← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1975771] [NEW] instance stuck in BUILD state with vm_state building

 

Public bug reported:

Description
===========

With a Train cellsv2 deployment we noticed an issue that instance randomly remain in BUILD state with vm_state building but nova-compute never seem to actually attempt building the instance.
Once we retry the instances may build which makes it hard to debug this issue and generally the infrastructure seem to work:

+--------------------------------------+------------------------------------------------------------+
| Property                             | Value                                                      |
+--------------------------------------+------------------------------------------------------------+
| OS-DCF:diskConfig                    | MANUAL                                                     |
| OS-EXT-AZ:availability_zone          | nova                                                       |
| OS-EXT-SRV-ATTR:host                 | -                                                          |
| OS-EXT-SRV-ATTR:hostname             | test                                                       |
| OS-EXT-SRV-ATTR:hypervisor_hostname  | -                                                          |
| OS-EXT-SRV-ATTR:instance_name        | instance-000026cd                                          |
| OS-EXT-SRV-ATTR:kernel_id            |                                                            |
| OS-EXT-SRV-ATTR:launch_index         | 0                                                          |
| OS-EXT-SRV-ATTR:ramdisk_id           |                                                            |
| OS-EXT-SRV-ATTR:reservation_id       | r-rj1sb4zs                                                 |
| OS-EXT-SRV-ATTR:root_device_name     | -                                                          |
| OS-EXT-SRV-ATTR:user_data            | -                                                          |
| OS-EXT-STS:power_state               | 0                                                          |
| OS-EXT-STS:task_state                | scheduling                                                 |
| OS-EXT-STS:vm_state                  | building                                                   |
| OS-SRV-USG:launched_at               | -                                                          |
| OS-SRV-USG:terminated_at             | -                                                          |
| accessIPv4                           |                                                            |
| accessIPv6                           |                                                            |
| config_drive                         |                                                            |
| created                              | 2022-05-25T22:59:18Z                                       |
| description                          | test                                                       |
| flavor:disk                          | 1                                                          |
| flavor:ephemeral                     | 0                                                          |
| flavor:extra_specs                   | {}                                                         |
| flavor:original_name                 | test-flavor                                                |
| flavor:ram                           | 512                                                        |
| flavor:swap                          | 0                                                          |
| flavor:vcpus                         | 1                                                          |
| hostId                               |                                                            |
| host_status                          |                                                            |
| id                                   | 2a6cf0bf-8a25-4b9c-997f-e9dbfc7927e5                       |
| image                                | cirros-0.4.0-x86_64 (15f38ee5-b94c-4bc0-a6f4-63cb308ba7bf) |
| key_name                             | -                                                          |
| locked                               | False                                                      |
| locked_reason                        | -                                                          |
| metadata                             | {}                                                         |
| name                                 | test                                                       |
| os-extended-volumes:volumes_attached | []                                                         |
| progress                             | 0                                                          |
| server_groups                        | []                                                         |
| status                               | BUILD                                                      |
| tags                                 | []                                                         |
| tenant_id                            | b72efcd55ff84b6abc65213b92d69c2f                           |
| trusted_image_certificates           | -                                                          |
| updated                              | 2022-05-25T23:01:06Z                                       |
| user_id                              | 5f1bf3f91c2d4ab7b46c13441dc0952f                           |
+--------------------------------------+------------------------------------------------------------+


Steps to reproduce
==================

As test we issue 6 server create against a different tenant with a
fairly simple build request, not even attaching a network

openstack server create --image cirros-0.4.0-x86_64 --flavor <flavor>  --availability-zone nova:<hypervisor> <name>
sleep 4

Expected result
===============

Instance build according the hypervisor we forcibly set

Actual result
=============

Instance stuck until we restart nova-compute

Environment
===========

It's a Train environment based on Ubuntu 18.04 and nova build from
source with SHA:

10df17638a1587f740c46a574c923df9348c3344  # HEAD as of 09.05.2021

Which hypervisor did you use? KVM

Which storage type did you use? local instance storage

Which networking type did you use? Neutron openvswitch

Logs & Configs
==============

See https://paste.openstack.org/show/b7OzusN6cmr6SUps0YYn/

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1975771

Title:
  instance stuck in BUILD state with vm_state building

Status in OpenStack Compute (nova):
  New

Bug description:
  Description
  ===========

  With a Train cellsv2 deployment we noticed an issue that instance randomly remain in BUILD state with vm_state building but nova-compute never seem to actually attempt building the instance.
  Once we retry the instances may build which makes it hard to debug this issue and generally the infrastructure seem to work:

  +--------------------------------------+------------------------------------------------------------+
  | Property                             | Value                                                      |
  +--------------------------------------+------------------------------------------------------------+
  | OS-DCF:diskConfig                    | MANUAL                                                     |
  | OS-EXT-AZ:availability_zone          | nova                                                       |
  | OS-EXT-SRV-ATTR:host                 | -                                                          |
  | OS-EXT-SRV-ATTR:hostname             | test                                                       |
  | OS-EXT-SRV-ATTR:hypervisor_hostname  | -                                                          |
  | OS-EXT-SRV-ATTR:instance_name        | instance-000026cd                                          |
  | OS-EXT-SRV-ATTR:kernel_id            |                                                            |
  | OS-EXT-SRV-ATTR:launch_index         | 0                                                          |
  | OS-EXT-SRV-ATTR:ramdisk_id           |                                                            |
  | OS-EXT-SRV-ATTR:reservation_id       | r-rj1sb4zs                                                 |
  | OS-EXT-SRV-ATTR:root_device_name     | -                                                          |
  | OS-EXT-SRV-ATTR:user_data            | -                                                          |
  | OS-EXT-STS:power_state               | 0                                                          |
  | OS-EXT-STS:task_state                | scheduling                                                 |
  | OS-EXT-STS:vm_state                  | building                                                   |
  | OS-SRV-USG:launched_at               | -                                                          |
  | OS-SRV-USG:terminated_at             | -                                                          |
  | accessIPv4                           |                                                            |
  | accessIPv6                           |                                                            |
  | config_drive                         |                                                            |
  | created                              | 2022-05-25T22:59:18Z                                       |
  | description                          | test                                                       |
  | flavor:disk                          | 1                                                          |
  | flavor:ephemeral                     | 0                                                          |
  | flavor:extra_specs                   | {}                                                         |
  | flavor:original_name                 | test-flavor                                                |
  | flavor:ram                           | 512                                                        |
  | flavor:swap                          | 0                                                          |
  | flavor:vcpus                         | 1                                                          |
  | hostId                               |                                                            |
  | host_status                          |                                                            |
  | id                                   | 2a6cf0bf-8a25-4b9c-997f-e9dbfc7927e5                       |
  | image                                | cirros-0.4.0-x86_64 (15f38ee5-b94c-4bc0-a6f4-63cb308ba7bf) |
  | key_name                             | -                                                          |
  | locked                               | False                                                      |
  | locked_reason                        | -                                                          |
  | metadata                             | {}                                                         |
  | name                                 | test                                                       |
  | os-extended-volumes:volumes_attached | []                                                         |
  | progress                             | 0                                                          |
  | server_groups                        | []                                                         |
  | status                               | BUILD                                                      |
  | tags                                 | []                                                         |
  | tenant_id                            | b72efcd55ff84b6abc65213b92d69c2f                           |
  | trusted_image_certificates           | -                                                          |
  | updated                              | 2022-05-25T23:01:06Z                                       |
  | user_id                              | 5f1bf3f91c2d4ab7b46c13441dc0952f                           |
  +--------------------------------------+------------------------------------------------------------+


  Steps to reproduce
  ==================

  As test we issue 6 server create against a different tenant with a
  fairly simple build request, not even attaching a network

  openstack server create --image cirros-0.4.0-x86_64 --flavor <flavor>  --availability-zone nova:<hypervisor> <name>
  sleep 4

  Expected result
  ===============

  Instance build according the hypervisor we forcibly set

  Actual result
  =============

  Instance stuck until we restart nova-compute

  Environment
  ===========

  It's a Train environment based on Ubuntu 18.04 and nova build from
  source with SHA:

  10df17638a1587f740c46a574c923df9348c3344  # HEAD as of 09.05.2021

  Which hypervisor did you use? KVM

  Which storage type did you use? local instance storage

  Which networking type did you use? Neutron openvswitch

  Logs & Configs
  ==============

  See https://paste.openstack.org/show/b7OzusN6cmr6SUps0YYn/

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1975771/+subscriptions



Follow ups