yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #88962
[Bug 1975771] [NEW] instance stuck in BUILD state with vm_state building
Public bug reported:
Description
===========
With a Train cellsv2 deployment we noticed an issue that instance randomly remain in BUILD state with vm_state building but nova-compute never seem to actually attempt building the instance.
Once we retry the instances may build which makes it hard to debug this issue and generally the infrastructure seem to work:
+--------------------------------------+------------------------------------------------------------+
| Property | Value |
+--------------------------------------+------------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | - |
| OS-EXT-SRV-ATTR:hostname | test |
| OS-EXT-SRV-ATTR:hypervisor_hostname | - |
| OS-EXT-SRV-ATTR:instance_name | instance-000026cd |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-rj1sb4zs |
| OS-EXT-SRV-ATTR:root_device_name | - |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 0 |
| OS-EXT-STS:task_state | scheduling |
| OS-EXT-STS:vm_state | building |
| OS-SRV-USG:launched_at | - |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2022-05-25T22:59:18Z |
| description | test |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | test-flavor |
| flavor:ram | 512 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | |
| host_status | |
| id | 2a6cf0bf-8a25-4b9c-997f-e9dbfc7927e5 |
| image | cirros-0.4.0-x86_64 (15f38ee5-b94c-4bc0-a6f4-63cb308ba7bf) |
| key_name | - |
| locked | False |
| locked_reason | - |
| metadata | {} |
| name | test |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| server_groups | [] |
| status | BUILD |
| tags | [] |
| tenant_id | b72efcd55ff84b6abc65213b92d69c2f |
| trusted_image_certificates | - |
| updated | 2022-05-25T23:01:06Z |
| user_id | 5f1bf3f91c2d4ab7b46c13441dc0952f |
+--------------------------------------+------------------------------------------------------------+
Steps to reproduce
==================
As test we issue 6 server create against a different tenant with a
fairly simple build request, not even attaching a network
openstack server create --image cirros-0.4.0-x86_64 --flavor <flavor> --availability-zone nova:<hypervisor> <name>
sleep 4
Expected result
===============
Instance build according the hypervisor we forcibly set
Actual result
=============
Instance stuck until we restart nova-compute
Environment
===========
It's a Train environment based on Ubuntu 18.04 and nova build from
source with SHA:
10df17638a1587f740c46a574c923df9348c3344 # HEAD as of 09.05.2021
Which hypervisor did you use? KVM
Which storage type did you use? local instance storage
Which networking type did you use? Neutron openvswitch
Logs & Configs
==============
See https://paste.openstack.org/show/b7OzusN6cmr6SUps0YYn/
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1975771
Title:
instance stuck in BUILD state with vm_state building
Status in OpenStack Compute (nova):
New
Bug description:
Description
===========
With a Train cellsv2 deployment we noticed an issue that instance randomly remain in BUILD state with vm_state building but nova-compute never seem to actually attempt building the instance.
Once we retry the instances may build which makes it hard to debug this issue and generally the infrastructure seem to work:
+--------------------------------------+------------------------------------------------------------+
| Property | Value |
+--------------------------------------+------------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | - |
| OS-EXT-SRV-ATTR:hostname | test |
| OS-EXT-SRV-ATTR:hypervisor_hostname | - |
| OS-EXT-SRV-ATTR:instance_name | instance-000026cd |
| OS-EXT-SRV-ATTR:kernel_id | |
| OS-EXT-SRV-ATTR:launch_index | 0 |
| OS-EXT-SRV-ATTR:ramdisk_id | |
| OS-EXT-SRV-ATTR:reservation_id | r-rj1sb4zs |
| OS-EXT-SRV-ATTR:root_device_name | - |
| OS-EXT-SRV-ATTR:user_data | - |
| OS-EXT-STS:power_state | 0 |
| OS-EXT-STS:task_state | scheduling |
| OS-EXT-STS:vm_state | building |
| OS-SRV-USG:launched_at | - |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2022-05-25T22:59:18Z |
| description | test |
| flavor:disk | 1 |
| flavor:ephemeral | 0 |
| flavor:extra_specs | {} |
| flavor:original_name | test-flavor |
| flavor:ram | 512 |
| flavor:swap | 0 |
| flavor:vcpus | 1 |
| hostId | |
| host_status | |
| id | 2a6cf0bf-8a25-4b9c-997f-e9dbfc7927e5 |
| image | cirros-0.4.0-x86_64 (15f38ee5-b94c-4bc0-a6f4-63cb308ba7bf) |
| key_name | - |
| locked | False |
| locked_reason | - |
| metadata | {} |
| name | test |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| server_groups | [] |
| status | BUILD |
| tags | [] |
| tenant_id | b72efcd55ff84b6abc65213b92d69c2f |
| trusted_image_certificates | - |
| updated | 2022-05-25T23:01:06Z |
| user_id | 5f1bf3f91c2d4ab7b46c13441dc0952f |
+--------------------------------------+------------------------------------------------------------+
Steps to reproduce
==================
As test we issue 6 server create against a different tenant with a
fairly simple build request, not even attaching a network
openstack server create --image cirros-0.4.0-x86_64 --flavor <flavor> --availability-zone nova:<hypervisor> <name>
sleep 4
Expected result
===============
Instance build according the hypervisor we forcibly set
Actual result
=============
Instance stuck until we restart nova-compute
Environment
===========
It's a Train environment based on Ubuntu 18.04 and nova build from
source with SHA:
10df17638a1587f740c46a574c923df9348c3344 # HEAD as of 09.05.2021
Which hypervisor did you use? KVM
Which storage type did you use? local instance storage
Which networking type did you use? Neutron openvswitch
Logs & Configs
==============
See https://paste.openstack.org/show/b7OzusN6cmr6SUps0YYn/
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1975771/+subscriptions
Follow ups