yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #32317
[Bug 1447884] [NEW] Boot from volume, block device allocate timeout cause VM error, but volume would be available later
Public bug reported:
When we try to boot multi instances from volume (with a large image
source) at the same time, we usually got a block device allocate error
as the logs in nova-compute.log:
2015-03-30 23:22:46.920 6445 WARNING nova.compute.manager [-] Volume id: 551ea616-e1c4-4ef2-9bf3-b0ca6d4474dc finished being created but was not set as 'available'
2015-03-30 23:22:47.131 6445 ERROR nova.compute.manager [-] [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] Instance failed block device setup
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] Traceback (most recent call last):
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1829, in _prep_block_device
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] do_check_attach=do_check_attach) +
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] File "/usr/lib/python2.6/site-packages/nova/virt/block_device.py", line 407, in attach_block_devices
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] map(_log_and_attach, block_device_mapping)
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] File "/usr/lib/python2.6/site-packages/nova/virt/block_device.py", line 405, in _log_and_attach
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] bdm.attach(*attach_args, **attach_kwargs)
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] File "/usr/lib/python2.6/site-packages/nova/virt/block_device.py", line 339, in attach
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] do_check_attach=do_check_attach)
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] File "/usr/lib/python2.6/site-packages/nova/virt/block_device.py", line 46, in wrapped
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] ret_val = method(obj, context, *args, **kwargs)
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] File "/usr/lib/python2.6/site-packages/nova/virt/block_device.py", line 229, in attach
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] volume_api.check_attach(context, volume, instance=instance)
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] File "/usr/lib/python2.6/site-packages/nova/volume/cinder.py", line 305, in check_attach
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] raise exception.InvalidVolume(reason=msg)
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] InvalidVolume: Invalid volume: status must be 'available'
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6]
This error cause the VM in "error" status:
+--------------------------------------+------------+--------+----------------------+-------------+----------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+------------+--------+----------------------+-------------+----------------------+
| 1fa2d7aa-8bd9-4a22-8538-0a07d9dae8aa | inst02 | ERROR | block_device_mapping | NOSTATE | |
+--------------------------------------+------------+--------+----------------------+-------------+----------------------+
But the volume was in "available" status:
-----------------------+
| ID | Status | Name | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+------+------+-------------+----------+--------------------------------------+
| a9ab2dc2-b117-44ef-8678-f71067a9e770 | available | None | 2 | None | true | |
+--------------------------------------+-----------+------+------+-------------+----------+--------------------------------------+
+--------------------------------------+------------+--------+----------------------+-------------+-------
And when we teminiate this VM, the volume will still exist, since there is no volume attachment info stored in this VM.
This can be easily reproduced:
1. add the following options to nova.conf in compute node ( make sure the error will be happened even with small image such as cirros):
block_device_allocate_retries = 1
block_device_allocate_retries_interval = 1
2. restart the nova-compute service and try boot one instance from
volume:
nova boot --flavor 1 --block-device source=image,id=929dfa0b-
0b9b-4819-9f84-e4ed22e81f33,dest=volume,size=2,shutdown=remove,bootindex=0
--nic net-id=a302dfa6-0c5d-46ab-9295-8f8b9119fe83 --availability-zone
nova:compute01-juno.ibm.com inst02
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1447884
Title:
Boot from volume, block device allocate timeout cause VM error, but
volume would be available later
Status in OpenStack Compute (Nova):
New
Bug description:
When we try to boot multi instances from volume (with a large image
source) at the same time, we usually got a block device allocate
error as the logs in nova-compute.log:
2015-03-30 23:22:46.920 6445 WARNING nova.compute.manager [-] Volume id: 551ea616-e1c4-4ef2-9bf3-b0ca6d4474dc finished being created but was not set as 'available'
2015-03-30 23:22:47.131 6445 ERROR nova.compute.manager [-] [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] Instance failed block device setup
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] Traceback (most recent call last):
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1829, in _prep_block_device
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] do_check_attach=do_check_attach) +
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] File "/usr/lib/python2.6/site-packages/nova/virt/block_device.py", line 407, in attach_block_devices
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] map(_log_and_attach, block_device_mapping)
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] File "/usr/lib/python2.6/site-packages/nova/virt/block_device.py", line 405, in _log_and_attach
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] bdm.attach(*attach_args, **attach_kwargs)
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] File "/usr/lib/python2.6/site-packages/nova/virt/block_device.py", line 339, in attach
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] do_check_attach=do_check_attach)
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] File "/usr/lib/python2.6/site-packages/nova/virt/block_device.py", line 46, in wrapped
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] ret_val = method(obj, context, *args, **kwargs)
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] File "/usr/lib/python2.6/site-packages/nova/virt/block_device.py", line 229, in attach
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] volume_api.check_attach(context, volume, instance=instance)
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] File "/usr/lib/python2.6/site-packages/nova/volume/cinder.py", line 305, in check_attach
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] raise exception.InvalidVolume(reason=msg)
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6] InvalidVolume: Invalid volume: status must be 'available'
2015-03-30 23:22:47.131 6445 TRACE nova.compute.manager [instance: 483472b2-61b3-4574-95e2-8cd0304f90f6]
This error cause the VM in "error" status:
+--------------------------------------+------------+--------+----------------------+-------------+----------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+------------+--------+----------------------+-------------+----------------------+
| 1fa2d7aa-8bd9-4a22-8538-0a07d9dae8aa | inst02 | ERROR | block_device_mapping | NOSTATE | |
+--------------------------------------+------------+--------+----------------------+-------------+----------------------+
But the volume was in "available" status:
-----------------------+
| ID | Status | Name | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+------+------+-------------+----------+--------------------------------------+
| a9ab2dc2-b117-44ef-8678-f71067a9e770 | available | None | 2 | None | true | |
+--------------------------------------+-----------+------+------+-------------+----------+--------------------------------------+
+--------------------------------------+------------+--------+----------------------+-------------+-------
And when we teminiate this VM, the volume will still exist, since there is no volume attachment info stored in this VM.
This can be easily reproduced:
1. add the following options to nova.conf in compute node ( make sure the error will be happened even with small image such as cirros):
block_device_allocate_retries = 1
block_device_allocate_retries_interval = 1
2. restart the nova-compute service and try boot one instance from
volume:
nova boot --flavor 1 --block-device source=image,id=929dfa0b-
0b9b-4819-9f84-e4ed22e81f33,dest=volume,size=2,shutdown=remove,bootindex=0
--nic net-id=a302dfa6-0c5d-46ab-9295-8f8b9119fe83 --availability-zone
nova:compute01-juno.ibm.com inst02
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1447884/+subscriptions
Follow ups
-
[Bug 1447884] Re: Boot from volume, block device allocate timeout cause VM error, but volume would be available later
From: Alan Pevec, 2015-07-29
-
[Bug 1447884] Re: Boot from volume, block device allocate timeout cause VM error, but volume would be available later
From: Thierry Carrez, 2015-06-24
-
[Bug 1447884] Re: Boot from volume, block device allocate timeout cause VM error, but volume would be available later
From: Matt Riedemann, 2015-06-10
-
[Bug 1447884] [NEW] Boot from volume, block device allocate timeout cause VM error, but volume would be available later
From: Lan Qi song, 2015-04-24
References