yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #95873
[Bug 2110738] [NEW] Stable rescue fails when necessary image properties not set
Public bug reported:
>From https://issues.redhat.com/browse/OSPRH-13142:
Description of problem:
For a boot-from-volume instances, 'openstack server rescue <vm> --image
<image>' fails with the following issues:
1. It attempts to attach two disks: <instance_uuid>_disk &
<instance_uuid>_disk.rescue. Only, <instance_uuid>_disk.rescue is
created so it fails with the following error:
2024-01-23 16:32:14.338 2 ERROR oslo_messaging.rpc.server
nova.exception.InstanceNotRescuable: Instance
dc0812ba-b4ca-4ffa-a7e5-2157e52f35d2 cannot be rescued: Driver Error:
internal error: process exited while connecting to monitor:
2024-01-23T16:32:13.017966Z qemu-kvm: -blockdev
{"driver":"rbd","pool":"vms","image":"dc0812ba-b4ca-4ffa-a7e5-2157e52f35d2_disk","server":[{"host":"172.16.1.100","port":"6789"}],"user":"openstack","auth-
client-required":["cephx","none"],"key-secret":"libvirt-1-storage-auth-
secret0","node-name":"libvirt-1-storage","cache":{"direct":false,"no-
flush":false},"auto-read-only":true,"discard":"unmap"}: error reading
header from dc0812ba-b4ca-4ffa-a7e5-2157e52f35d2_disk: No such file or
directory
If you look in ceph, only the .rescue image exists.
# rbd --id openstack -p vms ls -l
NAME SIZE PARENT FMT PROT LOCK
dc0812ba-b4ca-4ffa-a7e5-2157e52f35d2_disk.rescue 10 GiB 2 excl
However we see the instance configured with both disks.
# virsh domblklist instance-00000003
Target Source
----------------------------------------------------------------
vda vms/dc0812ba-b4ca-4ffa-a7e5-2157e52f35d2_disk.rescue
vdb vms/dc0812ba-b4ca-4ffa-a7e5-2157e52f35d2_disk
If I manually copy, the UUID_disk.rescue to UUID_disk, the instance will
boot into RESCUE mode. It seems the UUID_disk volume is not needed and
should not be configured in this RESCUE situation.
2. The RESCUED instance doesn't attach the cinder root volume. The
cinder root also doesnt re-attach after "unrescuing" the instance.
Reproducer:
$ openstack volume create --size 10 --image rhel8 rootvol1
$ openstack volume list
+--------------------------------------+----------+-----------+------+-------------+
| ID | Name | Status | Size | Attached to |
+--------------------------------------+----------+-----------+------+-------------+
| f855dfe6-ad5a-4497-87ff-16ac5856f596 | rootvol1 | available | 10 | |
+--------------------------------------+----------+-----------+------+-------------+
$ openstack server create --key-name default --flavor rhel --volume rootvol1 --network external test1
$ openstack server show test1 -c status -c image -c volumes_attached
+------------------+--------------------------------------------------------------------------+
| Field | Value |
+------------------+--------------------------------------------------------------------------+
| image | N/A (booted from volume) |
| status | ACTIVE |
| volumes_attached | delete_on_termination='False', id='f855dfe6-ad5a-4497-87ff-16ac5856f596' |
+------------------+--------------------------------------------------------------------------+
$ openstack server rescue test1 --image rhel8
$ openstack server show test1 -c status -c image -c volumes_attached -c fault --fit
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------+
| fault | {'code': 400, 'created': '2024-01-23T20:12:17Z', 'message': 'Instance ac3d46c0-c8d5-45df-bd17-d467baaa5a98 cannot be rescued: Driver |
| | Error: internal error: process exited while connecting to monitor: 2024-01-23T20:12:17.612453Z qemu-kvm: -blockdev |
| | {"driver":"rbd","pool":"vms","image":"ac3d46c0-c8d5-45df-bd17-d467ba'} |
| image | N/A (booted from volume) |
| status | ERROR |
| volumes_attached | delete_on_termination='False', id='f855dfe6-ad5a-4497-87ff-16ac5856f596' |
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------+
# virsh domblklist instance-00000004
Target Source
----------------------------------------------------------------
vda vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk.rescue
vdb vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk
# rbd --id openstack -p vms ls -l
NAME SIZE PARENT FMT PROT LOCK
ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk.rescue 10 GiB 2
NOTE: here if manually create the _disk volume, the instance will boot
into rescue mode; however, the cinder volume is not attached.
# rbd --id openstack cp vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk.rescue vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk
Image copy: 100% complete...done.
RESCUE now completes and the instance is accessible (without cinder root
vol attached).
$ openstack server show test1 -c status -c image -c volumes_attached -c fault --fit
+------------------+--------------------------------------------------------------------------+
| Field | Value |
+------------------+--------------------------------------------------------------------------+
| image | N/A (booted from volume) |
| status | RESCUE |
| volumes_attached | delete_on_termination='False', id='f855dfe6-ad5a-4497-87ff-16ac5856f596' |
+------------------+--------------------------------------------------------------------------+
volume still shows in-use
$ openstack volume list
+--------------------------------------+----------+--------+------+--------------------------------+
| ID | Name | Status | Size | Attached to |
+--------------------------------------+----------+--------+------+--------------------------------+
| f855dfe6-ad5a-4497-87ff-16ac5856f596 | rootvol1 | in-use | 10 | Attached to test1 on /dev/vda |
+--------------------------------------+----------+--------+------+--------------------------------+
But not attached.
# virsh domblklist instance-00000004
Target Source
----------------------------------------------------------------
vda vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk.rescue
vdb vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk
The other ugly thing, the unrescue does not revert this back to original disk config.
$ openstack server unrescue test1
$ openstack server show test1 -c status -c image -c volumes_attached -c fault --fit
+------------------+--------------------------------------------------------------------------+
| Field | Value |
+------------------+--------------------------------------------------------------------------+
| image | N/A (booted from volume) |
| status | ACTIVE |
| volumes_attached | delete_on_termination='False', id='f855dfe6-ad5a-4497-87ff-16ac5856f596' |
+------------------+--------------------------------------------------------------------------+
The above looks good, but the instance is still booted on rescue disks.
# virsh domblklist instance-00000004
Target Source
----------------------------------------------------------------
vda vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk.rescue
vdb vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk
A hard reboot will fix it:
$ openstack server reboot --hard test1
Now the instance is back to boot from vol:
# virsh domblklist instance-00000004
Target Source
---------------------------------------------------------------
vda volumes/volume-f855dfe6-ad5a-4497-87ff-16ac5856f596
Version-Release number of selected component (if applicable):
Wallaby
How reproducible:
100%
Steps to Reproduce:
1. See above
2.
3.
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2110738
Title:
Stable rescue fails when necessary image properties not set
Status in OpenStack Compute (nova):
New
Bug description:
From https://issues.redhat.com/browse/OSPRH-13142:
Description of problem:
For a boot-from-volume instances, 'openstack server rescue <vm>
--image <image>' fails with the following issues:
1. It attempts to attach two disks: <instance_uuid>_disk &
<instance_uuid>_disk.rescue. Only, <instance_uuid>_disk.rescue is
created so it fails with the following error:
2024-01-23 16:32:14.338 2 ERROR oslo_messaging.rpc.server
nova.exception.InstanceNotRescuable: Instance
dc0812ba-b4ca-4ffa-a7e5-2157e52f35d2 cannot be rescued: Driver Error:
internal error: process exited while connecting to monitor:
2024-01-23T16:32:13.017966Z qemu-kvm: -blockdev
{"driver":"rbd","pool":"vms","image":"dc0812ba-b4ca-4ffa-a7e5-2157e52f35d2_disk","server":[{"host":"172.16.1.100","port":"6789"}],"user":"openstack","auth-
client-required":["cephx","none"],"key-secret":"libvirt-1-storage-
auth-secret0","node-
name":"libvirt-1-storage","cache":{"direct":false,"no-
flush":false},"auto-read-only":true,"discard":"unmap"}: error reading
header from dc0812ba-b4ca-4ffa-a7e5-2157e52f35d2_disk: No such file or
directory
If you look in ceph, only the .rescue image exists.
# rbd --id openstack -p vms ls -l
NAME SIZE PARENT FMT PROT LOCK
dc0812ba-b4ca-4ffa-a7e5-2157e52f35d2_disk.rescue 10 GiB 2 excl
However we see the instance configured with both disks.
# virsh domblklist instance-00000003
Target Source
----------------------------------------------------------------
vda vms/dc0812ba-b4ca-4ffa-a7e5-2157e52f35d2_disk.rescue
vdb vms/dc0812ba-b4ca-4ffa-a7e5-2157e52f35d2_disk
If I manually copy, the UUID_disk.rescue to UUID_disk, the instance
will boot into RESCUE mode. It seems the UUID_disk volume is not
needed and should not be configured in this RESCUE situation.
2. The RESCUED instance doesn't attach the cinder root volume. The
cinder root also doesnt re-attach after "unrescuing" the instance.
Reproducer:
$ openstack volume create --size 10 --image rhel8 rootvol1
$ openstack volume list
+--------------------------------------+----------+-----------+------+-------------+
| ID | Name | Status | Size | Attached to |
+--------------------------------------+----------+-----------+------+-------------+
| f855dfe6-ad5a-4497-87ff-16ac5856f596 | rootvol1 | available | 10 | |
+--------------------------------------+----------+-----------+------+-------------+
$ openstack server create --key-name default --flavor rhel --volume rootvol1 --network external test1
$ openstack server show test1 -c status -c image -c volumes_attached
+------------------+--------------------------------------------------------------------------+
| Field | Value |
+------------------+--------------------------------------------------------------------------+
| image | N/A (booted from volume) |
| status | ACTIVE |
| volumes_attached | delete_on_termination='False', id='f855dfe6-ad5a-4497-87ff-16ac5856f596' |
+------------------+--------------------------------------------------------------------------+
$ openstack server rescue test1 --image rhel8
$ openstack server show test1 -c status -c image -c volumes_attached -c fault --fit
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------+
| fault | {'code': 400, 'created': '2024-01-23T20:12:17Z', 'message': 'Instance ac3d46c0-c8d5-45df-bd17-d467baaa5a98 cannot be rescued: Driver |
| | Error: internal error: process exited while connecting to monitor: 2024-01-23T20:12:17.612453Z qemu-kvm: -blockdev |
| | {"driver":"rbd","pool":"vms","image":"ac3d46c0-c8d5-45df-bd17-d467ba'} |
| image | N/A (booted from volume) |
| status | ERROR |
| volumes_attached | delete_on_termination='False', id='f855dfe6-ad5a-4497-87ff-16ac5856f596' |
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------+
# virsh domblklist instance-00000004
Target Source
----------------------------------------------------------------
vda vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk.rescue
vdb vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk
# rbd --id openstack -p vms ls -l
NAME SIZE PARENT FMT PROT LOCK
ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk.rescue 10 GiB 2
NOTE: here if manually create the _disk volume, the instance will boot
into rescue mode; however, the cinder volume is not attached.
# rbd --id openstack cp vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk.rescue vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk
Image copy: 100% complete...done.
RESCUE now completes and the instance is accessible (without cinder
root vol attached).
$ openstack server show test1 -c status -c image -c volumes_attached -c fault --fit
+------------------+--------------------------------------------------------------------------+
| Field | Value |
+------------------+--------------------------------------------------------------------------+
| image | N/A (booted from volume) |
| status | RESCUE |
| volumes_attached | delete_on_termination='False', id='f855dfe6-ad5a-4497-87ff-16ac5856f596' |
+------------------+--------------------------------------------------------------------------+
volume still shows in-use
$ openstack volume list
+--------------------------------------+----------+--------+------+--------------------------------+
| ID | Name | Status | Size | Attached to |
+--------------------------------------+----------+--------+------+--------------------------------+
| f855dfe6-ad5a-4497-87ff-16ac5856f596 | rootvol1 | in-use | 10 | Attached to test1 on /dev/vda |
+--------------------------------------+----------+--------+------+--------------------------------+
But not attached.
# virsh domblklist instance-00000004
Target Source
----------------------------------------------------------------
vda vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk.rescue
vdb vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk
The other ugly thing, the unrescue does not revert this back to original disk config.
$ openstack server unrescue test1
$ openstack server show test1 -c status -c image -c volumes_attached -c fault --fit
+------------------+--------------------------------------------------------------------------+
| Field | Value |
+------------------+--------------------------------------------------------------------------+
| image | N/A (booted from volume) |
| status | ACTIVE |
| volumes_attached | delete_on_termination='False', id='f855dfe6-ad5a-4497-87ff-16ac5856f596' |
+------------------+--------------------------------------------------------------------------+
The above looks good, but the instance is still booted on rescue
disks.
# virsh domblklist instance-00000004
Target Source
----------------------------------------------------------------
vda vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk.rescue
vdb vms/ac3d46c0-c8d5-45df-bd17-d467baaa5a98_disk
A hard reboot will fix it:
$ openstack server reboot --hard test1
Now the instance is back to boot from vol:
# virsh domblklist instance-00000004
Target Source
---------------------------------------------------------------
vda volumes/volume-f855dfe6-ad5a-4497-87ff-16ac5856f596
Version-Release number of selected component (if applicable):
Wallaby
How reproducible:
100%
Steps to Reproduce:
1. See above
2.
3.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2110738/+subscriptions