yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #95422
[Bug 2100588] [NEW] Incorrect image ID written to libvirt metadata when unshelving ephemeral instances
Public bug reported:
Description
===========
When an ephemeral instance (image booted from image) is shelved, Nova
writes the root volume of the instance to a temporary Glance image to
make sure it persists while the instance is shelved. Upon unshelve the
volume is written back out to disk, and the temporary image is removed.
Upon unshelving the instance, when the libvirt domain metadata for the
instance is generated the temporary image's ID is written to the
metadata, instead of the ID of the original image the instance is based
on like it should.
This is is corrected when the instance is cold restarted, resized or
cold migrated but the incorrect image ID can persist if the state of the
instance does not change following the unshelve, which causes issues for
other services such as Ceilometer which rely on this data being correct.
Steps to reproduce
==================
1. Launch a new ephemeral instance, booted from image.
2. Confirm the correct image ID of the active ephemeral instance:
$ openstack server show <instance>
3. Check the libvirt domain XML (the image ID should be correct):
$ virsh dumpxml instance-xxxxxxxx
4. Shelve the instance:
$ openstack server shelve <instance>
5. Unshelve the instance:
$ openstack server unshelve <instance>
6. Check the libvirt domain XML again (the image ID will be incorrect):
$ virsh dumpxml instance-xxxxxxxx
Expected result
===============
Given the following list of images (taken while the instance 'test2' is
shelved):
$ openstack image list
+--------------------------------------+--------------------------------------+--------+
| ID | Name | Status |
+--------------------------------------+--------------------------------------+--------+
| 00000000-0000-0000-0000-000000000001 | alt_tempest_cirros-0.3.4-x86_64 | active |
| 4c1e8dbe-7ef5-44be-8b3c-e36f72d93589 | cirros-0.6.2-x86_64 | active |
| a734e8cf-376d-4d00-a51e-b57995a4fc24 | octavia_image | active |
| 00000000-0000-0000-0000-000000000000 | tempest_cirros-0.3.4-x86_64 | active |
| f8e04922-c715-4ffe-82ef-cddace05f331 | test2-shelved | active |
| e69daa32-b950-40d6-bc5e-599e73f5c838 | trove-victoria-ubuntu-focal-20240514 | active |
+--------------------------------------+--------------------------------------+--------+
libvirt domain metadata before unshelve:
$ sudo virsh dumpxml 7a8af73e-9aa2-4f5d-bb20-7f5623adcd58 | xpath -q -e /domain/metadata
<metadata>
<nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0">
<nova:package version="17.0.14" />
<nova:name>test2</nova:name>
<nova:creationTime>2025-02-27 01:26:17</nova:creationTime>
<nova:flavor name="c1.c1r05">
<nova:memory>512</nova:memory>
<nova:disk>5</nova:disk>
<nova:swap>0</nova:swap>
<nova:ephemeral>0</nova:ephemeral>
<nova:vcpus>1</nova:vcpus>
</nova:flavor>
<nova:owner>
<nova:user uuid="a55b109e1508472cb4466bc84786680e">jeremy</nova:user>
<nova:project uuid="ab5d28cb6e2c4b7d89c8e0f96ac8604e">test1</nova:project>
</nova:owner>
<nova:root type="image" uuid="4c1e8dbe-7ef5-44be-8b3c-e36f72d93589" /> <!-- cirros-0.6.2-x86_64 -->
</nova:instance>
</metadata>
After unshelve, the image ID should stay the same.
Actual result
=============
After unshelve:
$ sudo virsh dumpxml 7a8af73e-9aa2-4f5d-bb20-7f5623adcd58 | xpath -q -e /domain/metadata
<metadata>
<nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0">
<nova:package version="17.0.14" />
<nova:name>test2</nova:name>
<nova:creationTime>2025-02-27 01:17:00</nova:creationTime>
<nova:flavor name="c1.c1r05">
<nova:memory>512</nova:memory>
<nova:disk>5</nova:disk>
<nova:swap>0</nova:swap>
<nova:ephemeral>0</nova:ephemeral>
<nova:vcpus>1</nova:vcpus>
</nova:flavor>
<nova:owner>
<nova:user uuid="a55b109e1508472cb4466bc84786680e">jeremy</nova:user>
<nova:project uuid="ab5d28cb6e2c4b7d89c8e0f96ac8604e">test1</nova:project>
</nova:owner>
<nova:root type="image" uuid="f8e04922-c715-4ffe-82ef-cddace05f331" /> <!-- test2-shelved -->
</nova:instance>
</metadata>
Environment
===========
We have experienced it on Nova Queens through to Train, but looking at
the implementation all Nova versions up to master are affected.
** Affects: nova
Importance: Undecided
Status: New
** Description changed:
Description
===========
When an ephemeral instance (image booted from image) is shelved, Nova
writes the root volume of the instance to a temporary Glance image to
make sure it persists while the instance is shelved. Upon unshelve the
volume is written back out to disk, and the temporary image is removed.
Upon unshelving the instance, when the libvirt domain metadata for the
instance is generated the temporary image's ID is written to the
metadata, instead of the ID of the original image the instance is based
on like it should.
This is is corrected when the instance is cold restarted, resized or
cold migrated but the incorrect image ID can persist if the state of the
instance does not change following the unshelve, which causes issues for
other services such as Ceilometer which rely on this data being correct.
-
Steps to reproduce
==================
1. Launch a new ephemeral instance, booted from image.
2. Confirm the correct image ID of the active ephemeral instance:
$ openstack server show <instance>
3. Check the libvirt domain XML (the image ID should be correct):
$ virsh dumpxml instance-xxxxxxxx
4. Shelve the instance:
$ openstack server shelve <instance>
5. Unshelve the instance:
$ openstack server unshelve <instance>
6. Check the libvirt domain XML again (the image ID will be incorrect):
$ virsh dumpxml instance-xxxxxxxx
-
Expected result
===============
Given the following list of images (taken while the instance 'test2' is
shelved):
$ openstack image list
+--------------------------------------+--------------------------------------+--------+
| ID | Name | Status |
+--------------------------------------+--------------------------------------+--------+
| 00000000-0000-0000-0000-000000000001 | alt_tempest_cirros-0.3.4-x86_64 | active |
| 4c1e8dbe-7ef5-44be-8b3c-e36f72d93589 | cirros-0.6.2-x86_64 | active |
| a734e8cf-376d-4d00-a51e-b57995a4fc24 | octavia_image | active |
| 00000000-0000-0000-0000-000000000000 | tempest_cirros-0.3.4-x86_64 | active |
| f8e04922-c715-4ffe-82ef-cddace05f331 | test2-shelved | active |
| e69daa32-b950-40d6-bc5e-599e73f5c838 | trove-victoria-ubuntu-focal-20240514 | active |
+--------------------------------------+--------------------------------------+--------+
libvirt domain metadata before unshelve:
+ $ sudo virsh dumpxml 7a8af73e-9aa2-4f5d-bb20-7f5623adcd58 | xpath -q -e /domain/metadata
<metadata>
- <nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0">
- <nova:package version="17.0.14" />
- <nova:name>test2</nova:name>
- <nova:creationTime>2025-02-27 01:26:17</nova:creationTime>
- <nova:flavor name="c1.c1r05">
- <nova:memory>512</nova:memory>
- <nova:disk>5</nova:disk>
- <nova:swap>0</nova:swap>
- <nova:ephemeral>0</nova:ephemeral>
- <nova:vcpus>1</nova:vcpus>
- </nova:flavor>
- <nova:owner>
- <nova:user uuid="a55b109e1508472cb4466bc84786680e">jeremy</nova:user>
- <nova:project uuid="ab5d28cb6e2c4b7d89c8e0f96ac8604e">test1</nova:project>
- </nova:owner>
- <nova:root type="image" uuid="4c1e8dbe-7ef5-44be-8b3c-e36f72d93589" /> <!-- cirros-0.6.2-x86_64 -->
- </nova:instance>
- </metadata>
+ <nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0">
+ <nova:package version="17.0.14" />
+ <nova:name>test2</nova:name>
+ <nova:creationTime>2025-02-27 01:26:17</nova:creationTime>
+ <nova:flavor name="c1.c1r05">
+ <nova:memory>512</nova:memory>
+ <nova:disk>5</nova:disk>
+ <nova:swap>0</nova:swap>
+ <nova:ephemeral>0</nova:ephemeral>
+ <nova:vcpus>1</nova:vcpus>
+ </nova:flavor>
+ <nova:owner>
+ <nova:user uuid="a55b109e1508472cb4466bc84786680e">jeremy</nova:user>
+ <nova:project uuid="ab5d28cb6e2c4b7d89c8e0f96ac8604e">test1</nova:project>
+ </nova:owner>
+ <nova:root type="image" uuid="4c1e8dbe-7ef5-44be-8b3c-e36f72d93589" /> <!-- cirros-0.6.2-x86_64 -->
+ </nova:instance>
+ </metadata>
After unshelve, the image ID should stay the same.
-
Actual result
=============
After unshelve:
$ sudo virsh dumpxml 7a8af73e-9aa2-4f5d-bb20-7f5623adcd58 | xpath -q -e /domain/metadata
<metadata>
- <nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0">
- <nova:package version="17.0.14" />
- <nova:name>test2</nova:name>
- <nova:creationTime>2025-02-27 01:17:00</nova:creationTime>
- <nova:flavor name="c1.c1r05">
- <nova:memory>512</nova:memory>
- <nova:disk>5</nova:disk>
- <nova:swap>0</nova:swap>
- <nova:ephemeral>0</nova:ephemeral>
- <nova:vcpus>1</nova:vcpus>
- </nova:flavor>
- <nova:owner>
- <nova:user uuid="a55b109e1508472cb4466bc84786680e">jeremy</nova:user>
- <nova:project uuid="ab5d28cb6e2c4b7d89c8e0f96ac8604e">test1</nova:project>
- </nova:owner>
- <nova:root type="image" uuid="f8e04922-c715-4ffe-82ef-cddace05f331" /> <!-- test2-shelved -->
- </nova:instance>
- </metadata>
-
+ <nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0">
+ <nova:package version="17.0.14" />
+ <nova:name>test2</nova:name>
+ <nova:creationTime>2025-02-27 01:17:00</nova:creationTime>
+ <nova:flavor name="c1.c1r05">
+ <nova:memory>512</nova:memory>
+ <nova:disk>5</nova:disk>
+ <nova:swap>0</nova:swap>
+ <nova:ephemeral>0</nova:ephemeral>
+ <nova:vcpus>1</nova:vcpus>
+ </nova:flavor>
+ <nova:owner>
+ <nova:user uuid="a55b109e1508472cb4466bc84786680e">jeremy</nova:user>
+ <nova:project uuid="ab5d28cb6e2c4b7d89c8e0f96ac8604e">test1</nova:project>
+ </nova:owner>
+ <nova:root type="image" uuid="f8e04922-c715-4ffe-82ef-cddace05f331" /> <!-- test2-shelved -->
+ </nova:instance>
+ </metadata>
Environment
===========
We have experienced it on Nova Queens through to Train, but looking at
the implementation all Nova versions up to master are affected.
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2100588
Title:
Incorrect image ID written to libvirt metadata when unshelving
ephemeral instances
Status in OpenStack Compute (nova):
New
Bug description:
Description
===========
When an ephemeral instance (image booted from image) is shelved, Nova
writes the root volume of the instance to a temporary Glance image to
make sure it persists while the instance is shelved. Upon unshelve the
volume is written back out to disk, and the temporary image is
removed.
Upon unshelving the instance, when the libvirt domain metadata for the
instance is generated the temporary image's ID is written to the
metadata, instead of the ID of the original image the instance is
based on like it should.
This is is corrected when the instance is cold restarted, resized or
cold migrated but the incorrect image ID can persist if the state of
the instance does not change following the unshelve, which causes
issues for other services such as Ceilometer which rely on this data
being correct.
Steps to reproduce
==================
1. Launch a new ephemeral instance, booted from image.
2. Confirm the correct image ID of the active ephemeral instance:
$ openstack server show <instance>
3. Check the libvirt domain XML (the image ID should be correct):
$ virsh dumpxml instance-xxxxxxxx
4. Shelve the instance:
$ openstack server shelve <instance>
5. Unshelve the instance:
$ openstack server unshelve <instance>
6. Check the libvirt domain XML again (the image ID will be
incorrect):
$ virsh dumpxml instance-xxxxxxxx
Expected result
===============
Given the following list of images (taken while the instance 'test2'
is shelved):
$ openstack image list
+--------------------------------------+--------------------------------------+--------+
| ID | Name | Status |
+--------------------------------------+--------------------------------------+--------+
| 00000000-0000-0000-0000-000000000001 | alt_tempest_cirros-0.3.4-x86_64 | active |
| 4c1e8dbe-7ef5-44be-8b3c-e36f72d93589 | cirros-0.6.2-x86_64 | active |
| a734e8cf-376d-4d00-a51e-b57995a4fc24 | octavia_image | active |
| 00000000-0000-0000-0000-000000000000 | tempest_cirros-0.3.4-x86_64 | active |
| f8e04922-c715-4ffe-82ef-cddace05f331 | test2-shelved | active |
| e69daa32-b950-40d6-bc5e-599e73f5c838 | trove-victoria-ubuntu-focal-20240514 | active |
+--------------------------------------+--------------------------------------+--------+
libvirt domain metadata before unshelve:
$ sudo virsh dumpxml 7a8af73e-9aa2-4f5d-bb20-7f5623adcd58 | xpath -q -e /domain/metadata
<metadata>
<nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0">
<nova:package version="17.0.14" />
<nova:name>test2</nova:name>
<nova:creationTime>2025-02-27 01:26:17</nova:creationTime>
<nova:flavor name="c1.c1r05">
<nova:memory>512</nova:memory>
<nova:disk>5</nova:disk>
<nova:swap>0</nova:swap>
<nova:ephemeral>0</nova:ephemeral>
<nova:vcpus>1</nova:vcpus>
</nova:flavor>
<nova:owner>
<nova:user uuid="a55b109e1508472cb4466bc84786680e">jeremy</nova:user>
<nova:project uuid="ab5d28cb6e2c4b7d89c8e0f96ac8604e">test1</nova:project>
</nova:owner>
<nova:root type="image" uuid="4c1e8dbe-7ef5-44be-8b3c-e36f72d93589" /> <!-- cirros-0.6.2-x86_64 -->
</nova:instance>
</metadata>
After unshelve, the image ID should stay the same.
Actual result
=============
After unshelve:
$ sudo virsh dumpxml 7a8af73e-9aa2-4f5d-bb20-7f5623adcd58 | xpath -q -e /domain/metadata
<metadata>
<nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0">
<nova:package version="17.0.14" />
<nova:name>test2</nova:name>
<nova:creationTime>2025-02-27 01:17:00</nova:creationTime>
<nova:flavor name="c1.c1r05">
<nova:memory>512</nova:memory>
<nova:disk>5</nova:disk>
<nova:swap>0</nova:swap>
<nova:ephemeral>0</nova:ephemeral>
<nova:vcpus>1</nova:vcpus>
</nova:flavor>
<nova:owner>
<nova:user uuid="a55b109e1508472cb4466bc84786680e">jeremy</nova:user>
<nova:project uuid="ab5d28cb6e2c4b7d89c8e0f96ac8604e">test1</nova:project>
</nova:owner>
<nova:root type="image" uuid="f8e04922-c715-4ffe-82ef-cddace05f331" /> <!-- test2-shelved -->
</nova:instance>
</metadata>
Environment
===========
We have experienced it on Nova Queens through to Train, but looking at
the implementation all Nova versions up to master are affected.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2100588/+subscriptions