yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #77217
[Bug 1817833] [NEW] Check compute_id existence when nova-compute reports info to placement
Public bug reported:
Description
===========
According to https://bugs.launchpad.net/nova/+bug/1756179, Currently we delete a nova-compute service, will delete compute_node records, resource provider records and host mapping records in DB. I found if deleting service when nova-compute service is active, it's no problem for deleting compute_node records and resource_provider records in DB, but nova-compute will continue to report the old resource_provider uuid. So when we restart nova-compute to recover service, will rasie ResourceProviderCreationFailed.
Steps to reproduce
==================
1. Check enviroment and resource_provider table.
# nova service-list | grep 'nova-compute'
| 3d9092b0-e164-4094-8672-1c855971218d | nova-compute | devstack-q | nova | enabled | up |
MariaDB [placement]> select uuid,name from resource_providers;
+--------------------------------------+------------+
| uuid | name |
+--------------------------------------+------------+
| edfff022-c19f-4720-85f9-fd947ae36b07 | devstack-q |
+--------------------------------------+------------+
2. Deleting a compute service when nova-compute process is running, check resource_provider table.
# nova service-delete 3d9092b0-e164-4094-8672-1c855971218d
MariaDB [placement]> select * from resource_providers;
Empty set (0.00 sec)
3. Wait a minute, restart nova-compute process.
# systemctl restart devstack@n-cpu
Expected result
===============
nova-compute work properly and report to resource_provider with new uuid.
Actual result
===============
nova-compute raise 409 when creae a new uuid resource_provider, and report 'No resource provider with uuid 52943fd2-d700-416f-9e16-7fe4744979b3 found'.
I found if nova-compute running, it will resume the old uuid to resource_providers when this uuid is gone. So
current resource_provider uuid in DB is still 'edfff022-c19f-4720-85f9-fd947ae36b07'. Then nova-compute will try to create a new resource provider with name 'devstack-q'. Unfortunately, the name column in tables is unique.
So I think we should check compute_id existence first, then update
resource_provider_tree. If not exist, rasie ComputeHostNotFound instead
of reporting.
** Affects: nova
Importance: Undecided
Assignee: xulei (605423512-j)
Status: New
** Tags: placement
** Changed in: nova
Assignee: (unassigned) => xulei (605423512-j)
** Tags added: placement
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1817833
Title:
Check compute_id existence when nova-compute reports info to
placement
Status in OpenStack Compute (nova):
New
Bug description:
Description
===========
According to https://bugs.launchpad.net/nova/+bug/1756179, Currently we delete a nova-compute service, will delete compute_node records, resource provider records and host mapping records in DB. I found if deleting service when nova-compute service is active, it's no problem for deleting compute_node records and resource_provider records in DB, but nova-compute will continue to report the old resource_provider uuid. So when we restart nova-compute to recover service, will rasie ResourceProviderCreationFailed.
Steps to reproduce
==================
1. Check enviroment and resource_provider table.
# nova service-list | grep 'nova-compute'
| 3d9092b0-e164-4094-8672-1c855971218d | nova-compute | devstack-q | nova | enabled | up |
MariaDB [placement]> select uuid,name from resource_providers;
+--------------------------------------+------------+
| uuid | name |
+--------------------------------------+------------+
| edfff022-c19f-4720-85f9-fd947ae36b07 | devstack-q |
+--------------------------------------+------------+
2. Deleting a compute service when nova-compute process is running, check resource_provider table.
# nova service-delete 3d9092b0-e164-4094-8672-1c855971218d
MariaDB [placement]> select * from resource_providers;
Empty set (0.00 sec)
3. Wait a minute, restart nova-compute process.
# systemctl restart devstack@n-cpu
Expected result
===============
nova-compute work properly and report to resource_provider with new uuid.
Actual result
===============
nova-compute raise 409 when creae a new uuid resource_provider, and report 'No resource provider with uuid 52943fd2-d700-416f-9e16-7fe4744979b3 found'.
I found if nova-compute running, it will resume the old uuid to resource_providers when this uuid is gone. So
current resource_provider uuid in DB is still 'edfff022-c19f-4720-85f9-fd947ae36b07'. Then nova-compute will try to create a new resource provider with name 'devstack-q'. Unfortunately, the name column in tables is unique.
So I think we should check compute_id existence first, then update
resource_provider_tree. If not exist, rasie ComputeHostNotFound
instead of reporting.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1817833/+subscriptions