yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #80484
[Bug 1849701] [NEW] Resource_provider entry related to a deleted compute node, unable to migrate vms to the node
Public bug reported:
Description
===========
Migrating vm to a node was failing with the following error:
"There was a conflict when trying to complete your request.\n\n
Conflicting resource provider name: mymachine.maas already exists."
https://paste.ubuntu.com/p/4dxS6d8X8p/
Steps to reproduce
==================
We found that the compute node was added multiple times:
Compute node was added multiple time, the valid one is created_at:
2019-08-22 18:47:31
mysql> select created_at, deleted_at from compute_nodes where host="mymachine";
+---------------------+---------------------+
| created_at | deleted_at |
+---------------------+---------------------+
| 2019-08-22 18:47:31 | NULL |
| 2019-08-21 11:50:26 | 2019-08-22 11:04:27 |
| 2019-08-22 16:25:52 | 2019-08-22 16:58:42 |
| 2019-08-22 18:42:39 | 2019-08-22 18:45:36 |
+---------------------+---------------------+
4 rows in set (0.00 sec)
and the resource provider entry was related to an already deleted compute node:
mysql> select created_at from resource_providers where name="mymachine.maas";
+---------------------+
| created_at |
+---------------------+
| 2019-08-22 18:42:40 |
+---------------------+
1 row in set (0.00 sec)
We tried to delete it:
mysql> delete from resource_providers where name="mymachine.maas";
ERROR 1451 (23000): Cannot delete or update a parent row: a foreign key constraint fails (`nova_api`.`resource_providers`, CONSTRAINT `resource_providers_ibfk_1` FOREIGN KEY (`root_provider_id`) REFERENCES `resource_providers` (`id`))
It is strange that root_provider_id seems to reference the same row of
the same table making deletion of any row of this table impossible:
mysql> select id,root_provider_id from resource_providers;
+----+------------------+
| id | root_provider_id |
+----+------------------+
| 1 | 1 |
| 4 | 4 |
| 7 | 7 |
| 10 | 10 |
| 13 | 13 |
| 16 | 16 |
| 19 | 19 |
| 22 | 22 |
| 28 | 28 |
| 31 | 31 |
| 34 | 34 |
| 37 | 37 |
| 40 | 40 |
| 43 | 43 |
| 45 | 45 |
| 52 | 52 |
| 55 | 55 |
| 58 | 58 |
| 61 | 61 |
| 64 | 64 |
| 67 | 67 |
| 70 | 70 |
| 73 | 73 |
| 76 | 76 |
| 79 | 79 |
| 82 | 82 |
| 91 | 91 |
+----+------------------+
Expected result
===============
Resource provider entry should be deleted when a compute node is deleted allowing to migrate vm to the node.
Workaround
===============
we updated name to invalid:
mysql> update resource_providers set name="invalid" where name="mymachine.maas";
Query OK, 1 row affected (0.01 sec)
Restarted nova-compute on the node with
systemctl restart nova-compute
Resource provider entry got recreated:
mysql> select * from resource_providers where name="mymachine.maas";
+---------------------+---------------------+-----+--------------------------------------+------------------+------------+----------+------------------+--------------------+
| created_at | updated_at | id | uuid | name | generation | can_host | root_provider_id | parent_provider_id |
+---------------------+---------------------+-----+--------------------------------------+------------------+------------+----------+------------------+--------------------+
| 2019-10-24 15:16:51 | 2019-10-24 15:18:12 | 384 | e6dabd5d-d1ed-4fd5-a1e0-0be3b360fb28 | mymachine.maas | 2 | NULL | 384 | NULL |
+---------------------+---------------------+-----+--------------------------------------+------------------+------------+----------+------------------+--------------------+
And migration worked.
Environment
===============
xenial-queens cloud
Nova compute node:
dpkg -l | grep nova
ii nova-api-metadata 2:17.0.10-0ubuntu2.1~cloud0 all OpenStack Compute - metadata API frontend
ii nova-common 2:17.0.10-0ubuntu2.1~cloud0 all OpenStack Compute - common files
ii nova-compute 2:17.0.10-0ubuntu2.1~cloud0 all OpenStack Compute - compute node base
ii nova-compute-kvm 2:17.0.10-0ubuntu2.1~cloud0 all OpenStack Compute - compute node (KVM)
ii nova-compute-libvirt 2:17.0.10-0ubuntu2.1~cloud0 all OpenStack Compute - compute node libvirt support
ii python-nova 2:17.0.10-0ubuntu2.1~cloud0 all OpenStack Compute Python libraries
ii python-novaclient 2:9.1.1-0ubuntu1~cloud0 all client library for OpenStack Compute API - Python 2.7
Nova Cloud Controller
dpkg -l | grep nova
ii nova-api-os-compute 2:17.0.9-0ubuntu1~cloud0 all OpenStack Compute - OpenStack Compute API frontend
ii nova-common 2:17.0.9-0ubuntu1~cloud0 all OpenStack Compute - common files
ii nova-conductor 2:17.0.9-0ubuntu1~cloud0 all OpenStack Compute - conductor service
ii nova-consoleauth 2:17.0.9-0ubuntu1~cloud0 all OpenStack Compute - Console Authenticator
ii nova-novncproxy 2:17.0.9-0ubuntu1~cloud0 all OpenStack Compute - NoVNC proxy
ii nova-placement-api 2:17.0.9-0ubuntu1~cloud0 all OpenStack Compute - placement API frontend
ii nova-scheduler 2:17.0.9-0ubuntu1~cloud0 all OpenStack Compute - virtual machine scheduler
ii nova-spiceproxy 2:17.0.9-0ubuntu1~cloud0 all OpenStack Compute - spice html5 proxy
ii python-nova 2:17.0.9-0ubuntu1~cloud0 all OpenStack Compute Python libraries
ii python-novaclient 2:9.1.1-0ubuntu1~cloud0 all client library for OpenStack Compute API - Python 2.7
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1849701
Title:
Resource_provider entry related to a deleted compute node, unable to
migrate vms to the node
Status in OpenStack Compute (nova):
New
Bug description:
Description
===========
Migrating vm to a node was failing with the following error:
"There was a conflict when trying to complete your request.\n\n
Conflicting resource provider name: mymachine.maas already exists."
https://paste.ubuntu.com/p/4dxS6d8X8p/
Steps to reproduce
==================
We found that the compute node was added multiple times:
Compute node was added multiple time, the valid one is created_at:
2019-08-22 18:47:31
mysql> select created_at, deleted_at from compute_nodes where host="mymachine";
+---------------------+---------------------+
| created_at | deleted_at |
+---------------------+---------------------+
| 2019-08-22 18:47:31 | NULL |
| 2019-08-21 11:50:26 | 2019-08-22 11:04:27 |
| 2019-08-22 16:25:52 | 2019-08-22 16:58:42 |
| 2019-08-22 18:42:39 | 2019-08-22 18:45:36 |
+---------------------+---------------------+
4 rows in set (0.00 sec)
and the resource provider entry was related to an already deleted compute node:
mysql> select created_at from resource_providers where name="mymachine.maas";
+---------------------+
| created_at |
+---------------------+
| 2019-08-22 18:42:40 |
+---------------------+
1 row in set (0.00 sec)
We tried to delete it:
mysql> delete from resource_providers where name="mymachine.maas";
ERROR 1451 (23000): Cannot delete or update a parent row: a foreign key constraint fails (`nova_api`.`resource_providers`, CONSTRAINT `resource_providers_ibfk_1` FOREIGN KEY (`root_provider_id`) REFERENCES `resource_providers` (`id`))
It is strange that root_provider_id seems to reference the same row of
the same table making deletion of any row of this table impossible:
mysql> select id,root_provider_id from resource_providers;
+----+------------------+
| id | root_provider_id |
+----+------------------+
| 1 | 1 |
| 4 | 4 |
| 7 | 7 |
| 10 | 10 |
| 13 | 13 |
| 16 | 16 |
| 19 | 19 |
| 22 | 22 |
| 28 | 28 |
| 31 | 31 |
| 34 | 34 |
| 37 | 37 |
| 40 | 40 |
| 43 | 43 |
| 45 | 45 |
| 52 | 52 |
| 55 | 55 |
| 58 | 58 |
| 61 | 61 |
| 64 | 64 |
| 67 | 67 |
| 70 | 70 |
| 73 | 73 |
| 76 | 76 |
| 79 | 79 |
| 82 | 82 |
| 91 | 91 |
+----+------------------+
Expected result
===============
Resource provider entry should be deleted when a compute node is deleted allowing to migrate vm to the node.
Workaround
===============
we updated name to invalid:
mysql> update resource_providers set name="invalid" where name="mymachine.maas";
Query OK, 1 row affected (0.01 sec)
Restarted nova-compute on the node with
systemctl restart nova-compute
Resource provider entry got recreated:
mysql> select * from resource_providers where name="mymachine.maas";
+---------------------+---------------------+-----+--------------------------------------+------------------+------------+----------+------------------+--------------------+
| created_at | updated_at | id | uuid | name | generation | can_host | root_provider_id | parent_provider_id |
+---------------------+---------------------+-----+--------------------------------------+------------------+------------+----------+------------------+--------------------+
| 2019-10-24 15:16:51 | 2019-10-24 15:18:12 | 384 | e6dabd5d-d1ed-4fd5-a1e0-0be3b360fb28 | mymachine.maas | 2 | NULL | 384 | NULL |
+---------------------+---------------------+-----+--------------------------------------+------------------+------------+----------+------------------+--------------------+
And migration worked.
Environment
===============
xenial-queens cloud
Nova compute node:
dpkg -l | grep nova
ii nova-api-metadata 2:17.0.10-0ubuntu2.1~cloud0 all OpenStack Compute - metadata API frontend
ii nova-common 2:17.0.10-0ubuntu2.1~cloud0 all OpenStack Compute - common files
ii nova-compute 2:17.0.10-0ubuntu2.1~cloud0 all OpenStack Compute - compute node base
ii nova-compute-kvm 2:17.0.10-0ubuntu2.1~cloud0 all OpenStack Compute - compute node (KVM)
ii nova-compute-libvirt 2:17.0.10-0ubuntu2.1~cloud0 all OpenStack Compute - compute node libvirt support
ii python-nova 2:17.0.10-0ubuntu2.1~cloud0 all OpenStack Compute Python libraries
ii python-novaclient 2:9.1.1-0ubuntu1~cloud0 all client library for OpenStack Compute API - Python 2.7
Nova Cloud Controller
dpkg -l | grep nova
ii nova-api-os-compute 2:17.0.9-0ubuntu1~cloud0 all OpenStack Compute - OpenStack Compute API frontend
ii nova-common 2:17.0.9-0ubuntu1~cloud0 all OpenStack Compute - common files
ii nova-conductor 2:17.0.9-0ubuntu1~cloud0 all OpenStack Compute - conductor service
ii nova-consoleauth 2:17.0.9-0ubuntu1~cloud0 all OpenStack Compute - Console Authenticator
ii nova-novncproxy 2:17.0.9-0ubuntu1~cloud0 all OpenStack Compute - NoVNC proxy
ii nova-placement-api 2:17.0.9-0ubuntu1~cloud0 all OpenStack Compute - placement API frontend
ii nova-scheduler 2:17.0.9-0ubuntu1~cloud0 all OpenStack Compute - virtual machine scheduler
ii nova-spiceproxy 2:17.0.9-0ubuntu1~cloud0 all OpenStack Compute - spice html5 proxy
ii python-nova 2:17.0.9-0ubuntu1~cloud0 all OpenStack Compute Python libraries
ii python-novaclient 2:9.1.1-0ubuntu1~cloud0 all client library for OpenStack Compute API - Python 2.7
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1849701/+subscriptions
Follow ups