yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #95895
[Bug 2111320] [NEW] Error detaching volumes when there are stale rows in services tables with same host field
Public bug reported:
Description
===========
I was seeing this kind of message when trying to detach a volume:
Error: Error detaching openstack_compute_volume_attach_v2 0528836c-b7e9-4ab3-9a8f-02675e58e52e/ff4dc359-eee8-4d7f-aabb-3e3d2f468c0a: Ex
pected HTTP response code [202 204] when accessing [DELETE https://vip.cosma.example.ac.uk:8774/v2.1/servers/0528836c-b7e9-4ab3-9a8f-02675e
58e52e/os-volume_attachments/ff4dc359-eee8-4d7f-aabb-3e3d2f468c0a], but got 409 instead\n{\"conflictingRequest\": {\"code\": 409, \"mes
sage\": \"Service is unavailable at this time.\"}}
I enabled debug logs on nova-api and saw:
2025-05-20 10:22:53.815 39 DEBUG nova.servicegroup.drivers.db [None req-5fc6720f-7aca-4bca-b3e8-95e66cb2815d 1b64d8057eb14eb4bb9f3fb01a414b1e 8b380af28742442ca0f7320d4a6f94c5 - - default default] Seems service nova-compute on host
hypervisor01 is down. Last heartbeat was 2024-06-07 14:50:38. Elapsed time is 29961135.814929 is_up /var/lib/kolla/venv/lib64/python3.9/site-packages/nova/servicegroup/drivers/db.py:76
2025-05-20 10:22:53.815 39 INFO nova.api.openstack.wsgi [None req-5fc6720f-7aca-4bca-b3e8-95e66cb2815d 1b64d8057eb14eb4bb9f3fb01a414b1e 8b380af28742442ca0f7320d4a6f94c5 - - default default] HTTP exception thrown: Service is unavai
lable at this time.
2025-05-20 10:22:53.815 39 DEBUG nova.api.openstack.wsgi [None req-5fc6720f-7aca-4bca-b3e8-95e66cb2815d 1b64d8057eb14eb4bb9f3fb01a414b1e 8b380af28742442ca0f7320d4a6f94c5 - - default default] Returning 409 to user: Service is unava
ilable at this time. __call__ /var/lib/kolla/venv/lib64/python3.9/site-packages/nova/api/openstack/wsgi.py:936
Looking in the services table I saw some soft deleted rows with the same host field:
MariaDB [nova]> select * from services;
+---------------------+---------------------+---------------------+----+--------------------+----------------+-----------+--------------+----------+---------+-----------------+---------------------+-------------+---------+--------
------------------------------+
| created_at | updated_at | deleted_at | id | host | binary | topic | report_count | disabled | deleted | disabled_reason | last_seen_up | forced_down | version | uuid
|
+---------------------+---------------------+---------------------+----+--------------------+----------------+-----------+--------------+----------+---------+-----------------+---------------------+-------------+---------+--------
------------------------------+
| 2024-05-03 17:49:33 | 2025-05-20 09:56:32 | NULL | 3 | controller0 | nova-conductor | conductor | 3016236 | 0 | 0 | NULL | 2025-05-20 09:56:32 | 0 | 67 | f6c6173
d-27bd-4bf5-9a45-5692ce6b8c21 |
| 2024-05-03 17:49:36 | 2025-05-20 09:56:32 | NULL | 18 | controller2 | nova-conductor | conductor | 3014186 | 0 | 0 | NULL | 2025-05-20 09:56:32 | 0 | 67 | 8126ed0
f-1261-431a-8053-50d57ee0490f |
| 2024-05-03 17:49:36 | 2025-05-20 09:56:52 | NULL | 33 | controller1 | nova-conductor | conductor | 3015781 | 0 | 0 | NULL | 2025-05-20 09:56:52 | 0 | 67 | 779a72a
6-cedf-4600-a17c-10a00b161b7d |
| 2024-05-03 17:49:58 | 2025-05-20 09:56:33 | NULL | 45 | controller2-ironic | nova-compute | compute | 3010744 | 0 | 0 | NULL | 2025-05-20 09:56:33 | 0 | 67 | f286691
8-980b-4b1b-a3cd-28ded445186a |
| 2024-05-03 17:49:58 | 2025-05-20 09:56:25 | NULL | 48 | controller1-ironic | nova-compute | compute | 3011140 | 0 | 0 | NULL | 2025-05-20 09:56:25 | 0 | 67 | 7809577
3-fb60-4eec-bf74-77eac558bbd4 |
| 2024-05-03 17:49:58 | 2025-05-20 09:56:31 | NULL | 51 | controller0-ironic | nova-compute | compute | 3011526 | 0 | 0 | NULL | 2025-05-20 09:56:31 | 0 | 67 | 0213714
b-7cca-4dfc-869b-28addf2ccc28 |
| 2024-05-17 18:14:51 | 2024-06-07 14:50:38 | 2024-06-10 10:33:28 | 54 | hypervisor01 | nova-compute | compute | 144610 | 0 | 54 | NULL | 2024-06-07 14:50:38 | 0 | 66 | f861cff
b-370b-45a6-8a2e-33fcace2bad1 |
| 2024-05-17 18:22:37 | 2024-06-07 14:50:36 | 2024-06-10 10:33:38 | 57 | hypervisor02 | nova-compute | compute | 144458 | 0 | 57 | NULL | 2024-06-07 14:50:36 | 0 | 66 | 201fff4
8-52c7-44a3-adce-6e32a6c6d889 |
| 2024-06-10 10:33:33 | 2024-06-14 10:22:47 | 2024-06-14 10:36:54 | 58 | hypervisor01 | nova-compute | compute | 32905 | 0 | 58 | NULL | 2024-06-14 10:22:47 | 0 | 66 | 8daf9eb
9-49a8-4d22-837a-c5b421d3e847 |
| 2024-06-10 10:33:48 | 2024-06-14 14:25:33 | 2024-06-14 16:29:30 | 61 | hypervisor02 | nova-compute | compute | 34504 | 0 | 61 | NULL | 2024-06-14 14:25:33 | 0 | 66 | 635f087
b-688c-4ed3-be17-b25d5f24a9c5 |
| 2024-06-14 11:24:12 | 2024-06-14 11:57:25 | 2024-06-14 16:29:41 | 64 | hypervisor01 | nova-compute | compute | 198 | 0 | 64 | NULL | 2024-06-14 11:57:25 | 0 | 66 | 47bd807
e-cea6-4fd6-aad5-314338c7f386 |
| 2024-06-14 16:32:59 | 2024-06-17 14:46:06 | 2024-06-17 15:00:34 | 67 | hypervisor02 | nova-compute | compute | 24823 | 0 | 67 | NULL | 2024-06-17 14:46:06 | 0 | 66 | 442e749
5-7d72-47d4-82d6-af469cb2c8ef |
| 2024-06-14 16:33:00 | 2024-06-19 13:33:27 | 2024-06-19 13:34:17 | 70 | hypervisor01 | nova-compute | compute | 41416 | 0 | 70 | NULL | 2024-06-19 13:33:27 | 0 | 66 | 0b01d66
c-e1f1-4c3e-a597-e176aa068d1d |
| 2024-06-17 15:24:52 | 2024-06-19 14:10:29 | 2024-06-19 14:41:01 | 73 | hypervisor02 | nova-compute | compute | 16568 | 1 | 73 | NULL | 2024-06-19 14:10:29 | 0 | 66 | 47f90af
b-64a2-4ff5-83a0-7442d894ef6f |
| 2024-06-19 13:34:46 | 2025-05-20 09:56:27 | NULL | 76 | hypervisor01 | nova-compute | compute | 2444608 | 0 | 0 | NULL | 2025-05-20 09:56:27 | 0 | 67 | e4e7ca1
c-34e8-4c1d-a32b-a7a67862ef54 |
| 2024-06-19 14:43:58 | 2025-05-20 09:56:32 | NULL | 79 | hypervisor02 | nova-compute | compute | 2283970 | 0 | 0 | NULL | 2025-05-20 09:56:32 | 0 | 67 | 1b4f317
7-ae00-4b59-95d9-a234c6ec8dc6 |
+---------------------+---------------------+---------------------+----+--------------------+----------------+-----------+--------------+----------+---------+-----------------+---------------------+-------------+---------+--------
-------------------------
I tried running `nova-manage db archive_deleted_rows --until-complete` in the hope that these would be moved to the services shadow table. However, this did not seem to do that.
I drop the deleted rows and I could then detach the volumes. Is there an
easier way? Am I likely to run into issues having done this?
Steps to reproduce
==================
- Reploy hypervisors after deleting old nova-compute service records (I guess what happened)
- Boot instance and attach volume (this works)
- Detach volume (this is what caused the error)
Expected result
===============
nova ignores deleted rows in services table and detaches the volume.
Actual result
=============
See description
Environment
===========
2024.1 deployed via kolla
Logs & Configs
==============
See description.
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2111320
Title:
Error detaching volumes when there are stale rows in services tables
with same host field
Status in OpenStack Compute (nova):
New
Bug description:
Description
===========
I was seeing this kind of message when trying to detach a volume:
Error: Error detaching openstack_compute_volume_attach_v2 0528836c-b7e9-4ab3-9a8f-02675e58e52e/ff4dc359-eee8-4d7f-aabb-3e3d2f468c0a: Ex
pected HTTP response code [202 204] when accessing [DELETE https://vip.cosma.example.ac.uk:8774/v2.1/servers/0528836c-b7e9-4ab3-9a8f-02675e
58e52e/os-volume_attachments/ff4dc359-eee8-4d7f-aabb-3e3d2f468c0a], but got 409 instead\n{\"conflictingRequest\": {\"code\": 409, \"mes
sage\": \"Service is unavailable at this time.\"}}
I enabled debug logs on nova-api and saw:
2025-05-20 10:22:53.815 39 DEBUG nova.servicegroup.drivers.db [None req-5fc6720f-7aca-4bca-b3e8-95e66cb2815d 1b64d8057eb14eb4bb9f3fb01a414b1e 8b380af28742442ca0f7320d4a6f94c5 - - default default] Seems service nova-compute on host
hypervisor01 is down. Last heartbeat was 2024-06-07 14:50:38. Elapsed time is 29961135.814929 is_up /var/lib/kolla/venv/lib64/python3.9/site-packages/nova/servicegroup/drivers/db.py:76
2025-05-20 10:22:53.815 39 INFO nova.api.openstack.wsgi [None req-5fc6720f-7aca-4bca-b3e8-95e66cb2815d 1b64d8057eb14eb4bb9f3fb01a414b1e 8b380af28742442ca0f7320d4a6f94c5 - - default default] HTTP exception thrown: Service is unavai
lable at this time.
2025-05-20 10:22:53.815 39 DEBUG nova.api.openstack.wsgi [None req-5fc6720f-7aca-4bca-b3e8-95e66cb2815d 1b64d8057eb14eb4bb9f3fb01a414b1e 8b380af28742442ca0f7320d4a6f94c5 - - default default] Returning 409 to user: Service is unava
ilable at this time. __call__ /var/lib/kolla/venv/lib64/python3.9/site-packages/nova/api/openstack/wsgi.py:936
Looking in the services table I saw some soft deleted rows with the same host field:
MariaDB [nova]> select * from services;
+---------------------+---------------------+---------------------+----+--------------------+----------------+-----------+--------------+----------+---------+-----------------+---------------------+-------------+---------+--------
------------------------------+
| created_at | updated_at | deleted_at | id | host | binary | topic | report_count | disabled | deleted | disabled_reason | last_seen_up | forced_down | version | uuid
|
+---------------------+---------------------+---------------------+----+--------------------+----------------+-----------+--------------+----------+---------+-----------------+---------------------+-------------+---------+--------
------------------------------+
| 2024-05-03 17:49:33 | 2025-05-20 09:56:32 | NULL | 3 | controller0 | nova-conductor | conductor | 3016236 | 0 | 0 | NULL | 2025-05-20 09:56:32 | 0 | 67 | f6c6173
d-27bd-4bf5-9a45-5692ce6b8c21 |
| 2024-05-03 17:49:36 | 2025-05-20 09:56:32 | NULL | 18 | controller2 | nova-conductor | conductor | 3014186 | 0 | 0 | NULL | 2025-05-20 09:56:32 | 0 | 67 | 8126ed0
f-1261-431a-8053-50d57ee0490f |
| 2024-05-03 17:49:36 | 2025-05-20 09:56:52 | NULL | 33 | controller1 | nova-conductor | conductor | 3015781 | 0 | 0 | NULL | 2025-05-20 09:56:52 | 0 | 67 | 779a72a
6-cedf-4600-a17c-10a00b161b7d |
| 2024-05-03 17:49:58 | 2025-05-20 09:56:33 | NULL | 45 | controller2-ironic | nova-compute | compute | 3010744 | 0 | 0 | NULL | 2025-05-20 09:56:33 | 0 | 67 | f286691
8-980b-4b1b-a3cd-28ded445186a |
| 2024-05-03 17:49:58 | 2025-05-20 09:56:25 | NULL | 48 | controller1-ironic | nova-compute | compute | 3011140 | 0 | 0 | NULL | 2025-05-20 09:56:25 | 0 | 67 | 7809577
3-fb60-4eec-bf74-77eac558bbd4 |
| 2024-05-03 17:49:58 | 2025-05-20 09:56:31 | NULL | 51 | controller0-ironic | nova-compute | compute | 3011526 | 0 | 0 | NULL | 2025-05-20 09:56:31 | 0 | 67 | 0213714
b-7cca-4dfc-869b-28addf2ccc28 |
| 2024-05-17 18:14:51 | 2024-06-07 14:50:38 | 2024-06-10 10:33:28 | 54 | hypervisor01 | nova-compute | compute | 144610 | 0 | 54 | NULL | 2024-06-07 14:50:38 | 0 | 66 | f861cff
b-370b-45a6-8a2e-33fcace2bad1 |
| 2024-05-17 18:22:37 | 2024-06-07 14:50:36 | 2024-06-10 10:33:38 | 57 | hypervisor02 | nova-compute | compute | 144458 | 0 | 57 | NULL | 2024-06-07 14:50:36 | 0 | 66 | 201fff4
8-52c7-44a3-adce-6e32a6c6d889 |
| 2024-06-10 10:33:33 | 2024-06-14 10:22:47 | 2024-06-14 10:36:54 | 58 | hypervisor01 | nova-compute | compute | 32905 | 0 | 58 | NULL | 2024-06-14 10:22:47 | 0 | 66 | 8daf9eb
9-49a8-4d22-837a-c5b421d3e847 |
| 2024-06-10 10:33:48 | 2024-06-14 14:25:33 | 2024-06-14 16:29:30 | 61 | hypervisor02 | nova-compute | compute | 34504 | 0 | 61 | NULL | 2024-06-14 14:25:33 | 0 | 66 | 635f087
b-688c-4ed3-be17-b25d5f24a9c5 |
| 2024-06-14 11:24:12 | 2024-06-14 11:57:25 | 2024-06-14 16:29:41 | 64 | hypervisor01 | nova-compute | compute | 198 | 0 | 64 | NULL | 2024-06-14 11:57:25 | 0 | 66 | 47bd807
e-cea6-4fd6-aad5-314338c7f386 |
| 2024-06-14 16:32:59 | 2024-06-17 14:46:06 | 2024-06-17 15:00:34 | 67 | hypervisor02 | nova-compute | compute | 24823 | 0 | 67 | NULL | 2024-06-17 14:46:06 | 0 | 66 | 442e749
5-7d72-47d4-82d6-af469cb2c8ef |
| 2024-06-14 16:33:00 | 2024-06-19 13:33:27 | 2024-06-19 13:34:17 | 70 | hypervisor01 | nova-compute | compute | 41416 | 0 | 70 | NULL | 2024-06-19 13:33:27 | 0 | 66 | 0b01d66
c-e1f1-4c3e-a597-e176aa068d1d |
| 2024-06-17 15:24:52 | 2024-06-19 14:10:29 | 2024-06-19 14:41:01 | 73 | hypervisor02 | nova-compute | compute | 16568 | 1 | 73 | NULL | 2024-06-19 14:10:29 | 0 | 66 | 47f90af
b-64a2-4ff5-83a0-7442d894ef6f |
| 2024-06-19 13:34:46 | 2025-05-20 09:56:27 | NULL | 76 | hypervisor01 | nova-compute | compute | 2444608 | 0 | 0 | NULL | 2025-05-20 09:56:27 | 0 | 67 | e4e7ca1
c-34e8-4c1d-a32b-a7a67862ef54 |
| 2024-06-19 14:43:58 | 2025-05-20 09:56:32 | NULL | 79 | hypervisor02 | nova-compute | compute | 2283970 | 0 | 0 | NULL | 2025-05-20 09:56:32 | 0 | 67 | 1b4f317
7-ae00-4b59-95d9-a234c6ec8dc6 |
+---------------------+---------------------+---------------------+----+--------------------+----------------+-----------+--------------+----------+---------+-----------------+---------------------+-------------+---------+--------
-------------------------
I tried running `nova-manage db archive_deleted_rows --until-complete` in the hope that these would be moved to the services shadow table. However, this did not seem to do that.
I drop the deleted rows and I could then detach the volumes. Is there
an easier way? Am I likely to run into issues having done this?
Steps to reproduce
==================
- Reploy hypervisors after deleting old nova-compute service records (I guess what happened)
- Boot instance and attach volume (this works)
- Detach volume (this is what caused the error)
Expected result
===============
nova ignores deleted rows in services table and detaches the volume.
Actual result
=============
See description
Environment
===========
2024.1 deployed via kolla
Logs & Configs
==============
See description.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2111320/+subscriptions