yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #70401
[Bug 1743579] [NEW] Concurrent report_state from multiple agents: segment_host_mapping fails - StaleDataError
Public bug reported:
When multiple host agents rapidly report_state for the first time we get
StaleDataError and _update_segment_host_mapping_for_agent does not
complete for all hosts.
Attached is a file with logs as well as reproducer script and
instruction on how to set up devstack environment similar to the one I
am using.
To Reproduce:
-------------
Run script with the delay, time.sleep(10), commented.
Results:
* 2x StaleDataError
* Only 1 attempt to add host to placement/host-aggregate.
MariaDB [neutron]> MariaDB [neutron]> SELECT * FROM segmenthostmappings;
+--------------------------------------+---------------------------------+
| segment_id | host |
+--------------------------------------+---------------------------------+
| a974ae4c-1389-4e41-9ab9-820165c26acd | host2 |
| a974ae4c-1389-4e41-9ab9-820165c26acd | routed-devstack.lab.example.com |
| bc626d3d-5503-4875-9db8-e1bcfad35979 | host2 |
| bc626d3d-5503-4875-9db8-e1bcfad35979 | routed-devstack.lab.example.com |
| ec7717dd-8533-464f-a3c8-4ecc7dc08d10 | host2 |
| ec7717dd-8533-464f-a3c8-4ecc7dc08d10 | routed-devstack.lab.example.com |
+--------------------------------------+---------------------------------+
Conclutions:
* 2x StaleDataError
* 1x successfull _update_segment_host_mapping after_create.
*** We should see 3x attempts to add to placement/host-aggregate, one
for each host agent. ****
Running the reproducer script with the delay uncommented (No issue):
--------------------------------------------------------------------
Run script with the delay, time.sleep(10), enabled.
Results:
* No StaleDataError
* 3 attempts to add the host to placemenb/host-aggregate.
MariaDB [neutron]> SELECT * FROM segmenthostmappings;
+--------------------------------------+---------------------------------+
| segment_id | host |
+--------------------------------------+---------------------------------+
| 11b9258f-8712-43b7-8f39-3eab627a8c7f | host0 |
| 11b9258f-8712-43b7-8f39-3eab627a8c7f | host1 |
| 11b9258f-8712-43b7-8f39-3eab627a8c7f | host2 |
| 11b9258f-8712-43b7-8f39-3eab627a8c7f | routed-devstack.lab.example.com |
| 89f96bee-424c-4ee2-8639-2ca8e07a70e6 | host0 |
| 89f96bee-424c-4ee2-8639-2ca8e07a70e6 | host1 |
| 89f96bee-424c-4ee2-8639-2ca8e07a70e6 | host2 |
| 89f96bee-424c-4ee2-8639-2ca8e07a70e6 | routed-devstack.lab.example.com |
| a7a7d2f4-c809-4ebb-916f-930c97fbec47 | host0 |
| a7a7d2f4-c809-4ebb-916f-930c97fbec47 | host1 |
| a7a7d2f4-c809-4ebb-916f-930c97fbec47 | host2 |
| a7a7d2f4-c809-4ebb-916f-930c97fbec47 | routed-devstack.lab.example.com |
+--------------------------------------+---------------------------------+
Conclution:
* 3x successfull _update_segment_host_mapping after_create.
** NOTE: **
The RESP BODY: {"itemNotFound": {"message": "Compute host host1 could not be found.", "code": 404}} errors in the logs is expected, the fake host is not in Nova, so this is expeced.
** Affects: neutron
Importance: Undecided
Status: New
** Attachment added: "Logs, a reproducer script and devstack instructions."
https://bugs.launchpad.net/bugs/1743579/+attachment/5037862/+files/reproducer_and_logs.txt
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1743579
Title:
Concurrent report_state from multiple agents: segment_host_mapping
fails - StaleDataError
Status in neutron:
New
Bug description:
When multiple host agents rapidly report_state for the first time we
get StaleDataError and _update_segment_host_mapping_for_agent does not
complete for all hosts.
Attached is a file with logs as well as reproducer script and
instruction on how to set up devstack environment similar to the one I
am using.
To Reproduce:
-------------
Run script with the delay, time.sleep(10), commented.
Results:
* 2x StaleDataError
* Only 1 attempt to add host to placement/host-aggregate.
MariaDB [neutron]> MariaDB [neutron]> SELECT * FROM segmenthostmappings;
+--------------------------------------+---------------------------------+
| segment_id | host |
+--------------------------------------+---------------------------------+
| a974ae4c-1389-4e41-9ab9-820165c26acd | host2 |
| a974ae4c-1389-4e41-9ab9-820165c26acd | routed-devstack.lab.example.com |
| bc626d3d-5503-4875-9db8-e1bcfad35979 | host2 |
| bc626d3d-5503-4875-9db8-e1bcfad35979 | routed-devstack.lab.example.com |
| ec7717dd-8533-464f-a3c8-4ecc7dc08d10 | host2 |
| ec7717dd-8533-464f-a3c8-4ecc7dc08d10 | routed-devstack.lab.example.com |
+--------------------------------------+---------------------------------+
Conclutions:
* 2x StaleDataError
* 1x successfull _update_segment_host_mapping after_create.
*** We should see 3x attempts to add to placement/host-aggregate, one
for each host agent. ****
Running the reproducer script with the delay uncommented (No issue):
--------------------------------------------------------------------
Run script with the delay, time.sleep(10), enabled.
Results:
* No StaleDataError
* 3 attempts to add the host to placemenb/host-aggregate.
MariaDB [neutron]> SELECT * FROM segmenthostmappings;
+--------------------------------------+---------------------------------+
| segment_id | host |
+--------------------------------------+---------------------------------+
| 11b9258f-8712-43b7-8f39-3eab627a8c7f | host0 |
| 11b9258f-8712-43b7-8f39-3eab627a8c7f | host1 |
| 11b9258f-8712-43b7-8f39-3eab627a8c7f | host2 |
| 11b9258f-8712-43b7-8f39-3eab627a8c7f | routed-devstack.lab.example.com |
| 89f96bee-424c-4ee2-8639-2ca8e07a70e6 | host0 |
| 89f96bee-424c-4ee2-8639-2ca8e07a70e6 | host1 |
| 89f96bee-424c-4ee2-8639-2ca8e07a70e6 | host2 |
| 89f96bee-424c-4ee2-8639-2ca8e07a70e6 | routed-devstack.lab.example.com |
| a7a7d2f4-c809-4ebb-916f-930c97fbec47 | host0 |
| a7a7d2f4-c809-4ebb-916f-930c97fbec47 | host1 |
| a7a7d2f4-c809-4ebb-916f-930c97fbec47 | host2 |
| a7a7d2f4-c809-4ebb-916f-930c97fbec47 | routed-devstack.lab.example.com |
+--------------------------------------+---------------------------------+
Conclution:
* 3x successfull _update_segment_host_mapping after_create.
** NOTE: **
The RESP BODY: {"itemNotFound": {"message": "Compute host host1 could not be found.", "code": 404}} errors in the logs is expected, the fake host is not in Nova, so this is expeced.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1743579/+subscriptions
Follow ups