← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1743579] [NEW] Concurrent report_state from multiple agents: segment_host_mapping fails - StaleDataError

 

Public bug reported:

When multiple host agents rapidly report_state for the first time we get
StaleDataError and _update_segment_host_mapping_for_agent does not
complete for all hosts.

Attached is a file with logs as well as reproducer script and
instruction on how to set up devstack environment similar to the one I
am using.

To Reproduce:
-------------

Run script with the delay, time.sleep(10), commented.
 Results:
  * 2x StaleDataError 
  * Only 1 attempt to add host to placement/host-aggregate.

MariaDB [neutron]> MariaDB [neutron]> SELECT * FROM segmenthostmappings;
+--------------------------------------+---------------------------------+
| segment_id                           | host                            |
+--------------------------------------+---------------------------------+
| a974ae4c-1389-4e41-9ab9-820165c26acd | host2                           |
| a974ae4c-1389-4e41-9ab9-820165c26acd | routed-devstack.lab.example.com |
| bc626d3d-5503-4875-9db8-e1bcfad35979 | host2                           |
| bc626d3d-5503-4875-9db8-e1bcfad35979 | routed-devstack.lab.example.com |
| ec7717dd-8533-464f-a3c8-4ecc7dc08d10 | host2                           |
| ec7717dd-8533-464f-a3c8-4ecc7dc08d10 | routed-devstack.lab.example.com |
+--------------------------------------+---------------------------------+


Conclutions: 
  * 2x StaleDataError
  * 1x successfull _update_segment_host_mapping after_create.

*** We should see 3x attempts to add to placement/host-aggregate, one
for each host agent.  ****


Running the reproducer script with the delay uncommented (No issue):
--------------------------------------------------------------------

Run script with the delay, time.sleep(10), enabled.
Results:
  * No StaleDataError
  * 3 attempts to add the host to placemenb/host-aggregate.

MariaDB [neutron]> SELECT * FROM segmenthostmappings;
+--------------------------------------+---------------------------------+
| segment_id                           | host                            |
+--------------------------------------+---------------------------------+
| 11b9258f-8712-43b7-8f39-3eab627a8c7f | host0                           |
| 11b9258f-8712-43b7-8f39-3eab627a8c7f | host1                           |
| 11b9258f-8712-43b7-8f39-3eab627a8c7f | host2                           |
| 11b9258f-8712-43b7-8f39-3eab627a8c7f | routed-devstack.lab.example.com |
| 89f96bee-424c-4ee2-8639-2ca8e07a70e6 | host0                           |
| 89f96bee-424c-4ee2-8639-2ca8e07a70e6 | host1                           |
| 89f96bee-424c-4ee2-8639-2ca8e07a70e6 | host2                           |
| 89f96bee-424c-4ee2-8639-2ca8e07a70e6 | routed-devstack.lab.example.com |
| a7a7d2f4-c809-4ebb-916f-930c97fbec47 | host0                           |
| a7a7d2f4-c809-4ebb-916f-930c97fbec47 | host1                           |
| a7a7d2f4-c809-4ebb-916f-930c97fbec47 | host2                           |
| a7a7d2f4-c809-4ebb-916f-930c97fbec47 | routed-devstack.lab.example.com |
+--------------------------------------+---------------------------------+


Conclution:
  * 3x successfull _update_segment_host_mapping after_create.


** NOTE: **
The RESP BODY: {"itemNotFound": {"message": "Compute host host1 could not be found.", "code": 404}} errors in the logs is expected, the fake host is not in Nova, so this is expeced.

** Affects: neutron
     Importance: Undecided
         Status: New

** Attachment added: "Logs, a reproducer script and devstack instructions."
   https://bugs.launchpad.net/bugs/1743579/+attachment/5037862/+files/reproducer_and_logs.txt

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1743579

Title:
  Concurrent report_state from multiple agents:  segment_host_mapping
  fails - StaleDataError

Status in neutron:
  New

Bug description:
  When multiple host agents rapidly report_state for the first time we
  get StaleDataError and _update_segment_host_mapping_for_agent does not
  complete for all hosts.

  Attached is a file with logs as well as reproducer script and
  instruction on how to set up devstack environment similar to the one I
  am using.

  To Reproduce:
  -------------

  Run script with the delay, time.sleep(10), commented.
   Results:
    * 2x StaleDataError 
    * Only 1 attempt to add host to placement/host-aggregate.

  MariaDB [neutron]> MariaDB [neutron]> SELECT * FROM segmenthostmappings;
  +--------------------------------------+---------------------------------+
  | segment_id                           | host                            |
  +--------------------------------------+---------------------------------+
  | a974ae4c-1389-4e41-9ab9-820165c26acd | host2                           |
  | a974ae4c-1389-4e41-9ab9-820165c26acd | routed-devstack.lab.example.com |
  | bc626d3d-5503-4875-9db8-e1bcfad35979 | host2                           |
  | bc626d3d-5503-4875-9db8-e1bcfad35979 | routed-devstack.lab.example.com |
  | ec7717dd-8533-464f-a3c8-4ecc7dc08d10 | host2                           |
  | ec7717dd-8533-464f-a3c8-4ecc7dc08d10 | routed-devstack.lab.example.com |
  +--------------------------------------+---------------------------------+

  
  Conclutions: 
    * 2x StaleDataError
    * 1x successfull _update_segment_host_mapping after_create.

  *** We should see 3x attempts to add to placement/host-aggregate, one
  for each host agent.  ****

  
  Running the reproducer script with the delay uncommented (No issue):
  --------------------------------------------------------------------

  Run script with the delay, time.sleep(10), enabled.
  Results:
    * No StaleDataError
    * 3 attempts to add the host to placemenb/host-aggregate.

  MariaDB [neutron]> SELECT * FROM segmenthostmappings;
  +--------------------------------------+---------------------------------+
  | segment_id                           | host                            |
  +--------------------------------------+---------------------------------+
  | 11b9258f-8712-43b7-8f39-3eab627a8c7f | host0                           |
  | 11b9258f-8712-43b7-8f39-3eab627a8c7f | host1                           |
  | 11b9258f-8712-43b7-8f39-3eab627a8c7f | host2                           |
  | 11b9258f-8712-43b7-8f39-3eab627a8c7f | routed-devstack.lab.example.com |
  | 89f96bee-424c-4ee2-8639-2ca8e07a70e6 | host0                           |
  | 89f96bee-424c-4ee2-8639-2ca8e07a70e6 | host1                           |
  | 89f96bee-424c-4ee2-8639-2ca8e07a70e6 | host2                           |
  | 89f96bee-424c-4ee2-8639-2ca8e07a70e6 | routed-devstack.lab.example.com |
  | a7a7d2f4-c809-4ebb-916f-930c97fbec47 | host0                           |
  | a7a7d2f4-c809-4ebb-916f-930c97fbec47 | host1                           |
  | a7a7d2f4-c809-4ebb-916f-930c97fbec47 | host2                           |
  | a7a7d2f4-c809-4ebb-916f-930c97fbec47 | routed-devstack.lab.example.com |
  +--------------------------------------+---------------------------------+

  
  Conclution:
    * 3x successfull _update_segment_host_mapping after_create.

  
  ** NOTE: **
  The RESP BODY: {"itemNotFound": {"message": "Compute host host1 could not be found.", "code": 404}} errors in the logs is expected, the fake host is not in Nova, so this is expeced.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1743579/+subscriptions


Follow ups