← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2045757] Re: Race condition while waiting for L2 agent to be DOWN

 

Reviewed:  https://review.opendev.org/c/openstack/neutron/+/902762
Committed: https://opendev.org/openstack/neutron/commit/58dcd30dbba67464f6fd7880ce7aee543156af65
Submitter: "Zuul (22348)"
Branch:    master

commit 58dcd30dbba67464f6fd7880ce7aee543156af65
Author: Slawek Kaplonski <skaplons@xxxxxxxxxx>
Date:   Wed Dec 6 12:56:30 2023 +0100

    [Fullstack] Double check that agent is dead when it should be dead
    
    In some fullstack tests it is expected that agent is DOWN in the Neutron
    DB. It could happen sometimes that in almost the same time test's client
    was doing GET /v2.0/agents/{agent_id} call and got result with
    "alive=False" and in other thread rpc worker was processing heartbeat
    from the agent so it was revived just after API request was finished.
    That was causing test failures in some cases.
    This patch adds second API call to get agent again after 2 seconds if it
    was already marked as DEAD, just to make sure that it is really dead ;)
    
    Closes-Bug: #2045757
    Change-Id: I1c20c90b8abd760f3a53b24024f19ef2bd189b5a


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2045757

Title:
  Race condition while waiting for L2 agent to be DOWN

Status in neutron:
  Fix Released

Bug description:
  In Fullstack tests:
  neutron.tests.fullstack.test_ports_rebind.TestVMPortRebind.test_vm_port_rebound_when_L2_agent_revived
  and
  neutron.tests.fullstack.test_ports_rebind.TestRouterPortRebind.test_vm_port_rebound_when_L2_agent_revived
  L2 agent is disabled, test is waiting for agent to be DOWN and then it
  tries to create port which is marked as "binding failed" due to dead
  agent on the compute node.

  In some cases like:
  http://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_abf/901827/1/check/neutron-fullstack-with-uwsgi/abf43a8/testr_results.html
  or
  http://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_607/901894/6/check/neutron-fullstack-with-uwsgi/6071fab/testr_results.html

  it may happen that L2 agent is found dead already but immediately
  after it is reported like that to the client, it is revived because
  heartbeat was just received. In the meantime test's client is creating
  port expecting that this port will be failed to bound but it's
  actually bound properly and test fails.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2045757/+subscriptions



References