← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1533441] Re: HA router can not be deleted in L3 agent after race between HA router creating and deleting

 

Reviewed:  https://review.openstack.org/285572
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=046be0b8f30291cd029e6e97a4c6c5a1717a8bd1
Submitter: Jenkins
Branch:    master

commit 046be0b8f30291cd029e6e97a4c6c5a1717a8bd1
Author: Kevin Benton <kevin@xxxxxxxxxx>
Date:   Wed Feb 24 13:30:24 2016 -0800

    Filter HA routers without HA interface and state
    
    This patch adjusts the sync method to exclude any HA
    routers from the response that are missing necessary
    HA fields (the HA interface and the HA state).
    
    This prevents the agent from every receiving a partially
    formed router.
    
    Co-Authored-By: Ann Kamyshnikova <akamyshnikova@xxxxxxxxxxxx>
    
    Related-Bug: #1499647
    Closes-Bug: #1533441
    Change-Id: Iadb5a69d4cbc2515fb112867c525676cadea002b


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1533441

Title:
  HA router can not be deleted in L3 agent after race between HA router
  creating and deleting

Status in neutron:
  Fix Released

Bug description:
  HA router can not be deleted in L3 agent after race between HA router
  creating and deleting

  Exception:
  1. Unable to process HA router %s without HA port (HA router initialize)

  2. AttributeError: 'NoneType' object has no attribute 'config' (HA
  router deleting procedure)

  
  With the newest neutron code, I find a infinite loop in _safe_router_removed.
  Consider a HA router without HA port was placed in the l3 agent,
  usually because of the race condition.

  Infinite loop steps:
  1. a HA router deleting RPC comes
  2. l3 agent remove it
  3. the RouterInfo will delete its the router namespace(self.router_namespace.delete())
  4. the HaRouter, ha_router.delete(), where the AttributeError: 'NoneType' or some error will be raised.
  5. _safe_router_removed return False
  6. self._resync_router(update)
  7. the router namespace is not existed, RuntimeError raised, go to 5, infinite loop 5 - 7

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1533441/+subscriptions


References