← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1533460] Re: DBReferenceError rasied during race between HA router deleting and L3 agent sync router info

 

Reviewed:  https://review.openstack.org/260303
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=d1accc36689fdda5bb5e16d276d5f284bc80a6bc
Submitter: Jenkins
Branch:    master

commit d1accc36689fdda5bb5e16d276d5f284bc80a6bc
Author: LIU Yulong <liuyulong@xxxxxxxx>
Date:   Tue Dec 22 11:02:08 2015 +0800

    Catch DBReferenceError in HA router race conditions
    
    If auto_scheduler the router, during the race, the neutron server may
    treat the HA router as 'no scheduled', then it will create a new HA
    port binding with the router(id) which is deleted concurrently, and
    then foreign key constraint error(Integrity Error) raised.
    
    Change-Id: I81fdd2c971ee4ae5133126b6887ba6ad855ef138
    Closes-Bug: #1533460
    Related-Bug: #1523780


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1533460

Title:
  DBReferenceError rasied during race between HA router deleting and L3
  agent sync router info

Status in neutron:
  Fix Released

Bug description:
  Env:
  devstack multi-host, 1 controller, 1 compute, 2 network(l3 agent)
  neutron master branch source code install
  VXLAN

  Description:
  If auto_scheduler the router, during race between HA router creating and deleting, the neutron server may treat the HA router as 'no scheduled', then it will create a new HA port binding with the router(id) which is deleted concurrently, and
  then foreign key constraint error(Integrity Error) raised.

  Exception log:
  http://paste.openstack.org/show/484088/

  DBReferenceError: (IntegrityError) (1452, 'Cannot add or update a
  child row: a foreign key constraint fails
  (`neutron`.`ha_router_agent_port_bindings`, CONSTRAINT
  `ha_router_agent_port_bindings_ibfk_2` FOREIGN KEY (`router_id`)
  REFERENCES `routers` (`id`) ON DELETE CASCADE)') 'INSERT INTO
  ha_router_agent_port_bindings (port_id, router_id, l3_agent_id, state)
  VALUES (%s, %s, %s, %s)' ('xxxxxxxxxxx', 'xxxxxxxxxxxxxxxxxxx', None,
  'standby')

  For more information:
  https://bugs.launchpad.net/neutron/+bug/1523780/

  
  Exception procedure:

  -----------------------Neutron Server Side------------------------
  (1) Create HA router, (create_router)
  (2) Init HA attrs, (_set_vr_id, _create_ha_interfaces)
  (3) scheduler and notify L3 agent, (_notify_ha_interfaces_updated)
  ...

  -----------------------L3 Agent Side----------------------------
  (1) Get a new router update RPC notify (self._queue.add(update))
  (2) To get this router info (self.plugin_rpc.get_routers(self.context, [update.id]))
  (3) RPC Call
  ...

  -----------------------Neutron Server Side------------------------
  (1) Get legacy router info (get_sync_data) (RPC worker)
  *(2) a delete router API call comes  (API worker)
  *(3) Remove HA router HA port binding attrs (API worker)
  *(4) auto_scheduler the HA router because the (3) remove it  (RPC worker)
  (5) Create the HA port bindings while the HA router was deleting (RPC worker)
  (6) the DBReferenceError: (IntegrityError) comes

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1533460/+subscriptions


References