← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1691674] [NEW] Restarting l3 agent causes split-brain in HA router

 

Public bug reported:

Restarting neutron-l3-agent that hosts the master router would cause
split-brain in HA Router.

How to reprocude:
  Create a ha router
  Set gateway for the HA route
  Restart neutron-l3-agent that hosts the master router of the HA router
  Split-brain occurs in the HA Router

Analysis:
  1、When restarting neutron-l3-agent that hosts the master router of the HA router, the keepalived on for the master router is killed and the backup router rises to master
  2、After the  neutron-l3-agent is started, state of the keepalived for the original master router is backup and the vips of the original master router are not deleted. Then split-brain occurs.

Modification method:
 When starting keepalived for a HA router, read the state file for the keepalived state('MASTER' if self.ha_state == 'master' else 'BACKUP'). 
 If two of more keepalived processes are master, these keepalived processes will vote a real master, and other keepalived processes become backup and remove their vips.

** Affects: neutron
     Importance: Undecided
     Assignee: sunzuohua (zuohuasun)
         Status: New


** Tags: l3-bgp

** Tags added: l3-bgp

** Changed in: neutron
     Assignee: (unassigned) => sunzuohua (zuohuasun)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1691674

Title:
  Restarting l3 agent causes split-brain in HA router

Status in neutron:
  New

Bug description:
  Restarting neutron-l3-agent that hosts the master router would cause
  split-brain in HA Router.

  How to reprocude:
    Create a ha router
    Set gateway for the HA route
    Restart neutron-l3-agent that hosts the master router of the HA router
    Split-brain occurs in the HA Router

  Analysis:
    1、When restarting neutron-l3-agent that hosts the master router of the HA router, the keepalived on for the master router is killed and the backup router rises to master
    2、After the  neutron-l3-agent is started, state of the keepalived for the original master router is backup and the vips of the original master router are not deleted. Then split-brain occurs.

  Modification method:
   When starting keepalived for a HA router, read the state file for the keepalived state('MASTER' if self.ha_state == 'master' else 'BACKUP'). 
   If two of more keepalived processes are master, these keepalived processes will vote a real master, and other keepalived processes become backup and remove their vips.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1691674/+subscriptions


Follow ups