← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1602320] [NEW] ha + distributed router: keepalived process kill vrrp child process

 

Public bug reported:

Code Repo: mitaka
keepalived version: 1.2.13
node mode: 4 nodes(containers), dvr_snat(l3 agent_mode)
OS: Centos 7

I both configure router_distributed and l3_ha True. Then I create a
router, using neutron l3-agent-list-hosting-router command, the result
show 1 active, 3 standby.

Then I add a router interface, there are more than 1 active.
I trace the /var/log/messages, in the original active l3 agent node:
2016-07-12T16:33:32.083140+08:00 localhost Keepalived[1320437]: VRRP child process(1320438) died: Respawning
2016-07-12T16:33:32.083613+08:00 localhost Keepalived[1320437]: Starting VRRP child process, pid=1340135

Strace info:
http://paste.openstack.org/show/530791/

This is not always failed, sometimes there was only 1 active. Maybe this
is related to the environment, because I can't reproduce in VMs.

** Affects: neutron
     Importance: Undecided
         Status: New


** Tags: l3-dvr-backlog l3-ha

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1602320

Title:
  ha + distributed router:  keepalived process kill vrrp child process

Status in neutron:
  New

Bug description:
  Code Repo: mitaka
  keepalived version: 1.2.13
  node mode: 4 nodes(containers), dvr_snat(l3 agent_mode)
  OS: Centos 7

  I both configure router_distributed and l3_ha True. Then I create a
  router, using neutron l3-agent-list-hosting-router command, the result
  show 1 active, 3 standby.

  Then I add a router interface, there are more than 1 active.
  I trace the /var/log/messages, in the original active l3 agent node:
  2016-07-12T16:33:32.083140+08:00 localhost Keepalived[1320437]: VRRP child process(1320438) died: Respawning
  2016-07-12T16:33:32.083613+08:00 localhost Keepalived[1320437]: Starting VRRP child process, pid=1340135

  Strace info:
  http://paste.openstack.org/show/530791/

  This is not always failed, sometimes there was only 1 active. Maybe
  this is related to the environment, because I can't reproduce in VMs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1602320/+subscriptions


Follow ups