yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #52973
[Bug 1597461] [NEW] L3 HA + DVR: 2 masters after reboot of controller
Public bug reported:
ENV: Mitaka 3 controllers 45 computes DVR + L3 HA
After reboot of controller on which l3 agent is active, another l3 agent
becomes active. When rebooted node recover, that l3 agent becomes active
as well - this lead to extra loss of external connectivity in tenant
network. After some time the only one agent remains to be active - the
one from rebooted node. Sometimes connectivity does not come back, as
snat port ends up on wrong host.
The root cause of this problem is that routers are processed by l3 agent
before openvswitch agent sets up appropriate ha ports, so for some time
recovered ha routers is isolated from ha routers on other hosts and
becomes active.
The possible solution for this is proper serialization of ha network
creation by l3 agent after ha network is set up on controller.
With 100 routers and networks this issues has been reproduced with every
reboot.
** Affects: neutron
Importance: Undecided
Status: New
** Tags: l3-dvr-backlog l3-ha
** Attachment added: "openvswitch agent logs"
https://bugs.launchpad.net/bugs/1597461/+attachment/4692415/+files/neutron-openvswitch-agent.log.3.gz
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1597461
Title:
L3 HA + DVR: 2 masters after reboot of controller
Status in neutron:
New
Bug description:
ENV: Mitaka 3 controllers 45 computes DVR + L3 HA
After reboot of controller on which l3 agent is active, another l3
agent becomes active. When rebooted node recover, that l3 agent
becomes active as well - this lead to extra loss of external
connectivity in tenant network. After some time the only one agent
remains to be active - the one from rebooted node. Sometimes
connectivity does not come back, as snat port ends up on wrong host.
The root cause of this problem is that routers are processed by l3
agent before openvswitch agent sets up appropriate ha ports, so for
some time recovered ha routers is isolated from ha routers on other
hosts and becomes active.
The possible solution for this is proper serialization of ha network
creation by l3 agent after ha network is set up on controller.
With 100 routers and networks this issues has been reproduced with
every reboot.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1597461/+subscriptions
Follow ups