yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #73648
[Bug 1744062] Re: L3 HA: multiple agents are active at the same time
As reported by Xav in https://bugs.launchpad.net/ubuntu/+bug/1731595:
"Comment for the folks that are noticing this as 'fix released' but
still affected - see
https://github.com/acassen/keepalived/commit/e90a633c34fbe6ebbb891aa98bf29ce579b8b45c
for the rest of this fix, we need keepalived to be at least 1.4.0 in
order to have this commit."
I just checked and the patch Xav referenced can be backported fairly
cleanly to at least keepalived 1:1.2.19-1 (xenial/mitaka) and above.
** Also affects: keepalived (Ubuntu)
Importance: Undecided
Status: New
** No longer affects: keepalived (Ubuntu Artful)
** Changed in: keepalived (Ubuntu)
Importance: Undecided => High
** Changed in: keepalived (Ubuntu)
Status: New => Triaged
** Changed in: keepalived (Ubuntu Xenial)
Importance: Undecided => High
** Changed in: keepalived (Ubuntu Xenial)
Status: New => Triaged
** Changed in: keepalived (Ubuntu Bionic)
Importance: Undecided => High
** Changed in: keepalived (Ubuntu Bionic)
Status: New => Triaged
** No longer affects: cloud-archive/newton
** No longer affects: neutron (Ubuntu Artful)
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1744062
Title:
L3 HA: multiple agents are active at the same time
Status in Ubuntu Cloud Archive:
Triaged
Status in Ubuntu Cloud Archive mitaka series:
Triaged
Status in Ubuntu Cloud Archive ocata series:
Triaged
Status in Ubuntu Cloud Archive pike series:
Triaged
Status in Ubuntu Cloud Archive queens series:
Triaged
Status in neutron:
New
Status in keepalived package in Ubuntu:
Triaged
Status in neutron package in Ubuntu:
Triaged
Status in keepalived source package in Xenial:
Triaged
Status in neutron source package in Xenial:
Triaged
Status in keepalived source package in Bionic:
Triaged
Status in neutron source package in Bionic:
Triaged
Bug description:
This is the same issue reported in
https://bugs.launchpad.net/neutron/+bug/1731595, however that is
marked as 'Fix Released' and the issue is still occurring and I can't
change back to 'New' so it seems best to just open a new bug.
It seems as if this bug surfaces due to load issues. While the fix
provided by Venkata (https://review.openstack.org/#/c/522641/) should
help clean things up at the time of l3 agent restart, issues seem to
come back later down the line in some circumstances. xavpaice
mentioned he saw multiple routers active at the same time when they
had 464 routers configured on 3 neutron gateway hosts using L3HA, and
each router was scheduled to all 3 hosts. However, jhebden mentions
that things seem stable at the 400 L3HA router mark, and it's worth
noting this is the same deployment that xavpaice was referring to.
It seems to me that something is being pushed to it's limit, and
possibly once that limit is hit, master router advertisements aren't
being received, causing a new master to be elected. If this is the
case it would be great to get to the bottom of what resource is
getting constrained.
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1744062/+subscriptions
References