yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #93377
[Bug 2030741] Re: [OVN] Lack of AZs awareness in L3 port scheduler
Reviewed: https://review.opendev.org/c/openstack/neutron/+/892604
Committed: https://opendev.org/openstack/neutron/commit/a29ea3724e1f6bb54b76d1b9915c13014272fdcd
Submitter: "Zuul (22348)"
Branch: master
commit a29ea3724e1f6bb54b76d1b9915c13014272fdcd
Author: Yann Morice <yann.morice@xxxxxxxxx>
Date: Thu Aug 24 15:56:48 2023 +0200
[ovn] AZs distribution in L3 port scheduler
Update l3 ovn schedulers (chance, leastloaded) to ensure that LRP gateways are distributed over chassis in the different eligible AZs.
Previous version already ensure that LRP gateways were scheduled over chassis in eligible AZs. But, depending on the deployment characteristics, all these chassis could be in the same AZ. In some use-cases, it could be needed to have LRP gateways in different AZs to be resilient on failures.
This patch re-order the list of eligible chassis to add a priority on selecting chassis in different AZs.
This should provide a solution for users who need to have their router gateways scheduled on chassis from different AZs.
Closes-Bug: #2030741
Change-Id: I72973abbb8b0f9cc5848fd3b4f6463c38c6595f8
** Changed in: neutron
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2030741
Title:
[OVN] Lack of AZs awareness in L3 port scheduler
Status in neutron:
Fix Released
Bug description:
The OVN L3 port scheduler assigns the router ports to gateway chassis.
It retrieves the chassis list from nodes configured as gateway
(external_ids:ovn-cms-options=enable-chassis-as-gw). This list could
be filtered by availability zones. In this case, the scheduler will
filter out chassis from invalid AZs (scheduler/l3_ovn_scheduler.py).
As a result, we have a list of all eligible chassis for gateway ports,
in all AZs where it could be scheduled.
Then, both chance and leastloaded scheduler select 5 nodes from this
list (hardcoded in common/ovn/constants.py:MAX_GW_CHASSIS = 5)
regardless of AZs membership. Everything seems OK but when more than 5
nodes are available in one of the AZs, the gateway for a router can be
scheduled in *only* one unique AZ.
In some use cases, where AZs are mapped to “failure domains”, this
could be a problem. While in OVS l3_ha mode, router instances where
placed by “neutron.scheduler.l3_agent_scheduler.AZ*Scheduler” taking
care of AZs and so were their ports, this seems not to be feasible
out-of-box - right now - using OVN.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2030741/+subscriptions