yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #90960
[Bug 1981113] Re: OVN metadata agent can be slow with large amount of subnets
Reviewed: https://review.opendev.org/c/openstack/neutron/+/861124
Committed: https://opendev.org/openstack/neutron/commit/edf48e46a1f0227f84b05ab39da005393e5fa73f
Submitter: "Zuul (22348)"
Branch: master
commit edf48e46a1f0227f84b05ab39da005393e5fa73f
Author: Miro Tomaska <mtomaska@xxxxxxxxxx>
Date: Wed Oct 12 08:42:18 2022 -0500
Improve agent provision performance for large networks
Before this patch, the metadata agent would provision network namespace
for all subnets under a network(datapath) as soon as the first
VM(vif port) was mounted on the chassis. This operation can take very
long time for networks with lots of subnets. See the linked bug for
more details.
This patch changes this mechanism to "lazy load" where metadata agent
provisions metadata namespace with only the subnets belonging to the
active ports on the chassis. This results in virtually constant
throughput not effected by the number of subnets.
Closes-Bug: #1981113
Change-Id: Ia2a66cfd3fd1380c5204109742d44f09160548d2
** Changed in: neutron
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1981113
Title:
OVN metadata agent can be slow with large amount of subnets
Status in neutron:
Fix Released
Bug description:
OVN metadata agent can take very long time (observed ~40s) to add
cidrs under a metadata namespace tap interface when a network consist
of many subnets (observed ~1700 subnets). The long processing time can
result in ovn-metada-agent not having haproxy ready by the time the
first VM cloud-init requests for its metadata. Thus resulting in VM
missing metadata for proper operation.
Reproducing step:
- Create a network with hundreds or thousands of subnets under this network. The more subnets the more obvious the problem is
- Create a VM connected to the network from above. Make sure this is the first VM on the deployed compute node(hypervisor).
- Once VM is created, observe that VM's cloud-init request time out due to no response from 169.256.169.256/openstack
- Inspect ovn-metadata-agent log and notice this is due to ovn-metadata-agent taking very long time to process [1]
Possible solutions:
1. (Low hanging fruit?) See if there is a way to improve execution time of `ip.add` call. Perhaps passing a list of cidrs instead of a single cidr at the time can improve performance?
2. (more involved) refactor the code such that ovn-metadata-agent only adds a single cidr which belongs to the VM being created. Instead of unconditionally adding all cidrs for the network when the first VM is created(current implementation)
[1]
https://github.com/openstack/neutron/blob/41bf8054017c72815226d5df50fd321b30fcba13/neutron/agent/ovn/metadata/agent.py#L488-L495
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1981113/+subscriptions
References