yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #89274
[Bug 1981113] [NEW] OVN metadata agent can be slow with large amount of subnets
Public bug reported:
OVN metadata agent can take very long time (observed ~40s) to add cidrs
under a metadata namespace tap interface when a network consist of many
subnets (observed ~1700 subnets). The long processing time can result in
ovn-metada-agent not having haproxy ready by the time the first VM
cloud-init requests for its metadata. Thus resulting in VM missing
metadata for proper operation.
Reproducing step:
- Create a network with thousands of subnets under this network
- Create a VM connected to the network from above. Make sure this is the first VM on the deployed compute node(hypervisor). Observe that VM's cloud-init request time out due to no response from 169.256.169.256/openstack
- Observe that ovn-metadata-agent logs is probably still executing or was executing this code [1]
Possible solutions:
1. (Long hanging fruit?) See if there is a way to improve execution time of `ip.add` call. Perhaps passing a list of cidrs instead of a single cidr at the time can improve performance?
2. (more involved) refactor the code such that ovn-metadata-agent only adds a single cidr which belongs to the VM being created. Instead of unconditionally adding all cidrs for the network when the first VM is created(current implementation)
[1]
https://github.com/openstack/neutron/blob/41bf8054017c72815226d5df50fd321b30fcba13/neutron/agent/ovn/metadata/agent.py#L488-L495
** Affects: neutron
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1981113
Title:
OVN metadata agent can be slow with large amount of subnets
Status in neutron:
New
Bug description:
OVN metadata agent can take very long time (observed ~40s) to add
cidrs under a metadata namespace tap interface when a network consist
of many subnets (observed ~1700 subnets). The long processing time can
result in ovn-metada-agent not having haproxy ready by the time the
first VM cloud-init requests for its metadata. Thus resulting in VM
missing metadata for proper operation.
Reproducing step:
- Create a network with thousands of subnets under this network
- Create a VM connected to the network from above. Make sure this is the first VM on the deployed compute node(hypervisor). Observe that VM's cloud-init request time out due to no response from 169.256.169.256/openstack
- Observe that ovn-metadata-agent logs is probably still executing or was executing this code [1]
Possible solutions:
1. (Long hanging fruit?) See if there is a way to improve execution time of `ip.add` call. Perhaps passing a list of cidrs instead of a single cidr at the time can improve performance?
2. (more involved) refactor the code such that ovn-metadata-agent only adds a single cidr which belongs to the VM being created. Instead of unconditionally adding all cidrs for the network when the first VM is created(current implementation)
[1]
https://github.com/openstack/neutron/blob/41bf8054017c72815226d5df50fd321b30fcba13/neutron/agent/ovn/metadata/agent.py#L488-L495
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1981113/+subscriptions
Follow ups