yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #32717
[Bug 1452886] [NEW] Port stuck in BUILD state results in limited instance connectivity
Public bug reported:
I am currently experiencing (random) cases of instances that are spun up
having limited connectivity. There are about 650 instances in the
environment and 45 networks.
Network Info:
- ML2/LinuxBridge/l2pop
- VXLAN networks
Symptoms:
- On the local compute node, the instance tap is in the bridge. Everything looks good.
- Instance is reachable from some, but not all, instances/devices in the same subnet across all compute and network nodes
- On some compute nodes and network nodes, the ARP and FDB entries for the instance do not exist. Instances/devices on these nodes cannot communicate with the new instance.
- No errors are logged
Here are some observations for the non-working instances:
- The corresponding Neutron port is stuck in a BUILD state
- The binding:host_id value of the port (ie. compute-xxx) does not match the OS-EXT-SRV-ATTR:host value of the instance (ie. compute-zzz). For working instances, these values match.
I am unable to replicate this consistently at this time, nor am I sure
where to begin pinpointing the issue. Any help is appreciated.
** Affects: neutron
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1452886
Title:
Port stuck in BUILD state results in limited instance connectivity
Status in OpenStack Neutron (virtual network service):
New
Bug description:
I am currently experiencing (random) cases of instances that are spun
up having limited connectivity. There are about 650 instances in the
environment and 45 networks.
Network Info:
- ML2/LinuxBridge/l2pop
- VXLAN networks
Symptoms:
- On the local compute node, the instance tap is in the bridge. Everything looks good.
- Instance is reachable from some, but not all, instances/devices in the same subnet across all compute and network nodes
- On some compute nodes and network nodes, the ARP and FDB entries for the instance do not exist. Instances/devices on these nodes cannot communicate with the new instance.
- No errors are logged
Here are some observations for the non-working instances:
- The corresponding Neutron port is stuck in a BUILD state
- The binding:host_id value of the port (ie. compute-xxx) does not match the OS-EXT-SRV-ATTR:host value of the instance (ie. compute-zzz). For working instances, these values match.
I am unable to replicate this consistently at this time, nor am I sure
where to begin pinpointing the issue. Any help is appreciated.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1452886/+subscriptions
Follow ups
References