yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #24072
[Bug 1385234] [NEW] OVS tunneling between multiple neutron nodes misconfigured if amqp is restarted
Public bug reported:
At completion of a deployment with multiple controllers, by observing
the gre tunnels created in OVS by the neutron ovs-agent, one will find
that some neutron nodes may miss the tunnels in between them.
This is due to ovs-agents getting disconnected from the rabbit cluster
without them noticing and as a result, being unable to receive updates
from other nodes or publish updates.
The disconnection may happen following a reconfig of a rabbit node, the
VIP moving over a different node, or even _during_ deployment due to
rabbit cluster configuration.
This was observed using Kombu 3.0.33 as well as 2.5.
Use of some aggressive (low) kernel keepalive probes interval seems to
improve the reliability but a more appropriate fix seems to be support
for heartbeat in oslo.messaging
** Affects: neutron
Importance: Undecided
Status: New
** Affects: oslo.messaging
Importance: Undecided
Status: New
** Affects: tripleo
Importance: High
Status: New
** Also affects: oslo.messaging
Importance: Undecided
Status: New
** Also affects: neutron
Importance: Undecided
Status: New
** Summary changed:
- OVS tunneling between multiple neutron nodes breaks if amqp is restarted
+ OVS tunneling between multiple neutron nodes misconfigured if amqp is restarted
** Description changed:
At completion of a deployment with multiple controllers, by observing
the gre tunnels created in OVS by the neutron ovs-agent, one will find
that some neutron nodes may miss the tunnels in between them.
This is due to ovs-agents getting disconnected from the rabbit cluster
without them noticing and as a result, being unable to receive updates
from other nodes or publish updates.
The disconnection may happen following a reconfig of a rabbit node, the
VIP moving over a different node, or even _during_ deployment due to
rabbit cluster configuration.
+ This was observed using Kombu 3.0.33 as well as 2.5.
+
Use of some aggressive (low) kernel keepalive probes interval seems to
improve the reliability but a more appropriate fix seems to be support
for heartbeat in oslo.messaging
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1385234
Title:
OVS tunneling between multiple neutron nodes misconfigured if amqp is
restarted
Status in OpenStack Neutron (virtual network service):
New
Status in Messaging API for OpenStack:
New
Status in tripleo - openstack on openstack:
New
Bug description:
At completion of a deployment with multiple controllers, by observing
the gre tunnels created in OVS by the neutron ovs-agent, one will find
that some neutron nodes may miss the tunnels in between them.
This is due to ovs-agents getting disconnected from the rabbit cluster
without them noticing and as a result, being unable to receive updates
from other nodes or publish updates.
The disconnection may happen following a reconfig of a rabbit node,
the VIP moving over a different node, or even _during_ deployment due
to rabbit cluster configuration.
This was observed using Kombu 3.0.33 as well as 2.5.
Use of some aggressive (low) kernel keepalive probes interval seems to
improve the reliability but a more appropriate fix seems to be support
for heartbeat in oslo.messaging
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1385234/+subscriptions
Follow ups
References