yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #81393
[Bug 1860521] [NEW] L2 pop notifications are not reliable
Public bug reported:
Problem: lack of connectivity (e.g. vxlan tunnels, OVS flows) between
nodes/VMs in L2 segment due to partial RabbitMQ unavailability, RPC
message loss or agent failure on applying fdb entry updates.
Why: currently FDB entries are sent by neutron server to L2 agents one-
way (no feedback), thus agent has no way to detect if all required
tunnels/flows are built. On the other hand server has no way to detect
if all sent FDB entries were delivered and required flows were applied.
In case some messages are lost - only agent restart fixes possible
issues.
Way to address: new synchronization mechanism on L2 agent side, which
will periodically request net topology from server and match it to
actual config applied on the node, with applying missing parts.
Option 2: move from RPC fanouts and casts to RPC calls which guarantee
message delivery. Concerns: scalability, increased load on neutron
server.
** Affects: neutron
Importance: Undecided
Status: New
** Tags: rfe
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1860521
Title:
L2 pop notifications are not reliable
Status in neutron:
New
Bug description:
Problem: lack of connectivity (e.g. vxlan tunnels, OVS flows) between
nodes/VMs in L2 segment due to partial RabbitMQ unavailability, RPC
message loss or agent failure on applying fdb entry updates.
Why: currently FDB entries are sent by neutron server to L2 agents
one-way (no feedback), thus agent has no way to detect if all required
tunnels/flows are built. On the other hand server has no way to detect
if all sent FDB entries were delivered and required flows were
applied. In case some messages are lost - only agent restart fixes
possible issues.
Way to address: new synchronization mechanism on L2 agent side, which
will periodically request net topology from server and match it to
actual config applied on the node, with applying missing parts.
Option 2: move from RPC fanouts and casts to RPC calls which guarantee
message delivery. Concerns: scalability, increased load on neutron
server.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1860521/+subscriptions
Follow ups