yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #74992
[Bug 1795212] [NEW] [RFE] Prevent DHCP agent from processing stale RPC messages when restarting up
Public bug reported:
Network rescheduling would be triggered when neutron server is
discovering that agents are down. At the same time, some bare metal and
node management systems will reboot those same nodes at the same time.
When those two actions happen together, it will result in the server
sending RPC notifications to agents that just get rebooted which will
lead to stale RPC messages when the DHCP agents return to service. These
messages were sent to the agent before the node was rebooted but were
not processed by the agent because it was shutdown at the time.
The negative effects brought by this case would be:
when an agent has received a stale network create/end notification, it will be triggered to start servicing a network even though the server may have already had that network assigned to a different agent. Since the agent does not periodically audit the list of networks that it is servicing it could potentially continue servicing a network that was not assigned to it forever. Similarly, it is possible that a stale delete message is processed thus causing the agent to stop servicing a network that it was actually supposed to service.
** Affects: neutron
Importance: Undecided
Assignee: Kailun Qin (kailun.qin)
Status: New
** Changed in: neutron
Assignee: (unassigned) => Kailun Qin (kailun.qin)
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1795212
Title:
[RFE] Prevent DHCP agent from processing stale RPC messages when
restarting up
Status in neutron:
New
Bug description:
Network rescheduling would be triggered when neutron server is
discovering that agents are down. At the same time, some bare metal
and node management systems will reboot those same nodes at the same
time. When those two actions happen together, it will result in the
server sending RPC notifications to agents that just get rebooted
which will lead to stale RPC messages when the DHCP agents return to
service. These messages were sent to the agent before the node was
rebooted but were not processed by the agent because it was shutdown
at the time.
The negative effects brought by this case would be:
when an agent has received a stale network create/end notification, it will be triggered to start servicing a network even though the server may have already had that network assigned to a different agent. Since the agent does not periodically audit the list of networks that it is servicing it could potentially continue servicing a network that was not assigned to it forever. Similarly, it is possible that a stale delete message is processed thus causing the agent to stop servicing a network that it was actually supposed to service.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1795212/+subscriptions