← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1516260] [NEW] L3 agent sync_routers timeouts may cause cluster to fall down

 

Public bug reported:

L3 agent 'sync_routers' RPC call is sent when the agent starts or when
an exception occurs. It uses a default timeout of 60 seconds (An Oslo
messaging config option). At scale the server can take a long time to
answer, causing a timeout and the message is sent again, causing a
cascading failure and the situation does not resolve itself. The
sync_routers server RPC response was optimized to mitigate this, it
could also be helpful to simply increase the timeout.

** Affects: neutron
     Importance: Low
     Assignee: Assaf Muller (amuller)
         Status: New


** Tags: l3-ipam-dhcp

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1516260

Title:
  L3 agent sync_routers timeouts may cause cluster to fall down

Status in neutron:
  New

Bug description:
  L3 agent 'sync_routers' RPC call is sent when the agent starts or when
  an exception occurs. It uses a default timeout of 60 seconds (An Oslo
  messaging config option). At scale the server can take a long time to
  answer, causing a timeout and the message is sent again, causing a
  cascading failure and the situation does not resolve itself. The
  sync_routers server RPC response was optimized to mitigate this, it
  could also be helpful to simply increase the timeout.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1516260/+subscriptions


Follow ups