← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1794809] Re: Gateway ports are down after reboot of control plane nodes

 

Reviewed:  https://review.openstack.org/606085
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=f787f12aa3441ecffef55f261c4d87dbb12ca6cf
Submitter: Zuul
Branch:    master

commit f787f12aa3441ecffef55f261c4d87dbb12ca6cf
Author: Slawek Kaplonski <skaplons@xxxxxxxxxx>
Date:   Fri Sep 28 13:07:28 2018 +0200

    Make port binding attempt after agent is revived
    
    In some cases it may happen that port is "binding_failed"
    because L2 agent running on destination host was down but
    this is "temporary" issue.
    It is like that for example in case when using L3 HA and when
    master and backup network nodes were e.g. rebooted.
    L3 agent might start running before L2 agent on host in such case
    and if it's new master node, router ports will have "binding_failed"
    state.
    
    When agent sends heartbeat and is getting back to live,
    ML2 plugin will try to bind all ports with "binding_failed"
    from this host.
    
    Change-Id: I3bedb7c22312884cc28aa78aa0f8fbe418f97090
    Closes-Bug: #1794809


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1794809

Title:
  Gateway ports are down after reboot of control plane nodes

Status in neutron:
  Fix Released

Bug description:
  Sometimes when control plane nodes are going down and then up it may happen that for L3 HA routers, failover of active router will happen and in such case if L3 agent will be running before openvswitch agent on host, gateway port may be in "binding failed" state on new MASTER agent.
  That will cause no connectivity to floating IPs on this router.

  I tested this on Queens but it seems that there wasn't any changes in
  this since Queens.

  One possible solution might be to trigger another bind attempt for all
  ports which are binding_failed on host when L2 agent from this host is
  revived. I will investigate if that would work.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1794809/+subscriptions


References