← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1959151] Re: Don't set HA ports down while L3 agent restart.

 

Reviewed:  https://review.opendev.org/c/openstack/neutron/+/826545
Committed: https://opendev.org/openstack/neutron/commit/f430cd00725f8303f5313cb7784c9aed4b982e62
Submitter: "Zuul (22348)"
Branch:    master

commit f430cd00725f8303f5313cb7784c9aed4b982e62
Author: labedz <krzysztof.tomaszewski@xxxxxxxxxxxx>
Date:   Thu Jan 27 00:13:40 2022 +0100

    Don't set HA ports down while L3 agent restart.
    
    Because of the fix for bug[1] and issue with linux_utils
    get_process_count_by_name() L3 agent puts all it's HA ports down
    during initialization phase. Unfortunately such operation can break
    already working L3 communication. Rewiring ha-* port from down state to
    up can takes few seconds and some VRRP packages could be lost then.
    That triggers keepalived on other node so router HA state change
    may be triggered.
    
    This change prevents putting HA ports down when during initialization
    phase L3 agent finds already configured own net namespaces. Existance
    of such net namespace is a good proof that there is a network
    configuration existing so host wasn't rebooted so most probably it is
    just agent restart.
    
    [1] https://bugs.launchpad.net/neutron/+bug/1597461
    
    Closes-Bug: #1959151
    Change-Id: Id9c906b2d141c3bedd80fb5f868190f8a4b66f54


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1959151

Title:
  Don't set HA ports down while L3 agent restart.

Status in neutron:
  Fix Released

Bug description:
  Because of the fix for bug #1597461[1] L3 agent puts all it's
  HA ports down during initialization phase. Unfortunately such
  operation can break already working L3 communication when
  you restart agent service (rewiring port from down state to
  up can takes few seconds and some VRRP packages could be lost
  so router HA state change may be triggered).

  This is an effect of calling:
  self.plugin_rpc.update_all_ha_network_port_statuses
  in neutron/agent/l3/agent.py#L393 during L3 agent
  initialization phase in _check_ha_router_process_status.

  Restarting agent process should not affect already working
  configuration (customer traffic).

  Possibly workaround would be to put HA ports to DOWN state
  only on host restart and not on every L3 agent restart.

  [1] https://bugs.launchpad.net/neutron/+bug/1597461

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1959151/+subscriptions



References