← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1612069] Re: HA router state change takes too much time to notify neutron server

 

Reviewed:  https://review.openstack.org/364803
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=e795a3fcf882ad8130018f32b57f2f887a1d20da
Submitter: Jenkins
Branch:    master

commit e795a3fcf882ad8130018f32b57f2f887a1d20da
Author: LIU Yulong <liuyulong@xxxxxx>
Date:   Thu Aug 11 16:58:48 2016 +0800

    Make the HA router state change notification more faster
    
    HA router state change takes too much time to notify neutron server.
    It takes almost 16s, by default ha_vrrp_advert_int 2s, for a single
    HA router state change.
    
    In this 16s time, assuming that a HA router meets 8 times HA router
    state change. After this 16s, the first change dequeue and notify the
    neutron server, then the 2nd, 3rd, and so on. Things are now becoming
    interesting, after this 16 seconds if you run
    `neutron l3-agent-list-hosting-router ha_router_id`, you may see the
    router state in one specific agent is alternatively changing in active
    and standby. It's not stay in the real state, because of the delay
    notification.
    
    This patch sets the BatchNotifier interval to ha_vrrp_advert_int (default
    2s) to make the HA router state change notification more faster.
    
    NOTE: the BatchNotifier event queue is needed, because the HA router state
    change needs to be sent in a proper order. Then the neutron server could set
    the HA state properly.
    
    Closes-Bug: #1612069
    Change-Id: Ife687038d31bd1e1ee264ff8b6ae1264fdd05489


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1612069

Title:
  HA router state change takes too much time to notify neutron server

Status in neutron:
  Fix Released

Bug description:
  The ha state change BatchNotifier uses the following calculated
  interval.

      def _calculate_batch_duration(self):
          # Slave becomes the master after not hearing from it 3 times
          detection_time = self.conf.ha_vrrp_advert_int * 3

          # Keepalived takes a couple of seconds to configure the VIPs
          configuration_time = 2

          # Give it enough slack to batch all events due to the same failure
          return (detection_time + configuration_time) * 2

  It takes almost 16s, by default ha_vrrp_advert_int is 2s, for a single HA router state change to notify neutron server.
  Actually before this notify, the ip MonitorDaemon has already set the router to its relevant state.
  So no need to wait this long time.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1612069/+subscriptions


References