← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1807153] Re: Race condition in metering agent when creating iptable managers for router namespaces

 

Reviewed:  https://review.opendev.org/666970
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=9f541521bbbcf36325bfc250e3ce27a138ddef3c
Submitter: Zuul
Branch:    master

commit 9f541521bbbcf36325bfc250e3ce27a138ddef3c
Author: bno1 <alex@xxxxxxxxxxxxxxxxx>
Date:   Sun Jun 23 00:51:02 2019 +0300

    Retry creating iptables managers and adding metering rules
    
    This change makes the metering agent retry creating the iptables
    managers for each router and applying the metering rules.
    This is needed in case the metering agent starts before some or all of
    the namespaces are created.
    
    Change-Id: Ifc565feb98c7f02df5c2831a3607c3e526a2e703
    Closes-Bug: #1807153


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1807153

Title:
  Race condition in metering agent when creating iptable managers for
  router namespaces

Status in neutron:
  Fix Released

Bug description:
  Sometimes the metering agent fails to send meter information. When
  this happens and I ssh into the network node and run `ip netns qrouter
  -<router-id> iptables -nvL` I see no metering rules in the output.
  Restarting the metering agent fixes this.

  I suspect that this is a race condition which happens when the
  metering agent is notified of a router before the L3 agent creates the
  namespaces for it. This causes the metering agent to not create an
  IptablesManager for the qrouter namespace and not add the metering
  rules (this happens in
  `neutron/services/metering/drivers/iptables/iptables_driver.py` in
  `RouterWithMetering.__init__`).

  I tested this in the following manner:
  1. Have a public router and a metering label with at least one rule attached. In my case I have two public routers (one distributed and the other centralized), a metering label with the ingress rule 0.0.0.0/0 and another metering label with the egress rule 0.0.0.0/0.
  2. Have a single network node. This makes the test easier to control.
  3. On the network node manually edit iptables_driver.py and add the following 2 if statements at the end of `RouterWithMetering.__init__`:

  if not self.iptables_manager:
      LOG.debug('Router %s has no iptables manager', router['name'])

  if not self.snat_iptables_manager:
      LOG.debug('Router %s has no snat iptables manager', router['name'])

  4. Set debug=True in /etc/neutron/metering_agent.ini on the network node.
  5. Reboot the network node.
  6. After it boots up check the iptables on the qrouter namespace and if the metering rules are missing run `grep 'iptables manager' /var/log/meutron/metering-agent.log. I get the following output:

  2018-12-06 07:55:26.103 1632 DEBUG neutron.services.metering.drivers.iptables.iptables_driver [req-87aa420f-8bdb-48b8-be38-91e407a56a61 - - - - -] Router public_router1 has no iptables manager __init__ /usr/lib/python2.7/site-packages/neutron/services/metering/drivers/iptables/iptables_driver.py:104
  2018-12-06 07:55:26.104 1632 DEBUG neutron.services.metering.drivers.iptables.iptables_driver [req-87aa420f-8bdb-48b8-be38-91e407a56a61 - - - - -] Router public_router1 has no snat iptables manager __init__ /usr/lib/python2.7/site-packages/neutron/services/metering/drivers/iptables/iptables_driver.py:107
  2018-12-06 07:55:26.158 1632 DEBUG neutron.services.metering.drivers.iptables.iptables_driver [req-87aa420f-8bdb-48b8-be38-91e407a56a61 - - - - -] Router public_router2 has no iptables manager __init__ /usr/lib/python2.7/site-packages/neutron/services/metering/drivers/iptables/iptables_driver.py:104
  2018-12-06 07:55:26.159 1632 DEBUG neutron.services.metering.drivers.iptables.iptables_driver [req-87aa420f-8bdb-48b8-be38-91e407a56a61 - - - - -] Router public_router2 has no snat iptables manager __init__ /usr/lib/python2.7/site-packages/neutron/services/metering/drivers/iptables/iptables_driver.py:107

  This confirms that at the time the metering agent started the router
  namespaces didn't exist so the metering rules weren't applied.

  We are running a multi-node OpenStack Pike deployment running on CentOS 7.4.1708 servers.
  The network node is running Neutron L3, DHC, VPN, Metering, Metadata, Open vSwitch agents and other Open vSwitch services.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1807153/+subscriptions


References