← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1807153] [NEW] Race condition in metering agent when creating iptable managers for router namespaces

 

Public bug reported:

Sometimes the metering agent fails to send meter information. When this
happens and I ssh into the network node and run `ip netns qrouter
-<router-id> iptables -nvL` I see no metering rules in the output.
Restarting the metering agent fixes this.

I suspect that this is a race condition which happens when the metering
agent is notified of a router before the L3 agent creates the namespaces
for it. This causes the metering agent to not create an IptablesManager
for the qrouter namespace and not add the metering rules (this happens
in `neutron/services/metering/drivers/iptables/iptables_driver.py` in
`RouterWithMetering.__init__`).

I tested this in the following manner:
1. Have a public router and a metering label with at least one rule attached. In my case I have two public routers (one distributed and the other centralized), a metering label with the ingress rule 0.0.0.0/0 and another metering label with the egress rule 0.0.0.0/0.
2. Have a single network node. This makes the test easier to control.
3. On the network node manually edit iptables_driver.py and add the following 2 if statements at the end of `RouterWithMetering.__init__`:

if not self.iptables_manager:
    LOG.debug('Router %s has no iptables manager', router['name'])

if not self.snat_iptables_manager:
    LOG.debug('Router %s has no snat iptables manager', router['name'])

4. Set debug=True in /etc/neutron/metering_agent.ini on the network node.
5. Reboot the network node.
6. After it boots up check the iptables on the qrouter namespace and if the metering rules are missing run `grep 'iptables manager' /var/log/meutron/metering-agent.log. I get the following output:

2018-12-06 07:55:26.103 1632 DEBUG neutron.services.metering.drivers.iptables.iptables_driver [req-87aa420f-8bdb-48b8-be38-91e407a56a61 - - - - -] Router public_router1 has no iptables manager __init__ /usr/lib/python2.7/site-packages/neutron/services/metering/drivers/iptables/iptables_driver.py:104
2018-12-06 07:55:26.104 1632 DEBUG neutron.services.metering.drivers.iptables.iptables_driver [req-87aa420f-8bdb-48b8-be38-91e407a56a61 - - - - -] Router public_router1 has no snat iptables manager __init__ /usr/lib/python2.7/site-packages/neutron/services/metering/drivers/iptables/iptables_driver.py:107
2018-12-06 07:55:26.158 1632 DEBUG neutron.services.metering.drivers.iptables.iptables_driver [req-87aa420f-8bdb-48b8-be38-91e407a56a61 - - - - -] Router public_router2 has no iptables manager __init__ /usr/lib/python2.7/site-packages/neutron/services/metering/drivers/iptables/iptables_driver.py:104
2018-12-06 07:55:26.159 1632 DEBUG neutron.services.metering.drivers.iptables.iptables_driver [req-87aa420f-8bdb-48b8-be38-91e407a56a61 - - - - -] Router public_router2 has no snat iptables manager __init__ /usr/lib/python2.7/site-packages/neutron/services/metering/drivers/iptables/iptables_driver.py:107

This confirms that at the time the metering agent started the router
namespaces didn't exist so the metering rules weren't applied.

We are running a multi-node OpenStack Pike deployment running on CentOS 7.4.1708 servers.
The network node is running Neutron L3, DHC, VPN, Metering, Metadata, Open vSwitch agents and other Open vSwitch services.

** Affects: neutron
     Importance: Undecided
         Status: New


** Tags: metering

** Tags added: metering

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1807153

Title:
  Race condition in metering agent when creating iptable managers for
  router namespaces

Status in neutron:
  New

Bug description:
  Sometimes the metering agent fails to send meter information. When
  this happens and I ssh into the network node and run `ip netns qrouter
  -<router-id> iptables -nvL` I see no metering rules in the output.
  Restarting the metering agent fixes this.

  I suspect that this is a race condition which happens when the
  metering agent is notified of a router before the L3 agent creates the
  namespaces for it. This causes the metering agent to not create an
  IptablesManager for the qrouter namespace and not add the metering
  rules (this happens in
  `neutron/services/metering/drivers/iptables/iptables_driver.py` in
  `RouterWithMetering.__init__`).

  I tested this in the following manner:
  1. Have a public router and a metering label with at least one rule attached. In my case I have two public routers (one distributed and the other centralized), a metering label with the ingress rule 0.0.0.0/0 and another metering label with the egress rule 0.0.0.0/0.
  2. Have a single network node. This makes the test easier to control.
  3. On the network node manually edit iptables_driver.py and add the following 2 if statements at the end of `RouterWithMetering.__init__`:

  if not self.iptables_manager:
      LOG.debug('Router %s has no iptables manager', router['name'])

  if not self.snat_iptables_manager:
      LOG.debug('Router %s has no snat iptables manager', router['name'])

  4. Set debug=True in /etc/neutron/metering_agent.ini on the network node.
  5. Reboot the network node.
  6. After it boots up check the iptables on the qrouter namespace and if the metering rules are missing run `grep 'iptables manager' /var/log/meutron/metering-agent.log. I get the following output:

  2018-12-06 07:55:26.103 1632 DEBUG neutron.services.metering.drivers.iptables.iptables_driver [req-87aa420f-8bdb-48b8-be38-91e407a56a61 - - - - -] Router public_router1 has no iptables manager __init__ /usr/lib/python2.7/site-packages/neutron/services/metering/drivers/iptables/iptables_driver.py:104
  2018-12-06 07:55:26.104 1632 DEBUG neutron.services.metering.drivers.iptables.iptables_driver [req-87aa420f-8bdb-48b8-be38-91e407a56a61 - - - - -] Router public_router1 has no snat iptables manager __init__ /usr/lib/python2.7/site-packages/neutron/services/metering/drivers/iptables/iptables_driver.py:107
  2018-12-06 07:55:26.158 1632 DEBUG neutron.services.metering.drivers.iptables.iptables_driver [req-87aa420f-8bdb-48b8-be38-91e407a56a61 - - - - -] Router public_router2 has no iptables manager __init__ /usr/lib/python2.7/site-packages/neutron/services/metering/drivers/iptables/iptables_driver.py:104
  2018-12-06 07:55:26.159 1632 DEBUG neutron.services.metering.drivers.iptables.iptables_driver [req-87aa420f-8bdb-48b8-be38-91e407a56a61 - - - - -] Router public_router2 has no snat iptables manager __init__ /usr/lib/python2.7/site-packages/neutron/services/metering/drivers/iptables/iptables_driver.py:107

  This confirms that at the time the metering agent started the router
  namespaces didn't exist so the metering rules weren't applied.

  We are running a multi-node OpenStack Pike deployment running on CentOS 7.4.1708 servers.
  The network node is running Neutron L3, DHC, VPN, Metering, Metadata, Open vSwitch agents and other Open vSwitch services.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1807153/+subscriptions


Follow ups