yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #79325
[Bug 1807153] Re: Race condition in metering agent when creating iptable managers for router namespaces
Reviewed: https://review.opendev.org/666970
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=9f541521bbbcf36325bfc250e3ce27a138ddef3c
Submitter: Zuul
Branch: master
commit 9f541521bbbcf36325bfc250e3ce27a138ddef3c
Author: bno1 <alex@xxxxxxxxxxxxxxxxx>
Date: Sun Jun 23 00:51:02 2019 +0300
Retry creating iptables managers and adding metering rules
This change makes the metering agent retry creating the iptables
managers for each router and applying the metering rules.
This is needed in case the metering agent starts before some or all of
the namespaces are created.
Change-Id: Ifc565feb98c7f02df5c2831a3607c3e526a2e703
Closes-Bug: #1807153
** Changed in: neutron
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1807153
Title:
Race condition in metering agent when creating iptable managers for
router namespaces
Status in neutron:
Fix Released
Bug description:
Sometimes the metering agent fails to send meter information. When
this happens and I ssh into the network node and run `ip netns qrouter
-<router-id> iptables -nvL` I see no metering rules in the output.
Restarting the metering agent fixes this.
I suspect that this is a race condition which happens when the
metering agent is notified of a router before the L3 agent creates the
namespaces for it. This causes the metering agent to not create an
IptablesManager for the qrouter namespace and not add the metering
rules (this happens in
`neutron/services/metering/drivers/iptables/iptables_driver.py` in
`RouterWithMetering.__init__`).
I tested this in the following manner:
1. Have a public router and a metering label with at least one rule attached. In my case I have two public routers (one distributed and the other centralized), a metering label with the ingress rule 0.0.0.0/0 and another metering label with the egress rule 0.0.0.0/0.
2. Have a single network node. This makes the test easier to control.
3. On the network node manually edit iptables_driver.py and add the following 2 if statements at the end of `RouterWithMetering.__init__`:
if not self.iptables_manager:
LOG.debug('Router %s has no iptables manager', router['name'])
if not self.snat_iptables_manager:
LOG.debug('Router %s has no snat iptables manager', router['name'])
4. Set debug=True in /etc/neutron/metering_agent.ini on the network node.
5. Reboot the network node.
6. After it boots up check the iptables on the qrouter namespace and if the metering rules are missing run `grep 'iptables manager' /var/log/meutron/metering-agent.log. I get the following output:
2018-12-06 07:55:26.103 1632 DEBUG neutron.services.metering.drivers.iptables.iptables_driver [req-87aa420f-8bdb-48b8-be38-91e407a56a61 - - - - -] Router public_router1 has no iptables manager __init__ /usr/lib/python2.7/site-packages/neutron/services/metering/drivers/iptables/iptables_driver.py:104
2018-12-06 07:55:26.104 1632 DEBUG neutron.services.metering.drivers.iptables.iptables_driver [req-87aa420f-8bdb-48b8-be38-91e407a56a61 - - - - -] Router public_router1 has no snat iptables manager __init__ /usr/lib/python2.7/site-packages/neutron/services/metering/drivers/iptables/iptables_driver.py:107
2018-12-06 07:55:26.158 1632 DEBUG neutron.services.metering.drivers.iptables.iptables_driver [req-87aa420f-8bdb-48b8-be38-91e407a56a61 - - - - -] Router public_router2 has no iptables manager __init__ /usr/lib/python2.7/site-packages/neutron/services/metering/drivers/iptables/iptables_driver.py:104
2018-12-06 07:55:26.159 1632 DEBUG neutron.services.metering.drivers.iptables.iptables_driver [req-87aa420f-8bdb-48b8-be38-91e407a56a61 - - - - -] Router public_router2 has no snat iptables manager __init__ /usr/lib/python2.7/site-packages/neutron/services/metering/drivers/iptables/iptables_driver.py:107
This confirms that at the time the metering agent started the router
namespaces didn't exist so the metering rules weren't applied.
We are running a multi-node OpenStack Pike deployment running on CentOS 7.4.1708 servers.
The network node is running Neutron L3, DHC, VPN, Metering, Metadata, Open vSwitch agents and other Open vSwitch services.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1807153/+subscriptions
References