yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #56561
[Bug 1622833] Re: timestamp mechanism in linux bridge false positives
Reviewed: https://review.openstack.org/369179
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=a2bd0b4b53db8468681eb2905e2fbc2f9073869a
Submitter: Jenkins
Branch: master
commit a2bd0b4b53db8468681eb2905e2fbc2f9073869a
Author: Kevin Benton <kevin@xxxxxxxxxx>
Date: Mon Sep 12 22:27:33 2016 -0700
LinuxBridge: Use ifindex for logical 'timestamp'
With Xenial (and maybe older versions), the modified timestamps
in /sys/class/net/(device_name) are not stable. They appear to
work for a period of time, and then when some kind of cache clears
on the kernel side, all of the timestamps are reset to the latest
access time.
This was causing the Linux Bridge agent to think that the interfaces
were experiencing local changes much more frequently than they actually
were, resulting in more polling to the Neutron server and subsequently
more BUILD->ACTIVE->BUILD->ACTIVE transitions in the logical model.
The purpose of the timestamp patch was to catch rapid server REBUILD
operations where the interface would be deleted and re-added within
a polling interval. Without it, these would be stuck in the BUILD
state since the agent wouldn't realize it needed to wire the ports.
This patch switches to looking at the IFINDEX of the interfaces to
use as a sort of logical timestamp. If an interface gets removed
and readded, it will get a different index, so the original timestamp
comparison logic will still work.
In the future, the agent should undergo a larger refactor to just
watch 'ip monitor' for netlink events to replace the polling of the
interface listing and the timestamp logic entirely. However, this
approach was taken due to the near term release and the ability to
back-port it to older releases.
This was verified with both Nova rebuild actions and Nova interface
attach/detach actions.
Change-Id: I016019885446bff6806268ab49cd5476d93ec61f
Closes-Bug: #1622833
** Changed in: neutron
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1622833
Title:
timestamp mechanism in linux bridge false positives
Status in neutron:
Fix Released
Bug description:
The linux bridge agent is picking up too many false positives in its
detection mechanism for when devices have been modified locally. In
the following the 4 tap devices attached to a particular bridge had
timestamps that jumped forward even though none of the interfaces
actually changed:
2016-09-13 00:13:38.744 14179 DEBUG neutron.plugins.ml2.drivers.agent._common_agent [req-82c02245-80fd-4712-baa6-cdd4033315d1 - -] Adding locally changed devices to updated set: set(['tap422b85d9-95', 'tap9b365584-34', 'tapee2684f8-51', 'tap66ef2d8e-3b']) scan_devices /opt/stack/new/neutron/neutron/plugins/ml2/drivers/agent/_common_agent.py:397
2016-09-13 00:13:38.744 14179 DEBUG neutron.plugins.ml2.drivers.agent._common_agent [req-82c02245-80fd-4712-baa6-cdd4033315d1 - -] Agent loop found changes! {'current': set(['tap422b85d9-95', 'tapee2684f8-51', 'tap6028e7a2-c0', 'tap9b365584-34', 'tap0960ffac-f9', 'tap7ba5f865-54', 'tap66ef2d8e-3b', 'tapfe427ba3-63', 'tap475f33ef-c3']), 'timestamps': {'tap422b85d9-95': 1473725618.73996, 'tapee2684f8-51': 1473725618.73996, 'tap6028e7a2-c0': None, 'tap9b365584-34': 1473725618.73996, 'tap0960ffac-f9': 1473725618.73996, 'tap7ba5f865-54': 1473725616.7399597, 'tap66ef2d8e-3b': 1473725618.73996, 'tapfe427ba3-63': 1473725616.7399597, 'tap475f33ef-c3': None}, 'removed': set([]), 'added': set([]), 'updated': set(['tap422b85d9-95', 'tap9b365584-34', 'tapee2684f8-51', 'tap66ef2d8e-3b'])} daemon_loop /opt/stack/new/neutron/neutron/plugins/ml2/drivers/agent/_common_agent.py:448
This leads to the agent refetching the details, which puts the port in BUILD and then back to ACTIVE. This leads to sporadic failures when tempest tests are asserting that a port should be in the ACTIVE status.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1622833/+subscriptions
References