yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #74718
[Bug 1788865] Re: neutron-openvswitch-agent interface monitor does not work if ovsdb-client generates warnings (ovs 2.10)
Reviewed: https://review.openstack.org/596717
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=f6d98a747b03e4da5109b2ee0e3c1bd7e88aee49
Submitter: Zuul
Branch: master
commit f6d98a747b03e4da5109b2ee0e3c1bd7e88aee49
Author: Bernard Cafarelli <bcafarel@xxxxxxxxxx>
Date: Mon Aug 27 14:37:15 2018 +0200
ovsdb monitor: do not die on ovsdb-client stderr output
That process may generate stderr output (ovs 2.10 with dpdk support will
log about missing optional libraries for example), in which case the
agent will loop forever respawning the ovsdb-client processes.
AsyncProcess already handles processes exiting uncleanly, and logs
stderr output with log_output=True (which is the case for OvsdbMonitor).
As the monitors work on stdout output, disabling die_on_error is enough
to make them work with this behaviour.
Change-Id: I8f2e5b93b9c16f9b288046911b5aeb4938845233
Closes-Bug: #1788865
** Changed in: neutron
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1788865
Title:
neutron-openvswitch-agent interface monitor does not work if ovsdb-
client generates warnings (ovs 2.10)
Status in neutron:
Fix Released
Bug description:
This was found while testing with ovs 2.10
openvswitch has all drivers built-in, and unfortunately Mellanox needs extra libs, so the driver can't be initialized if you miss those libs. A visible result if the system does not have these extra libs is a warning on stderr when calling ovsdb-client.
A typical call as done by OvsdbMonitor is [ovsdb-client monitor tcp:127.0.0.1:6640 Interface name,ofport,external_ids --format=json]
It will result in
PMD: net_mlx5: cannot load glue library: libibverbs.so.1: cannot open shared object file: No such file or directory # on stderr
[...]
{"data":[...]} # Proper JSON output on stdout
But OvsdbMonitor is an AsyncProcess(die_on_error=True), so if any
stderr ouput is found, the process is killed. With the libibverbs
warning, that basically gives a non-working agent
There are possible workarounds and fixes on the ovs side of course,
but the agent should be more robust to this kind of events (stderr is
not always fatal).
Initial fix ideas:
* Disable die_on_error in OvsdbMonitor, update sub-classes process_events() to filter out non JSON output. Log error lines in debug or similar (as it may be quite verbose in this ovs 2.10 warning case). This is a short-term fix, but we may miss actual errors, and slower reactions to them (until we hit timeout)
* Update the OvsdbMonitor/AsyncProcess logic to check process return code. This allows to ignore/log in a low level stderr output and rely on process reporting success. But it is a bigger change and is still vulnerable to CLI changes
* Use native ovsdb implementation. No more subprocess and vulnerability to CLI changes, but a bit longer-term solution.
Original downstream bug with some additional info and workarounds:
https://bugzilla.redhat.com/show_bug.cgi?id=1619387
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1788865/+subscriptions
References