yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #74421
[Bug 1788865] [NEW] neutron-openvswitch-agent interface monitor does not work if ovsdb-client generates warnings (ovs 2.10)
Public bug reported:
This was found while testing with ovs 2.10
openvswitch has all drivers built-in, and unfortunately Mellanox needs extra libs, so the driver can't be initialized if you miss those libs. A visible result if the system does not have these extra libs is a warning on stderr when calling ovsdb-client.
A typical call as done by OvsdbMonitor is [ovsdb-client monitor tcp:127.0.0.1:6640 Interface name,ofport,external_ids --format=json]
It will result in
PMD: net_mlx5: cannot load glue library: libibverbs.so.1: cannot open shared object file: No such file or directory # on stderr
[...]
{"data":[...]} # Proper JSON output on stdout
But OvsdbMonitor is an AsyncProcess(die_on_error=True), so if any stderr
ouput is found, the process is killed. With the libibverbs warning, that
basically gives a non-working agent
There are possible workarounds and fixes on the ovs side of course, but
the agent should be more robust to this kind of events (stderr is not
always fatal).
Initial fix ideas:
* Disable die_on_error in OvsdbMonitor, update sub-classes process_events() to filter out non JSON output. Log error lines in debug or similar (as it may be quite verbose in this ovs 2.10 warning case). This is a short-term fix, but we may miss actual errors, and slower reactions to them (until we hit timeout)
* Update the OvsdbMonitor/AsyncProcess logic to check process return code. This allows to ignore/log in a low level stderr output and rely on process reporting success. But it is a bigger change and is still vulnerable to CLI changes
* Use native ovsdb implementation. No more subprocess and vulnerability to CLI changes, but a bit longer-term solution.
Original downstream bug with some additional info and workarounds:
https://bugzilla.redhat.com/show_bug.cgi?id=1619387
** Affects: neutron
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1788865
Title:
neutron-openvswitch-agent interface monitor does not work if ovsdb-
client generates warnings (ovs 2.10)
Status in neutron:
New
Bug description:
This was found while testing with ovs 2.10
openvswitch has all drivers built-in, and unfortunately Mellanox needs extra libs, so the driver can't be initialized if you miss those libs. A visible result if the system does not have these extra libs is a warning on stderr when calling ovsdb-client.
A typical call as done by OvsdbMonitor is [ovsdb-client monitor tcp:127.0.0.1:6640 Interface name,ofport,external_ids --format=json]
It will result in
PMD: net_mlx5: cannot load glue library: libibverbs.so.1: cannot open shared object file: No such file or directory # on stderr
[...]
{"data":[...]} # Proper JSON output on stdout
But OvsdbMonitor is an AsyncProcess(die_on_error=True), so if any
stderr ouput is found, the process is killed. With the libibverbs
warning, that basically gives a non-working agent
There are possible workarounds and fixes on the ovs side of course,
but the agent should be more robust to this kind of events (stderr is
not always fatal).
Initial fix ideas:
* Disable die_on_error in OvsdbMonitor, update sub-classes process_events() to filter out non JSON output. Log error lines in debug or similar (as it may be quite verbose in this ovs 2.10 warning case). This is a short-term fix, but we may miss actual errors, and slower reactions to them (until we hit timeout)
* Update the OvsdbMonitor/AsyncProcess logic to check process return code. This allows to ignore/log in a low level stderr output and rely on process reporting success. But it is a bigger change and is still vulnerable to CLI changes
* Use native ovsdb implementation. No more subprocess and vulnerability to CLI changes, but a bit longer-term solution.
Original downstream bug with some additional info and workarounds:
https://bugzilla.redhat.com/show_bug.cgi?id=1619387
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1788865/+subscriptions
Follow ups