yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #44776
[Bug 1531772] Re: Liberty server and Kilo security group aware agent fail to refresh firewall for DHCP and router IPv6 ports
Reviewed: https://review.openstack.org/266886
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=f8f366024052a191eb0fc74af1643be15c541aef
Submitter: Jenkins
Branch: master
commit f8f366024052a191eb0fc74af1643be15c541aef
Author: Ihar Hrachyshka <ihrachys@xxxxxxxxxx>
Date: Wed Jan 13 12:37:21 2016 +0100
Make security_groups_provider_updated work with Kilo agents
Initially, we bumped the required version for the agent endpoint from
1.1 (the initial version that implemented security groups) to 1.3
without considering that the code should work with old agents that do
not yet know about the new devices_to_update argument.
Actually, there was no need to bump the version: old agent side code
already captures all unknown arguments that could be passed from the
server, ignoring them:
https://github.com/openstack/neutron/blob/608b54137fb67512c07099089ea7e074176e12df/neutron/agent/securitygroups_rpc.py#L155
(^ the link shows the latest Kilo code as of writing)
Note: some people may argue that the approach that is taken in Neutron
to support backwards compatibility for server notifications is wrong,
and we instead should adopt some stricter mechanism like nova version
pinning. While that is a noble thing to do, it's out of scope for the
patch that is designed to be easily backportable to stable/liberty.
Note: some people may also argue that the patch should go straight into
stable/liberty because we don't claim support for rolling upgrade
scenarios that span multiple releases. That's indeed true, though my
take on it is that if we have a way to handle more unofficial scenarios
without more coding effort, it's worth doing it.
Change-Id: I741e6e5c460658ac17095551040e67e8d1990812
Closes-Bug: #1531772
** Changed in: neutron
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1531772
Title:
Liberty server and Kilo security group aware agent fail to refresh
firewall for DHCP and router IPv6 ports
Status in neutron:
Fix Released
Bug description:
When we try to mix Liberty server with Kilo L2 agent, we get the
following traceback in the agent log:
ERROR oslo_messaging.rpc.dispatcher [-] Exception during message handling: Endpoint does not support RPC version 1.3. Attempted method: security_groups_provider_updated
TRACE oslo_messaging.rpc.dispatcher Traceback (most recent call last):
TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply
TRACE oslo_messaging.rpc.dispatcher executor_callback))
TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 195, in _dispatch
TRACE oslo_messaging.rpc.dispatcher raise UnsupportedVersion(version, method=method)
TRACE oslo_messaging.rpc.dispatcher UnsupportedVersion: Endpoint does not support RPC version 1.3. Attempted method: security_groups_provider_updated
In Kilo, server just dropped a bare notification about some change,
and the firewall was reset for all devices; in Liberty, it now passes
the list of devices to refresh, so that firewall setup on security
group change is more optimized.
Missing the notification could mean any kind of issues that will all
go back to ‘my firewall is not updated after security group change’.
For what I see in the code, it would affect DHCP and router IPv6 ports
only.
Now, since the signature of the RPC call was changed (adding the list
of devices), the server requires version = 1.3 for the agent endpoint
that would know about the new argument. If that would be a usual
notification directed specifically to the agent, we would just use
call() instead of cast() and handle UnsupportedVersion exception by
calling remotely without the device list. But since it’s fanout, we
can’t do it.
The solution for the upgrade issue would probably be reverting the
optimization in Liberty. Since we don’t support spanning upgrades
through multiple cycles just yet, it should be enough.
Other alternatives do not seem to work here:
- cast()ing for both new and old signatures would effectively disable the optimization, because the same agent would receive both versions of the method, and the old one will trigger full firewall reset anyway;
- calling cast() with the new signature but without the version specified would probably make the older Kilo agent to crash in a more horrible way; (note: I need to check that locally).
Side note: it’s interesting that we have a backwards compatible code
on agent side to accommodate to older servers. I will probably kill it
since it’s not in line with usual rolling upgrade scenarios that we
support where you never run a server older than an agent in the
cluster.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1531772/+subscriptions
References