yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #85530
[Bug 1917409] Re: neutron-l3-agents won't become active
Argh, perhaps I've made things worse, I added an ubuntu source neutron
task for this, unclicked the 'duplicate' bug, but that sets the wrong
state for the upstream neutron, which was handled in
https://bugs.launchpad.net/neutron/+bug/1883089 -- I'm not sure how to
undo the mess I've made. Anyway, Brad mentions this affects the ubuntu
package.
** Also affects: neutron (Ubuntu)
Importance: Undecided
Status: New
** This bug is no longer a duplicate of bug 1883089
[L3] floating IP failed to bind due to no agent gateway port(fip-ns)
** Changed in: neutron
Status: New => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1917409
Title:
neutron-l3-agents won't become active
Status in neutron:
Fix Released
Status in neutron package in Ubuntu:
New
Bug description:
We have a Ubuntu Ussari cloud deployed on Ubuntu 20.04 using the juju
charms from the 20.08 bundle (planning to upgrade soon).
The problem that is occuring that all l3 agents for routers using a
particular external network show up with their ha_state in standby.
I've tried removing and re-adding, and we never see the state go to
active.
$ neutron l3-agent-list-hosting-router bradm-router
neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead.
+--------------------------------------+-------------+----------------+-------+----------+
| id | host | admin_state_up | alive | ha_state |
+--------------------------------------+-------------+----------------+-------+----------+
| 09ae92c9-ae8f-4209-b1a8-d593cc6d6602 | oschv1.maas | True | :-) | standby |
| 4d9fe934-b1f8-4c2b-83ea-04971f827209 | oschv2.maas | True | :-) | standby |
| 70b8b60e-7fbd-4b3a-80a3-90875ca72ce6 | oschv4.maas | True | :-) | standby |
+--------------------------------------+-------------+----------------+-------+----------+
This generates a stack trace:
2021-03-01 02:59:47.344 3675486 ERROR neutron.agent.l3.router_info [-] 'NoneType' object has no attribute 'get'
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming
res = self.dispatcher.dispatch(message)
File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/dispatcher.py", line 276, in dispatch
return self._do_dispatch(endpoint, method, ctxt, args)
File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/dispatcher.py", line 196, in _do_dispatch
result = func(ctxt, **new_args)
File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 139, in wrapped
setattr(e, '_RETRY_EXCEEDED', True)
File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise
raise value
File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 135, in wrapped
return f(*args, **kwargs)
File "/usr/lib/python3/dist-packages/oslo_db/api.py", line 154, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise
raise value
File "/usr/lib/python3/dist-packages/oslo_db/api.py", line 142, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 183, in wrapped
LOG.debug("Retry wrapper got retriable exception: %s", e)
File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise
raise value
File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 179, in wrapped
return f(*dup_args, **dup_kwargs)
File "/usr/lib/python3/dist-packages/neutron/api/rpc/handlers/l3_rpc.py", line 306, in get_agent_gateway_port
agent_port = self.l3plugin.create_fip_agent_gw_port_if_not_exists(
File "/usr/lib/python3/dist-packages/neutron/db/l3_dvr_db.py", line 1101, in create_fip_agent_gw_port_if_not_exists
self._populate_mtu_and_subnets_for_ports(context, [agent_port])
File "/usr/lib/python3/dist-packages/neutron/db/l3_db.py", line 1772, in _populate_mtu_and_subnets_for_ports
network_ids = [p['network_id']
File "/usr/lib/python3/dist-packages/neutron/db/l3_db.py", line 1772, in <listcomp>
network_ids = [p['network_id']
File "/usr/lib/python3/dist-packages/neutron/db/l3_db.py", line 1720, in _each_port_having_fixed_ips
fixed_ips = port.get('fixed_ips', [])
This system was running successfully after deployment, and has been
left running for a while and when it was revisited was in this state.
I've been unable to successfully debug what has caused it to be in
this state.
Versions:
Ubuntu 20.04
Juju charms 20.08
Openstack ussari
Environment: Clustered services using containers on converged hypervisors
$ dpkg-query -W neutron-common
neutron-common 2:16.2.0-0ubuntu2
Please let me know if there is any further information that could be
used to see what is happening here.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1917409/+subscriptions
References