yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #59111
[Bug 1641879] Re: No routers created after l3-agent start - error during L3NATAgentWithStateReport.periodic_sync_routers_task
Reviewed: https://review.openstack.org/400233
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=62176a9b40619327aacee9ed4162252d1245d019
Submitter: Jenkins
Branch: master
commit 62176a9b40619327aacee9ed4162252d1245d019
Author: Pepijn Oomen <pepijn@xxxxxxxxxxxxxxxxxx>
Date: Mon Nov 21 12:07:45 2016 +0100
Solve unexpected NoneType returned by _get_routers_can_schedule.
Solve a problem with an unexpected NoneType returned by
_get_routers_can_schedule called from within
_schedule_ha_routers_to_additional_agent when using:
router_scheduler_driver =
neutron.scheduler.l3_agent_scheduler.AZLeastRoutersScheduler
This was leading to problems with starting neutron-l3-agent on network
nodes, causing HA routers to fail to start.
Closes-Bug: #1641879
Change-Id: I33c5a6214670f0ada9c2293b0eb2ff243f6f7b1b
** Changed in: neutron
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1641879
Title:
No routers created after l3-agent start - error during
L3NATAgentWithStateReport.periodic_sync_routers_task
Status in neutron:
Fix Released
Bug description:
After (re)starting the L3 agent, a failure during
L3NATAgentWithStateReport.periodic_sync_routers_task was encountered.
This resulted in a Python traceback being dumped to the log ending
with "TypeError: 'NoneType' object is not iterable". The error repeats
every minute or so. (See below.)
While the error persists, no routers are provisioned by the L3 agent.
It just keeps failing and failing and failing in an seemingly infinite
loop.
The fix/workaround was to remove a random router from the L3 agent
having problems, like so:
$ neutron l3-agent-router-remove <uuid-of-l3-agent-having-problems>
<uuid-of-some-router>
It did not seem to matter exactly which of the many routers scheduled
to run on the problematic L3 agent that was removed in this way. The
removal itself seemed to get rid of the blockage, allowing the L3
agent to start normal operations shortly after.
Excerpts from l3-agent.log:
2016-11-15 03:14:26.201 2988 INFO neutron.common.config [-] Logging enabled!
2016-11-15 03:14:26.201 2988 INFO neutron.common.config [-] /usr/bin/neutron-l3-agent version 8.3.0
2016-11-15 03:14:26.504 2988 INFO eventlet.wsgi.server [-] (2988) wsgi starting up on http:/var/lib/neutron/keepalived-state-change
2016-11-15 03:14:26.548 2988 INFO neutron.agent.l3.agent [-] Agent has just been revived. Doing a full sync.
2016-11-15 03:14:26.781 2988 INFO neutron.agent.l3.agent [-] L3 agent started
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task [req-97b76ec8-9eb6-4e23-8074-63c889c3e199 - - - - -] Error during L3NATAgentWithStateReport.periodic_sync_routers_task
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task Traceback (most recent call last):
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/oslo_service/periodic_task.py", line 220, in run_periodic_tasks
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task task(self, context)
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 545, in periodic_sync_routers_task
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task self.fetch_and_sync_all_routers(context, ns_manager)
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 556, in fetch_and_sync_all_routers
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task self.plugin_rpc.get_router_ids(context))
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 108, in get_router_ids
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task return cctxt.call(context, 'get_router_ids', host=self.host)
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/neutron/common/rpc.py", line 136, in call
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task return self._original_context.call(ctxt, method, **kwargs)
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 158, in call
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task retry=self.retry)
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 90, in _send
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task timeout=timeout, retry=retry)
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 470, in send
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task retry=retry)
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 461, in _send
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task raise result
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task TypeError: 'NoneType' object is not iterable
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task Traceback (most recent call last):
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 138, in _dispatch_and_reply
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task incoming.message))
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 185, in _dispatch
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task return self._do_dispatch(endpoint, method, ctxt, args)
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 127, in _do_dispatch
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task result = func(ctxt, **new_args)
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/neutron/api/rpc/handlers/l3_rpc.py", line 75, in get_router_ids
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task router_ids=None)
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/neutron/db/l3_agentschedulers_db.py", line 525, in auto_schedule_routers
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task self, context, host, router_ids)
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/neutron/scheduler/l3_agent_scheduler.py", line 148, in auto_schedule_routers
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task plugin, context, l3_agent)
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/neutron/scheduler/l3_agent_scheduler.py", line 314, in _schedule_ha_routers_to_addition
al_agent
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task for router in schedulable_routers:
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task TypeError: 'NoneType' object is not iterable
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task
2016-11-15 03:14:28.297 2988 ERROR oslo_service.periodic_task
[ some time pass ]
2016-11-15 03:15:08.152 2988 ERROR oslo_service.periodic_task [req-97b76ec8-9eb6-4e23-8074-63c889c3e199 - - - - -] Error during L3NATAgentWithStateReport.periodic_sync_routers_task
2016-11-15 03:15:08.152 2988 ERROR oslo_service.periodic_task Traceback (most recent call last):
2016-11-15 03:15:08.152 2988 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/oslo_service/periodic_task.py", line 220, in run_periodic_tasks
2016-11-15 03:15:08.152 2988 ERROR oslo_service.periodic_task task(self, context)
[ ... ]
[ "neutron l3-agent-router-remove <l3-agent-uuid> d95d9198-9960-4f2c-
b33f-d6211ef1a8a3" is run]
2016-11-15 03:15:50.441 2988 WARNING neutron.agent.l3.agent [-] Info for router d95d9198-9960-4f2c-b33f-d6211ef1a8a3 was not found. Performing router cleanup
2016-11-15 03:17:23.102 2988 WARNING oslo.service.loopingcall [req-97b76ec8-9eb6-4e23-8074-63c889c3e199 - - - - -] Function 'neutron.service.Service.periodic_tasks' run outlasted interval by
15.31 sec
2016-11-15 03:17:23.174 2988 INFO oslo_rootwrap.client [-] Spawned new rootwrap daemon process with pid=9015
2016-11-15 03:17:23.634 2988 INFO neutron.agent.l3.ha [-] Router 19258eff-2755-41b8-aefb-1c9b2af991f4 transitioned to backup
[ ... ]
At this point, the L3 agent starts operating normally, creating all
the routers it's scheduled to run and so on.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1641879/+subscriptions
References