yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #89595
[Bug 1971760] Re: nova-compute leaks green threads
[Expired for OpenStack Compute (nova) because there has been no activity
for 60 days.]
** Changed in: nova
Status: Incomplete => Expired
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1971760
Title:
nova-compute leaks green threads
Status in OpenStack Compute (nova):
Expired
Bug description:
At the moment, if the cloud sustain a large number of VIF plugging
timeouts, it will lead into a ton of leaked green threads which can
cause the nova-compute process to stop reporting/responding.
The tracebacks that would occur would be:
===
2022-04-17 00:21:28.651 877893 ERROR nova.compute.manager [instance: 0c0d2422-781c-4bd2-b6bd-e5e3c94b602b] Traceback (most recent call last):
2022-04-17 00:21:28.651 877893 ERROR nova.compute.manager [instance: 0c0d2422-781c-4bd2-b6bd-e5e3c94b602b] File "/var/lib/openstack/lib/python3.8/site-packages/nova/virt/libvirt/driver.py", line 7230, in _create_guest_with_network
2022-04-17 00:21:28.651 877893 ERROR nova.compute.manager [instance: 0c0d2422-781c-4bd2-b6bd-e5e3c94b602b] guest = self._create_guest(
2022-04-17 00:21:28.651 877893 ERROR nova.compute.manager [instance: 0c0d2422-781c-4bd2-b6bd-e5e3c94b602b] File "/usr/lib/python3.8/contextlib.py", line 120, in __exit__
2022-04-17 00:21:28.651 877893 ERROR nova.compute.manager [instance: 0c0d2422-781c-4bd2-b6bd-e5e3c94b602b] next(self.gen)
2022-04-17 00:21:28.651 877893 ERROR nova.compute.manager [instance: 0c0d2422-781c-4bd2-b6bd-e5e3c94b602b] File "/var/lib/openstack/lib/python3.8/site-packages/nova/compute/manager.py", line 479, in wait_for_instance_event
2022-04-17 00:21:28.651 877893 ERROR nova.compute.manager [instance: 0c0d2422-781c-4bd2-b6bd-e5e3c94b602b] actual_event = event.wait()
2022-04-17 00:21:28.651 877893 ERROR nova.compute.manager [instance: 0c0d2422-781c-4bd2-b6bd-e5e3c94b602b] File "/var/lib/openstack/lib/python3.8/site-packages/eventlet/event.py", line 125, in wait
2022-04-17 00:21:28.651 877893 ERROR nova.compute.manager [instance: 0c0d2422-781c-4bd2-b6bd-e5e3c94b602b] result = hub.switch()
2022-04-17 00:21:28.651 877893 ERROR nova.compute.manager [instance: 0c0d2422-781c-4bd2-b6bd-e5e3c94b602b] File "/var/lib/openstack/lib/python3.8/site-packages/eventlet/hubs/hub.py", line 313, in switch
2022-04-17 00:21:28.651 877893 ERROR nova.compute.manager [instance: 0c0d2422-781c-4bd2-b6bd-e5e3c94b602b] return self.greenlet.switch()
2022-04-17 00:21:28.651 877893 ERROR nova.compute.manager [instance: 0c0d2422-781c-4bd2-b6bd-e5e3c94b602b] eventlet.timeout.Timeout: 300 seconds
2022-04-17 00:21:28.651 877893 ERROR nova.compute.manager [instance: 0c0d2422-781c-4bd2-b6bd-e5e3c94b602b]
2022-04-17 00:21:28.651 877893 ERROR nova.compute.manager [instance: 0c0d2422-781c-4bd2-b6bd-e5e3c94b602b] During handling of the above exception, another exception occurred:
2022-04-17 00:21:28.651 877893 ERROR nova.compute.manager [instance: 0c0d2422-781c-4bd2-b6bd-e5e3c94b602b]
2022-04-17 00:21:28.651 877893 ERROR nova.compute.manager [instance: 0c0d2422-781c-4bd2-b6bd-e5e3c94b602b] Traceback (most recent call last):
2022-04-17 00:21:28.651 877893 ERROR nova.compute.manager [instance: 0c0d2422-781c-4bd2-b6bd-e5e3c94b602b] File "/var/lib/openstack/lib/python3.8/site-packages/nova/compute/manager.py", line 2409, in _build_and_run_instance
2022-04-17 00:21:28.651 877893 ERROR nova.compute.manager [instance: 0c0d2422-781c-4bd2-b6bd-e5e3c94b602b] self.driver.spawn(context, instance, image_meta,
2022-04-17 00:21:28.651 877893 ERROR nova.compute.manager [instance: 0c0d2422-781c-4bd2-b6bd-e5e3c94b602b] File "/var/lib/openstack/lib/python3.8/site-packages/nova/virt/libvirt/driver.py", line 4193, in spawn
2022-04-17 00:21:28.651 877893 ERROR nova.compute.manager [instance: 0c0d2422-781c-4bd2-b6bd-e5e3c94b602b] self._create_guest_with_network(
2022-04-17 00:21:28.651 877893 ERROR nova.compute.manager [instance: 0c0d2422-781c-4bd2-b6bd-e5e3c94b602b] File "/var/lib/openstack/lib/python3.8/site-packages/nova/virt/libvirt/driver.py", line 7256, in _create_guest_with_network
2022-04-17 00:21:28.651 877893 ERROR nova.compute.manager [instance: 0c0d2422-781c-4bd2-b6bd-e5e3c94b602b] raise exception.VirtualInterfaceCreateException()
2022-04-17 00:21:28.651 877893 ERROR nova.compute.manager [instance: 0c0d2422-781c-4bd2-b6bd-e5e3c94b602b] nova.exception.VirtualInterfaceCreateException: Virtual Interface creation failed
2022-04-17 00:21:28.651 877893 ERROR nova.compute.manager [instance: 0c0d2422-781c-4bd2-b6bd-e5e3c94b602b]
===
Eventually, with enough of these, the nova-compute process would hang.
The output of GMR shows nearly 6094 threads, with around 3038 of them
having the traceback below:
===
------ Green Thread ------
/var/lib/openstack/lib/python3.8/site-packages/eventlet/hubs/hub.py:355 in run
`self.fire_timers(self.clock())`
/var/lib/openstack/lib/python3.8/site-packages/eventlet/hubs/hub.py:476 in fire_timers
`timer()`
/var/lib/openstack/lib/python3.8/site-packages/eventlet/hubs/timer.py:59 in __call__
`cb(*args, **kw)`
/var/lib/openstack/lib/python3.8/site-packages/eventlet/hubs/__init__.py:151 in _timeout
`current.throw(exc)`
===
In addition, 3039 of those threads would output the following:
===
------ Green Thread ------
No Traceback!
===
In total, that puts 6077 green threads in total with that weird state.
We've had a discussion about this here, and it seems that it may be
related to the use of `spawn_n`.
https://meetings.opendev.org/irclogs/%23openstack-nova/%23openstack-
nova.2022-05-05.log.html#t2022-05-05T16:20:37
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1971760/+subscriptions
References