yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #59691
[Bug 1648840] Re: libvirt driver leaves interface residue after failed start
Reviewed: https://review.openstack.org/408806
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=5e7f765266e0b94807e019b645c8be89770e7428
Submitter: Jenkins
Branch: master
commit 5e7f765266e0b94807e019b645c8be89770e7428
Author: Dan Smith <dansmith@xxxxxxxxxx>
Date: Thu Dec 8 12:25:37 2016 -0800
Cleanup after any failed libvirt spawn
When we go to spawn a libvirt domain, we catch a few types of exceptions
and perform cleanup before failing the operation. For some reason, we
don't do this universally, which means that we leave things like network
devices laying around (from plug_vifs()). If a delete comes later, it
should clean those things up. However, if a subsequent failure prevents
that, and especially if we do a local delete at the API, we'll leak those
interfaces.
As seen in at least one real-world situation, this can cause us to leak
interfaces until we have tens of thousands of them on the system, which
then causes secondary failures.
Since we run the cleanup() routine for certain failures, it certainly
seems appropriate to run it always and not leave residue until a
successful delete is performed.
Closes-Bug: #1648840
Change-Id: Iab5afdf1b5b8d107ea0e5895c24d50712e7dc7b1
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1648840
Title:
libvirt driver leaves interface residue after failed start
Status in OpenStack Compute (nova):
Fix Released
Status in OpenStack Compute (nova) newton series:
Confirmed
Bug description:
When the libvirt driver fails to start a VM due to reasons other than
neutron plug timeout, it leaves interfaces on the system from the vif
plugging. If a subsequent delete is performed and completes
successfully, these will be removed. However, in cases where
connectivity is preventing a normal delete, a local delete will be
performed at the api level and the interfaces will remain.
In at least one real world situation I have observed, a script was
creating test instances which were failing and leaving residue. After
the residue interface count reached about 6,000 on the system, VM
creates started failing with "Argument list too long" as libvirt was
choking on enumerating the interfaces it had left behind.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1648840/+subscriptions
References