← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1648840] Re: libvirt driver leaves interface residue after failed start

 

Reviewed:  https://review.openstack.org/408806
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=5e7f765266e0b94807e019b645c8be89770e7428
Submitter: Jenkins
Branch:    master

commit 5e7f765266e0b94807e019b645c8be89770e7428
Author: Dan Smith <dansmith@xxxxxxxxxx>
Date:   Thu Dec 8 12:25:37 2016 -0800

    Cleanup after any failed libvirt spawn
    
    When we go to spawn a libvirt domain, we catch a few types of exceptions
    and perform cleanup before failing the operation. For some reason, we
    don't do this universally, which means that we leave things like network
    devices laying around (from plug_vifs()). If a delete comes later, it
    should clean those things up. However, if a subsequent failure prevents
    that, and especially if we do a local delete at the API, we'll leak those
    interfaces.
    
    As seen in at least one real-world situation, this can cause us to leak
    interfaces until we have tens of thousands of them on the system, which
    then causes secondary failures.
    
    Since we run the cleanup() routine for certain failures, it certainly
    seems appropriate to run it always and not leave residue until a
    successful delete is performed.
    
    Closes-Bug: #1648840
    Change-Id: Iab5afdf1b5b8d107ea0e5895c24d50712e7dc7b1


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1648840

Title:
  libvirt driver leaves interface residue after failed start

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) newton series:
  Confirmed

Bug description:
  When the libvirt driver fails to start a VM due to reasons other than
  neutron plug timeout, it leaves interfaces on the system from the vif
  plugging. If a subsequent delete is performed and completes
  successfully, these will be removed. However, in cases where
  connectivity is preventing a normal delete, a local delete will be
  performed at the api level and the interfaces will remain.

  In at least one real world situation I have observed, a script was
  creating test instances which were failing and leaving residue. After
  the residue interface count reached about 6,000 on the system, VM
  creates started failing with "Argument list too long" as libvirt was
  choking on enumerating the interfaces it had left behind.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1648840/+subscriptions


References