yahoo-eng-team team mailing list archive

Thread
Date
[Bug 1102714] Re: nova-network related fixed ip cleanup is needed

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: Sean Dague <sean@xxxxxxxxx>
Date: Mon, 30 Mar 2015 12:20:43 -0000
Reply-to: Bug 1102714 <1102714@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx
** Changed in: nova
       Status: Triaged => Opinion

** Changed in: nova
   Importance: Low => Wishlist

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1102714

Title:
  nova-network related fixed ip cleanup is needed

Status in OpenStack Compute (Nova):
  Opinion

Bug description:
  overall description: 
  If nova-network crashes (and then restarts) during a "nova delete" operation, then the ip associated with the deleted VM instance may remain in the "allocated" state. This may affect the subsequent network related operations. 

  concrete example:
  step 1: create a fixed ip network with only 4 ips. 
  the "fixed_ips" table looks like the follwoing at this stage:
  ========================================
  mysql> select id,address,instance_id,allocated,leased,reserved from fixed_ips;
  +----+---------------+-------------+-----------+--------+----------+
  | id | address       | instance_id | allocated | leased | reserved |
  +----+---------------+-------------+-----------+--------+----------+
  |  1 | 192.199.196.0 |        NULL |         0 |      0 |        1 |
  |  2 | 192.199.196.1 |        NULL |         0 |      0 |        1 |
  |  3 | 192.199.196.2 |        NULL |         0 |      0 |        0 |
  |  4 | 192.199.196.3 |        NULL |         0 |      0 |        1 |
  +----+---------------+-------------+-----------+--------+----------+
  ========================================

  step 2: create a VM and then delete it.

  during the execution of "nova delete" command, nova-compute sends a
  rpc call and then a rpc cast to nova-network. At the time of the rpc
  cast, before the rpc is sent, we stop nova-network service (to emulate
  a service crash). The rpc cast is lost if the rpc queue related to
  nova-network is auto-deleted with the stopped service.

  the "nova delete" command returns successfully. the VM is marked as
  deleted. but the ip associated with the VM is still marked as
  "allocated" due to the lost rpc message.

  the "fixed_ips" table looks like the following at this stage:
  ========================================
  mysql> select id,address,instance_id,allocated,leased,reserved from fixed_ips;
  +----+---------------+-------------+-----------+--------+----------+
  | id | address       | instance_id | allocated | leased | reserved |
  +----+---------------+-------------+-----------+--------+----------+
  |  1 | 192.199.196.0 |        NULL |         0 |      0 |        1 |
  |  2 | 192.199.196.1 |        NULL |         0 |      0 |        1 |
  |  3 | 192.199.196.2 |               1 |         1 |      0 |        0 |
  |  4 | 192.199.196.3 |        NULL |         0 |      0 |        1 |
  +----+---------------+-------------+-----------+--------+----------+
  ========================================

  step 3: restart the nova-network service. try to create another VM via
  "nova boot." this time we will get an exception due to "zero fixed ips
  available."

  thought:
  from a user's perspective, the VM in step 3 should be created. that operation cannot succeed due to the existence of an "orphan" ip related to a previously deleted instance. Admittedly, the situation in the above example can be avoided by configuring the rpc queue related to nova-network to be non-auto-deleted. But a periodical orphan ip cleanup logic may help in general in such cases. 

  Is there such a logic/module in OpenStack? If there is, should it be
  triggered when the "fixed_ips" table is about to be exhausted (as in
  the above example)?

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1102714/+subscriptions