yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #78800
[Bug 1830081] Re: Nova unplug interface race condition when deleting an instance
** Also affects: nova/queens
Importance: Undecided
Status: New
** Also affects: nova/rocky
Importance: Undecided
Status: New
** Also affects: nova/stein
Importance: Undecided
Status: New
** Changed in: nova/queens
Status: New => Confirmed
** Changed in: nova/rocky
Status: New => Confirmed
** Changed in: nova/stein
Status: New => Confirmed
** Changed in: nova/queens
Importance: Undecided => Low
** Changed in: nova/rocky
Importance: Undecided => Low
** Changed in: nova/stein
Importance: Undecided => Low
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1830081
Title:
Nova unplug interface race condition when deleting an instance
Status in OpenStack Compute (nova):
In Progress
Status in OpenStack Compute (nova) queens series:
Confirmed
Status in OpenStack Compute (nova) rocky series:
Confirmed
Status in OpenStack Compute (nova) stein series:
Confirmed
Bug description:
Description
===========
When nova start an instance, it asks neutron to create a port and then update the instance info cache based on information from neutron.
If, in the middle of the spawning, the instance is getting deleted, the terminate_instance function is called with an instance object that DOES NOT contain any network info.
As a result, nova is deleting the instance but is never unplugging the interface.
Step to reproduce
=================
I am booting an instance and immediately deleting it thanks to a command like:
$ openstack server create --key-name fake --image ubuntu1810 --flavor c2-7 --net Ext-Net arnaudubuntu1810-3 ; nova delete arnaudubuntu1810-3
- [1] build_and_run_instance is executed, with a semaphore, thus, locking the instance. When booting, nova will fill the network_info cache, by calling [2] update_instance_cache_with_nw_info.
- [3] terminate_instance is executed few seconds later, but is waiting for the semaphore to be released. At this time, the instance network_info cache may not be filled, depending if the [2] update_instance_cache_with_nw_info has already been executed or not.
- If we follow the code, we end up at _shutdown_instance [4], which is doing a call to [5] get_network_info, which is returning a NetworkInfo object that contains no interface.
- At the end, nova is calling _unplug_vifs [6] which is doing nothing (no vif)
Note that I am running OpenStack Newton release, but the code involved
seems identical on master.
[1] https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1837
[2] https://github.com/openstack/nova/blob/master/nova/network/base_api.py#L34
[2] https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2765
[4] https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2559
[5] https://github.com/openstack/nova/blob/master/nova/objects/instance.py#L1252
[6] https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L919
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1830081/+subscriptions
References