yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #21682
[Bug 1161657] Re: nova.compute.manager.py needs better rollbacks
This isn't really a bug, this is really something which should come in
via the specs process
** Changed in: nova
Status: Confirmed => Opinion
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1161657
Title:
nova.compute.manager.py needs better rollbacks
Status in OpenStack Compute (Nova):
Opinion
Bug description:
As documented at
https://review.openstack.org/#/c/25075/2/nova/compute/manager.py there
are cases in the compute manager that cause the database, network, or
instances themselves to be in a inconsistent (or wrong entirely)
state. It would be useful to verify that when a plugin is called that
there is a defined interface and known set of errors that said
interface can throw, and how to rollback from all of those allowed set
of errors. The top level manager code must correctly rollback state
(as needed) so that the compute node is left in a pristine state when
a underlying driver does not behave correctly (or just doesn't work).
Lets first attack one function, a critical path one, _run_instance(),
and its direct _spawn(), _prep_block_device()
Certain calls noted:
- Deallocating networks/volumes (not always done) -> _setup_block_device_mapping is never rolledback...
- Un-preparing a block device (on later failure)
- A driver can affect the macs for an instance (self.driver.macs_for_instance) and since this is 3rd party driver code, if said driver 'locks' said macs (via whatever mechanism) then there is future macs not rolledback.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1161657/+subscriptions