yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #93246
[Bug 2047182] [NEW] BFV VM may be unexpectedly moved to different AZ
Public bug reported:
In cases when:
- each availability zone has a separate storage cluster([cinder]/cross_az_attach option helps to achieve that)
and
- there is no default_schedule_zone
VM may be unexpectedly moved to different AZ.
When a VM is created from pre-existing volume, nova places the specific
availability zone in request_specs which prevents a VM from being moved
to different AZ during resize/migrate[1]. In this case, everything works
fine.
Unfortunately, problems start in the following cases:
a) VM is created with --boot-from-volume argument which dynamically creates volume for the VM
b) VM has only ephemeral volume
Lets focus on case a) because option b) may be not working "by design".
_get_volume_from_bdms() method considers only pre-existing volumes[2]. Volume that will be created later on with `--boot-from-volume` does not exist yet so it cannot fetch its availability zone.
As a result, request_specs contains '"availability_zone": null' and VM can be moved to different AZ during resize/migrate. Because storage is not shared between AZs, it breaks a VM.
It's not easy to fix because:
- nova API is not aware of the designated AZ at the time of placing request_specs in DB
- looking at schedule_and_build_instances method[3] we do not create the cinder volumes before downcalling to the compute agent. And we do not allow upcalls from the compute-agent to the api db in general, so it's hard to update request_specs after the volume is created.
Unfortunately, at this point I don't see any easy way to fix this issue.
[1] https://github.com/openstack/nova/blob/d28a55959e50b472e181809b919e11a896f989e3/nova/compute/api.py#L1268C19
[2] https://github.com/openstack/nova/blob/d28a55959e50b472e181809b919e11a896f989e3/nova/compute/api.py#L1247
[3] https://github.com/openstack/nova/blob/master/nova/conductor/manager.py#L1646
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2047182
Title:
BFV VM may be unexpectedly moved to different AZ
Status in OpenStack Compute (nova):
New
Bug description:
In cases when:
- each availability zone has a separate storage cluster([cinder]/cross_az_attach option helps to achieve that)
and
- there is no default_schedule_zone
VM may be unexpectedly moved to different AZ.
When a VM is created from pre-existing volume, nova places the
specific availability zone in request_specs which prevents a VM from
being moved to different AZ during resize/migrate[1]. In this case,
everything works fine.
Unfortunately, problems start in the following cases:
a) VM is created with --boot-from-volume argument which dynamically creates volume for the VM
b) VM has only ephemeral volume
Lets focus on case a) because option b) may be not working "by
design".
_get_volume_from_bdms() method considers only pre-existing volumes[2]. Volume that will be created later on with `--boot-from-volume` does not exist yet so it cannot fetch its availability zone.
As a result, request_specs contains '"availability_zone": null' and VM can be moved to different AZ during resize/migrate. Because storage is not shared between AZs, it breaks a VM.
It's not easy to fix because:
- nova API is not aware of the designated AZ at the time of placing request_specs in DB
- looking at schedule_and_build_instances method[3] we do not create the cinder volumes before downcalling to the compute agent. And we do not allow upcalls from the compute-agent to the api db in general, so it's hard to update request_specs after the volume is created.
Unfortunately, at this point I don't see any easy way to fix this
issue.
[1] https://github.com/openstack/nova/blob/d28a55959e50b472e181809b919e11a896f989e3/nova/compute/api.py#L1268C19
[2] https://github.com/openstack/nova/blob/d28a55959e50b472e181809b919e11a896f989e3/nova/compute/api.py#L1247
[3] https://github.com/openstack/nova/blob/master/nova/conductor/manager.py#L1646
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2047182/+subscriptions