yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #36364
[Bug 1480441] [NEW] Live migration doesn't retry on migration pre-check failure
Public bug reported:
When live migrating an instance, it is supposed to retry some
(configurable) number of times. It only retries if the host
compatibility and migration pre-checks raise nova.exception.Invalid,
though:
https://github.com/openstack/nova/blob/master/nova/conductor/tasks/live_migrate.py#L167-L174
If, for instance, a destination hypervisor has run out of disk space it
will not raise an Invalid subclass, but rather MigrationPreCheckError,
which causes the retry loop to short-circuit. Nova should instead retry
as long as either Invalid or MigrationPreCheckError is raised.
This can be tricky to reproduce because it only occurs if a host raises
MigrationPreCheckError before a valid host is found, so it's dependent
upon the order in which the scheduler supplies possible destinations to
the conductor. In theory, though, it can be reproduced by bringing up a
number of hypervisors, exhausting the disk on one -- ideally the one
that the scheduler will return first -- and then attempting a live
migration. It will fail with something like:
$ nova live-migration --block-migrate stpierre-test-1 ERROR
(BadRequest): Migration pre-check error: Unable to migrate f44296dd-
ffa6-4ec0-8256-c311d025d46c: Disk of instance is too large(available on
destination host:-38654705664 < need:1073741824) (HTTP 400) (Request-ID:
req-9951691a-c63c-4888-bec5-30a072dfe727)
Even when there are valid hosts to migrate to.
** Affects: nova
Importance: Undecided
Assignee: Chris St. Pierre (stpierre)
Status: In Progress
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1480441
Title:
Live migration doesn't retry on migration pre-check failure
Status in OpenStack Compute (nova):
In Progress
Bug description:
When live migrating an instance, it is supposed to retry some
(configurable) number of times. It only retries if the host
compatibility and migration pre-checks raise nova.exception.Invalid,
though:
https://github.com/openstack/nova/blob/master/nova/conductor/tasks/live_migrate.py#L167-L174
If, for instance, a destination hypervisor has run out of disk space
it will not raise an Invalid subclass, but rather
MigrationPreCheckError, which causes the retry loop to short-circuit.
Nova should instead retry as long as either Invalid or
MigrationPreCheckError is raised.
This can be tricky to reproduce because it only occurs if a host
raises MigrationPreCheckError before a valid host is found, so it's
dependent upon the order in which the scheduler supplies possible
destinations to the conductor. In theory, though, it can be reproduced
by bringing up a number of hypervisors, exhausting the disk on one --
ideally the one that the scheduler will return first -- and then
attempting a live migration. It will fail with something like:
$ nova live-migration --block-migrate stpierre-test-1 ERROR
(BadRequest): Migration pre-check error: Unable to migrate f44296dd-
ffa6-4ec0-8256-c311d025d46c: Disk of instance is too large(available
on destination host:-38654705664 < need:1073741824) (HTTP 400)
(Request-ID: req-9951691a-c63c-4888-bec5-30a072dfe727)
Even when there are valid hosts to migrate to.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1480441/+subscriptions
Follow ups