yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #33983
[Bug 1467570] [NEW] Nova can't provision instance from snapshot with a ceph backend
Public bug reported:
This is a weird issue that does not happen in our Juno setup, but
happens in our Kilo setup. The configuration between the two setups is
pretty much the same, with only kilo-specific changes done (namely,
moving lines around to new sections).
Here's how to reproduce:
1.Provision an instance.
2.Make a snapshot of this instance.
3.Try to provision an instance with that snapshot.
Nova-compute will complain that it can't find the disk and the instance
will fall in error.
Here's what the default behavior is supposed to be from my observations:
-When the image is uploaded into ceph, a snapshot is created automatically inside ceph (this is NOT an instance snapshot per say, but a ceph internal snapshot).
-When an instance is booted from image in nova, this snapshot gets a clone in the nova ceph pool. Nova then uses that clone as the instance's disk. This is called copy-on-write cloning.
Here's when things get funky: -When an instance is booted from a
snapshot, the copy-on-write cloning does not happen. Nova looks for the
disk and, of course, fails to find it in its pool, thus failing to
provision the instance . There's no trace anywhere of the copy-on-write
clone failing (In part because ceph doesn't log client commands, from
what I see).
The compute logs I got are in this pastebin :
http://pastebin.com/ADHTEnhn
There's a few things I notice here that I'd like to point out :
-Nova create an ephemeral drive file, then proceeds to delete it before
using rbd_utils instead. While strange, this may be the intended but
somewhat dirty behavior, as nova consider it an ephemeral instance,
before realizing that it's actually a ceph instance and doesn't need its
ephemeral disk. Or maybe these conjectures are completely wrong and this
is part of the issue.
-Nova "creates the image" (I'm guessing it's the copy-on-write cloning
happening here). What exactly happens here isn't very clear, but then it
complains that it can't find the clone in its pool to use as block
device.
** Affects: nova
Importance: Undecided
Status: New
** Tags: ceph
** Tags added: ceph
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1467570
Title:
Nova can't provision instance from snapshot with a ceph backend
Status in OpenStack Compute (Nova):
New
Bug description:
This is a weird issue that does not happen in our Juno setup, but
happens in our Kilo setup. The configuration between the two setups is
pretty much the same, with only kilo-specific changes done (namely,
moving lines around to new sections).
Here's how to reproduce:
1.Provision an instance.
2.Make a snapshot of this instance.
3.Try to provision an instance with that snapshot.
Nova-compute will complain that it can't find the disk and the
instance will fall in error.
Here's what the default behavior is supposed to be from my observations:
-When the image is uploaded into ceph, a snapshot is created automatically inside ceph (this is NOT an instance snapshot per say, but a ceph internal snapshot).
-When an instance is booted from image in nova, this snapshot gets a clone in the nova ceph pool. Nova then uses that clone as the instance's disk. This is called copy-on-write cloning.
Here's when things get funky: -When an instance is booted from a
snapshot, the copy-on-write cloning does not happen. Nova looks for
the disk and, of course, fails to find it in its pool, thus failing to
provision the instance . There's no trace anywhere of the copy-on-
write clone failing (In part because ceph doesn't log client commands,
from what I see).
The compute logs I got are in this pastebin :
http://pastebin.com/ADHTEnhn
There's a few things I notice here that I'd like to point out :
-Nova create an ephemeral drive file, then proceeds to delete it
before using rbd_utils instead. While strange, this may be the
intended but somewhat dirty behavior, as nova consider it an ephemeral
instance, before realizing that it's actually a ceph instance and
doesn't need its ephemeral disk. Or maybe these conjectures are
completely wrong and this is part of the issue.
-Nova "creates the image" (I'm guessing it's the copy-on-write cloning
happening here). What exactly happens here isn't very clear, but then
it complains that it can't find the clone in its pool to use as block
device.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1467570/+subscriptions
Follow ups
References