← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1467570] Re: Nova can't provision instance from snapshot with a ceph backend

 

** Also affects: horizon
   Importance: Undecided
       Status: New

** Changed in: horizon
     Assignee: (unassigned) => lyanchih (lyanchih)

** Changed in: horizon
       Status: New => Confirmed

** Changed in: nova
       Status: In Progress => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1467570

Title:
  Nova can't provision instance from snapshot with a ceph backend

Status in OpenStack Dashboard (Horizon):
  Confirmed
Status in OpenStack Compute (Nova):
  Invalid

Bug description:
  This is a weird issue that does not happen in our Juno setup, but
  happens in our Kilo setup. The configuration between the two setups is
  pretty much the same, with only kilo-specific changes done (namely,
  moving lines around to new sections).

  Here's how to reproduce:
  1.Provision an instance.
  2.Make a snapshot of this instance.
  3.Try to provision an instance with that snapshot.

  Nova-compute will complain that it can't find the disk and the
  instance will fall in error.

  Here's what the default behavior is supposed to be from my observations:
  -When the image is uploaded into ceph, a snapshot is created automatically inside ceph (this is NOT an instance snapshot per say, but a ceph internal snapshot).
  -When an instance is booted from image in nova, this snapshot gets a clone in the nova ceph pool. Nova then uses that clone as the instance's disk. This is called copy-on-write cloning.

  Here's when things get funky: -When an instance is booted from a
  snapshot, the copy-on-write cloning does not happen. Nova looks for
  the disk and, of course, fails to find it in its pool, thus failing to
  provision the instance . There's no trace anywhere of the copy-on-
  write clone failing (In part because ceph doesn't log client commands,
  from what I see).

  The compute logs I got are in this pastebin :
  http://pastebin.com/ADHTEnhn

  There's a few things I notice here that I'd like to point out :

  -Nova create an ephemeral drive file, then proceeds to delete it
  before using rbd_utils instead. While strange, this may be the
  intended but somewhat dirty behavior, as nova consider it an ephemeral
  instance, before realizing that it's actually a ceph instance and
  doesn't need its ephemeral disk. Or maybe these conjectures are
  completely wrong and this is part of the issue.

  -Nova "creates the image" (I'm guessing it's the copy-on-write cloning
  happening here). What exactly happens here isn't very clear, but then
  it complains that it can't find the clone in its pool to use as block
  device.

  This issue does not happen on ephemeral storage.

To manage notifications about this bug go to:
https://bugs.launchpad.net/horizon/+bug/1467570/+subscriptions


References