← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1669844] Re: Host failure shortly after image download can result in data corruption

 

Reviewed:  https://review.openstack.org/443230
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=1301368bf2352eddcc664202d7f159f523f681e2
Submitter: Jenkins
Branch:    master

commit 1301368bf2352eddcc664202d7f159f523f681e2
Author: Matthew Booth <mbooth@xxxxxxxxxx>
Date:   Wed Mar 8 16:38:49 2017 +0000

    Ensure image conversion flushes output data to disk
    
    qemu-img convert defaults to cache=none, which means that when it
    completes the output data may still only be in the host kernel's
    cache rather than on persistent storage. A host crash at this point
    will leave a file with the correct metadata (name, size, ownership,
    permissions), but no contents. This will prevent qcow2 instances on
    that compute host which use that image from restarting, and requires
    manual intervention from an operator to fix.
    
    See also change Id9905a87, which fixes this issue for downloads
    without a conversion.
    
    Closes-Bug: #1669844
    Change-Id: I33bd99b0752111ff7057f9bd40e58dcde77c7d95


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1669844

Title:
  Host failure shortly after image download can result in data
  corruption

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  GlanceImageServiceV2.download() ensures its downloaded file is closed
  before releasing for use by an external qemu process, but it doesn't
  do an fdatasync(). This means that the downloaded file may be
  temporarily in the host kernel's cache rather than on disk, which
  means there is a short window in which a host crash will lose the
  contents of the backing file, despite it being in use by a running
  instance.

  Disclaimer: I'm not personally able to reproduce this, but it looks
  sane and our QE team is reliably hitting it.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1669844/+subscriptions


References