← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2006559] [NEW] Interrupted snapshot is not trigger during nova-compute restart service

 

Public bug reported:

Description
===========
The init of nova-compute service reset the state of the instance in snapshotting but no action was taken to stop the job in progress and to clean the snapshot directory.
Adding an abort on libvirt job during the init and deleting the data generated in the snapshot directory helping to not have data leak on the compute node.

Steps to reproduce
==================
* on devstack
* add conf "snapshots_directory = /opt/stack/qemu/snaphot" in libvirt section inside nova.conf
* create an instance with local storage and some data
$ cat gen-data.sh
#!/bin/bash
head -c 10G </dev/urandom >bigfile

$ openstack server create --flavor m1.small --image
3f4b3e2c-9618-4504-b97b-89a5aa84241c --nic net-
id=47b09fbe-9f4b-449a-9d78-58af11a798eb test-snapshot --key-name pierre-
test --user-data gen-data.sh

* create a snapshot of the instance
$ openstack server image create --name snap1 6bfd51f6-694b-4f2f-935b-b3d554806d82

* restart nova-compute on the host during the snapshot
$ sudo systemctl restart devstack@n-cpu.service

* instance go to ACTIVE
* job not abort and folder not clean
$ virsh blockjob --info 6bfd51f6-694b-4f2f-935b-b3d554806d82 /opt/stack/data/nova/instances/6bfd51f6-694b-4f2f-935b-b3d554806d82/disk
Block Copy: [ 91 %]

$ ls -al /opt/stack/qemu/snaphot
total 12
drwxrwxr-x 3 stack stack   4096 Feb  8 07:47 .
drwxrwxr-x 3 stack stack   4096 Feb  8 07:45 ..
drwx-----x 2 stack libvirt 4096 Feb  8 07:47 tmp59vspxay
$ls -al /opt/stack/qemu/snaphot/tmp59vspxay
total 10511512
drwx-----x 2 stack        libvirt        4096 Feb  8 07:47 .
drwxrwxr-x 3 stack        stack          4096 Feb  8 07:47 ..
-rw-r--r-- 1 libvirt-qemu kvm     10763829248 Feb  8 07:47 805be505a09b46bfa7b584f1ec14a52b.delta

Expected result
===============
* No blokjob running in libvirt
* No disk instance in the snapshot directory

Actual result
=============
* blokjob running in libvirt
* disk instance in the snapshot directory

Environment
===========
OpenStack master
Libvirt + KVM
Local storage

** Affects: nova
     Importance: Undecided
     Assignee: Pierre Libeau (pierre-libeau)
         Status: In Progress

** Changed in: nova
     Assignee: (unassigned) => Pierre Libeau (pierre-libeau)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2006559

Title:
  Interrupted snapshot is not trigger during nova-compute restart
  service

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  Description
  ===========
  The init of nova-compute service reset the state of the instance in snapshotting but no action was taken to stop the job in progress and to clean the snapshot directory.
  Adding an abort on libvirt job during the init and deleting the data generated in the snapshot directory helping to not have data leak on the compute node.

  Steps to reproduce
  ==================
  * on devstack
  * add conf "snapshots_directory = /opt/stack/qemu/snaphot" in libvirt section inside nova.conf
  * create an instance with local storage and some data
  $ cat gen-data.sh
  #!/bin/bash
  head -c 10G </dev/urandom >bigfile

  $ openstack server create --flavor m1.small --image
  3f4b3e2c-9618-4504-b97b-89a5aa84241c --nic net-
  id=47b09fbe-9f4b-449a-9d78-58af11a798eb test-snapshot --key-name
  pierre-test --user-data gen-data.sh

  * create a snapshot of the instance
  $ openstack server image create --name snap1 6bfd51f6-694b-4f2f-935b-b3d554806d82

  * restart nova-compute on the host during the snapshot
  $ sudo systemctl restart devstack@n-cpu.service

  * instance go to ACTIVE
  * job not abort and folder not clean
  $ virsh blockjob --info 6bfd51f6-694b-4f2f-935b-b3d554806d82 /opt/stack/data/nova/instances/6bfd51f6-694b-4f2f-935b-b3d554806d82/disk
  Block Copy: [ 91 %]

  $ ls -al /opt/stack/qemu/snaphot
  total 12
  drwxrwxr-x 3 stack stack   4096 Feb  8 07:47 .
  drwxrwxr-x 3 stack stack   4096 Feb  8 07:45 ..
  drwx-----x 2 stack libvirt 4096 Feb  8 07:47 tmp59vspxay
  $ls -al /opt/stack/qemu/snaphot/tmp59vspxay
  total 10511512
  drwx-----x 2 stack        libvirt        4096 Feb  8 07:47 .
  drwxrwxr-x 3 stack        stack          4096 Feb  8 07:47 ..
  -rw-r--r-- 1 libvirt-qemu kvm     10763829248 Feb  8 07:47 805be505a09b46bfa7b584f1ec14a52b.delta

  Expected result
  ===============
  * No blokjob running in libvirt
  * No disk instance in the snapshot directory

  Actual result
  =============
  * blokjob running in libvirt
  * disk instance in the snapshot directory

  Environment
  ===========
  OpenStack master
  Libvirt + KVM
  Local storage

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2006559/+subscriptions