← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1552740] [NEW] Nova hard reboot fails to mount logical volume (LVM + libvirt-lxc)

 

Public bug reported:

Discovered with the experimental libvirt-lxc tempest gate job initially,
but pared down to an easier test using Devstack and a node like that
which is used in our CI for devstack-gate tests. Here's an etherpad with
many details: https://etherpad.openstack.org/p/lxc_driver_devstack_gate.

The gist of it is there appears to be a bug where trying to hard reboot
a libvirtLXC instance in nova, when using LVM storage backend, when nova
goes to try and mount the LV, and it will sometimes fail with:

```
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 138, in _dispatch_and_reply
    incoming.message))
  File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _dispatch
    return self._do_dispatch(endpoint, method, ctxt, args)
  File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 127, in _do_dispatch
    result = func(ctxt, **new_args)
  File "/opt/stack/nova/nova/exception.py", line 110, in wrapped
    payload)
  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
    self.force_reraise()
  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
    six.reraise(self.type_, self.value, self.tb)
  File "/opt/stack/nova/nova/exception.py", line 89, in wrapped
    return f(self, context, *args, **kw)
  File "/opt/stack/nova/nova/compute/manager.py", line 359, in decorated_function
    LOG.warning(msg, e, instance=instance)
  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
    self.force_reraise()
  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
    six.reraise(self.type_, self.value, self.tb)
  File "/opt/stack/nova/nova/compute/manager.py", line 328, in decorated_function
    return function(self, context, *args, **kwargs)
  File "/opt/stack/nova/nova/compute/manager.py", line 409, in decorated_function
    return function(self, context, *args, **kwargs)
  File "/opt/stack/nova/nova/compute/manager.py", line 387, in decorated_function
    kwargs['instance'], e, sys.exc_info())
  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
    self.force_reraise()
  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
    six.reraise(self.type_, self.value, self.tb)
  File "/opt/stack/nova/nova/compute/manager.py", line 375, in decorated_function
    return function(self, context, *args, **kwargs)
  File "/opt/stack/nova/nova/compute/manager.py", line 3061, in reboot_instance
    self._set_instance_obj_error_state(context, instance)
  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
    self.force_reraise()
  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
    six.reraise(self.type_, self.value, self.tb)
  File "/opt/stack/nova/nova/compute/manager.py", line 3042, in reboot_instance
    bad_volumes_callback=bad_volumes_callback)
  File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2404, in reboot
    block_device_info)
  File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2501, in _hard_reboot
    vifs_already_plugged=True)
  File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4904, in _create_domain_and_network
    block_device_info, disk_info):
  File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
    return self.gen.next()
  File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4814, in _lxc_disk_handler
    block_device_info, disk_info)
  File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4764, in _create_domain_setup_lxc
    container_dir=container_dir)
  File "/opt/stack/nova/nova/virt/disk/api.py", line 428, in setup_container
    raise exception.NovaException(img.errors)
NovaException:
--
Failed to mount filesystem: Unexpected error while running command.
Command: sudo nova-rootwrap /etc/nova/rootwrap.conf mount /dev/stack-volumes-default/692321ba-dd42-4c31-84af-10ca2f10324d_disk /opt/stack/data/nova/instances/692321ba-dd42-4c31-84af-10ca2f10324d/rootfs
Exit code: 32
Stdout: u''
Stderr: u'mount: /dev/mapper/stack--volumes--default-692321ba--dd42--4c31--84af--10ca2f10324d_disk already mounted or /opt/stack/data/nova/instances/692321ba-dd42-4c31-84af-10ca2f10324d/rootfs busy\n'
```

I can recreate this fairly consistently in devstack using the same form
factor as nodes in our CI devstack-gate:

local.conf:
```
[[local|localrc]]
LIBVIRT_TYPE=lxc
NOVA_BACKEND=LVM
```

```
$ ./stack.sh
```
...

```
$ source openrc
$ for i in `seq 1 1 10`; do ( nova boot --image "cirros-0.3.4-x86_64-rootfs" --flavor 42 test$i & ); done
$ for i in `seq 1 1 10`; do ( nova reboot --hard test$i & ); done
```

After doing so, some of the instances should go into ERROR with the
Traceback above in the compute log. The volume of instances is meant to
perturb the issue more reliably. This doesn't always happen, however it
has happened several times when I've just spun up one instance and
tried.

I am running HEAD in Devstack when I see this problem.

Note: On this set up, nova's soft reboot is falling through to hard
reboot, I believe due to this bug:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1536280.

** Affects: nova
     Importance: Undecided
         Status: New

** Description changed:

  Discovered with the experimental libvirt-lxc tempest gate job initially,
  but pared down to an easier test using Devstack and a node like that
  which is used in our CI for devstack-gate tests. Here's an etherpad with
  many details: https://etherpad.openstack.org/p/lxc_driver_devstack_gate.
  
  The gist of it is there appears to be a bug where trying to hard reboot
  a libvirtLXC instance in nova, when using LVM storage backend, when nova
  goes to try and mount the LV, and it will sometimes fail with:
  
  ```
  Traceback (most recent call last):
-   File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 138, in _dispatch_and_reply
-     incoming.message))
-   File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _dispatch
-     return self._do_dispatch(endpoint, method, ctxt, args)
-   File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 127, in _do_dispatch
-     result = func(ctxt, **new_args)
-   File "/opt/stack/nova/nova/exception.py", line 110, in wrapped
-     payload)
-   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
-     self.force_reraise()
-   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
-     six.reraise(self.type_, self.value, self.tb)
-   File "/opt/stack/nova/nova/exception.py", line 89, in wrapped
-     return f(self, context, *args, **kw)
-   File "/opt/stack/nova/nova/compute/manager.py", line 359, in decorated_function
-     LOG.warning(msg, e, instance=instance)
-   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
-     self.force_reraise()
-   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
-     six.reraise(self.type_, self.value, self.tb)
-   File "/opt/stack/nova/nova/compute/manager.py", line 328, in decorated_function
-     return function(self, context, *args, **kwargs)
-   File "/opt/stack/nova/nova/compute/manager.py", line 409, in decorated_function
-     return function(self, context, *args, **kwargs)
-   File "/opt/stack/nova/nova/compute/manager.py", line 387, in decorated_function
-     kwargs['instance'], e, sys.exc_info())
-   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
-     self.force_reraise()
-   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
-     six.reraise(self.type_, self.value, self.tb)
-   File "/opt/stack/nova/nova/compute/manager.py", line 375, in decorated_function
-     return function(self, context, *args, **kwargs)
-   File "/opt/stack/nova/nova/compute/manager.py", line 3061, in reboot_instance
-     self._set_instance_obj_error_state(context, instance)
-   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
-     self.force_reraise()
-   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
-     six.reraise(self.type_, self.value, self.tb)
-   File "/opt/stack/nova/nova/compute/manager.py", line 3042, in reboot_instance
-     bad_volumes_callback=bad_volumes_callback)
-   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2404, in reboot
-     block_device_info)
-   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2501, in _hard_reboot
-     vifs_already_plugged=True)
-   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4904, in _create_domain_and_network
-     block_device_info, disk_info):
-   File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
-     return self.gen.next()
-   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4814, in _lxc_disk_handler
-     block_device_info, disk_info)
-   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4764, in _create_domain_setup_lxc
-     container_dir=container_dir)
-   File "/opt/stack/nova/nova/virt/disk/api.py", line 428, in setup_container
-     raise exception.NovaException(img.errors)
+   File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 138, in _dispatch_and_reply
+     incoming.message))
+   File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _dispatch
+     return self._do_dispatch(endpoint, method, ctxt, args)
+   File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 127, in _do_dispatch
+     result = func(ctxt, **new_args)
+   File "/opt/stack/nova/nova/exception.py", line 110, in wrapped
+     payload)
+   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
+     self.force_reraise()
+   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
+     six.reraise(self.type_, self.value, self.tb)
+   File "/opt/stack/nova/nova/exception.py", line 89, in wrapped
+     return f(self, context, *args, **kw)
+   File "/opt/stack/nova/nova/compute/manager.py", line 359, in decorated_function
+     LOG.warning(msg, e, instance=instance)
+   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
+     self.force_reraise()
+   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
+     six.reraise(self.type_, self.value, self.tb)
+   File "/opt/stack/nova/nova/compute/manager.py", line 328, in decorated_function
+     return function(self, context, *args, **kwargs)
+   File "/opt/stack/nova/nova/compute/manager.py", line 409, in decorated_function
+     return function(self, context, *args, **kwargs)
+   File "/opt/stack/nova/nova/compute/manager.py", line 387, in decorated_function
+     kwargs['instance'], e, sys.exc_info())
+   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
+     self.force_reraise()
+   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
+     six.reraise(self.type_, self.value, self.tb)
+   File "/opt/stack/nova/nova/compute/manager.py", line 375, in decorated_function
+     return function(self, context, *args, **kwargs)
+   File "/opt/stack/nova/nova/compute/manager.py", line 3061, in reboot_instance
+     self._set_instance_obj_error_state(context, instance)
+   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
+     self.force_reraise()
+   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
+     six.reraise(self.type_, self.value, self.tb)
+   File "/opt/stack/nova/nova/compute/manager.py", line 3042, in reboot_instance
+     bad_volumes_callback=bad_volumes_callback)
+   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2404, in reboot
+     block_device_info)
+   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2501, in _hard_reboot
+     vifs_already_plugged=True)
+   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4904, in _create_domain_and_network
+     block_device_info, disk_info):
+   File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
+     return self.gen.next()
+   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4814, in _lxc_disk_handler
+     block_device_info, disk_info)
+   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4764, in _create_domain_setup_lxc
+     container_dir=container_dir)
+   File "/opt/stack/nova/nova/virt/disk/api.py", line 428, in setup_container
+     raise exception.NovaException(img.errors)
  NovaException:
  --
  Failed to mount filesystem: Unexpected error while running command.
  Command: sudo nova-rootwrap /etc/nova/rootwrap.conf mount /dev/stack-volumes-default/692321ba-dd42-4c31-84af-10ca2f10324d_disk /opt/stack/data/nova/instances/692321ba-dd42-4c31-84af-10ca2f10324d/rootfs
  Exit code: 32
  Stdout: u''
  Stderr: u'mount: /dev/mapper/stack--volumes--default-692321ba--dd42--4c31--84af--10ca2f10324d_disk already mounted or /opt/stack/data/nova/instances/692321ba-dd42-4c31-84af-10ca2f10324d/rootfs busy\n'
  ```
  
  I can recreate this fairly consistently in devstack using the same form
  factor as nodes in our CI devstack-gate:
  
  local.conf:
  ```
  [[local|localrc]]
  LIBVIRT_TYPE=lxc
  NOVA_BACKEND=LVM
  ```
  
  ```
  $ ./stack.sh
  ```
  ...
  
  ```
  $ source openrc
  $ for i in `seq 1 1 10`; do ( nova boot --image "cirros-0.3.4-x86_64-rootfs" --flavor 42 test$i & ); done
  $ for i in `seq 1 1 10`; do ( nova reboot test$i & ); done
  ```
  
  After doing so, some of the instances should go into ERROR with the
  Traceback above in the compute log. The volume of instances is meant to
  perturb the issue more reliably. This doesn't always happen, however it
  has happened several times when I've just spun up one instance and
  tried.
  
  I am running HEAD in Devstack when I see this problem.
  
- Some nuances:
+ Some nuance(s):
  
- 1. On this set up, nova reboot is falling through to hard reboot, I believe due to this bug: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1536280.
- 2.
+ 1. On this set up, nova reboot is falling through to hard reboot, I
+ believe due to this bug:
+ https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1536280.

** Description changed:

  Discovered with the experimental libvirt-lxc tempest gate job initially,
  but pared down to an easier test using Devstack and a node like that
  which is used in our CI for devstack-gate tests. Here's an etherpad with
  many details: https://etherpad.openstack.org/p/lxc_driver_devstack_gate.
  
  The gist of it is there appears to be a bug where trying to hard reboot
  a libvirtLXC instance in nova, when using LVM storage backend, when nova
  goes to try and mount the LV, and it will sometimes fail with:
  
  ```
  Traceback (most recent call last):
    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 138, in _dispatch_and_reply
      incoming.message))
    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _dispatch
      return self._do_dispatch(endpoint, method, ctxt, args)
    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 127, in _do_dispatch
      result = func(ctxt, **new_args)
    File "/opt/stack/nova/nova/exception.py", line 110, in wrapped
      payload)
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/opt/stack/nova/nova/exception.py", line 89, in wrapped
      return f(self, context, *args, **kw)
    File "/opt/stack/nova/nova/compute/manager.py", line 359, in decorated_function
      LOG.warning(msg, e, instance=instance)
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/opt/stack/nova/nova/compute/manager.py", line 328, in decorated_function
      return function(self, context, *args, **kwargs)
    File "/opt/stack/nova/nova/compute/manager.py", line 409, in decorated_function
      return function(self, context, *args, **kwargs)
    File "/opt/stack/nova/nova/compute/manager.py", line 387, in decorated_function
      kwargs['instance'], e, sys.exc_info())
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/opt/stack/nova/nova/compute/manager.py", line 375, in decorated_function
      return function(self, context, *args, **kwargs)
    File "/opt/stack/nova/nova/compute/manager.py", line 3061, in reboot_instance
      self._set_instance_obj_error_state(context, instance)
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/opt/stack/nova/nova/compute/manager.py", line 3042, in reboot_instance
      bad_volumes_callback=bad_volumes_callback)
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2404, in reboot
      block_device_info)
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2501, in _hard_reboot
      vifs_already_plugged=True)
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4904, in _create_domain_and_network
      block_device_info, disk_info):
    File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
      return self.gen.next()
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4814, in _lxc_disk_handler
      block_device_info, disk_info)
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4764, in _create_domain_setup_lxc
      container_dir=container_dir)
    File "/opt/stack/nova/nova/virt/disk/api.py", line 428, in setup_container
      raise exception.NovaException(img.errors)
  NovaException:
  --
  Failed to mount filesystem: Unexpected error while running command.
  Command: sudo nova-rootwrap /etc/nova/rootwrap.conf mount /dev/stack-volumes-default/692321ba-dd42-4c31-84af-10ca2f10324d_disk /opt/stack/data/nova/instances/692321ba-dd42-4c31-84af-10ca2f10324d/rootfs
  Exit code: 32
  Stdout: u''
  Stderr: u'mount: /dev/mapper/stack--volumes--default-692321ba--dd42--4c31--84af--10ca2f10324d_disk already mounted or /opt/stack/data/nova/instances/692321ba-dd42-4c31-84af-10ca2f10324d/rootfs busy\n'
  ```
  
  I can recreate this fairly consistently in devstack using the same form
  factor as nodes in our CI devstack-gate:
  
  local.conf:
  ```
  [[local|localrc]]
  LIBVIRT_TYPE=lxc
  NOVA_BACKEND=LVM
  ```
  
  ```
  $ ./stack.sh
  ```
  ...
  
  ```
  $ source openrc
  $ for i in `seq 1 1 10`; do ( nova boot --image "cirros-0.3.4-x86_64-rootfs" --flavor 42 test$i & ); done
  $ for i in `seq 1 1 10`; do ( nova reboot test$i & ); done
  ```
  
  After doing so, some of the instances should go into ERROR with the
  Traceback above in the compute log. The volume of instances is meant to
  perturb the issue more reliably. This doesn't always happen, however it
  has happened several times when I've just spun up one instance and
  tried.
  
  I am running HEAD in Devstack when I see this problem.
  
- Some nuance(s):
- 
- 1. On this set up, nova reboot is falling through to hard reboot, I
+ Note: On this set up, nova reboot is falling through to hard reboot, I
  believe due to this bug:
  https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1536280.

** Summary changed:

- Nova reboot fails to mount logical volume (LVM + libvirt-lxc)
+ Nova hard reboot fails to mount logical volume (LVM + libvirt-lxc)

** Description changed:

  Discovered with the experimental libvirt-lxc tempest gate job initially,
  but pared down to an easier test using Devstack and a node like that
  which is used in our CI for devstack-gate tests. Here's an etherpad with
  many details: https://etherpad.openstack.org/p/lxc_driver_devstack_gate.
  
  The gist of it is there appears to be a bug where trying to hard reboot
  a libvirtLXC instance in nova, when using LVM storage backend, when nova
  goes to try and mount the LV, and it will sometimes fail with:
  
  ```
  Traceback (most recent call last):
    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 138, in _dispatch_and_reply
      incoming.message))
    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _dispatch
      return self._do_dispatch(endpoint, method, ctxt, args)
    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 127, in _do_dispatch
      result = func(ctxt, **new_args)
    File "/opt/stack/nova/nova/exception.py", line 110, in wrapped
      payload)
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/opt/stack/nova/nova/exception.py", line 89, in wrapped
      return f(self, context, *args, **kw)
    File "/opt/stack/nova/nova/compute/manager.py", line 359, in decorated_function
      LOG.warning(msg, e, instance=instance)
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/opt/stack/nova/nova/compute/manager.py", line 328, in decorated_function
      return function(self, context, *args, **kwargs)
    File "/opt/stack/nova/nova/compute/manager.py", line 409, in decorated_function
      return function(self, context, *args, **kwargs)
    File "/opt/stack/nova/nova/compute/manager.py", line 387, in decorated_function
      kwargs['instance'], e, sys.exc_info())
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/opt/stack/nova/nova/compute/manager.py", line 375, in decorated_function
      return function(self, context, *args, **kwargs)
    File "/opt/stack/nova/nova/compute/manager.py", line 3061, in reboot_instance
      self._set_instance_obj_error_state(context, instance)
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/opt/stack/nova/nova/compute/manager.py", line 3042, in reboot_instance
      bad_volumes_callback=bad_volumes_callback)
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2404, in reboot
      block_device_info)
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2501, in _hard_reboot
      vifs_already_plugged=True)
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4904, in _create_domain_and_network
      block_device_info, disk_info):
    File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
      return self.gen.next()
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4814, in _lxc_disk_handler
      block_device_info, disk_info)
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4764, in _create_domain_setup_lxc
      container_dir=container_dir)
    File "/opt/stack/nova/nova/virt/disk/api.py", line 428, in setup_container
      raise exception.NovaException(img.errors)
  NovaException:
  --
  Failed to mount filesystem: Unexpected error while running command.
  Command: sudo nova-rootwrap /etc/nova/rootwrap.conf mount /dev/stack-volumes-default/692321ba-dd42-4c31-84af-10ca2f10324d_disk /opt/stack/data/nova/instances/692321ba-dd42-4c31-84af-10ca2f10324d/rootfs
  Exit code: 32
  Stdout: u''
  Stderr: u'mount: /dev/mapper/stack--volumes--default-692321ba--dd42--4c31--84af--10ca2f10324d_disk already mounted or /opt/stack/data/nova/instances/692321ba-dd42-4c31-84af-10ca2f10324d/rootfs busy\n'
  ```
  
  I can recreate this fairly consistently in devstack using the same form
  factor as nodes in our CI devstack-gate:
  
  local.conf:
  ```
  [[local|localrc]]
  LIBVIRT_TYPE=lxc
  NOVA_BACKEND=LVM
  ```
  
  ```
  $ ./stack.sh
  ```
  ...
  
  ```
  $ source openrc
  $ for i in `seq 1 1 10`; do ( nova boot --image "cirros-0.3.4-x86_64-rootfs" --flavor 42 test$i & ); done
- $ for i in `seq 1 1 10`; do ( nova reboot test$i & ); done
+ $ for i in `seq 1 1 10`; do ( nova reboot --hard test$i & ); done
  ```
  
  After doing so, some of the instances should go into ERROR with the
  Traceback above in the compute log. The volume of instances is meant to
  perturb the issue more reliably. This doesn't always happen, however it
  has happened several times when I've just spun up one instance and
  tried.
  
  I am running HEAD in Devstack when I see this problem.
  
  Note: On this set up, nova reboot is falling through to hard reboot, I
  believe due to this bug:
  https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1536280.

** Description changed:

  Discovered with the experimental libvirt-lxc tempest gate job initially,
  but pared down to an easier test using Devstack and a node like that
  which is used in our CI for devstack-gate tests. Here's an etherpad with
  many details: https://etherpad.openstack.org/p/lxc_driver_devstack_gate.
  
  The gist of it is there appears to be a bug where trying to hard reboot
  a libvirtLXC instance in nova, when using LVM storage backend, when nova
  goes to try and mount the LV, and it will sometimes fail with:
  
  ```
  Traceback (most recent call last):
    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 138, in _dispatch_and_reply
      incoming.message))
    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _dispatch
      return self._do_dispatch(endpoint, method, ctxt, args)
    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 127, in _do_dispatch
      result = func(ctxt, **new_args)
    File "/opt/stack/nova/nova/exception.py", line 110, in wrapped
      payload)
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/opt/stack/nova/nova/exception.py", line 89, in wrapped
      return f(self, context, *args, **kw)
    File "/opt/stack/nova/nova/compute/manager.py", line 359, in decorated_function
      LOG.warning(msg, e, instance=instance)
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/opt/stack/nova/nova/compute/manager.py", line 328, in decorated_function
      return function(self, context, *args, **kwargs)
    File "/opt/stack/nova/nova/compute/manager.py", line 409, in decorated_function
      return function(self, context, *args, **kwargs)
    File "/opt/stack/nova/nova/compute/manager.py", line 387, in decorated_function
      kwargs['instance'], e, sys.exc_info())
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/opt/stack/nova/nova/compute/manager.py", line 375, in decorated_function
      return function(self, context, *args, **kwargs)
    File "/opt/stack/nova/nova/compute/manager.py", line 3061, in reboot_instance
      self._set_instance_obj_error_state(context, instance)
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/opt/stack/nova/nova/compute/manager.py", line 3042, in reboot_instance
      bad_volumes_callback=bad_volumes_callback)
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2404, in reboot
      block_device_info)
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2501, in _hard_reboot
      vifs_already_plugged=True)
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4904, in _create_domain_and_network
      block_device_info, disk_info):
    File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
      return self.gen.next()
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4814, in _lxc_disk_handler
      block_device_info, disk_info)
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4764, in _create_domain_setup_lxc
      container_dir=container_dir)
    File "/opt/stack/nova/nova/virt/disk/api.py", line 428, in setup_container
      raise exception.NovaException(img.errors)
  NovaException:
  --
  Failed to mount filesystem: Unexpected error while running command.
  Command: sudo nova-rootwrap /etc/nova/rootwrap.conf mount /dev/stack-volumes-default/692321ba-dd42-4c31-84af-10ca2f10324d_disk /opt/stack/data/nova/instances/692321ba-dd42-4c31-84af-10ca2f10324d/rootfs
  Exit code: 32
  Stdout: u''
  Stderr: u'mount: /dev/mapper/stack--volumes--default-692321ba--dd42--4c31--84af--10ca2f10324d_disk already mounted or /opt/stack/data/nova/instances/692321ba-dd42-4c31-84af-10ca2f10324d/rootfs busy\n'
  ```
  
  I can recreate this fairly consistently in devstack using the same form
  factor as nodes in our CI devstack-gate:
  
  local.conf:
  ```
  [[local|localrc]]
  LIBVIRT_TYPE=lxc
  NOVA_BACKEND=LVM
  ```
  
  ```
  $ ./stack.sh
  ```
  ...
  
  ```
  $ source openrc
  $ for i in `seq 1 1 10`; do ( nova boot --image "cirros-0.3.4-x86_64-rootfs" --flavor 42 test$i & ); done
  $ for i in `seq 1 1 10`; do ( nova reboot --hard test$i & ); done
  ```
  
  After doing so, some of the instances should go into ERROR with the
  Traceback above in the compute log. The volume of instances is meant to
  perturb the issue more reliably. This doesn't always happen, however it
  has happened several times when I've just spun up one instance and
  tried.
  
  I am running HEAD in Devstack when I see this problem.
  
- Note: On this set up, nova reboot is falling through to hard reboot, I
- believe due to this bug:
+ Note: On this set up, nova's soft reboot is falling through to hard
+ reboot, I believe due to this bug:
  https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1536280.

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1552740

Title:
  Nova hard reboot fails to mount logical volume (LVM + libvirt-lxc)

Status in OpenStack Compute (nova):
  New

Bug description:
  Discovered with the experimental libvirt-lxc tempest gate job
  initially, but pared down to an easier test using Devstack and a node
  like that which is used in our CI for devstack-gate tests. Here's an
  etherpad with many details:
  https://etherpad.openstack.org/p/lxc_driver_devstack_gate.

  The gist of it is there appears to be a bug where trying to hard
  reboot a libvirtLXC instance in nova, when using LVM storage backend,
  when nova goes to try and mount the LV, and it will sometimes fail
  with:

  ```
  Traceback (most recent call last):
    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 138, in _dispatch_and_reply
      incoming.message))
    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _dispatch
      return self._do_dispatch(endpoint, method, ctxt, args)
    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 127, in _do_dispatch
      result = func(ctxt, **new_args)
    File "/opt/stack/nova/nova/exception.py", line 110, in wrapped
      payload)
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/opt/stack/nova/nova/exception.py", line 89, in wrapped
      return f(self, context, *args, **kw)
    File "/opt/stack/nova/nova/compute/manager.py", line 359, in decorated_function
      LOG.warning(msg, e, instance=instance)
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/opt/stack/nova/nova/compute/manager.py", line 328, in decorated_function
      return function(self, context, *args, **kwargs)
    File "/opt/stack/nova/nova/compute/manager.py", line 409, in decorated_function
      return function(self, context, *args, **kwargs)
    File "/opt/stack/nova/nova/compute/manager.py", line 387, in decorated_function
      kwargs['instance'], e, sys.exc_info())
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/opt/stack/nova/nova/compute/manager.py", line 375, in decorated_function
      return function(self, context, *args, **kwargs)
    File "/opt/stack/nova/nova/compute/manager.py", line 3061, in reboot_instance
      self._set_instance_obj_error_state(context, instance)
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
      self.force_reraise()
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/opt/stack/nova/nova/compute/manager.py", line 3042, in reboot_instance
      bad_volumes_callback=bad_volumes_callback)
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2404, in reboot
      block_device_info)
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2501, in _hard_reboot
      vifs_already_plugged=True)
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4904, in _create_domain_and_network
      block_device_info, disk_info):
    File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
      return self.gen.next()
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4814, in _lxc_disk_handler
      block_device_info, disk_info)
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4764, in _create_domain_setup_lxc
      container_dir=container_dir)
    File "/opt/stack/nova/nova/virt/disk/api.py", line 428, in setup_container
      raise exception.NovaException(img.errors)
  NovaException:
  --
  Failed to mount filesystem: Unexpected error while running command.
  Command: sudo nova-rootwrap /etc/nova/rootwrap.conf mount /dev/stack-volumes-default/692321ba-dd42-4c31-84af-10ca2f10324d_disk /opt/stack/data/nova/instances/692321ba-dd42-4c31-84af-10ca2f10324d/rootfs
  Exit code: 32
  Stdout: u''
  Stderr: u'mount: /dev/mapper/stack--volumes--default-692321ba--dd42--4c31--84af--10ca2f10324d_disk already mounted or /opt/stack/data/nova/instances/692321ba-dd42-4c31-84af-10ca2f10324d/rootfs busy\n'
  ```

  I can recreate this fairly consistently in devstack using the same
  form factor as nodes in our CI devstack-gate:

  local.conf:
  ```
  [[local|localrc]]
  LIBVIRT_TYPE=lxc
  NOVA_BACKEND=LVM
  ```

  ```
  $ ./stack.sh
  ```
  ...

  ```
  $ source openrc
  $ for i in `seq 1 1 10`; do ( nova boot --image "cirros-0.3.4-x86_64-rootfs" --flavor 42 test$i & ); done
  $ for i in `seq 1 1 10`; do ( nova reboot --hard test$i & ); done
  ```

  After doing so, some of the instances should go into ERROR with the
  Traceback above in the compute log. The volume of instances is meant
  to perturb the issue more reliably. This doesn't always happen,
  however it has happened several times when I've just spun up one
  instance and tried.

  I am running HEAD in Devstack when I see this problem.

  Note: On this set up, nova's soft reboot is falling through to hard
  reboot, I believe due to this bug:
  https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1536280.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1552740/+subscriptions