← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1448333] [NEW] unable to start nova compute if there are inaccessible lvm iSCSI devices in /dev/disk/by-path

 

Public bug reported:

I saw this problem on a system that had been used for test.  It appears
that there was at least one stale iSCSI device left around in /dev/disk
/by-path.

I attempted to restart nova-compute to enable debug for another issue
and I saw the following in the log files:

2015-04-23 23:24:30.952 12377 ERROR nova.openstack.common.threadgroup [req-df828496-65f8-4bc1-9539-f937921c96c6 - - - - -] Unexpected error while running command.
Command: sudo nova-rootwrap /etc/nova/rootwrap.conf blockdev --getsize64 /dev/disk/by-path/ip-9.119.59.151:3260-iscsi-iqn.2010-10.org.openstack:volume-4b0d679c-bc90-46bd-951b-3eeeed04f302-lun-0
Exit code: 1
Stdout: u''
Stderr: u'blockdev: cannot open /dev/disk/by-path/ip-9.119.59.151:3260-iscsi-iqn.2010-10.org.openstack:volume-4b0d679c-bc90-46bd-951b-3eeeed04f302-lun-0: No such device or address\n'
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup Traceback (most recent call last):
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/nova/openstack/common/threadgroup.py", line 145, in wait
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     x.wait()
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/nova/openstack/common/threadgroup.py", line 47, in wait
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     return self.thread.wait()
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 175, in wait
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     return self._exit_event.wait()
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/eventlet/event.py", line 121, in wait
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     return hubs.get_hub().switch()
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 294, in switch
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     return self.greenlet.switch()
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 214, in main
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     result = function(*args, **kwargs)
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/nova/openstack/common/service.py", line 497, in run_service
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     service.start()
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/nova/service.py", line 183, in start
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     self.manager.pre_start_hook()
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1293, in pre_start_hook
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     self.update_available_resource(nova.context.get_admin_context())
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6300, in update_available_resource
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     rt.update_available_resource(context)
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 376, in update_available_resource
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     resources = self.driver.get_available_resource(self.nodename)
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4908, in get_available_resource
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     disk_over_committed = self._get_disk_over_committed_size_total()
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6094, in _get_disk_over_committed_size_total
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     self._get_instance_disk_info(dom.name(), xml))
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6047, in _get_instance_disk_info
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     dk_size = lvm.get_volume_size(path)
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/lvm.py", line 172, in get_volume_size
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     run_as_root=True)
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/utils.py", line 55, in execute
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     return utils.execute(*args, **kwargs)
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/nova/utils.py", line 206, in execute
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     return processutils.execute(*cmd, **kwargs)
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py", line 233, in execute
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     cmd=sanitized_cmd)
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup ProcessExecutionError: Unexpected error while running command.
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup Command: sudo nova-rootwrap /etc/nova/rootwrap.conf blockdev --getsize64 /dev/disk/by-path/ip-9.119.59.151:3260-iscsi-iqn.2010-10.org.openstack:volume-4b0d679c-bc90-46bd-951b-3eeeed04f302-lun-0
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup Exit code: 1
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup Stdout: u''
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup Stderr: u'blockdev: cannot open /dev/disk/by-path/ip-9.119.59.151:3260-iscsi-iqn.2010-10.org.openstack:volume-4b0d679c-bc90-46bd-951b-3eeeed04f302-lun-0: No such device or address\n'
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup

It appears that there are other places in the code where it is set up to
handle a failure from "get_volume_size" but in _get_instance_disk_info
it is not.  I am wondering if this problem can be resolved by handling
the error there?  Otherwise, the compute process errors out at that
point and we get no further bringing the node up.

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1448333

Title:
  unable to start nova compute if there are inaccessible lvm iSCSI
  devices in /dev/disk/by-path

Status in OpenStack Compute (Nova):
  New

Bug description:
  I saw this problem on a system that had been used for test.  It
  appears that there was at least one stale iSCSI device left around in
  /dev/disk/by-path.

  I attempted to restart nova-compute to enable debug for another issue
  and I saw the following in the log files:

  2015-04-23 23:24:30.952 12377 ERROR nova.openstack.common.threadgroup [req-df828496-65f8-4bc1-9539-f937921c96c6 - - - - -] Unexpected error while running command.
  Command: sudo nova-rootwrap /etc/nova/rootwrap.conf blockdev --getsize64 /dev/disk/by-path/ip-9.119.59.151:3260-iscsi-iqn.2010-10.org.openstack:volume-4b0d679c-bc90-46bd-951b-3eeeed04f302-lun-0
  Exit code: 1
  Stdout: u''
  Stderr: u'blockdev: cannot open /dev/disk/by-path/ip-9.119.59.151:3260-iscsi-iqn.2010-10.org.openstack:volume-4b0d679c-bc90-46bd-951b-3eeeed04f302-lun-0: No such device or address\n'
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup Traceback (most recent call last):
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/nova/openstack/common/threadgroup.py", line 145, in wait
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     x.wait()
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/nova/openstack/common/threadgroup.py", line 47, in wait
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     return self.thread.wait()
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 175, in wait
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     return self._exit_event.wait()
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/eventlet/event.py", line 121, in wait
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     return hubs.get_hub().switch()
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 294, in switch
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     return self.greenlet.switch()
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 214, in main
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     result = function(*args, **kwargs)
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/nova/openstack/common/service.py", line 497, in run_service
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     service.start()
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/nova/service.py", line 183, in start
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     self.manager.pre_start_hook()
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1293, in pre_start_hook
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     self.update_available_resource(nova.context.get_admin_context())
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6300, in update_available_resource
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     rt.update_available_resource(context)
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 376, in update_available_resource
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     resources = self.driver.get_available_resource(self.nodename)
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4908, in get_available_resource
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     disk_over_committed = self._get_disk_over_committed_size_total()
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6094, in _get_disk_over_committed_size_total
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     self._get_instance_disk_info(dom.name(), xml))
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6047, in _get_instance_disk_info
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     dk_size = lvm.get_volume_size(path)
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/lvm.py", line 172, in get_volume_size
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     run_as_root=True)
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/utils.py", line 55, in execute
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     return utils.execute(*args, **kwargs)
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/nova/utils.py", line 206, in execute
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     return processutils.execute(*cmd, **kwargs)
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup   File "/usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py", line 233, in execute
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup     cmd=sanitized_cmd)
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup ProcessExecutionError: Unexpected error while running command.
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup Command: sudo nova-rootwrap /etc/nova/rootwrap.conf blockdev --getsize64 /dev/disk/by-path/ip-9.119.59.151:3260-iscsi-iqn.2010-10.org.openstack:volume-4b0d679c-bc90-46bd-951b-3eeeed04f302-lun-0
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup Exit code: 1
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup Stdout: u''
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup Stderr: u'blockdev: cannot open /dev/disk/by-path/ip-9.119.59.151:3260-iscsi-iqn.2010-10.org.openstack:volume-4b0d679c-bc90-46bd-951b-3eeeed04f302-lun-0: No such device or address\n'
  2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup

  It appears that there are other places in the code where it is set up
  to handle a failure from "get_volume_size" but in
  _get_instance_disk_info it is not.  I am wondering if this problem can
  be resolved by handling the error there?  Otherwise, the compute
  process errors out at that point and we get no further bringing the
  node up.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1448333/+subscriptions


Follow ups

References