yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #32378
[Bug 1448333] [NEW] unable to start nova compute if there are inaccessible lvm iSCSI devices in /dev/disk/by-path
Public bug reported:
I saw this problem on a system that had been used for test. It appears
that there was at least one stale iSCSI device left around in /dev/disk
/by-path.
I attempted to restart nova-compute to enable debug for another issue
and I saw the following in the log files:
2015-04-23 23:24:30.952 12377 ERROR nova.openstack.common.threadgroup [req-df828496-65f8-4bc1-9539-f937921c96c6 - - - - -] Unexpected error while running command.
Command: sudo nova-rootwrap /etc/nova/rootwrap.conf blockdev --getsize64 /dev/disk/by-path/ip-9.119.59.151:3260-iscsi-iqn.2010-10.org.openstack:volume-4b0d679c-bc90-46bd-951b-3eeeed04f302-lun-0
Exit code: 1
Stdout: u''
Stderr: u'blockdev: cannot open /dev/disk/by-path/ip-9.119.59.151:3260-iscsi-iqn.2010-10.org.openstack:volume-4b0d679c-bc90-46bd-951b-3eeeed04f302-lun-0: No such device or address\n'
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup Traceback (most recent call last):
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/openstack/common/threadgroup.py", line 145, in wait
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup x.wait()
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/openstack/common/threadgroup.py", line 47, in wait
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup return self.thread.wait()
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 175, in wait
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup return self._exit_event.wait()
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/eventlet/event.py", line 121, in wait
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup return hubs.get_hub().switch()
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 294, in switch
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup return self.greenlet.switch()
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 214, in main
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup result = function(*args, **kwargs)
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/openstack/common/service.py", line 497, in run_service
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup service.start()
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/service.py", line 183, in start
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup self.manager.pre_start_hook()
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1293, in pre_start_hook
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup self.update_available_resource(nova.context.get_admin_context())
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6300, in update_available_resource
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup rt.update_available_resource(context)
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 376, in update_available_resource
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup resources = self.driver.get_available_resource(self.nodename)
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4908, in get_available_resource
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup disk_over_committed = self._get_disk_over_committed_size_total()
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6094, in _get_disk_over_committed_size_total
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup self._get_instance_disk_info(dom.name(), xml))
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6047, in _get_instance_disk_info
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup dk_size = lvm.get_volume_size(path)
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/lvm.py", line 172, in get_volume_size
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup run_as_root=True)
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/utils.py", line 55, in execute
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup return utils.execute(*args, **kwargs)
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/utils.py", line 206, in execute
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup return processutils.execute(*cmd, **kwargs)
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py", line 233, in execute
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup cmd=sanitized_cmd)
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup ProcessExecutionError: Unexpected error while running command.
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup Command: sudo nova-rootwrap /etc/nova/rootwrap.conf blockdev --getsize64 /dev/disk/by-path/ip-9.119.59.151:3260-iscsi-iqn.2010-10.org.openstack:volume-4b0d679c-bc90-46bd-951b-3eeeed04f302-lun-0
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup Exit code: 1
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup Stdout: u''
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup Stderr: u'blockdev: cannot open /dev/disk/by-path/ip-9.119.59.151:3260-iscsi-iqn.2010-10.org.openstack:volume-4b0d679c-bc90-46bd-951b-3eeeed04f302-lun-0: No such device or address\n'
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup
It appears that there are other places in the code where it is set up to
handle a failure from "get_volume_size" but in _get_instance_disk_info
it is not. I am wondering if this problem can be resolved by handling
the error there? Otherwise, the compute process errors out at that
point and we get no further bringing the node up.
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1448333
Title:
unable to start nova compute if there are inaccessible lvm iSCSI
devices in /dev/disk/by-path
Status in OpenStack Compute (Nova):
New
Bug description:
I saw this problem on a system that had been used for test. It
appears that there was at least one stale iSCSI device left around in
/dev/disk/by-path.
I attempted to restart nova-compute to enable debug for another issue
and I saw the following in the log files:
2015-04-23 23:24:30.952 12377 ERROR nova.openstack.common.threadgroup [req-df828496-65f8-4bc1-9539-f937921c96c6 - - - - -] Unexpected error while running command.
Command: sudo nova-rootwrap /etc/nova/rootwrap.conf blockdev --getsize64 /dev/disk/by-path/ip-9.119.59.151:3260-iscsi-iqn.2010-10.org.openstack:volume-4b0d679c-bc90-46bd-951b-3eeeed04f302-lun-0
Exit code: 1
Stdout: u''
Stderr: u'blockdev: cannot open /dev/disk/by-path/ip-9.119.59.151:3260-iscsi-iqn.2010-10.org.openstack:volume-4b0d679c-bc90-46bd-951b-3eeeed04f302-lun-0: No such device or address\n'
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup Traceback (most recent call last):
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/openstack/common/threadgroup.py", line 145, in wait
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup x.wait()
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/openstack/common/threadgroup.py", line 47, in wait
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup return self.thread.wait()
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 175, in wait
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup return self._exit_event.wait()
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/eventlet/event.py", line 121, in wait
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup return hubs.get_hub().switch()
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 294, in switch
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup return self.greenlet.switch()
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 214, in main
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup result = function(*args, **kwargs)
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/openstack/common/service.py", line 497, in run_service
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup service.start()
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/service.py", line 183, in start
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup self.manager.pre_start_hook()
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1293, in pre_start_hook
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup self.update_available_resource(nova.context.get_admin_context())
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6300, in update_available_resource
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup rt.update_available_resource(context)
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 376, in update_available_resource
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup resources = self.driver.get_available_resource(self.nodename)
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4908, in get_available_resource
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup disk_over_committed = self._get_disk_over_committed_size_total()
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6094, in _get_disk_over_committed_size_total
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup self._get_instance_disk_info(dom.name(), xml))
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6047, in _get_instance_disk_info
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup dk_size = lvm.get_volume_size(path)
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/lvm.py", line 172, in get_volume_size
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup run_as_root=True)
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/utils.py", line 55, in execute
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup return utils.execute(*args, **kwargs)
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/utils.py", line 206, in execute
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup return processutils.execute(*cmd, **kwargs)
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py", line 233, in execute
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup cmd=sanitized_cmd)
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup ProcessExecutionError: Unexpected error while running command.
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup Command: sudo nova-rootwrap /etc/nova/rootwrap.conf blockdev --getsize64 /dev/disk/by-path/ip-9.119.59.151:3260-iscsi-iqn.2010-10.org.openstack:volume-4b0d679c-bc90-46bd-951b-3eeeed04f302-lun-0
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup Exit code: 1
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup Stdout: u''
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup Stderr: u'blockdev: cannot open /dev/disk/by-path/ip-9.119.59.151:3260-iscsi-iqn.2010-10.org.openstack:volume-4b0d679c-bc90-46bd-951b-3eeeed04f302-lun-0: No such device or address\n'
2015-04-23 23:24:30.952 12377 TRACE nova.openstack.common.threadgroup
It appears that there are other places in the code where it is set up
to handle a failure from "get_volume_size" but in
_get_instance_disk_info it is not. I am wondering if this problem can
be resolved by handling the error there? Otherwise, the compute
process errors out at that point and we get no further bringing the
node up.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1448333/+subscriptions
Follow ups
References