← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1257411] [NEW] Intermittent boot instance failure, "libvirt unable to read from monitor"

 

Public bug reported:

devstack install using master on precise intermittent failures when
trying to boot instances.  (cirros image, flavor 1).  Typically simply
running again this will work.  n-cpu logs contain the following trace:

2013-12-03 11:11:01.124 DEBUG nova.compute.manager [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96] Re-scheduling run_instance: attempt 1 from (pid=30610) _reschedule /opt/stack/nova/nova/compute/man
ager.py:1167
2013-12-03 11:11:01.124 DEBUG nova.openstack.common.rpc.amqp [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] Making synchronous call on conductor ... from (pid=30610) multicall /opt/stack/nova/nova/openstack/common/rpc/amqp.py:553
2013-12-03 11:11:01.125 DEBUG nova.openstack.common.rpc.amqp [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] MSG_ID is 7b83e1059204445ba23ed876943eea2d from (pid=30610) multicall /opt/stack/nova/nova/openstack/common/rpc/amqp.py:556
2013-12-03 11:11:01.125 DEBUG nova.openstack.common.rpc.amqp [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] UNIQUE_ID is 67de22630ca94eee9f409ee8aeaece1c. from (pid=30610) _add_unique_id /opt/stack/nova/nova/openstack/common/rpc/amqp.py:341
2013-12-03 11:11:01.237 DEBUG nova.openstack.common.lockutils [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] Got semaphore "compute_resources" from (pid=30610) lock /opt/stack/nova/nova/openstack/common/lockutils.py:167
2013-12-03 11:11:01.237 DEBUG nova.openstack.common.lockutils [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] Got semaphore / lock "update_usage" from (pid=30610) inner /opt/stack/nova/nova/openstack/common/lockutils.py:247
2013-12-03 11:11:01.237 DEBUG nova.openstack.common.lockutils [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] Semaphore / lock released "update_usage" from (pid=30610) inner /opt/stack/nova/nova/openstack/common/lockutils.py:251
2013-12-03 11:11:01.239 DEBUG nova.openstack.common.rpc.amqp [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] Making asynchronous cast on scheduler... from (pid=30610) cast /opt/stack/nova/nova/openstack/common/rpc/amqp.py:582
2013-12-03 11:11:01.239 DEBUG nova.openstack.common.rpc.amqp [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] UNIQUE_ID is ce107999c53949fa8aef7d13586a3d5a. from (pid=30610) _add_unique_id /opt/stack/nova/nova/openstack/common/rpc/amqp.py:341
2013-12-03 11:11:01.247 ERROR nova.compute.manager [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96] Error: Unable to read from monitor: Connection reset by peer
2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96] Traceback (most recent call last):
2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]   File "/opt/stack/nova/nova/compute/manager.py", line 1049, in _build_instance
2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]     set_access_ip=set_access_ip)
2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]   File "/opt/stack/nova/nova/compute/manager.py", line 1453, in _spawn
2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]     LOG.exception(_('Instance failed to spawn'), instance=instance)
2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]   File "/opt/stack/nova/nova/compute/manager.py", line 1450, in _spawn
2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]     block_device_info)
2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2161, in spawn
2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]     block_device_info)
2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 3395, in _create_domain_and_network
2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]     domain = self._create_domain(xml, instance=instance, power_on=power_on)
2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 3338, in _create_domain
2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]     domain.XMLDesc(0))
2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 3333, in _create_domain
2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]     domain.createWithFlags(launch_flags)
2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 179, in doit
2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]     result = proxy_call(self._autowrap, f, *args, **kwargs)
2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 139, in proxy_call
2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]     rv = execute(f,*args,**kwargs)
2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 77, in tworker
2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]     rv = meth(*args,**kwargs)
2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]   File "/usr/lib/python2.7/dist-packages/libvirt.py", line 581, in createWithFlags
2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]     if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96] libvirtError: Unable to read from monitor: Connection reset by peer
2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]
2013-12-03 11:11:01.256 DEBUG nova.openstack.common.lockutils [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] Semaphore / lock released "do_run_instance" from (pid=30610) inner /opt/stack/nova/nova/openstack/common/lockutils.py:251
2013-12-03 11:11:01.260 DEBUG nova.openstack.common.rpc.amqp [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] Making synchronous call on conductor ... from (pid=30610) multicall /opt/stack/nova/nova/openstack/common/rpc/amqp.py:553
2013-12-03 11:11:01.262 DEBUG nova.openstack.common.rpc.amqp [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] MSG_ID is 59e775b22f894384b10b6eee9ad0744c from (pid=30610) multicall /opt/stack/nova/nova/openstack/common/rpc/amqp.py:556
2013-12-03 11:11:01.263 DEBUG nova.openstack.common.rpc.amqp [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] UNIQUE_ID is 8fab369e677f47a48452cc65e26631a1. from (pid=30610) _add_unique_id /opt/stack/nova/nova/openstack/common/rpc/amqp.py:341
2013-12-03 11:11:09.109 DEBUG nova.openstack.common.rpc.amqp [-] Making synchronous call on conductor ... from (pid=30610) multicall /opt/stack/nova/nova/openstack/common/rpc/amqp.py:553

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1257411

Title:
  Intermittent boot instance failure, "libvirt unable to read from
  monitor"

Status in OpenStack Compute (Nova):
  New

Bug description:
  devstack install using master on precise intermittent failures when
  trying to boot instances.  (cirros image, flavor 1).  Typically simply
  running again this will work.  n-cpu logs contain the following trace:

  2013-12-03 11:11:01.124 DEBUG nova.compute.manager [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96] Re-scheduling run_instance: attempt 1 from (pid=30610) _reschedule /opt/stack/nova/nova/compute/man
  ager.py:1167
  2013-12-03 11:11:01.124 DEBUG nova.openstack.common.rpc.amqp [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] Making synchronous call on conductor ... from (pid=30610) multicall /opt/stack/nova/nova/openstack/common/rpc/amqp.py:553
  2013-12-03 11:11:01.125 DEBUG nova.openstack.common.rpc.amqp [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] MSG_ID is 7b83e1059204445ba23ed876943eea2d from (pid=30610) multicall /opt/stack/nova/nova/openstack/common/rpc/amqp.py:556
  2013-12-03 11:11:01.125 DEBUG nova.openstack.common.rpc.amqp [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] UNIQUE_ID is 67de22630ca94eee9f409ee8aeaece1c. from (pid=30610) _add_unique_id /opt/stack/nova/nova/openstack/common/rpc/amqp.py:341
  2013-12-03 11:11:01.237 DEBUG nova.openstack.common.lockutils [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] Got semaphore "compute_resources" from (pid=30610) lock /opt/stack/nova/nova/openstack/common/lockutils.py:167
  2013-12-03 11:11:01.237 DEBUG nova.openstack.common.lockutils [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] Got semaphore / lock "update_usage" from (pid=30610) inner /opt/stack/nova/nova/openstack/common/lockutils.py:247
  2013-12-03 11:11:01.237 DEBUG nova.openstack.common.lockutils [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] Semaphore / lock released "update_usage" from (pid=30610) inner /opt/stack/nova/nova/openstack/common/lockutils.py:251
  2013-12-03 11:11:01.239 DEBUG nova.openstack.common.rpc.amqp [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] Making asynchronous cast on scheduler... from (pid=30610) cast /opt/stack/nova/nova/openstack/common/rpc/amqp.py:582
  2013-12-03 11:11:01.239 DEBUG nova.openstack.common.rpc.amqp [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] UNIQUE_ID is ce107999c53949fa8aef7d13586a3d5a. from (pid=30610) _add_unique_id /opt/stack/nova/nova/openstack/common/rpc/amqp.py:341
  2013-12-03 11:11:01.247 ERROR nova.compute.manager [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96] Error: Unable to read from monitor: Connection reset by peer
  2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96] Traceback (most recent call last):
  2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]   File "/opt/stack/nova/nova/compute/manager.py", line 1049, in _build_instance
  2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]     set_access_ip=set_access_ip)
  2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]   File "/opt/stack/nova/nova/compute/manager.py", line 1453, in _spawn
  2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]     LOG.exception(_('Instance failed to spawn'), instance=instance)
  2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]   File "/opt/stack/nova/nova/compute/manager.py", line 1450, in _spawn
  2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]     block_device_info)
  2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2161, in spawn
  2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]     block_device_info)
  2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 3395, in _create_domain_and_network
  2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]     domain = self._create_domain(xml, instance=instance, power_on=power_on)
  2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 3338, in _create_domain
  2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]     domain.XMLDesc(0))
  2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 3333, in _create_domain
  2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]     domain.createWithFlags(launch_flags)
  2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 179, in doit
  2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]     result = proxy_call(self._autowrap, f, *args, **kwargs)
  2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 139, in proxy_call
  2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]     rv = execute(f,*args,**kwargs)
  2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 77, in tworker
  2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]     rv = meth(*args,**kwargs)
  2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]   File "/usr/lib/python2.7/dist-packages/libvirt.py", line 581, in createWithFlags
  2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]     if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
  2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96] libvirtError: Unable to read from monitor: Connection reset by peer
  2013-12-03 11:11:01.247 TRACE nova.compute.manager [instance: e7b65a0e-d9cf-4d4e-939c-5424e45c3d96]
  2013-12-03 11:11:01.256 DEBUG nova.openstack.common.lockutils [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] Semaphore / lock released "do_run_instance" from (pid=30610) inner /opt/stack/nova/nova/openstack/common/lockutils.py:251
  2013-12-03 11:11:01.260 DEBUG nova.openstack.common.rpc.amqp [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] Making synchronous call on conductor ... from (pid=30610) multicall /opt/stack/nova/nova/openstack/common/rpc/amqp.py:553
  2013-12-03 11:11:01.262 DEBUG nova.openstack.common.rpc.amqp [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] MSG_ID is 59e775b22f894384b10b6eee9ad0744c from (pid=30610) multicall /opt/stack/nova/nova/openstack/common/rpc/amqp.py:556
  2013-12-03 11:11:01.263 DEBUG nova.openstack.common.rpc.amqp [req-a228c55f-5050-4324-9853-6e72036e9449 demo demo] UNIQUE_ID is 8fab369e677f47a48452cc65e26631a1. from (pid=30610) _add_unique_id /opt/stack/nova/nova/openstack/common/rpc/amqp.py:341
  2013-12-03 11:11:09.109 DEBUG nova.openstack.common.rpc.amqp [-] Making synchronous call on conductor ... from (pid=30610) multicall /opt/stack/nova/nova/openstack/common/rpc/amqp.py:553

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1257411/+subscriptions


Follow ups

References