openstack team mailing list archive
-
openstack team
-
Mailing list archive
-
Message #15670
Trouble getting instances back up after hard server reboot
Hi all,
I am having a terrible time getting my instances to work after a hard
reboot. I am using the most up-to date version of all openstack
packages provided by Ubuntu. I have included a list of packages, with
version, at the end of this email.
After a hard reboot "nova list" reports that the instance is active,
but there are no kvm processes running. grepping the log file for
errors I find this in nova-compute.log:
2012-08-09 14:32:51 INFO nova.rpc.common
[req-dd6fcade-73ec-4378-9a6b-7bc709eefcd4 None None] Connected to AMQP
server on cloudy-priv:5672
2012-08-09 14:33:51 ERROR nova.rpc.common
[req-dd6fcade-73ec-4378-9a6b-7bc709eefcd4 None None] Timed out waiting
for RPC response: timed out
2012-08-09 14:33:51 TRACE nova.rpc.common Traceback (most recent call last):
2012-08-09 14:33:51 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 490,
in ensure
2012-08-09 14:33:51 TRACE nova.rpc.common return method(*args, **kwargs)
2012-08-09 14:33:51 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 567,
in _consume
2012-08-09 14:33:51 TRACE nova.rpc.common return
self.connection.drain_events(timeout=timeout)
2012-08-09 14:33:51 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/kombu/connection.py", line 175, in
drain_events
2012-08-09 14:33:51 TRACE nova.rpc.common return
self.transport.drain_events(self.connection, **kwargs)
2012-08-09 14:33:51 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line
238, in drain_events
2012-08-09 14:33:51 TRACE nova.rpc.common return
connection.drain_events(**kwargs)
2012-08-09 14:33:51 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line
57, in drain_events
2012-08-09 14:33:51 TRACE nova.rpc.common return
self.wait_multi(self.channels.values(), timeout=timeout)
2012-08-09 14:33:51 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line
63, in wait_multi
2012-08-09 14:33:51 TRACE nova.rpc.common chanmap.keys(),
allowed_methods, timeout=timeout)
2012-08-09 14:33:51 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line
120, in _wait_multiple
2012-08-09 14:33:51 TRACE nova.rpc.common channel, method_sig,
args, content = read_timeout(timeout)
2012-08-09 14:33:51 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line
94, in read_timeout
2012-08-09 14:33:51 TRACE nova.rpc.common return
self.method_reader.read_method()
2012-08-09 14:33:51 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/amqplib/client_0_8/method_framing.py",
line 221, in read_method
2012-08-09 14:33:51 TRACE nova.rpc.common raise m
2012-08-09 14:33:51 TRACE nova.rpc.common timeout: timed out
2012-08-09 14:33:51 TRACE nova.rpc.common
2012-08-09 14:33:51 CRITICAL nova [-] Timeout while waiting on RPC response.
restarting nova-compute brings the instance up, so it looks like
nova-compute is starting before rabbitmq? Is there a clean way
around this, or should I put "service nova-compute restart" in
rc.local?
If I have a volume attached things get much worse. I can still start
the instance by restarting nova-compute, but the volume does not
attach. I can not seem to detach the volume in order to attach it
again. Below is the only error in the log file, and how I mount the
image that contains the nova-volume logical group. The error occurs
because it tries to start nova-volume before the loopback device is
setup. The command in rc.local restarts the service, making the
logical group available.
>From nova-volume.log
2012-08-09 14:32:40 CRITICAL nova [-] volume group nova-volumes doesn't exist
>From rc.local
losetup -f /var/lib/nova/nova-volumes.img
service nova-volume restart
Any idea how I should solve these problems? I could disable upstart
from bringing the services up automatically and start them in the
correct order in rc.local, but I don't think this would solve the
volume attachment issue.
I am so frustrated that I created this script for testing which
completely resets the nova database table, iptables, and recreates
everything.
http://paste2.org/p/2100211
I know it is a dirty dirty hack, but I can't seem to figure out what
is going on.
Thanks in advance for the help.
Sam
root@cloudy:/var/log/nova# dpkg -l | grep -E
"(nova|glance|keystone|tgt|rabbit|ntp|mysql|libvirt|kvm)"
ii glance
2012.1+stable~20120608-5462295-0ubuntu2.2 OpenStack Image Registry
and Delivery Service - Daemons
ii glance-api
2012.1+stable~20120608-5462295-0ubuntu2.2 OpenStack Image Registry
and Delivery Service - API
ii glance-client
2012.1+stable~20120608-5462295-0ubuntu2.2 OpenStack Image Registry
and Delivery Service - Registry
ii glance-common
2012.1+stable~20120608-5462295-0ubuntu2.2 OpenStack Image Registry
and Delivery Service - Common
ii glance-registry
2012.1+stable~20120608-5462295-0ubuntu2.2 OpenStack Image Registry
and Delivery Service - Registry
ii keystone
2012.1+stable~20120608-aff45d6-0ubuntu1 OpenStack identity service
- Daemons
ii kvm
1:84+dfsg-0ubuntu16+1.0+noroms+0ubuntu14.1 dummy transitional package
from kvm to qemu-kvm
ii kvm-ipxe 1.0.0+git-3.55f6c88-0ubuntu1
PXE ROM's for KVM
ii libdbd-mysql-perl 4.020-1build2
Perl5 database interface to the MySQL database
ii libmysqlclient18 5.5.24-0ubuntu0.12.04.1
MySQL database client library
ii libsys-virt-perl 0.9.7-2
Perl module providing an extension for the libvirt library
ii libvirt-bin 0.9.8-2ubuntu17.3
programs for the libvirt library
ii libvirt0 0.9.8-2ubuntu17.3
library for interfacing with different virtualization systems
ii mysql-client-5.5 5.5.24-0ubuntu0.12.04.1
MySQL database client binaries
ii mysql-client-core-5.5 5.5.24-0ubuntu0.12.04.1
MySQL database core client binaries
ii mysql-common 5.5.24-0ubuntu0.12.04.1
MySQL database common files, e.g. /etc/mysql/my.cnf
ii mysql-server 5.5.24-0ubuntu0.12.04.1
MySQL database server (metapackage depending on the latest
version)
ii mysql-server-5.5 5.5.24-0ubuntu0.12.04.1
MySQL database server binaries and system database setup
ii mysql-server-core-5.5 5.5.24-0ubuntu0.12.04.1
MySQL database server binaries
ii nova-api
2012.1+stable~20120612-3ee026e-0ubuntu1.2 OpenStack Compute - API
frontend
ii nova-common
2012.1+stable~20120612-3ee026e-0ubuntu1.2 OpenStack Compute - common
files
ii nova-compute
2012.1+stable~20120612-3ee026e-0ubuntu1.2 OpenStack Compute - compute
node
ii nova-compute-kvm
2012.1+stable~20120612-3ee026e-0ubuntu1.2 OpenStack Compute - compute
node (KVM)
ii nova-network
2012.1+stable~20120612-3ee026e-0ubuntu1.2 OpenStack Compute - Network
manager
ii nova-scheduler
2012.1+stable~20120612-3ee026e-0ubuntu1.2 OpenStack Compute - virtual
machine scheduler
ii nova-volume
2012.1+stable~20120612-3ee026e-0ubuntu1.2 OpenStack Compute - storage
ii ntp 1:4.2.6.p3+dfsg-1ubuntu3.1
Network Time Protocol daemon and utility programs
ii ntpdate 1:4.2.6.p3+dfsg-1ubuntu3.1
client for setting system time from NTP servers
ii python-glance
2012.1+stable~20120608-5462295-0ubuntu2.2 OpenStack Image Registry
and Delivery Service - Python library
ii python-keystone
2012.1+stable~20120608-aff45d6-0ubuntu1 OpenStack identity service
- Python library
ii python-keystoneclient 2012.1-0ubuntu1
Client libary for Openstack Keystone API
ii python-libvirt 0.9.8-2ubuntu17.3
libvirt Python bindings
ii python-mysqldb 1.2.3-1build1
Python interface to MySQL
ii python-nova
2012.1+stable~20120612-3ee026e-0ubuntu1.2 OpenStack Compute Python
libraries
ii python-novaclient 2012.1-0ubuntu1
client library for OpenStack Compute API
ii qemu-kvm 1.0+noroms-0ubuntu14.1
Full virtualization on i386 and amd64 hardware
ii rabbitmq-server 2.7.1-0ubuntu4
An AMQP server written in Erlang
ii tgt 1:1.0.17-1ubuntu2
Linux SCSI target user-space tools
Follow ups