← Back to team overview

openstack team mailing list archive

Trouble getting instances back up after hard server reboot

 

Hi all,


I am having a terrible time getting my instances to work after a hard
reboot.  I am using the most up-to date version of all openstack
packages provided by Ubuntu.  I have included a list of packages, with
version, at the end of this email.

After a hard reboot "nova list" reports that the instance is active,
but there are no kvm processes running.  grepping the log file for
errors I find this in nova-compute.log:


2012-08-09 14:32:51 INFO nova.rpc.common
[req-dd6fcade-73ec-4378-9a6b-7bc709eefcd4 None None] Connected to AMQP
server on cloudy-priv:5672
2012-08-09 14:33:51 ERROR nova.rpc.common
[req-dd6fcade-73ec-4378-9a6b-7bc709eefcd4 None None] Timed out waiting
for RPC response: timed out
2012-08-09 14:33:51 TRACE nova.rpc.common Traceback (most recent call last):
2012-08-09 14:33:51 TRACE nova.rpc.common   File
"/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 490,
in ensure
2012-08-09 14:33:51 TRACE nova.rpc.common     return method(*args, **kwargs)
2012-08-09 14:33:51 TRACE nova.rpc.common   File
"/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 567,
in _consume
2012-08-09 14:33:51 TRACE nova.rpc.common     return
self.connection.drain_events(timeout=timeout)
2012-08-09 14:33:51 TRACE nova.rpc.common   File
"/usr/lib/python2.7/dist-packages/kombu/connection.py", line 175, in
drain_events
2012-08-09 14:33:51 TRACE nova.rpc.common     return
self.transport.drain_events(self.connection, **kwargs)
2012-08-09 14:33:51 TRACE nova.rpc.common   File
"/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line
238, in drain_events
2012-08-09 14:33:51 TRACE nova.rpc.common     return
connection.drain_events(**kwargs)
2012-08-09 14:33:51 TRACE nova.rpc.common   File
"/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line
57, in drain_events
2012-08-09 14:33:51 TRACE nova.rpc.common     return
self.wait_multi(self.channels.values(), timeout=timeout)
2012-08-09 14:33:51 TRACE nova.rpc.common   File
"/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line
63, in wait_multi
2012-08-09 14:33:51 TRACE nova.rpc.common     chanmap.keys(),
allowed_methods, timeout=timeout)
2012-08-09 14:33:51 TRACE nova.rpc.common   File
"/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line
120, in _wait_multiple
2012-08-09 14:33:51 TRACE nova.rpc.common     channel, method_sig,
args, content = read_timeout(timeout)
2012-08-09 14:33:51 TRACE nova.rpc.common   File
"/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line
94, in read_timeout
2012-08-09 14:33:51 TRACE nova.rpc.common     return
self.method_reader.read_method()
2012-08-09 14:33:51 TRACE nova.rpc.common   File
"/usr/lib/python2.7/dist-packages/amqplib/client_0_8/method_framing.py",
line 221, in read_method
2012-08-09 14:33:51 TRACE nova.rpc.common     raise m
2012-08-09 14:33:51 TRACE nova.rpc.common timeout: timed out
2012-08-09 14:33:51 TRACE nova.rpc.common
2012-08-09 14:33:51 CRITICAL nova [-] Timeout while waiting on RPC response.

restarting nova-compute brings the instance up, so it looks like
nova-compute is starting before rabbitmq?   Is there a clean way
around this, or should I put "service nova-compute restart" in
rc.local?



If I have a volume attached things get much worse.  I can still start
the instance by restarting nova-compute, but the volume does not
attach.  I can not seem to detach the volume in order to attach it
again.  Below is the only error in the log file, and how I mount the
image that contains the nova-volume logical group.    The error occurs
because it tries to start nova-volume before the loopback device is
setup.  The command in rc.local restarts the service, making the
logical group available.

>From nova-volume.log

2012-08-09 14:32:40 CRITICAL nova [-] volume group nova-volumes doesn't exist

>From rc.local

losetup -f /var/lib/nova/nova-volumes.img
service nova-volume restart

Any idea how I should solve these problems?  I could disable upstart
from bringing the services up automatically and start them in the
correct order in rc.local, but I don't think this would solve the
volume attachment issue.

I am so frustrated that I created this script for testing which
completely resets the nova database table, iptables, and recreates
everything.
http://paste2.org/p/2100211

I know it is a dirty dirty hack, but I can't seem to figure out what
is going on.

Thanks in advance for the help.
Sam


root@cloudy:/var/log/nova# dpkg -l | grep -E
"(nova|glance|keystone|tgt|rabbit|ntp|mysql|libvirt|kvm)"
ii  glance
2012.1+stable~20120608-5462295-0ubuntu2.2  OpenStack Image Registry
and Delivery Service - Daemons
ii  glance-api
2012.1+stable~20120608-5462295-0ubuntu2.2  OpenStack Image Registry
and Delivery Service - API
ii  glance-client
2012.1+stable~20120608-5462295-0ubuntu2.2  OpenStack Image Registry
and Delivery Service - Registry
ii  glance-common
2012.1+stable~20120608-5462295-0ubuntu2.2  OpenStack Image Registry
and Delivery Service - Common
ii  glance-registry
2012.1+stable~20120608-5462295-0ubuntu2.2  OpenStack Image Registry
and Delivery Service - Registry
ii  keystone
2012.1+stable~20120608-aff45d6-0ubuntu1    OpenStack identity service
- Daemons
ii  kvm
1:84+dfsg-0ubuntu16+1.0+noroms+0ubuntu14.1 dummy transitional package
from kvm to qemu-kvm
ii  kvm-ipxe                         1.0.0+git-3.55f6c88-0ubuntu1
         PXE ROM's for KVM
ii  libdbd-mysql-perl                4.020-1build2
         Perl5 database interface to the MySQL database
ii  libmysqlclient18                 5.5.24-0ubuntu0.12.04.1
         MySQL database client library
ii  libsys-virt-perl                 0.9.7-2
         Perl module providing an extension for the libvirt library
ii  libvirt-bin                      0.9.8-2ubuntu17.3
         programs for the libvirt library
ii  libvirt0                         0.9.8-2ubuntu17.3
         library for interfacing with different virtualization systems
ii  mysql-client-5.5                 5.5.24-0ubuntu0.12.04.1
         MySQL database client binaries
ii  mysql-client-core-5.5            5.5.24-0ubuntu0.12.04.1
         MySQL database core client binaries
ii  mysql-common                     5.5.24-0ubuntu0.12.04.1
         MySQL database common files, e.g. /etc/mysql/my.cnf
ii  mysql-server                     5.5.24-0ubuntu0.12.04.1
         MySQL database server (metapackage depending on the latest
version)
ii  mysql-server-5.5                 5.5.24-0ubuntu0.12.04.1
         MySQL database server binaries and system database setup
ii  mysql-server-core-5.5            5.5.24-0ubuntu0.12.04.1
         MySQL database server binaries
ii  nova-api
2012.1+stable~20120612-3ee026e-0ubuntu1.2  OpenStack Compute - API
frontend
ii  nova-common
2012.1+stable~20120612-3ee026e-0ubuntu1.2  OpenStack Compute - common
files
ii  nova-compute
2012.1+stable~20120612-3ee026e-0ubuntu1.2  OpenStack Compute - compute
node
ii  nova-compute-kvm
2012.1+stable~20120612-3ee026e-0ubuntu1.2  OpenStack Compute - compute
node (KVM)
ii  nova-network
2012.1+stable~20120612-3ee026e-0ubuntu1.2  OpenStack Compute - Network
manager
ii  nova-scheduler
2012.1+stable~20120612-3ee026e-0ubuntu1.2  OpenStack Compute - virtual
machine scheduler
ii  nova-volume
2012.1+stable~20120612-3ee026e-0ubuntu1.2  OpenStack Compute - storage
ii  ntp                              1:4.2.6.p3+dfsg-1ubuntu3.1
         Network Time Protocol daemon and utility programs
ii  ntpdate                          1:4.2.6.p3+dfsg-1ubuntu3.1
         client for setting system time from NTP servers
ii  python-glance
2012.1+stable~20120608-5462295-0ubuntu2.2  OpenStack Image Registry
and Delivery Service - Python library
ii  python-keystone
2012.1+stable~20120608-aff45d6-0ubuntu1    OpenStack identity service
- Python library
ii  python-keystoneclient            2012.1-0ubuntu1
         Client libary for Openstack Keystone API
ii  python-libvirt                   0.9.8-2ubuntu17.3
         libvirt Python bindings
ii  python-mysqldb                   1.2.3-1build1
         Python interface to MySQL
ii  python-nova
2012.1+stable~20120612-3ee026e-0ubuntu1.2  OpenStack Compute Python
libraries
ii  python-novaclient                2012.1-0ubuntu1
         client library for OpenStack Compute API
ii  qemu-kvm                         1.0+noroms-0ubuntu14.1
         Full virtualization on i386 and amd64 hardware
ii  rabbitmq-server                  2.7.1-0ubuntu4
         An AMQP server written in Erlang
ii  tgt                              1:1.0.17-1ubuntu2
         Linux SCSI target user-space tools


Follow ups