yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #26346
[Bug 1404241] [NEW] nova-compute state not updated
Public bug reported:
I'm running 2014.2.1 on CentOS7. 1 controller and 5 compute nodes are
deployed using packstack.
Whenever I reboot the controller node, some of nova-compute services
report state=XXX even after 60 minutes after reboot completed and
controller node is up and running again:
[root@juno1 ~(keystone_admin)]# nova-manage service list
Binary Host Zone Status State Updated_At
nova-consoleauth juno1 internal enabled :-) 2014-12-19 13:17:48
nova-scheduler juno1 internal enabled :-) 2014-12-19 13:17:47
nova-conductor juno1 internal enabled :-) 2014-12-19 13:17:47
nova-cert juno1 internal enabled :-) 2014-12-19 13:17:48
nova-compute juno4 nova enabled XXX 2014-12-19 12:26:56
nova-compute juno5 nova enabled :-) 2014-12-19 13:17:47
nova-compute juno6 nova enabled :-) 2014-12-19 13:17:46
nova-compute juno3 nova enabled :-) 2014-12-19 13:17:46
nova-compute juno2 nova enabled XXX 2014-12-19 12:21:52
Here is the chunk of nova-compute log from juno4:
2014-12-19 15:46:02.082 5193 INFO oslo.messaging._drivers.impl_rabbit [-] Delaying reconnect for 1.0 seconds...
2014-12-19 15:46:02.083 5193 ERROR oslo.messaging._drivers.impl_rabbit [-] Failed to consume message from queue: Socket closed
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit Traceback (most recent call last):
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/impl_rabbit.py", line 655, in ensure
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit return method()
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/impl_rabbit.py", line 735, in _consume
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit return self.connection.drain_events(timeout=timeout)
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/connection.py", line 281, in drain_events
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit return self.transport.drain_events(self.connection, **kwargs)
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/transport/pyamqp.py", line 94, in drain_events
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit return connection.drain_events(**kwargs)
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 299, in drain_events
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit chanmap, None, timeout=timeout,
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 362, in _wait_multiple
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit channel, method_sig, args, content = read_timeout(timeout)
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 333, in read_timeout
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit return self.method_reader.read_method()
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/method_framing.py", line 189, in read_method
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit raise m
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit IOError: Socket closed
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit
2014-12-19 15:46:02.084 5193 INFO oslo.messaging._drivers.impl_rabbit [-] Delaying reconnect for 1.0 seconds...
2014-12-19 15:46:03.084 5193 INFO oslo.messaging._drivers.impl_rabbit [-] Connecting to AMQP server on 9.167.185.90:5672
2014-12-19 15:46:03.096 5193 INFO oslo.messaging._drivers.impl_rabbit [-] Connecting to AMQP server on 9.167.185.90:5672
2014-12-19 15:46:03.105 5193 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on 9.167.185.90:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 1 seconds.
2014-12-19 15:46:03.105 5193 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on 9.167.185.90:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 1 seconds.
2014-12-19 15:46:04.106 5193 INFO oslo.messaging._drivers.impl_rabbit [-] Delaying reconnect for 1.0 seconds...
2014-12-19 15:46:04.106 5193 INFO oslo.messaging._drivers.impl_rabbit [-] Delaying reconnect for 1.0 seconds...
2014-12-19 15:46:05.106 5193 INFO oslo.messaging._drivers.impl_rabbit [-] Connecting to AMQP server on 9.167.185.90:5672
2014-12-19 15:46:05.116 5193 INFO oslo.messaging._drivers.impl_rabbit [-] Connecting to AMQP server on 9.167.185.90:5672
2014-12-19 15:46:05.157 5193 INFO oslo.messaging._drivers.impl_rabbit [-] Connected to AMQP server on 9.167.185.90:5672
2014-12-19 15:46:05.159 5193 INFO oslo.messaging._drivers.impl_rabbit [-] Connected to AMQP server on 9.167.185.90:5672
2014-12-19 15:46:33.229 5193 AUDIT nova.compute.resource_tracker [-] Auditing locally available compute resources
2014-12-19 15:46:33.781 5193 ERROR oslo.messaging._drivers.impl_rabbit [-] Failed to publish message to topic 'conductor': [Errno 104] Connection reset by peer
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit Traceback (most recent call last):
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/impl_rabbit.py", line 655, in ensure
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit return method()
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/impl_rabbit.py", line 752, in _publish
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit publisher = cls(self.conf, self.channel, topic=topic, **kwargs)
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/impl_rabbit.py", line 378, in __init__
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit **options)
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/impl_rabbit.py", line 326, in __init__
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit self.reconnect(channel)
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/impl_rabbit.py", line 334, in reconnect
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit routing_key=self.routing_key)
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/messaging.py", line 83, in __init__
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit self.revive(self._channel)
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/messaging.py", line 214, in revive
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit self.declare()
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/messaging.py", line 100, in declare
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit self.exchange.declare()
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/entity.py", line 163, in declare
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit nowait=nowait, passive=passive,
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/channel.py", line 613, in exchange_declare
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit self._send_method((40, 10), args)
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/abstract_channel.py", line 56, in _send_method
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit self.channel_id, method_sig, args, content,
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/method_framing.py", line 221, in write_method
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit write_frame(1, channel, payload)
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/transport.py", line 177, in write_frame
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit frame_type, channel, size, payload, 0xce,
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/eventlet/greenio.py", line 359, in sendall
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit tail = self.send(data, flags)
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/eventlet/greenio.py", line 342, in send
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit total_sent += fd.send(data[total_sent:], flags)
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit error: [Errno 104] Connection reset by peer
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit
2014-12-19 15:46:33.782 5193 INFO oslo.messaging._drivers.impl_rabbit [-] Delaying reconnect for 1.0 seconds...
2014-12-19 15:46:34.782 5193 INFO oslo.messaging._drivers.impl_rabbit [-] Connecting to AMQP server on 9.167.185.90:5672
2014-12-19 15:46:34.796 5193 INFO oslo.messaging._drivers.impl_rabbit [-] Connected to AMQP server on 9.167.185.90:5672
I see, that nova-compute lost AMQ connection with controller node because of reboot, but then connection was established again after reboot completed.
I also can manage running instances on that compute node and see correct instance states.
The problem can be solved if I do: systemctl restart openstack-nova-
compute on nodes with XXX state. After that I see expected :-)
** Affects: nova
Importance: Undecided
Status: New
** Attachment added: "nova.conf on compute node"
https://bugs.launchpad.net/bugs/1404241/+attachment/4284163/+files/nova.conf
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1404241
Title:
nova-compute state not updated
Status in OpenStack Compute (Nova):
New
Bug description:
I'm running 2014.2.1 on CentOS7. 1 controller and 5 compute nodes are
deployed using packstack.
Whenever I reboot the controller node, some of nova-compute services
report state=XXX even after 60 minutes after reboot completed and
controller node is up and running again:
[root@juno1 ~(keystone_admin)]# nova-manage service list
Binary Host Zone Status State Updated_At
nova-consoleauth juno1 internal enabled :-) 2014-12-19 13:17:48
nova-scheduler juno1 internal enabled :-) 2014-12-19 13:17:47
nova-conductor juno1 internal enabled :-) 2014-12-19 13:17:47
nova-cert juno1 internal enabled :-) 2014-12-19 13:17:48
nova-compute juno4 nova enabled XXX 2014-12-19 12:26:56
nova-compute juno5 nova enabled :-) 2014-12-19 13:17:47
nova-compute juno6 nova enabled :-) 2014-12-19 13:17:46
nova-compute juno3 nova enabled :-) 2014-12-19 13:17:46
nova-compute juno2 nova enabled XXX 2014-12-19 12:21:52
Here is the chunk of nova-compute log from juno4:
2014-12-19 15:46:02.082 5193 INFO oslo.messaging._drivers.impl_rabbit [-] Delaying reconnect for 1.0 seconds...
2014-12-19 15:46:02.083 5193 ERROR oslo.messaging._drivers.impl_rabbit [-] Failed to consume message from queue: Socket closed
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit Traceback (most recent call last):
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/impl_rabbit.py", line 655, in ensure
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit return method()
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/impl_rabbit.py", line 735, in _consume
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit return self.connection.drain_events(timeout=timeout)
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/connection.py", line 281, in drain_events
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit return self.transport.drain_events(self.connection, **kwargs)
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/transport/pyamqp.py", line 94, in drain_events
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit return connection.drain_events(**kwargs)
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 299, in drain_events
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit chanmap, None, timeout=timeout,
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 362, in _wait_multiple
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit channel, method_sig, args, content = read_timeout(timeout)
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 333, in read_timeout
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit return self.method_reader.read_method()
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/method_framing.py", line 189, in read_method
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit raise m
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit IOError: Socket closed
2014-12-19 15:46:02.083 5193 TRACE oslo.messaging._drivers.impl_rabbit
2014-12-19 15:46:02.084 5193 INFO oslo.messaging._drivers.impl_rabbit [-] Delaying reconnect for 1.0 seconds...
2014-12-19 15:46:03.084 5193 INFO oslo.messaging._drivers.impl_rabbit [-] Connecting to AMQP server on 9.167.185.90:5672
2014-12-19 15:46:03.096 5193 INFO oslo.messaging._drivers.impl_rabbit [-] Connecting to AMQP server on 9.167.185.90:5672
2014-12-19 15:46:03.105 5193 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on 9.167.185.90:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 1 seconds.
2014-12-19 15:46:03.105 5193 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on 9.167.185.90:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 1 seconds.
2014-12-19 15:46:04.106 5193 INFO oslo.messaging._drivers.impl_rabbit [-] Delaying reconnect for 1.0 seconds...
2014-12-19 15:46:04.106 5193 INFO oslo.messaging._drivers.impl_rabbit [-] Delaying reconnect for 1.0 seconds...
2014-12-19 15:46:05.106 5193 INFO oslo.messaging._drivers.impl_rabbit [-] Connecting to AMQP server on 9.167.185.90:5672
2014-12-19 15:46:05.116 5193 INFO oslo.messaging._drivers.impl_rabbit [-] Connecting to AMQP server on 9.167.185.90:5672
2014-12-19 15:46:05.157 5193 INFO oslo.messaging._drivers.impl_rabbit [-] Connected to AMQP server on 9.167.185.90:5672
2014-12-19 15:46:05.159 5193 INFO oslo.messaging._drivers.impl_rabbit [-] Connected to AMQP server on 9.167.185.90:5672
2014-12-19 15:46:33.229 5193 AUDIT nova.compute.resource_tracker [-] Auditing locally available compute resources
2014-12-19 15:46:33.781 5193 ERROR oslo.messaging._drivers.impl_rabbit [-] Failed to publish message to topic 'conductor': [Errno 104] Connection reset by peer
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit Traceback (most recent call last):
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/impl_rabbit.py", line 655, in ensure
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit return method()
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/impl_rabbit.py", line 752, in _publish
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit publisher = cls(self.conf, self.channel, topic=topic, **kwargs)
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/impl_rabbit.py", line 378, in __init__
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit **options)
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/impl_rabbit.py", line 326, in __init__
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit self.reconnect(channel)
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/impl_rabbit.py", line 334, in reconnect
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit routing_key=self.routing_key)
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/messaging.py", line 83, in __init__
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit self.revive(self._channel)
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/messaging.py", line 214, in revive
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit self.declare()
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/messaging.py", line 100, in declare
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit self.exchange.declare()
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/entity.py", line 163, in declare
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit nowait=nowait, passive=passive,
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/channel.py", line 613, in exchange_declare
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit self._send_method((40, 10), args)
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/abstract_channel.py", line 56, in _send_method
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit self.channel_id, method_sig, args, content,
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/method_framing.py", line 221, in write_method
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit write_frame(1, channel, payload)
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/transport.py", line 177, in write_frame
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit frame_type, channel, size, payload, 0xce,
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/eventlet/greenio.py", line 359, in sendall
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit tail = self.send(data, flags)
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/eventlet/greenio.py", line 342, in send
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit total_sent += fd.send(data[total_sent:], flags)
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit error: [Errno 104] Connection reset by peer
2014-12-19 15:46:33.781 5193 TRACE oslo.messaging._drivers.impl_rabbit
2014-12-19 15:46:33.782 5193 INFO oslo.messaging._drivers.impl_rabbit [-] Delaying reconnect for 1.0 seconds...
2014-12-19 15:46:34.782 5193 INFO oslo.messaging._drivers.impl_rabbit [-] Connecting to AMQP server on 9.167.185.90:5672
2014-12-19 15:46:34.796 5193 INFO oslo.messaging._drivers.impl_rabbit [-] Connected to AMQP server on 9.167.185.90:5672
I see, that nova-compute lost AMQ connection with controller node because of reboot, but then connection was established again after reboot completed.
I also can manage running instances on that compute node and see correct instance states.
The problem can be solved if I do: systemctl restart openstack-nova-
compute on nodes with XXX state. After that I see expected :-)
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1404241/+subscriptions
Follow ups
References