openstack team mailing list archive
-
openstack team
-
Mailing list archive
-
Message #13362
Re: Timeout during image build (task Networking)
Vish, Jay,
OK, this looks promising. A couple of questions…
I'm seeing this RPC timeout on the Essex 2012.1 packages released with Ubuntu 12.04. I'm assuming these packages are affected by this bug?
Why would something this fundamental not show up during Essex RC.X testing?
How best to 'fix' this for our production environment (thank god it's only the research organization!)
My previous testing of Essex (running on Diablo) didn't exhibit this problem. However during testing, I was configured using FlatDHCP networking versus our production cloud using VLAN networking. This was what lead me to believe that it might be a network configuration issue. Apparently not.
As it stands right now, we're dead in the water, so I hope some easy fix for the Ubuntu Essex release is possible.
Thanks for everyone looking this. Hope to hear a resolution soon.
Regards,
Ross
On Jun 19, 2012, at 2:13 PM, Vishvananda Ishaya wrote:
Sorry, paste fail on the last message.
This seems like a likely culprit:
https://review.openstack.org/#/c/8339/
I'm guessing it only happens on concurrent builds? We probably need a synchronized somewhere.
Vish
On Jun 19, 2012, at 12:03 PM, Jay Pipes wrote:
cc'ing Vish on this, as this is now occurring on every single devstack + Tempest run, for multiple servers.
Vish, I am seeing the exact same issue as shown below. Instances end up in ERROR state and looking into the nova-network log, I find *no* errors at all, and yet looking at the nova-compute log, I see multiple timeout errors -- all of them trying to RPC while in the allocate_network method. Always the same method, always the same error, and no errors in nova-network or nova-api (other than just reporting a failed build)
Any idea on something that may have crept in recently? This wasn't happening a week or so ago, AFAICT.
Best,
-jay
On 06/18/2012 06:03 PM, Lillie Ross-CDSR11 wrote:
I'm receiving RPC timeouts when trying to launch an instance. My
installation is the Essex release running on Ubuntu 12.04.
When I launch a test image, the launch fails. In my setup, Nova network
runs on a controller node, and all compute instances run on separate,
dedicated server nodes. The failure is repeatable. Upon examining the
various logs, I see the following (see below). Any insight would be welcome.
Regards,
Ross
>From 'nova show <instance name>' I read the following:
root@cirrus1:~# nova show test
+-------------------------------------+-----------------------------------------------------------------------------+
| Property | Value |
+-------------------------------------+-----------------------------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-SRV-ATTR:host | nova8 |
| OS-EXT-SRV-ATTR:hypervisor_hostname | None |
| OS-EXT-SRV-ATTR:instance_name | instance-00000005 |
| OS-EXT-STS:power_state | 0 |
| OS-EXT-STS:task_state | networking |
| OS-EXT-STS:vm_state | error |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2012-06-18T20:42:56Z |
| fault | {u'message': u'Timeout', u'code': 500, u'created':
u'2012-06-18T20:43:58Z'} |
| flavor | m1.tiny |
| hostId | 50272989300483e2b5e5236cd572fef3f9149ae60faa5f5660f8da54 |
| id | d569b16f-10a8-4cb8-90a3-d5b664c2322d |
| image | tty-linux |
| key_name | admin |
| metadata | {} |
| name | test |
| private_0 network | |
| status | ERROR |
| tenant_id | 1 |
| updated | 2012-06-18T20:43:57Z |
| user_id | 1 |
+-------------------------------------+-----------------------------------------------------------------------------+
>From the nova-network.log I see the following:
2012-06-18 15:43:36 DEBUG nova.manager [-] Running periodic task
VlanManager._disassociate_stale_fixed_ips from (pid=1381) periodic_tasks
/usr/lib/python2.7/dist-packages
/nova/manager.py:152
2012-06-18 15:43:57 ERROR nova.rpc.common [-] Timed out waiting for RPC
response: timed out
2012-06-18 15:43:57 TRACE nova.rpc.common Traceback (most recent call last):
2012-06-18 15:43:57 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 490, in
ensure
2012-06-18 15:43:57 TRACE nova.rpc.common return method(*args, **kwargs)
2012-06-18 15:43:57 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 567, in
_consume
2012-06-18 15:43:57 TRACE nova.rpc.common return
self.connection.drain_events(timeout=timeout)
2012-06-18 15:43:57 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/kombu/connection.py", line 175, in
drain_events
2012-06-18 15:43:57 TRACE nova.rpc.common return
self.transport.drain_events(self.connection, **kwargs)
2012-06-18 15:43:57 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line
238, in drain_events
2012-06-18 15:43:57 TRACE nova.rpc.common return
connection.drain_events(**kwargs)
2012-06-18 15:43:57 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line
57, in drain_events
2012-06-18 15:43:57 TRACE nova.rpc.common return
self.wait_multi(self.channels.values(), timeout=timeout)
2012-06-18 15:43:57 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line
63, in wait_multi
2012-06-18 15:43:57 TRACE nova.rpc.common chanmap.keys(),
allowed_methods, timeout=timeout)
2012-06-18 15:43:57 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line
120, in _wait_multiple
2012-06-18 15:43:57 TRACE nova.rpc.common channel, method_sig, args,
content = read_timeout(timeout)
2012-06-18 15:43:57 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line
94, in read_timeout
2012-06-18 15:43:57 TRACE nova.rpc.common return
self.method_reader.read_method()
2012-06-18 15:43:57 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/amqplib/client_0_8/method_framing.py",
line 221, in read_method
2012-06-18 15:43:57 TRACE nova.rpc.common raise m
2012-06-18 15:43:57 TRACE nova.rpc.common timeout: timed out
2012-06-18 15:43:57 TRACE nova.rpc.common
2012-06-18 15:43:58 DEBUG nova.utils
[req-16158a6b-f3d6-49f3-977e-3ccfecc791ca 1 1] Attempting to grab
semaphore "get_dhcp" for method "_get_dhcp_ip"... from (pid=1381) i
nner /usr/lib/python2.7/dist-packages/nova/utils.py:927
2012-06-18 15:43:58 DEBUG nova.utils
[req-16158a6b-f3d6-49f3-977e-3ccfecc791ca 1 1] Got semaphore "get_dhcp"
for method "_get_dhcp_ip"... from (pid=1381) inner /usr/lib/p
ython2.7/dist-packages/nova/utils.py:931
2012-06-18 15:43:58 DEBUG nova.utils
[req-16158a6b-f3d6-49f3-977e-3ccfecc791ca 1 1] Attempting to grab
semaphore "get_dhcp" for method "_get_dhcp_ip"... from (pid=1381) i
and from nova-compute.log I read:
2012-06-18 15:43:57 ERROR nova.rpc.common
[req-16158a6b-f3d6-49f3-977e-3ccfecc791ca 1 1] Timed out waiting for RPC
response: timed out
2012-06-18 15:43:57 TRACE nova.rpc.common Traceback (most recent call last):
2012-06-18 15:43:57 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 490, in
ensure
2012-06-18 15:43:57 TRACE nova.rpc.common return method(*args, **kwargs)
2012-06-18 15:43:57 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 567, in
_consume
2012-06-18 15:43:57 TRACE nova.rpc.common return
self.connection.drain_events(timeout=timeout)
2012-06-18 15:43:57 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/kombu/connection.py", line 175, in
drain_events
2012-06-18 15:43:57 TRACE nova.rpc.common return
self.transport.drain_events(self.connection, **kwargs)
2012-06-18 15:43:57 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line
238, in drain_events
2012-06-18 15:43:57 TRACE nova.rpc.common return
connection.drain_events(**kwargs)
2012-06-18 15:43:57 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line
57, in drain_events
2012-06-18 15:43:57 TRACE nova.rpc.common return
self.wait_multi(self.channels.values(), timeout=timeout)
2012-06-18 15:43:57 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line
63, in wait_multi
2012-06-18 15:43:57 TRACE nova.rpc.common chanmap.keys(),
allowed_methods, timeout=timeout)
2012-06-18 15:43:57 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line
120, in _wait_multiple
2012-06-18 15:43:57 TRACE nova.rpc.common channel, method_sig, args,
content = read_timeout(timeout)
2012-06-18 15:43:57 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line
94, in read_timeout
2012-06-18 15:43:57 TRACE nova.rpc.common return
self.method_reader.read_method()
2012-06-18 15:43:57 TRACE nova.rpc.common File
"/usr/lib/python2.7/dist-packages/amqplib/client_0_8/method_framing.py",
line 221, in read_method
2012-06-18 15:43:57 TRACE nova.rpc.common raise m
2012-06-18 15:43:57 TRACE nova.rpc.common timeout: timed out
2012-06-18 15:43:57 TRACE nova.rpc.common
2012-06-18 15:43:57 ERROR nova.compute.manager
[req-16158a6b-f3d6-49f3-977e-3ccfecc791ca 1 1] [instance:
d569b16f-10a8-4cb8-90a3-d5b664c2322d] Instance failed network setup
2012-06-18 15:43:57 TRACE nova.compute.manager [instance:
d569b16f-10a8-4cb8-90a3-d5b664c2322d] Traceback (most recent call last):
2012-06-18 15:43:57 TRACE nova.compute.manager [instance:
d569b16f-10a8-4cb8-90a3-d5b664c2322d] File
"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 559, in
_allocate_network
2012-06-18 15:43:57 TRACE nova.compute.manager [instance:
d569b16f-10a8-4cb8-90a3-d5b664c2322d] requested_networks=requested_networks)
2012-06-18 15:43:57 TRACE nova.compute.manager [instance:
d569b16f-10a8-4cb8-90a3-d5b664c2322d] File
"/usr/lib/python2.7/dist-packages/nova/network/api.py", line 170, in
allocate_for_instance
2012-06-18 15:43:57 TRACE nova.compute.manager [instance:
d569b16f-10a8-4cb8-90a3-d5b664c2322d] 'args': args})
2012-06-18 15:43:57 TRACE nova.compute.manager [instance:
d569b16f-10a8-4cb8-90a3-d5b664c2322d] File
"/usr/lib/python2.7/dist-packages/nova/rpc/__init__.py", line 68, in call
2012-06-18 15:43:57 TRACE nova.compute.manager [instance:
d569b16f-10a8-4cb8-90a3-d5b664c2322d] return _get_impl().call(context,
topic, msg, timeout)
2012-06-18 15:43:57 TRACE nova.compute.manager [instance:
d569b16f-10a8-4cb8-90a3-d5b664c2322d] File
"/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 674, in call
2012-06-18 15:43:57 TRACE nova.compute.manager [instance:
d569b16f-10a8-4cb8-90a3-d5b664c2322d] return rpc_amqp.call(context,
topic, msg, timeout, Connection.pool)
2012-06-18 15:43:57 TRACE nova.compute.manager [instance:
d569b16f-10a8-4cb8-90a3-d5b664c2322d] File
"/usr/lib/python2.7/dist-packages/nova/rpc/amqp.py", line 338, in call
2012-06-18 15:43:57 TRACE nova.compute.manager [instance:
d569b16f-10a8-4cb8-90a3-d5b664c2322d] rv = list(rv)
2012-06-18 15:43:57 TRACE nova.compute.manager [instance:
d569b16f-10a8-4cb8-90a3-d5b664c2322d] File
"/usr/lib/python2.7/dist-packages/nova/rpc/amqp.py", line 299, in __iter__
2012-06-18 15:43:57 TRACE nova.compute.manager [instance:
d569b16f-10a8-4cb8-90a3-d5b664c2322d] self._iterator.next()
2012-06-18 15:43:57 TRACE nova.compute.manager [instance:
d569b16f-10a8-4cb8-90a3-d5b664c2322d] File
"/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 572, in
iterconsume
2012-06-18 15:43:57 TRACE nova.compute.manager [instance:
d569b16f-10a8-4cb8-90a3-d5b664c2322d] yield self.ensure(_error_callback,
_consume)
2012-06-18 15:43:57 TRACE nova.compute.manager [instance:
d569b16f-10a8-4cb8-90a3-d5b664c2322d] File
"/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 503, in
ensure
2012-06-18 15:43:57 TRACE nova.compute.manager [instance:
d569b16f-10a8-4cb8-90a3-d5b664c2322d] error_callback(e)
2012-06-18 15:43:57 TRACE nova.compute.manager [instance:
d569b16f-10a8-4cb8-90a3-d5b664c2322d] File
"/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 553, in
_error_callback
2012-06-18 15:43:57 TRACE nova.compute.manager [instance:
d569b16f-10a8-4cb8-90a3-d5b664c2322d] raise rpc_common.Timeout()
2012-06-18 15:43:57 TRACE nova.compute.manager [instance:
d569b16f-10a8-4cb8-90a3-d5b664c2322d] Timeout: Timeout while waiting on
RPC response.
2012-06-18 15:43:57 TRACE nova.compute.manager [instance:
d569b16f-10a8-4cb8-90a3-d5b664c2322d]
2012-06-18 15:43:58 ERROR nova.rpc.amqp
[req-16158a6b-f3d6-49f3-977e-3ccfecc791ca 1 1] Exception during message
handling
2012-06-18 15:43:58 TRACE nova.rpc.amqp Traceback (most recent call last):
2012-06-18 15:43:58 TRACE nova.rpc.amqp File
"/usr/lib/python2.7/dist-packages/nova/rpc/amqp.py", line 252, in
_process_data
2012-06-18 15:43:58 TRACE nova.rpc.amqp rval = node_func(context=ctxt,
**node_args)
2012-06-18 15:43:58 TRACE nova.rpc.amqp File
"/usr/lib/python2.7/dist-packages/nova/exception.py", line 114, in wrapped
2012-06-18 15:43:58 TRACE nova.rpc.amqp return f(*args, **kw)
2012-06-18 15:43:58 TRACE nova.rpc.amqp File
"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 177, in
decorated_function
2012-06-18 15:43:58 TRACE nova.rpc.amqp sys.exc_info())
2012-06-18 15:43:58 TRACE nova.rpc.amqp File
"/usr/lib/python2.7/contextlib.py", line 24, in __exit__
2012-06-18 15:43:58 TRACE nova.rpc.amqp self.gen.next()
2012-06-18 15:43:58 TRACE nova.rpc.amqp File
"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 171, in
decorated_function
2012-06-18 15:43:58 TRACE nova.rpc.amqp return function(self, context,
instance_uuid, *args, **kwargs)
2012-06-18 15:43:58 TRACE nova.rpc.amqp File
"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 651, in
run_instance
2012-06-18 15:43:58 TRACE nova.rpc.amqp do_run_instance()
2012-06-18 15:43:58 TRACE nova.rpc.amqp File
"/usr/lib/python2.7/dist-packages/nova/utils.py", line 945, in inner
2012-06-18 15:43:58 TRACE nova.rpc.amqp retval = f(*args, **kwargs)
2012-06-18 15:43:58 TRACE nova.rpc.amqp File
"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 650, in
do_run_instance
2012-06-18 15:43:58 TRACE nova.rpc.amqp self._run_instance(context,
instance_uuid, **kwargs)
2012-06-18 15:43:58 TRACE nova.rpc.amqp File
"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 451, in
_run_instance
2012-06-18 15:43:58 TRACE nova.rpc.amqp
self._set_instance_error_state(context, instance_uuid)
2012-06-18 15:43:58 TRACE nova.rpc.amqp File
"/usr/lib/python2.7/contextlib.py", line 24, in __exit__
2012-06-18 15:43:58 TRACE nova.rpc.amqp self.gen.next()
2012-06-18 15:43:58 TRACE nova.rpc.amqp File
"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 424, in
_run_instance
2012-06-18 15:43:58 TRACE nova.rpc.amqp requested_networks)
2012-06-18 15:43:58 TRACE nova.rpc.amqp File
"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 559, in
_allocate_network
2012-06-18 15:43:58 TRACE nova.rpc.amqp
requested_networks=requested_networks)
2012-06-18 15:43:58 TRACE nova.rpc.amqp File
"/usr/lib/python2.7/dist-packages/nova/network/api.py", line 170, in
allocate_for_instance
2012-06-18 15:43:58 TRACE nova.rpc.amqp 'args': args})
2012-06-18 15:43:58 TRACE nova.rpc.amqp File
"/usr/lib/python2.7/dist-packages/nova/rpc/__init__.py", line 68, in call
2012-06-18 15:43:58 TRACE nova.rpc.amqp return _get_impl().call(context,
topic, msg, timeout)
2012-06-18 15:43:58 TRACE nova.rpc.amqp File
"/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 674, in call
2012-06-18 15:43:58 TRACE nova.rpc.amqp return rpc_amqp.call(context,
topic, msg, timeout, Connection.pool)
2012-06-18 15:43:58 TRACE nova.rpc.amqp File
"/usr/lib/python2.7/dist-packages/nova/rpc/amqp.py", line 338, in call
2012-06-18 15:43:58 TRACE nova.rpc.amqp rv = list(rv)
2012-06-18 15:43:58 TRACE nova.rpc.amqp File
"/usr/lib/python2.7/dist-packages/nova/rpc/amqp.py", line 299, in __iter__
2012-06-18 15:43:58 TRACE nova.rpc.amqp self._iterator.next()
2012-06-18 15:43:58 TRACE nova.rpc.amqp File
"/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 572, in
iterconsume
2012-06-18 15:43:58 TRACE nova.rpc.amqp yield
self.ensure(_error_callback, _consume)
2012-06-18 15:43:58 TRACE nova.rpc.amqp File
"/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 503, in
ensure
2012-06-18 15:43:58 TRACE nova.rpc.amqp error_callback(e)
2012-06-18 15:43:58 TRACE nova.rpc.amqp File
"/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 553, in
_error_callback
2012-06-18 15:43:58 TRACE nova.rpc.amqp raise rpc_common.Timeout()
2012-06-18 15:43:58 TRACE nova.rpc.amqp Timeout: Timeout while waiting
on RPC response.
2012-06-18 15:43:58 TRACE nova.rpc.amqp
2012-06-18 15:44:05 DEBUG nova.manager [-] Running periodic task
ComputeManager._publish_service_capabilities from (pid=13262)
periodic_tasks /usr/lib/python2.7/dist-packages/nova/manager.py:152
2012-06-18 15:44:05 DEBUG nova.manager [-] Notifying Schedulers of
capabilities ... from (pid=13262) _publish_service_capabilities
/usr/lib/python2.7/dist-packages/nova/manager.py:203
2012-06-18 15:44:05 DEBUG nova.rpc.amqp [-] Making asynchronous fanout
cast... from (pid=13262) fanout_cast
/usr/lib/python2.7/dist-packages/nova/rpc/amqp.py:354
2012-06-18 15:44:05 DEBUG nova.manager [-] Running periodic task
ComputeManager._poll_rescued_instances from (pid=13262) periodic_tasks
/usr/lib/python2.7/dist-packages/nova/manager.py:152
2012-06-18 15:44:05 DEBUG nova.manager [-] Skipping
ComputeManager._sync_power_states, 1 ticks left until next run from
_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack@xxxxxxxxxxxxxxxxxxx<mailto:openstack@xxxxxxxxxxxxxxxxxxx>
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp
_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack@xxxxxxxxxxxxxxxxxxx<mailto:openstack@xxxxxxxxxxxxxxxxxxx>
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp
References