yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #21825
[Bug 1371812] [NEW] n-cpu timeout on rpc allocating network
Public bug reported:
During a server create when the ec2 metadata triggers the async creation
of the network information for the server n-cpu times out waiting for
the rpc response from n-net causing the server to go into an error
state. See:
http://logs.openstack.org/27/122527/2/check/check-tempest-dsvm-
full/0258bda/logs/screen-n-cpu.txt.gz#_2014-09-19_17_54_32_139
Looking at the n-net logs for the same req-id show that n-net got the
allocate network call at:
http://logs.openstack.org/27/122527/2/check/check-tempest-dsvm-
full/0258bda/logs/screen-n-net.txt.gz#_2014-09-19_17_53_11_153
and completed it successfully a little over a 1 min later at:
http://logs.openstack.org/27/122527/2/check/check-tempest-dsvm-
full/0258bda/logs/screen-n-net.txt.gz#_2014-09-19_17_54_32_681
and there is no other indication as to why n-cpu raised the timeout. The
rabbit logs don't show an error around that time:
http://logs.openstack.org/27/122527/2/check/check-tempest-dsvm-full/0258bda/logs/rabbitmq/clarkbtest.txt.gz
(ignore the unusual path it's the rabbit log for that run)
My guess is that the timeout for that rpc call was 60sec and taking a
little longer caused the timeout and this was just run on a slow node.
looking at logstash for:
message:"MessagingTimeout: Timed out waiting for a reply to message" AND
message:"ec2utils.get_ip_info_for_instance_from_nw_info" AND
tags:"screen-n-cpu.txt"
only shows one hit in the past 7 days, so it's probably not super
prolific.
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1371812
Title:
n-cpu timeout on rpc allocating network
Status in OpenStack Compute (Nova):
New
Bug description:
During a server create when the ec2 metadata triggers the async
creation of the network information for the server n-cpu times out
waiting for the rpc response from n-net causing the server to go into
an error state. See:
http://logs.openstack.org/27/122527/2/check/check-tempest-dsvm-
full/0258bda/logs/screen-n-cpu.txt.gz#_2014-09-19_17_54_32_139
Looking at the n-net logs for the same req-id show that n-net got the
allocate network call at:
http://logs.openstack.org/27/122527/2/check/check-tempest-dsvm-
full/0258bda/logs/screen-n-net.txt.gz#_2014-09-19_17_53_11_153
and completed it successfully a little over a 1 min later at:
http://logs.openstack.org/27/122527/2/check/check-tempest-dsvm-
full/0258bda/logs/screen-n-net.txt.gz#_2014-09-19_17_54_32_681
and there is no other indication as to why n-cpu raised the timeout.
The rabbit logs don't show an error around that time:
http://logs.openstack.org/27/122527/2/check/check-tempest-dsvm-full/0258bda/logs/rabbitmq/clarkbtest.txt.gz
(ignore the unusual path it's the rabbit log for that run)
My guess is that the timeout for that rpc call was 60sec and taking a
little longer caused the timeout and this was just run on a slow node.
looking at logstash for:
message:"MessagingTimeout: Timed out waiting for a reply to message"
AND message:"ec2utils.get_ip_info_for_instance_from_nw_info" AND
tags:"screen-n-cpu.txt"
only shows one hit in the past 7 days, so it's probably not super
prolific.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1371812/+subscriptions
Follow ups
References