yahoo-eng-team team mailing list archive
  
  - 
     yahoo-eng-team team yahoo-eng-team team
- 
    Mailing list archive
  
- 
    Message #21825
  
 [Bug 1371812] [NEW] n-cpu timeout on rpc	allocating network
  
Public bug reported:
During a server create when the ec2 metadata triggers the async creation
of the network information for the server n-cpu times out waiting for
the rpc response from n-net causing the server to go into an error
state. See:
http://logs.openstack.org/27/122527/2/check/check-tempest-dsvm-
full/0258bda/logs/screen-n-cpu.txt.gz#_2014-09-19_17_54_32_139
Looking at the n-net logs for the same req-id show that n-net got the
allocate network call at:
http://logs.openstack.org/27/122527/2/check/check-tempest-dsvm-
full/0258bda/logs/screen-n-net.txt.gz#_2014-09-19_17_53_11_153
and completed it successfully a little over a  1 min later at:
http://logs.openstack.org/27/122527/2/check/check-tempest-dsvm-
full/0258bda/logs/screen-n-net.txt.gz#_2014-09-19_17_54_32_681
and there is no other indication as to why n-cpu raised the timeout. The
rabbit logs don't show an error around that time:
http://logs.openstack.org/27/122527/2/check/check-tempest-dsvm-full/0258bda/logs/rabbitmq/clarkbtest.txt.gz
(ignore the unusual path it's the rabbit log for that run)
My guess is that the timeout for that rpc call was 60sec and taking a
little longer caused the timeout and this was just run on a slow node.
looking at logstash for:
message:"MessagingTimeout: Timed out waiting for a reply to message" AND
message:"ec2utils.get_ip_info_for_instance_from_nw_info" AND
tags:"screen-n-cpu.txt"
only shows one hit in the past 7 days, so it's probably not super
prolific.
** Affects: nova
     Importance: Undecided
         Status: New
-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1371812
Title:
  n-cpu timeout on rpc allocating network
Status in OpenStack Compute (Nova):
  New
Bug description:
  During a server create when the ec2 metadata triggers the async
  creation of the network information for the server n-cpu times out
  waiting for the rpc response from n-net causing the server to go into
  an error state. See:
  http://logs.openstack.org/27/122527/2/check/check-tempest-dsvm-
  full/0258bda/logs/screen-n-cpu.txt.gz#_2014-09-19_17_54_32_139
  Looking at the n-net logs for the same req-id show that n-net got the
  allocate network call at:
  http://logs.openstack.org/27/122527/2/check/check-tempest-dsvm-
  full/0258bda/logs/screen-n-net.txt.gz#_2014-09-19_17_53_11_153
  and completed it successfully a little over a  1 min later at:
  http://logs.openstack.org/27/122527/2/check/check-tempest-dsvm-
  full/0258bda/logs/screen-n-net.txt.gz#_2014-09-19_17_54_32_681
  and there is no other indication as to why n-cpu raised the timeout.
  The rabbit logs don't show an error around that time:
  http://logs.openstack.org/27/122527/2/check/check-tempest-dsvm-full/0258bda/logs/rabbitmq/clarkbtest.txt.gz
  (ignore the unusual path it's the rabbit log for that run)
  My guess is that the timeout for that rpc call was 60sec and taking a
  little longer caused the timeout and this was just run on a slow node.
  looking at logstash for:
  message:"MessagingTimeout: Timed out waiting for a reply to message"
  AND message:"ec2utils.get_ip_info_for_instance_from_nw_info" AND
  tags:"screen-n-cpu.txt"
  only shows one hit in the past 7 days, so it's probably not super
  prolific.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1371812/+subscriptions
Follow ups
References