yahoo-eng-team team mailing list archive

Thread
Date

[Bug 1371812] Re: n-cpu timeout on rpc allocating network

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: "Markus Zoeller \(markus_z\)" <mzoeller@xxxxxxxxxxxxxxxxxx>
Date: Tue, 05 Jul 2016 09:53:11 -0000
Reply-to: Bug 1371812 <1371812@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx

This is an automated cleanup. This bug report has been closed because it
is older than 18 months and there is no open code change to fix this.
After this time it is unlikely that the circumstances which lead to
the observed issue can be reproduced.

If you can reproduce the bug, please:
* reopen the bug report (set to status "New")
* AND add the detailed steps to reproduce the issue (if applicable)
* AND leave a comment "CONFIRMED FOR: <RELEASE_NAME>"
  Only still supported release names are valid (LIBERTY, MITAKA, OCATA, NEWTON).
  Valid example: CONFIRMED FOR: LIBERTY


** Changed in: nova
       Status: Confirmed => Expired

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1371812

Title:
  n-cpu timeout on rpc allocating network

Status in OpenStack Compute (nova):
  Expired

Bug description:
  During a server create when the ec2 metadata triggers the async
  creation of the network information for the server n-cpu times out
  waiting for the rpc response from n-net causing the server to go into
  an error state. See:

  http://logs.openstack.org/27/122527/2/check/check-tempest-dsvm-
  full/0258bda/logs/screen-n-cpu.txt.gz#_2014-09-19_17_54_32_139

  Looking at the n-net logs for the same req-id show that n-net got the
  allocate network call at:

  http://logs.openstack.org/27/122527/2/check/check-tempest-dsvm-
  full/0258bda/logs/screen-n-net.txt.gz#_2014-09-19_17_53_11_153

  and completed it successfully a little over a  1 min later at:

  http://logs.openstack.org/27/122527/2/check/check-tempest-dsvm-
  full/0258bda/logs/screen-n-net.txt.gz#_2014-09-19_17_54_32_681

  and there is no other indication as to why n-cpu raised the timeout.
  The rabbit logs don't show an error around that time:

  http://logs.openstack.org/27/122527/2/check/check-tempest-dsvm-full/0258bda/logs/rabbitmq/clarkbtest.txt.gz
  (ignore the unusual path it's the rabbit log for that run)

  My guess is that the timeout for that rpc call was 60sec and taking a
  little longer caused the timeout and this was just run on a slow node.

  looking at logstash for:

  message:"MessagingTimeout: Timed out waiting for a reply to message"
  AND message:"ec2utils.get_ip_info_for_instance_from_nw_info" AND
  tags:"screen-n-cpu.txt"

  only shows one hit in the past 7 days, so it's probably not super
  prolific.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1371812/+subscriptions

References

[Bug 1371812] [NEW] n-cpu timeout on rpc allocating network
From: Matthew Treinish, 2014-09-19