← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1053286] Re: nova-api not shutting down cleanly under some conditions

 

This bug lacks the necessary information to effectively reproduce and
fix it, therefore it has been closed. Feel free to reopen the bug by
providing the requested information and set the bug status back to
''New''.

** Changed in: nova
       Status: Incomplete => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1053286

Title:
  nova-api not shutting down cleanly under some conditions

Status in OpenStack Compute (Nova):
  Invalid

Bug description:
  There seems to be some situation that prevents nova-api from shutting
  down cleanly. One process can be left behind, preventing restarts.
  Unfortunately I don't have much detailed information, but here goes
  what I know:

  nova-api.log reported the service shutting down without any
  exceptions. Last request was handled correctly, then the standard
  shutdown sequence follows:

  2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
  2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
  2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
  2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
  2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
  2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
  2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
  2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
  2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
  2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
  2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
  2012-09-19 20:27:34 INFO nova.wsgi [-] Stopping WSGI server.
  2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
  2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
  2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, stopping children
  2012-09-19 20:27:34 INFO nova.wsgi [-] Stopping WSGI server.
  2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
  2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
  2012-09-19 20:27:34 INFO nova.wsgi [-] Stopping WSGI server.
  2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
  2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
  2012-09-19 20:27:34 INFO nova.wsgi [-] Stopping WSGI server.
  2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
  2012-09-19 20:27:34 INFO nova.wsgi [-] Stopping WSGI server.
  2012-09-19 20:27:34 INFO nova.wsgi [-] Stopping WSGI server.
  2012-09-19 20:27:34 INFO nova.wsgi [-] Stopping WSGI server.
  2012-09-19 20:27:34 INFO nova.wsgi [-] Stopping WSGI server.
  2012-09-19 20:27:34 INFO nova.wsgi [-] Stopping WSGI server.
  2012-09-19 20:27:34 INFO nova.service [-] Waiting on 21 children to exit
  2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
  2012-09-19 20:27:34 INFO nova.wsgi [-] Stopping WSGI server.
  2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
  2012-09-19 20:27:34 INFO nova.wsgi [-] Stopping WSGI server.
  2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
  2012-09-19 20:27:34 INFO nova.wsgi [-] Stopping WSGI server.

  Main process reports waiting for 21 children and it looks like the
  count is correct for SIGTERM messages:

  $ tail -n 50 /var/log/nova/nova-api.log.1 | grep SIGTERM | wc -l
  22

  Not sure about the 'Stopping WSGI server' messages though:

  $ tail -n 50 /var/log/nova/nova-api.log.1 | grep 'Stopping WSGI server' | wc -l
  12

  There is still one process left, but it doesn't seem to be the main
  one. Upstart reported the following in syslog:

  Sep 19 20:27:34 nv-aw1rde1-schedule0000 kernel: [4316283.745184] init:
  nova-api main process (5832) terminated with status 143

  so it seems like it's a child process that stays around, because the
  pid is different from the one reported:

  $ pgrep nova-api
  1535

  Process itself seems to only sleep in 1min intervals, with no sockets
  to monitor, if you attach strace to it:

  $ sudo strace -p 1535
  Process 1535 attached - interrupt to quit
  select(0, NULL, NULL, NULL, {22, 691264}) = 0 (Timeout)
  select(0, NULL, NULL, NULL, {60, 0}

  I don't have any more information - is there something else that I
  could check?

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1053286/+subscriptions