yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #01364
[Bug 1053286] Re: nova-api not shutting down cleanly under some conditions
This bug lacks the necessary information to effectively reproduce and
fix it, therefore it has been closed. Feel free to reopen the bug by
providing the requested information and set the bug status back to
''New''.
** Changed in: nova
Status: Incomplete => Invalid
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1053286
Title:
nova-api not shutting down cleanly under some conditions
Status in OpenStack Compute (Nova):
Invalid
Bug description:
There seems to be some situation that prevents nova-api from shutting
down cleanly. One process can be left behind, preventing restarts.
Unfortunately I don't have much detailed information, but here goes
what I know:
nova-api.log reported the service shutting down without any
exceptions. Last request was handled correctly, then the standard
shutdown sequence follows:
2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
2012-09-19 20:27:34 INFO nova.wsgi [-] Stopping WSGI server.
2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, stopping children
2012-09-19 20:27:34 INFO nova.wsgi [-] Stopping WSGI server.
2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
2012-09-19 20:27:34 INFO nova.wsgi [-] Stopping WSGI server.
2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
2012-09-19 20:27:34 INFO nova.wsgi [-] Stopping WSGI server.
2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
2012-09-19 20:27:34 INFO nova.wsgi [-] Stopping WSGI server.
2012-09-19 20:27:34 INFO nova.wsgi [-] Stopping WSGI server.
2012-09-19 20:27:34 INFO nova.wsgi [-] Stopping WSGI server.
2012-09-19 20:27:34 INFO nova.wsgi [-] Stopping WSGI server.
2012-09-19 20:27:34 INFO nova.wsgi [-] Stopping WSGI server.
2012-09-19 20:27:34 INFO nova.service [-] Waiting on 21 children to exit
2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
2012-09-19 20:27:34 INFO nova.wsgi [-] Stopping WSGI server.
2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
2012-09-19 20:27:34 INFO nova.wsgi [-] Stopping WSGI server.
2012-09-19 20:27:34 INFO nova.service [-] Caught SIGTERM, exiting
2012-09-19 20:27:34 INFO nova.wsgi [-] Stopping WSGI server.
Main process reports waiting for 21 children and it looks like the
count is correct for SIGTERM messages:
$ tail -n 50 /var/log/nova/nova-api.log.1 | grep SIGTERM | wc -l
22
Not sure about the 'Stopping WSGI server' messages though:
$ tail -n 50 /var/log/nova/nova-api.log.1 | grep 'Stopping WSGI server' | wc -l
12
There is still one process left, but it doesn't seem to be the main
one. Upstart reported the following in syslog:
Sep 19 20:27:34 nv-aw1rde1-schedule0000 kernel: [4316283.745184] init:
nova-api main process (5832) terminated with status 143
so it seems like it's a child process that stays around, because the
pid is different from the one reported:
$ pgrep nova-api
1535
Process itself seems to only sleep in 1min intervals, with no sockets
to monitor, if you attach strace to it:
$ sudo strace -p 1535
Process 1535 attached - interrupt to quit
select(0, NULL, NULL, NULL, {22, 691264}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {60, 0}
I don't have any more information - is there something else that I
could check?
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1053286/+subscriptions