yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #93670
[Bug 2056366] [NEW] Neutron ml2/ovn does not exit when killed with SIGTERM
Public bug reported:
When Neutron is killed with SIGTERM (like via systemctl), when using
ML2/OVN neutron workers do not exit and instead are eventually killed
with SIGKILL when the graceful timeout is reached (often around 1
minute).
This is happening due to the signal handlers for SIGTERM. There are
multiple issues.
1) oslo_service, ml2/ovn mech_driver, and ml2/ovo_rpc.py all call signal.signal(signal.SIGTERM, ...) overwriting each others signal handlers.
2) SIGTERM is handled in the main thread, and running blocking code there causes AssertionErrors in eventlet
3) The ml2/ovn cleanup code doesn't cause the process to end, so it interrupts the killing of the process
oslo_service has a singleton SignalHandler class that solves all of
these issues and we should use that instead of calling signal.signal()
ourselves.
** Affects: neutron
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/2056366
Title:
Neutron ml2/ovn does not exit when killed with SIGTERM
Status in neutron:
New
Bug description:
When Neutron is killed with SIGTERM (like via systemctl), when using
ML2/OVN neutron workers do not exit and instead are eventually killed
with SIGKILL when the graceful timeout is reached (often around 1
minute).
This is happening due to the signal handlers for SIGTERM. There are
multiple issues.
1) oslo_service, ml2/ovn mech_driver, and ml2/ovo_rpc.py all call signal.signal(signal.SIGTERM, ...) overwriting each others signal handlers.
2) SIGTERM is handled in the main thread, and running blocking code there causes AssertionErrors in eventlet
3) The ml2/ovn cleanup code doesn't cause the process to end, so it interrupts the killing of the process
oslo_service has a singleton SignalHandler class that solves all of
these issues and we should use that instead of calling signal.signal()
ourselves.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/2056366/+subscriptions
Follow ups