yahoo-eng-team team mailing list archive
  
  - 
     yahoo-eng-team team yahoo-eng-team team
- 
    Mailing list archive
  
- 
    Message #06239
  
 [Bug 1257524] [NEW] If neutron spawned dnsmasq	dies, neutron-dhcp-agent will be totally unaware
  
Public bug reported:
I recently had some trouble with dnsmasq causing it to segfault in
certain situations. No doubt, this was a bug in dnsmasq. However, it was
quite troubling that Neutron never noted that dnsmasq had stopped
working. This is because dnsmasq is spawned as a daemon, even though it
is most definitely "owned" by neutron-dhcp-agent. Also if neutron-dhcp-
agent should die, since dnsmasq is a daemon it will continue to run and
be "stale", requiring manual intervention to clean up. However if it is
in the foreground then it will stay in neutron-dhcp-agent's process
group and should also die and if need-be cleaned up by init.
I did some analysis and will not be able to dig into the actual
implementation. However my analysis shows that this would work:
* use utils.create_process instead of execute and remember returned Popen object.
* spawn a greenthread to wait() on the process
* if it dies, restart it and log the error code
* pass the -k option so dnsmasq stays in foreground
* kill the process using child signals
Note sure how or if SIGCHLD plays a factor.
** Affects: neutron
     Importance: Undecided
         Status: New
-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1257524
Title:
  If neutron spawned dnsmasq dies, neutron-dhcp-agent will be totally
  unaware
Status in OpenStack Neutron (virtual network service):
  New
Bug description:
  I recently had some trouble with dnsmasq causing it to segfault in
  certain situations. No doubt, this was a bug in dnsmasq. However, it
  was quite troubling that Neutron never noted that dnsmasq had stopped
  working. This is because dnsmasq is spawned as a daemon, even though
  it is most definitely "owned" by neutron-dhcp-agent. Also if neutron-
  dhcp-agent should die, since dnsmasq is a daemon it will continue to
  run and be "stale", requiring manual intervention to clean up. However
  if it is in the foreground then it will stay in neutron-dhcp-agent's
  process group and should also die and if need-be cleaned up by init.
  I did some analysis and will not be able to dig into the actual
  implementation. However my analysis shows that this would work:
  * use utils.create_process instead of execute and remember returned Popen object.
  * spawn a greenthread to wait() on the process
  * if it dies, restart it and log the error code
  * pass the -k option so dnsmasq stays in foreground
  * kill the process using child signals
  Note sure how or if SIGCHLD plays a factor.
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1257524/+subscriptions
Follow ups
References