← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1632099] Re: Deleting an HA router kills keepalived-state-change with signal 9, leaving children behind

 

Reviewed:  https://review.openstack.org/411968
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=17c2b45d916168fa396caf1a6188e6c8a0c8d19a
Submitter: Jenkins
Branch:    master

commit 17c2b45d916168fa396caf1a6188e6c8a0c8d19a
Author: Daniel Alvarez <dalvarez@xxxxxxxxxx>
Date:   Fri Dec 16 17:52:05 2016 +0000

    Kill neutron-keepalived-state-change gracefully
    
    When HA routers are deleted, neutron-keepalived-state-change was
    killed by l3/ha_router using SIGKILL and orphaning its child process
    'ip -o monitor'.
    
    This patch implements a way to kill it gracefully through SIGTERM, and
    also sending a SIGKILL if keepalived-state-change didn't die after a
    10 seconds timeout. The SIGTERM handler in keepalived-state-change
    kills the ip monitor process using SIGKILL.
    
    It kills ip_monitor PID using kill instead of through IPMonitor.stop()
    because the monitor runs as root and keepalived-state-change doesn't
    (it drops its privileges after launching the monitor). Doing so, would
    lead to a "Permission denied". We can safely kill the monitor this way
    since it was started with respawn_interval set to None which means that
    it won't be automatically respawned.
    
    Closes-Bug: #1632099
    Change-Id: I7385172a007fdd252ea3a1c03c58064160d07e9e


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1632099

Title:
  Deleting an HA router kills keepalived-state-change with signal 9,
  leaving children behind

Status in neutron:
  Fix Released

Bug description:
  When deleting an HA router the agent shuts the neutron-keepalived-
  state-monitor with signal 9, leaving behind processes that the state
  change process spawns, the "ip -o monitor address" process.

  How to reproduce:

  $ps aux | grep "monitor address"  # Verify you've got nothing
  $tox -e dsvm-functional test_ha_router_lifecycle  # The test creates and deletes an HA router
  $ps aux | grep "monitor address"  # Oops, leaked process!

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1632099/+subscriptions


References