← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1561046] Re: If there is a /var/lib/neutron/ha_confs/<router-id>.pid then l3 agent fails to spawn a keepalived process for that router

 

Reviewed:  https://review.openstack.org/296532
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=e98fabb5836b12bc40a2b64a2668893ea73c2320
Submitter: Jenkins
Branch:    master

commit e98fabb5836b12bc40a2b64a2668893ea73c2320
Author: Hynek Mlnarik <hmlnarik@xxxxxxxxxx>
Date:   Wed Mar 23 14:51:59 2016 +0100

    Remove obsolete keepalived PID files before start
    
    keepalived refuses to start and claims "daemon already started"
    when there is already a process with the same PID as found in
    either the VRRP or the main process PID file. This happens even
    in case when the new process is not keepalived. The situation
    can happen when the neutron node is reset and the obsolete PID
    files are not cleaned before neutron is started.
    
    This commit adds PID file cleanup before keepalived start.
    
    Closes-Bug: 1561046
    Change-Id: Ib6b6f2fe76fe82253f195c9eab6b243d9eb76fa2


** Changed in: neutron
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1561046

Title:
  If there is a /var/lib/neutron/ha_confs/<router-id>.pid then l3 agent
  fails to spawn a keepalived process for that router

Status in neutron:
  Fix Released

Bug description:
  If the .pid file for the previous keepalived process (located in
  /var/lib/neutron/ha_confs/<router_id>.pid) already exists then the L3
  agent fails to spawn a keepalived process for that router.

  For example, upon neutron node shutdown and restart the processes are
  assigned new PIDs that can be same as those previously assigned to
  some of the keepalived processes. The latter are captured in PID files
  and once keepalived starts, it detects that there is a running process
  with that PID and reports "daemon is already running".

  Steps to reproduce:
  1)  Pick a router that you want to make display this issue;  record the router_id
  2)  kill the two processes denoted in these two files: /lib/neutron/ha_confs/<router_id>.pid and /lib/neutron/ha_confs/<router_id>.pid-vrrp
  3)  Make sure that no keepalived process comes back for this router
  4) Now pick out an existing process id - anything that's really  running - and put that processid into the PID files.  For example, a background sleep process running as pid 12345 can be put into <router_id>.pid file and <router_id>.pid-vrrp.

  Bug valid with keepalived version 1.2.13 and 1.2.19.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1561046/+subscriptions


References