← Back to team overview

group.of.nepali.translators team mailing list archive

[Bug 1644530] Re: keepalived fails to restart cleanly due to the wrong systemd settings

 

On Zesty (still without the pidfile in the service) it seems to work
fine (Version 1:1.3.2-1).

Test:
2x zesty KVM guests
sudo apt-get install keepalived
Set up as above, but with ens3 (virtual ethenet device).

I think the name of backup/master conf names are interchanged in the
example above, but it works fine.


0) pre restart
  Process: 2173 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS 
 Main PID: 2183 (keepalived)
           ├─2183 /usr/sbin/keepalived
           ├─2184 /usr/sbin/keepalived
           └─2185 /usr/sbin/keepalived

1) First restart
  Process: 2393 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS
 Main PID: 2404 (keepalived)
           ├─2404 /usr/sbin/keepalived
           ├─2408 /usr/sbin/keepalived
           └─2409 /usr/sbin/keepalived

I see the backup server properly detect, transtion to take over and back to backup.
I was restarting on the master above, now trying on backup.

0) pre restart
  Process: 2137 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS 
 Main PID: 2147 (keepalived)
           ├─2147 /usr/sbin/keepalived
           ├─2148 /usr/sbin/keepalived
           └─2149 /usr/sbin/keepalived

1) First restart
  Process: 2765 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS
 Main PID: 2776 (keepalived)
           ├─2776 /usr/sbin/keepalived
           ├─2777 /usr/sbin/keepalived
           └─2778 /usr/sbin/keepalived


Note: since they did not add the PIDFILE we might want/need to understand what was done to fix it here (to backport that as SRU)
Next: I'm trying to reproduce on Xenial as reported

** Changed in: keepalived (Ubuntu)
       Status: Confirmed => Fix Released

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1644530

Title:
  keepalived fails to restart cleanly due to the wrong systemd settings

Status in keepalived package in Ubuntu:
  Fix Released
Status in keepalived source package in Xenial:
  Confirmed

Bug description:
  Because "PIDFile=" directive is missing in the systemd unit file,
  keepalived sometimes fails to kill all old processes. The old
  processes remain with old settings and cause unexpected behaviors. The
  detail of this bug is described in this ticket in upstream:
  https://github.com/acassen/keepalived/issues/443.

  The official systemd unit file is available since version 1.2.24 by
  this commit:

  https://github.com/acassen/keepalived/commit/635ab69afb44cd8573663e62f292c6bb84b44f15

  This includes "PIDFile" directive correctly:

  PIDFile=/var/run/keepalived.pid

  We should go the same way.

  I am using Ubuntu 16.04.1, kernel 4.4.0-45-generic.

  Package: keepalived
  Version: 1.2.19-1

  =======================================================================

  How to reproduce:

  I used the two instances of Ubuntu 16.04.2 on DigitalOcean:

  Configurations
  --------------

  MASTER server's /etc/keepalived/keepalived.conf:

    vrrp_script chk_nothing {
       script "/bin/true"
       interval 2
    }

    vrrp_instance G1 {
      interface eth1
      state BACKUP
      priority 100

      virtual_router_id 123
      unicast_src_ip <primal IP>
      unicast_peer {
        <secondal IP>
      }
      track_script {
        chk_nothing
      }
    }

  BACKUP server's /etc/keepalived/keepalived.conf:

    vrrp_script chk_nothing {
       script "/bin/true"
       interval 2
    }

    vrrp_instance G1 {
      interface eth1
      state MASTER
      priority 200

      virtual_router_id 123
      unicast_src_ip <secondal IP>
      unicast_peer {
        <primal IP>
      }
      track_script {
        chk_nothing
      }
    }

  Procedures
  ----------

  1) Start keepalived on both servers

    $ sudo systemctl start keepalived.service

  2) Restart keepalived on either one

    $ sudo systemctl restart keepalived.service

  3) Check status and PID

    $ systemctl status -n0 keepalived.service

  Result
  ------

  0) Before restart

  Main PID is 3402 and the subprocesses' PIDs are 3403-3406. So far so
  good.

    root@ubuntu-2gb-sgp1-01:~# systemctl status -n0 keepalived
    ● keepalived.service - Keepalive Daemon (LVS and VRRP)
       Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)
       Active: active (running) since Sat 2017-03-04 01:37:12 UTC; 14min ago
      Process: 3402 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS (code=exited, status=0/SUCCESS)
     Main PID: 3403 (keepalived)
        Tasks: 3
       Memory: 1.7M
          CPU: 1.900s
       CGroup: /system.slice/keepalived.service
               ├─3403 /usr/sbin/keepalived
               ├─3405 /usr/sbin/keepalived
               └─3406 /usr/sbin/keepalived

  1) First restart

  Now Main PID is 3403, which was one of the previous subprocesses and
  is actually exited. Something is wrong. Yet, the previous processes
  are all exited; we are not likely to see no weird behaviors here.

    root@ubuntu-2gb-sgp1-01:~# systemctl restart keepalived
    root@ubuntu-2gb-sgp1-01:~# systemctl status -n0 keepalived
    ● keepalived.service - Keepalive Daemon (LVS and VRRP)
       Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)
       Active: active (running) since Sat 2017-03-04 01:51:45 UTC; 1s ago
      Process: 4782 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS (code=exited, status=0/SUCCESS)
     Main PID: 3403 (code=exited, status=0/SUCCESS)
        Tasks: 3
       Memory: 1.7M
          CPU: 11ms
       CGroup: /system.slice/keepalived.service
               ├─4783 /usr/sbin/keepalived
               ├─4784 /usr/sbin/keepalived
               └─4785 /usr/sbin/keepalived

  2) Second restart

  Now Main PID is 4783 and subprocesses' PIDs are 4783-4785. This is
  problematic as 4783 is the old process, which should have exited
  before new processes arose. Therefore, keepalived remains in old
  settings while users believe it uses the new setting.

    root@ubuntu-2gb-sgp1-01:~# systemctl restart keepalived
    root@ubuntu-2gb-sgp1-01:~# systemctl status -n0 keepalived
    ● keepalived.service - Keepalive Daemon (LVS and VRRP)
       Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)
       Active: active (running) since Sat 2017-03-04 01:51:49 UTC; 1s ago
      Process: 4796 ExecStart=/usr/sbin/keepalived $DAEMON_ARGS (code=exited, status=0/SUCCESS)
     Main PID: 4783 (keepalived)
        Tasks: 3
       Memory: 1.7M
          CPU: 6ms
       CGroup: /system.slice/keepalived.service
               ├─4783 /usr/sbin/keepalived
               ├─4784 /usr/sbin/keepalived
               └─4785 /usr/sbin/keepalived

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/keepalived/+bug/1644530/+subscriptions