← Back to team overview

group.of.nepali.translators team mailing list archive

[Bug 1586876] Re: Corosync report "Started" itself too early

 

>From upstream documentation:

"""
Pacemaker used to obtain membership and quorum from a custom Corosync plugin. This plugin also had the capability to start Pacemaker automatically when Corosync was started. Neither behavior is possible with Corosync 2.0 and beyond as support for plugins was removed.

Instead, Pacemaker must be started as a separate job/initscript. Also, since Pacemaker made use of the plugin for message routing, a node using the plugin (Corosync prior to 2.0) cannot talk to one that isn’t (Corosync 2.0+).
Rolling upgrades between these versions are therefore not possible and an alternate strategy must be used.
"""

showing that since Ubuntu Trusty this detection behavior is not
supported any longer. Nowadays, we start both services separately and
using systemd.

Corosync starts with a simple one-node only (localhost) ring configured:

(c)rafaeldtinoco@clusterdev:~$ systemctl status corosync
● corosync.service - Corosync Cluster Engine
     Loaded: loaded (/lib/systemd/system/corosync.service; enabled; vendor preset: enabled)
     Active: active (running) since Thu 2020-03-19 20:16:49 UTC; 45min ago
       Docs: man:corosync
             man:corosync.conf
             man:corosync_overview
   Main PID: 851 (corosync)
      Tasks: 9 (limit: 23186)
     Memory: 125.9M
     CGroup: /system.slice/corosync.service
             └─851 /usr/sbin/corosync -f

(c)rafaeldtinoco@clusterdev:~$ sudo corosync-quorumtool
Quorum information
------------------
Date:             Thu Mar 19 21:02:21 2020
Quorum provider:  corosync_votequorum
Nodes:            1
Node ID:          1
Ring ID:          1.5
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   1
Highest expected: 1
Total votes:      1
Quorum:           1
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
         1          1 node1 (local)

AND systemd is responsible to guarantee the synchronicity needed.

----

>From pacemaker service unit:

...
After=corosync.service
Requires=corosync.service

...

# If you want Corosync to stop whenever Pacemaker is stopped,
# uncomment the next line too:
#
# ExecStopPost=/bin/sh -c 'pidof pacemaker-controld || killall -TERM corosync'

...

# Pacemaker will restart along with Corosync if Corosync is stopped while
# Pacemaker is running.
# In this case, if you want to be fenced always (if you do not want to restart)
# uncomment ExecStopPost below.
#
# ExecStopPost=/bin/sh -c 'pidof corosync || \
#              /usr/bin/systemctl --no-block stop pacemaker' 

you have different options to control behavior for start/stop and
restart accordingly with corosync status.


** Changed in: corosync (Ubuntu Focal)
       Status: Triaged => Fix Released

** Changed in: corosync (Ubuntu Eoan)
       Status: Triaged => Fix Released

** Changed in: corosync (Ubuntu Bionic)
       Status: Triaged => Fix Released

** Changed in: corosync (Ubuntu Xenial)
       Status: Triaged => Won't Fix

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1586876

Title:
  Corosync report "Started" itself too early

Status in corosync package in Ubuntu:
  Fix Released
Status in corosync source package in Trusty:
  Won't Fix
Status in corosync source package in Xenial:
  Won't Fix
Status in corosync source package in Bionic:
  Fix Released
Status in corosync source package in Disco:
  Won't Fix
Status in corosync source package in Eoan:
  Fix Released
Status in corosync source package in Focal:
  Fix Released

Bug description:
  Problem description:
  currently, we have no service state check after start-stop-daemon in do_start(),
  it might lead to an error if corosync report itself started too early,
  pacemaker might think it is a 'heartbeat' backended, which is not we desired,
  we should check if corosync is "really" started, then report its state,

  syslog with wrong state:
  May 24 19:53:50 myhost corosync[1018]:   [MAIN  ] Corosync Cluster Engine ('1.4.2'): started and ready to provide service.
  May 24 19:53:50 myhost corosync[1018]:   [MAIN  ] Corosync built-in features: nss
  May 24 19:53:50 myhost corosync[1018]:   [MAIN  ] Successfully read main configuration file '/etc/corosync/corosync.conf'.
  May 24 19:53:50 myhost corosync[1018]:   [TOTEM ] Initializing transport (UDP/IP Unicast).
  May 24 19:53:50 myhost corosync[1018]:   [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
  May 24 19:53:50 myhost pacemakerd: [1094]: info: Invoked: pacemakerd
  May 24 19:53:50 myhost pacemakerd: [1094]: info: crm_log_init_worker: Changed active directory to /var/lib/heartbeat/cores/root
  May 24 19:53:50 myhost pacemakerd: [1094]: info: get_cluster_type: Assuming a 'heartbeat' based cluster
  May 24 19:53:50 myhost pacemakerd: [1094]: info: read_config: Reading configure for stack: heartbeat

  expected result:
  May 24 21:45:02 myhost corosync[1021]:   [MAIN  ] Completed service synchronization, ready to provide service.
  May 24 21:45:02 myhost pacemakerd: [1106]: info: Invoked: pacemakerd
  May 24 21:45:02 myhost pacemakerd: [1106]: info: crm_log_init_worker: Changed active directory to /var/lib/heartbeat/cores/root
  May 24 21:45:02 myhost pacemakerd: [1106]: info: config_find_next: Processing additional service options...
  May 24 21:45:02 myhost pacemakerd: [1106]: info: get_config_opt: Found 'pacemaker' for option: name
  May 24 21:45:02 myhost pacemakerd: [1106]: info: get_config_opt: Found '1' for option: ver
  May 24 21:45:02 myhost pacemakerd: [1106]: info: get_cluster_type: Detected an active 'classic openais (with plugin)' cluster

  please note the order of following two lines:
  * corosync: [MAIN  ] Completed service synchronization, ready to provide service.
  * pacemakerd: info: get_cluster_type: ...

  affected versions:
  ALL (precise, trusty, vivid, wily, xenial, yakkety)

  upstream solution: wait_for_ipc()
  https://github.com/corosync/corosync/blob/master/init/corosync.in#L84-L99

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/corosync/+bug/1586876/+subscriptions