sts-sponsors team mailing list archive
-
sts-sponsors team
-
Mailing list archive
-
Message #00232
[Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail
Currently, in bionic:
$ systemctl cat pacemaker.service
# /lib/systemd/system/pacemaker.service
After=corosync.service
Requires=corosync.service
$ systemctl cat corosync.service
<nothing about pacemaker>
Desired properties:
i) when corosync is started, attempt to start pacemaker
ii) when corosync is restarted, attempt to restart pacemaker too
iii) when corosync is stopped, do not stop pacemaker
1) Property i) can be satisfied with [Install]
WantedBy=corosync.service, in pacemaker.service.
2) Requires=corosync.service is too strong, as it means that pacemaker
cannot operate without corosync. Is this true?
3) Currently on upgrade corosync prerm script does "stop corosync" and
later postinst does "start corosync". My understanding it would be
better, on upgrades to simply "restart" corosync, instead of doing
stop&start. Please consider switching corosync package to use dh_systemd
and use restart-on-upgrade dh_installinit/systemd option.
4) Properties ii) and iii) cannot currently be satisfied simultaneously
with simple stanzas. If pacemaker requires corosync at all time, then
pacemaker.service should declare PartOf=corosync.service. Then
stop/restart of corosync will stop and restart pacemaker. Condition ii)
is good. However that will violate condition iii). However, we can
instead introduce a helper unit to achieve both ii) and iii)
simultaneously. e.g.:
pacemaker-restart.service
[Unit]
PartOf=corosync.service
[Service]
ExecStart=/bin/true
ExecStop=/bin/systemctl restart pacemaker.service
[Install]
WantedBy=corosync.service
This means that whenever corosync is stopped, or restarted,
pacemaker.service will be restarted too. This extra unit will satisfy
the conditions `ii` and `iii` as stated.
--
You received this bug notification because you are a member of STS
Sponsors, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1740892
Title:
corosync upgrade on 2018-01-02 caused pacemaker to fail
Status in OpenStack hacluster charm:
Invalid
Status in corosync package in Ubuntu:
In Progress
Status in corosync source package in Trusty:
New
Status in corosync source package in Xenial:
New
Status in corosync source package in Zesty:
New
Status in corosync source package in Artful:
New
Status in corosync source package in Bionic:
In Progress
Bug description:
During upgrades on 2018-01-02, corosync and it's libs were upgraded:
(from a trusty/mitaka cloud)
Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4),
corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64
(2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3,
2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4),
libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4),
libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64
(2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3,
2.3.3-1ubuntu4)
During this process, it appears that pacemaker service is restarted
and it errors:
syslog:Jan 2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now lost (was member)
syslog:Jan 2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]: notice: crm_update_peer_state: pcmk_quorum_notification: Node juju-machine-1-lxc-3[1001] - state is now member (was lost)
syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: error: cfg_connection_destroy: Connection destroyed
syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: pcmk_shutdown_worker: Shuting down Pacemaker
syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: notice: stop_child: Stopping crmd: Sent -15 to process 2050
syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2)
syslog:Jan 2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]: error: mcp_cpg_destroy: Connection destroyed
Also affected xenial/ocata
To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions