sts-sponsors team mailing list archive
-
sts-sponsors team
-
Mailing list archive
-
Message #00191
[Bug 1739033] Re: Corosync: Assertion 'sender_node != NULL' failed when bind iface is ready after corosync boots
#VERIFICATION FOR TRUSTY
- Packages
ii corosync 2.3.3-1ubuntu4 amd64 Standards-based cluster framework (daemon and modules)
ii libcorosync-common4 2.3.3-1ubuntu4 amd64 Standards-based cluster framework, common library
- Reproducer
Using a config file with bad entries (as shown in the description)
ifdown interface
/usr/sbin/corosync -f
ifup interface
- Debug output:
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] entering GATHER state from 0(consensus timeout).
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] Creating commit token because I am the rep.
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] Saving state aru 0 high seq received 0
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] Storing new sequence id for ring 4
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] entering COMMIT state.
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] got commit token
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] entering RECOVERY state.
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] position [0] member 169.254.241.20:
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] previous ring seq 0 rep 127.0.0.1
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] aru 0 high delivered 0 received flag 1
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] Did not need to originate any messages in recovery.
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] got commit token
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] Sending initial ORF token
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 0, aru 0
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] install seq 0 aru 0 high seq received 0
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 1, aru 0
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] install seq 0 aru 0 high seq received 0
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 2, aru 0
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] install seq 0 aru 0 high seq received 0
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 3, aru 0
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] install seq 0 aru 0 high seq received 0
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] retrans flag count 4 token aru 0 install seq 0 aru 0 0
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] Resetting old ring state
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] recovery to regular 1-0
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] waiting_trans_ack changed to 1
Dec 22 12:18:27 trusty-corosync corosync[3910]: [MAIN ] Member joined: r(0) ip(169.254.241.20)
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] entering OPERATIONAL state.
Dec 22 12:18:27 trusty-corosync corosync[3910]: [TOTEM ] A new membership (169.254.241.20:4) was formed. Members joined: 1
Dec 22 12:18:27 trusty-corosync corosync[3910]: [QUORUM] got nodeinfo message from cluster node 1
Dec 22 12:18:27 trusty-corosync corosync[3910]: [QUORUM] nodeinfo message[1]: votes: 1, expected: 2 flags: 8
root@trusty-corosync:/home/vtapia# corosync-cfgtool -s
Printing ring status.
Local node ID 1
RING ID 0
id = 169.254.241.20
status = ring 0 active with no faults
Corosync starts as expected.
--
You received this bug notification because you are a member of STS
Sponsors, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1739033
Title:
Corosync: Assertion 'sender_node != NULL' failed when bind iface is
ready after corosync boots
Status in corosync package in Ubuntu:
Fix Released
Status in corosync source package in Trusty:
Fix Committed
Status in corosync source package in Xenial:
Fix Committed
Status in corosync source package in Zesty:
Fix Released
Status in corosync source package in Artful:
Fix Released
Bug description:
[Impact]
Corosync sigaborts if it starts before the interface it has to bind to
is ready.
On boot, if no interface in the bindnetaddr range is up/configured,
corosync binds to lo (127.0.0.1). Once an applicable interface is up,
corosync crashes with the following error message:
corosync: votequorum.c:2019: message_handler_req_exec_votequorum_nodeinfo: Assertion `sender_node != NULL' failed.
Aborted (core dumped)
The last log entries show that the interface is trying to join the
cluster:
Dec 19 11:36:05 [22167] xenial-pacemaker corosync debug [TOTEM ] totemsrp.c:2089 entering OPERATIONAL state.
Dec 19 11:36:05 [22167] xenial-pacemaker corosync notice [TOTEM ] totemsrp.c:2095 A new membership (169.254.241.10:444) was formed. Members joined: 704573706
During the quorum calculation, the generated nodeid (704573706) for
the node is being used instead of the nodeid specified in the
configuration file (1), and the assert fails because the nodeid is not
present in the member list. Corosync should use the correct nodeid and
continue running after the interface is up, as shown in a fixed
corosync boot:
Dec 19 11:50:56 [4824] xenial-corosync corosync notice [TOTEM ]
totemsrp.c:2095 A new membership (169.254.241.10:80) was formed.
Members joined: 1
[Environment]
Xenial 16.04.3
Packages:
ii corosync 2.3.5-3ubuntu1 amd64 cluster engine daemon and utilities
ii libcorosync-common4:amd64 2.3.5-3ubuntu1 amd64 cluster engine common library
[Test Case]
Config:
totem {
version: 2
member {
memberaddr: 169.254.241.10
}
member {
memberaddr: 169.254.241.20
}
transport: udpu
crypto_cipher: none
crypto_hash: none
nodeid: 1
interface {
ringnumber: 0
bindnetaddr: 169.254.241.0
mcastport: 5405
ttl: 1
}
}
quorum {
provider: corosync_votequorum
expected_votes: 2
}
nodelist {
node {
ring0_addr: 169.254.241.10
nodeid: 1
}
node {
ring0_addr: 169.254.241.20
nodeid: 2
}
}
1. ifdown interface (169.254.241.10)
2. start corosync (/usr/sbin/corosync -f)
3. ifup interface
[Regression Potential]
This patch affects corosync boot; the regression potential is for
other problems during corosync startup and/or configuration parsing.
[Other info]
# Upstream corosync commit :
https://github.com/corosync/corosync/commit/aab55a004bb12ebe78db341dc56759dfe710c1b2
# git describe aab55a004bb12ebe78db341dc56759dfe710c1b2
v2.3.5-45-gaab55a0
# rmadison corosync
corosync | 2.3.3-1ubuntu1 | trusty | source, amd64, arm64, armhf, i386, powerpc, ppc64el
corosync | 2.3.3-1ubuntu3 | trusty-updates | source, amd64, arm64, armhf, i386, powerpc, ppc64el
corosync | 2.3.5-3ubuntu1 | xenial | source, amd64, arm64, armhf, i386, powerpc, ppc64el, s390x
corosync | 2.4.2-3build1 | zesty | source, amd64, arm64, armhf, i386, ppc64el, s390x
corosync | 2.4.2-3build1 | artful | source, amd64, arm64, armhf, i386, ppc64el, s390x
corosync | 2.4.2-3build1 | bionic | source, amd64, arm64, armhf, i386, ppc64el, s390x
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/corosync/+bug/1739033/+subscriptions