maria-discuss team mailing list archive
-
maria-discuss team
-
Mailing list archive
-
Message #05181
Galera partitioning question
Greetings all,
I am running MariaDB 10.2.16 on CentOS in AWS and am seeing a sporadic
cluster partitioning and rejoining issue with seemingly no explicable cause.
* I have elements in 3 different AWS availability zones in a single
galera cluster
* Monitoring logs I see this message: /Jul 29 05:33:53 server01
mysqld: 2018-07-29 5:33:53 139633883080448 [Note] WSREP: (eabb848a,
'tcp://0.0.0.0:4567') connection to peer 392b9516 with addr
tcp://172.31.17.60:4567 timed out, no messages seen in PT3S/
* I have tried forcing a 1500byte MTU as some others sources mentioned
jumbo framing could negatively impact galera replication.
* Running prolonged packet captures between nodes i cannot seem to
find anything else wrong, network connectivity isn't interrupted and
no service restarts occur.
* These partition events happen multiple times per day.
Has anyone seem this sporadic cluster disconnect and re-join issue in a
similar env? I did not previously note this behavior on 10.1.
Any help is much appreciated.
-Ryan