← Back to team overview

maria-discuss team mailing list archive

Synchronization down after reboot

 

    Hello,

I'm using MariaDB with Galera replication on 3 Ubuntu 12.04 servers :
Server version: 5.5.33a-MariaDB-1~precise mariadb.org binary distribution, wsrep_23.7.6.rXXXX

They're all development servers, in the same local network, so there's no firewall between them.
They were all working fine until a reboot.
Now node 2 and 3 aren't in sync anymore and mysql refuse to start.

Replication is done using xtrabackup.

I can start over by doing a mysqlhotcopy on node1 and put the data on node2 and node3 but i'd like to understand why it didn't survived a reboot.

Here's my nodes :
* Node 1 : dev-sql-001 / 192.168.3.121
* Node 2 : dev-sql-002 / 192.168.3.221
* Node 3 : dev-sql-003 / 192.168.3.222

Here's the log on node 1 when I try to start node 2 :
Oct 22 10:19:29 dev-sql-001 mysqld: 131022 10:19:29 [Note] WSREP: declaring abbffcb5-3af2-11e3-a043-5b38dfb689cf stable Oct 22 10:19:29 dev-sql-001 mysqld: 131022 10:19:29 [Note] WSREP: Node f743192f-3aee-11e3-a47b-97626a725dde state prim Oct 22 10:19:29 dev-sql-001 mysqld: 131022 10:19:29 [Note] WSREP: view(view_id(PRIM,abbffcb5-3af2-11e3-a043-5b38dfb689cf,6) memb { Oct 22 10:19:29 dev-sql-001 mysqld: #011abbffcb5-3af2-11e3-a043-5b38dfb689cf, Oct 22 10:19:29 dev-sql-001 mysqld: #011f743192f-3aee-11e3-a47b-97626a725dde,
Oct 22 10:19:29 dev-sql-001 mysqld: } joined {
Oct 22 10:19:29 dev-sql-001 mysqld: } left {
Oct 22 10:19:29 dev-sql-001 mysqld: } partitioned {
Oct 22 10:19:29 dev-sql-001 mysqld: })
Oct 22 10:19:29 dev-sql-001 mysqld: 131022 10:19:29 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2 Oct 22 10:19:29 dev-sql-001 mysqld: 131022 10:19:29 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID. Oct 22 10:19:29 dev-sql-001 mysqld: 131022 10:19:29 [Note] WSREP: STATE EXCHANGE: sent state msg: ac5a1fa8-3af2-11e3-ba64-7e36060f57f6 Oct 22 10:19:29 dev-sql-001 mysqld: 131022 10:19:29 [Note] WSREP: STATE EXCHANGE: got state msg: ac5a1fa8-3af2-11e3-ba64-7e36060f57f6 from 0 (dev-sql-002) Oct 22 10:19:29 dev-sql-001 mysqld: 131022 10:19:29 [Note] WSREP: STATE EXCHANGE: got state msg: ac5a1fa8-3af2-11e3-ba64-7e36060f57f6 from 1 (dev-sql-001) Oct 22 10:19:29 dev-sql-001 mysqld: 131022 10:19:29 [Note] WSREP: Quorum results:
Oct 22 10:19:29 dev-sql-001 mysqld: #011version    = 2,
Oct 22 10:19:29 dev-sql-001 mysqld: #011component  = PRIMARY,
Oct 22 10:19:29 dev-sql-001 mysqld: #011conf_id    = 5,
Oct 22 10:19:29 dev-sql-001 mysqld: #011members    = 1/2 (joined/total),
Oct 22 10:19:29 dev-sql-001 mysqld: #011act_id     = 968,
Oct 22 10:19:29 dev-sql-001 mysqld: #011last_appl. = 0,
Oct 22 10:19:29 dev-sql-001 mysqld: #011protocols  = 0/4/2 (gcs/repl/appl),
Oct 22 10:19:29 dev-sql-001 mysqld: #011group UUID = 2896facc-2110-11e3-0800-717f6d7ec2af Oct 22 10:19:29 dev-sql-001 mysqld: 131022 10:19:29 [Note] WSREP: Flow-control interval: [23, 23] Oct 22 10:19:29 dev-sql-001 mysqld: 131022 10:19:29 [Note] WSREP: New cluster view: global state: 2896facc-2110-11e3-0800-717f6d7ec2af:968, view# 6: Primary, number of nodes: 2, my index: 1, protocol version 2 Oct 22 10:19:29 dev-sql-001 mysqld: 131022 10:19:29 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. Oct 22 10:19:29 dev-sql-001 mysqld: 131022 10:19:29 [Note] WSREP: Assign initial position for certification: 968, protocol version: 2 Oct 22 10:19:32 dev-sql-001 mysqld: 131022 10:19:32 [Note] WSREP: Node 0 (dev-sql-002) requested state transfer from '*any*'. Selected 1 (dev-sql-001)(SYNCED) as donor. Oct 22 10:19:32 dev-sql-001 mysqld: 131022 10:19:32 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 968) Oct 22 10:19:32 dev-sql-001 mysqld: 131022 10:19:32 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. Oct 22 10:19:32 dev-sql-001 mysqld: 131022 10:19:32 [Note] WSREP: Running: 'wsrep_sst_xtrabackup --role 'donor' --address '192.168.3.221:4444/xtrabackup_sst' --auth 'root:test' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --gtid '2896facc-2110-11e3-0800-717f6d7ec2af:968'' Oct 22 10:19:32 dev-sql-001 mysqld: 131022 10:19:32 [Note] WSREP: sst_donor_thread signaled with 0 Oct 22 10:19:32 dev-sql-001 mysqld: WSREP_SST: [INFO] Streaming with tar (20131022 10:19:32.294) Oct 22 10:19:32 dev-sql-001 mysqld: WSREP_SST: [INFO] Note: Advanced xtrabackup features - encryption,compression etc. not available with tar. (20131022 10:19:32.301) Oct 22 10:19:32 dev-sql-001 mysqld: WSREP_SST: [INFO] Streaming the backup to joiner at 192.168.3.221 4444 (20131022 10:19:32.330) Oct 22 10:19:32 dev-sql-001 mysqld: WSREP_SST: [ERROR] innobackupex finished with error: 1. Check /var/lib/mysql//innobackup.backup.log (20131022 10:19:32.662) Oct 22 10:19:32 dev-sql-001 mysqld: WSREP_SST: [ERROR] Cleanup after exit with status:22 (20131022 10:19:32.668) Oct 22 10:19:32 dev-sql-001 mysqld: 131022 10:19:32 [ERROR] WSREP: Failed to read from: wsrep_sst_xtrabackup --role 'donor' --address '192.168.3.221:4444/xtrabackup_sst' --auth 'root:test' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --gtid '2896facc-2110-11e3-0800-717f6d7ec2af:968' Oct 22 10:19:32 dev-sql-001 mysqld: 131022 10:19:32 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup --role 'donor' --address '192.168.3.221:4444/xtrabackup_sst' --auth 'root:test' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --gtid '2896facc-2110-11e3-0800-717f6d7ec2af:968': 22 (Invalid argument) Oct 22 10:19:32 dev-sql-001 mysqld: 131022 10:19:32 [Warning] WSREP: 1 (dev-sql-001): State transfer to 0 (dev-sql-002) failed: -1 (Operation not permitted) Oct 22 10:19:32 dev-sql-001 mysqld: 131022 10:19:32 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 968) Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: Member 1 (dev-sql-001) synced with group. Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 968) Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: Synchronized with group, ready for connections Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: Node f743192f-3aee-11e3-a47b-97626a725dde state prim Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: view(view_id(PRIM,f743192f-3aee-11e3-a47b-97626a725dde,7) memb { Oct 22 10:19:33 dev-sql-001 mysqld: #011f743192f-3aee-11e3-a47b-97626a725dde,
Oct 22 10:19:33 dev-sql-001 mysqld: } joined {
Oct 22 10:19:33 dev-sql-001 mysqld: } left {
Oct 22 10:19:33 dev-sql-001 mysqld: } partitioned {
Oct 22 10:19:33 dev-sql-001 mysqld: #011abbffcb5-3af2-11e3-a043-5b38dfb689cf,
Oct 22 10:19:33 dev-sql-001 mysqld: })
Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: forgetting abbffcb5-3af2-11e3-a043-5b38dfb689cf (tcp://192.168.3.221:4567) Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1 Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: STATE_EXCHANGE: sent state UUID: ae9636c9-3af2-11e3-b69e-0fef55014e34 Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: STATE EXCHANGE: sent state msg: ae9636c9-3af2-11e3-b69e-0fef55014e34 Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: STATE EXCHANGE: got state msg: ae9636c9-3af2-11e3-b69e-0fef55014e34 from 0 (dev-sql-001) Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: Quorum results:
Oct 22 10:19:33 dev-sql-001 mysqld: #011version    = 2,
Oct 22 10:19:33 dev-sql-001 mysqld: #011component  = PRIMARY,
Oct 22 10:19:33 dev-sql-001 mysqld: #011conf_id    = 6,
Oct 22 10:19:33 dev-sql-001 mysqld: #011members    = 1/1 (joined/total),
Oct 22 10:19:33 dev-sql-001 mysqld: #011act_id     = 968,
Oct 22 10:19:33 dev-sql-001 mysqld: #011last_appl. = 0,
Oct 22 10:19:33 dev-sql-001 mysqld: #011protocols  = 0/4/2 (gcs/repl/appl),
Oct 22 10:19:33 dev-sql-001 mysqld: #011group UUID = 2896facc-2110-11e3-0800-717f6d7ec2af Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: Flow-control interval: [16, 16] Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: New cluster view: global state: 2896facc-2110-11e3-0800-717f6d7ec2af:968, view# 7: Primary, number of nodes: 1, my index: 0, protocol version 2 Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: Assign initial position for certification: 968, protocol version: 2 Oct 22 10:19:38 dev-sql-001 mysqld: 131022 10:19:38 [Note] WSREP: cleaning up abbffcb5-3af2-11e3-a043-5b38dfb689cf (tcp://192.168.3.221:4567)


/var/lib/mysql//innobackup.backup.log on node1 :
131022 10:19:32 innobackupex: Connecting to MySQL server with DSN 'dbi:mysql:;mysql_read_default_file=/etc/mysql/my.cnf;mysql_read_default_group=xtrabackup;mysql_socket=/var/run/mysqld/mysqld.sock' as 'root' (using password: YES). innobackupex: Error: Failed to connect to MySQL server: DBI connect(';mysql_read_default_file=/etc/mysql/my.cnf;mysql_read_default_group=xtrabackup;mysql_socket=/var/run/mysqld/mysqld.sock','root',...) failed: Access denied for user 'root'@'localhost' (using password: YES) at /usr//bin/innobackupex line 2902


As for node2 logs :
Oct 22 10:19:25 dev-sql-002 mysqld_safe: Starting mysqld daemon with databases from /var/lib/mysql Oct 22 10:19:25 dev-sql-002 mysqld_safe: WSREP: Running position recovery with --log_error=/tmp/tmp.gp3R8iSBdK --pid-file=/var/lib/mysql/dev-sql-002-recover.pid Oct 22 10:19:25 dev-sql-002 mysqld: 131022 10:19:25 [Warning] option 'table_cache': unsigned value 2097152 adjusted to 524288 Oct 22 10:19:28 dev-sql-002 mysqld_safe: WSREP: Recovered position 2896facc-2110-11e3-0800-717f6d7ec2af:370 Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Warning] option 'table_cache': unsigned value 2097152 adjusted to 524288 Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: wsrep_start_position var submitted: '2896facc-2110-11e3-0800-717f6d7ec2af:370' Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: Read nil XID from storage engines, skipping position init Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so' Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: wsrep_load(): Galera 23.2.7-wheezy(r) by Codership Oy <info@xxxxxxxxxxxxx> loaded succesfully. Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1 Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: Reusing existing '/var/lib/mysql//galera.cache'. Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: Passing config to GCS: base_host = 192.168.3.221; base_port = 4567; cert.log_conflicts = no; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3 Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1 Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: wsrep_sst_grab() Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: Start replication Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1 Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: protonet asio version 0 Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: backend: asio Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: GMCast version 0 Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: (abbffcb5-3af2-11e3-a043-5b38dfb689cf, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567 Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: (abbffcb5-3af2-11e3-a043-5b38dfb689cf, 'tcp://0.0.0.0:4567') multicast: , ttl: 1 Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: EVS version 0 Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: PC version 0 Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: gcomm: connecting to group 'dev-sql', peer 'dev-sql-001:,dev-sql-002:' Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Warning] WSREP: (abbffcb5-3af2-11e3-a043-5b38dfb689cf, 'tcp://0.0.0.0:4567') address 'tcp://192.168.3.221:4567' points to own listening address, blacklisting Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: (abbffcb5-3af2-11e3-a043-5b38dfb689cf, 'tcp://0.0.0.0:4567') address 'tcp://192.168.3.221:4567' pointing to uuid abbffcb5-3af2-11e3-a043-5b38dfb689cf is blacklisted, skipping Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: declaring f743192f-3aee-11e3-a47b-97626a725dde stable Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: Node f743192f-3aee-11e3-a47b-97626a725dde state prim Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: view(view_id(PRIM,abbffcb5-3af2-11e3-a043-5b38dfb689cf,6) memb { Oct 22 10:19:29 dev-sql-002 mysqld: #011abbffcb5-3af2-11e3-a043-5b38dfb689cf, Oct 22 10:19:29 dev-sql-002 mysqld: #011f743192f-3aee-11e3-a47b-97626a725dde,
Oct 22 10:19:29 dev-sql-002 mysqld: } joined {
Oct 22 10:19:29 dev-sql-002 mysqld: } left {
Oct 22 10:19:29 dev-sql-002 mysqld: } partitioned {
Oct 22 10:19:29 dev-sql-002 mysqld: })
Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: gcomm: connected Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636 Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0) Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: Opened channel 'dev-sql' Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: Waiting for SST to complete. Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 2 Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: STATE_EXCHANGE: sent state UUID: ac5a1fa8-3af2-11e3-ba64-7e36060f57f6 Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: STATE EXCHANGE: sent state msg: ac5a1fa8-3af2-11e3-ba64-7e36060f57f6 Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: STATE EXCHANGE: got state msg: ac5a1fa8-3af2-11e3-ba64-7e36060f57f6 from 0 (dev-sql-002) Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: STATE EXCHANGE: got state msg: ac5a1fa8-3af2-11e3-ba64-7e36060f57f6 from 1 (dev-sql-001) Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: Quorum results:
Oct 22 10:19:29 dev-sql-002 mysqld: #011version    = 2,
Oct 22 10:19:29 dev-sql-002 mysqld: #011component  = PRIMARY,
Oct 22 10:19:29 dev-sql-002 mysqld: #011conf_id    = 5,
Oct 22 10:19:29 dev-sql-002 mysqld: #011members    = 1/2 (joined/total),
Oct 22 10:19:29 dev-sql-002 mysqld: #011act_id     = 968,
Oct 22 10:19:29 dev-sql-002 mysqld: #011last_appl. = -1,
Oct 22 10:19:29 dev-sql-002 mysqld: #011protocols  = 0/4/2 (gcs/repl/appl),
Oct 22 10:19:29 dev-sql-002 mysqld: #011group UUID = 2896facc-2110-11e3-0800-717f6d7ec2af Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: Flow-control interval: [23, 23] Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 968) Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: State transfer required: Oct 22 10:19:29 dev-sql-002 mysqld: #011Group state: 2896facc-2110-11e3-0800-717f6d7ec2af:968 Oct 22 10:19:29 dev-sql-002 mysqld: #011Local state: 00000000-0000-0000-0000-000000000000:-1 Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: New cluster view: global state: 2896facc-2110-11e3-0800-717f6d7ec2af:968, view# 6: Primary, number of nodes: 2, my index: 0, protocol version 2 Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Warning] WSREP: Gap in state sequence. Need state transfer. Oct 22 10:19:31 dev-sql-002 mysqld: 131022 10:19:31 [Note] WSREP: Running: 'wsrep_sst_xtrabackup --role 'joiner' --address '192.168.3.221' --auth 'root:test' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --parent '13950'' Oct 22 10:19:31 dev-sql-002 mysqld: WSREP_SST: [INFO] Streaming with tar (20131022 10:19:31.981) Oct 22 10:19:31 dev-sql-002 mysqld: WSREP_SST: [INFO] Note: Advanced xtrabackup features - encryption,compression etc. not available with tar. (20131022 10:19:31.988) Oct 22 10:19:31 dev-sql-002 mysqld: WSREP_SST: [INFO] However, xbstream requires manual cleanup of data directory before SST - lp:1193240. (20131022 10:19:31.994) Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [Note] WSREP: Prepared SST request: xtrabackup|192.168.3.221:4444/xtrabackup_sst Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [Note] WSREP: Assign initial position for certification: 968, protocol version: 2 Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (2896facc-2110-11e3-0800-717f6d7ec2af): 1 (Operation not permitted) Oct 22 10:19:32 dev-sql-002 mysqld: #011 at galera/src/replicator_str.cpp:prepare_for_IST():436. IST will be unavailable. Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [Note] WSREP: Node 0 (dev-sql-002) requested state transfer from '*any*'. Selected 1 (dev-sql-001)(SYNCED) as donor. Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 968) Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [Note] WSREP: Requesting state transfer: success, donor: 1 Oct 22 10:19:32 dev-sql-002 mysqld: tar: This does not look like a tar archive Oct 22 10:19:32 dev-sql-002 mysqld: tar: Exiting with failure status due to previous errors Oct 22 10:19:32 dev-sql-002 mysqld: WSREP_SST: [ERROR] Error while getting data from donor node: exit codes: 0 2 (20131022 10:19:32.659) Oct 22 10:19:32 dev-sql-002 mysqld: WSREP_SST: [ERROR] Data directory /var/lib/mysql/ needs to be empty for SST: Manual intervention required in that case (20131022 10:19:32.666) Oct 22 10:19:32 dev-sql-002 mysqld: WSREP_SST: [ERROR] Cleanup after exit with status:32 (20131022 10:19:32.672) Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [Warning] WSREP: 1 (dev-sql-001): State transfer to 0 (dev-sql-002) failed: -1 (Operation not permitted) Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [ERROR] WSREP: gcs/src/gcs_group.c:gcs_group_handle_join_msg():719: Will never receive state. Need to abort. Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [Note] WSREP: gcomm: terminating thread Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [Note] WSREP: gcomm: joining thread Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [Note] WSREP: gcomm: closing backend Oct 22 10:19:32 dev-sql-002 mysqld: WSREP_SST: [ERROR] Killing nc pid (20131022 10:19:32.707) Oct 22 10:19:32 dev-sql-002 mysqld: WSREP_SST: [INFO] Removing the sst_in_progress file (20131022 10:19:32.725) Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup --role 'joiner' --address '192.168.3.221' --auth 'root:test' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --parent '13950': 32 (Broken pipe) Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [ERROR] WSREP: Failed to read uuid:seqno from joiner script. Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [ERROR] WSREP: SST failed: 32 (Broken pipe)
Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [ERROR] Aborting
Oct 22 10:19:32 dev-sql-002 mysqld:
Oct 22 10:19:33 dev-sql-002 mysqld: 131022 10:19:33 [Note] WSREP: view(view_id(NON_PRIM,abbffcb5-3af2-11e3-a043-5b38dfb689cf,6) memb { Oct 22 10:19:33 dev-sql-002 mysqld: #011abbffcb5-3af2-11e3-a043-5b38dfb689cf,
Oct 22 10:19:33 dev-sql-002 mysqld: } joined {
Oct 22 10:19:33 dev-sql-002 mysqld: } left {
Oct 22 10:19:33 dev-sql-002 mysqld: } partitioned {
Oct 22 10:19:33 dev-sql-002 mysqld: #011f743192f-3aee-11e3-a47b-97626a725dde,
Oct 22 10:19:33 dev-sql-002 mysqld: })
Oct 22 10:19:33 dev-sql-002 mysqld: 131022 10:19:33 [Note] WSREP: view((empty)) Oct 22 10:19:33 dev-sql-002 mysqld: 131022 10:19:33 [Note] WSREP: gcomm: closed Oct 22 10:19:33 dev-sql-002 mysqld: 131022 10:19:33 [Note] WSREP: /usr/sbin/mysqld: Terminated. Oct 22 10:19:33 dev-sql-002 mysqld_safe: mysqld from pid file /var/run/mysqld/mysqld.pid ended Oct 22 10:19:55 dev-sql-002 /etc/init.d/mysql[14428]: 0 processes alive and '/usr/bin/mysqladmin --defaults-file=/etc/mysql/debian.cnf ping' resulted in Oct 22 10:19:55 dev-sql-002 /etc/init.d/mysql[14428]: #007/usr/bin/mysqladmin: connect to server at 'localhost' failed Oct 22 10:19:55 dev-sql-002 /etc/init.d/mysql[14428]: error: 'Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (111)' Oct 22 10:19:55 dev-sql-002 /etc/init.d/mysql[14428]: Check that mysqld is running and that the socket: '/var/run/mysqld/mysqld.sock' exists!
Oct 22 10:19:55 dev-sql-002 /etc/init.d/mysql[14428]:

WSREP status on node1 :
MariaDB [(none)]> SHOW STATUS LIKE 'wsrep_%'
    -> ;
+----------------------------+--------------------------------------+
| Variable_name              | Value |
+----------------------------+--------------------------------------+
| wsrep_local_state_uuid     | 2896facc-2110-11e3-0800-717f6d7ec2af |
| wsrep_protocol_version     | 4 |
| wsrep_last_committed       | 968 |
| wsrep_replicated           | 0 |
| wsrep_replicated_bytes     | 0 |
| wsrep_received             | 12 |
| wsrep_received_bytes       | 1126 |
| wsrep_local_commits        | 0 |
| wsrep_local_cert_failures  | 0 |
| wsrep_local_bf_aborts      | 0 |
| wsrep_local_replays        | 0 |
| wsrep_local_send_queue     | 0 |
| wsrep_local_send_queue_avg | 0.500000 |
| wsrep_local_recv_queue     | 0 |
| wsrep_local_recv_queue_avg | 0.000000 |
| wsrep_flow_control_paused  | 0.000000 |
| wsrep_flow_control_sent    | 0 |
| wsrep_flow_control_recv    | 0 |
| wsrep_cert_deps_distance   | 0.000000 |
| wsrep_apply_oooe           | 0.000000 |
| wsrep_apply_oool           | 0.000000 |
| wsrep_apply_window         | 0.000000 |
| wsrep_commit_oooe          | 0.000000 |
| wsrep_commit_oool          | 0.000000 |
| wsrep_commit_window        | 0.000000 |
| wsrep_local_state          | 4 |
| wsrep_local_state_comment  | Synced |
| wsrep_cert_index_size      | 0 |
| wsrep_causal_reads         | 0 |
| wsrep_incoming_addresses   | 192.168.3.121:3306 |
| wsrep_cluster_conf_id      | 5 |
| wsrep_cluster_size         | 1 |
| wsrep_cluster_state_uuid   | 2896facc-2110-11e3-0800-717f6d7ec2af |
| wsrep_cluster_status       | Primary |
| wsrep_connected            | ON |
| wsrep_local_index          | 0 |
| wsrep_provider_name        | Galera |
| wsrep_provider_vendor      | Codership Oy <info@xxxxxxxxxxxxx>    |
| wsrep_provider_version     | 23.2.7-wheezy(r) |
| wsrep_ready                | ON |
+----------------------------+--------------------------------------+
40 rows in set (0.00 sec)

Thanks,

--
  Guillaume Hilt



Follow ups