← Back to team overview

maria-discuss team mailing list archive

Re: Synchronization down after reboot

 

I fixed the issue bu replicating the mysql folder of the first node to the others nodes. I think the problem was caused one evening when all the node were rebooted at the same time (the hypervisor hosting them was halted) and didn't had time to send unregister from the other nodes.

  Guillaume Hilt

Le 22/10/2013 10:46, Guillaume Hilt a écrit :
Mariadb conf file on node1 :

[mysqld]
wsrep_cluster_name = 'dev-sql'
wsrep_node_name = dev-sql-001
wsrep_node_address = "192.168.3.121"
wsrep_cluster_address = 'gcomm://'
wsrep_retry_autocommit = 0
wsrep_provider = /usr/lib/galera/libgalera_smm.so
wsrep_retry_autocommit = 0
wsrep_sst_method = xtrabackup
wsrep_sst_auth = root:test

# Other mysqld options
binlog_format = ROW
innodb_autoinc_lock_mode = 2
innodb_flush_log_at_trx_commit = 2
innodb_locks_unsafe_for_binlog = 1

Mariadb conf file on node2 :

[mysqld]
wsrep_cluster_name = 'dev-sql'
wsrep_node_name = dev-sql-002
wsrep_node_address = "192.168.3.221"
wsrep_cluster_address = 'gcomm://dev-sql-001,dev-sql-002,dev-sql-003'
wsrep_retry_autocommit = 0
wsrep_provider = /usr/lib/galera/libgalera_smm.so
wsrep_sst_method = xtrabackup
wsrep_sst_auth = root:test

# Other mysqld options
binlog_format = ROW
innodb_autoinc_lock_mode = 2
innodb_flush_log_at_trx_commit = 2
innodb_locks_unsafe_for_binlog = 1

All servers have the hosts file with the other hosts names in it.

  Guillaume Hilt

Le 22/10/2013 10:35, Guillaume Hilt a écrit :
    Hello,

I'm using MariaDB with Galera replication on 3 Ubuntu 12.04 servers :
Server version: 5.5.33a-MariaDB-1~precise mariadb.org binary distribution, wsrep_23.7.6.rXXXX

They're all development servers, in the same local network, so there's no firewall between them.
They were all working fine until a reboot.
Now node 2 and 3 aren't in sync anymore and mysql refuse to start.

Replication is done using xtrabackup.

I can start over by doing a mysqlhotcopy on node1 and put the data on node2 and node3 but i'd like to understand why it didn't survived a reboot.

Here's my nodes :
* Node 1 : dev-sql-001 / 192.168.3.121
* Node 2 : dev-sql-002 / 192.168.3.221
* Node 3 : dev-sql-003 / 192.168.3.222

Here's the log on node 1 when I try to start node 2 :
Oct 22 10:19:29 dev-sql-001 mysqld: 131022 10:19:29 [Note] WSREP: declaring abbffcb5-3af2-11e3-a043-5b38dfb689cf stable Oct 22 10:19:29 dev-sql-001 mysqld: 131022 10:19:29 [Note] WSREP: Node f743192f-3aee-11e3-a47b-97626a725dde state prim Oct 22 10:19:29 dev-sql-001 mysqld: 131022 10:19:29 [Note] WSREP: view(view_id(PRIM,abbffcb5-3af2-11e3-a043-5b38dfb689cf,6) memb { Oct 22 10:19:29 dev-sql-001 mysqld: #011abbffcb5-3af2-11e3-a043-5b38dfb689cf, Oct 22 10:19:29 dev-sql-001 mysqld: #011f743192f-3aee-11e3-a47b-97626a725dde,
Oct 22 10:19:29 dev-sql-001 mysqld: } joined {
Oct 22 10:19:29 dev-sql-001 mysqld: } left {
Oct 22 10:19:29 dev-sql-001 mysqld: } partitioned {
Oct 22 10:19:29 dev-sql-001 mysqld: })
Oct 22 10:19:29 dev-sql-001 mysqld: 131022 10:19:29 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2 Oct 22 10:19:29 dev-sql-001 mysqld: 131022 10:19:29 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID. Oct 22 10:19:29 dev-sql-001 mysqld: 131022 10:19:29 [Note] WSREP: STATE EXCHANGE: sent state msg: ac5a1fa8-3af2-11e3-ba64-7e36060f57f6 Oct 22 10:19:29 dev-sql-001 mysqld: 131022 10:19:29 [Note] WSREP: STATE EXCHANGE: got state msg: ac5a1fa8-3af2-11e3-ba64-7e36060f57f6 from 0 (dev-sql-002) Oct 22 10:19:29 dev-sql-001 mysqld: 131022 10:19:29 [Note] WSREP: STATE EXCHANGE: got state msg: ac5a1fa8-3af2-11e3-ba64-7e36060f57f6 from 1 (dev-sql-001) Oct 22 10:19:29 dev-sql-001 mysqld: 131022 10:19:29 [Note] WSREP: Quorum results:
Oct 22 10:19:29 dev-sql-001 mysqld: #011version    = 2,
Oct 22 10:19:29 dev-sql-001 mysqld: #011component  = PRIMARY,
Oct 22 10:19:29 dev-sql-001 mysqld: #011conf_id    = 5,
Oct 22 10:19:29 dev-sql-001 mysqld: #011members    = 1/2 (joined/total),
Oct 22 10:19:29 dev-sql-001 mysqld: #011act_id     = 968,
Oct 22 10:19:29 dev-sql-001 mysqld: #011last_appl. = 0,
Oct 22 10:19:29 dev-sql-001 mysqld: #011protocols = 0/4/2 (gcs/repl/appl), Oct 22 10:19:29 dev-sql-001 mysqld: #011group UUID = 2896facc-2110-11e3-0800-717f6d7ec2af Oct 22 10:19:29 dev-sql-001 mysqld: 131022 10:19:29 [Note] WSREP: Flow-control interval: [23, 23] Oct 22 10:19:29 dev-sql-001 mysqld: 131022 10:19:29 [Note] WSREP: New cluster view: global state: 2896facc-2110-11e3-0800-717f6d7ec2af:968, view# 6: Primary, number of nodes: 2, my index: 1, protocol version 2 Oct 22 10:19:29 dev-sql-001 mysqld: 131022 10:19:29 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. Oct 22 10:19:29 dev-sql-001 mysqld: 131022 10:19:29 [Note] WSREP: Assign initial position for certification: 968, protocol version: 2 Oct 22 10:19:32 dev-sql-001 mysqld: 131022 10:19:32 [Note] WSREP: Node 0 (dev-sql-002) requested state transfer from '*any*'. Selected 1 (dev-sql-001)(SYNCED) as donor. Oct 22 10:19:32 dev-sql-001 mysqld: 131022 10:19:32 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 968) Oct 22 10:19:32 dev-sql-001 mysqld: 131022 10:19:32 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. Oct 22 10:19:32 dev-sql-001 mysqld: 131022 10:19:32 [Note] WSREP: Running: 'wsrep_sst_xtrabackup --role 'donor' --address '192.168.3.221:4444/xtrabackup_sst' --auth 'root:test' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --gtid '2896facc-2110-11e3-0800-717f6d7ec2af:968'' Oct 22 10:19:32 dev-sql-001 mysqld: 131022 10:19:32 [Note] WSREP: sst_donor_thread signaled with 0 Oct 22 10:19:32 dev-sql-001 mysqld: WSREP_SST: [INFO] Streaming with tar (20131022 10:19:32.294) Oct 22 10:19:32 dev-sql-001 mysqld: WSREP_SST: [INFO] Note: Advanced xtrabackup features - encryption,compression etc. not available with tar. (20131022 10:19:32.301) Oct 22 10:19:32 dev-sql-001 mysqld: WSREP_SST: [INFO] Streaming the backup to joiner at 192.168.3.221 4444 (20131022 10:19:32.330) Oct 22 10:19:32 dev-sql-001 mysqld: WSREP_SST: [ERROR] innobackupex finished with error: 1. Check /var/lib/mysql//innobackup.backup.log (20131022 10:19:32.662) Oct 22 10:19:32 dev-sql-001 mysqld: WSREP_SST: [ERROR] Cleanup after exit with status:22 (20131022 10:19:32.668) Oct 22 10:19:32 dev-sql-001 mysqld: 131022 10:19:32 [ERROR] WSREP: Failed to read from: wsrep_sst_xtrabackup --role 'donor' --address '192.168.3.221:4444/xtrabackup_sst' --auth 'root:test' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --gtid '2896facc-2110-11e3-0800-717f6d7ec2af:968' Oct 22 10:19:32 dev-sql-001 mysqld: 131022 10:19:32 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup --role 'donor' --address '192.168.3.221:4444/xtrabackup_sst' --auth 'root:test' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --gtid '2896facc-2110-11e3-0800-717f6d7ec2af:968': 22 (Invalid argument) Oct 22 10:19:32 dev-sql-001 mysqld: 131022 10:19:32 [Warning] WSREP: 1 (dev-sql-001): State transfer to 0 (dev-sql-002) failed: -1 (Operation not permitted) Oct 22 10:19:32 dev-sql-001 mysqld: 131022 10:19:32 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 968) Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: Member 1 (dev-sql-001) synced with group. Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 968) Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: Synchronized with group, ready for connections Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: Node f743192f-3aee-11e3-a47b-97626a725dde state prim Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: view(view_id(PRIM,f743192f-3aee-11e3-a47b-97626a725dde,7) memb { Oct 22 10:19:33 dev-sql-001 mysqld: #011f743192f-3aee-11e3-a47b-97626a725dde,
Oct 22 10:19:33 dev-sql-001 mysqld: } joined {
Oct 22 10:19:33 dev-sql-001 mysqld: } left {
Oct 22 10:19:33 dev-sql-001 mysqld: } partitioned {
Oct 22 10:19:33 dev-sql-001 mysqld: #011abbffcb5-3af2-11e3-a043-5b38dfb689cf,
Oct 22 10:19:33 dev-sql-001 mysqld: })
Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: forgetting abbffcb5-3af2-11e3-a043-5b38dfb689cf (tcp://192.168.3.221:4567) Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1 Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: STATE_EXCHANGE: sent state UUID: ae9636c9-3af2-11e3-b69e-0fef55014e34 Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: STATE EXCHANGE: sent state msg: ae9636c9-3af2-11e3-b69e-0fef55014e34 Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: STATE EXCHANGE: got state msg: ae9636c9-3af2-11e3-b69e-0fef55014e34 from 0 (dev-sql-001) Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: Quorum results:
Oct 22 10:19:33 dev-sql-001 mysqld: #011version    = 2,
Oct 22 10:19:33 dev-sql-001 mysqld: #011component  = PRIMARY,
Oct 22 10:19:33 dev-sql-001 mysqld: #011conf_id    = 6,
Oct 22 10:19:33 dev-sql-001 mysqld: #011members    = 1/1 (joined/total),
Oct 22 10:19:33 dev-sql-001 mysqld: #011act_id     = 968,
Oct 22 10:19:33 dev-sql-001 mysqld: #011last_appl. = 0,
Oct 22 10:19:33 dev-sql-001 mysqld: #011protocols = 0/4/2 (gcs/repl/appl), Oct 22 10:19:33 dev-sql-001 mysqld: #011group UUID = 2896facc-2110-11e3-0800-717f6d7ec2af Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: Flow-control interval: [16, 16] Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: New cluster view: global state: 2896facc-2110-11e3-0800-717f6d7ec2af:968, view# 7: Primary, number of nodes: 1, my index: 0, protocol version 2 Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. Oct 22 10:19:33 dev-sql-001 mysqld: 131022 10:19:33 [Note] WSREP: Assign initial position for certification: 968, protocol version: 2 Oct 22 10:19:38 dev-sql-001 mysqld: 131022 10:19:38 [Note] WSREP: cleaning up abbffcb5-3af2-11e3-a043-5b38dfb689cf (tcp://192.168.3.221:4567)


/var/lib/mysql//innobackup.backup.log on node1 :
131022 10:19:32 innobackupex: Connecting to MySQL server with DSN 'dbi:mysql:;mysql_read_default_file=/etc/mysql/my.cnf;mysql_read_default_group=xtrabackup;mysql_socket=/var/run/mysqld/mysqld.sock' as 'root' (using password: YES). innobackupex: Error: Failed to connect to MySQL server: DBI connect(';mysql_read_default_file=/etc/mysql/my.cnf;mysql_read_default_group=xtrabackup;mysql_socket=/var/run/mysqld/mysqld.sock','root',...) failed: Access denied for user 'root'@'localhost' (using password: YES) at /usr//bin/innobackupex line 2902


As for node2 logs :
Oct 22 10:19:25 dev-sql-002 mysqld_safe: Starting mysqld daemon with databases from /var/lib/mysql Oct 22 10:19:25 dev-sql-002 mysqld_safe: WSREP: Running position recovery with --log_error=/tmp/tmp.gp3R8iSBdK --pid-file=/var/lib/mysql/dev-sql-002-recover.pid Oct 22 10:19:25 dev-sql-002 mysqld: 131022 10:19:25 [Warning] option 'table_cache': unsigned value 2097152 adjusted to 524288 Oct 22 10:19:28 dev-sql-002 mysqld_safe: WSREP: Recovered position 2896facc-2110-11e3-0800-717f6d7ec2af:370 Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Warning] option 'table_cache': unsigned value 2097152 adjusted to 524288 Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: wsrep_start_position var submitted: '2896facc-2110-11e3-0800-717f6d7ec2af:370' Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: Read nil XID from storage engines, skipping position init Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so' Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: wsrep_load(): Galera 23.2.7-wheezy(r) by Codership Oy <info@xxxxxxxxxxxxx> loaded succesfully. Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1 Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: Reusing existing '/var/lib/mysql//galera.cache'. Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: Passing config to GCS: base_host = 192.168.3.221; base_port = 4567; cert.log_conflicts = no; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3 Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1 Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: wsrep_sst_grab() Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: Start replication Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1 Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: protonet asio version 0 Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: backend: asio Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: GMCast version 0 Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: (abbffcb5-3af2-11e3-a043-5b38dfb689cf, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567 Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: (abbffcb5-3af2-11e3-a043-5b38dfb689cf, 'tcp://0.0.0.0:4567') multicast: , ttl: 1 Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: EVS version 0 Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: PC version 0 Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: gcomm: connecting to group 'dev-sql', peer 'dev-sql-001:,dev-sql-002:' Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Warning] WSREP: (abbffcb5-3af2-11e3-a043-5b38dfb689cf, 'tcp://0.0.0.0:4567') address 'tcp://192.168.3.221:4567' points to own listening address, blacklisting Oct 22 10:19:28 dev-sql-002 mysqld: 131022 10:19:28 [Note] WSREP: (abbffcb5-3af2-11e3-a043-5b38dfb689cf, 'tcp://0.0.0.0:4567') address 'tcp://192.168.3.221:4567' pointing to uuid abbffcb5-3af2-11e3-a043-5b38dfb689cf is blacklisted, skipping Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: declaring f743192f-3aee-11e3-a47b-97626a725dde stable Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: Node f743192f-3aee-11e3-a47b-97626a725dde state prim Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: view(view_id(PRIM,abbffcb5-3af2-11e3-a043-5b38dfb689cf,6) memb { Oct 22 10:19:29 dev-sql-002 mysqld: #011abbffcb5-3af2-11e3-a043-5b38dfb689cf, Oct 22 10:19:29 dev-sql-002 mysqld: #011f743192f-3aee-11e3-a47b-97626a725dde,
Oct 22 10:19:29 dev-sql-002 mysqld: } joined {
Oct 22 10:19:29 dev-sql-002 mysqld: } left {
Oct 22 10:19:29 dev-sql-002 mysqld: } partitioned {
Oct 22 10:19:29 dev-sql-002 mysqld: })
Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: gcomm: connected Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636 Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0) Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: Opened channel 'dev-sql' Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: Waiting for SST to complete. Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 2 Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: STATE_EXCHANGE: sent state UUID: ac5a1fa8-3af2-11e3-ba64-7e36060f57f6 Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: STATE EXCHANGE: sent state msg: ac5a1fa8-3af2-11e3-ba64-7e36060f57f6 Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: STATE EXCHANGE: got state msg: ac5a1fa8-3af2-11e3-ba64-7e36060f57f6 from 0 (dev-sql-002) Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: STATE EXCHANGE: got state msg: ac5a1fa8-3af2-11e3-ba64-7e36060f57f6 from 1 (dev-sql-001) Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: Quorum results:
Oct 22 10:19:29 dev-sql-002 mysqld: #011version    = 2,
Oct 22 10:19:29 dev-sql-002 mysqld: #011component  = PRIMARY,
Oct 22 10:19:29 dev-sql-002 mysqld: #011conf_id    = 5,
Oct 22 10:19:29 dev-sql-002 mysqld: #011members    = 1/2 (joined/total),
Oct 22 10:19:29 dev-sql-002 mysqld: #011act_id     = 968,
Oct 22 10:19:29 dev-sql-002 mysqld: #011last_appl. = -1,
Oct 22 10:19:29 dev-sql-002 mysqld: #011protocols = 0/4/2 (gcs/repl/appl), Oct 22 10:19:29 dev-sql-002 mysqld: #011group UUID = 2896facc-2110-11e3-0800-717f6d7ec2af Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: Flow-control interval: [23, 23] Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 968) Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: State transfer required: Oct 22 10:19:29 dev-sql-002 mysqld: #011Group state: 2896facc-2110-11e3-0800-717f6d7ec2af:968 Oct 22 10:19:29 dev-sql-002 mysqld: #011Local state: 00000000-0000-0000-0000-000000000000:-1 Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Note] WSREP: New cluster view: global state: 2896facc-2110-11e3-0800-717f6d7ec2af:968, view# 6: Primary, number of nodes: 2, my index: 0, protocol version 2 Oct 22 10:19:29 dev-sql-002 mysqld: 131022 10:19:29 [Warning] WSREP: Gap in state sequence. Need state transfer. Oct 22 10:19:31 dev-sql-002 mysqld: 131022 10:19:31 [Note] WSREP: Running: 'wsrep_sst_xtrabackup --role 'joiner' --address '192.168.3.221' --auth 'root:test' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --parent '13950'' Oct 22 10:19:31 dev-sql-002 mysqld: WSREP_SST: [INFO] Streaming with tar (20131022 10:19:31.981) Oct 22 10:19:31 dev-sql-002 mysqld: WSREP_SST: [INFO] Note: Advanced xtrabackup features - encryption,compression etc. not available with tar. (20131022 10:19:31.988) Oct 22 10:19:31 dev-sql-002 mysqld: WSREP_SST: [INFO] However, xbstream requires manual cleanup of data directory before SST - lp:1193240. (20131022 10:19:31.994) Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [Note] WSREP: Prepared SST request: xtrabackup|192.168.3.221:4444/xtrabackup_sst Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [Note] WSREP: Assign initial position for certification: 968, protocol version: 2 Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (2896facc-2110-11e3-0800-717f6d7ec2af): 1 (Operation not permitted) Oct 22 10:19:32 dev-sql-002 mysqld: #011 at galera/src/replicator_str.cpp:prepare_for_IST():436. IST will be unavailable. Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [Note] WSREP: Node 0 (dev-sql-002) requested state transfer from '*any*'. Selected 1 (dev-sql-001)(SYNCED) as donor. Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 968) Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [Note] WSREP: Requesting state transfer: success, donor: 1 Oct 22 10:19:32 dev-sql-002 mysqld: tar: This does not look like a tar archive Oct 22 10:19:32 dev-sql-002 mysqld: tar: Exiting with failure status due to previous errors Oct 22 10:19:32 dev-sql-002 mysqld: WSREP_SST: [ERROR] Error while getting data from donor node: exit codes: 0 2 (20131022 10:19:32.659) Oct 22 10:19:32 dev-sql-002 mysqld: WSREP_SST: [ERROR] Data directory /var/lib/mysql/ needs to be empty for SST: Manual intervention required in that case (20131022 10:19:32.666) Oct 22 10:19:32 dev-sql-002 mysqld: WSREP_SST: [ERROR] Cleanup after exit with status:32 (20131022 10:19:32.672) Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [Warning] WSREP: 1 (dev-sql-001): State transfer to 0 (dev-sql-002) failed: -1 (Operation not permitted) Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [ERROR] WSREP: gcs/src/gcs_group.c:gcs_group_handle_join_msg():719: Will never receive state. Need to abort. Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [Note] WSREP: gcomm: terminating thread Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [Note] WSREP: gcomm: joining thread Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [Note] WSREP: gcomm: closing backend Oct 22 10:19:32 dev-sql-002 mysqld: WSREP_SST: [ERROR] Killing nc pid (20131022 10:19:32.707) Oct 22 10:19:32 dev-sql-002 mysqld: WSREP_SST: [INFO] Removing the sst_in_progress file (20131022 10:19:32.725) Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup --role 'joiner' --address '192.168.3.221' --auth 'root:test' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --parent '13950': 32 (Broken pipe) Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [ERROR] WSREP: Failed to read uuid:seqno from joiner script. Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [ERROR] WSREP: SST failed: 32 (Broken pipe)
Oct 22 10:19:32 dev-sql-002 mysqld: 131022 10:19:32 [ERROR] Aborting
Oct 22 10:19:32 dev-sql-002 mysqld:
Oct 22 10:19:33 dev-sql-002 mysqld: 131022 10:19:33 [Note] WSREP: view(view_id(NON_PRIM,abbffcb5-3af2-11e3-a043-5b38dfb689cf,6) memb { Oct 22 10:19:33 dev-sql-002 mysqld: #011abbffcb5-3af2-11e3-a043-5b38dfb689cf,
Oct 22 10:19:33 dev-sql-002 mysqld: } joined {
Oct 22 10:19:33 dev-sql-002 mysqld: } left {
Oct 22 10:19:33 dev-sql-002 mysqld: } partitioned {
Oct 22 10:19:33 dev-sql-002 mysqld: #011f743192f-3aee-11e3-a47b-97626a725dde,
Oct 22 10:19:33 dev-sql-002 mysqld: })
Oct 22 10:19:33 dev-sql-002 mysqld: 131022 10:19:33 [Note] WSREP: view((empty)) Oct 22 10:19:33 dev-sql-002 mysqld: 131022 10:19:33 [Note] WSREP: gcomm: closed Oct 22 10:19:33 dev-sql-002 mysqld: 131022 10:19:33 [Note] WSREP: /usr/sbin/mysqld: Terminated. Oct 22 10:19:33 dev-sql-002 mysqld_safe: mysqld from pid file /var/run/mysqld/mysqld.pid ended Oct 22 10:19:55 dev-sql-002 /etc/init.d/mysql[14428]: 0 processes alive and '/usr/bin/mysqladmin --defaults-file=/etc/mysql/debian.cnf ping' resulted in Oct 22 10:19:55 dev-sql-002 /etc/init.d/mysql[14428]: #007/usr/bin/mysqladmin: connect to server at 'localhost' failed Oct 22 10:19:55 dev-sql-002 /etc/init.d/mysql[14428]: error: 'Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (111)' Oct 22 10:19:55 dev-sql-002 /etc/init.d/mysql[14428]: Check that mysqld is running and that the socket: '/var/run/mysqld/mysqld.sock' exists!
Oct 22 10:19:55 dev-sql-002 /etc/init.d/mysql[14428]:

WSREP status on node1 :
MariaDB [(none)]> SHOW STATUS LIKE 'wsrep_%'
    -> ;
+----------------------------+--------------------------------------+
| Variable_name              | Value |
+----------------------------+--------------------------------------+
| wsrep_local_state_uuid     | 2896facc-2110-11e3-0800-717f6d7ec2af |
| wsrep_protocol_version     | 4 |
| wsrep_last_committed       | 968 |
| wsrep_replicated           | 0 |
| wsrep_replicated_bytes     | 0 |
| wsrep_received             | 12 |
| wsrep_received_bytes       | 1126 |
| wsrep_local_commits        | 0 |
| wsrep_local_cert_failures  | 0 |
| wsrep_local_bf_aborts      | 0 |
| wsrep_local_replays        | 0 |
| wsrep_local_send_queue     | 0 |
| wsrep_local_send_queue_avg | 0.500000 |
| wsrep_local_recv_queue     | 0 |
| wsrep_local_recv_queue_avg | 0.000000 |
| wsrep_flow_control_paused  | 0.000000 |
| wsrep_flow_control_sent    | 0 |
| wsrep_flow_control_recv    | 0 |
| wsrep_cert_deps_distance   | 0.000000 |
| wsrep_apply_oooe           | 0.000000 |
| wsrep_apply_oool           | 0.000000 |
| wsrep_apply_window         | 0.000000 |
| wsrep_commit_oooe          | 0.000000 |
| wsrep_commit_oool          | 0.000000 |
| wsrep_commit_window        | 0.000000 |
| wsrep_local_state          | 4 |
| wsrep_local_state_comment  | Synced |
| wsrep_cert_index_size      | 0 |
| wsrep_causal_reads         | 0 |
| wsrep_incoming_addresses   | 192.168.3.121:3306 |
| wsrep_cluster_conf_id      | 5 |
| wsrep_cluster_size         | 1 |
| wsrep_cluster_state_uuid   | 2896facc-2110-11e3-0800-717f6d7ec2af |
| wsrep_cluster_status       | Primary |
| wsrep_connected            | ON |
| wsrep_local_index          | 0 |
| wsrep_provider_name        | Galera |
| wsrep_provider_vendor      | Codership Oy <info@xxxxxxxxxxxxx>    |
| wsrep_provider_version     | 23.2.7-wheezy(r) |
| wsrep_ready                | ON |
+----------------------------+--------------------------------------+
40 rows in set (0.00 sec)

Thanks,



_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss
Post to     : maria-discuss@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp




Follow ups

References