← Back to team overview

maria-discuss team mailing list archive

回复:Re:__转发:failed_to_connect_mariadb_at_addition_node_in_MariaDB_galera_cluster

 

Hi,
  Thanks for the reponse.
  No, I started mysqld on the second node via comamd "mysqld --defaults-extra-file=mdb.my.cnf --debug".
  The problem is permanent. 
  No abnormal output seen from the console. I feel mariadb may stuck somewher. from the trace log(attached mdb.mysqld.trace). it looks stuck at wsrep_replication_process. It may wait for some message from primary node. but I have no ieda what it waits for.

--------------------------------


----- 原始邮件 -----
发件人:Nirbhay Choubey <nirbhay@xxxxxxxxxxx>
收件人:zh1029@xxxxxxxx
抄送人:maria-discuss <maria-discuss@xxxxxxxxxxxxxxxxxxx>,  "yan-jack.chen" <yan-jack.chen@xxxxxxxxx>
主题:Re:_[Maria-discuss]_转发:failed_to_connect_mariadb_at_addition_node_in_MariaDB_galera_cluster
日期:2016年09月02日 23点26分

Hi,
On Thu, Sep 1, 2016 at 11:08 PM, 西门吹牛 <zh1029@xxxxxxxx> wrote:
Hi,

  I deployed two MariaDB galera version in two nodes to build cluster. But I can’t connect MariaDB in second node as seems the port is not created by MariaDB because looks like it is stuck somehow.

  Version: mysqld 10.1.17-MariaDB-debug VS galera-3-25.3.17



I started MariaDB in first node. Seems fine. Port 3307 was created and I can login Mariadb by mysql.

You shouldn't use --wsrep-new-cluster to start the 2nd node (in case you are).Regarding the hang:Does it happen all the time? Repeatable?Is that all you see in the error log? Nothing after the partial last line?Will it be possible to attach mysqld to some debugger to check where exactly does it hang?
Best,Nirbhay











[root@MMN-0(RCP-69) /root/test]



# /home/_rcpadmin/bin/mariadb/bin/mysqld --defaults-extra-file=./mmn.my.cnf --wsrep-new-cluster --debug



2016-09-01 16:41:37 140716544316288 [Note] /home/_rcpadmin/bin/mariadb/bin/mysqld (mysqld 10.1.17-MariaDB-debug) starting as process 15248 ...



2016-09-01 16:41:38 140716544316288 [Note] WSREP: Setting wsrep_ready to 0



2016-09-01 16:41:38 140716544316288 [Note] WSREP: Read nil XID from storage engines, skipping position init



2016-09-01 16:41:38 140716544316288 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/libgalera_smm.so'



2016-09-01 16:41:38 140716544316288 [Note] WSREP: wsrep_load(): Galera 3.17(r0) by Codership Oy <info@xxxxxxxxxxxxx> loaded successfully.



2016-09-01 16:41:38 140716544316288 [Note] WSREP: CRC-32C: using hardware acceleration.



2016-09-01 16:41:38 140716544316288 [Note] WSREP: Found saved state: 900987cc-7003-11e6-b25f-de0a52317f1d:0



2016-09-01 16:41:38 140716544316288 [Note] WSREP: Passing config to GCS: base_dir = /mariadb/; base_host = MMN-0; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period

= PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout

= PT24H; gcache.dir = /mariadb/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /mariadb//galera.cache; gcache.page_size = 300M; gcache.size = 300M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave

= no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum

= false; pc.ignore_sb = false; pc



2016-09-01 16:41:38 140716029634304 [Note] WSREP: Service thread queue flushed.



2016-09-01 16:41:38 140716544316288 [Note] WSREP: Assign initial position for certification: 0, protocol version: -1



2016-09-01 16:41:38 140716544316288 [Note] WSREP: wsrep_sst_grab()



2016-09-01 16:41:38 140716544316288 [Note] WSREP: Start replication



2016-09-01 16:41:38 140716544316288 [Note] WSREP: 'wsrep-new-cluster' option used, bootstrapping the cluster



2016-09-01 16:41:38 140716544316288 [Note] WSREP: Setting initial position to 900987cc-7003-11e6-b25f-de0a52317f1d:0



2016-09-01 16:41:38 140716544316288 [Note] WSREP: protonet asio version 0



2016-09-01 16:41:38 140716544316288 [Note] WSREP: Using CRC-32C for message checksums.



2016-09-01 16:41:38 140716544316288 [Note] WSREP: backend: asio



2016-09-01 16:41:38 140716544316288 [Note] WSREP: gcomm thread scheduling priority set to other:0



2016-09-01 16:41:38 140716544316288 [Warning] WSREP: access file(/mariadb//gvwstate.dat) failed(No such file or directory)



2016-09-01 16:41:38 140716544316288 [Note] WSREP: restore pc from disk failed



2016-09-01 16:41:38 140716544316288 [Note] WSREP: GMCast version 0



2016-09-01 16:41:38 140716544316288 [Note] WSREP: (e5a32d3c, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567



2016-09-01 16:41:38 140716544316288 [Note] WSREP: (e5a32d3c, 'tcp://0.0.0.0:4567') multicast: , ttl: 1



2016-09-01 16:41:38 140716544316288 [Note] WSREP: EVS version 0



2016-09-01 16:41:38 140716544316288 [Note] WSREP: gcomm: bootstrapping new group 'example_cluster'



2016-09-01 16:41:38 140716544316288 [Note] WSREP: start_prim is enabled, turn off pc_recovery



2016-09-01 16:41:38 140716544316288 [Note] WSREP: Node e5a32d3c state prim



2016-09-01 16:41:38 140716544316288 [Note] WSREP: view(view_id(PRIM,e5a32d3c,1) memb {



        e5a32d3c,0



} joined {



} left {



} partitioned {



})



2016-09-01 16:41:38 140716544316288 [Note] WSREP: save pc into disk



2016-09-01 16:41:38 140716544316288 [Note] WSREP: discarding pending addr without UUID: tcp://169.254.0.4:4567



2016-09-01 16:41:38 140716544316288 [Note] WSREP: discarding pending addr proto entry 0x5652b0352ef0



2016-09-01 16:41:38 140716544316288 [Note] WSREP: discarding pending addr without UUID: tcp://169.254.0.5:4567



2016-09-01 16:41:38 140716544316288 [Note] WSREP: discarding pending addr proto entry 0x5652b035b720



2016-09-01 16:41:38 140716544316288 [Note] WSREP: discarding pending addr without UUID: tcp://169.254.0.6:4567



2016-09-01 16:41:38 140716544316288 [Note] WSREP: discarding pending addr proto entry 0x5652b0363ea0



2016-09-01 16:41:38 140716544316288 [Note] WSREP: gcomm: connected



2016-09-01 16:41:38 140716544316288 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636



2016-09-01 16:41:38 140716544316288 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)



2016-09-01 16:41:38 140716544316288 [Note] WSREP: Opened channel 'example_cluster'



2016-09-01 16:41:38 140715987670784 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1



2016-09-01 16:41:38 140716544316288 [Note] WSREP: Waiting for SST to complete.



2016-09-01 16:41:38 140715987670784 [Note] WSREP: STATE_EXCHANGE: sent state UUID: e5a3bc45-701f-11e6-ba1c-471590fea490



2016-09-01 16:41:38 140715987670784 [Note] WSREP: STATE EXCHANGE: sent state msg: e5a3bc45-701f-11e6-ba1c-471590fea490



2016-09-01 16:41:38 140715987670784 [Note] WSREP: STATE EXCHANGE: got state msg: e5a3bc45-701f-11e6-ba1c-471590fea490 from 0 (MMN-0)



2016-09-01 16:41:38 140715987670784 [Note] WSREP: Quorum results:



        version    = 4,



        component  = PRIMARY,



        conf_id    = 0,



        members    = 1/1 (joined/total),



        act_id     = 0,



        last_appl. = -1,



        protocols  = 0/7/3 (gcs/repl/appl),



        group UUID = 900987cc-7003-11e6-b25f-de0a52317f1d



2016-09-01 16:41:38 140715987670784 [Note] WSREP: Flow-control interval: [16, 16]



2016-09-01 16:41:38 140715987670784 [Note] WSREP: Restored state OPEN -> JOINED (0)



2016-09-01 16:41:38 140715987670784 [Note] WSREP: Member 0.0 (MMN-0) synced with group.



2016-09-01 16:41:38 140716542806784 [Note] WSREP: New cluster view: global state: 900987cc-7003-11e6-b25f-de0a52317f1d:0, view# 1: Primary, number of nodes: 1, my index: 0, protocol version 3



2016-09-01 16:41:38 140715987670784 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 0)



2016-09-01 16:41:38 140716544316288 [Note] WSREP: SST complete, seqno: 0



2016-09-01 16:41:38 140716544316288 [Note] InnoDB: Using mutexes to ref count buffer pool pages



2016-09-01 16:41:38 140716544316288 [Note] InnoDB:  InnoDB: !!!!!!!! UNIV_DEBUG switched on !!!!!!!!!



2016-09-01 16:41:38 140716544316288 [Note] InnoDB:  InnoDB: !!!!!!!! UNIV_SYNC_DEBUG switched on !!!!!!!!!



2016-09-01 16:41:38 140716544316288 [Note] InnoDB: The InnoDB memory heap is disabled



2016-09-01 16:41:38 140716
_______________________________________________

Mailing list: https://launchpad.net/~maria-discuss

Post to     : maria-discuss@xxxxxxxxxxxxxxxxxxx

Unsubscribe : https://launchpad.net/~maria-discuss

More help   : https://help.launchpad.net/ListHelp





Attachment: mdb.mysqld.trace
Description: Binary data


Follow ups