← Back to team overview

maria-discuss team mailing list archive

Re: MariaDB Cluster wont start

 

Hi!

On Mon, Sep 29, 2014 at 4:40 PM, Carlos Raúl Laguna <carlosla1987@xxxxxxxxx>
wrote:

> The log is actually just the output of executing mysqld i just focus in
> this part,
>
> address 'tcp://172.17.1.2:4567' points to own listening
>  address, blacklisting,
>
> However i do manage to start one node, now the problem it seems to be in
> the other to, here is what it show when y try to add the second node to the
> cluster.
>
> 140929 16:24:41 [Note] WSREP: Service thread queue flushed.
> 140929 16:24:41 [Note] WSREP: Assign initial position for certification:
> -1, protocol version: -1
> 140929 16:24:41 [Note] WSREP: wsrep_sst_grab()
> 140929 16:24:41 [Note] WSREP: Start replication
> 140929 16:24:41 [Note] WSREP: Setting initial position to
> 00000000-0000-0000-0000-
> 000000000000:-1
> 140929 16:24:41 [Note] WSREP: protonet asio version 0
> 140929 16:24:41 [Note] WSREP: Using CRC-32C (optimized) for message
> checksums.
> 140929 16:24:41 [Note] WSREP: backend: asio
> 140929 16:24:41 [Note] WSREP: GMCast version 0
> 140929 16:24:41 [Note] WSREP: (a47be000-4816-11e4-b0fd-43eda06d6eff,
> 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
> 140929 16:24:41 [Note] WSREP: (a47be000-4816-11e4-b0fd-43eda06d6eff,
> 'tcp://0.0.0.0:4567') multicast: , ttl: 1
> 140929 16:24:41 [Note] WSREP: EVS version 0
> 140929 16:24:41 [Note] WSREP: PC version 0
> 140929 16:24:41 [Note] WSREP: gcomm: connecting to group 'baruwa_cluster',
> peer 'baruwadb1:'
> 140929 16:24:41 [Note] WSREP: (a47be000-4816-11e4-b0fd-43eda06d6eff,
> 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
> tcp://172.17.1.4:4567
> 140929 16:24:41 [Note] WSREP: declaring
> 498ac792-480e-11e4-994f-533085d0ff8b stable
> 140929 16:24:41 [Note] WSREP: declaring
> ef83de45-4807-11e4-8802-0203dc2e072c stable
> 140929 16:24:41 [Note] WSREP: Node 498ac792-480e-11e4-994f-533085d0ff8b
> state prim
> 140929 16:24:41 [Note] WSREP: (a47be000-4816-11e4-b0fd-43eda06d6eff,
> 'tcp://0.0.0.0:4567') turning message relay requesting off
> 140929 16:24:41 [Note] WSREP:
> view(view_id(PRIM,498ac792-480e-11e4-994f-533085d0ff8b,3) memb {
>     498ac792-480e-11e4-994f-533085d0ff8b,0
>     a47be000-4816-11e4-b0fd-43eda06d6eff,0
>     ef83de45-4807-11e4-8802-0203dc2e072c,0
> } joined {
> } left {
> } partitioned {
> })
> 140929 16:24:41 [Note] WSREP: gcomm: connected
> 140929 16:24:41 [Note] WSREP: Changing maximum packet size to 64500,
> resulting msg size: 32636
> 140929 16:24:41 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
> 140929 16:24:41 [Note] WSREP: Opened channel 'baruwa_cluster'
> 140929 16:24:41 [Note] WSREP: Waiting for SST to complete.
> 140929 16:24:41 [Note] WSREP: New COMPONENT: primary = yes, bootstrap =
> no, my_idx = 1, memb_num = 3
> 140929 16:24:41 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
> 140929 16:24:41 [Note] WSREP: STATE EXCHANGE: sent state msg:
> a49fe848-4816-11e4-a8c7-b70f254d3249
> 140929 16:24:41 [Note] WSREP: STATE EXCHANGE: got state msg:
> a49fe848-4816-11e4-a8c7-b70f254d3249 from 0 (baruwadb2)
> 140929 16:24:41 [Note] WSREP: STATE EXCHANGE: got state msg:
> a49fe848-4816-11e4-a8c7-b70f254d3249 from 2 (baruwadb1)
> 140929 16:24:41 [Note] WSREP: STATE EXCHANGE: got state msg:
> a49fe848-4816-11e4-a8c7-b70f254d3249 from 1 (baruwadb3)
> 140929 16:24:41 [Note] WSREP: Quorum results:
>     version    = 3,
>     component  = PRIMARY,
>     conf_id    = 2,
>     members    = 2/3 (joined/total),
>     act_id     = 1,
>     last_appl. = -1,
>     protocols  = 0/5/3 (gcs/repl/appl),
>     group UUID = 0c63fe44-47fe-11e4-8d91-d3636a34c2fc
> 140929 16:24:41 [Note] WSREP: Flow-control interval: [28, 28]
> 140929 16:24:41 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 1)
> 140929 16:24:41 [Note] WSREP: State transfer required:
>     Group state: 0c63fe44-47fe-11e4-8d91-d3636a34c2fc:1
>     Local state: 00000000-0000-0000-0000-000000000000:-1
> 140929 16:24:41 [Note] WSREP: New cluster view: global state:
> 0c63fe44-47fe-11e4-8d91-d3636a34c2fc:1, view# 3: Primary, number of nodes:
> 3, my index: 1, protocol version 3
> 140929 16:24:41 [Warning] WSREP: Gap in state sequence. Need state
> transfer.
> 140929 16:24:43 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner'
> --address '172.17.1.12' --auth '' --datadir '/var/lib/mysql/'
> --defaults-file '/etc/mysql/my.cnf' --parent '22299'  '' '
> 140929 16:24:44 [Note] WSREP: Prepared SST request: rsync|
> 172.17.1.12:4444/rsync_sst
> 140929 16:24:44 [Note] WSREP: wsrep_notify_cmd is not defined, skipping
> notification.
> 140929 16:24:44 [Note] WSREP: REPL Protocols: 5 (3, 1)
> 140929 16:24:44 [Note] WSREP: Service thread queue flushed.
> 140929 16:24:44 [Note] WSREP: Assign initial position for certification:
> 1, protocol version: 3
> 140929 16:24:44 [Note] WSREP: Service thread queue flushed.
> 140929 16:24:44 [Warning]* WSREP: Failed to prepare for incremental state
> transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not
> match group state UUID (0c63fe44-47fe-11e4-8d91-d3636a34c2fc): 1 (Operation
> not permitted)*
>      at galera/src/replicator_str.cpp:prepare_for_IST():447. IST will be
> unavailable.
> 140929 16:24:44 [Note] WSREP: Member 1.0 (baruwadb3) requested state
> transfer from '*any*'. Selected 0.0 (baruwadb2)(SYNCED) as donor.
> 140929 16:24:44 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 1)
> 140929 16:24:44 [Note] WSREP: Requesting state transfer: success, donor: 0
>

There should be more after this. Besides, you should also inspect donor
logs for any issues or errors.

Best,
Nirbhay

Any ideas ?
>
>
> 2014-09-29 15:55 GMT-04:00 Nirbhay Choubey <nirbhay@xxxxxxxxxxx>:
>
>> Hi,
>>
>> On Mon, Sep 29, 2014 at 1:11 PM, Carlos Raúl Laguna <
>> carlosla1987@xxxxxxxxx> wrote:
>>
>>> I have a 3node galera-mariadb10  cluster working without issues until
>>> today in ubuntu 14.04, all node stop, all i can grasp from mysqld is this:
>>>
>>
>> The provided log seem to be incomplete? I do not see any error that would
>> have caused the node to stop or abort.
>>
>>
>>>  [Warning] option 'table_cache': unsigned value 2097152 adjusted to
>>> 524288
>>> [Note] WSREP: Read nil XID from storage engines, skipping position init
>>>  [Note] WSREP: wsrep_load(): loading provider library
>>> '/usr/lib/galera/libgalera_smm.so'
>>>  [Note] WSREP: wsrep_load(): Galera 25.3.5-wheezy(rXXXX) by Codership Oy
>>> <info@xxxxxxxxxxxxx> loaded successfully.
>>>  [Note] WSREP: CRC-32C: using hardware acceleration.
>>>  [Note] WSREP: Found saved state: 193c65da-169f-11e4-a58c-ba3055701076:-1
>>>  [Note] WSREP: Passing config to GCS: base_host = 172.17.1.2; base_port
>>> = 4567; cert.log_conflicts = no; debug = no; evs.inactive_check_period =
>>> PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S;
>>> evs.max_install_timeouts = 1; evs.send_window = 4; evs.stats_report_period
>>> = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2;
>>> evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/;
>>> gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name =
>>> /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M;
>>> gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16;
>>> gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle =
>>> 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit =
>>> 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0;
>>> pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false;
>>> pc.ignore_sb = false; pc.npvo = false; pc.version = 0; pc.wait_prim = true;
>>> pc.wait_prim_timeout = P30S; pc.weight = 1; protonet.
>>> Note] WSREP: Service thread queue flushed.
>>>  [Note] WSREP: Assign initial position for certification: -1, protocol
>>> version: -1
>>>  [Note] WSREP: wsrep_sst_grab()
>>>  [Note] WSREP: Start replication
>>>  [Note] WSREP: Setting initial position to
>>> 00000000-0000-0000-0000-000000000000:-1
>>>  [Note] WSREP: protonet asio version 0
>>>  [Note] WSREP: Using CRC-32C (optimized) for message checksums.
>>> [Note] WSREP: backend: asio
>>>  [Note] WSREP: GMCast version 0
>>>  [Note] WSREP: (d35fbdc5-47fa-11e4-88ce-ebeacde19e39, 'tcp://
>>> 0.0.0.0:4567') listening at tcp://0.0.0.0:4567
>>> 140929 13:05:34 [Note] WSREP: (d35fbdc5-47fa-11e4-88ce-ebeacde19e39,
>>> 'tcp://0.0.0.0:4567') multicast: , ttl: 1
>>>  [Note] WSREP: EVS version 0
>>> [Note] WSREP: PC version 0
>>>  [Note] WSREP: gcomm: connecting to group 'baruwa_cluster', peer
>>> 'baruwadb1:,baruwadb2:,baruwadb3:'
>>>  [Warning] WSREP: (d35fbdc5-47fa-11e4-88ce-ebeacde19e39, 'tcp://
>>> 0.0.0.0:4567') address 'tcp://172.17.1.2:4567' points to own listening
>>> address, blacklisting
>>>  [Warning] WSREP: no nodes coming from prim view, prim not possible
>>>  [Note] WSREP:
>>> view(view_id(NON_PRIM,d35fbdc5-47fa-11e4-88ce-ebeacde19e39,1) memb {
>>>         d35fbdc5-47fa-11e4-88ce-ebeacde19e39,0
>>> } joined {
>>> } left {
>>> } partitioned {
>>> })
>>>  [Note] WSREP: gcomm: connected
>>> 7 [Note] WSREP: Changing maximum packet size to 64500, resulting msg
>>> size: 32636
>>> [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
>>>  [Note] WSREP: Opened channel 'baruwa_cluster'
>>>  [Note] WSREP: Waiting for SST to complete.
>>> [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0,
>>> memb_num = 1
>>>  [Note] WSREP: Received NON-PRIMARY.
>>> [Note] WSREP: New cluster view: global state: :-1, view# -1:
>>> non-Primary, number of nodes: 1, my index: 0, protocol version -1
>>> 1 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
>>>  [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.5018S),
>>> skipping check
>>>
>>> my.cnf:
>>> #galera settings
>>> wsrep_provider=/usr/lib/galera/libgalera_smm.so
>>>
>>> wsrep_cluster_address="gcomm://baruwadb1,baruwadb2,baruwadb3?pc.wait_prim=no"
>>> wsrep_sst_method=rsync
>>> wsrep_cluster_name="baruwa_cluster"
>>>
>>
>> Perhaps you need to bootstrap the cluster.
>>
>> Best,
>> Nirbhay
>>
>>
>>>
>>> Any help will be appreciated. Thanks and regards
>>>
>>>
>>> _______________________________________________
>>> Mailing list: https://launchpad.net/~maria-discuss
>>> Post to     : maria-discuss@xxxxxxxxxxxxxxxxxxx
>>> Unsubscribe : https://launchpad.net/~maria-discuss
>>> More help   : https://help.launchpad.net/ListHelp
>>>
>>>
>>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~maria-discuss
> Post to     : maria-discuss@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~maria-discuss
> More help   : https://help.launchpad.net/ListHelp
>
>

Follow ups

References