maria-discuss team mailing list archive
-
maria-discuss team
-
Mailing list archive
-
Message #03605
Re: MariaDB 10.1.14 with galera results in Signal 11
-
To:
MariaDB discuss <maria-discuss@xxxxxxxxxxxxxxxxxxx>
-
From:
Bill Mair <william.mair@xxxxxxxxxxx>
-
Date:
Fri, 27 May 2016 18:09:17 +0100
-
In-reply-to:
<CACAc7V=VGeRK0opF7jDV=pVd0oAZ1UmzBCcHjJL5vnoRQZcoiA@mail.gmail.com>
-
User-agent:
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.0
Hi Nirbhay,
I've built and ran it with a debug build (switched to an RPi2 with 1GB RAM).
The output of gdb's "backtrace full" is:
https://gist.github.com/anonymous/8f1a5ee6967f4a1147aa120612760914
And here is the output of the command and associated trace:
https://gist.github.com/anonymous/1e41b823cfe76bbf35a4da8ef1d203e4
MariaDB appears to work fine without the galera module.
What else can I try to fix this?
Regards,
Bill
On 27/05/16 05:32, Nirbhay Choubey wrote:
> Hi Bill,
>
> As discussed on #maria, the build machine/VM might have either less
> disk space or low memory than required.
>
> Best,
> Nirbhay
>
> On Thu, May 26, 2016 at 6:25 PM, Bill Mair <william.mair@xxxxxxxxxxx
> <mailto:william.mair@xxxxxxxxxxx>> wrote:
>
> Hi Nirbhay,
>
> I got 57% of the way through the build and it failed as follows:
>
> I cloned the source and set the branch:
>
> $ git clone https://github.com/mariadb/server mariadb
>
> $ git checkout 10.1
>
> $ cmake -DWITH_WSREP=ON -DWITH_INNODB_DISALLOW_WRITES=ON
> -DCMAKE_BUILD_TYPE=Debug ./
>
> $ make
>
> ...
>
> [ 55%] Building CXX object sql/CMakeFiles/sql.dir/encryption.cc.o
> [ 55%] Building CXX object sql/CMakeFiles/sql.dir/sql_builtin.cc.o
> [ 55%] Building CXX object sql/CMakeFiles/sql.dir/sql_yacc.cc.o
> [ 55%] Building CXX object sql/CMakeFiles/sql.dir/threadpool_unix.cc.o
> [ 56%] Linking CXX static library libsql.a
> [ 57%] Built target sql
> Scanning dependencies of target explain_filename-t
> [ 57%] Building CXX object
> unittest/sql/CMakeFiles/explain_filename-t.dir/explain_filename-t.cc.o
> [ 57%] Linking CXX executable explain_filename-t
> collect2: fatal error: ld terminated with signal 9 [Killed]
> compilation terminated.
> unittest/sql/CMakeFiles/explain_filename-t.dir/build.make:120:
> recipe for target 'unittest/sql/explain_filename-t' failed
> make[2]: *** [unittest/sql/explain_filename-t] Error 1
> make[2]: *** Deleting file 'unittest/sql/explain_filename-t'
> CMakeFiles/Makefile2:1266: recipe for target
> 'unittest/sql/CMakeFiles/explain_filename-t.dir/all' failed
> make[1]: *** [unittest/sql/CMakeFiles/explain_filename-t.dir/all]
> Error 2
> Makefile:160: recipe for target 'all' failed
> make: *** [all] Error 2
>
> I have no idea if something is missing from the build environment
> or if there is some other error causing the signal 9 failure.
>
> I ran it again and it failed again:
>
> [ 43%] Built target partition
> [ 44%] Built target gen_lex_token
> [ 44%] Built target GenDigestServerSource
> [ 57%] Built target sql
> [ 57%] Linking CXX executable explain_filename-t
> collect2: fatal error: ld terminated with signal 9 [Killed]
> compilation terminated.
> unittest/sql/CMakeFiles/explain_filename-t.dir/build.make:120:
> recipe for target 'unittest/sql/explain_filename-t' failed
> make[2]: *** [unittest/sql/explain_filename-t] Error 1
> make[2]: *** Deleting file 'unittest/sql/explain_filename-t'
> CMakeFiles/Makefile2:1266: recipe for target
> 'unittest/sql/CMakeFiles/explain_filename-t.dir/all' failed
> make[1]: *** [unittest/sql/CMakeFiles/explain_filename-t.dir/all]
> Error 2
> Makefile:160: recipe for target 'all' failed
> make: *** [all] Error 2
>
> The command that appears to fail is this one:
>
> 10782 pts/1 R+ 0:29 /usr/bin/ld -plugin
> /usr/lib/gcc/armv7l-unknown-linux-gnueabihf/6.1.1/liblto_plugin.so
> -plugin-opt=/usr/lib/gcc/armv7l-unknown-linux-gnueabihf/6.1.1/lto-wrapper
> -plugin-opt=-fresolution=/tmp/ccSjYPQi.res
> -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc
> -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc_s
> -plugin-opt=-pass-through=-lgcc --build-id --eh-frame-hdr
> --hash-style=gnu -dynamic-linker /lib/ld-linux-armhf.so.3 -X -m
> armelf_linux_eabi -pie -o explain_filename-t
> /usr/lib/gcc/armv7l-unknown-linux-gnueabihf/6.1.1/../../../Scrt1.o
> /usr/lib/gcc/armv7l-unknown-linux-gnueabihf/6.1.1/../../../crti.o
> /usr/lib/gcc/armv7l-unknown-linux-gnueabihf/6.1.1/crtbeginS.o
> -L/usr/lib/gcc/armv7l-unknown-linux-gnueabihf/6.1.1
> -L/usr/lib/gcc/armv7l-unknown-linux-gnueabihf/6.1.1/../../.. -z
> relro -z now
> CMakeFiles/explain_filename-t.dir/explain_filename-t.cc.o
> -lpthread ../../sql/libsql.a ../mytap/libmytap.a
> ../../storage/perfschema/libperfschema.a
> ../../storage/maria/libaria.a ../../storage/csv/libcsv.a
> ../../storage/sequence/libsequence.a
> ../../storage/xtradb/libxtradb.a -llz4 -llzo2 -llzma -lbz2 -laio
> ../../storage/myisammrg/libmyisammrg.a
> ../../storage/myisam/libmyisam.a ../../storage/heap/libheap.a
> ../../plugin/feedback/libfeedback.a
> ../../plugin/userstat/libuserstat.a ../../sql/libpartition.a
> ../../mysys/libmysys.a ../../mysys_ssl/libmysys_ssl.a
> ../../dbug/libdbug.a ../../mysys/libmysys.a
> ../../mysys_ssl/libmysys_ssl.a ../../dbug/libdbug.a -lz
> ../../strings/libstrings.a ../../vio/libvio.a -lpcre -ljemalloc
> -lcrypt -lssl -lcrypto -ldl ../../wsrep/libwsrep.a -lsystemd
> -lpthread -lstdc++ -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc
> /usr/lib/gcc/armv7l-unknown-linux-gnueabihf/6.1.1/crtendS.o
> /usr/lib/gcc/armv7l-unknown-linux-gnueabihf/6.1.1/../../../crtn.o
>
> So as it stands I can't get a "debug build" built.
>
> I have a further question: How do I tell scons that I want to
> build a debug version of the galera shared library or is that not
> required or would a debug galera build be enough ?
>
> Thanks,
>
> Bill
>
> On 26/05/16 02:40, Nirbhay Choubey wrote:
>> Hi Bill,
>>
>> The stack trace isn't useful. Did you build the server too?
>> Could you try repeating this on a debug build?
>> I would also advise you to file a bug at jira.mariadb.org
>> <http://jira.mariadb.org>.
>>
>> Best,
>> Nirbhay
>>
>> On Wed, May 25, 2016 at 6:29 PM, Bill Mair
>> <william.mair@xxxxxxxxxxx <mailto:william.mair@xxxxxxxxxxx>> wrote:
>>
>> A new install and instance crashes when I try to start it
>> with galera, I don't know what I'm doing wrong.
>>
>> I built galera from source according to the instructions
>> here: https://mariadb.com/kb/en/mariadb/installating-galera-from-source/
>>
>> I had to use "scons strict_build_flags=0" (warnings were
>> being flagged up as errors).
>>
>> # uname -a
>> Linux alarm 4.6.0-3-ARCH #1 Fri May 20 20:21:01 MDT 2016
>> armv7l GNU/Linux
>>
>> # grep wsrep /etc/mysql/my.cnf
>>
>> wsrep_on=ON
>> wsrep_provider=/usr/lib/galera/libgalera_smm.so
>> wsrep_cluster_address=gcomm://192.168.1.91
>> <http://192.168.1.91>,192.168.1.92,192.168.1.93
>> wsrep_node_address='192.168.1.91'
>> wsrep_node_name='node1'
>> wsrep_cluster_name='mariadb_cluster'
>> wsrep_sst_method=rsync
>> wsrep_provider_options=pc.bootstrap=true
>>
>>
>> # sudo -u mysql mysqld --wsrep-new-cluster
>> 2016-05-25 22:10:31 3061874688 [Note] mysqld (mysqld
>> 10.1.14-MariaDB) starting as process 8527 ...
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: Read nil XID
>> from storage engines, skipping position init
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: wsrep_load():
>> loading provider library '/usr/lib/galera/libgalera_smm.so'
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: wsrep_load():
>> Galera 3.16(rXXXX) by Codership Oy <info@xxxxxxxxxxxxx
>> <mailto:info@xxxxxxxxxxxxx>> loaded successfully.
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: CRC-32C: using
>> "slicing-by-8" algorithm.
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: Found saved
>> state: 00000000-0000-0000-0000-000000000000:-1
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: Passing config
>> to GCS: base_dir = /var/lib/mysql/; base_host = 192.168.1.91;
>> base_port = 4567; cert.log_conflicts = no; debug = no;
>> evs.auto_evict = 0; evs.delay_margin = PT1S;
>> evs.delayed_keep_period = PT30S; evs.inactive_check_period =
>> PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period
>> = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4;
>> evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S;
>> evs.user_send_window = 2; evs.view_forget_timeout = PT24H;
>> gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0;
>> gcache.mem_size = 0; gcache.name <http://gcache.name> =
>> /var/lib/mysql//galera.cache; gcache.page_size = 128M;
>> gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0;
>> gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave =
>> no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25;
>> gcs.recv_q_hard_limit = 2147483647; gcs.recv_q_soft_limit =
>> 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version
>> = 0; pc.announce_timeout = PT3S; pc.bootstrap = true;
>> pc.checksum = false; pc.ignore_quo
>> 2016-05-25 22:10:31 2908746768 [Note] WSREP: Service thread
>> queue flushed.
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: Assign initial
>> position for certification: -1, protocol version: -1
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: wsrep_sst_grab()
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: Start replication
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP:
>> 'wsrep-new-cluster' option used, bootstrapping the cluster
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: Setting initial
>> position to 00000000-0000-0000-0000-000000000000:-1
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: protonet asio
>> version 0
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: Using CRC-32C
>> for message checksums.
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: backend: asio
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: gcomm thread
>> scheduling priority set to other:0
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: restore pc from
>> disk successfully
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: GMCast version 0
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: (8708eb25,
>> 'tcp://0.0.0.0:4567 <http://0.0.0.0:4567>') listening at
>> tcp://0.0.0.0:4567 <http://0.0.0.0:4567>
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: (8708eb25,
>> 'tcp://0.0.0.0:4567 <http://0.0.0.0:4567>') multicast: , ttl: 1
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: EVS version 0
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: gcomm:
>> bootstrapping new group 'mariadb_cluster'
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: start_prim is
>> enabled, turn off pc_recovery
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: Node 8708eb25
>> state prim
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP:
>> view(view_id(PRIM,8708eb25,17) memb {
>> 8708eb25,0
>> } joined {
>> } left {
>> } partitioned {
>> })
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: save pc into disk
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: discarding
>> pending addr without UUID: tcp://192.168.1.91:4567
>> <http://192.168.1.91:4567>
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: discarding
>> pending addr proto entry 0xb652e2a0
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: discarding
>> pending addr without UUID: tcp://192.168.1.92:4567
>> <http://192.168.1.92:4567>
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: discarding
>> pending addr proto entry 0xb652e380
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: discarding
>> pending addr without UUID: tcp://192.168.1.93:4567
>> <http://192.168.1.93:4567>
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: discarding
>> pending addr proto entry 0xb652e460
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: clear restored view
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: gcomm: connected
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: Changing maximum
>> packet size to 64500, resulting msg size: 32636
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: Shifting CLOSED
>> -> OPEN (TO: 0)
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: Opened channel
>> 'mariadb_cluster'
>> 2016-05-25 22:10:31 3061874688 [Note] WSREP: Waiting for SST
>> to complete.
>> 2016-05-25 22:10:31 2860512272 [Note] WSREP: New COMPONENT:
>> primary = yes, bootstrap = no, my_idx = 0, memb_num = 1
>> 2016-05-25 22:10:31 2860512272 [Note] WSREP: Starting new
>> group from scratch: 1d570213-22bd-11e6-a70b-6aae917d4541
>> 2016-05-25 22:10:31 2860512272 [Note] WSREP: STATE_EXCHANGE:
>> sent state UUID: 1d574d2d-22bd-11e6-8936-86e5eb8c6c4c
>> 2016-05-25 22:10:31 2860512272 [Note] WSREP: STATE EXCHANGE:
>> sent state msg: 1d574d2d-22bd-11e6-8936-86e5eb8c6c4c
>> 2016-05-25 22:10:31 2860512272 [Note] WSREP: STATE EXCHANGE:
>> got state msg: 1d574d2d-22bd-11e6-8936-86e5eb8c6c4c from 0
>> (node1)
>> 2016-05-25 22:10:31 2860512272 [Note] WSREP: Quorum results:
>> version = 4,
>> component = PRIMARY,
>> conf_id = 0,
>> members = 1/1 (joined/total),
>> act_id = 0,
>> last_appl. = -1,
>> protocols = 0/7/3 (gcs/repl/appl),
>> group UUID = 1d570213-22bd-11e6-a70b-6aae917d4541
>> 2016-05-25 22:10:31 2860512272 [Note] WSREP: Flow-control
>> interval: [16, 16]
>> 2016-05-25 22:10:31 2860512272 [Note] WSREP: Restored state
>> OPEN -> JOINED (0)
>> 160525 22:10:31 [ERROR] mysqld got signal 11 ;
>> This could be because you hit a bug. It is also possible that
>> this binary
>> or one of the libraries it was linked against is corrupt,
>> improperly built,
>> or misconfigured. This error can also be caused by
>> malfunctioning hardware.
>>
>> To report this bug, see https://mariadb.com/kb/en/reporting-bugs
>>
>> We will try our best to scrape up some info that will
>> hopefully help
>> diagnose the problem, but since we have already crashed,
>> something is definitely wrong and this may fail.
>>
>> Server version: 10.1.14-MariaDB
>> key_buffer_size=0
>> read_buffer_size=262144
>> max_used_connections=0
>> max_threads=153
>> thread_count=2
>> It is possible that mysqld could use up to
>> key_buffer_size + (read_buffer_size +
>> sort_buffer_size)*max_threads = 119520 K bytes of memory
>> Hope that's ok; if not, decrease some variables in the equation.
>>
>> Thread pointer: 0x0xb65d7008
>> Attempting backtrace. You can use the following information
>> to find out
>> where mysqld died. If you see no messages after this,
>> something went
>> terribly wrong...
>> stack_bottom = 0xad90edc4 thread_stack 0x48400
>>
>> Trying to get some variables.
>> Some pointers may be invalid and cause the dump to abort.
>> 2016-05-25 22:10:31 2860512272 [Note] WSREP: Member 0.0
>> (node1) synced with group.
>> 2016-05-25 22:10:31 2860512272 [Note] WSREP: Shifting JOINED
>> -> SYNCED (TO: 0)
>> Query (0x0):
>> Connection ID (thread ID): 1
>> Status: NOT_KILLED
>>
>> Optimizer switch:
>> index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on
>>
>>
>> The manual page at
>> http://dev.mysql.com/doc/mysql/en/crashing.html contains
>>
>> information that should help you find out what is causing the
>> crash.
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~maria-discuss
>> <https://launchpad.net/%7Emaria-discuss>
>> Post to : maria-discuss@xxxxxxxxxxxxxxxxxxx
>> <mailto:maria-discuss@xxxxxxxxxxxxxxxxxxx>
>> Unsubscribe : https://launchpad.net/~maria-discuss
>> <https://launchpad.net/%7Emaria-discuss>
>> More help : https://help.launchpad.net/ListHelp
>>
>>
>
>
References