← Back to team overview

maria-discuss team mailing list archive

Re: MariaDB 10.1.14 with galera results in Signal 11

 

Hi Nirbhay,

I've built and ran it with a debug build (switched to an RPi2 with 1GB RAM).

The output of gdb's "backtrace full" is:
https://gist.github.com/anonymous/8f1a5ee6967f4a1147aa120612760914

And here is the output of the command and associated trace:
https://gist.github.com/anonymous/1e41b823cfe76bbf35a4da8ef1d203e4

MariaDB appears to work fine without the galera module.

What else can I try to fix this?

Regards,

    Bill


On 27/05/16 05:32, Nirbhay Choubey wrote:
> Hi Bill,
>
> As discussed on #maria, the build machine/VM might have either less
> disk space or low memory than required.
>
> Best,
> Nirbhay
>
> On Thu, May 26, 2016 at 6:25 PM, Bill Mair <william.mair@xxxxxxxxxxx
> <mailto:william.mair@xxxxxxxxxxx>> wrote:
>
>     Hi Nirbhay,
>
>     I got 57% of the way through the build and it failed as follows:
>
>     I cloned the source and set the branch:
>
>     $ git clone https://github.com/mariadb/server mariadb
>
>     $ git checkout 10.1
>
>     $ cmake -DWITH_WSREP=ON -DWITH_INNODB_DISALLOW_WRITES=ON
>     -DCMAKE_BUILD_TYPE=Debug ./
>
>     $ make
>
>     ...
>
>     [ 55%] Building CXX object sql/CMakeFiles/sql.dir/encryption.cc.o
>     [ 55%] Building CXX object sql/CMakeFiles/sql.dir/sql_builtin.cc.o
>     [ 55%] Building CXX object sql/CMakeFiles/sql.dir/sql_yacc.cc.o
>     [ 55%] Building CXX object sql/CMakeFiles/sql.dir/threadpool_unix.cc.o
>     [ 56%] Linking CXX static library libsql.a
>     [ 57%] Built target sql
>     Scanning dependencies of target explain_filename-t
>     [ 57%] Building CXX object
>     unittest/sql/CMakeFiles/explain_filename-t.dir/explain_filename-t.cc.o
>     [ 57%] Linking CXX executable explain_filename-t
>     collect2: fatal error: ld terminated with signal 9 [Killed]
>     compilation terminated.
>     unittest/sql/CMakeFiles/explain_filename-t.dir/build.make:120:
>     recipe for target 'unittest/sql/explain_filename-t' failed
>     make[2]: *** [unittest/sql/explain_filename-t] Error 1
>     make[2]: *** Deleting file 'unittest/sql/explain_filename-t'
>     CMakeFiles/Makefile2:1266: recipe for target
>     'unittest/sql/CMakeFiles/explain_filename-t.dir/all' failed
>     make[1]: *** [unittest/sql/CMakeFiles/explain_filename-t.dir/all]
>     Error 2
>     Makefile:160: recipe for target 'all' failed
>     make: *** [all] Error 2
>
>     I have no idea if something is missing from the build environment
>     or if there is some other error causing the signal 9 failure.
>
>     I ran it again and it failed again:
>
>     [ 43%] Built target partition
>     [ 44%] Built target gen_lex_token
>     [ 44%] Built target GenDigestServerSource
>     [ 57%] Built target sql
>     [ 57%] Linking CXX executable explain_filename-t
>     collect2: fatal error: ld terminated with signal 9 [Killed]
>     compilation terminated.
>     unittest/sql/CMakeFiles/explain_filename-t.dir/build.make:120:
>     recipe for target 'unittest/sql/explain_filename-t' failed
>     make[2]: *** [unittest/sql/explain_filename-t] Error 1
>     make[2]: *** Deleting file 'unittest/sql/explain_filename-t'
>     CMakeFiles/Makefile2:1266: recipe for target
>     'unittest/sql/CMakeFiles/explain_filename-t.dir/all' failed
>     make[1]: *** [unittest/sql/CMakeFiles/explain_filename-t.dir/all]
>     Error 2
>     Makefile:160: recipe for target 'all' failed
>     make: *** [all] Error 2
>
>     The command that appears to fail is this one:
>
>     10782 pts/1    R+     0:29 /usr/bin/ld -plugin
>     /usr/lib/gcc/armv7l-unknown-linux-gnueabihf/6.1.1/liblto_plugin.so
>     -plugin-opt=/usr/lib/gcc/armv7l-unknown-linux-gnueabihf/6.1.1/lto-wrapper
>     -plugin-opt=-fresolution=/tmp/ccSjYPQi.res
>     -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc
>     -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc_s
>     -plugin-opt=-pass-through=-lgcc --build-id --eh-frame-hdr
>     --hash-style=gnu -dynamic-linker /lib/ld-linux-armhf.so.3 -X -m
>     armelf_linux_eabi -pie -o explain_filename-t
>     /usr/lib/gcc/armv7l-unknown-linux-gnueabihf/6.1.1/../../../Scrt1.o
>     /usr/lib/gcc/armv7l-unknown-linux-gnueabihf/6.1.1/../../../crti.o
>     /usr/lib/gcc/armv7l-unknown-linux-gnueabihf/6.1.1/crtbeginS.o
>     -L/usr/lib/gcc/armv7l-unknown-linux-gnueabihf/6.1.1
>     -L/usr/lib/gcc/armv7l-unknown-linux-gnueabihf/6.1.1/../../.. -z
>     relro -z now
>     CMakeFiles/explain_filename-t.dir/explain_filename-t.cc.o
>     -lpthread ../../sql/libsql.a ../mytap/libmytap.a
>     ../../storage/perfschema/libperfschema.a
>     ../../storage/maria/libaria.a ../../storage/csv/libcsv.a
>     ../../storage/sequence/libsequence.a
>     ../../storage/xtradb/libxtradb.a -llz4 -llzo2 -llzma -lbz2 -laio
>     ../../storage/myisammrg/libmyisammrg.a
>     ../../storage/myisam/libmyisam.a ../../storage/heap/libheap.a
>     ../../plugin/feedback/libfeedback.a
>     ../../plugin/userstat/libuserstat.a ../../sql/libpartition.a
>     ../../mysys/libmysys.a ../../mysys_ssl/libmysys_ssl.a
>     ../../dbug/libdbug.a ../../mysys/libmysys.a
>     ../../mysys_ssl/libmysys_ssl.a ../../dbug/libdbug.a -lz
>     ../../strings/libstrings.a ../../vio/libvio.a -lpcre -ljemalloc
>     -lcrypt -lssl -lcrypto -ldl ../../wsrep/libwsrep.a -lsystemd
>     -lpthread -lstdc++ -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc
>     /usr/lib/gcc/armv7l-unknown-linux-gnueabihf/6.1.1/crtendS.o
>     /usr/lib/gcc/armv7l-unknown-linux-gnueabihf/6.1.1/../../../crtn.o
>
>     So as it stands I can't get a "debug build" built.
>
>     I have a further question: How do I tell scons that I want to
>     build a debug version of the galera shared library or is that not
>     required or would a debug galera build be enough ?
>
>     Thanks,
>
>         Bill
>
>     On 26/05/16 02:40, Nirbhay Choubey wrote:
>>     Hi Bill,
>>
>>     The stack trace isn't useful. Did you build the server too?
>>     Could you try repeating this on a debug build?
>>     I would also advise you to file a bug at jira.mariadb.org
>>     <http://jira.mariadb.org>.
>>
>>     Best,
>>     Nirbhay
>>
>>     On Wed, May 25, 2016 at 6:29 PM, Bill Mair
>>     <william.mair@xxxxxxxxxxx <mailto:william.mair@xxxxxxxxxxx>> wrote:
>>
>>         A new install and instance crashes when I try to start it
>>         with galera, I don't know what I'm doing wrong.
>>
>>         I built galera from source according to the instructions
>>         here: https://mariadb.com/kb/en/mariadb/installating-galera-from-source/
>>
>>         I had to use "scons strict_build_flags=0" (warnings were
>>         being flagged up as errors).
>>
>>         # uname -a
>>         Linux alarm 4.6.0-3-ARCH #1 Fri May 20 20:21:01 MDT 2016
>>         armv7l GNU/Linux
>>
>>         # grep wsrep /etc/mysql/my.cnf
>>
>>         wsrep_on=ON
>>         wsrep_provider=/usr/lib/galera/libgalera_smm.so
>>         wsrep_cluster_address=gcomm://192.168.1.91
>>         <http://192.168.1.91>,192.168.1.92,192.168.1.93
>>         wsrep_node_address='192.168.1.91'
>>         wsrep_node_name='node1'
>>         wsrep_cluster_name='mariadb_cluster'
>>         wsrep_sst_method=rsync
>>         wsrep_provider_options=pc.bootstrap=true
>>
>>
>>         # sudo -u mysql mysqld --wsrep-new-cluster
>>         2016-05-25 22:10:31 3061874688 [Note] mysqld (mysqld
>>         10.1.14-MariaDB) starting as process 8527 ...
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: Read nil XID
>>         from storage engines, skipping position init
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: wsrep_load():
>>         loading provider library '/usr/lib/galera/libgalera_smm.so'
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: wsrep_load():
>>         Galera 3.16(rXXXX) by Codership Oy <info@xxxxxxxxxxxxx
>>         <mailto:info@xxxxxxxxxxxxx>> loaded successfully.
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: CRC-32C: using
>>         "slicing-by-8" algorithm.
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: Found saved
>>         state: 00000000-0000-0000-0000-000000000000:-1
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: Passing config
>>         to GCS: base_dir = /var/lib/mysql/; base_host = 192.168.1.91;
>>         base_port = 4567; cert.log_conflicts = no; debug = no;
>>         evs.auto_evict = 0; evs.delay_margin = PT1S;
>>         evs.delayed_keep_period = PT30S; evs.inactive_check_period =
>>         PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period
>>         = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4;
>>         evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S;
>>         evs.user_send_window = 2; evs.view_forget_timeout = PT24H;
>>         gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0;
>>         gcache.mem_size = 0; gcache.name <http://gcache.name> =
>>         /var/lib/mysql//galera.cache; gcache.page_size = 128M;
>>         gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0;
>>         gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave =
>>         no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25;
>>         gcs.recv_q_hard_limit = 2147483647; gcs.recv_q_soft_limit =
>>         0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version
>>         = 0; pc.announce_timeout = PT3S; pc.bootstrap = true;
>>         pc.checksum = false; pc.ignore_quo
>>         2016-05-25 22:10:31 2908746768 [Note] WSREP: Service thread
>>         queue flushed.
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: Assign initial
>>         position for certification: -1, protocol version: -1
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: wsrep_sst_grab()
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: Start replication
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP:
>>         'wsrep-new-cluster' option used, bootstrapping the cluster
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: Setting initial
>>         position to 00000000-0000-0000-0000-000000000000:-1
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: protonet asio
>>         version 0
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: Using CRC-32C
>>         for message checksums.
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: backend: asio
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: gcomm thread
>>         scheduling priority set to other:0
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: restore pc from
>>         disk successfully
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: GMCast version 0
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: (8708eb25,
>>         'tcp://0.0.0.0:4567 <http://0.0.0.0:4567>') listening at
>>         tcp://0.0.0.0:4567 <http://0.0.0.0:4567>
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: (8708eb25,
>>         'tcp://0.0.0.0:4567 <http://0.0.0.0:4567>') multicast: , ttl: 1
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: EVS version 0
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: gcomm:
>>         bootstrapping new group 'mariadb_cluster'
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: start_prim is
>>         enabled, turn off pc_recovery
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: Node 8708eb25
>>         state prim
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP:
>>         view(view_id(PRIM,8708eb25,17) memb {
>>         8708eb25,0
>>         } joined {
>>         } left {
>>         } partitioned {
>>         })
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: save pc into disk
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: discarding
>>         pending addr without UUID: tcp://192.168.1.91:4567
>>         <http://192.168.1.91:4567>
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: discarding
>>         pending addr proto entry 0xb652e2a0
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: discarding
>>         pending addr without UUID: tcp://192.168.1.92:4567
>>         <http://192.168.1.92:4567>
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: discarding
>>         pending addr proto entry 0xb652e380
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: discarding
>>         pending addr without UUID: tcp://192.168.1.93:4567
>>         <http://192.168.1.93:4567>
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: discarding
>>         pending addr proto entry 0xb652e460
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: clear restored view
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: gcomm: connected
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: Changing maximum
>>         packet size to 64500, resulting msg size: 32636
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: Shifting CLOSED
>>         -> OPEN (TO: 0)
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: Opened channel
>>         'mariadb_cluster'
>>         2016-05-25 22:10:31 3061874688 [Note] WSREP: Waiting for SST
>>         to complete.
>>         2016-05-25 22:10:31 2860512272 [Note] WSREP: New COMPONENT:
>>         primary = yes, bootstrap = no, my_idx = 0, memb_num = 1
>>         2016-05-25 22:10:31 2860512272 [Note] WSREP: Starting new
>>         group from scratch: 1d570213-22bd-11e6-a70b-6aae917d4541
>>         2016-05-25 22:10:31 2860512272 [Note] WSREP: STATE_EXCHANGE:
>>         sent state UUID: 1d574d2d-22bd-11e6-8936-86e5eb8c6c4c
>>         2016-05-25 22:10:31 2860512272 [Note] WSREP: STATE EXCHANGE:
>>         sent state msg: 1d574d2d-22bd-11e6-8936-86e5eb8c6c4c
>>         2016-05-25 22:10:31 2860512272 [Note] WSREP: STATE EXCHANGE:
>>         got state msg: 1d574d2d-22bd-11e6-8936-86e5eb8c6c4c from 0
>>         (node1)
>>         2016-05-25 22:10:31 2860512272 [Note] WSREP: Quorum results:
>>         version = 4,
>>         component = PRIMARY,
>>         conf_id = 0,
>>         members = 1/1 (joined/total),
>>         act_id = 0,
>>         last_appl. = -1,
>>         protocols = 0/7/3 (gcs/repl/appl),
>>         group UUID = 1d570213-22bd-11e6-a70b-6aae917d4541
>>         2016-05-25 22:10:31 2860512272 [Note] WSREP: Flow-control
>>         interval: [16, 16]
>>         2016-05-25 22:10:31 2860512272 [Note] WSREP: Restored state
>>         OPEN -> JOINED (0)
>>         160525 22:10:31 [ERROR] mysqld got signal 11 ;
>>         This could be because you hit a bug. It is also possible that
>>         this binary
>>         or one of the libraries it was linked against is corrupt,
>>         improperly built,
>>         or misconfigured. This error can also be caused by
>>         malfunctioning hardware.
>>
>>         To report this bug, see https://mariadb.com/kb/en/reporting-bugs
>>
>>         We will try our best to scrape up some info that will
>>         hopefully help
>>         diagnose the problem, but since we have already crashed,
>>         something is definitely wrong and this may fail.
>>
>>         Server version: 10.1.14-MariaDB
>>         key_buffer_size=0
>>         read_buffer_size=262144
>>         max_used_connections=0
>>         max_threads=153
>>         thread_count=2
>>         It is possible that mysqld could use up to
>>         key_buffer_size + (read_buffer_size +
>>         sort_buffer_size)*max_threads = 119520 K bytes of memory
>>         Hope that's ok; if not, decrease some variables in the equation.
>>
>>         Thread pointer: 0x0xb65d7008
>>         Attempting backtrace. You can use the following information
>>         to find out
>>         where mysqld died. If you see no messages after this,
>>         something went
>>         terribly wrong...
>>         stack_bottom = 0xad90edc4 thread_stack 0x48400
>>
>>         Trying to get some variables.
>>         Some pointers may be invalid and cause the dump to abort.
>>         2016-05-25 22:10:31 2860512272 [Note] WSREP: Member 0.0
>>         (node1) synced with group.
>>         2016-05-25 22:10:31 2860512272 [Note] WSREP: Shifting JOINED
>>         -> SYNCED (TO: 0)
>>         Query (0x0):
>>         Connection ID (thread ID): 1
>>         Status: NOT_KILLED
>>
>>         Optimizer switch:
>>         index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on
>>
>>
>>         The manual page at
>>         http://dev.mysql.com/doc/mysql/en/crashing.html contains
>>
>>         information that should help you find out what is causing the
>>         crash.
>>
>>
>>         _______________________________________________
>>         Mailing list: https://launchpad.net/~maria-discuss
>>         <https://launchpad.net/%7Emaria-discuss>
>>         Post to     : maria-discuss@xxxxxxxxxxxxxxxxxxx
>>         <mailto:maria-discuss@xxxxxxxxxxxxxxxxxxx>
>>         Unsubscribe : https://launchpad.net/~maria-discuss
>>         <https://launchpad.net/%7Emaria-discuss>
>>         More help   : https://help.launchpad.net/ListHelp
>>
>>
>
>


References