← Back to team overview

maria-developers team mailing list archive

Re: mariadb 10.1.5 crash

 

Hi,

I have a very similar crash while testing optimistic parallel replication (stack trace in attachment). MariaDB was running with:

 * binlog enabled, but log-slave-updates disabled
 * reduced durability (trx-commit=2 + sync_binlog=0)
 * slave_parallel_mode=aggressive + slave_parallel_threads=320 (yes,
   this number of thread can make sense, but now is not the time to
   extend on that)

I also have crashes, but without stack trace at the moment, for:

 * Same as above, but with 240 threads
 * high durability (trx-commit=1 + sync_binlog=1) and 320 threads

I will gather more information on the crashes, but it might take a few days (those are benchmarks of the same workload, so I can rerun it). When the bug is created, please provide bug number so I can add the relevant information in the ticket.

Note that I had no crashes with:

 * high durability and 5,10,20,40,80,120,160,240 threads, with 5,10,20
   ran twice
 * low durability and 5,10,20,40,80,120,160 threads

So the bug looks like a race condition that does not happen very often.

Thanks,

Jean-François Gagné

On 2015-06-18 18:53, Roberto Spadim wrote:
Hi sergey, i'm loggins querirss now to check when the crash occur
I think it's not related with create/drop, vit since i don't have the log i cant provide some kind of 100% information just the log
You can create the bug eithout problemns
It's mariadb 10.1.5 glibc>2.14
The serve load was low i was testing only, near to 5 active connections, others was connections not running anything just connections waiting queries


Em quinta-feira, 18 de junho de 2015, Sergey Vojtovich <svoj@xxxxxxxxxxx <mailto:svoj@xxxxxxxxxxx>> escreveu:

    Hi Roberto,

    according to this trace server crashed while evicting an object
    from table
    cache (happens when we run out of table_open_cache). Likely
    because per-share
    free tables list was broken. The rest is quite moot yet: e.g. it
    may (or may
    not) also involve table definition cache eviction.

    Will you report a bug or should I? A valuable information would be
    if this load
    involved DDL and admin statements at least. table_open_cache and
    table_definition_cache values, number of active connections,
    number of tables
    participating in this load.

    Thanks,
    Sergey

    On Wed, Jun 17, 2015 at 07:46:34PM -0300, Roberto Spadim wrote:
    > hi guys, i was testing mariadb with test servers and i hit a crash
    > could anyone check? server has 8GB of ram, i don't know what
    query is
    >
    >
    >
    >
    > 2015-06-17 14:01:32 140497706510080 [Note] WSREP: Stop replication
    > 2015-06-17 14:01:32 140497706510080 [Note] Event Scheduler:
    Purging the
    > queue. 0 events
    > 2015-06-17 14:01:32 140497706510080 [Note]
    > /usr/mariadb/mariadb/bin//mysqld: Shutdown complete
    >
    > 150617 14:01:32 mysqld_safe mysqld from pid file
    /home/mysql/pid/local.pid
    > ended
    > 150617 14:02:38 mysqld_safe Starting mysqld daemon with
    databases from
    > /home/mysql/data/local/
    > 2015-06-17 14:02:38 140260849645440 [Warning]
    'THREAD_CONCURRENCY' is
    > deprecated and will be removed in a future release.
    > 2015-06-17 14:02:38 140260849645440 [Note]
    /usr/mariadb/mariadb/bin//mysqld
    > (mysqld 10.1.5-MariaDB-log) starting as process 2321 ...
    > 2015-06-17 14:02:38 140260849645440 [Note] Plugin 'InnoDB' is
    disabled.
    > 2015-06-17 14:02:38 140260849645440 [Note] Plugin 'FEEDBACK' is
    disabled.
    > 2015-06-17 14:02:38 140260849645440 [Note] Server socket created
    on IP:
    > '::'.
    > 2015-06-17 14:02:38 140260849645440 [Note] Event Scheduler:
    Loaded 0 events
    > 2015-06-17 14:02:38 140260849645440 [Note] WSREP: Read nil XID
    from storage
    > engines, skipping position init
    > 2015-06-17 14:02:38 140260849645440 [Note] WSREP: wsrep_load():
    loading
    > provider library 'none'
    > 2015-06-17 14:02:38 140260849645440 [Note]
    > /usr/mariadb/mariadb/bin//mysqld: ready for connections.
> Version: '10.1.5-MariaDB-log' socket: '/tmp/mysql.BH.sock' port: 3307
    >  MariaDB Server
    > 150617 17:02:14 [ERROR] mysqld got signal 11 ;
    > This could be because you hit a bug. It is also possible that
    this binary
    > or one of the libraries it was linked against is corrupt,
    improperly built,
    > or misconfigured. This error can also be caused by
    malfunctioning hardware.
    >
    > To report this bug, see http://kb.askmonty.org/en/reporting-bugs
    >
    > We will try our best to scrape up some info that will hopefully help
    > diagnose the problem, but since we have already crashed,
    > something is definitely wrong and this may fail.
    >
    > Server version: 10.1.5-MariaDB-log
    > key_buffer_size=629145600
    > read_buffer_size=4194304
    > max_used_connections=23
    > max_threads=102
    > thread_count=16
    > It is possible that mysqld could use up to
    > key_buffer_size + (read_buffer_size +
    sort_buffer_size)*max_threads =
    > 1452041 K  bytes of memory
    > Hope that's ok; if not, decrease some variables in the equation.
    >
    > Thread pointer: 0x0x7f90d0353008
    > Attempting backtrace. You can use the following information to
    find out
    > where mysqld died. If you see no messages after this, something went
    > terribly wrong...
    > stack_bottom = 0x7f9103539e80 thread_stack 0x48000
    > mysys/stacktrace.c:247(my_print_stacktrace)[0x7f9106c50b0b]
    > sql/signal_handler.cc:153(handle_fatal_signal)[0x7f91067eadb5]
    > /lib64/libpthread.so.0(+0x10d00)[0x7f910537dd00]
    > sql/sql_plist.h:181(I_P_List_iterator)[0x7f910679cf03]
    > mysys/lf_hash.c:125(lfind)[0x7f9106c5a1a0]
    > sql/table_cache.cc:1084(tdc_iterate(THD*, char (*)(void*,
    void*), void*,
    > bool))[0x7f910679e4e3]
    > sql/table_cache.cc:267(tc_add_table(THD*, TABLE*))[0x7f910679e808]
    > sql/sql_base.cc:2572(open_table(THD*, TABLE_LIST*,
    > Open_table_context*))[0x7f9106633272]
    > sql/sql_base.cc:4016(open_and_process_table)[0x7f91066370c8]
    > sql/sql_base.cc:5184(open_and_lock_tables(THD*, DDL_options_st
    const&,
    > TABLE_LIST*, bool, unsigned int,
    Prelocking_strategy*))[0x7f9106637394]
    > sql/sql_parse.cc:5718(execute_sqlcom_select)[0x7f910666f083]
    > sql/sql_parse.cc:2938(mysql_execute_command(THD*))[0x7f910667a9ce]
    > sql/sql_parse.cc:7173(mysql_parse)[0x7f910667dc3d]
    > sql/sql_parse.cc:1470(dispatch_command(enum_server_command,
    THD*, char*,
    > unsigned int))[0x7f91066802a8]
    > sql/sql_parse.cc:1095(do_command(THD*))[0x7f910668083b]
    >
    sql/sql_connect.cc:1347(do_handle_one_connection(THD*))[0x7f91067435ef]
    > sql/sql_connect.cc:1260(handle_one_connection)[0x7f9106743747]
    > /lib64/libpthread.so.0(+0x7654)[0x7f9105374654]
    > /lib64/libc.so.6(clone+0x6d)[0x7f9104a987dd]
    >
    > Trying to get some variables.
    > Some pointers may be invalid and cause the dump to abort.
    > Query (0x7f90c8424020): is an invalid pointer
    > Connection ID (thread ID): 516
    > Status: NOT_KILLED
    >
    > Optimizer switch:
    >
    index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on
    >
    > The manual page at
    http://dev.mysql.com/doc/mysql/en/crashing.html contains
    > information that should help you find out what is causing the crash.
    > 150617 17:02:14 mysqld_safe Number of processes running now: 0
    > 150617 17:02:14 mysqld_safe mysqld restarted
    > 2015-06-17 17:02:15 140086665152384 [Warning]
    'THREAD_CONCURRENCY' is
    > deprecated and will be removed in a future release.
    > 2015-06-17 17:02:15 140086665152384 [Note]
    /usr/mariadb/mariadb/bin//mysqld
    > (mysqld 10.1.5-MariaDB-log) starting as process 9785 ...
    > 2015-06-17 17:02:15 140086665152384 [Note] mysqld: Aria engine:
    starting
    > recovery
    > recovered pages: 0% 10% 21% 31% 41% 51% 78% 90% 100% (0.0
    seconds); tables
    > to flush: 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22
    21 20 19 18
    > 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 (0.3 seconds);
    > 2015-06-17 17:02:15 140086665152384 [Note] mysqld: Aria engine:
    recovery
    > done
    > 2015-06-17 17:02:15 140086665152384 [Note] Plugin 'InnoDB' is
    disabled.
    > 2015-06-17 17:02:15 140086665152384 [Note] Plugin 'FEEDBACK' is
    disabled.
    > 2015-06-17 17:02:15 140086665152384 [Note] Recovering after a
    crash using
    > tc.log
    > 2015-06-17 17:02:15 140086665152384 [Note] Starting crash
    recovery...
    > 2015-06-17 17:02:15 140086665152384 [Note] Crash recovery finished.
    > 2015-06-17 17:02:15 140086665152384 [Note] Server socket created
    on IP:
    > '::'.
    > 2015-06-17 17:02:15 140086665152384 [Note] Event Scheduler:
    Loaded 0 events
    > 2015-06-17 17:02:15 140086665152384 [Note] WSREP: Read nil XID
    from storage
    > engines, skipping position init
    > 2015-06-17 17:02:15 140086665152384 [Note] WSREP: wsrep_load():
    loading
    > provider library 'none'
    > 2015-06-17 17:02:15 140086665152384 [Note]
    > /usr/mariadb/mariadb/bin//mysqld: ready for connections.
> Version: '10.1.5-MariaDB-log' socket: '/tmp/mysql.BH.sock' port: 3307
    >  MariaDB Server
    > 2015-06-17 17:02:19 140086617798400 [ERROR] mysqld: Table
    > './conquista_comercial/est_atu' is marked as crashed and should
    be repaired
    > 2015-06-17 17:02:19 140086617798400 [Warning] Checking table:
    > './conquista_comercial/est_atu'
    > 2015-06-17 17:02:19 140086617798400 [ERROR] mysqld: Table
    > './conquista_comercial/est_spa' is marked as crashed and should
    be repaired
    > 2015-06-17 17:02:19 140086617798400 [Warning] Checking table:
    > './conquista_comercial/est_spa'
    > 2015-06-17 17:02:51 140086617798400 [ERROR] mysqld: Table
    > './conquista_comercial/est_dia' is marked as crashed and should
    be repaired
    > 2015-06-17 17:02:51 140086617798400 [Warning] Checking table:
    > './conquista_comercial/est_dia'
    > 2015-06-17 17:02:52 140086617798400 [ERROR] mysqld: Table
    > './conquista_comercial/rastreabilidade_impressao' is marked as
    crashed and
    > should be repaired
    > 2015-06-17 17:02:52 140086617798400 [Warning] Checking table:
    > './conquista_comercial/rastreabilidade_impressao'
    > 2015-06-17 17:29:58 140086663948032 [ERROR] mysqld: Table
    > './conquista_comercial/est_atu_erros' is marked as crashed and
    should be
    > repaired
    > 2015-06-17 17:29:58 140086663948032 [Warning] Checking table:
    > './conquista_comercial/est_atu_erros'
    > 2015-06-17 19:24:04 140086538364672 [Note]
    > /usr/mariadb/mariadb/bin//mysqld: Normal shutdown
    >
    > 2015-06-17 19:24:04 140086538364672 [Note] WSREP: Stop replication
    > 2015-06-17 19:24:06 140086538364672 [Note] Event Scheduler:
    Purging the
    > queue. 0 events
    > 2015-06-17 19:24:07 140086538364672 [Note]
    > /usr/mariadb/mariadb/bin//mysqld: Shutdown complete
    >
    > 150617 19:24:07 mysqld_safe mysqld from pid file
    /home/mysql/pid/local.pid
    > ended
    >
    >
    >
    > --
    > Roberto Spadim
    > SPAEmpresarial - Software ERP
    > Eng. Automação e Controle

    > _______________________________________________
    > Mailing list: https://launchpad.net/~maria-developers
    <https://launchpad.net/%7Emaria-developers>
    > Post to     : maria-developers@xxxxxxxxxxxxxxxxxxx <javascript:;>
    > Unsubscribe : https://launchpad.net/~maria-developers
    <https://launchpad.net/%7Emaria-developers>
    > More help   : https://help.launchpad.net/ListHelp



--
Roberto Spadim
SPAEmpresarial - Software ERP
Eng. Automação e Controle



_______________________________________________
Mailing list: https://launchpad.net/~maria-developers
Post to     : maria-developers@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp

150619  7:06:19 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see http://kb.askmonty.org/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.

Server version: 10.1.5-MariaDB-log
key_buffer_size=268435456
read_buffer_size=4194304
max_used_connections=8
max_threads=3002
thread_count=3
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 15693505 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0x7fba1614a008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7fd25d9efc80 thread_stack 0x48000
mysys/stacktrace.c:247(my_print_stacktrace)[0x7fe6c150083b]
sql/signal_handler.cc:153(handle_fatal_signal)[0x7fe6c109a8a5]
/lib64/libpthread.so.0(+0xf710)[0x7fe6c06a7710]
sql/sql_plist.h:181(I_P_List_iterator)[0x7fe6c104cab3]
mysys/lf_hash.c:125(lfind)[0x7fe6c1509ee0]
sql/table_cache.cc:1084(tdc_iterate(THD*, char (*)(void*, void*), void*, bool))[0x7fe6c104e093]
sql/table_cache.cc:267(tc_add_table(THD*, TABLE*))[0x7fe6c104e3b8]
sql/sql_base.cc:2572(open_table(THD*, TABLE_LIST*, Open_table_context*))[0x7fe6c0ee1e82]
sql/sql_base.cc:4016(open_and_process_table)[0x7fe6c0ee5cc0]
sql/sql_base.cc:5184(open_and_lock_tables(THD*, DDL_options_st const&, TABLE_LIST*, bool, unsigned int, Prelocking_strategy*))[0x7fe6c0ee5f84]
sql/sql_insert.cc:719(mysql_insert(THD*, TABLE_LIST*, List<Item>&, List<List<Item> >&, List<Item>&, List<Item>&, enum_duplicates, bool))[0x7fe6c0f1271b]
sql/sql_parse.cc:3866(mysql_execute_command(THD*))[0x7fe6c0f27687]
sql/sql_parse.cc:7173(mysql_parse)[0x7fe6c0f2d74d]
sql/log_event.cc:4274(Query_log_event::do_apply_event(rpl_group_info*, char const*, unsigned int))[0x7fe6c115a92c]
sql/log_event.h:1348(Log_event::apply_event(rpl_group_info*))[0x7fe6c0eba711]
sql/slave.cc:3291(apply_event_and_update_pos(Log_event*, THD*, rpl_group_info*, rpl_parallel_thread*))[0x7fe6c0eb0b1b]
sql/rpl_parallel.cc:49(rpt_handle_event)[0x7fe6c10356e2]
perfschema/pfs.cc:1863(pfs_spawn_thread)[0x7fe6c13ee35d]
/lib64/libpthread.so.0(+0x79d1)[0x7fe6c069f9d1]
/lib64/libc.so.6(clone+0x6d)[0x7fe6beb8586d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7fba8c385b0f): is an invalid pointer
Connection ID (thread ID): 4028
Status: NOT_KILLED

Follow ups

References