maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #06319
Re: Questions re MDEV-4736 and MDEV-4739 (was Re: Spider's installation sql file)
Hi Kentoku,
On Thu, Sep 19, 2013 at 11:47:33PM +0900, kentoku wrote:
> Hi Sergey,
>
> > I'm afraid fixing rnd_end() callers in the server may stall for a long
> time.
> O.K. No need to fix it if it is not easy. This request is not high priority
> request.
>
> > Is it accaptable for spider to use bulk updates and deletes API instead
> > (see handler.h: start_bulk_update/start_bulk_delete).
> Spider use it already. This API is not used if target table has after
> trigger. I understand why this API is not used in this case, but it
> sometimes causes performance problem. So, I thought it is better to prepare
> other choice for something wrong. Anyway, I will disable bulk
> updating/deleting feature without using API.
I see. Hope it is acceptable.
>
> > Nope, it looks like a bug in thread pool. MDEV-4739 has different trace,
> how
> did you get this one? Just executed given test?
> I got this trace when I try to reproduce MDEV-4739. I did as the followings.
>
> 1. start mysqld
> 2. log in mysqld
> 3. mysql> CREATE TABLE t1 (a INT) ENGINE=InnoDB;
> 4. mysql> XA START 'xa1';
> 5. mysql> INSERT INTO t1 (a) VALUES (1),(2);
> 6. mysql> XA END 'xa1';
> 7. mysql> XA PREPARE 'xa1';
> 8. kill -9 mysqld_safe and mysqld from another terminal
> 9. start mysqld on gdb
>
> At that time, InnoDB and Spider were enabled and log-bin was disabled. So
> probably "total_ha_2pc > 1" was true, "opt_bin_log" was false.
> Does it help you?
We couldn't reproduce it yet. :(
Looking through the code I noticed that call to thd_wait_begin() looks as
following:
static void scheduler_wait_sync_begin(void) {
thd_wait_begin(NULL, THD_WAIT_SYNC);
}
Note that thd is always NULL. And it must be NULL at this point, because we're
booting. But according to your trace thd is not NULL.
#0 0x00000000005eabf6 in thd_wait_begin (
thd=0x29da060, wait_type=10)
at /ssd1/mariadb-10.0.4/sql/sql_class.cc:4277
#1 0x000000000072a114 in scheduler_wait_sync_begin ()
at /ssd1/mariadb-10.0.4/sql/scheduler.cc:59
...
The above should have been fixed back in the beginning of 2012. Which MariaDB
revision are you testing with?
Thanks,
Sergey
>
> Thanks,
> Kentoku
>
>
>
> 2013/9/19 Sergey Vojtovich <svoj@xxxxxxxxxxx>
>
> > Hi Kentoku,
> >
> > I'm adding MariaDB developers to CC.
> >
> > On Thu, Sep 19, 2013 at 01:19:13AM +0900, kentoku wrote:
> > > Hi Sergey,
> > >
> > > > But what kind of errors are possible in your case? Other storage
> > engines
> > > doesn't
> > > seem to suffer from this API violation.
> > >
> > > Spider support bulk updating and deleting for avoiding network roundtrip
> > > between data node. Some times, last bulk updating is executed in
> > rnd_end()
> > > function. So rnd_end() has possibility getting errors from data node.
> > I'm afraid fixing rnd_end() callers in the server may stall for a long
> > time.
> > Is it accaptable for spider to use bulk updates and deletes API instead
> > (see handler.h: start_bulk_update/start_bulk_delete).
> >
> > > By the way, about MDEV-4739. I get the following stack trace.
> > > Program received signal SIGSEGV, Segmentation fault.
> > > 0x00000000005eabf6 in thd_wait_begin (thd=0x29da060,
> > > wait_type=10)
> > > at /ssd1/mariadb-10.0.4/sql/sql_class.cc:4277
> > > 4277 MYSQL_CALLBACK(thd->scheduler, thd_wait_begin, (thd,
> > wait_type));
> > > (gdb) print thd
> > > $1 = (THD *) 0x29da060
> > > (gdb) bt
> > > #0 0x00000000005eabf6 in thd_wait_begin (
> > > thd=0x29da060, wait_type=10)
> > > at /ssd1/mariadb-10.0.4/sql/sql_class.cc:4277
> > > #1 0x000000000072a114 in scheduler_wait_sync_begin ()
> > > at /ssd1/mariadb-10.0.4/sql/scheduler.cc:59
> > > #2 0x0000000000d6dc20 in my_sync (fd=23, my_flags=0)
> > > at /ssd1/mariadb-10.0.4/mysys/my_sync.c:76
> > > #3 0x0000000000d6b54f in my_msync (fd=23,
> > > addr=0x7ffff7ff4000, len=4096, flags=4)
> > > at /ssd1/mariadb-10.0.4/mysys/my_mmap.c:27
> > > #4 0x00000000008bea03 in TC_LOG_MMAP::open (
> > > this=0x16e6a00, opt_name=0xe19c87 "tc.log")
> > > at /ssd1/mariadb-10.0.4/sql/log.cc:7735
> > > #5 0x00000000005751cb in init_server_components ()
> > > at /ssd1/mariadb-10.0.4/sql/mysqld.cc:4797
> > > #6 0x0000000000575a07 in mysqld_main (argc=30,
> > > argv=0x1f204d0)
> > > at /ssd1/mariadb-10.0.4/sql/mysqld.cc:5208
> > > #7 0x000000000056d884 in main (argc=11,
> > > argv=0x7fffffffe3a8)
> > > at /ssd1/mariadb-10.0.4/sql/main.cc:25
> > > (gdb) print thd->scheduler
> > > $2 = (scheduler_functions *) 0x8f8f8f8f8f8f8f8f
> > > (gdb) print thd_wait_begin
> > > $3 = {void (THD *,
> > > int)} 0x5eaba4 <thd_wait_begin(THD*, int)>
> > > (gdb) print wait_type
> > > $4 = 10
> > >
> > > It is looks that "thd->scheduler" is not initialized. What do you think?
> > > Must storage engine set it?
> > Nope, it looks like a bug in thread pool. MDEV-4739 has different trace,
> > how
> > did you get this one? Just executed given test?
> >
> > Regards,
> > Sergey
> >
Follow ups
References