maria-developers team mailing list archive

Thread
Date

Wrong SQL thread position reporting after IO thread restart

To: Kristian Nielsen <knielsen@xxxxxxxxxxxxxxx>
From: Pavel Ivanov <pivanof@xxxxxxxxxx>
Date: Thu, 12 Sep 2013 00:14:15 -0700
Cc: maria-developers <maria-developers@xxxxxxxxxxxxxxxxxxx>

Krisitan,

I found what I think is a bug in IO and SQL thread accounting. How to reproduce:
1) Set up two servers S1 and S2. S1 is a master, S2 is slave with
master_using_gtid = current_pos.
2) Execute some transactions on the master, e.g.

create database d;
create table d.t (n int);
insert into d.t values (1);

3) Both servers are at 0-1-3 now, SHOW SLAVE STATUS on S2 shows
Read_Master_Log_Pos equals to Exec_Master_Log_Pos.
4) Execute STOP SLAVE IO_THREAD on S2.
5) S2 reports in the logs: "Slave I/O thread exiting, ... GTID
position 0-1-2". So IO thread didn't realize that it received full
transaction for 0-1-3 even though it didn't receive next GTID event.
6) Execute START SLAVE IO_THREAD on S2.
7) At this point SHOW SLAVE STATUS on S2 shows Read_Master_Log_Pos the
same as in step 3, but Exec_Master_Log_Pos is now less than in step 3
as if SQL thread didn't catch up with IO thread yet. But despite both
threads running and no more transactions executed on master
Exec_Master_Log_Pos doesn't change and doesn't become equal to
Read_Master_Log_Pos. This apparently happens because IO thread
restarts from one transaction behind, adds to relay log Rotate event
that master sends with the position of that transaction, but then
doesn't add any events for the transaction because it knows they
already were added into relay log.

I think both problems are bugs. And although after fixing the first it
would be really hard (if possible) to reproduce second, I'd think the
reporting of SQL thread's position still should be fixed.


Thank you,
Pavel

Follow ups

Re: Wrong SQL thread position reporting after IO thread restart
From: Pavel Ivanov, 2013-09-17