maria-discuss team mailing list archive
-
maria-discuss team
-
Mailing list archive
-
Message #05256
Re: gtid_slave_pos row count
> Do you have any errors in the error log about failure to delete rows?
Nope, no errors.
> Anything else special to your setup that might be causing this?
At some point I thought maybe the tokudb_analyze_in_background / tokudb_auto_analyze messes things up as it does the background check (you can also see here the row count growing):
2018-09-29 11:05:48 134488 [Note] TokuDB: Auto scheduling background analysis for ./mysql/gtid_slave_pos_TokuDB, delta_activity 423840 is greater than 40 percent of 1059601 rows. - succeeded.
2018-09-29 11:09:35 134490 [Note] TokuDB: Auto scheduling background analysis for ./mysql/gtid_slave_pos_TokuDB, delta_activity 424359 is greater than 40 percent of 1060885 rows. - succeeded.
2018-09-29 11:13:23 134488 [Note] TokuDB: Auto scheduling background analysis for ./mysql/gtid_slave_pos_TokuDB, delta_activity 424888 is greater than 40 percent of 1062196 rows. - succeeded.
(it triggers also in conservative mode but then it happens just because of a single row being >40% of the table)
I tried to switch off the gtid_pos_auto_engines to use a single gtid_pos InnoDB table and it makes no difference - in conservative mode everything is fine in optimistic the table fills up.
The odd thing is that I'm actually not using gtid for the replication:
MariaDB [mysql]> show slave status\G
Slave_IO_State: Waiting for master to send event
Master_Host: 10.0.8.211
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.096519
Read_Master_Log_Pos: 79697585
Relay_Log_File: db-relay-bin.000142
Relay_Log_Pos: 78464847
Relay_Master_Log_File: mysql-bin.096519
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 79697245
Relay_Log_Space: 595992008
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
..
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 211
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: No
Gtid_IO_Pos:
Replicate_Do_Domain_Ids:
Replicate_Ignore_Domain_Ids:
Parallel_Mode: optimistic
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
Slave_DDL_Groups: 25
Slave_Non_Transactional_Groups: 284193
Slave_Transactional_Groups: 452098720
The other "special" thing maybe is that the master is still 10.2.4 - but that shouldn’t be the problem?
I have 2 slaves (both 10.3.9 / might try to downgrade back to 10.2.x or previous versions of 10.3.x as I don't know the exact point when it started to happen) and the issue is triggered immediately when switching the parallel mode.
> Can you share the contents of the mysql.gtid_slave_pos table when this
> happens?
Sure,
MariaDB [mysql]> select @@gtid_slave_pos;
+--------------------+
| @@gtid_slave_pos |
+--------------------+
| 0-211-211038653075 |
+--------------------+
1 row in set (0.000 sec)
MariaDB [mysql]> select * from gtid_slave_pos limit 10;
+-----------+--------+-----------+--------------+
| domain_id | sub_id | server_id | seq_no |
+-----------+--------+-----------+--------------+
| 0 | 29488 | 211 | 210594092751 |
| 0 | 29490 | 211 | 210594092753 |
| 0 | 29957 | 211 | 210594093220 |
| 0 | 29958 | 211 | 210594093221 |
| 0 | 29961 | 211 | 210594093224 |
| 0 | 29962 | 211 | 210594093225 |
| 0 | 30095 | 211 | 210594093358 |
| 0 | 30096 | 211 | 210594093359 |
| 0 | 30247 | 211 | 210594093510 |
| 0 | 30275 | 211 | 210594093538 |
+-----------+--------+-----------+--------------+
10 rows in set (0.000 sec)
MariaDB [mysql]> select count(*) from gtid_slave_pos;
+----------+
| count(*) |
+----------+
| 2395877 |
+----------+
1 row in set (0.578 sec)
MariaDB [mysql]> select * from gtid_slave_pos_TokuDB limit 10;;
+-----------+--------+-----------+--------------+
| domain_id | sub_id | server_id | seq_no |
+-----------+--------+-----------+--------------+
| 0 | 29373 | 211 | 210594092636 |
| 0 | 29911 | 211 | 210594093174 |
| 0 | 29912 | 211 | 210594093175 |
| 0 | 30282 | 211 | 210594093545 |
| 0 | 30283 | 211 | 210594093546 |
| 0 | 30284 | 211 | 210594093547 |
| 0 | 30285 | 211 | 210594093548 |
| 0 | 30287 | 211 | 210594093550 |
| 0 | 30348 | 211 | 210594093611 |
| 0 | 30349 | 211 | 210594093612 |
+-----------+--------+-----------+--------------+
10 rows in set (0.001 sec)
MariaDB [mysql]> select count(*) from gtid_slave_pos_TokuDB;
+----------+
| count(*) |
+----------+
| 840001 |
> > Is there something wrong with the purger?
> > (something similar like in https://jira.mariadb.org/browse/MDEV-12147
> > ? )
>
> That bug is rather different - the row count in the table is not growing, but
> number of unpurged rows is.
True, I just searched the Jira for similar (related to gtid_slave_pos) kind of issues and saw this still being open.
The optimistic mode makes a big difference in our setup as with the conservative there are times when the slaves start to lag several days behind.
rr
Follow ups
References