Re: [Commits] 8bfb140d5dc: Move deletion of old GTID rows to slave background thread


Salute, Kristian!

> Hi Andrei,
> Thanks for review! I rebased the patch on 10.4, ran it through another
> buildbot run, and pushed it to 10.4.
> I think with this patch I'll close MDEV-12147, ok?

Committed :-)

> I wrote up the below documentation, I'm planning on adding it to the
> knowledgebase, unless it is better to send it to someone for them to add
> (with proper English spelling/grammar, etc)?

I am cc-ing Ian Gilfillan to this matter. Let me throw in couple of
questions/recommendations to the description below.

> andrei.elkin@xxxxxxxxxx writes:
>> There is something to improve in the test organization, like
>> to base two tests of
>>>  storage/rocksdb/mysql-test/rocksdb_rpl/t/mdev12179.test
>>>  storage/tokudb /mysql-test/tokudb_rpl /t/mdev12179.test
>> on a common parent.
>> I thought for a second to place it in mysql-test/include/
>> but again the parent file is so specific that I had to stop it.
>> This apparently can wait until a third engine shows up and require the
>> same coverage.
> Right, I had the same thoughts... but yes, this is probably for another
> task (I only modified those tests because they needed adjustment to work
> with the new way of mysql.gtid_slave_pos cleanup).
>  - Kristian.

> -----------------------------------------------------------------------
> mysql.gtid_slave_pos functionality
> The mysql.gtid_slave_pos table is maintained automatically by the server,
> there is generally no need to manually inspect or modify it in any way. This
> description is just for reference to understand the internal workings of the
> server.
> The table is automatically created when installing or upgrading the server
> with mysql_install_db or mysql_upgrade.
> Each replicated transaction (internally refered to as "event group") inserts
> a new row in the table as the last step before committing.
> Each new row
> increments the value of sub_id, so the last GTID replicated is always found
> from the row with the largest sub_id.

[Insert next clause in order to clarify on 'sub_id']

As event groups commit in the master binlog order 'sub_id' therefore
facilitates such order on the slave.

> The insert is committed as part of the replicated transaction (for DML to
> transactional storage engines like InnoDB); this makes the replication GTID
> position crash-safe.
> At server start, the table is read, and the row with the highest sub_id
> value (within each GTID domain) is used to initialize the value of
> @@gtid_slave_pos. After reading the table, any redundant rows (having not a
> highest sub_id) are deleted from the table.
> As new rows are inserted into the table, old rows are automatically removed
> by a background process. The removal happens asynchronously and the exact
> duration before a row is removed depends on server and system load. The
> frequency at which rows are removed can be controlled with the system
> variable @@gtid_cleanup_batch_size. A larger size of
> @@gtid_cleanup_batch_size reduces the overhead of old rows removal but
> increases the amount of old rows that can exist in the table; in most cases
> the impact of changing @@gtid_cleanup_batch_size will be minimal.
> Prior to MariaDB 10.4.1 there is no background process to remove old rows
> in the table. Instead, no longer needed rows are removed synchronously as
> part of the replication of the next transaction within the same GTID domain.
> This means there will usually be two rows for each domain in the table,
> though with parallel replication the amount of rows can temporarily increase
> beyond that.
> -----------------------------------------------------------------------
> @@gtid_cleanup_batch_size
> This variable controls the frequency at which a background process runs to
> remove no longer needed rows from the mysql.gtid_slave_pos table. Normally,
> tuning this variable will have little impact on server performance and
> should not be needed.
> The server counts the number of GTIDs replicated; when this number reaches
> @@gtid_cleanup_batch_size, the background process is signalled to start
> cleanup of no longer needed rows in the mysql.gtid_slave_pos table, and the
> counter is reset.
> Note that the cleanup happens asynchroneously, and system load can cause the
> cleanup step to be delayed or

  >even skipped completely in rare cases;

I only can think of crashes here... Anything else do you mean?

> thus
> the number of rows in mysql.gtid_slave_pos can temporarily be larger than
> @@gtid_cleanup_batch_size.
> The @@gtid_cleanup_batch_size variable was introduced in MariaDB 10.4.1.
> -----------------------------------------------------------------------

Much of thanks for this great piece of work!


