← Back to team overview

maria-developers team mailing list archive

[GSoC] MDEV-7502 Automatic provisioning of slave

 

Hello,

I am student interested in GSoC project Automatic provisioning of
slave ( https://mariadb.atlassian.net/browse/MDEV-7502 ). I have read
current code and put together my idea of how it could be implemented.
Please correct me if there is better solution for some steps or I just
misunderstood something.

Slave starts regular sql thread and something really similar to
current io thread (maybe best to begin with will be copy of
handle_slave_io and related functions, it may still end with some
duplicate code, which at the end can be moved to new common
functions).

Slave will receive gtids of current master as response to
copy-modified version of COM_BINLOG_DUMP, they will mark position from
which binlogs will be sent to slave.

Master proceeds with building of list containing what needs to be sent
to slave. Here exact implementation will depend on whether the master
itself will block any DDL queries during provisioning process or not.
If he will (probably won’t – it’s not necessary and it will just slow
him down), we will have guaranteed that internal containers with
databases/tables won’t change and they can be just iterated through
safely (not true for procedure/trigger containers?). In other case, it
will be probably required to prefetch list of databases/tables to copy
to slave (it can be done lazily with tables/triggers/procedures/…,
holding only list of tables/… for database which is currently being
processed). In the second case, slave will have to detect DDL event
and interrupt connection.

The range scans on primary key mentioned in description, I assume it
would look like
SELECT * FROM tbl WHERE pk > @lastChunkKey ORDER BY pk LIMIT @chunkSize;
just done with internal functions more efficiently, and the result
will be converted to Rows_log_event which will be sent to slave.

To restrictions I would add at least one more, provisioning cannot be
continued if master is restarted during process (or can we actually
store information about provisioning process in database?).

Another possible problem could be slave reconnect. I found
kill_zombie_dump_threads function, in this case, new thread could take
information about provisioning progress from old thread and with help
of fake gtids bundled with chunks somehow figure out which chunk was
last successfully received by slave before disconnect. It would
require additional memory to store few (or more?) most recently sent
chunks and wouldn't guarantee success.

Great way to solve this would be, to be able to map gtid sequence
number to exact chunk of data, but I can’t see how it can be done
without assuming something about maximum amount of rows in table /
number of tables / triggers / …

Thanks for feedback, Martin Kaluznik ( martin.kaluznik [at] gmail.com ).
I can be also found on IRC under nick Farnham


Follow ups