← Back to team overview

maria-developers team mailing list archive

Re: MDEV-7502 Automatic provisioning of slave – retrieval of data from master

 

Martin Kaluznik <martin.kaluznik@xxxxxxxxx> writes:

> Meta-data
> For this I found only one method, pass output of SHOW CREATE * to
> slave in 'Query_log_event'. When I was looking for alternatives, I
> have found MySQL online backup work log where they are describing
> similar/same problem under High Level Architecture – Decision (
> http://dev.mysql.com/worklog/task/?id=3574 ). Because of this, I find
> it unlikely, that there currently is better way.

Agree. This is also the method used by eg. mysqldump.

> Because meta-data retrieval will take only small fraction of
> provisioning time, I think, that it could be implemented using
> 'Ed_connection' class, it executes query in string form and returns
> result set from within replication thread. Disadvantage of this

I am not familiar with Ed_connection, but it seems fine. I agree performance
is not an issue here. I believe there are various functions to build CREATE
TABLE statements (and similar), but if Ed_connection works, I see no reason
not to use it.

> Row data
> For this, same method with 'Ed_connection' could work, but it would be
> probably too big overhead – parser, result processing. So I created
> low level loop which goes through records of table and fills them in
> 'Write_rows_log_event'. Could you please check it, if it looks
> acceptable and I will then rewrite it to generic form, it is
> provisioning_send_info::send_table_data() function.
> https://github.com/f4rnham/server/blob/3d36cbfc20dff73d92ce61168dcc9805d68d3e2b/sql/rpl_provisioning.cc

Yes, it seems a good start.
Again, the exact way to open tables and access them, and to handle
transaction start/end and lock release, is not something I am much familiar
with. But you seem already to have gotten something working. So I suggest to
continue with the generic code. And when you have something, or if you have
a question I cannot answer, I will find someone else to help, probably Monty
or Serg.

I wrote some simple code that scans a table when I implemented GTID. Maybe
you can get inspiration from that. Look at rpl_slave_state::record_gtid()
and rpl_load_gtid_slave_state(). rpl_slave_state::record_gtid() works both
as part of an existing transaction and as a stand-alone transaction if there
is no existing transaction, so might help you with the lock release
stuff. Monty reviewed that code at one point, so it should be ok.

The idea with MDEV-7502 is to do scan by primary key, so that we can send
the table in chunks and not keep a read transaction open for a long time on
large tables. Your code already uses the primary key for the scan, I think,
so you probably already planned this. It is probably a good idea in any case
to start getting a simpler version working where everything is done in one
scan, and then add the chunking later when the simple stuff works.

You should probably check if the table actually has a primary key. I think
it is fine to give an error if it does not. Or we could fall back to a full
scan without chunking for such tables, rpl_load_gtid_slave_state() does such
a full table scan.

Looks good to me so far.

Thanks,

 - Kristian.


References