← Back to team overview

maria-developers team mailing list archive

Re: [External] Lag Free ALTER Table for Slave

 

Hi Simon,

On Tue, Feb 25, 2020 at 12:21 AM Simon Mudd <simon.mudd@xxxxxxxxxxx> wrote:
> Actually this is critical. Imagine I kick off an ALTER TABLE of a 1 TB table.
> That is going to take a long time to run, so I need a reasonably easy way to figure out how far things have progressed.
> I had anticipated something like a counter of bytes or rows affected / copied so far,
> or an indication of the position copying data from the source to destination table.

This is also a weak point in the InnoDB storage engine in MariaDB.

MySQL 5.7 introduced a progress reporting interface for InnoDB native
ALTER TABLE, but we never properly integrated it with MariaDB, even
though MariaDB Server 10.2 and later are based on MySQL 5.7.

For ALGORITHM=COPY, MariaDB actually introduced progress reporting,
way before MySQL had anything comparable:
https://mariadb.com/kb/en/progress-reporting/

When https://jira.mariadb.org/browse/MDEV-16329 implements online
ALGORITHM=COPY, we should make sure that also the log of concurrent
modifications will be accounted for in the progress reporting.

This progress reporting should also be exposed somehow in replication
in this https://jira.mariadb.org/browse/MDEV-11675 that you are
commenting on. How, I do not know, and I would expect that you would
have to query the replicas directly, not via a master. If I remember
correctly, MySQL 5.7 used PERFORMANCE_SCHEMA for the InnoDB native
ALTER TABLE progress reporting.

For the InnoDB storage engine in MariaDB, we have two open tasks,
which look like they could be duplicating each other:

https://jira.mariadb.org/browse/MDEV-9260 Improve progress report on
on-line alter table
https://jira.mariadb.org/browse/MDEV-12512 Accurately report progress
for ALTER TABLE...ALGORITHM=INPLACE

> Some of the OSCs I have done in production have taken over 3 months to run so having a progress meter of some sort
> tends to be really useful.

This strongly suggests that we need test coverage for various 'bad
day' scenarios, such as slow or intermittent connection for the
replication stream, or the restart of the master or replica (with or
without crash).

Best regards,

Marko
-- 
Marko Mäkelä, Lead Developer InnoDB
MariaDB Corporation


References