maria-developers team mailing list archive
Mailing list archive
Re: InnoDB: Instant ADD COLUMN that works on old data files
On Wed, Jan 11, 2017 at 9:07 AM, Jan Lindström
> Number of the fields indeed would solve the ADD COLUMN default case but has
> some limitations
> ADD COLUMN m default xxx; would be still slow, xxx = NULL maybe possible, in
> other cases could we store default on SYS_COLUMNS?
The instant ADD COLUMN described in MDEV-11369 is just a start.
Changing the DEFAULT of an instantly added column would require a
rebuild, but specifying the DEFAULT at the time of adding the column
can be done instantly.
Changing a previously NOT NULL column to allow NULLs would be in the realm of
https://jira.mariadb.org/browse/MDEV-11424 Instant ALTER TABLE of
failure-free record format changes.
Currently the DEFAULT values of columns are stored into the *.frm
files by the SQL layer, not at all in InnoDB. If we allow instant
changes of the DEFAULT values of instantly added columns, we must make
sure that any rows that were inserted before the change to the DEFAULT
would use the old default value. I believe that for this, we would
effectively have to store all default values that ever existed, and we
would have to identify when each record or page was created with
respect to the instant ALTER operations. MDEV-11424 does cover all
Monty has expressed a long-term goal of allowing the server to
discover InnoDB tables after *.frm and *.ibd files were copied to the
data directory. I think that we should avoid extending the InnoDB
SYS_* tables any further, because that would seem to move us further
away from the goal. I think that any metadata related to the instant
ALTER has to be stored in the data file itself, and the high-level
data dictionary should only reflect the latest committed data
definition. I hope that we can some day deliver a transactional data
dictionary in MDEV-11655 and remove all SYS_* tables.
> DROP COLUMN would be still slow
Yes, DROP COLUMN (as well as changing the order of columns, or adding
a column somewhere in the middle) would require a table rebuild until
MDEV-11424 is implemented.
> Problem naturally is that we do not have suitable metadata on every page
> where to identify what columns are stored. I do not see way out of this
> problem without introducing yet a another file forma (we have too many of
> them already).
Yes, that is the case. I agree with you that we must be careful when
introducing new file formats.
Luckily we do have 8 previously unused bytes (always written as 0
until now) in each clustered index page, the PAGE_MAX_TRX_ID field.
(MDEV-6076 in MariaDB 10.2.4 will repurpose the field in clustered
index root pages for a persistent AUTO_INCREMENT field.) In MDEV-11369
we repurpose a subset 1..1023 of the values, and in MDEV-11424 we can
repurpose the remaining values to indicate the format of the records
in the page. MDEV-11424 would also require a metadata history (or
record conversion recipes) to be stored somewhere in the data file,
separately from the clustered index tree.
April 11 - 12, 2017
The Conrad Hotel
New York City
Marko Mäkelä, Lead Developer InnoDB