← Back to team overview

maria-developers team mailing list archive

Re: [Commits] a2a42a9: MDEV-11623: MariaDB 10.1 fails to start datadir created with MariaDB 10.0/MySQL 5.6 using innodb-page-size!=16K

 

I realized that the MySQL 5.6/MariaDB 10.1 innochecksum does not have
code to update the data files. 5.7/10.2 got the ability to rewrite
page checksums.
I did a quick attempt at porting innochecksum from 10.2 to 10.1, but
that felt too risky.
So, instead of writing a proper innochecksum tool, I wrote a
quick&dirty Perl routine that downgrades the FSP_SPACE_FLAGS to the
old 10.1.x format. That routine is used in the test
innodb.101_compatibility that I added.

Jan, please review bb-10.1-mdev-11623. I hope to push it to 10.1
tomorrow if the tests are fine. Elena is running some tests.

Marko

On Wed, Jan 11, 2017 at 3:42 PM, Marko Mäkelä <marko.makela@xxxxxxxxxxx> wrote:
> Sorry for the delay. This patch looks great to me, except for one
> thing: I think that we should write the corrected flags back to the
> data file.
>
> Given that 10.1 is already in GA status, we should consider our
> options carefully. My preference would be (1) below:
>
> (1) If we adjust 10.1.21 so that it writes back corrected flags for
> both old and new data files, then older 10.1.x would be unable to open
> the files (if compression, DATA_DIRECTORY or non-default
> innodb_page_size was used).
>
> (2) If we adjust 10.1.21 so that it writes correct flags for new data
> files (like the current patch does), then older 10.1.x would not be
> able to read those created-with-newer-version data files.
>
> (3) If we adjust 10.2 only, so that it writes the correct flags for
> new files and also adjusts the flags in old files, then we would seem
> to have a clear upgrade path.
>
> Whatever we choose, I would like to remove the flag-adjustment code
> from 10.3 or 10.4. This would mean that an upgrade (or import)
> directly from 10.1 is only possible if the problematic features
> (DATA_DIRECTORY, compression, non-standard innodb_page_size) were not
> used. Otherwise the upgrade would have to be done through 10.2.
>
> On Thu, Dec 29, 2016 at 10:41 PM, Jan Lindström
> <jan.lindstrom@xxxxxxxxxxx> wrote:
>> revision-id: a2a42a9181729b661ba308f2d0a75d8b547ed09d (mariadb-10.1.20-15-ga2a42a9)
>> parent(s): 23cc1be270c7304963643947d8e5ab88f4e312ee
>> author: Jan Lindström
>> committer: Jan Lindström
>> timestamp: 2016-12-29 22:40:30 +0200
>> message:
>>
>> MDEV-11623: MariaDB 10.1 fails to start datadir created with MariaDB 10.0/MySQL 5.6 using innodb-page-size!=16K
>>
>> Problem was that in MariaDB 10.1 page size flag is on different
>> position (13) on tablespace flags compared to MySQL 5.6 and
>> MariaDB 10.0 or older position (6). DATA_DIR flag position difference
>> was already handled.
>>
>> /** Tablespace flags position and name in MySQL 5.6/MariaDB 10.0 or older
>> and MariaDB 10.1.20 or older MariaDB 10.1 and in MariaDB 10.1.21
>> or newer.
>> MySQL 5.6               MariaDB 10.1.x          MariaDB 10.1.21
>> ====================================================================
>> Below flags in same offset
>> ====================================================================
>> 0: POST_ANTELOPE        0:POST_ANTELOPE         0: POST_ANTELOPE
>> 1: ZIP_SSIZE            1:ZIP_SSIZE             1: ZIP_SSIZE
>> 5: ATOMIC_BLOBS         5:ATOMIC_BLOBS          5: ATOMIC_BLOBS
>
> It could be clearer to explicitly identify multi-bit fields, e.g.,
> 1..4 for ZIP_SSIZE.
>
>> =====================================================================
>> Below note the difference in order
>> =====================================================================
>> 6:  PAGE_SSIZE          6:PAGE_COMPRESSION      6: PAGE_SSIZE
>> 10: DATA_DIR            7:PAGE_COMPRESSION_LEVEL  10: DATA_DIR
>> 11: UNUSED              11:ATOMIC_WRITES
>> =====================================================================
>> Note that below flags have been moved
>> =====================================================================
>>                         13:PAGE_SSIZE           11: RESERVED (5.7 SHARED)
>>                         17:DATA_DIR             12: RESERVED (5.7 TEMPORARY)
>>                         18:UNUSED               13: RESERVED (5.7 ENCRYPTION)
>>                                                 14: PAGE_COMPRESSION
>>                                                 15: PAGE_COMPRESSION_LEVEL
>>                                                 19: ATOMIC_WRITES
>>                                                 21: UNUSED
>
> s/have been moved/were in incorrect position in MariaDB 10.1.x/
>
> It could be good to explain the impact of this.
> That is, what would happen if we upgrade from MariaDB 10.0 to the
> buggy MariaDB 10.1, or
> if we upgrade from the buggy MariaDB 10.1 to 10.1.21 where this has been fixed.
>
> As far as I can tell, with respect to the DATA_DIR flag we should be
> mostly OK, because that flag should only play a role in the table
> flags when determining where the data file is located, not so much in
> the tablespace flags inside the data file. (Moving files from MariaDB
> 10.1 to MySQL 5.7 could be hurt, because the DATA_DIR flag would be
> interpreted as the 5.7 ENCRYPTION flag.)
>
> Next, let us consider PAGE_SSIZE (bits 13..16 in the buggy MariaDB
> 10.1.x, 6..9 elsewhere).
>
> Upgrade from 10.0 or 5.6 to the buggy 10.1.x: We read the bits 13..16.
> These would always be 0 in the old data files, so we will interpret
> the data file as if it had innodb_page_size=16k. That is, upgrade or
> import from 5.6 or 10.0 to the buggy 10.1.x will be refused if
> innodb_page_size differs from 16k. Is this correct? (This is something
> that we cannot fix, because we cannot retroactively change old MariaDB
> 10.1.x versions.)
>
> Upgrade from the buggy 10.1.x to the fixed 10.1.x or 10.2: We read the
> bits 6..9 which were incorrectly used as PAGE_COMPRESSION,
> PAGE_COMPRESSION_LEVEL. That is, upgrade should work if
> innodb_page_size=16k (the flags are expected to be 0) and compression
> is not used (the bits 6..9 were written as 0 by the buggy 10.1.x).
>
> To help users who use the buggy 10.1.x with innodb_page_size!=16k or
> with compression, we can adjust the flags:
>
>> fsp0fsp.ic: fsp_flags_get_page_size() : By default read
>> used page size from original position i.e. after ATOMIC_BLOBS.
>> But if we detect that buggy MariaDB 10.1 flags are used, then
>> we read it from FSP_FLAGS_PAGE_SSIZE_MARIADB101.
> [snip]
>> Tested also with MySQL 5.7 datadir (after deleting redo logs
>> and copying mysql/* files from 10.1) but that failed because
>> SYS_INDEXES got an extra column in 5.7 SYS_INDEXES.MERGE_THRESHOLD.
>
> You could theoretically have initialized the database in MySQL 5.6 and
> then upgraded it to 5.7. All tables that were created in 5.6 should be
> accessible. But would not have added much value.
>
>> After shutdown, started MariaDB 10.1.21 with fix and
>> verified that all tables created are accessible.
>
> Were the flags corrected in the data file afterwards? I think that we
> should correct the flags.
>
> Please add a mysql-test case that demonstrates that the flags are
> corrected. Something like this:
>
> --source include/have_innodb.inc
> --source include/not_embedded.inc
>
> # restart will restore the old values
> SET GLOBAL innodb_file_per_table=1;
> SET GLOBAL innodb_file_format=Barracuda;
> let INNODB_PAGE_SIZE=`select @@innodb_page_size`;
> let MYSQLD_DATADIR=`select @@datadir`;
>
> CREATE TABLE tr(a INT)ENGINE=InnoDB ROW_FORMAT=REDUNDANT;
> CREATE TABLE tc(a INT)ENGINE=InnoDB ROW_FORMAT=COMPACT;
> CREATE TABLE td(a INT)ENGINE=InnoDB ROW_FORMAT=DYNAMIC;
> CREATE TABLE tz(a INT)ENGINE=InnoDB ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=1;
> # TODO: repeat the same with DATA_DIRECTORY and COMPRESSION
> --source include/shutdown_mysqld.inc
>
> perl;
> # corrupt the FSP_FLAGS in each file; write the page checksum as 0xdeadbeef
> # (see log_data_file_size.test or some recent test from MySQL)
> EOF;
>
> --let $restart_parameters = --innodb-read-only;
> --source include/start_mysqld.inc
> CHECK TABLE tr;
> CHECK TABLE tc;
> CHECK TABLE td;
> CHECK TABLE tz;
> --source include/shutdown_mysqld.inc
>
> perl;
> # check that the FSP_FLAGS are still corrupted
> # (they must not be adjusted in read-only mode)
> EOF;
>
> --let $restart_parameters=;
> --source include/start_mysqld.inc
> CHECK TABLE tr;
> CHECK TABLE tc;
> CHECK TABLE td;
> CHECK TABLE tz;
> --source include/shutdown_mysqld.inc
>
> perl;
> # check that the FSP_FLAGS now correspond to $ENV{INNODB_PAGE_SIZE}
> EOF;
> --source include/start_mysqld.inc
> DROP TABLE tr,tc,td,tz;
>
>> @@ -3859,21 +3837,11 @@ fil_open_single_table_tablespace(
>> +               /* Validate this single-table-tablespace with SYS_TABLES. */
>> +               bool flags_correct = dict_compare_flags(def.flags, flags);
>
> I think that the function name should be more descriptive, making
> clear that the first argument contains the table flags and the second
> the FIL_SPACE_FLAGS:
>
>     dict_tf_match_space(def.flags, flags)
>
> The comment does not really add any value. I would omit it.
>
>> diff --git a/storage/innobase/fsp/fsp0fsp.cc b/storage/innobase/fsp/fsp0fsp.cc
>> index 87aa5f7..7ff5244 100644
>> --- a/storage/innobase/fsp/fsp0fsp.cc
>> +++ b/storage/innobase/fsp/fsp0fsp.cc
>> @@ -4182,3 +4182,141 @@ fsp_page_is_free_func(
>>         return xdes_mtr_get_bit(
>>                 descr, XDES_FREE_BIT, page_no % FSP_EXTENT_SIZE, mtr);
>>  }
>> +
>> +/********************************************************************//**
>> +Verify that dictionary flags modified to tablespace flags
>> +and actual tablespace flags stored to FSP header match.
>> +@param[in]     dict_flags      dict_tf_to_fsp_flags(dict_table_t::flags)
>> +@param[in]     fsp_flags       Actual flags stored to FSP header in page 0
>> +@return        true if flags match, false if not */
>> +bool
>> +fsp_verify_flags(
>> +       ulint   dict_flags,
>> +       ulint   fsp_flags)
>> +{
>> +       ulint   dict_unused = FSP_FLAGS_GET_UNUSED(dict_flags);
>> +       ulint   dict_antelope = FSP_FLAGS_GET_POST_ANTELOPE(dict_flags);
>> +       ulint   dict_zssize = FSP_FLAGS_GET_ZIP_SSIZE(dict_flags);
>> +       ulint   dict_ablobs = FSP_FLAGS_HAS_ATOMIC_BLOBS(dict_flags);
>
> Add an early return before all these computations if dict_flags==fsp_flags.
>
> I think that the name dict_flags is misleading. Both flags are
> supposed to be tablespace flags, not data dictionary (or table) flags.
>
> How about this:
>
> s/dict_flags/expected/
> s/fsp_flags/actual/
>
> And then define the local variables with expected_ and actual_
> prefixes, instead of dict_ and fsp_. That would make the code easier
> to follow.
>
>> +       DBUG_EXECUTE_IF("fsp_verify_flags_failure",
>> +                       return(false););
>
> Where is the test to exercise this? Do we even need this? I think it
> is better to inject faults into the data files using Perl code in
> mysql-test (see my previous suggestion).
>
> Other than this, the function looks OK to me.
>
>> diff --git a/storage/innobase/include/dict0dict.h b/storage/innobase/include/dict0dict.h
>> index 42f93b5..479954d 100644
>> --- a/storage/innobase/include/dict0dict.h
>> +++ b/storage/innobase/include/dict0dict.h
>> @@ -1904,6 +1904,17 @@ dict_table_get_index_on_first_col(
>>  #endif /* !UNIV_HOTBACKUP */
>>
>>
>> +/** Compare tablespace flags.
>> +@param[in]     flags           SYS_TABLES.flags
>> +@param[in]     mod_flags       Tablespace flags.
>> +@return true if flags match, false if not */
>> +UNIV_INLINE
>> +bool
>> +dict_compare_flags(
>> +       ulint   flags,
>> +       ulint   mod_flags)
>> +       MY_ATTRIBUTE((warn_unused_result));
>> +
>
> Can we declare the inline function directly in the .h file, instead of
> splitting it into the .ic file?
>
> I would suggest different names:
>
> bool dict_tf_match_space(ulint table_flags, ulint fsp_flags);
>
>> @@ -980,12 +982,11 @@ dict_sys_tables_type_to_tf(
>>         PAGE_COMPRESSION_LEVEL, ATOMIC_WRITES are the same. */
>>         flags |= type & (DICT_TF_MASK_ZIP_SSIZE
>>                          | DICT_TF_MASK_ATOMIC_BLOBS
>> -                        | DICT_TF_MASK_DATA_DIR
>> -                        | DICT_TF_MASK_PAGE_COMPRESSION
>> -                        | DICT_TF_MASK_PAGE_COMPRESSION_LEVEL
>> -                        | DICT_TF_MASK_ATOMIC_WRITES
>> +                        | DICT_TF_MASK_DATA_DIR
>> +                         | DICT_TF_MASK_PAGE_COMPRESSION
>> +                         | DICT_TF_MASK_PAGE_COMPRESSION_LEVEL
>> +                        | DICT_TF_MASK_ATOMIC_WRITES);
>>
>> -       );
>>
>>         return(flags);
>>  }
>
> This appears to be a white-space-only change. We could avoid it,
> unless this is actually fixing the indentation. The spaces and TABs
> got lost in transit, so I cannot tell..
>
>> @@ -1018,8 +1019,8 @@ dict_tf_to_sys_tables_type(
>>         type |= flags & (DICT_TF_MASK_ZIP_SSIZE
>>                          | DICT_TF_MASK_ATOMIC_BLOBS
>>                          | DICT_TF_MASK_DATA_DIR
>> -                        | DICT_TF_MASK_PAGE_COMPRESSION
>> -                        | DICT_TF_MASK_PAGE_COMPRESSION_LEVEL
>> +                         | DICT_TF_MASK_PAGE_COMPRESSION
>> +                         | DICT_TF_MASK_PAGE_COMPRESSION_LEVEL
>>                          | DICT_TF_MASK_ATOMIC_WRITES);
>>
>>         return(type);
>
> Same here.
>
>>         DBUG_EXECUTE_IF("dict_tf_verify_flags_failure",
>> -                       return(ULINT_UNDEFINED););
>> +                       return(false););
>
> Nice catch. But is there any test to exercise this? If not, I would
> remove the DBUG_EXECUTE_IF altogether.
>
>>         ut_a(page_ssize == 0 || page_ssize != 0); /* silence compiler */
>>         ut_a(compact == 0 || compact == 1); /* silence compiler */
>>         ut_a(data_dir == 0 || data_dir == 1); /* silence compiler */
>> -       ut_a(post_antelope == 0 || post_antelope == 1); /* silence compiler */
>> +       ut_a(post_antelope == 0 || post_antelope == 1); /* silence compiler
>> +                                                       */
>
> I think it would be better to declare these variables as
> __attribute__((unused)) or to remove the variables altogether if they
> are unused.
>
>> +
>> +       if (table_unused || fsp_unused) {
>> +               ib_logf(IB_LOG_LEVEL_ERROR,
>> +                       "Table flags has unused %ld"
>> +                       " in the data dictionary"
>> +                       " but the flags in file has unused %ld\n",
>> +                       table_unused, fsp_unused);
>> +
>> +               return (false);
>> +       }
>
> 0x%lx would be more user-friendly here.
>
>> +               ib_logf(IB_LOG_LEVEL_ERROR,
>> +                       "Table flags has page_compression %ld"
>> +                       " in the data dictionary"
>> +                       " but the flags in file ahas page_compression %ld\n",
>>                         page_compression, fsp_page_compression);
>
> s/ahas/has/
>
>> +/********************************************************************//**
>> +Verify that dictionary flags modified to tablespace flags
>> +and actual tablespace flags stored to FSP header match.
>> +@param[in]     dict_flags      dict_tf_to_fsp_flags(dict_table_t::flags)
>> +@param[in]     fsp_flags       Actual flags stored to FSP header in page 0
>> +@return        true if flags match, false if not */
>> +bool
>> +fsp_verify_flags(
>> +       ulint   dict_flags,
>> +       ulint   fsp_flags)
>> +       MY_ATTRIBUTE((warn_unused_result));
>> +
>
> This seems to be missing UNIV_INTERN? I think we need it in 10.1 at
> least for innodb_plugin.
>
>> +       /** In MariaDB 10.1.20 or older MariaDB 10.1 PAGE_SSIZE
>> +       was stored after ATOMIC_WRITES not after ATOMIC_BLOBS
>> +       as in MySQL 5.6, MariaDB 10.0 or older. New tables
>> +       in MariDB 10.1.21 or newer store PAGE_SSIZE after ATOMIC_BLOBS.
>> +       */
>> +       ssize = FSP_FLAGS_GET_PAGE_SSIZE(flags);
>
> Do not use /** comments before local variables.
>
> I think that one of the ‘after’ in the comment should be ‘before’.
> Also fix the typo MariDB.
> --
> DON’T MISS
> M|17
> April 11 - 12, 2017
> The Conrad Hotel
> New York City
> https://m17.mariadb.com/
>
> Marko Mäkelä, Lead Developer InnoDB
> MariaDB Corporation



-- 
DON’T MISS
M|17
April 11 - 12, 2017
The Conrad Hotel
New York City
https://m17.mariadb.com/

Marko Mäkelä, Lead Developer InnoDB
MariaDB Corporation


References