← Back to team overview

maria-developers team mailing list archive

Re: 6768e3a2830: MDEV-22166 MIGRATE PARTITION: move out partition into a table

 

Hi, Aleksey!

It's great that you made it atomic. Frankly, I didn't think of it, MDEV
defines the feature as being exactly equivalent to the 3-command
sequence, so non-atomic. But, of course, making it atomic makes perfect
sense.

See other comments below.

On Aug 23, Aleksey Midenkov wrote:
> commit 6768e3a2830
> Author: Aleksey Midenkov <midenok@xxxxxxxxx>
> Date:   Thu Jul 29 22:54:12 2021 +0300
> 
> diff --git a/sql/lex.h b/sql/lex.h
> index cbf9d9d51b2..55eefc39a7e 100644
> --- a/sql/lex.h
> +++ b/sql/lex.h
> @@ -240,6 +240,7 @@ SYMBOL symbols[] = {
>    { "EXPLAIN",         SYM(DESCRIBE)},
>    { "EXTENDED",        SYM(EXTENDED_SYM)},
>    { "EXTENT_SIZE",     SYM(EXTENT_SIZE_SYM)},
> +  { "EXTRACT",         SYM(EXTRACT_SYM)},

So, you did EXTRACT after all, not MIGRATE?
ADD/EXTRACT aren't opposites in SQL.

It would be rather consistent to have

    ALTER ... ADD PARTITION (...) FROM table_name

and

    ALTER ... DROP PARTITION ... TO table_name

except that DROP...TO looks kind of strange, and ADD...FROM
does not convey the meaning that the table disappears.

The second issue also applies to IMPORT/EXPORT.

Another option, also kind of strange, but also consistent and
makes it clear that the table/partition disappears:

    ALTER ... MOVE TABLE table_name TO PARTITION (...)
    ALTER ... MOVE PARTITION ... TO TABLE table_name

or the same with CHANGE instead of MOVE to reuse an existing keyword.

I'm not saying anything of the above is perfect, and better variants are
very much welcome. But pretty much anything is better than ADD/EXTRACT.

>    { "FALSE",           SYM(FALSE_SYM)},
>    { "FAST",            SYM(FAST_SYM)},
>    { "FAULTS",  SYM(FAULTS_SYM)},
> diff --git a/sql/sql_yacc.yy b/sql/sql_yacc.yy
> index dc2eb4425f0..a41ea7d3e4a 100644
> --- a/sql/sql_yacc.yy
> +++ b/sql/sql_yacc.yy
> @@ -7605,6 +7605,25 @@ alter_commands:
>              if (Lex->stmt_alter_table_exchange_partition($6))
>                MYSQL_YYABORT;
>            }
> +        | EXTRACT_SYM PARTITION_SYM alt_part_name_item
> +          AS TABLE_SYM table_ident have_partitioning
> +          {
> +            LEX *lex= Lex;
> +            lex->first_select_lex()->db= $6->db;
> +            if (lex->first_select_lex()->db.str == NULL &&
> +                lex->copy_db_to(&lex->first_select_lex()->db))
> +              MYSQL_YYABORT;
> +            if (unlikely(check_table_name($6->table.str,$6->table.length,
> +                                          FALSE)) ||
> +                ($6->db.str && unlikely(check_db_name((LEX_STRING*) &$6->db))))
> +              my_yyabort_error((ER_WRONG_TABLE_NAME, MYF(0), $6->table.str));

good point, LEX::stmt_alter_table_exchange_partition() doesn't seem to do it

> +            lex->name= $6->table;
> +
> +            lex->m_sql_cmd= new (thd->mem_root) Sql_cmd_alter_table();
> +            if (unlikely(lex->m_sql_cmd == NULL))
> +              MYSQL_YYABORT;
> +            lex->alter_info.partition_flags|= ALTER_PARTITION_EXTRACT;

why wouldn't you combine the common code from here and EXCHANGE PARTITION?
less code duplication and as a bonus EXCHANGE will get proper
table/db name checks.

> +          }
>          ;
>  
>  remove_partitioning:
> diff --git a/sql/sql_alter.h b/sql/sql_alter.h
> index a0f89c28d2a..6aa65d232ca 100644
> --- a/sql/sql_alter.h
> +++ b/sql/sql_alter.h
> @@ -345,6 +348,8 @@ class Alter_table_ctx
>    /** Indicates that we are altering temporary table */
>    bool tmp_table;
>  
> +  DDL_LOG_STATE *ddl_log_state;

What do you need this for?

> +
>  private:
>    char new_filename[FN_REFLEN + 1];
>    char new_alias_buff[NAME_LEN + 1];
> diff --git a/sql/sql_table.cc b/sql/sql_table.cc
> index 779d125af70..033d77a827d 100644
> --- a/sql/sql_table.cc
> +++ b/sql/sql_table.cc
> @@ -10016,7 +10146,7 @@ bool mysql_alter_table(THD *thd, const LEX_CSTRING *new_db,
>      }
>  
>      // In-place execution of ALTER TABLE for partitioning.
> -    DBUG_RETURN(fast_alter_partition_table(thd, table, alter_info,
> +    DBUG_RETURN(fast_alter_partition_table(thd, table, alter_info, &alter_ctx,
>                                             create_info, table_list,
>                                             &alter_ctx.db,
>                                             &alter_ctx.table_name));

you no longer need to pass alter_ctx.db and alter_ctx.table_name,
if you pass the whole alter_ctx down.

> diff --git a/include/my_dbug.h b/include/my_dbug.h
> index e25bfcf28a7..cf969ce5750 100644
> --- a/include/my_dbug.h
> +++ b/include/my_dbug.h
> @@ -104,6 +104,8 @@ extern int (*dbug_sanity)(void);
>          (_db_keyword_(0,(keyword), 0) ? (a1) : (a2))
>  #define DBUG_EVALUATE_IF(keyword,a1,a2) \
>          (_db_keyword_(0,(keyword), 1) ? (a1) : (a2))
> +#define DBUG_TRUE_IF(keyword) DBUG_EVALUATE_IF(keyword, 1, 0)
> +#define DBUG_FALSE_IF(keyword) DBUG_EVALUATE_IF(keyword, 0, 1)

hmm. DBUG_FALSE_IF is not used anywhere, DBUG_TRUE_IF is used once as

  if (DBUG_TRUE_IF("error_extract_partition_00") ||
      unlikely(error= file->ha_rename_table(from_name, to_name)))

and this is typically written as

  if (DBUG_EVALUATE_IF("error_extract_partition_00", 1,
        error= file->ha_rename_table(from_name, to_name)))

so I see no need for either DBUG_TRUE_IF or DBUG_FALSE_IF,
please, remove both

>  #define DBUG_PRINT(keyword,arglist) \
>          do if (_db_pargs_(__LINE__,keyword)) _db_doprnt_ arglist; while(0)
>  
> diff --git a/sql/sql_partition.cc b/sql/sql_partition.cc
> index 85d880cbbb4..4172a61812e 100644
> --- a/sql/sql_partition.cc
> +++ b/sql/sql_partition.cc
> @@ -5432,7 +5432,7 @@ uint prep_alter_part_table(THD *thd, TABLE *table, Alter_info *alter_info,
>        } while (++part_count < tab_part_info->num_parts);
>        if (num_parts_found != num_parts_dropped)
>        {
> -        my_error(ER_DROP_PARTITION_NON_EXISTENT, MYF(0), "DROP");
> +        my_error(ER_DROP_PARTITION_NON_EXISTENT, MYF(0), cmd);

Not part of your patch, I know, but the error message here is

   "Error in list of partitions to %-.64s"

shouldn't it say "Partition does not exist" or something?

>          goto err;
>        }
>        if (table->file->is_fk_defined_on_table_or_index(MAX_KEY))
> @@ -7268,6 +7420,48 @@ uint fast_alter_partition_table(THD *thd, TABLE *table,
>      if (alter_partition_lock_handling(lpt))
>        goto err;
>    }
> +  else if (alter_info->partition_flags & ALTER_PARTITION_EXTRACT)
> +  {
> +    if (mysql_write_frm(lpt, WFRM_WRITE_EXTRACTED) ||
> +        write_log_drop_shadow_frm(lpt) ||
> +        ERROR_INJECT_CRASH("crash_extract_partition_1") ||
> +        ERROR_INJECT_ERROR("fail_extract_partition_1") ||
> +        mysql_write_frm(lpt, WFRM_WRITE_SHADOW) ||
> +        ERROR_INJECT_CRASH("crash_extract_partition_2") ||
> +        ERROR_INJECT_ERROR("fail_extract_partition_2") ||
> +        wait_while_table_is_used(thd, table, HA_EXTRA_NOT_USED) ||
> +        ERROR_INJECT_CRASH("crash_extract_partition_3") ||
> +        ERROR_INJECT_ERROR("fail_extract_partition_3") ||
> +        write_log_extract_partition(lpt) ||
> +        ERROR_INJECT_CRASH("crash_extract_partition_4") ||
> +        ERROR_INJECT_ERROR("fail_extract_partition_4") ||
> +        alter_close_table(lpt) ||
> +        ERROR_INJECT_CRASH("crash_extract_partition_5") ||
> +        ERROR_INJECT_ERROR("fail_extract_partition_5") ||
> +        alter_partition_extract(lpt) ||
> +        ERROR_INJECT_CRASH("crash_extract_partition_7") ||
> +        ERROR_INJECT_ERROR("fail_extract_partition_7") ||
> +        (frm_install= TRUE, FALSE) ||
> +        mysql_write_frm(lpt, WFRM_INSTALL_SHADOW|WFRM_BACKUP_ORIGINAL) ||
> +        log_partition_alter_to_ddl_log(lpt) ||
> +        (frm_install= FALSE, FALSE) ||
> +        ERROR_INJECT_CRASH("crash_extract_partition_8") ||
> +        ERROR_INJECT_ERROR("fail_extract_partition_8") ||
> +        (ddl_log_complete(lpt->part_info), false) ||

as far as I understand this ^^^ should be done *after* binlogging.
I suspect that if you crash here, blnlog won't get this ALTER TABLE.

> +        ((!thd->lex->no_write_to_binlog) &&
> +         (write_bin_log(thd, FALSE,
> +                        thd->query(), thd->query_length()), FALSE)) ||

if you crash here, what will roll forward on recovery and
drop the backup? The log is already closed here, isn't it?

I don't understand why you need a backup. Why not to install shadow
after the binlog was written?

> +        mysql_write_frm(lpt, WFRM_DROP_BACKUP) ||
> +        ERROR_INJECT_CRASH("crash_extract_partition_9") ||
> +        ERROR_INJECT_ERROR("fail_extract_partition_9"))
> +    {
> +      (void) ddl_log_revert(thd, lpt->part_info);
> +      handle_alter_part_error(lpt, true, true, frm_install);
> +      goto err;
> +    }
> +    if (alter_partition_lock_handling(lpt))
> +      goto err;
> +  }
>    else if ((alter_info->partition_flags & ALTER_PARTITION_ADD) &&
>             (part_info->part_type == RANGE_PARTITION ||
>              part_info->part_type == LIST_PARTITION))
> diff --git a/mysql-test/include/lcase_names.inc b/mysql-test/include/lcase_names.inc
> new file mode 100644
> index 00000000000..e69de29bb2d

better put here some comment, like in other dummy *.inc files
see e.g. platform.inc or protocol.inc

> diff --git a/mysql-test/suite/atomic/alter_partition,aria.rdiff b/mysql-test/suite/atomic/alter_partition,aria.rdiff
> new file mode 100644
> index 00000000000..20c0510c5ed
> --- /dev/null
> +++ b/mysql-test/suite/atomic/alter_partition,aria.rdiff
> @@ -0,0 +1,219 @@
> +--- alter_partition.result
> ++++ alter_partition,aria.reject
> +@@ -1,4 +1,4 @@
> +-set @@default_storage_engine=MyISAM;
> ++set @@default_storage_engine=Aria;
> + # Crash recovery
> + create or replace procedure prepare_table()
> + begin
> +@@ -13,12 +13,12 @@
> + end $
> + # QUERY: ALTER TABLE t1 EXTRACT PARTITION p1 AS TABLE tp1
> + # CRASH POINT: ddl_log_create_before_create_frm
> +-t1#P#p0.MYD
> +-t1#P#p0.MYI
> +-t1#P#p1.MYD

may be better `replace_result MAI MYI MAD MYD` ?
if that's the only difference. To avoid big rdiff files.

> +-t1#P#p1.MYI
> +-t1#P#pn.MYD
> +-t1#P#pn.MYI
> ++t1#P#p0.MAD
> ++t1#P#p0.MAI
 
Regards,
Sergei
VP of MariaDB Server Engineering
and security@xxxxxxxxxxx


Follow ups