← Back to team overview

maria-developers team mailing list archive

Re: 6768e3a2830: MDEV-22166 MIGRATE PARTITION: move out partition into a table

 

Hi Sergei!

Updated bb-10.7-midenok-MDEV-22166 is d4668e7254c6

On Thu, Aug 26, 2021 at 9:58 PM Sergei Golubchik <serg@xxxxxxxxxxx> wrote:
>
> Hi, Aleksey!
>
> Some replies.
>
> On Aug 26, Aleksey Midenkov wrote:
> > > > diff --git a/sql/lex.h b/sql/lex.h
> > > > index cbf9d9d51b2..55eefc39a7e 100644
> > > > --- a/sql/lex.h
> > > > +++ b/sql/lex.h
> > > > @@ -240,6 +240,7 @@ SYMBOL symbols[] = {
> > > >    { "EXPLAIN",         SYM(DESCRIBE)},
> > > >    { "EXTENDED",        SYM(EXTENDED_SYM)},
> > > >    { "EXTENT_SIZE",     SYM(EXTENT_SIZE_SYM)},
> > > > +  { "EXTRACT",         SYM(EXTRACT_SYM)},
> > >
> > > So, you did EXTRACT after all, not MIGRATE?
> >
> > I did like the original description said but I agree that this should be
> > changed to something better. Moreover EXTRACT parsing is now
> > incorrect, see
> >
> > git show 3b693cd3d9 mysql-test/main/parser.result
>
> Yes, I've noticed, didn't comment because we still discuss the syntax.
> If we'll decide to use EXTRACT after all, then perhaps it could be made
> non-reserved.
>
> > > ADD/EXTRACT aren't opposites in SQL.
> > >
> > > It would be rather consistent to have
> > >
> > >     ALTER ... ADD PARTITION (...) FROM table_name
> > >
> > > and
> > >
> > >     ALTER ... DROP PARTITION ... TO table_name
> > >
> > > except that DROP...TO looks kind of strange, and ADD...FROM
> > > does not convey the meaning that the table disappears.
> > >
> > > The second issue also applies to IMPORT/EXPORT.
> > >
> > > Another option, also kind of strange, but also consistent and
> > > makes it clear that the table/partition disappears:
> > >
> > >     ALTER ... MOVE TABLE table_name TO PARTITION (...)
> > >     ALTER ... MOVE PARTITION ... TO TABLE table_name
> > >
> > > or the same with CHANGE instead of MOVE to reuse an existing keyword.
> >
> > There is no MOVE symbol and we should not add a new one because of the
> > same compatibility reason: it will not be possible to create table
> > "move" (as well as to work with existing one I suspect).
>
> it will be, if MOVE won't be a reserved work.
> But still, keywords, even non-reserved, make the parser a bit slower.
> And may be a keyword in that context has to be reserved?
>
> So, yes, I totally agree that it's best to avoid adding new keywords to
> the grammar.
>
> > That's why we found MIGRATE as a good candidate And the reasons to
> > have some special keyword, not just "ADD PARTITION ... FROM ..." are:
> >
> > 1. it is not clear whether the source table stays or not (that was the
> > discussion in MDEV-22165);
> > 2. for keyword like MIGRATE it is clear how it works in both
> > directions and no need to remember or guess the opposite keyword;
> > 3. (and most important) is scalability: we may want to expand the
> > syntax for copying the table instead of moving it so instead of
> > MIGRATE we put COPY and that's it: same formula, clear how it works,
> > easy to interchange between MIGRATE and COPY.
> >
> > I've already thought about MOVE as a better keyword but sadly enough
> > it is not reserved. So how about the proposed syntax with MIGRATE?
>
> MIGRATE is not reserved, by the way, it's a non-reserved keyword.
>
> If it's my suggested synax from above, but with MIGRATE, it becomes
>
>   ALTER TABLE ... MIGRATE TABLE tbl_name TO PARTITION (partition definition)
>   ALTER TABLE ... MIGRATE PARTITION part_name TO TABLE tbl_name
>
> so very symmetrical.
>
>   ALTER TABLE ...  ADD PARTITION ... MIGRATE FROM TABLE ...
>
> looks kind of backward to me. At least if the reverse is MIGRATE PARTITION TO TABLE.
>
> May be MIGRATE PARTITION TO TABLE and MIGRATE PARTITION FROM TABLE ?
> Instead of MIGRATE TABLE FROM PARTITION.

Looks nice. Made this kind of syntax.

>
> > > I'm not saying anything of the above is perfect, and better variants are
> > > very much welcome. But pretty much anything is better than ADD/EXTRACT.
> > >
> > > >    { "FALSE",           SYM(FALSE_SYM)},
> > > >    { "FAST",            SYM(FAST_SYM)},
> > > >    { "FAULTS",  SYM(FAULTS_SYM)},
> > > > diff --git a/include/my_dbug.h b/include/my_dbug.h
> > > > index e25bfcf28a7..cf969ce5750 100644
> > > > --- a/include/my_dbug.h
> > > > +++ b/include/my_dbug.h
> > > > @@ -104,6 +104,8 @@ extern int (*dbug_sanity)(void);
> > > >          (_db_keyword_(0,(keyword), 0) ? (a1) : (a2))
> > > >  #define DBUG_EVALUATE_IF(keyword,a1,a2) \
> > > >          (_db_keyword_(0,(keyword), 1) ? (a1) : (a2))
> > > > +#define DBUG_TRUE_IF(keyword) DBUG_EVALUATE_IF(keyword, 1, 0)
> > > > +#define DBUG_FALSE_IF(keyword) DBUG_EVALUATE_IF(keyword, 0, 1)
> > >
> > > hmm. DBUG_FALSE_IF is not used anywhere, DBUG_TRUE_IF is used once as
> > >   if (DBUG_TRUE_IF("error_extract_partition_00") ||
> > >       unlikely(error= file->ha_rename_table(from_name, to_name)))
> > >
> > > and this is typically written as
> > >
> > >   if (DBUG_EVALUATE_IF("error_extract_partition_00", 1,
> > >         error= file->ha_rename_table(from_name, to_name)))
> > >
> > > so I see no need for either DBUG_TRUE_IF or DBUG_FALSE_IF,
> > > please, remove both
> >
> > Please, (1) looks much better! Considering that there are no complex
> > expressions for a1 in the code but only 1 or 0, we should use
> > DBUG_TRUE_IF/DBUG_FALSE_IF instead. I can change all entries of
> > DBUG_EVALUATE_IF for that. This is supposed to be an independent refactoring.
>
> Okay, indeed, it seems that DBUG_EVALUATE_IF is almost always used with
> one of the arguments being 0 or 1. If you replace all
> DBUG_EVALUATE_IF's, let's just remove it completely.
> Note, that DBUG_TRUE_IF can be defined simply as
>
>   #define DBUG_TRUE_IF(keyword)  _db_keyword_(0, (keyword), 1)
>
> so may be you'd like to call id DBUG_KEYWORD_IF or something?
> Although DBUG_TRUE_IF is shorter :)

It turned out DBUG_IF() is even more short! ;)

>
> > > >  #define DBUG_PRINT(keyword,arglist) \
> > > >          do if (_db_pargs_(__LINE__,keyword)) _db_doprnt_ arglist; while(0)
> > > >
> > > > diff --git a/sql/sql_partition.cc b/sql/sql_partition.cc
> > > > index 85d880cbbb4..4172a61812e 100644
> > > > --- a/sql/sql_partition.cc
> > > > +++ b/sql/sql_partition.cc
> > > > @@ -5432,7 +5432,7 @@ uint prep_alter_part_table(THD *thd, TABLE *table, Alter_info *alter_info,
> > > >        } while (++part_count < tab_part_info->num_parts);
> > > >        if (num_parts_found != num_parts_dropped)
> > > >        {
> > > > -        my_error(ER_DROP_PARTITION_NON_EXISTENT, MYF(0), "DROP");
> > > > +        my_error(ER_DROP_PARTITION_NON_EXISTENT, MYF(0), cmd);
> > >
> > > Not part of your patch, I know, but the error message here is
> > >
> > >    "Error in list of partitions to %-.64s"
> > >
> > > shouldn't it say "Partition does not exist" or something?
> >
> > Why does DROP use such wording then? I don't think here should be different
> > error code, but the code name itself should be something like
> > ER_PARTITION_NOT_EXISTS.
>
> I don't know why.
> I agree it should be ER_PARTITION_NOT_EXISTS, not a different error code.
> I meant that perhaps we should fix the error message text?

Renamed this error code and ugly ER_KEY_COLUMN_DOES_NOT_EXITS
Changed the message.

>
> > > >          goto err;
> > > >        }
> > > >        if (table->file->is_fk_defined_on_table_or_index(MAX_KEY))
> > > > @@ -7268,6 +7420,48 @@ uint fast_alter_partition_table(THD *thd, TABLE *table,
> > > >      if (alter_partition_lock_handling(lpt))
> > > >        goto err;
> > > >    }
> > > > +  else if (alter_info->partition_flags & ALTER_PARTITION_EXTRACT)
> > > > +  {
> > > > +    if (mysql_write_frm(lpt, WFRM_WRITE_EXTRACTED) ||
> > > > +        write_log_drop_shadow_frm(lpt) ||
> > > > +        ERROR_INJECT_CRASH("crash_extract_partition_1") ||
> > > > +        ERROR_INJECT_ERROR("fail_extract_partition_1") ||
> > > > +        mysql_write_frm(lpt, WFRM_WRITE_SHADOW) ||
> > > > +        ERROR_INJECT_CRASH("crash_extract_partition_2") ||
> > > > +        ERROR_INJECT_ERROR("fail_extract_partition_2") ||
> > > > +        wait_while_table_is_used(thd, table, HA_EXTRA_NOT_USED) ||
> > > > +        ERROR_INJECT_CRASH("crash_extract_partition_3") ||
> > > > +        ERROR_INJECT_ERROR("fail_extract_partition_3") ||
> > > > +        write_log_extract_partition(lpt) ||
> > > > +        ERROR_INJECT_CRASH("crash_extract_partition_4") ||
> > > > +        ERROR_INJECT_ERROR("fail_extract_partition_4") ||
> > > > +        alter_close_table(lpt) ||
> > > > +        ERROR_INJECT_CRASH("crash_extract_partition_5") ||
> > > > +        ERROR_INJECT_ERROR("fail_extract_partition_5") ||
> > > > +        alter_partition_extract(lpt) ||
> > > > +        ERROR_INJECT_CRASH("crash_extract_partition_7") ||
> > > > +        ERROR_INJECT_ERROR("fail_extract_partition_7") ||
> > > > +        (frm_install= TRUE, FALSE) ||
> > > > +        mysql_write_frm(lpt, WFRM_INSTALL_SHADOW|WFRM_BACKUP_ORIGINAL) ||
> > > > +        log_partition_alter_to_ddl_log(lpt) ||
> > > > +        (frm_install= FALSE, FALSE) ||
> > > > +        ERROR_INJECT_CRASH("crash_extract_partition_8") ||
> > > > +        ERROR_INJECT_ERROR("fail_extract_partition_8") ||
> > > > +        (ddl_log_complete(lpt->part_info), false) ||
> > > > +        ((!thd->lex->no_write_to_binlog) &&
> > > > +         (write_bin_log(thd, FALSE,
> > > > +                        thd->query(), thd->query_length()), FALSE)) ||
> > >
> > > if you crash here, what will roll forward on recovery and
> > > drop the backup? The log is already closed here, isn't it?
> > >
> > > I don't understand why you need a backup. Why not to install shadow
> > > after the binlog was written?
> >
> > If we install shadow after binlog was written and before
> > ddl_log_complete() and it crashes after binlog was written and before
> > ddl_log_complete() the chain will be closed via XID and we are left
> > with (or without) original frm and without installed frm. Let's
> > imagine we do not close chain via XID, then it is replayed back and we
> > are left with master-slave inconsistency. There are more examples of
> > how it can crash and how we can try to recover, but the simplest and
> > most reliable implementation is via WFRM_BACKUP_ORIGINAL. If you want
> > to discuss more, let's do that in chat.
>
> ok. let me check the new patch first, may be it'll answer my questions.
>
> > > > +        mysql_write_frm(lpt, WFRM_DROP_BACKUP) ||
> > > > +        ERROR_INJECT_CRASH("crash_extract_partition_9") ||
> > > > +        ERROR_INJECT_ERROR("fail_extract_partition_9"))
> > > > +    {
> > > > +      (void) ddl_log_revert(thd, lpt->part_info);
> > > > +      handle_alter_part_error(lpt, true, true, frm_install);
> > > > +      goto err;
> > > > +    }
> > > > +    if (alter_partition_lock_handling(lpt))
> > > > +      goto err;
> > > > +  }
>
> Regards,
> Sergei
> VP of MariaDB Server Engineering
> and security@xxxxxxxxxxx



-- 
All the best,

Aleksey Midenkov
@midenok


Follow ups

References