maria-developers team mailing list archive

Thread
Date
Re: 6768e3a2830: MDEV-22166 MIGRATE PARTITION: move out partition into a table

To: Aleksey Midenkov <midenok@xxxxxxxxx>
From: Sergei Golubchik <serg@xxxxxxxxxxx>
Date: Thu, 26 Aug 2021 20:58:30 +0200
Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
In-reply-to: <CAF8BazAfMe4zbgbLWUpoj1XG-ChoffiRA01fr1+GaPcbrocZ5Q@mail.gmail.com>
Hi, Aleksey!

Some replies.

On Aug 26, Aleksey Midenkov wrote:
> > > diff --git a/sql/lex.h b/sql/lex.h
> > > index cbf9d9d51b2..55eefc39a7e 100644
> > > --- a/sql/lex.h
> > > +++ b/sql/lex.h
> > > @@ -240,6 +240,7 @@ SYMBOL symbols[] = {
> > >    { "EXPLAIN",         SYM(DESCRIBE)},
> > >    { "EXTENDED",        SYM(EXTENDED_SYM)},
> > >    { "EXTENT_SIZE",     SYM(EXTENT_SIZE_SYM)},
> > > +  { "EXTRACT",         SYM(EXTRACT_SYM)},
> >
> > So, you did EXTRACT after all, not MIGRATE?
> 
> I did like the original description said but I agree that this should be
> changed to something better. Moreover EXTRACT parsing is now
> incorrect, see
> 
> git show 3b693cd3d9 mysql-test/main/parser.result

Yes, I've noticed, didn't comment because we still discuss the syntax.
If we'll decide to use EXTRACT after all, then perhaps it could be made
non-reserved.

> > ADD/EXTRACT aren't opposites in SQL.
> >
> > It would be rather consistent to have
> >
> >     ALTER ... ADD PARTITION (...) FROM table_name
> >
> > and
> >
> >     ALTER ... DROP PARTITION ... TO table_name
> >
> > except that DROP...TO looks kind of strange, and ADD...FROM
> > does not convey the meaning that the table disappears.
> >
> > The second issue also applies to IMPORT/EXPORT.
> >
> > Another option, also kind of strange, but also consistent and
> > makes it clear that the table/partition disappears:
> >
> >     ALTER ... MOVE TABLE table_name TO PARTITION (...)
> >     ALTER ... MOVE PARTITION ... TO TABLE table_name
> >
> > or the same with CHANGE instead of MOVE to reuse an existing keyword.
> 
> There is no MOVE symbol and we should not add a new one because of the
> same compatibility reason: it will not be possible to create table
> "move" (as well as to work with existing one I suspect).

it will be, if MOVE won't be a reserved work.
But still, keywords, even non-reserved, make the parser a bit slower.
And may be a keyword in that context has to be reserved?

So, yes, I totally agree that it's best to avoid adding new keywords to
the grammar.

> That's why we found MIGRATE as a good candidate And the reasons to
> have some special keyword, not just "ADD PARTITION ... FROM ..." are:
> 
> 1. it is not clear whether the source table stays or not (that was the
> discussion in MDEV-22165);
> 2. for keyword like MIGRATE it is clear how it works in both
> directions and no need to remember or guess the opposite keyword;
> 3. (and most important) is scalability: we may want to expand the
> syntax for copying the table instead of moving it so instead of
> MIGRATE we put COPY and that's it: same formula, clear how it works,
> easy to interchange between MIGRATE and COPY.
> 
> I've already thought about MOVE as a better keyword but sadly enough
> it is not reserved. So how about the proposed syntax with MIGRATE?

MIGRATE is not reserved, by the way, it's a non-reserved keyword.

If it's my suggested synax from above, but with MIGRATE, it becomes

  ALTER TABLE ... MIGRATE TABLE tbl_name TO PARTITION (partition definition)
  ALTER TABLE ... MIGRATE PARTITION part_name TO TABLE tbl_name

so very symmetrical.

  ALTER TABLE ...  ADD PARTITION ... MIGRATE FROM TABLE ...

looks kind of backward to me. At least if the reverse is MIGRATE PARTITION TO TABLE.

May be MIGRATE PARTITION TO TABLE and MIGRATE PARTITION FROM TABLE ?
Instead of MIGRATE TABLE FROM PARTITION.

> > I'm not saying anything of the above is perfect, and better variants are
> > very much welcome. But pretty much anything is better than ADD/EXTRACT.
> >
> > >    { "FALSE",           SYM(FALSE_SYM)},
> > >    { "FAST",            SYM(FAST_SYM)},
> > >    { "FAULTS",  SYM(FAULTS_SYM)},
> > > diff --git a/include/my_dbug.h b/include/my_dbug.h
> > > index e25bfcf28a7..cf969ce5750 100644
> > > --- a/include/my_dbug.h
> > > +++ b/include/my_dbug.h
> > > @@ -104,6 +104,8 @@ extern int (*dbug_sanity)(void);
> > >          (_db_keyword_(0,(keyword), 0) ? (a1) : (a2))
> > >  #define DBUG_EVALUATE_IF(keyword,a1,a2) \
> > >          (_db_keyword_(0,(keyword), 1) ? (a1) : (a2))
> > > +#define DBUG_TRUE_IF(keyword) DBUG_EVALUATE_IF(keyword, 1, 0)
> > > +#define DBUG_FALSE_IF(keyword) DBUG_EVALUATE_IF(keyword, 0, 1)
> >
> > hmm. DBUG_FALSE_IF is not used anywhere, DBUG_TRUE_IF is used once as
> >   if (DBUG_TRUE_IF("error_extract_partition_00") ||
> >       unlikely(error= file->ha_rename_table(from_name, to_name)))
> >
> > and this is typically written as
> >
> >   if (DBUG_EVALUATE_IF("error_extract_partition_00", 1,
> >         error= file->ha_rename_table(from_name, to_name)))
> >
> > so I see no need for either DBUG_TRUE_IF or DBUG_FALSE_IF,
> > please, remove both
> 
> Please, (1) looks much better! Considering that there are no complex
> expressions for a1 in the code but only 1 or 0, we should use
> DBUG_TRUE_IF/DBUG_FALSE_IF instead. I can change all entries of
> DBUG_EVALUATE_IF for that. This is supposed to be an independent refactoring.

Okay, indeed, it seems that DBUG_EVALUATE_IF is almost always used with
one of the arguments being 0 or 1. If you replace all
DBUG_EVALUATE_IF's, let's just remove it completely.
Note, that DBUG_TRUE_IF can be defined simply as

  #define DBUG_TRUE_IF(keyword)  _db_keyword_(0, (keyword), 1)

so may be you'd like to call id DBUG_KEYWORD_IF or something?
Although DBUG_TRUE_IF is shorter :)

> > >  #define DBUG_PRINT(keyword,arglist) \
> > >          do if (_db_pargs_(__LINE__,keyword)) _db_doprnt_ arglist; while(0)
> > >
> > > diff --git a/sql/sql_partition.cc b/sql/sql_partition.cc
> > > index 85d880cbbb4..4172a61812e 100644
> > > --- a/sql/sql_partition.cc
> > > +++ b/sql/sql_partition.cc
> > > @@ -5432,7 +5432,7 @@ uint prep_alter_part_table(THD *thd, TABLE *table, Alter_info *alter_info,
> > >        } while (++part_count < tab_part_info->num_parts);
> > >        if (num_parts_found != num_parts_dropped)
> > >        {
> > > -        my_error(ER_DROP_PARTITION_NON_EXISTENT, MYF(0), "DROP");
> > > +        my_error(ER_DROP_PARTITION_NON_EXISTENT, MYF(0), cmd);
> >
> > Not part of your patch, I know, but the error message here is
> >
> >    "Error in list of partitions to %-.64s"
> >
> > shouldn't it say "Partition does not exist" or something?
> 
> Why does DROP use such wording then? I don't think here should be different
> error code, but the code name itself should be something like
> ER_PARTITION_NOT_EXISTS.

I don't know why.
I agree it should be ER_PARTITION_NOT_EXISTS, not a different error code.
I meant that perhaps we should fix the error message text?

> > >          goto err;
> > >        }
> > >        if (table->file->is_fk_defined_on_table_or_index(MAX_KEY))
> > > @@ -7268,6 +7420,48 @@ uint fast_alter_partition_table(THD *thd, TABLE *table,
> > >      if (alter_partition_lock_handling(lpt))
> > >        goto err;
> > >    }
> > > +  else if (alter_info->partition_flags & ALTER_PARTITION_EXTRACT)
> > > +  {
> > > +    if (mysql_write_frm(lpt, WFRM_WRITE_EXTRACTED) ||
> > > +        write_log_drop_shadow_frm(lpt) ||
> > > +        ERROR_INJECT_CRASH("crash_extract_partition_1") ||
> > > +        ERROR_INJECT_ERROR("fail_extract_partition_1") ||
> > > +        mysql_write_frm(lpt, WFRM_WRITE_SHADOW) ||
> > > +        ERROR_INJECT_CRASH("crash_extract_partition_2") ||
> > > +        ERROR_INJECT_ERROR("fail_extract_partition_2") ||
> > > +        wait_while_table_is_used(thd, table, HA_EXTRA_NOT_USED) ||
> > > +        ERROR_INJECT_CRASH("crash_extract_partition_3") ||
> > > +        ERROR_INJECT_ERROR("fail_extract_partition_3") ||
> > > +        write_log_extract_partition(lpt) ||
> > > +        ERROR_INJECT_CRASH("crash_extract_partition_4") ||
> > > +        ERROR_INJECT_ERROR("fail_extract_partition_4") ||
> > > +        alter_close_table(lpt) ||
> > > +        ERROR_INJECT_CRASH("crash_extract_partition_5") ||
> > > +        ERROR_INJECT_ERROR("fail_extract_partition_5") ||
> > > +        alter_partition_extract(lpt) ||
> > > +        ERROR_INJECT_CRASH("crash_extract_partition_7") ||
> > > +        ERROR_INJECT_ERROR("fail_extract_partition_7") ||
> > > +        (frm_install= TRUE, FALSE) ||
> > > +        mysql_write_frm(lpt, WFRM_INSTALL_SHADOW|WFRM_BACKUP_ORIGINAL) ||
> > > +        log_partition_alter_to_ddl_log(lpt) ||
> > > +        (frm_install= FALSE, FALSE) ||
> > > +        ERROR_INJECT_CRASH("crash_extract_partition_8") ||
> > > +        ERROR_INJECT_ERROR("fail_extract_partition_8") ||
> > > +        (ddl_log_complete(lpt->part_info), false) ||
> > > +        ((!thd->lex->no_write_to_binlog) &&
> > > +         (write_bin_log(thd, FALSE,
> > > +                        thd->query(), thd->query_length()), FALSE)) ||
> >
> > if you crash here, what will roll forward on recovery and
> > drop the backup? The log is already closed here, isn't it?
> >
> > I don't understand why you need a backup. Why not to install shadow
> > after the binlog was written?
> 
> If we install shadow after binlog was written and before
> ddl_log_complete() and it crashes after binlog was written and before
> ddl_log_complete() the chain will be closed via XID and we are left
> with (or without) original frm and without installed frm. Let's
> imagine we do not close chain via XID, then it is replayed back and we
> are left with master-slave inconsistency. There are more examples of
> how it can crash and how we can try to recover, but the simplest and
> most reliable implementation is via WFRM_BACKUP_ORIGINAL. If you want
> to discuss more, let's do that in chat.

ok. let me check the new patch first, may be it'll answer my questions.

> > > +        mysql_write_frm(lpt, WFRM_DROP_BACKUP) ||
> > > +        ERROR_INJECT_CRASH("crash_extract_partition_9") ||
> > > +        ERROR_INJECT_ERROR("fail_extract_partition_9"))
> > > +    {
> > > +      (void) ddl_log_revert(thd, lpt->part_info);
> > > +      handle_alter_part_error(lpt, true, true, frm_install);
> > > +      goto err;
> > > +    }
> > > +    if (alter_partition_lock_handling(lpt))
> > > +      goto err;
> > > +  }

Regards,
Sergei
VP of MariaDB Server Engineering
and security@xxxxxxxxxxx
Follow ups

Re: 6768e3a2830: MDEV-22166 MIGRATE PARTITION: move out partition into a table
From: Sergei Golubchik, 2021-09-04
Re: 6768e3a2830: MDEV-22166 MIGRATE PARTITION: move out partition into a table
From: Aleksey Midenkov, 2021-08-30
References

Re: 6768e3a2830: MDEV-22166 MIGRATE PARTITION: move out partition into a table
From: Sergei Golubchik, 2021-08-25
Re: 6768e3a2830: MDEV-22166 MIGRATE PARTITION: move out partition into a table
From: Aleksey Midenkov, 2021-08-26