← Back to team overview

maria-developers team mailing list archive

Re: 12d5c4f2813: Fixed problems with S3 after "DROP TABLE FORCE" changes

 

Hi!

On Sun, Oct 18, 2020 at 3:38 PM Sergei Golubchik <serg@xxxxxxxxxxx> wrote:
>
> Hi, Monty!

> > Fixed problems with S3 after "DROP TABLE FORCE" changes
> >
> > MDEV-23691 S3 storage engine: delayed slave can drop the table
>
> please put the MDEV on the first line of the commit comment

I prefer to have the first row readable and understandable for all
possible readers, which the MDEV
doesn't do.  I can move the MDEV to the first row in this case, but if
the commit touches
several MDEV's, I will list these separately and keep the first row as
a summary of what the commit is for.

> > This commit also fixes all failing replication S3 tests.
> >
> > A slave is delayed if it is trying to execute replicated queries on a
> > table that is already converted to S3 by the master.
>
> What do you mean, a slave is delayed?

The above explains it, as those the MDEV.
It means that the slave is not up to date with the master (either
intentionally with stop slave or because the slave still has local
not-s3 tables as it has not yet seen ALTER TABLE ... engine=s3).

The assumption is also that anyone reading the commit knows what
'slave' and 'master' and 'delayed slave' means

<cut>

> > diff --git a/sql/log.cc b/sql/log.cc
> > index 2a887a68606..9c656387e90 100644
> > --- a/sql/log.cc
> > +++ b/sql/log.cc
> > @@ -2151,6 +2151,12 @@ static int binlog_commit(handlerton *hton, THD *thd, bool all)
> >
> >      DBUG_RETURN(0);
> >    }
> > +  /*
> > +    This is true if we are doing an alter table that is replicated as
> > +    CREATE TABLE ... SELECT
> > +  */
> > +  if (thd->variables.option_bits & OPTION_BIN_COMMIT_OFF)
> > +    DBUG_RETURN(0);
>
> why binlog_commit is even called here?

Binlog_drop_table() and binlog_create_table() calls binlog_query(),
which always does a commit
for any query events.
<cut>

> > +    The following is needed if the query was "ignored" by a shared distributed
> > +    engine, in which case decide_logging_format() will set the binlog filter
> > +  */
> > +  reset_binlog_local_stmt_filter();
> > +  clear_binlog_local_stmt_filter();
> > +  return binlog_query(THD::STMT_QUERY_TYPE, query(), query_length(),
> > +                      /* is_trans */     FALSE,
> > +                      /* direct */       FALSE,
> > +                      /* suppress_use */ FALSE,
> > +                      /* Error */        0) > 0;
> > +}

> you created a generally looking function with a rather special
> behavior. Please rename it to make sure it's not a general purpose
> helper. For example, call it binlog_current_query_unfiltered() or
> force_binlog_current_query(). And the function comment should say
>
>   Force-binlog current query as a statement ignoring the binlog filter setting

First a note. The function can be used anywhere where one would binlog
the current query.
The ..._local_stmt_filter() functions are only there to force the
query to be logged statement
based even if decide_logging_format(), used by some statement, would
have wanted to
do things differently.

I have updated the function comment.
I am not sure how many will understand the word 'unfiltered'
I have expanded the function comment to explain this in more detail and also
renamed the function binlog_current_query_unfiltered().

> >  void
> >  THD::wait_for_wakeup_ready()
> >  {
> > diff --git a/sql/sql_insert.cc b/sql/sql_insert.cc
> > index 4ad4c478937..5f59e064a35 100644
> > --- a/sql/sql_insert.cc
> > +++ b/sql/sql_insert.cc
> > @@ -724,6 +724,7 @@ bool mysql_insert(THD *thd, TABLE_LIST *table_list,
> >    Item *unused_conds= 0;
> >    DBUG_ENTER("mysql_insert");
> >
> > +  bzero((char*) &info,sizeof(info));
>
> why did you move it up?

Proper place to have it at start. Not directly related to this patch.


> >    create_explain_query(thd->lex, thd->mem_root);
> >    /*
> >      Upgrade lock type if the requested lock is incompatible with
> > @@ -764,16 +765,27 @@ bool mysql_insert(THD *thd, TABLE_LIST *table_list,
> >      DBUG_RETURN(TRUE);
> >    value_count= values->elements;
> >
> > -  if (mysql_prepare_insert(thd, table_list, table, fields, values,
> > -                           update_fields, update_values, duplic,
> > -                           &unused_conds, FALSE))
> > +  if ((res= mysql_prepare_insert(thd, table_list, table, fields, values,
> > +                                 update_fields, update_values, duplic,
> > +                                 &unused_conds, FALSE)))
> > +  {
> > +    retval= thd->is_error();
> > +    if (res < 0)
> > +    {
> > +      /*
> > +        Insert should be ignored but we have to log the query in statement
> > +        format in the binary log
> > +      */
> > +      res= thd->binlog_current_query();
>
> abort label doesn't look at `res` it returns `retval`.

Yes, this is correct.  This is why 'retval' is set to is_error() above.
res is only used to know if the binlog

> So I don't think you need `retval= thd->is_error()` at all,
> and instead you need

>    if (res < 0)
>      retval= thd->binlog_current_query();

Actually this is also wrong as it would reset 'result' if it was set
by is_error() ;)
I have changed it to:

      if (thd->binlog_current_query_unfiltered())
        retval= 1;

> > +    }
> >      goto abort;
> > +  }
> > +  /* mysql_prepare_insert sets table_list->table if it was not set */
> > +  table= table_list->table;
> >
> >    /* Prepares LEX::returing_list if it is not empty */
> >    if (returning)
> >      result->prepare(returning->item_list, NULL);
> > -  /* mysql_prepare_insert sets table_list->table if it was not set */
> > -  table= table_list->table;
>
> why did you move this up?

I got crashes is result->prepared and got confused in the debugger
that 'table' was not
set.  Better to set the table early instead of having it undefined for
a longer amount of time.

> >
> >    context= &thd->lex->first_select_lex()->context;
> >    /*
> > @@ -1575,21 +1587,29 @@ bool mysql_prepare_insert(THD *thd, TABLE_LIST *table_list,
> >    DBUG_ASSERT (!select_insert || !values);
> >
> >    if (mysql_handle_derived(thd->lex, DT_INIT))
> >      DBUG_RETURN(1);
> >    if (table_list->handle_derived(thd->lex, DT_MERGE_FOR_INSERT))
> >      DBUG_RETURN(1);
> >    if (thd->lex->handle_list_of_derived(table_list, DT_PREPARE))
> >      DBUG_RETURN(1);
> >
> >    if (duplic == DUP_UPDATE)
> >    {
> >      /* it should be allocated before Item::fix_fields() */
> >      if (table_list->set_insert_values(thd->mem_root))
> >        DBUG_RETURN(1);
> > +  }
> > +
> > +  if (!table)
> > +    table= table_list->table;
>
> As far as I can see, table in mysql_prepare_insert is *always*
> either table_list->table or NULL. At least in 10.5.

> So I'd rather remove the table argument completely and let
> mysql_prepare_insert always use table_list->table.

Agree that it's safe to remove the table argument. Will do that.

> > --- a/sql/sql_table.cc
> > +++ b/sql/sql_table.cc
> > @@ -5206,7 +5215,19 @@ int create_table_impl(THD *thd, const LEX_CSTRING &orig_db,
> >            goto err;
> >        }
> >        else if (options.if_not_exists())
> > +      {
> > +        /*
> > +          We never come here as part of normal create table as table existance
> > +          is  checked in open_and_lock_tables(). We may come here as part of
> > +          ALTER TABLE
> > +        */
>
> How so? There is no ALTER IF NOT EXISTS.
> And this is the old code, so it seems to be for CREATE TABLE IF NOT EXISTS.

Alter table s3_table engine=innodb table, could come here.
I added the comment to clarify this.

Regards,
Monty


References