← Back to team overview

maria-developers team mailing list archive

Re: 12d5c4f2813: Fixed problems with S3 after "DROP TABLE FORCE" changes

 

Hi, Monty!

On Oct 16, Michael Widenius wrote:
> revision-id: 12d5c4f2813 (mariadb-10.5.4-190-g12d5c4f2813)
> parent(s): afa5f735f30
> author: Michael Widenius <michael.widenius@xxxxxxxxx>
> committer: Michael Widenius <michael.widenius@xxxxxxxxx>
> timestamp: 2020-09-13 17:48:15 +0300
> message:
> 
> Fixed problems with S3 after "DROP TABLE FORCE" changes
> 
> MDEV-23691 S3 storage engine: delayed slave can drop the table

please put the MDEV on the first line of the commit comment

> 
> This commit also fixes all failing replication S3 tests.
> 
> A slave is delayed if it is trying to execute replicated queries on a
> table that is already converted to S3 by the master.

What do you mean, a slave is delayed?

> 
> Fixes for replication events on S3 tables for delayed slaves:
> - INSERT and INSERT ... SELECT and CREATE TABLE are ignored but written
>   to the binary log.   UPDATE & DELETE will be fixed in a future commit.
> 
> Other things:
> - On slaves with --s3-slave-ignore-updates set, allow S3 tables to be
>   opened in read-write mode. This was done to be able to
>   ignore-but-replicate queries like insert.  Without this change any
>   open of an S3 table failed with 'Table is read only' which is too
>   early to be able to replicate the original query.
> - Errors are now printed if handler::extra() call fails in
>   wait_while_tables_are_used()
> - Error message for row changes are changed from HA_ERR_WRONG_COMMAND
>   to HA_ERR_TABLE_READONLY
> - Disable some maria_extra() calls for S3 tables. This could cause
>   S3 tables to fail in some cases.
> - Added missing thr_lock_delete() to ma_open() in case of failure
> 
> diff --git a/sql/log.cc b/sql/log.cc
> index 2a887a68606..9c656387e90 100644
> --- a/sql/log.cc
> +++ b/sql/log.cc
> @@ -2151,6 +2151,12 @@ static int binlog_commit(handlerton *hton, THD *thd, bool all)
>  
>      DBUG_RETURN(0);
>    }
> +  /*
> +    This is true if we are doing an alter table that is replicated as
> +    CREATE TABLE ... SELECT
> +  */
> +  if (thd->variables.option_bits & OPTION_BIN_COMMIT_OFF)
> +    DBUG_RETURN(0);

why binlog_commit is even called here?

>  
>    DBUG_PRINT("debug",
>               ("all: %d, in_transaction: %s, all.modified_non_trans_table: %s, stmt.modified_non_trans_table: %s",
> diff --git a/sql/sql_class.cc b/sql/sql_class.cc
> index f4cbda5490c..7c103756566 100644
> --- a/sql/sql_class.cc
> +++ b/sql/sql_class.cc
> @@ -7476,6 +7476,34 @@ int THD::binlog_query(THD::enum_binlog_query_type qtype, char const *query_arg,
>    DBUG_RETURN(0);
>  }
>  
> +
> +/**
> +  Binlog current query as a statement if binary log is open
> +
> +  @retval false   ok
> +  @retval true    error
> +*/
> +
> +
> +bool THD::binlog_current_query()
> +{
> +  if (!mysql_bin_log.is_open())
> +    return 0;
> +
> +  /*
> +    The following is needed if the query was "ignored" by a shared distributed
> +    engine, in which case decide_logging_format() will set the binlog filter
> +  */
> +  reset_binlog_local_stmt_filter();
> +  clear_binlog_local_stmt_filter();
> +  return binlog_query(THD::STMT_QUERY_TYPE, query(), query_length(),
> +                      /* is_trans */     FALSE,
> +                      /* direct */       FALSE,
> +                      /* suppress_use */ FALSE,
> +                      /* Error */        0) > 0;
> +}

you created a generally looking function with a rather special
behavior. Please rename it to make sure it's not a general purpose
helper. For example, call it binlog_current_query_unfiltered() or
force_binlog_current_query(). And the function comment should say

  Force-binlog current query as a statement ignoring the binlog filter setting

> +
> +
>  void
>  THD::wait_for_wakeup_ready()
>  {
> diff --git a/sql/sql_insert.cc b/sql/sql_insert.cc
> index 4ad4c478937..5f59e064a35 100644
> --- a/sql/sql_insert.cc
> +++ b/sql/sql_insert.cc
> @@ -724,6 +724,7 @@ bool mysql_insert(THD *thd, TABLE_LIST *table_list,
>    Item *unused_conds= 0;
>    DBUG_ENTER("mysql_insert");
>  
> +  bzero((char*) &info,sizeof(info));

why did you move it up?

>    create_explain_query(thd->lex, thd->mem_root);
>    /*
>      Upgrade lock type if the requested lock is incompatible with
> @@ -764,16 +765,27 @@ bool mysql_insert(THD *thd, TABLE_LIST *table_list,
>      DBUG_RETURN(TRUE);
>    value_count= values->elements;
>  
> -  if (mysql_prepare_insert(thd, table_list, table, fields, values,
> -                           update_fields, update_values, duplic,
> -                           &unused_conds, FALSE))
> +  if ((res= mysql_prepare_insert(thd, table_list, table, fields, values,
> +                                 update_fields, update_values, duplic,
> +                                 &unused_conds, FALSE)))
> +  {
> +    retval= thd->is_error();
> +    if (res < 0)
> +    {
> +      /*
> +        Insert should be ignored but we have to log the query in statement
> +        format in the binary log
> +      */
> +      res= thd->binlog_current_query();

abort label doesn't look at `res` it returns `retval`.
So I don't think you need `retval= thd->is_error()` at all,
and instead you need

   if (res < 0)
     retval= thd->binlog_current_query();

> +    }
>      goto abort;
> +  }
> +  /* mysql_prepare_insert sets table_list->table if it was not set */
> +  table= table_list->table;
>  
>    /* Prepares LEX::returing_list if it is not empty */
>    if (returning)
>      result->prepare(returning->item_list, NULL);
> -  /* mysql_prepare_insert sets table_list->table if it was not set */
> -  table= table_list->table;

why did you move this up?

>  
>    context= &thd->lex->first_select_lex()->context;
>    /*
> @@ -1575,21 +1587,29 @@ bool mysql_prepare_insert(THD *thd, TABLE_LIST *table_list,
>    DBUG_ASSERT (!select_insert || !values);
>  
>    if (mysql_handle_derived(thd->lex, DT_INIT))
>      DBUG_RETURN(1);
>    if (table_list->handle_derived(thd->lex, DT_MERGE_FOR_INSERT))
>      DBUG_RETURN(1);
>    if (thd->lex->handle_list_of_derived(table_list, DT_PREPARE))
>      DBUG_RETURN(1);
>  
>    if (duplic == DUP_UPDATE)
>    {
>      /* it should be allocated before Item::fix_fields() */
>      if (table_list->set_insert_values(thd->mem_root))
>        DBUG_RETURN(1);
> +  }
> +
> +  if (!table)
> +    table= table_list->table;

As far as I can see, table in mysql_prepare_insert is *always*
either table_list->table or NULL. At least in 10.5.

So I'd rather remove the table argument completely and let
mysql_prepare_insert always use table_list->table.

> +
> +  if (table->file->check_if_updates_are_ignored("INSERT"))
> +  {
> +    DBUG_RETURN(-1);
>    }
>  
>    if (mysql_prepare_insert_check_table(thd, table_list, fields, select_insert))
>      DBUG_RETURN(1);
>  
>    /* Prepare the fields in the statement. */
>    if (values)
> diff --git a/sql/sql_table.cc b/sql/sql_table.cc
> index 15d190c3139..2843496f783 100644
> --- a/sql/sql_table.cc
> +++ b/sql/sql_table.cc
> @@ -5206,7 +5215,19 @@ int create_table_impl(THD *thd, const LEX_CSTRING &orig_db,
>            goto err;
>        }
>        else if (options.if_not_exists())
> +      {
> +        /*
> +          We never come here as part of normal create table as table existance
> +          is  checked in open_and_lock_tables(). We may come here as part of
> +          ALTER TABLE
> +        */

How so? There is no ALTER IF NOT EXISTS.
And this is the old code, so it seems to be for CREATE TABLE IF NOT EXISTS.

> +
> +        /* Log CREATE IF NOT EXISTS on slave for distributed engines */
> +        if (thd->slave_thread && (exists_hton && exists_hton->flags &
> +                                  HTON_IGNORE_UPDATES))
> +          thd->log_current_statement= 1;
>          goto warn;

Regards,
Sergei
VP of MariaDB Server Engineering
and security@xxxxxxxxxxx


Follow ups