← Back to team overview

maria-developers team mailing list archive

Re: 250bc3b6d74: Ensure that table is truly dropped when using DROP TABLE

 

> Hi, Michael!

Please use Monty. No reason to be formal!

> > Ensure that table is truly dropped when using DROP TABLE
> >
> > MDEV-11412 AliSQL: [Feature] Issue#34 Support force drop table
>
> please, put this line ^^^ first in the comment comment.

Already done, as we discussed

> > The used code is largely based on code from Tencent
> >
> > The problem is that in some rare cases there may be a conflict between .frm
> > files and the files in the storage engine. In this case the DROP TABLE
> > was not able to properly drop the table.
> >
> > Some MariaDB/MySQL forks has solved this by adding a FORCE option to
>
> TSQL? Or some other fork?
>

At least Alibaba and TSQL.


> > DROP TABLE. After some discussion among MariaDB developers, we concluded
> > that users expects that DROP TABLE should always work, even if the
> > table would not be consistent. There should not be a need to use a
> > separate keyword to ensure that the table is really deleted.
> >
> > The used solution is:
> > - If a .frm table doesn't exists, try dropping the table from all storage
> >   engines.
> > - If the .frm table exists but the table does not exist in the engine
> >   try dropping the table from all storage engines.
>
> I don't see why is that needed, if .frm exists, it tells exactly what
> storage engine to use, there is no point in trying other engines.

I had cases on my machine where I had a frm but tables from another engine.
May happen when you do an alter table from one engine to another.

If the .frm is inconsistent, there is no harm to try a drop in all engines.
After all, this is not something that happens often and when it happens,
better to check things around.


<cut>

> >
> > Bugs fixed introduced by the original version of this commit:
> > MDEV-22826 Presence of Spider prevents tables from being force-deleted from
> >            other engines
>
> I don't think this part of the comment makes any sense, when one looks
> through git history there is no "original version of this commit".

I want to record that a MDEV is fixed by this commit. I think that is
perfectly reasonable and correct thing to do. So this is more a comment
for Jira than anything else.

> It's like a rolled back transaction, that value did not exist, unless
> you use READ UNCOMMITTED :)

Yes, it doesn't exist in the code, but it exists in Jira.
I assume you would not like me to delete the Jira entry?

> > diff --git a/mysql-test/main/drop_table_force.test b/mysql-test/main/drop_table_force.test
> > new file mode 100644
> > index 00000000000..8fdd79465b7
> > --- /dev/null
> > +++ b/mysql-test/main/drop_table_force.test
> > @@ -0,1 +1,197 @@
> >  +--source include/have_log_bin.inc
> > +--source include/have_innodb.inc
> > +
> > +#
> > +# This test is based on the orginal test from Tencent for DROP TABLE ... FORCE
> > +# In MariaDB we did reuse the code but MariaDB does not require the FORCE
> > +# keyword to drop a table even if the .frm file or some engine files are
> > +# missing.
> > +# To make it easy to see the differences between the orginal code and
> > +# the new one, we have left some references to the original test case
> > +#
> > +
> > +CALL mtr.add_suppression("Operating system error number");
> > +CALL mtr.add_suppression("The error means the system cannot");
> > +CALL mtr.add_suppression("returned OS error 71");
>
> not sure it's a good idea, error numbers aren't very portable, this will
> blow up some day like a time bomb

This is from the original commit. I don't think they are relevant anymore
but no harm in keeping them. Buildbot will show us if things are wrong,
so nothing that can really 'blow up'.

<cut>
> > +# create test user
> > +create user test identified by '123456';
> > +grant all privileges on test.t1 to 'test'@'%'identified by '123456' with grant option;
>
> why with grant option?
> better remove it if it's not essential for drop to work.
> if it is essential, why is it?

Read the comment in the beginning of the test.
This is to test that one doesn't need a separate privilge to be able to
drop an inconsistent table.



> > +
> > +# connect as test
> > +connect (con_test, localhost, test,'123456', );
> > +--connection con_test
> > +
> > +# drop table with user test
> > +drop table t1;
> > +--error ER_BAD_TABLE_ERROR
> > +drop table t1;
> > +
> > +# connect as root
> > +--connection default
> > +
> > +--disconnect con_test
> > +drop user test;
> > +
> > +# check files in datadir about t1
> > +--list_files  $DATADIR/test/
> > +
> > +--echo #Test4: drop table can drop consistent table as well
> > +create table t1(a int) engine=innodb;
> > +drop table t1;
>
> aha, so it works! :)

Orginal code was drop table followed by drop table force.

<cut>

> > +# second drop with if exists will success
> > +drop table if exists t1;
>
> you've just tested it in Test7, haven't you?

Read again comment at the beginning of the test file.

<cut>

> > +# create Aria table t2 and rm .MAI and .MAD
> > +create table t2(a int) engine=aria;
> > +flush tables;
> > +--remove_file $DATADIR/test/t2.MAD
> > +--remove_file $DATADIR/test/t2.MAI
> > +--list_files  $DATADIR/test/
> > +--replace_result \\ /
> > +drop table t2;

> may be csv and archive tests too?

CSV has only one table file, so not that interesting as it's covered by
other tests.
Archive has discovery and thus will automatically remove the .frm or .ARZ
in case of missing files.

I have now added bot to make the test case more complete.


>
> > diff --git a/sql/handler.cc b/sql/handler.cc
> > index 1af0157783e..5dd02d057c7 100644
> > --- a/sql/handler.cc
> > +++ b/sql/handler.cc
> > @@ -2674,26 +2712,34 @@ const char *get_canonical_filename(handler *file, const char *path,
> >  }
> >
> >
> > -/** delete a table in the engine
> > +/**
> > +   Delete a table in the engine
> >
> > +   @return 0   Table was deleted
> > +   @return -1  Table didn't exists, no error given
> > +   @return #   Error from table handler
> > +
> >    @note
> >    ENOENT and HA_ERR_NO_SUCH_TABLE are not considered errors.
> > -  The .frm file will be deleted only if we return 0.
> > +  The .frm file should be deleted by the caller only if we return <= 0.
> >  */
> > +
> >  int ha_delete_table(THD *thd, handlerton *table_type, const char *path,
> > -                    const LEX_CSTRING *db, const LEX_CSTRING *alias, bool generate_warning)
> > +                    const LEX_CSTRING *db, const LEX_CSTRING *alias,
> > +                    bool generate_warning)
> >  {
> >    handler *file;
> >    char tmp_path[FN_REFLEN];
> >    int error;
> >    TABLE dummy_table;
> >    TABLE_SHARE dummy_share;
> > +  bool is_error= thd->is_error();
> >    DBUG_ENTER("ha_delete_table");
> >
> >    /* table_type is NULL in ALTER TABLE when renaming only .frm files */
> >    if (table_type == NULL || table_type == view_pseudo_hton ||
> >        ! (file=get_new_handler((TABLE_SHARE*)0, thd->mem_root, table_type)))
> > -    DBUG_RETURN(0);
> > +    DBUG_RETURN(-1);
>
> "Table didn't exists, no error given"?

Yes (for this function)

> if get_new_handler() fails it's OOM.

No problem with OOM as it sets thd->error() and the user will get that error
message.


> >    bzero((char*) &dummy_table, sizeof(dummy_table));
> >    bzero((char*) &dummy_share, sizeof(dummy_share));
> > @@ -2723,8 +2769,10 @@ int ha_delete_table(THD *thd, handlerton *table_type, const char *path,
> >      }
> >      if (intercept)
> >      {
> > +      /* Clear error if we got it in this function */
> > +      if (!is_error)
> >        thd->clear_error();
>
> this is generally a wrong approach. One should use push_internal_handler
> to silence errors, not thd->is_error() or thd->clear_error().

I know that silensing the error is normally wrong approach. Old code did that
without no testing which was definitely wrong and hide some errors.
However if we got a table not exists error from the handler and there
was no error from before, this is a safe way to do it.

> In this particilar case there will be no errors if there was no error
> before ha_delete_table() - which is correct. But if there was an error
> before, this will leave spurious and confusing errors in `SHOW WARNINGS`
> output.

The only thing that will be left in SHOW WARNINGS is a note that a file
didn't exist, not a big deal for an edge case.  In case of
ha_delete_table_force(), which is the one that can generate many warnings
from different engines, we have error handler in place.

> >    delete file;
> > @@ -4365,45 +4413,63 @@ uint handler::get_dup_key(int error)
> >
> >    @note
> >      We assume that the handler may return more extensions than
> > -    was actually used for the file.
> > +    was actually used for the file. We also assume that the first
> > +    extension is the most important one. If this exist and we can't delete
> > +    that one we will abort the delete.
>
> after "the most important one" you can add "(see the comment near
> handlerton::tablefile_extensions)"

Done.

>
> > +    If the first one doesn't exists, we have to try to delete all other
>
> "doesn't exist"
>
> > +    extension as there is chance that the server had crashed between
> > +    the delete of the first file and the next
> >
> >    @retval
> >      0   If we successfully deleted at least one file from base_ext and
> >      didn't get any other errors than ENOENT
> > +
> >    @retval
> >      !0  Error
> >  */
> > +
> >  int handler::delete_table(const char *name)
> >  {
> > -  int saved_error= 0;
> > -  int error= 0;
> > -  int enoent_or_zero;
> > +  int saved_error= ENOENT;
> > +  bool abort_if_first_file_error= 1;
> > +  bool some_file_deleted= 0;
> > +  DBUG_ENTER("handler::delete_table");
> >
> > +  // For discovery tables, it's ok if first file doesn't exists
>
> why would that be ok?

For tables with discovery, there may not be any files on disk. They may
be generated by the discovery process. That first file in a list is
missing doesn't mean that there isn't other files.

> >    for (const char **ext= bas_ext(); *ext ; ext++)
> >    {
> > +    int error;
> > -    if (mysql_file_delete_with_symlink(key_file_misc, name, *ext, 0))
> > +    if ((error= mysql_file_delete_with_symlink(key_file_misc, name, *ext,
> > +                                              MYF(0))))
>
> why do you need this `error` variable, if you never use it?

Mostly for debugging. The compiler will delete it, so no big deal.
I originally thought my_handler_delete_with_symlink() should return the
errno, but as it can generate several errno that didn't work.

<cut>

> >  /**
> > +   Return true if the error from drop table means that the
> > +   table didn't exists
> > +*/
> > +
> > +bool non_existing_table_error(int error)
> > +{
> > +  return (error == ENOENT || error == HA_ERR_NO_SUCH_TABLE ||
> > +          error == HA_ERR_UNSUPPORTED ||
> > +          error == ER_NO_SUCH_TABLE ||
> > +          error == ER_NO_SUCH_TABLE_IN_ENGINE ||
> > +          error == ER_WRONG_OBJECT);
> > +}
>
> this should, I believe, be inline in handler.h

After having to change this multiple times and have to recompile,
I added this to the .c file.
As this is only used for DDL's (which are slow), it's better to NOT have
it in the head file as it's generates less code to have it here.

<cut>

> > +static my_bool delete_table_force(THD *thd, plugin_ref plugin, void *arg)
> > +{
> > +  handlerton *hton = plugin_hton(plugin);
> > +  st_force_drop_table_params *param = (st_force_drop_table_params *)arg;
> > +
> > +  /*
> > +    We have to ignore HEAP tables as these may not have been created yet
> > +    We also remove engines that is using discovery (as these will recrate
> > +    any missing .frm if needed) and tables marked with
> > +    HTON_AUTOMATIC_DELETE_TABLE as for these we can't check if the table
> > +    ever existed.
> > +  */
> > +  if (!hton->discover_table && hton->db_type != DB_TYPE_HEAP &&
> > +      !(hton->flags & HTON_AUTOMATIC_DELETE_TABLE))
> > +  {
> > +    int error;
> > +    error= ha_delete_table(thd, hton, param->path, param->db,
> > +                           param->alias, param->generate_warning);
>
> This is doing a a lot of extra work. dummy_table/dummy_share, if()s.
> Allocating and constructing new handlers.

This is for extreme cases when there is no reason to optimize for
performance but less code is better.

> I suggest the following:
>
>   1. Introduce new handlerton method hton->drop_table().
>      By default it'll be
>
>         hton->get_new_handler()
>         get_canonical_filename()
>         file->ha_delete_table()
>
>      like now, so all engines will still work. But later we can start
>      moving drop table functionality out of the handler. In this commit
>      only engines like blackhole will have a dummy drop_table() method.
>

We already have two handler delete_table methods. Don't want to add yet
another one.

Note that a several handlers needs special handling. (Just check who have
HTON_AUTOMATIC_DELETE_TABLE), not just blackhole,

According to your suggestion, we would have to create a new
drop_table() for all engines and have flags when to generate errors
and which errors.  And we would have to duplicate some of that code in
ha_delete_table().

Definetly much more code than the current one and much more changes in
many engines.

>   2. Here you can do, basically
>
>        hton->drop_table(hton, param->path)
>
>      without checking for discovery, without a new flag
>      HTON_AUTOMATIC_DELETE_TABLE. I'm not sure why you check for
>      DB_TYPE_HEAP, could you please explain that?

See code comment!  HEAP TABLES are created on open, so when we call
delete_table_force() the table is not yet created. We can't detect
that a not yet created table exists, can we ?

<cut>


> > +int ha_delete_table_force(THD *thd, const char *path, const LEX_CSTRING *db,
> > +                          const LEX_CSTRING *alias, bool generate_warning)
> > +{
> > +  st_force_drop_table_params param;
> > +  Table_exists_error_handler no_such_table_handler;
> > +  DBUG_ENTER("ha_delete_table_force");
> > +
> > +  param.path=             path;
> > +  param.db=               db;
> > +  param.alias=            alias;
> > +  param.generate_warning= generate_warning;
>
> this is never used, you always invoke ha_delete_table_force() with
> generate_warning=0.

Yes, I know. This was in the original code and it was not used there either.
However, I can imagine there are scenarios where this would be useful in
the future, so I kept it. If you feel strongly about it I can delete the
generate_warning options.

<cut>

> > index 61b1113f680..7c79acef3ea 100644
> > --- a/sql/sql_table.cc
> > +++ b/sql/sql_table.cc
> > @@ -2327,36 +2329,42 @@ int mysql_rm_table_no_locks(THD *thd, TABLE_LIST *tables, bool if_exists,
> >                    thd->find_temporary_table(table) &&
> >                    table->mdl_request.ticket != NULL));
> >
> > +    /* First try to delete temporary tables and temporary sequences */
> > -    if (table->open_type == OT_BASE_ONLY || !is_temporary_table(table) ||
> > -        (drop_sequence && table->table->s->table_type != TABLE_TYPE_SEQUENCE))
> > -      error= 1;
> > -    else
> > +    if ((table->open_type != OT_BASE_ONLY && is_temporary_table(table)) &&
>
> unnecessary parentheses

Yes, but easier to read with != and other things and especially when
it's combined with another expression with !=

<cut>

> >        if (!dont_log_query && table_creation_was_logged)
> >        {
> >          /*
> > +          DROP TEMPORARY succeded. For the moment when we only come
> > +          here on success (error == 0)
>
> DBUG_ASSERT(error == 0) please

Not needed as the code makes it clear if one looks up a couple of lines.
The comment is more clear than an assert.

<cut>

> > @@ -2387,7 +2395,8 @@ int mysql_rm_table_no_locks(THD *thd, TABLE_LIST *tables, bool if_exists,
> >          is no need to proceed with the code that tries to drop a regular
> >          table.
> >        */
> > -      if (!error) continue;
> > +      if (temporary_table_was_dropped)
>
> can temporary_table_was_dropped be false here?

yes

The condition for the branch is:

   if ((drop_temporary && if_exists) || temporary_table_was_dropped)



>
> > +        continue;
> >      }
> >      else if (!drop_temporary)
> >      {
> > @@ -2402,48 +2411,33 @@ int mysql_rm_table_no_locks(THD *thd, TABLE_LIST *tables, bool if_exists,
> >        path_length= build_table_filename(path, sizeof(path) - 1, db.str,
> >                                          alias.str, reg_ext, 0);
> >      }
> > +
> >      DEBUG_SYNC(thd, "rm_table_no_locks_before_delete_table");
> >      error= 0;
> > -    if (drop_temporary ||
> > -        (ha_table_exists(thd, &db, &alias, &table_type, &is_sequence) == 0 &&
> > +    if (drop_temporary)
> > +    {
> > +      /* "DROP TEMPORARY" but a temporary table was not found */
> > +      error= ENOENT;
> > +    }
> > +    else if (((frm_exists= ha_table_exists(thd, &db, &alias, &table_type,
> > +                                           &is_sequence)) == 0 &&
>
> this is an overkill. ha_table_exists() will do a full table discovery,
> you don't need it here, because you'll try to delete anyway.


This was in old code, nothing changed. I assume that we also need this for
discovery as ha_delete_table() needs to know in which engine the
table exists.

Note that if .frm file exists, ha_table_exists() is going to do just
dd_frm_type().

<cut>

> > +      if (frm_delete_error)
> >        {
> > -        int trigger_drop_error= 0;
> > +        /*
> > +          Remember error if unexpected error from dropping the .frm file
> > +          or we got an error from ha_delete_table()
> > +        */
> > +        if (frm_delete_error != ENOENT)
> > +          error= frm_delete_error;
> > +        else if (if_exists && ! error)
> > +          thd->clear_error();
>
> no, please, don't do thd->clear_error().
> I've seen and fixed so many bugs because of that. The last one - just an
> hour ago. Use push_internal_handler() instead.

This is from the old code, nothing I added (I just moved thing around).
I don't have time to fix this now and better for this patch keep things
same as before. You can create a new MDEV for fixing this if needed.

<cut>

> > +        if (Table_triggers_list::drop_all_triggers(thd, &db,
> > +                                                   &table->table_name,
> > +                                                   MYF(MY_WME |
> > +                                                       MY_IGNORE_ENOENT)))
> > +          ferror= -1;
>
> Why do you need two copies of code to drop triggers?
> Why cannot you do it once, on the first drop_all_triggers() place?

Because the first drop_all_triggers is only done if the .frm exists and
the table was dropped. This is part of the "drop force", but only in the
cases where it makes sence to do DROP FORCE:

    if (non_existing_table_error(error) && !drop_temporary &&
        table_type != view_pseudo_hton && !trigger_drop_executed &&
        !wrong_drop_sequence)
    {

Regards,
Monty


Follow ups

References