maria-developers team mailing list archive

Thread
Date
Re: acbe14b122c: Aria will now register it's transactions

To: Sergei Golubchik <serg@xxxxxxxxxxx>
From: Michael Widenius <michael.widenius@xxxxxxxxx>
Date: Fri, 22 May 2020 11:19:14 +0300
Cc: MariaDB Developers <maria-developers@xxxxxxxxxxxxxxxxxxx>, Michael Widenius <monty@xxxxxxxxxxx>
In-reply-to: <20200520075203.GA27000@pweza.fritz.box>
Hi!

On Wed, May 20, 2020 at 10:52 AM Sergei Golubchik <serg@xxxxxxxxxxx> wrote:
>
> Hi, Michael!

> > +class start_new_trans
> > +{
> > +  /* container for handler's private per-connection data */
> > +  Ha_data old_ha_data[MAX_HA];
> > +  struct THD::st_transactions *old_transaction, new_transaction;
> > +  Open_tables_backup open_tables_state_backup;
> > +  MDL_savepoint mdl_savepoint;
> > +  PSI_transaction_locker *m_transaction_psi;
> > +  THD *org_thd;
> > +  uint in_sub_stmt;
> > +  uint server_status;
> > +
> > +public:
> > +  start_new_trans(THD *thd);
> > +  ~start_new_trans()
> > +  {
> > +    destroy();
> > +  }
> > +  void destroy()
> > +  {
> > +    if (org_thd)                                // Safety
> > +      restore_old_transaction();
> > +    new_transaction.free();
> > +  }
> > +  void restore_old_transaction();
>
> interesting. You made restore_old_transaction() public and you
> use it in many places. Why not to use destroy() instead?
> When one would want to use restore_old_transaction() instead of destroy()?

We already discussed this on slack.
- In some cases it's an advantage/a need to end the transaction before the
  variable is destroyed.
- I don't like to have things happen automatically, especially not something
  as important as doing a commit.  I prefer to have it spelled out.
  The commit in 'destroy()' is a there mainly as a safeguard if one forgets
  to call restore_old_transaction().  I may at some point add an assert to
  force one to call restore_old_transaction() at least in debug code.

<cut>

> > +int THD::commit_whole_transaction_and_close_tables()
> > +{
> > +  int error, error2;
> > +  DBUG_ENTER("THD::commit_whole_transaction_and_close_tables");
> > +
> > +  /*
> > +    This can only happened if we failed to open any table in the
> > +    new transaction
> > +  */
> > +  if (!open_tables)
> > +    DBUG_RETURN(0);
>
> Generally, I think, it should still end the transaction here.
> Or add an assert that there is no active transaction at the moment.

There can't be any active transactions if there is no open tables.
So this is safe.

> > +
> > +  /*
> > +    Ensure table was locked (opened with open_and_lock_tables()). If not
> > +    the THD can't be part of any transactions and doesn't have to call
> > +    this function.
> > +  */
> > +  DBUG_ASSERT(lock);
> > +
> > +  error= ha_commit_trans(this, FALSE);
> > +  /* This will call external_lock to unlock all tables */
> > +  if ((error2= mysql_unlock_tables(this, lock)))
> > +  {
> > +    my_error(ER_ERROR_DURING_COMMIT, MYF(0), error2);
> > +    error= error2;
> > +  }
> > +  lock= 0;
> > +  if ((error2= ha_commit_trans(this, TRUE)))
> > +    error= error2;
> > +  close_thread_tables(this);
>
> I wonder why you're doing it in that specific order.
> commit(stmt)-unlock-commit(all)-close

Because of unlock tables calls external_lock which affects
transactions in some engines.
This is the fastest way to safely close a transaction.

The other option would be:
commit(stmt) - close()  - commit(all)

but this may in some cases be not optimal if the table is evicted from
the table cache between close() and commit().
(At least Aria will keep the read view open until the table is
commited and the external unlock is called, even over close).

<cut>

> > +start_new_trans::start_new_trans(THD *thd)
> > +{
> > +  org_thd= thd;
> > +  mdl_savepoint= thd->mdl_context.mdl_savepoint();
> > +  memcpy(old_ha_data, thd->ha_data, sizeof(old_ha_data));
> > +  thd->reset_n_backup_open_tables_state(&open_tables_state_backup);
> > +  bzero(thd->ha_data, sizeof(thd->ha_data));
> > +  old_transaction= thd->transaction;
> > +  thd->transaction= &new_transaction;
> > +  new_transaction.on= 1;
> > +  in_sub_stmt= thd->in_sub_stmt;
> > +  thd->in_sub_stmt= 0;
> > +  server_status= thd->server_status;
> > +  m_transaction_psi= thd->m_transaction_psi;
> > +  thd->m_transaction_psi= 0;
> > +  thd->server_status&= ~(SERVER_STATUS_IN_TRANS |
> > +                         SERVER_STATUS_IN_TRANS_READONLY);
> > +  thd->server_status|= SERVER_STATUS_AUTOCOMMIT;
> > +}
>
> Few thoughts:
>
> 1. If you need to save and restore _all that_ then, perhaps,
>    all that should be inside st_transactions ?

Yes, I think that would be much better. We even discussed this and I suggested
that we do that in 10.6

> 2. strictly speaking, ha_data is _per connection_. If you just bzero it,
>    the engine will think it's a new connection, and you cannot just
>    overwrite it on restore without hton->close_connection.

Yes, and that is how things works now.  The engine threats this as a new
connection and when the new transactions ends, we call hton->free_connection()
on the ha_data before restoring it.

>    A strictly "proper" solution would be to introduce ha_data per
>    transaction in addition to what we have now. But it looks like an
>    overkill. So I'd just add close_system_tables() now into restore_old_transaction()

We already have in restore_old_transaction the code:
ha_free_transactions(org_thd);

I don't think we can call close_system_tables() here.
In the code, we only call close if the open succeeded, so
it doesn't belong here.

>    A strictly "proper" solution would be to introduce ha_data per
>    transaction in addition to what we have now. But it looks like an
>    overkill. So I'd just add ha_close_connection() now into
>    restore_old_transaction() and that's all.
>
>    I see that you've added free_transaction() call, but isn't it redundant?
>    ha_close_connection() does that now.

I discussed this with Marko hand he suggested we did this with a new call.

The calls are in InnoDB:

void
innobase_free_transaction(handlerton *hton, THD *thd)
{
  if (trx_t *trx= thd_to_trx(thd))
  {
    trx_free(trx);
    thd_set_ha_data(thd, hton, nullptr);
  }
}

and
static int innobase_close_connection(handlerton *hton, THD *thd)
{
  DBUG_ASSERT(hton == innodb_hton_ptr);
  if (auto trx= thd_to_trx(thd))
  {
    if (trx->state == TRX_STATE_PREPARED && trx->has_logged_persistent())
    {
      trx_disconnect_prepared(trx);
      return 0;
    }
    innobase_rollback_trx(trx);
    trx_free(trx);
  }
  return 0;
}

I am not sure if it's always safe to call innodb_close_connection()
instead of innobase_free_connection()
Should be as I assume that for independent transactions the state
should probably never be in a prepared state.
I am just a bit nervous that in case of TRX_STATE_PREPARED the trx
will not be freed.
I will check with Marko and if he says it's ok, I will remove
free_transaction()  and use ha_close_connection()
instead.

<cut>

> > index af43d92dea7..82b3968de85 100644
> > --- a/sql/event_db_repository.cc
> > +++ b/sql/event_db_repository.cc
> > @@ -742,7 +748,8 @@ Event_db_repository::create_event(THD *thd, Event_parse_data *parse_data,
> >    ret= 0;
> >
> >  end:
> > -  close_thread_tables(thd);
> > +  if (table)
> > +    thd->commit_whole_transaction_and_close_tables();
>
> You're checking twice. Here in the caller, and the first thing
> inside commit_whole_transaction_and_close_tables().
> Here and in many places.
> If you want to check in the caller, there's no need to check inside?
> You can add an assert instead, can't you?

Yes, I prefer to add an assert in commit_whole_transaction_and_close_tables()
as one shouldn't call close() if one doesn't have anything opened.
However I want to still keep the test for open_tables in the commit_...
as this is an error case and it's hard to ensure that we can cover all
error cases in the code.


> >    thd->mdl_context.rollback_to_savepoint(mdl_savepoint);
> >
> >    thd->variables.sql_mode= saved_mode;
> > @@ -1117,22 +1124,20 @@ update_timing_fields_for_event(THD *thd,
> >    TABLE *table= NULL;
> >    Field **fields;
> >    int ret= 1;
> > -  enum_binlog_format save_binlog_format;
> >    MYSQL_TIME time;
> >    DBUG_ENTER("Event_db_repository::update_timing_fields_for_event");
> >
> > -  /*
> > -    Turn off row binlogging of event timing updates. These are not used
> > -    for RBR of events replicated to the slave.
> > -  */
> > -  save_binlog_format= thd->set_current_stmt_binlog_format_stmt();
> > -
> >    DBUG_ASSERT(thd->security_ctx->master_access & PRIV_IGNORE_READ_ONLY);
> >
> >    if (open_event_table(thd, TL_WRITE, &table))
> >      goto end;
>
> Why do you reuse a current transaction, instead of creating a new one
> as everywhere above?

This is done from the event thread handler loop and this THD cannot have any
old active transactions.  No need to create a new transaction when
there is only one
transaction active at a time.

<cut>

> > +/*
> > +  True if handler cannot roolback transactions. If not true, the transaction
> > +  will be put in the transactional binlog cache.
> > +*/
> > +#define HTON_NO_ROLLBACK (1 << 16)
>
> How is it different from HA_PERSISTENT_TABLE?

HA_PERSISTENT_TABLE means that any change is at once
stored in the table and in the redo and can't be rolled back.

HTON_NO_ROLLBACK affects mainly the binlog cache, but with Aria the
transaction can rollback in case of crash.

Should maybe be renamed to HTON_NO_EXPLICIT_ROLLBACK ?

In any case, I have updated the comment to be:
/*
  True if handler cannot rollback transactions. If not true, the transaction
  will be put in the transactional binlog cache.
  For some engines, like Aria, the rollback can happen in case of crash, but
  not trough a handler rollback call.
*/

Another difference is that HA_PERSISTENT_TABLE is part of a table
while HTON_... is part of the handlerton.


> > +
> >  class Ha_trx_info;
> >
> >  struct THD_TRANS
> > diff --git a/sql/ha_partition.cc b/sql/ha_partition.cc
> > index 582c9bb110b..0c53bdc5bbe 100644
> > --- a/sql/ha_partition.cc
> > +++ b/sql/ha_partition.cc
> > @@ -11016,7 +11016,7 @@ int ha_partition::check_misplaced_rows(uint read_part_id, bool do_repair)
> >              If the engine supports transactions, the failure will be
> >              rollbacked.
>
> if you're changing it anyway, can you also change
>
>   rollbacked -> rolled back

Done.


> ? thanks.
>
> >            */
> > -          if (!m_file[correct_part_id]->has_transactions())
> > +          if (!m_file[correct_part_id]->has_transactions_and_rollback())
> >            {
> >              /* Log this error, so the DBA can notice it and fix it! */
> >              sql_print_error("Table '%-192s' failed to move/insert a row"
> > diff --git a/sql/handler.cc b/sql/handler.cc
> > index 39841cc28d7..e3f5773717d 100644
> > --- a/sql/handler.cc
> > +++ b/sql/handler.cc
> > @@ -4514,7 +4515,6 @@ void handler::mark_trx_read_write_internal()
> >    */
> >    if (ha_info->is_started())
> >    {
> > -    DBUG_ASSERT(has_transaction_manager());
>
> there should be *some* assert here, I think

I don't think this is needed.  ha_info cannot be set if the engine
doesn't support transactions.
I can't come up with any case when an engine has registered it's
transaction in ha_info and
it would not support transactions.

> > --- a/sql/share/errmsg-utf8.txt
> > +++ b/sql/share/errmsg-utf8.txt
> > @@ -7961,3 +7961,5 @@ ER_KEY_CONTAINS_PERIOD_FIELDS
> >          eng "Key %`s cannot explicitly include column %`s"
> >  ER_KEY_CANT_HAVE_WITHOUT_OVERLAPS
> >          eng "Key %`s cannot have WITHOUT OVERLAPS"
> > +ER_DATA_WAS_COMMITED_UNDER_ROLLBACK
> > +        eng "Engine %s does not support rollback. Changes where commited during rollback call"
>
> A confusing error message. I'd just say "table is not transactional".

One only gets the error in case of the user calls ROLLBACK implicitly
or explicitly.
"not transactional" doesn't help the user at all to know what has happened.

The implication is that when one gets the above error when executing a
rollback, one will hopefully
understand that the data was stored permanently. Getting an error
'table is not transactional' doesn't
tell you anything about what really happened.

> This is the user point of view difference between a "transactional engine"
> like InnoDB and "crash safe engine" like Aria.

> It's very confusing to invent a new category of "transactional engines
> that cannot roll back" even if internally Aria is treated like one.

>From the user point of view, they get a specific error that tells what
happend when they
tried to execute rollback. I prefer to keep the new error message.  Do
you have any suggestions
of how to improve it to make it more clear of what did happen?

> > diff --git a/sql/sp.cc b/sql/sp.cc
> > index 51bbeeef368..1629290eb73 100644
> > --- a/sql/sp.cc
> > +++ b/sql/sp.cc
> > @@ -470,27 +471,29 @@ static Proc_table_intact proc_table_intact;
> >                   currently open tables will be saved, and from which will be
> >                   restored when we will end work with mysql.proc.
> >
> > +  NOTES
> > +    On must have a start_new_trans object active when calling this function
>
> I think this:
>
>    DBUG_ASSERT(thd->transcation != &thd->default_transaction);
>
> could be a bit safer than a NOTE :)

I added the assert; When I originally wrote it, I didn't have a good
readable assert, now we have one:

 DBUG_ASSERT(thd->internal_transaction());

<cut>

> > @@ -517,7 +520,8 @@ static TABLE *open_proc_table_for_update(THD *thd)
> >    MDL_savepoint mdl_savepoint= thd->mdl_context.mdl_savepoint();
> >    DBUG_ENTER("open_proc_table_for_update");
> >
>
> same? also need start_new_trans?


No, not needed. I added already a comment about this:
/**
  Open the mysql.proc table for update.

  @param thd  Thread context

  @note
    Table opened with this call should closed using close_thread_tables().

    We don't need to use the start_new_transaction object when calling this
    as there can't be any active transactions when we create or alter
    stored procedures

> > --- a/sql/sql_base.cc
> > +++ b/sql/sql_base.cc
> > @@ -4259,7 +4259,7 @@ bool open_tables(THD *thd, const DDL_options_st &options,
> >        list, we still need to call open_and_process_routine() to take
> >        MDL locks on the routines.
> >      */
> > -    if (thd->locked_tables_mode <= LTM_LOCK_TABLES)
> > +    if (thd->locked_tables_mode <= LTM_LOCK_TABLES && *sroutine_to_open)
>
> why?

The following code is only for the case when we have to open a stored function.
We already know at this point if there are any stored functions, and if not
we don't need to execute any of the code inside the if.


> >      {
> >        /*
> >          Process elements of the prelocking set which are present there
> > @@ -8887,17 +8887,16 @@ bool is_equal(const LEX_CSTRING *a, const LEX_CSTRING *b)
> >      open_system_tables_for_read()
> >        thd         Thread context.
> >        table_list  List of tables to open.
> > -      backup      Pointer to Open_tables_state instance where
> > -                  information about currently open tables will be
> > -                  saved, and from which will be restored when we will
> > -                  end work with system tables.
> >
> >    NOTES
> > +    Caller should have used start_new_trans object to start a new
> > +    transcation when reading system tables.
>
> assert, may be?

My code already have an assert; I thought your copy should also have it.

> > @@ -709,8 +709,9 @@ static bool mysqld_help_internal(THD *thd, const char *mask)
> >      Reset and backup the current open tables state to
> >      make it possible.
> >    */
> > -  Open_tables_backup open_tables_state_backup;
> > -  if (open_system_tables_for_read(thd, tables, &open_tables_state_backup))
> > +  start_new_trans new_trans(thd);
>
> I'm starting to think that this (and in some other places)
> is rather redundant. Transactions can be expensive, the overhead being
> quite big. It seems like a overkill to wrap read-only system table
> accesses in a separate transaction, they could as well be performed
> as a part of the parent transaction.
>
> Write accesses has to be wrapped in a dedicated transaction, of course.

The problem is that we have never registered the system tables into the
original open table list.  That is why we create a new context of opening
the system tables invisible from the upper level transaction.
Changing this is possible, but not trivial and I don't want to do that change
in 10.5

In any case, createing a new transaction is much less overhead than we
already have
for opening and closing system tables, so even if it's a small
addition it's probably not notable
in any benchmark.

> > --- a/sql/sql_show.cc
> > +++ b/sql/sql_show.cc
> > @@ -6105,7 +6105,7 @@ static my_bool iter_schema_engines(THD *thd, plugin_ref plugin,
> >        table->field[1]->store(option_name, strlen(option_name), scs);
> >        table->field[2]->store(plugin_decl(plugin)->descr,
> >                               strlen(plugin_decl(plugin)->descr), scs);
> > -      tmp= &yesno[MY_TEST(hton->commit)];
> > +      tmp= &yesno[MY_TEST(hton->commit && !(hton->flags & HTON_NO_ROLLBACK))];
>
> Why not to say that hton->rollback==NULL means no rollback?

Doesn't work as Aria must register hton->rollback.  This will call
commit + error message.
If not, the transaction could be 'hanging' around without either
rollback or commit.


> Then you won't need a new flag for that.
> And you'll remove redundancy, what would it mean if HTON_NO_ROLLBACK is
> present, but hton->rollback!=NULL?

This means that hton->rollback should be called on rollback.
We call hton->rollback independent of the HTON_NO_ROLLBACK flag.

Or the other way around, HTON_NO_ROLLBACK
> is not present, but hton->rollback==NULL?

That works too.  They are independent.  (This is the case for MyISAM tables).

<cut>

Regards,
Monty
References

Re: acbe14b122c: Aria will now register it's transactions
From: Sergei Golubchik, 2020-05-20