maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #13177
Re: 3e2d297830b: MDEV-29013 ER_KEY_NOT_FOUND/lock timeout upon online alter with long unique
Hi, Nikita,
On Jul 08, Nikita Malyavin wrote:
> revision-id: 3e2d297830b (mariadb-10.6.1-511-g3e2d297830b)
> parent(s): c1feb8c3c32
> author: Nikita Malyavin
> committer: Nikita Malyavin
> timestamp: 2022-07-05 15:20:04 +0300
> message:
>
> MDEV-29013 ER_KEY_NOT_FOUND/lock timeout upon online alter with long unique
>
> 1. ER_KEY_NOT_FOUND
> Some virtual columns were not updated because were not included in read_set
> to the moment of calculation.
>
> In Row_logs_event::do_apply_event some fields are excluded based on m_cols
> value, the number of replicated rows.
the number of replicated *columns*
> We can't rely on this. Basically, we'd be satisfied, if all columns were
> just set, at least for now.
>
> 2. ER_LOCK_WAIT_TIMEOUT
> This is a long unique specific problem.
>
> Sometimes, lookup_handler is created for to->file. To properly free it,
> ha_reset should be called. It is usually done by calling
> close_thread_table, but ALTER TABLE makes it differently. Hence, a single
> ha_reset call is added to mysql_alter_table.
>
> Also, the lifetime of lookup_handler is corrected: it's been allocated on
> thd->mem_root, which in case of online alter
> is event's mem_root, and can vary other ways.
> It is changed to table->mem_root, which is more appropriate:
> ha_reset is called only when table is closed, or in a few additional cases,
> which correspond to a statement end. So memory leaks are unlikely here.
>
> diff --git a/sql/handler.cc b/sql/handler.cc
> index d71c0fc83a3..fbebfd27203 100644
> --- a/sql/handler.cc
> +++ b/sql/handler.cc
> @@ -3212,7 +3212,7 @@ int handler::create_lookup_handler()
> handler *tmp;
> if (lookup_handler != this)
> return 0;
> - if (!(tmp= clone(table->s->normalized_path.str, table->in_use->mem_root)))
> + if (!(tmp= clone(table->s->normalized_path.str, &table->mem_root)))
that's wrong, the lifetime of a lookup handler is one statement, it's
destroyed at the end. You cannot keep allocating new handlers on the
table->mem_root for every statement.
> return 1;
> lookup_handler= tmp;
> return lookup_handler->ha_external_lock(table->in_use, F_RDLCK);
> diff --git a/sql/log_event_server.cc b/sql/log_event_server.cc
> index 89ca2a41937..2c19bf67586 100644
> --- a/sql/log_event_server.cc
> +++ b/sql/log_event_server.cc
> @@ -6007,6 +6007,11 @@ int Rows_log_event::do_apply_event(rpl_group_info *rgi)
> if (m_width == table->s->fields && bitmap_is_set_all(&m_cols))
> set_flags(COMPLETE_ROWS_F);
>
> + Rpl_table_data rpl_data{};
> + if (rgi) rgi->get_table_data(table, &rpl_data);
> +
> + if (!rpl_data.copy_fields)
> + {
> /*
> Set tables write and read sets.
>
> @@ -6027,17 +6027,19 @@ int Rows_log_event::do_apply_event(rpl_group_info *rgi)
> bitmap_set_all(table->read_set);
> if (get_general_type_code() == DELETE_ROWS_EVENT ||
> get_general_type_code() == UPDATE_ROWS_EVENT)
> bitmap_intersect(table->read_set,&m_cols);
>
> bitmap_set_all(table->write_set);
>
> /* WRITE ROWS EVENTS store the bitmap in m_cols instead of m_cols_ai */
> MY_BITMAP *after_image= ((get_general_type_code() == UPDATE_ROWS_EVENT) ?
> &m_cols_ai : &m_cols);
> bitmap_intersect(table->write_set, after_image);
>
> this->slave_exec_mode= slave_exec_mode_options; // fix the mode
> + }
I'm sorry, I don't quite understand. "some fields are excluded based on
m_cols value" - what columns were excluded? What was the value of m_cols
here?
> table->rpl_write_set= table->write_set;
>
> // Do event specific preparations
> error= do_before_row_operations(rli);
> @@ -6045,8 +6052,6 @@ int Rows_log_event::do_apply_event(rpl_group_info *rgi)
> extra columns on the slave. In that case, do not force
> MODE_NO_AUTO_VALUE_ON_ZERO.
> */
> - Rpl_table_data rpl_data{};
> - if (rgi) rgi->get_table_data(table, &rpl_data);
> sql_mode_t saved_sql_mode= thd->variables.sql_mode;
> if (!is_auto_inc_in_extra_columns())
> thd->variables.sql_mode= (rpl_data.copy_fields ? saved_sql_mode : 0)
> diff --git a/sql/sql_table.cc b/sql/sql_table.cc
> index 29c4d09cd39..a3efcdd9785 100644
> --- a/sql/sql_table.cc
> +++ b/sql/sql_table.cc
> @@ -11011,6 +11011,9 @@ do_continue:;
> !(table->file->ha_table_flags() & HA_REUSES_FILE_NAMES) &&
> !(new_table->file->ha_table_flags() &
> HA_REUSES_FILE_NAMES));
> +
> + // Close lookup_handler.
> + new_table->file->ha_reset();
please, put it not here, but inside the following
thd->drop_temporary_table() call. It'll also fix Sanja's bug MDEV-26433.
And please coordinate with him to put it on the same place so that git
would detect a conflict and we could merge it to have only one reset
call in that method, not two.
> /*
> Close the intermediate table that will be the new table, but do
> not delete it! Even though MERGE tables do not have their children
> @@ -11854,8 +11857,11 @@ copy_data_between_tables(THD *thd, TABLE *from, TABLE *to,
>
> // We'll be filling from->record[0] from row events
> bitmap_set_all(from->write_set);
> - // We restore bitmaps, because update event is going to mess up with them.
> + // Use all columns. UPDATE/DELETE events reset read_set and write_set to
> + // def_*_set after each row operation, so using all_set won't work.
> to->default_column_bitmaps();
> + bitmap_set_all(&to->def_read_set);
> + bitmap_set_all(&to->def_write_set);
>
> end_read_record(&info);
> init_read_record_done= false;
>
Regards,
Sergei
VP of MariaDB Server Engineering
and security@xxxxxxxxxxx
Follow ups