maria-developers team mailing list archive

Thread
Date
Re: df203d1daa1: MDEV-15990: REPLACE on a precise-versioned table returns duplicate key error (ER_DUP_ENTRY)

To: Nikita Malyavin <nikitamalyavin@xxxxxxxxx>
From: Sergei Golubchik <serg@xxxxxxxxxxx>
Date: Thu, 27 Dec 2018 13:18:46 +0100
Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
In-reply-to: <5vuBlfDbF6L.A.5O.4G3IcB@meddwl>
User-agent: Mutt/1.7.2 (2016-11-26)
Hi, Nikita!

Sorry, I don't understand what are you fixing and why :(
Could you please elaborate?

See few minor comments below and the main question at the very end.

On Dec 26, Nikita Malyavin wrote:
> revision-id: df203d1daa1 (versioning-1.0.5-42-gdf203d1daa1)
> parent(s): 5e89a0dc25a
> author: Nikita Malyavin <nikitamalyavin@xxxxxxxxx>
> committer: Nikita Malyavin <nikitamalyavin@xxxxxxxxx>
> timestamp: 2018-07-21 00:14:57 +1000
> message:
> 
> MDEV-15990: REPLACE on a precise-versioned table returns duplicate key error (ER_DUP_ENTRY)
> 
> * handle zero value in `row_start` `trx_id`
> * refuse to do more than one versioned insert on the same transaction and same row
> * generalize versioned cases with and without triggers
> 
> [closes tempesta-tech/mariadb#517]
> [closes tempesta-tech/mariadb#520]
> 
> diff --git a/mysql-test/suite/versioning/r/replace.result b/mysql-test/suite/versioning/r/replace.result
> --- a/mysql-test/suite/versioning/r/replace.result
> +++ b/mysql-test/suite/versioning/r/replace.result
> @@ -7,10 +7,31 @@ period for system_time (row_start, row_end)
>  ) with system versioning;
>  insert t values (1, 2);
>  replace t values (1, 3);
> -select *, current_row(row_end) as current from t for system_time all order by x;
> -id	x	current
> -1	2	0
> -1	3	1
> +replace t values (1, 3);
> +# MDEV-15990: REPLACE on a precise-versioned table
> +# returns duplicate key error (ER_DUP_ENTRY)
> +replace t values (1, 3), (1, 30), (1, 40);
> +create table log(s varchar(3));
> +create trigger tr1del before delete on t
> +for each row insert into log values('DEL');
> +replace t values (1, 4), (1, 40), (1, 400);
> +select *, check_row(row_start, row_end), row_start != 0 as start_nonzero
> +from t for system_time all order by x, row_end;
> +id	x	check_row(row_start, row_end)	start_nonzero
> +1	2	HISTORICAL ROW	1
> +1	3	HISTORICAL ROW	1
> +1	3	HISTORICAL ROW	1
> +1	40	HISTORICAL ROW	1
> +1	400	CURRENT ROW	1
> +select * from log;
> +s
> +DEL
> +DEL
> +DEL

better to insert OLD.x too, to be able to see where the DEL comes from

> +# ensure that row_start is not duplicated
> +select count(row_start) as empty from t for system_time all
> +group by row_start having count(row_start) > 1;
> +empty
>  drop table t;
>  create table t (
>  id int unique,
> diff --git a/sql/sql_insert.cc b/sql/sql_insert.cc
> --- a/sql/sql_insert.cc
> +++ b/sql/sql_insert.cc
> @@ -1969,23 +1952,22 @@ int write_record(THD *thd, TABLE *table,COPY_INFO *info)
>                                                  TRG_ACTION_BEFORE, TRUE))
>              goto before_trg_err;
>  
> -          if (!table->versioned(VERS_TIMESTAMP))
> +          if (table->versioned(VERS_TRX_ID) && table->vers_write)
> +          {

can table->vers_write be false here? I suspect it can never be false for
REPLACE, so I'd write it as

             if (table->versioned(VERS_TRX_ID))
             {
               DBUG_ASSERT(table->vers_write);

> +            bitmap_set_bit(table->write_set, table->vers_start_field()->field_index);
> +            table->vers_start_field()->store(0, false);
> +          }
> +
> +          if (!table->versioned())
>              error= table->file->ha_delete_row(table->record[1]);
>            else
> -          {
> -            store_record(table, record[2]);
> -            restore_record(table, record[1]);
> -            table->vers_update_end();
>              error= table->file->ha_update_row(table->record[1],
>                                                table->record[0]);
> -            restore_record(table, record[2]);
> -          }
> -          if (unlikely(error))
> +          if (unlikely(error && error != HA_ERR_RECORD_IS_THE_SAME))
>              goto err;
> -          if (!table->versioned(VERS_TIMESTAMP))
> +          if (likely(!error))
>              info->deleted++;
> -          else
> -            info->updated++;
> +          error= 0;
>            if (!table->file->has_transactions())
>              thd->transaction.stmt.modified_non_trans_table= TRUE;
>            if (table->triggers &&
> @@ -1995,6 +1977,17 @@ int write_record(THD *thd, TABLE *table,COPY_INFO *info)
>              trg_error= 1;
>              goto ok_or_after_trg_err;
>            }
> +          if (table->versioned(VERS_TIMESTAMP) && table->vers_write)

same

> +          {
> +            store_record(table, record[2]);
> +            error = vers_insert_history_row(table);

why are you doing that (insert + goto after_trg_n_copied_inc)
instead of going the normal REPLACE code path?

> +            restore_record(table, record[2]);
> +            if (unlikely(error))
> +              goto err;
> +          }
> +          if (table->versioned())
> +            goto after_trg_n_copied_inc;
> +
>            /* Let us attempt do write_row() once more */
>          }
>        }
> diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc
> index db755077e0b..139c43ced38 100644
> --- a/storage/innobase/handler/ha_innodb.cc
> +++ b/storage/innobase/handler/ha_innodb.cc
> @@ -8916,6 +8916,19 @@ ha_innobase::update_row(
>  	upd_t*		uvect = row_get_prebuilt_update_vector(m_prebuilt);
>  	ib_uint64_t	autoinc;
>  
> +	bool force_insert= table->versioned_write(VERS_TRX_ID)
> +	                   && table->vers_start_id() == 0;

hmm, so you're using vers_start_id() == 0 as a special hack to signal
REPLACE to the storage engine...

> +
> +	if (force_insert) {
> +		table->vers_start_field()->store(trx->id);
> +
> +		Field *start= table->vers_start_field();
> +		trx_id_t old_trx_id= uint8korr(start->ptr_in_record(old_row));
> +
> +		/* avoid double versioned insert on the same transaction and same row */
> +		force_insert= (old_trx_id != trx->id);

okay, I'm confused. What was wrong in the old code? Normally REPLACE is
DELETE + INSERT, why was it not enough for InnoDB and you needed some
special logic here to handle the REPLACE case?

> +	}
> +
>  	/* Build an update vector from the modified fields in the rows
>  	(uses m_upd_buf of the handle) */
>  
Regards,
Sergei
Chief Architect MariaDB
and security@xxxxxxxxxxx
Follow ups

Re: df203d1daa1: MDEV-15990: REPLACE on a precise-versioned table returns duplicate key error (ER_DUP_ENTRY)
From: Nikita Malyavin, 2019-02-28