maria-developers team mailing list archive

Thread
Date
Re: 7b03ce7af3d: MDEV-17395 REPLACE/INSERT ODKU: support WITHOUT OVERLAPS

To: Nikita Malyavin <nikitamalyavin@xxxxxxxxx>
From: Sergei Golubchik <serg@xxxxxxxxxxx>
Date: Thu, 12 Mar 2020 13:10:05 +0100
Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
In-reply-to: <LkbuxxYpEVM.A.eOE.r3oYeB@pweza>
User-agent: Mutt/1.10.1 (2018-07-13)
Hi, Nikita!

First, about tests:

On Mar 06, Nikita Malyavin wrote:
> revision-id: 7b03ce7af3d (mariadb-10.4.4-504-g7b03ce7af3d)
> parent(s): 5c94cf3bf44
> author: Nikita Malyavin <nikitamalyavin@xxxxxxxxx>
> committer: Nikita Malyavin <nikitamalyavin@xxxxxxxxx>
> timestamp: 2019-11-28 01:38:53 +1000
> message:
> 
> MDEV-17395 REPLACE/INSERT ODKU: support WITHOUT OVERLAPS
> 
> ---
>  mysql-test/suite/period/r/insert_replace.result |  56 +++
>  mysql-test/suite/period/t/insert_replace.test   |  37 ++
>  sql/handler.cc                                  |  14 +
>  sql/handler.h                                   |   3 +-
>  sql/sql_insert.cc                               | 492 ++++++++++++++----------
>  sql/sql_load.cc                                 |   6 +
>  sql/table.cc                                    |   4 +-
>  sql/table.h                                     |   5 +-
>  8 files changed, 409 insertions(+), 208 deletions(-)
> 
> diff --git a/mysql-test/suite/period/r/insert_replace.result b/mysql-test/suite/period/r/insert_replace.result
> --- /dev/null
> +++ b/mysql-test/suite/period/r/insert_replace.result
> @@ -0,0 +1,56 @@
> +create or replace table t(id int, val int, s date, e date,
> +period for p(s,e),
> +primary key(id, p without overlaps)) engine=myisam;
> +insert into t values (1, 1, '2003-01-01', '2003-03-01'),
> +(1, 2, '2003-05-01', '2003-07-01');
> +# This just inserts a row; no rows matched
> +insert into t values (2, 3, '2003-01-01', '2003-04-01')
> +on duplicate key update val=3;
> +# The following command is equivalent to
> +# MERGE INTO t USING t
> +#  ON id = 1 AND s <= 2003-04-01 AND e > 2003-01-01
> +#  WHEN MATCHED UPDATE SET val=3
> +#  WHEN NOT MATCHED INSERT VALUES (1, 3, '2003-01-01', '2003-04-01');
> +insert into t values (1, 3, '2003-01-01', '2003-04-01')
> +on duplicate key update val=3;
> +select row_count();
> +row_count()
> +2
> +select * from t;
> +id   val     s       e
> +1    3       2003-01-01      2003-03-01
> +1    2       2003-05-01      2003-07-01
> +2    3       2003-01-01      2003-04-01

It's somewhat misleading, because you've used '3' everywhere.
Please rewrite your tests (all of them, also tests below) to identify
every operation uniquely. For example:

  insert into t values (1, 1, '2003-01-01', '2003-03-01'),
  (1, 2, '2003-05-01', '2003-07-01');
  # This just inserts a row; no rows matched
  insert into t values (2, 3, '2003-01-01', '2003-04-01')
  on duplicate key update val=4;
  # The following command is equivalent to
  # MERGE INTO t USING t
  #  ON id = 1 AND s <= 2003-04-01 AND e > 2003-01-01
  #  WHEN MATCHED UPDATE SET val=6
  #  WHEN NOT MATCHED INSERT VALUES (1, 5, '2003-01-01', '2003-04-01');
  insert into t values (1, 5, '2003-01-01', '2003-04-01')
  on duplicate key update val=6;

> +insert into t values (1, 3, '2003-01-01', '2003-06-01')
> +on duplicate key update val=4;
> +select row_count();
> +row_count()
> +4
> +select * from t;
> +id   val     s       e
> +1    4       2003-01-01      2003-03-01
> +1    4       2003-05-01      2003-07-01
> +2    3       2003-01-01      2003-04-01

I don't think IODKU is defined via MERGE. Unfortunately.
See, how it works:

  create table t1 (a int not null, b int not null, c int, unique(a), unique(b));
  insert t1 values (1,1,1), (2,2,2);
  insert t1 values (1,2,3) on duplicate key update c=4;
  select * from t1;
  a       b       c
  1       1       4
  2       2       2

here only one row was updated. If it would've been defined as

  MERGE INTO t1 USING t1
    ON a=1 OR b=2
    WHEN MATCHED UPDATE c=4
    WHEN NOT MATCHED INSERT VALUES (1,2,3)

then it would've updated both rows.

As you can see it literally is defined as "insert, and if there's a
duplicate key error, then update the conflicting row instead"

That is, in your case it should've updated only one row too.

Also, please, add this statement to your test:

 insert into t values (1, 3, '2003-01-01', '2003-02-01')
 on duplicate key update val=4;

> +# No rows matched
> +insert into t values (1, 3, '2003-07-01', '2003-08-01')
> +on duplicate key update val=5;
> +select row_count();
> +row_count()
> +1
> +select * from t;
> +id   val     s       e
> +1    4       2003-01-01      2003-03-01
> +1    4       2003-05-01      2003-07-01
> +2    3       2003-01-01      2003-04-01
> +1    3       2003-07-01      2003-08-01
> +replace into t values(1, 6, '2003-01-01', '2003-06-01');
> +select row_count();
> +row_count()
> +4
> +select * from t;
> +id   val     s       e
> +1    6       2003-01-01      2003-06-01
> +1    4       2003-06-01      2003-07-01
> +2    3       2003-01-01      2003-04-01
> +1    3       2003-07-01      2003-08-01

Here you do DELETE FOR PERIOD. But above you didn't do UPDATE FOR PERIOD.

Add also this: replace into t values(1, 6, '2003-01-01', '2003-02-01');

And tests for INSERT SELECT (also with IGNORE, REPLACE, ODKU) and for
LOAD DATA (also with IGNORE and REPLACE).

> +drop table t;

Now, about semantics. It is very arguable here. One options is to do,
literally, "insert, if fails delete/update the conflicting row". No
periods involved here. The other option is to use FOR PERIOD implicitly
for updates and deletes.

Example, in all test cases below I'll assume:

  insert t1 values (1,1,'2003-01-01','2003-03-01');
  insert t1 values (1,2,'2003-05-01','2003-06-01');
 ->  1     1     2003-01-01     2003-03-01
     1     2     2003-05-01     2003-06-01

So, option one:

  insert t1 values (1,3,'2003-01-01','2003-02-01')
  on duplicate key update val=4;
 ->  1     4     2003-01-01     2003-02-01
     1     2     2003-05-01     2003-06-01
  
Option two:

  insert t1 values (1,3,'2003-01-01','2003-02-01')
  on duplicate key update val=4;
 ->  1     4     2003-01-01     2003-02-01
     1     1     2003-02-01     2003-03-01
     1     2     2003-05-01     2003-06-01

Overlapping range, option two:

  insert t1 values (1,3,'2003-01-01','2003-04-01')
  on duplicate key update val=4;
 ->  1     4     2003-01-01     2003-03-01
     1     3     2003-03-01     2003-04-01
     1     2     2003-05-01     2003-06-01

And the same for replace, basically:

  replace t1 values (1,3,'2003-01-01','2003-02-01');
 ->  1     3     2003-01-01     2003-02-01
     1     2     2003-05-01     2003-06-01

or

 ->  1     3     2003-01-01     2003-02-01
     1     1     2003-02-01     2003-03-01
     1     2     2003-05-01     2003-06-01

Regards,
Sergei
VP of MariaDB Server Engineering
and security@xxxxxxxxxxx
Follow ups

Re: 7b03ce7af3d: MDEV-17395 REPLACE/INSERT ODKU: support WITHOUT OVERLAPS
From: Nikita Malyavin, 2020-04-03