← Back to team overview

maria-developers team mailing list archive

Re: fd18ed07d65: MDEV-16973 Application period tables: DELETE

 

Hi!
> I'd insert also, say, (5, '2010-01-01', '2015-01-01').
Ok no problem

> > +select * from t;
>
> try to avoid unsorted selects, the row order is undefined in these cases
> use --sorted_results command before the select or the order by.
>
> --sorted_results is generally better, because it's done on the client,
> so it doesn't change the execution plan on the server.
>
Added --sorted_result to all plain selects✅

> > +delete history from t2 for portion of apptime from '2000-01-01' to '2018-01-01';
> > +ERROR 42000: You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '' at line 1
>
> The error is good, ER_PARSE_ERROR is very appropriate here.
> But it should say "right syntax to use near 'for portion of' at line 1"
>
Well I just used `MYSQL_YYABORT_UNLESS` in parser, as in many other
places, and they all have this problem at this moment. Maybe that's a
bug of this macro.
> > +delete from t for portion of othertime from '2000-01-01' to '2018-01-01';
> > +ERROR HY000: period `othertime` is not found in table
> > +delete from t for portion of system_time from '2000-01-01' to '2018-01-01';
> > +ERROR HY000: period SYSTEM_TIME is not allowed here
>
> could be just ER_PARSE_ERROR too, I suppose
>
I think it worth separate error here. ER_PARSE_ERROR will make "right
syntax to use near '' at line 1", just like in the above case.
However, these cases differ -- it is very likely that user will try to
use SYSTEM_TIME period, and we should tell him this is a voilation.

> > +select * from t;
>
> --sorted_results or order by
>
✅

> > +# SQL17 Case 15.7-8.b.i
>
> good! you forgot "part 2" and "general rules", but even without
> them it only takes a couple of seconds to find this quote.
>
To be honest, the whole idea to cite the standard in tests belongs to
Alexey Midenkov.
I've just pasted his citations.
Added the rest ✅

> > +# auto_inrement field is updated
>
> "increment"
>
✅

> > +create or replace table t (id int primary key auto_increment, s date, e date,
> > +period for apptime(s, e));
> > +insert into t values (default, '1999-01-01', '2018-12-12');
>
> please, add a select * here.
>
✅

> > +  { "PORTION",               SYM(PORTION)},
>
> PORTION_SYM, please
>
✅

> > +%token  PORTION
>
> add /* SQL-2016-R */ comment
>
✅

> and it seems to be 2016 standard, not 2017 standard,
> https://en.wikipedia.org/wiki/SQL
> please change it everywhere, it's too confusing otherwise
>
uh yes, really, missed it somehow. Replaced ✅

> >  %token  POSITION_SYM                  /* SQL-2003-N */
> >  %token  PRECISION                     /* SQL-2003-R */
> >  %token  PRIMARY_SYM                   /* SQL-2003-R */
> > @@ -13437,23 +13449,26 @@ delete_part2:
> >            opt_delete_options single_multi {}
> >          | HISTORY_SYM delete_single_table opt_delete_system_time
> >            {
> > +            MYSQL_YYABORT_UNLESS(!Lex->last_table()->has_period());
> >              Lex->last_table()->vers_conditions= Lex->vers_conditions;
> >              Lex->pop_select(); //main select
> >            }
> >          ;
> >
> >  delete_single_table:
> > -          FROM table_ident opt_use_partition
> > +          FROM table_ident opt_for_portion_of_time_clause opt_use_partition
>
> better put opt_for_portion_of_time_clause after opt_use_partition,
> I think it's more logical that way. FOR PORTION OF is logical
> specification, while opt_use_partition is physical one, just like
> the table name, so it makes sense to keep them together
>
✅

> > +ER_PERIOD_NOT_FOUND
> > +        eng "period %`s is not found in table"
>
> first letter capitalized, please
>
✅

> > +
> > +ER_PERIOD_SYSTEM_TIME_NOT_ALLOWED
> > +        eng "period SYSTEM_TIME is not allowed here"
>
> same here, or, better, use ER_PARSE_ERROR
>
✅

> > diff --git a/sql/sql_delete.cc b/sql/sql_delete.cc
> > index 3ae113a4595..e76eb729536 100644
> > --- a/sql/sql_delete.cc
> > +++ b/sql/sql_delete.cc
> > @@ -287,7 +287,8 @@ bool mysql_delete(THD *thd, TABLE_LIST *table_list, COND *conds,
> >    bool               return_error= 0;
> >    ha_rows    deleted= 0;
> >    bool          reverse= FALSE;
> > -  bool          has_triggers;
> > +  bool          has_triggers= false;
> > +  bool          has_period_triggers= false;
>
> I don't see why you need has_triggers and has_period_triggers,
> I find it confusing.
>
> just make `has_triggers=true` if there's `FOR PORTION OF` and there
> are insert triggers.
>
> In other words, has_triggers should be true if there are any triggers
> that this statement might need to fire.
>
ok ✅

> > +    conds= table_list->on_expr;
> > +    table_list->on_expr= NULL;
>
> Why did you define SELECT_LEX::period_setup_conds to put the
> condition in TABLE_LIST::on_expr only to hack it up from there?
> Just make it to return the new condition or NULL in case of an error.
>
I just wanted to reflect versioning behavior as much as possible.
Made it just to return a condition ✅

> > +  portion_of_time_through_update= !has_period_triggers
> > +                                  && !table->versioned(VERS_TIMESTAMP);
>
> I suspect that with VERS_TRX_ID you'd also need to use delete/insert.
> Add a test, please.
>
There is a system versioning test at the end of 'delete.test'. It runs
for both 'timestamp' and 'trx_id'  cases.
But what I suspect is that this optimization is meaningful in case of
VERS_TRX_ID.
The resulting operation will be converted to update+insert in both
cases, so I am also removing it for VERS_TRX_ID.
✅

> > -  if (unique_table(thd, table_list, table_list->next_global, 0))
> > +  if (table_list->has_period()
>
> Really? Why?
>
Yes, really. Table crashes myisam -- just tested. This is related to
that we mix deletes/updates and inserts, and it breaks cursor.
Maybe we can do something like Eugene did in latest TABLE::delete_row
patch -- i.e. save/restore cursor, but delete_while_scanning= false
just works.
> > +      || unique_table(thd, table_list, table_list->next_global, 0))
> >      *delete_while_scanning= false;
> >

> > @@ -8302,6 +8302,7 @@ TABLE_LIST *st_select_lex::add_table_to_list(THD *thd,
> >    ptr->ignore_leaves= MY_TEST(table_options & TL_OPTION_IGNORE_LEAVES);
> >    ptr->sequence=      MY_TEST(table_options & TL_OPTION_SEQUENCE);
> >    ptr->derived=          table->sel;
> > +  ptr->vers_conditions.name= "SYSTEM_TIME";
>
> why?
>
Guess it is junk for now... I was thinking about SELECT...AS OF
support, and had get_period_by_name function somewhere in TABLE, but
it's gone after several review iterations. I was also trying to not so
much distinguish SYSTEM_TIME and application-time period already on
this step.
Removed from this commit and added note in MDEV-16977 ✅

> > +void period_update_condition(THD *thd, TABLE_LIST *table, SELECT_LEX *select,
>
> static
>
✅
> > +                             vers_select_conds_t &conds, bool timestamp)
>
> please, either use pointers or _const_ references.
> in other words, `conds` here should be a pointer
>
✅

> > +    switch (conds.type)
> > +    {
> > +    case SYSTEM_TIME_UNSPECIFIED:
> > +      curr= newx Item_int(thd, ULONGLONG_MAX);
> > +      cond1= newx Item_func_eq(thd, conds.field_end, curr);
>
> add here
>
>    DBUG_ASSERT(!conds.start.item);
>    DBUG_ASSERT(!conds.end.item);
>
✅
> > +      break;
> > +    case SYSTEM_TIME_AS_OF:
> > +      cond1= newx Item_func_trt_trx_sees_eq(thd, trx_id0, conds.field_start);
> > +      cond2= newx Item_func_trt_trx_sees(thd, conds.field_end, trx_id0);
>
>   DBUG_ASSERT(!conds.end.item);
>
✅
> > +      break;
> > +    case SYSTEM_TIME_FROM_TO:
> > +      cond1= newx Item_func_trt_trx_sees(thd, trx_id1, conds.field_start);
> > +      cond3= newx Item_func_lt(thd, conds.start.item, conds.end.item);
> > +      break;
>
> hmm, you don't use trx_id0 here at all. did you forget
>
>   cond2= newx Item_func_trt_trx_sees_eq(thd, conds.field_end, trx_id0);
>
yes, looks so... The tests did not catch this

> > +int SELECT_LEX::period_setup_conds(THD *thd, TABLE_LIST *tables, Item *where)
> > +{
> > +  DBUG_ENTER("SELECT_LEX::period_setup_conds");
> > +
> > +  if (!thd->stmt_arena->is_conventional() &&
> > +      !thd->stmt_arena->is_stmt_prepare_or_first_sp_execute())
> > +  {
> > +    // statement is already prepared
> > +    DBUG_RETURN(0);
> > +  }
> > +
> > +  if (thd->lex->is_view_context_analysis())
> > +    DBUG_RETURN(0);
>
> I see that you've copied these two if's from SELECT_LEX::vers_setup_conds.
> I suspect that the first if() is not quite correct here,
> and I plan to take a closer look at it.
>
> so, it'd be great if you could come up with some way to avoid
> copying these two if()'s. So that there will be only one
> place to fix.
>
> May be, extract them into a function or something, I don't know.
>
Those lines were made for a couple of bugfixes. Eugene's just
mentioned the latest one:
https://github.com/MariaDB/server/commit/748ef3ec91d35aa6cd5b230c71084f6c83c1c92e

I've just extracted those if's to a separate function. Have no better idea.

> > +
> > +  for (TABLE_LIST *table= tables; table; table= table->next_local)
>
> 1. can here be more than one table? you only allow FOR PORTION OF
>    in delete_single_table parser rule.
> 2. if no, add an assert here that there's only one table in the list
That's not the case for UPDATE, so assert will be inappropriate here.

> 3. and please add a test for DELETE FROM view FOR PORTION OF
>
✅
> > +  {
> > +    if (!table->table)
> > +      continue;
> > +    vers_select_conds_t &conds= table->period_conditions;
> > +    if (0 == strcasecmp(conds.name.str, "SYSTEM_TIME"))
>
> why not to check it in the parser?
>
Maybe.. but anyway what's the defference? This place is good for error
handling too.

> > @@ -6426,9 +6426,14 @@ void TABLE::mark_columns_needed_for_delete()
> >
> >    if (s->versioned)
> >    {
> > -    bitmap_set_bit(read_set, s->vers.start_field(s)->field_index);
> > -    bitmap_set_bit(read_set, s->vers.end_field(s)->field_index);
> > -    bitmap_set_bit(write_set, s->vers.end_field(s)->field_index);
> > +    bitmap_set_bit(read_set, s->vers.start_fieldno);
> > +    bitmap_set_bit(read_set, s->vers.end_fieldno);
> > +    bitmap_set_bit(write_set, s->vers.end_fieldno);
> > +  }
> > +
> > +  if (with_period)
> > +  {
> > +    use_all_columns();
> >    }
>
> Nope, I think this is conceptually wrong.
> TABLE::mark_columns_needed_for_delete marks columns that are needed
> to *delete* a row. The fact that you might need to *insert* if
> there's a FOR PORTION OF, is irrelevant here, it's something
> the caller should bother about.
>
I thought this function is to collect all the weird logic of columns
marking in one convenient place.
A tip is that it's used only in DELETE statement code, but not in
replication, or anywhere else.
> >  }
> >
> > @@ -7834,6 +7839,101 @@ int TABLE::update_default_fields(bool update_command, bool ignore_errors)
> >    DBUG_RETURN(res);
> >  }
> >
> > +static int table_update_generated_fields(TABLE *table)
> > +{
> > +  int res= 0;
> > +  if (table->found_next_number_field)
> > +  {
> > +    table->next_number_field= table->found_next_number_field;
> > +    res= table->found_next_number_field->set_default();
> > +    table->file->update_auto_increment();
> > +  }
> > +
> > +  if (table->default_field)
> > +    res= table->update_default_fields(0, false);
>
> why is that?
auto_increment fields are allowed
>
> > +
> > +  if (likely(!res) && table->vfield)
> > +    res= table->update_virtual_fields(table->file, VCOL_UPDATE_FOR_WRITE);
> > +  if (likely(!res) && table->versioned())
> > +    table->vers_update_fields();
>
> I believe you need to verify CHECK constraints too.
> and, please, add a test where they fail here.
I think there'll be no such test, because generated fields are
forbidden to use application-time period fields.
Except auto_increment, but, those are not allowed to be used in virtual fields.
The SQL16 standard in 15.7 Effect of deleting rows from base tables,
General Rules, 8)c)ii) says:
>The following <insert statement> is effectively executed without further Access Rule and constraint checking:
>INSERT INTO TN VALUES (VL1, ..., VLd)
>NOTE 691 — Constraint checking is performed at the end of triggering DELETE statement.

Not sure what does the NOTE 691 mean exactly, but I interpret the
statement as this INSERT command doesn't require any constraint check.
BTW what's important, this operation does not require INSERT privilege!

>
> > +  return res;
> > +}
> > +
> > +static int period_make_insert(TABLE *table, Item *src, Field *dst)
> > +{
> > +  THD *thd= table->in_use;
> > +
> > +  store_record(table, record[1]);
> > +  int res= src->save_in_field(dst, true);
> > +
> > +  if (likely(!res))
> > +    table_update_generated_fields(table);
>
> perhaps, move store_record() and save_in_field() into
> table_update_generated_fields() ? less duplicated code.
> But the function will need to be renamed, like prepare_for_period_insert
> or something
>
Because of two duplicating lines?:)

I actually don't quite like this idea because store_record is
symmetrically paired with restore_record here, and this pairing fits
on one screen, which is very convenient for reading.

> > +
> > +  if (likely(!res) && table->triggers)
> > +    res= table->triggers->process_triggers(thd, TRG_EVENT_INSERT, TRG_ACTION_BEFORE, true);
> > +
> > +  if (likely(!res))
> > +    res = table->file->ha_write_row(table->record[0]);
> > +
> > +  if (likely(!res) && table->triggers)
> > +    res= table->triggers->process_triggers(thd, TRG_EVENT_INSERT, TRG_ACTION_AFTER, true);
> > +
> > +  restore_record(table, record[1]);
> > +  return res;
> > +}
> > +
> > +int TABLE::update_portion_of_time(THD *thd, vers_select_conds_t &period_conds,
> > +                                  bool *inside_period)
>
> is it used for UPDATE too? or only for DELETE?
>
Looks like for DELETE only. Does the naming look confusing?

> > +{
> > +  bool lcond= period_conds.field_start->val_datetime_packed(thd)
> > +              < period_conds.start.item->val_datetime_packed(thd);
> > +  bool rcond= period_conds.field_end->val_datetime_packed(thd)
> > +              > period_conds.end.item->val_datetime_packed(thd);
> > +
> > +  *inside_period= !lcond && !rcond;
> > +  if (*inside_period)
> > +    return 0;
> > +
> > +  int res= 0;
> > +  Item *src= lcond ? period_conds.start.item : period_conds.end.item;
> > +  uint dst_fieldno= lcond ? s->period.end_fieldno : s->period.start_fieldno;
> > +
> > +  store_record(this, record[1]);
> > +  if (likely(!res))
> > +    res= src->save_in_field(field[dst_fieldno], true);
> > +
> > +  if (likely(!res))
> > +    res= table_update_generated_fields(this);
> > +
>
> may be, an assert that there're no delete/insert triggers in a table?
>
Hmm maybe. I doubt it will prevent any bug hereafter, but added, let's
see how it'll go.


Sincerely,
Nikita Malyavin


Follow ups

References