maria-developers team mailing list archive

Thread
Date
Re: c76ac1e6de8: MDEV-24089 support oracle syntax: rownum

To: Sergei Golubchik <serg@xxxxxxxxxxx>
From: Michael Widenius <michael.widenius@xxxxxxxxx>
Date: Sun, 18 Apr 2021 19:11:33 +0300
Cc: MariaDB Developers <maria-developers@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <YHRzNbjICgzqPzi1@pweza>
On Mon, Apr 12, 2021 at 7:20 PM Sergei Golubchik <serg@xxxxxxxxxxx> wrote:
>
> Hi, Michael!
>
> On Apr 12, Michael Widenius wrote:
> > revision-id: c76ac1e6de8 (mariadb-10.5.2-551-gc76ac1e6de8)
> > parent(s): a507eb17708
> > author: Michael Widenius <michael.widenius@xxxxxxxxx>
> > committer: Michael Widenius <michael.widenius@xxxxxxxxx>
> > timestamp: 2021-03-24 19:42:24 +0200
> > message:

Please include always the first row in the commit message so that I
can know which commit you are reviewing.
(revision id's are often useless for reviews)

> > diff --git a/mysql-test/main/rownum.test b/mysql-test/main/rownum.test
> > new file mode 100644
> > index 00000000000..8d89eed2f86
> > --- /dev/null
> > +++ b/mysql-test/main/rownum.test
> > @@ -0,0 +1,542 @@
> > +--source include/have_sequence.inc
> > +#
> > +# Test of basic rownum() functionallity
> > +# Test are done with Aria to ensure that row order is stable
>
> what does that mean?

It means that SELECT * will return the rows in the same order for the test.
(There is normally no guaranteed that insert and select order are the same).

Please do not include things that your not commenting upon, to make it
easier to find your
comments.

> > +  OPEN c_cursor;
> > +  FETCH c_cursor INTO v_a, v_b, v_rn;
> > +  WHILE c_cursor%FOUND LOOP
> > +     SELECT concat(v_a,'--',v_b,'--',v_rn);
> > +     FETCH c_cursor INTO v_a, v_b, v_rn;
> > +  END LOOP;
> > +  CLOSE c_cursor;
> > +END;|
> > +DELIMITER ;|
> > +
> > +--error ER_FUNCTION_CANNOT_BE_USED_IN_CLAUSE
> > +select a, rownum from t1 group by a, rownum having rownum < 3;
> > +select a, rownum from t1 group by a, rownum having "rownum" < 3;
>
> what's the difference between these two? Why the first is an error and the
> second is not?

There are quotes around rownum in the second version.  In Oracle mode that is
an error, in MariaDB mode it is not.

> > +select a, rownum as r from t1 group by a, rownum having r < 3;
>
> And the this one. Why is it not an error either?

Because 'r' is a reference, not a function.  'r' works on the result set.
rownum cannot be executed in the having clause (which is executed
after all rows are returned).

> > diff --git a/sql/filesort.h b/sql/filesort.h
> > index 29ae5e20cc6..806a4a8b027 100644
> > --- a/sql/filesort.h
> > +++ b/sql/filesort.h
> > @@ -44,24 +44,21 @@ class Filesort: public Sql_alloc
> >    /** Number of records to return */
> >    ha_rows limit;
> >    /** ORDER BY list with some precalculated info for filesort */
> > +  ha_rows *accepted_rows;                       /* For ROWNUM */
> >    SORT_FIELD *sortorder;
>
> first, I suspect that you want to put your accepted_rows declaration
> before the comment that applies to sortorder.

Fixed.

> second, could you do a bit more than just "For ROWNUM"?
> why do you need something "for ROWNUM" inside Filesort?

Because we excute the WHERE clause under filesort.
and then rownum function needs to know how many rows we have found so far.


I added
  /* Used with ROWNUM. Contains the number of rows filesort has found so far */

 > index 53c236bbce7..c237fcd9ecc 100644
> > --- a/sql/item_func.h
> > +++ b/sql/item_func.h
> > @@ -2125,6 +2125,63 @@ class Item_func_rand :public Item_real_func
> >  };
> >
> >
> > +class Item_func_rownum :public Item_longlong_func
> > +{
> > +  /*
> > +    This points to a variable that contains the number of rows
> > +    accpted so far in the result set
> > +  */
> > +  ha_rows *accepted_rows;
> > +  SELECT_LEX *select;
> > +public:
> > +  Item_func_rownum(THD *thd);
> > +  longlong val_int() override;
> > +  LEX_CSTRING func_name_cstring() const override
> > +  {
> > +    static LEX_CSTRING name= {STRING_WITH_LEN("rownum") };
> > +    return name;
> > +  }
> > +  enum Functype functype() const override { return ROWNUM_FUNC; }
> > +  void update_used_tables() override {}
> > +  bool const_item() const override { return 0; }
> > +  void fix_after_optimize(THD *thd) override;
> > +  bool fix_fields(THD* thd, Item **ref) override;
> > +  bool fix_length_and_dec() override
> > +  {
> > +    unsigned_flag= 1;
> > +    used_tables_cache= RAND_TABLE_BIT;
> > +    const_item_cache=0;
> > +    set_maybe_null();
>
> how can rownum be NULL?

In context where there are no rows.   like SET @A=rownum;

> > +    return FALSE;
> > +  }
> > +  void cleanup() override
> > +  {
> > +    Item_longlong_func::cleanup();
> > +    /* Ensure we don't point to freed memory */
> > +    accepted_rows= 0;
> > +  }
> > +  bool check_vcol_func_processor(void *arg) override
> > +  {
> > +    return mark_unsupported_function(func_name(), "()", arg,
> > +                                     VCOL_IMPOSSIBLE);
> > +  }
> > +  bool check_handler_func_processor(void *arg) override
> > +  {
> > +    return mark_unsupported_function(func_name(), "()", arg,
> > +                                     VCOL_IMPOSSIBLE);
>
> first, as I wrote above, I think that there's no logical reason why rownum should
> not with with handler.

Please read the patch. It cannot be in the handler for multiple reasons:
- INSERT, UPDATE, LOAD DATA
- handler does not know if the row is accepted
- joins
 etc.

> but even if it doesn't, it's much better to check for SQLCOM_HA_READ
> from Item_func_rownum::fix_fields instead of traversing the whole
> item tree for every single HANDLER READ command looking for Item_func_rownum
> that nobody (literally, not a single user out of all our millions) will use there.

There is only a traverse of the tree looking for rownum if the rownum
function is used.

Why do you think there is some extra travelling of the tree for HANDLER_READ ?

> > +  }
> > +  Item *get_copy(THD *thd) override
> > +  {
> > +    return this;
>
> It looks wrong. We don't have any other Item that returns `this` here.
> I think rownum cannot be pushed down because of RAND_TABLE_BIT,
> So better return NULL here, meaning "cannot be pushed down".
> Instead of `this` meaning "if you push it down, it'll surreptitiously
> start corrupting the memory"

I don't think it can trash memory even if it would be pushed down.
As long as the item exists, it should be safe to use anywhere.
I also don't trust the code to be able to cope with returning null
in get_copy.

> > +  }
> > +  /* This function is used in insert, update and delete */
> > +  void store_pointer_to_row_counter(ha_rows *row_counter)
>
> So this isn't a virtual method, and to use it you downcast after checking
> type() == FUNC_ITEM and functype() == ROWNUM_FUNC. To avoid adding a new
> virtual method thus blowing up the vtable, I presume.

Yes, it is not a virtual and should not be.  And yes, we do not want
to add any new
virtual functions to items when we have close to 600+ items in the system...

It is also much faster to do an if on upper level than doing a function call.

> And fix_after_optimize() is virtual. To avoid error-prone downcasting, would
> be my guess.

Yes. This is something we may need for other items in the future, which
is why I made it virtaul.

> Can you explain when you use what approach, please?

See above.

> > --- a/sql/item_func.cc
> > +++ b/sql/item_func.cc
> > @@ -7206,3 +7206,72 @@ void Item_func_setval::print(String *str, enum_query_type query_type)
> >    str->append_ulonglong(round);
> >    str->append(')');
> >  }
> > +
> > +
> > +/*
> > +  Return how many row combinations has accepted so far + 1
> > +
> > +  The + 1 is to ensure that 'WHERE ROWNUM <=1' returns one row
>
> this is a very weird explanation, it sounds as if it's
> a heuristical a hack for ROWNUM <=1 to work
>
> Why not to write "+1 takes into account the currently computed row"

Because the row is not yet computed or accepted. We can call rownum any
time, after a row is accpted, before etc.

I personally prefer the current comment as it is more well defined.

> > +*/
> > +
> > +longlong Item_func_rownum::val_int()
> > +{
> > +  if (!accepted_rows)
> > +  {
> > +    /*
> > +      Rownum is not properly set up. Probably used in wrong context when
> > +      it should not be used. In this case returning 0 is probably the best
> > +      solution.
> > +    */
>
> DBUG_ASSERT(0); here?

No.  One can use rownum in a lot of context where it does makes sence.
Instead of trying to generate an error for every of these cases, I decided
to return 0 for these.  In the future maybe some of these will be valid
and then things can change.  I do not want to see asserts in test cases
people add for things like this.


> > +  select->with_rownum= 1;
> > +  thd->lex->with_rownum= 1;
> > +  thd->lex->uncacheable(UNCACHEABLE_RAND);
>
> Why? ROWNUM doesn't make a query uncacheable, it's perfectly
> deterministic and repeatable.

Not necessartly.  Row order may be different from the engine on
subsequent reads for example.

> > +
> > +  /* If this command changes data, mark it as unsafe for statement logging */
> > +  if (sql_command_flags[thd->lex->sql_command] &
> > +      (CF_UPDATES_DATA | CF_DELETES_DATA))
> > +    thd->lex->set_stmt_unsafe(LEX::BINLOG_STMT_UNSAFE_SYSTEM_FUNCTION);
>
> Why? It's the same as LIMIT. unsafe without ORDER BY and perfectly safe with.

It is much worse.  rownum is not safe to use anytime, not even with
ORDER BY etc.
(Except maybe with INSERT)..
So better to give an error if it is used in any context.

> > +  Store a reference to the variable that contains number of accepted rows
> > +*/
> > +
> > +void Item_func_rownum::fix_after_optimize(THD *thd)
> > +{
> > +  accepted_rows= &select->join->accepted_rows;
>
> why do you need both that and fix_rownum_pointers()?
> I'd think that the latter alone would be sufficient

select is not calling fix_rownum_pointers().
Also, fix_after_optimize() is much faster as it is ONLY called on functions
that require this call; No loop required over all existing items, like with
insert/update etc.

> > diff --git a/sql/share/errmsg-utf8.txt b/sql/share/errmsg-utf8.txt
> > index badaaa17716..aecb00563f7 100644
> > --- a/sql/share/errmsg-utf8.txt
> > +++ b/sql/share/errmsg-utf8.txt
> > @@ -7973,3 +7973,5 @@ ER_DATA_WAS_COMMITED_UNDER_ROLLBACK
> >          eng "Engine %s does not support rollback. Changes were committed during rollback call"
> >  ER_PK_INDEX_CANT_BE_IGNORED
> >          eng "A primary key cannot be marked as IGNORE"
> > +ER_FUNCTION_CANNOT_BE_USED_IN_CLAUSE
> > +        eng "Function '%s' cannot be used in the %s clause"
>
> Is it the first function we've got that cannot be used in HAVING?

At least I could not find any similar error.
There is no 'having' text string at all in errmsg-utf8.txt.
The only reference in the code we have to the "HAVING" string is:

    my_error(ER_NON_GROUPING_FIELD_USED, MYF(0),
             ref->name.str, "HAVING");

But the error:
    my_error(ER_NON_GROUPING_FIELD_USED, MYF(0),
             ref->name.str, "HAVING");
        eng "Non-grouping field '%-.192s' is used in %-.64s clause"

Is not really usable in this context.

> > diff --git a/sql/sql_derived.cc b/sql/sql_derived.cc
> > index 35b4294707f..da6c567b87c 100644
> > --- a/sql/sql_derived.cc
> > +++ b/sql/sql_derived.cc
> > @@ -656,14 +656,22 @@ static
> >  bool mysql_derived_prepare(THD *thd, LEX *lex, TABLE_LIST *derived)
> >  {
> >    SELECT_LEX_UNIT *unit= derived->get_unit();
> > -  bool res= FALSE;
> > +  SELECT_LEX *first_select;
> > +  bool res= FALSE, keep_row_order;
> >    DBUG_ENTER("mysql_derived_prepare");
> >    DBUG_PRINT("enter", ("unit: %p  table_list: %p  alias: '%s'",
> >                         unit, derived, derived->alias.str));
> >    if (!unit)
> >      DBUG_RETURN(FALSE);
> >
> > -  SELECT_LEX *first_select= unit->first_select();
> > +  first_select= unit->first_select();
> > +  /*
> > +    If rownum() is used we have to preserve the insert row order
> > +    to make group by with filesort to work.
>
> it'd help if you'd add an example of a SELECT here

Any FILESORT:
SELECT a from t1 GROUP BY a;

> > +   */
> > +  keep_row_order= (thd->lex->with_rownum &&
> > +                   (first_select->group_list.elements ||
> > +                    first_select->order_list.elements));

I think that the above test is self described. If there is any ORDER
BY or GROUP BY use
we have to keep the insert order.

Anyway, a better comment is of course good:

  /*
    If rownum() is used we have to preserve the insert row order
    to make GROUP BY and ORDER BY with filesort work.

    SELECT * from (SELECT a,b from t1 ORDER BY a)) WHERE rownum <= 0;

    When rownum is not used the optimizer will skip the ORDER BY clause.
    With rownum we have to keep the ORDER BY as this is what is expected.
    We also have to create any sort result temporary table in such a way
    that the inserted row order is maintained.
  */

> >    if (derived->is_recursive_with_table() &&
> >        !derived->is_with_table_recursive_reference() &&
> > diff --git a/sql/sql_insert.cc b/sql/sql_insert.cc
> > index 806049b4465..5c4dff33914 100644
> > --- a/sql/sql_insert.cc
> > +++ b/sql/sql_insert.cc
> > @@ -746,6 +746,9 @@ bool mysql_insert(THD *thd, TABLE_LIST *table_list,
> >      DBUG_RETURN(TRUE);
> >    }
> > +
> > +  if (thd->lex->with_rownum && table_list->lock_type == TL_WRITE_DELAYED)
> > +    table_list->lock_type= TL_WRITE;
>
> why? do you have a test case when it's needed?

Because we cannot calculate row number inside insert delayed...
(We do not know if an row is accepted or not).

> >    if (table_list->lock_type == TL_WRITE_DELAYED)
> >    {
> >      if (open_and_lock_for_insert_delayed(thd, table_list))
> > @@ -968,11 +971,16 @@ bool mysql_insert(THD *thd, TABLE_LIST *table_list,
> >    */
> >    if (returning &&
> >        result->send_result_set_metadata(returning->item_list,
> > -                                Protocol::SEND_NUM_ROWS | Protocol::SEND_EOF))
> > +                                       Protocol::SEND_NUM_ROWS |
> > +                                       Protocol::SEND_EOF))
> >      goto values_loop_end;
> >
> >    THD_STAGE_INFO(thd, stage_update);
> >    thd->decide_logging_format_low(table);
> > +  fix_rownum_pointers(thd, thd->lex->current_select, &info.copied);
> > +  if (returning)
> > +    fix_rownum_pointers(thd, thd->lex->returning(), &info.copied);
> > +
>
> Note that info.copied can be incremented more than once per row (in INSERT...ON
> DUPLICATE KEY UPDATE, for example). This will probably break rownum logic.

I am not sure how it should work in case of duplicate.  I do not know
how the logic
in Oracle is for this (or if it even supports on duplicate key update)
and I do not want to
speculate on this too much. I do not think this is right as versioning
should be invisible for the end user.

The current logic for rownum is that rownum is incremented for any
accepted rows,
either for insert or update.  In the current code we seam to do that,
execpt for versioning
where it is counted twice.  I think this is a bug as versioning should
be undetectable for
a normal user and users may use the return values to check what
happened with the row
(duplicate or not). Versioning will make any such test unusable.

What do you suggest?
- Only count the true inserts we are doing?
- Remove the copied++ for versioning, in which case the rownum logic
should work.

As far as I understands, one can only safely use rownum when inserting
into an empty
table, so do not know how important this is.

> >    do
> >    {
> >      DBUG_PRINT("info", ("iteration %llu", iteration));
> > @@ -2133,8 +2141,14 @@ int write_record(THD *thd, TABLE *table, COPY_INFO *info, select_result *sink)
> >      autoinc values (generated inside the handler::ha_write()) and
> >      values updated in ON DUPLICATE KEY UPDATE.
> >    */
> > -  if (sink && sink->send_data(thd->lex->returning()->item_list) < 0)
> > -    trg_error= 1;
> > +  if (sink)
> > +  {
> > +    /* We have to adjust 'copied' to make ROWNUM() work in RETURNING */
> > +    info->copied--;
> > +    if (sink->send_data(thd->lex->returning()->item_list) < 0)
> > +      trg_error= 1;
> > +    info->copied++;
>
> this is quite silly. Can you simply increment info->copied after this
> sink->send_data call?

No, as this leads to wrong resutls.  The problem is that for
'returning' the row is already accepted
while in all other code it is called before the row is accepted.

> > +  }
> >
> >  after_trg_or_ignored_err:
> >    if (key)
> > diff --git a/sql/sql_limit.h b/sql/sql_limit.h
> > index a4fcedac14a..62fd0546af0 100644
> > --- a/sql/sql_limit.h
> > +++ b/sql/sql_limit.h
> > @@ -67,6 +67,15 @@ class Select_limit_counters
> >     { return select_limit_cnt; }
> >     ha_rows get_offset_limit()
> >     { return offset_limit_cnt; }
> > +
> > +   void set_limit_if_lower(ha_rows limit)
> > +   {
> > +     if (!offset_limit_cnt)
>
> I suppose, alternatively, you could've had
>
>   if (offset_limit_cnt + select_limit_cnt > limit)
>     select_limit_cnt= limit - offset_limit_cnt;
>
> > +     {
> > +       if (select_limit_cnt > limit)
> > +         select_limit_cnt= limit;
> > +     }
> > +   }
> >  };

In some cases LIMIT is absolute.  With rownum it only applies if limit
is higher.
I think it is good to separate those tests.


> >  #endif // INCLUDES_MARIADB_SQL_LIMIT_H
> > diff --git a/sql/sql_yacc.yy b/sql/sql_yacc.yy
> > index 2bd2575b882..57ba9df42c0 100644
> > --- a/sql/sql_yacc.yy
> > +++ b/sql/sql_yacc.yy
> > @@ -10286,6 +10287,22 @@ function_call_nonkeyword:
> >              if (unlikely($$ == NULL))
> >                MYSQL_YYABORT;
> >            }
> > +/* Start SQL_MODE_ORACLE_SPECIFIC
> > +         | ROWNUM_SYM optional_braces
> > +          {
> > +            $$= new (thd->mem_root) Item_func_rownum(thd);
> > +            if (unlikely($$ == NULL))
> > +              MYSQL_YYABORT;
> > +          }
> > +End SQL_MODE_ORACLE_SPECIFIC */
> > +/* Start SQL_MODE_DEFAULT_SPECIFIC */
> > +         | ROWNUM_SYM '(' ')'
> > +          {
> > +            $$= new (thd->mem_root) Item_func_rownum(thd);
> > +            if (unlikely($$ == NULL))
> > +              MYSQL_YYABORT;
> > +          }
> > +/* End SQL_MODE_DEFAULT_SPECIFIC */
>
> You have two nearly identical rules compiled conditionally.
> I can pretty much guarantee that they'll go out of sync eventually.

This was what bar suggested we should use.


> Better do something like
>
>  /* Start SQL_MODE_ORACLE_SPECIFIC
>     oracle_optional_braces: optional_braces ;
>  End SQL_MODE_ORACLE_SPECIFIC */
>  /* Start SQL_MODE_DEFAULT_SPECIFIC */
>     oracle_optional_braces: '(' ')' ;
>  /* End SQL_MODE_DEFAULT_SPECIFIC */

Which is totally unreadable.

> or even
>
>         | ROWNUM_SYM
> /* Start SQL_MODE_ORACLE_SPECIFIC
>           optional_braces
> End SQL_MODE_ORACLE_SPECIFIC */
> /* Start SQL_MODE_DEFAULT_SPECIFIC */
>           '(' ')'
> /* End SQL_MODE_DEFAULT_SPECIFIC */
>          {
>            $$= new (thd->mem_root) Item_func_rownum(thd);
>            if (unlikely($$ == NULL))
>              MYSQL_YYABORT;
>          }

A little better, but not much. Still not readable.

<cut>

> > diff --git a/sql/table.cc b/sql/table.cc
> > index 7f3a6468aa2..100813e1f95 100644
> > --- a/sql/table.cc
> > +++ b/sql/table.cc
> > @@ -9331,6 +9331,10 @@ bool TABLE_LIST::init_derived(THD *thd, bool init_view)
> >    {
> >      /* A subquery might be forced to be materialized due to a side-effect. */
> >      if (!is_materialized_derived() && first_select->is_mergeable() &&
> > +        (unit->outer_select() && !unit->outer_select()->with_rownum) &&
>
> no merging if *outer select* uses rownum? why? example?

We cannot merge sub queries with rownum.

> > +        (!thd->lex->with_rownum ||
> > +         (!first_select->group_list.elements &&
> > +          !first_select->order_list.elements)) &&
>
> no merging if no rownum but there's group by or order by?
> group by - this is obvious, and is already checked earlier in this if()
> order by - why?

Becase we have to do the order by. A merge would destroy that.
(rownum requires original row order)

> > diff --git a/sql/sql_select.cc b/sql/sql_select.cc
> > +  // rownum used somewhere in query, no limits and it is derived
> > +  if (unlikely(thd->lex->with_rownum &&
> > +               select_lex->first_cond_optimization &&
> > +               select_lex->master_unit()->derived))
> > +    optimize_upper_rownum_func();
> > +
> > +  do_send_rows = (unit->lim.get_select_limit()) ? 1 : 0;
>
> Is there any particular reason why you put that between
>
>    DEBUG_SYNC(thd, "before_join_optimize");
>
> and
>
>    THD_STAGE_INFO(thd, stage_optimizing);
>
> ?
> I'd expect them logically to go directly one after the other

No particualr reason. I did not see the relationship between the above
calls, especially as there was a blank line in between,
but I see the point now. Will move the code.


> > +
> > +
> >    THD_STAGE_INFO(thd, stage_optimizing);
> >
> >    set_allowed_join_cache_types();
> > @@ -14210,7 +14236,24 @@ static ORDER *
> >  remove_const(JOIN *join,ORDER *first_order, COND *cond,
> >               bool change_list, bool *simple_order)
> >  {
> > -  *simple_order= join->rollup.state == ROLLUP::STATE_NONE;
> > +  /*
> > +    We can't do ORDER BY using filesort if the select list contains a non
> > +    deterministic value like RAND() or ROWNUM().
> > +    For example:
> > +    SELECT RAND() as 'r' from t1 order by r;
>
> This one uses temporary for me. Can you put here a correct example please?

 It is a correct example and probably the best possible example.
I do not understand your comment (does not make any sense).
Could you please provide a correct comment?
Also, try the example and try to figure out why it works.
Check also the comment below.

> > +
> > +    If we would first sort the table 't1', the RAND() column would be generated
> > +    during end_send() and the order would be wrong.
> > +
> > +    Previously we had here also a test of ROLLUP:
> > +    'join->rollup.state == ROLLUP::STATE_NONE'
> > +    I deleted this because
> > +    the ROLLUP was never enforced because of a bug where the inital value
> > +    of simple_order was ignored.  Having ROLLUP tested now when the code
> > +    is fixed causes many test failure and some wrong results so better to
> > +    leave the code as it was related to ROLLUP.
>
> No need to explain in a comment why something that is no longer here was removed.
> Use git for that, commit comments. The code comment should explain the code as it is.

No that is not the right approach as otherwise someone would add a similar code
back!  The comment explains what was removed and why, which is correct thing to
do in the code if there is a risk someone would try to add it back.

> > +  */
> > +  *simple_order= !join->rand_table_in_field_list;
>
> why do you need rand_table_in_field_list instead of writing
>
>   *simple_order= join->select_lex->select_list_tables & RAND_TABLE_BIT;

I assume the reason was that originally select_lex was an argument to
JOIN::prepare() and someone thought the bit could be reset later
or we could not reliable access it from join remove_const().

I agree that it is proably safe to remove the variable and do the test as above.
I will test this and do the change if the test passes (I do not see
any reason why they shouldn't)

> > +static void fix_items_after_optimize(THD *thd, SELECT_LEX *select_lex)
> > +{
> > +  List_iterator<Item> li(select_lex->fix_after_optimize);
> > +
> > +  while (Item *item= li++)
> > +    item->fix_after_optimize(thd);
> > +}
> > +
> > +
>
> here you could've had a comment, like
>
>  /* callbacks for process_direct_rownum_comparison() */
>
> > +static bool is_const(Item *item)
> > +{
> > +  /*
> > +    Constant expression in WHERE can be also subquery or other expensive
> > +    thing we shoud not evaluate on OPTIMIZATION phase
> > +  */
> > +  return item->const_item() && !item->is_expensive();
> > +}
> > +
> > +
> > +// Check if this is an unexpensive functional expression over basic constants
> > +
> > +static bool is_basic_const(Item *item)
> > +{
> > +  if (item->basic_const_item())
> > +    return true;
> > +  if (!item->const_item() || item->is_expensive())
> > +    return false;
> > +  if (item->type() == Item::FUNC_ITEM)
> > +  {
> > +    Item_func *func= ((Item_func *)item);
> > +    if (func->argument_count() == 0)
> > +      return false;
> > +    bool result= true;
> > +    Item **args= func->arguments();
> > +    for (uint i= 0; i < func->argument_count() && result; i++)
> > +      result= result && is_basic_const(args[i]);
> > +    return result;
> > +  }
> > +  return false;
> > +}
>
> is_basic_const() is a misleading name, a function of basic consts isn't
> a basic const itself.

This is how bar has named the item and classes. Do a grep after
basic_const..

> what's an example of an expression has is_const() == true, but
> is_basic_const() == false?

"1+1"  THis is the sum of two basic constants.



> > +
> > +
> > +inline bool is_simple_rownum_exp(Item *rownum, Item *constant)
>
> unused
>
> > +{
> > +  return (constant->basic_const_item() &&
> > +          rownum->type() == Item::FUNC_ITEM &&
> > +          ((Item_func*) rownum)->functype() == Item_func::ROWNUM_FUNC);
> > +}
> > +
> > +
> > +static bool temporary_set_limit(THD *thd __attribute__((unused)),
> > +                                SELECT_LEX_UNIT *unit,
> > +                                ha_rows lim)
> > +{
> > +  unit->set_limit_if_lower(lim);
>
> why is it temporary?
> in the PS/SP sense, just for one execution?

You have to ask Sanja that.
I assume it is just for one exection as limits may be parameters and constant
can change.  After all, we have also permanent limit and we have to distingush
these functions somehow. One different is that permanent limit is show
in explain.

> > +  return false;
> > +}
> > +
> > +
> > +static bool permanent_limit(THD *thd, SELECT_LEX_UNIT *unit, ha_rows lim)
> > +{
> > +  SELECT_LEX *gpar= unit->global_parameters();
> > +  if (gpar->select_limit != 0  &&
> > +       // limit can not be an expression but can be parameter
> > +      (!gpar->select_limit->basic_const_item() ||
> > +       ((ha_rows)gpar->select_limit->val_int()) < lim))
> > +    return false;
> > +
> > +  Query_arena *arena, backup;
> > +  arena= thd->activate_stmt_arena_if_needed(&backup);
> > +
> > +  gpar->select_limit=
> > +    new (thd->mem_root)Item_int(thd, lim, 20 /* max 64 bit length*/ );
>
> we have defines for that, MAX_BIGINT_WIDTH, MY_INT64_NUM_DECIMAL_DIGITS,
> LONGLONG_BUFFER_SIZE - whatever you prefer.

This come from Sanja, but I can change it. (I also updated the
comments to be "more english")



> > +  if (gpar->select_limit == 0)
> > +    return true; // EOM
> > +
> > +  unit->set_limit(gpar);
> > +
> > +  gpar->explicit_limit= true; // to show in EXPLAIN
> > +
> > +  if (arena)
> > +    thd->restore_active_arena(arena, &backup);
> > +
> > +  return false;
> > +}
> > +
> > +
> > +/**
> > +  Check posibility of LIMIT setting by rownum() of upper SELECT and do it
> > +
> > +  @note Ideal is to convert something like
> > +    SELECT ...
> > +      FROM (SELECT ...) table
> > +      WHERE rownum() < <CONSTANT>;
> > +  to
> > +    SELECT ...
> > +      FROM (SELECT ... LIMIT <CONSTANT>) table
> > +      WHERE rownum() < <CONSTANT>;
>
> 1. what if the outer select uses LIMIT, not ROWNUM?
>    why not to push that down too?

We already have code for that.

> 2. why are you not removing rownum from the outer select?

Because it may not be safe to do in some cases. Better safe than sorry..

>    if you remove rownum and reset with_rownum=false, it'll generally
>    give optimizer more freedom.

Has no effect on our optimizer in this particular case.

> > +static bool check_rownum_usage(Item_func *func_item, longlong *limit,
> > +                               bool *inv_order, const_probe probe)
> > +{
> > +  Item *arg1, *arg2;
> > +  *inv_order= 0;
> > +  DBUG_ASSERT(func_item->argument_count() == 2);
> > +
> > +  /* 'rownum op const' or 'const op field' */
> > +  arg1= func_item->arguments()[0]->real_item();
> > +  if (arg1->type() == Item::FUNC_ITEM &&
> > +      ((Item_func*) arg1)->functype() == Item_func::ROWNUM_FUNC)
> > +  {
> > +    arg2= func_item->arguments()[1]->real_item();
> > +    if ((*probe)(arg2))
> > +    {
> > +      *limit= arg2->val_int();
> > +      return *limit <= 0 || (ulonglong) *limit >= HA_POS_ERROR;
> > +    }
> > +  }
> > +  else if ((*probe)(arg1))
> > +  {
> > +    arg2= func_item->arguments()[1]->real_item();
>
> why not to assign arg2 outside of the if() ?

Because we should not evaluate it if probe is not true.
It can be expensive to evalute it or it can have side effects that
we may want to avoid.  For example, it could be a sub query
and probe is there to ensure we don't evaluate it here.



> > +    if (arg2->type() == Item::FUNC_ITEM &&
> > +        ((Item_func*) arg2)->functype() == Item_func::ROWNUM_FUNC)
> > +    {
> > +      *limit= arg1->val_int();
> > +      *inv_order= 1;
> > +      return *limit <= 0 || (ulonglong) *limit >= HA_POS_ERROR;
> > +    }
> > +  }
> > +  return 1;
> > +}
> > +
> > +
> > +/*
> > +  Limit optimization for ROWNUM()
> > +
> > +  Go through the WHERE clause and find out if there are any of the following
> > +  constructs on the top level:
> > +  rownum() <= integer_constant
> > +  rownum() <  integer_constant
> > +  rownum() = 1
> > +
> > +  If yes, then threat the select as if 'LIMIT integer_constant' would
> > +  have been used
> > +*/
> > +
> > +static void optimize_rownum(THD *thd, SELECT_LEX_UNIT *unit, Item *cond)
> > +{
> > +  DBUG_ENTER("optimize_rownum");
> > +
> > +  if (cond->type() == Item::COND_ITEM)
> > +  {
> > +    if (((Item_cond*) cond)->functype() == Item_func::COND_AND_FUNC)
> > +    {
> > +      List_iterator<Item> li(*((Item_cond*) cond)->argument_list());
> > +      Item *item;
> > +      while ((item= li++))
> > +        optimize_rownum(thd, unit, item);
> > +    }
> > +    DBUG_VOID_RETURN;
> > +  }
> > +
> > +  process_direct_rownum_comparison(thd, unit, cond, &is_const,
> > +                                   &temporary_set_limit);
>
> please, explain why it's is_const/temporary_set_limit here and
> is_basic_const/permanent_limit in optimize_upper_rownum_func().

Please ask Sanja.
I think the short version is that const/temporary is for the current select
why is_basic_const/permanent is for pushing limit from an outer select
to an inner select and there is more restrictions for that.

<cut>

> > +    switch (pred_type) {
> > +    case Item_func::LT_FUNC:                    // rownum() < #
> > +    {
> > +      if (limit <= 0)
> > +        DBUG_RETURN(false);
> > +      DBUG_RETURN((*set_limit)(thd, unit, limit - 1));
> > +    case Item_func::LE_FUNC:
> > +      DBUG_RETURN((*set_limit)(thd, unit, limit));
> > +    case Item_func::EQ_FUNC:
> > +      if (limit == 1)
> > +        DBUG_RETURN((*set_limit)(thd, unit, limit));
> > +      break;
> > +    default:
> > +      break;
> > +    }
> > +    }
>
> what if it's rownum() > 5 ?
> why not to mark it as always false right away?

We decided to not do this optimization at this point in time
Would require a lot of more work and testing to ensure
detection of wrong rownum usage would always work.
Not important for the real world.

Regards,
Monty
Follow ups

Re: c76ac1e6de8: MDEV-24089 support oracle syntax: rownum
From: Sergei Golubchik, 2021-04-19
References

Re: c76ac1e6de8: MDEV-24089 support oracle syntax: rownum
From: Sergei Golubchik, 2021-04-12