← Back to team overview

maria-developers team mailing list archive

Re: New patch on the JSON_TABLE.


Hi Alexey,

On Mon, Sep 14, 2020 at 03:38:38PM +0400, Alexey Botchkov wrote:
> Hi, Sergei!.
> so the branch name is 'bb-mdef17399-hf'.
> It has slightly changed since last patch so please pay attention.
> There i present the patch that resolves issues that you mentioned
> in your last comments.
> One important is not fixed though.
> That is:
> INSERT INTO t1 VALUES (1, '{\"1\": 1}'), (2, '{\"1\": 2}'), (3, '{\"1\":
> 3}'),
> (4, '{\"1\": 4}'), (5, '{\"1\": 5}'), (6, '{\"1\": 6}');
>   (SELECT id FROM t1 as tt2,
>       JSON_TABLE(f1, "$" COLUMNS (jf FOR ORDINALITY)) AS tbl);
> That SELECT crashes in the optimizer, as the greedy_search() can't find any
> satisfying plan.
> I got rid of the crash with this line:
> @@ -9520,7 +9528,7 @@ best_extension_by_limited_search(JOIN      *join,
>      table_map real_table_bit= s->table->map;
>      if ((remaining_tables & real_table_bit) &&
>          (allowed_tables & real_table_bit) &&
> -        !(remaining_tables & s->dependent) &&
> +//        !(remaining_tables & s->dependent) &&
> But i don't mean it's an acceptable fix.
> It seems to me the problem is in the optimizer, so i'd like to ask your
> opinion.
> Maybe you just know the answer at once.

After some debugging I see that the following happens:

1. Execution reaches Table_function_json_table::setup

It computes

   m_dep_tables= m_json->used_tables();

this is =1 (In the subquery, table tt2 has map=1)

2. Then, subquery is converted into a semi-join. That is, it is merged into its
parent subquery. There, we have:

table t1: map=1
table tt2:  map=2
table JSON_TABLE(...) AS tbl: map=3

3.  The subquery was uncorrelated, so SJ-Materialization is a possible option and
the join optimizer attempts to construct a join order for the subquery tables.

However it can't succeed, as the table "JSON_TABLE(..) AS tbl" is set to be
dependent on table with map=1, which is not put into the join order because it
is not a part of subquery.
As a result, we get assertion failure when the optimizer fails to produce any
join orders.

The probem here is on step #2. 
When the subquery's tables are moved to the parent query, they get new values 
of TABLE::map, and also all their attributes that contain table maps need to 
be updated.

This done here in convert_subq_to_sj() function:

  /* n. Walk through child's tables and adjust table->map */
  List_iterator_fast<TABLE_LIST> si(subq_lex->leaf_tables);
  while ((tl= si++))
    if (tl->is_jtbm())
      tl->jtbm_table_no= table_no;
      Item *dummy= tl->jtbm_subselect;
      tl->jtbm_subselect->fix_after_pullout(parent_lex, &dummy, true);

In that code one can see: if the subquery has table that has 'is_jtbm()==true',
then ... some processing is done to re-compute its attributes after the table
has been pulled up into the parent select.

The same should be done for TABLE_LIST elements that are JSON_TABLE(..). 

If you'll introduce a new function, fix_after_pullout() is a good name. It is
already used for Item-derived classes.

> If not, i'll dig into it.
> And you're welcome to observe the new patch.
> Best regards.
> HF

Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog

Follow ups