← Back to team overview

maria-developers team mailing list archive

Re: Proposal: change SUBQUERY to MATERIALIZE

 

On Thu, Dec 01, 2011 at 01:12:00PM +0200, Timour Katchaounov wrote:
> Ok, I agree to change to MATERIALIZED (not MATERIALIZE).
> Who will do the change?

I could make it after I'm done with my current crashing bug.

> In addition, in the EXTRA field there should be info
> which materialization is used. If I am not mistaken,
> there are three variants:
> - SJ-mat,
> - SJ-inside-out,
> - NON-SJ-mat
> (better terms are needed for explain).
>
(will reply to this separately)

> Timour
>
> On  1.12.2011 12:54, Sergei Petrunia wrote:
>> Reply-To:
>>
>> Hello,
>>
>> I think we have already discussed this before (multiple times), and the
>> idea was received with some conditional approval of the idea. Nothing has
>> been done or put to paper, though, so now I'm posting this in writing.
>>
>> I'm posting this now, because we're close to RC and I think it's better to do
>> this kind of changes before the RC.
>>
>> The proposal is: if a subquery is materialized (that is, the subquery is run
>> and it's output is stored with a temporary table with unique key), EXPLAIN
>> should not show "SUBQUERY" (like it does now), it should show "MATERIALIZE".
>>
>> Consider the following two queries. The EXPLAINs are the same, although one of
>> them runs materialization and the other does not:
>>
>> MariaDB [test]>  EXPLAIN select * from ten where a<  any (select max(a) from ten);
>> +----+-------------+-------+------+---------------+------+---------+------+------+-------------+
>> | id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra       |
>> +----+-------------+-------+------+---------------+------+---------+------+------+-------------+
>> |  1 | PRIMARY     | ten   | ALL  | NULL          | NULL | NULL    | NULL |   10 | Using where |
>> |  2 | SUBQUERY    | ten   | ALL  | NULL          | NULL | NULL    | NULL |   10 |             |
>> +----+-------------+-------+------+---------------+------+---------+------+------+-------------+
>> 2 rows in set (0.00 sec)
>>
>> MariaDB [test]>  EXPLAIN select * from ten where a = any (select max(a) from ten) or a<  3;
>> +----+-------------+-------+------+---------------+------+---------+------+------+-------------+
>> | id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra       |
>> +----+-------------+-------+------+---------------+------+---------+------+------+-------------+
>> |  1 | PRIMARY     | ten   | ALL  | NULL          | NULL | NULL    | NULL |   10 | Using where |
>> |  2 | SUBQUERY    | ten   | ALL  | NULL          | NULL | NULL    | NULL |   10 |             |
>> +----+-------------+-------+------+---------------+------+---------+------+------+-------------+
>> 2 rows in set (0.00 sec)
>>
>> You can only tell that the second one uses Materialization if look at EXPLAIN
>> EXTENDED warning text.  I think it is extremely bad that radically new
>> execution strategy lurks somewhere inside EXPLAIN EXTENDED.
>>
>>
>> I remember you've used to argue that 'SUBUQERY' represents the fact that the
>> subquqery is evaluated once, and materialization should be shown elsewhere. I
>> object to that argument
>> - materialization is currently shown *nowhere* (except EXPLAIN EXTENDED which
>>    is pain to read except for the most simplest queries)
>> - Derived table subqueries, which are also executed once (and stored in a
>>    temporary table, like with materialization), have their special select_type
>>    value 'DERIVED':
>>
>> MariaDB [test]>  explain select * from one_k, (select max(A.a+B.a) from ten A, ten B ) foo;
>> +----+-------------+------------+------+---------------+------+---------+------+------+------------------------------------+
>> | id | select_type | table      | type | possible_keys | key  | key_len | ref  | rows | Extra                              |
>> +----+-------------+------------+------+---------------+------+---------+------+------+------------------------------------+
>> |  1 | PRIMARY     |<derived2>  | ALL  | NULL          | NULL | NULL    | NULL |  100 |                                    |
>> |  1 | PRIMARY     | one_k      | ALL  | NULL          | NULL | NULL    | NULL | 1000 | Using join buffer (flat, BNL join) |
>> |  2 | DERIVED     | A          | ALL  | NULL          | NULL | NULL    | NULL |   10 |                                    |
>> |  2 | DERIVED     | B          | ALL  | NULL          | NULL | NULL    | NULL |   10 | Using join buffer (flat, BNL join) |
>> +----+-------------+------------+------+---------------+------+---------+------+------+------------------------------------+
>> 4 rows in set (0.02 sec)
>>
>>
>> I think the above should be sufficient to make a decision to change
>> Materialized subqueries to show 'MATERIALIZED' (or 'MATERIALIZE') in EXPLAIN
>> output.  Any objections?
>>
>>
>>
>>

-- 
BR
 Sergei
-- 
Sergei Petrunia, Software Developer
Monty Program AB, http://askmonty.org
Blog: http://s.petrunia.net/blog


References