maria-developers team mailing list archive

Thread
Date
Re: Fwd: [Commits] Rev 2901: BNLH algorithm always used a full table scan over the joined table in file:///home/igor/maria/maria-5.3-mwl128-hashrange/

To: Sergey Petrunya <psergey@xxxxxxxxxxxx>
From: Igor Babaev <igor@xxxxxxxxxxxx>
Date: Mon, 21 Feb 2011 15:20:13 -0800
Cc: maria-developers <maria-developers@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <20110221203008.GB23800@pslp2>
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.15) Gecko/20101026 SUSE/3.0.10 Thunderbird/3.0.10
On 02/21/2011 12:30 PM, Sergey Petrunya wrote:
> On Fri, Feb 11, 2011 at 08:49:37PM -0800, Igor Babaev wrote:
>> Hi Sergey,
>>
>> Please review this patch that fixes a defect of the current
>> implementation of the mwl #128 that never couples hash join
>> with range/index_merge scans.
>> The patch also adds prefix #hash# to the names of the used hash indexes.
>> (we agreed upon this at the last optimizer meeting).
>>
>> Regards,
>> Igor.
>>
>> -------- Original Message --------
>> Subject: [Commits] Rev 2901: BNLH algorithm always used a full table
>> scan over the joined table in
>> file:///home/igor/maria/maria-5.3-mwl128-hashrange/
>> Date: Fri, 11 Feb 2011 20:41:23 -0800 (PST)
>> From: Igor Babaev <igor@xxxxxxxxxxxx>
>> Reply-To: maria-developers@xxxxxxxxxxxxxxxxxxx
>> To: <commits@xxxxxxxxxxx>
>>
>> At file:///home/igor/maria/maria-5.3-mwl128-hashrange/
>>
>> ------------------------------------------------------------
>> revno: 2901
>> revision-id: igor@xxxxxxxxxxxx-20110212044122-w9n3jdk3d2ps0w3o
>> parent: igor@xxxxxxxxxxxx-20110207231903-3rqbfs50d33lk3r9
>> committer: Igor Babaev <igor@xxxxxxxxxxxx>
>> branch nick: maria-5.3-mwl128-hashrange
>> timestamp: Fri 2011-02-11 20:41:22 -0800
>> message:
>>   BNLH algorithm always used a full table scan over the joined table
>>   even in the cases when there existed range/index-merge scans that
>>   were cheaper than the full table scan.
>>   This was a defect/bug of the implementation of mwl #128.
>>   Now hash join can work not only with full table scan of the joined
>>   table, but also with full index scan, range and index-merge scans.
>>   Accordingly, in the cases when hash join is used the column 'type'
>>   in the EXPLAINs can contain now 'hash_ALL', 'hash_index', 'hash_range'
>>   and 'hash_index_merge'. If hash join is coupled with a range/index_merge
>>   scan then the columns 'key' and 'key_len' contain info not only on
>>   the used hash index, but also on the indexes used for the scan.
> ...
>> === modified file 'sql/sql_select.cc'
>> --- a/sql/sql_select.cc	2011-02-06 04:57:03 +0000
>> +++ b/sql/sql_select.cc	2011-02-12 04:41:22 +0000
> ...
>> @@ -8419,6 +8438,11 @@
>>            sort_by_tab->type= JT_ALL;
>>            sort_by_tab->read_first_record= join_init_read_record;
>>          }
>> +        else if (sort_by_tab->type == JT_HASH_NEXT)
>> +        {
>> +          sort_by_tab->type= JT_HASH;
>> +          sort_by_tab->read_first_record= join_init_read_record;
>> +        }
>>        }
>>        break;
>>      }
> I have a question only to this part of the patch: I don't understand how the
> above code could work. 
> 
> The if statement above the one you've added checks if sort_by_tab was using
> [ordered] index scan to produce records in orderer, and if yes, switches it to
> a full scan (because use of join buffering will break ordering anyway).
> 
> I suppose the part you've added does something similar, but I totally fail to
> understand what exactly it does.

The above code says:
if hash join is used do not use full index scan to look through the
joined table, rather use full table scan in this case.

> 
> Now, some comments on the new EXPLAIN output:
> 
> 
> ISSUE1. 
> 
> Consider a query:
> MariaDB [j1]> explain select * from t3, t4 where t3.col1=t4.col2;
> +----+-------------+-------+----------+---------------+-----------+---------+------------+------+--------------------------------------------------+
> | id | select_type | table | type     | possible_keys | key       | key_len | ref        | rows | Extra                                            |
> +----+-------------+-------+----------+---------------+-----------+---------+------------+------+--------------------------------------------------+
> |  1 | SIMPLE      | t3    | ALL      | NULL          | NULL      | NULL    | NULL       | 1000 | Using where                                      |
> |  1 | SIMPLE      | t4    | hash_ALL | NULL          | #hash#$hj | 5       | j1.t3.col1 | 1000 | Using where; Using join buffer (flat, BNLH join) |
> +----+-------------+-------+----------+---------------+-----------+---------+------------+------+--------------------------------------------------+
> 2 rows in set (0.00 sec)
> 
> AFAIU #hash#$hj means the hash key on the hashtable that's used for
> hash join. I think the name is too convoluted.  The hashtable has only one
> key, if we need to tell it from t4's indexes it would be sufficient to use
> "#hashkey#", without the cryptic "$hj".
> 
> As for "#" characters: EXPLAIN already shows "internal" tables/indexes:

If you look into #maria-call log you'll see that I followed strictly
Monty's instructions:
to use the prefix #hash# before the index name I used to use, i.e.
<used_index> should be substituted for #hash#<used_index>.

The only deviation I afforded myself was: I used $hj instead of hj_key.

> 
> MariaDB [j1]> explain select * from (select * from t1) X;
> +----+-------------+------------+------+---------------+------+---------+------+------+-------+
> | id | select_type | table      | type | possible_keys | key  | key_len | ref  | rows | Extra |
> +----+-------------+------------+------+---------------+------+---------+------+------+-------+
> |  1 | PRIMARY     | <derived2> | ALL  | NULL          | NULL | NULL    | NULL | 1000 |       |
> |  2 | DERIVED     | t1         | ALL  | NULL          | NULL | NULL    | NULL | 1000 |       |
> +----+-------------+------------+------+---------------+------+---------+------+------+-------+
> 
> # The following is from MWL#90 tree:
> MariaDB [test]> explain select * from t10 where a in (select max(b) from t10 group by a);
> +----+-------------+-------------+--------+---------------+--------------+---------+------------+------+---------------------------------+
> | id | select_type | table       | type   | possible_keys | key          | key_len | ref        | rows | Extra                           |
> +----+-------------+-------------+--------+---------------+--------------+---------+------------+------+---------------------------------+
> |  1 | PRIMARY     | t10         | ALL    | NULL          | NULL         | NULL    | NULL       | 1000 | Using where                     |
> |  1 | PRIMARY     | <subquery2> | eq_ref | distinct_key  | distinct_key | 5       | test.t10.a |    1 |                                 |
> |  2 | SUBQUERY    | t10         | ALL    | NULL          | NULL         | NULL    | NULL       | 1000 | Using temporary; Using filesort |
> +----+-------------+-------------+--------+---------------+--------------+---------+------------+------+---------------------------------+
> 3 rows in set (0.01 sec)
> 
> # The following is from current 5.3:
> MariaDB [j1]> explain select distinct * from t1 where a in (select b from t2);
> +----+-------------+------------+--------+---------------+------------+---------+------+------+-----------------+
> | id | select_type | table      | type   | possible_keys | key        | key_len | ref  | rows | Extra           |
> +----+-------------+------------+--------+---------------+------------+---------+------+------+-----------------+
> |  1 | PRIMARY     | t1         | ALL    | NULL          | NULL       | NULL    | NULL | 1000 | Using temporary |
> |  1 | PRIMARY     | subselect2 | eq_ref | unique_key    | unique_key | 5       | func |    1 |                 |
> |  2 | SUBQUERY    | t2         | ALL    | NULL          | NULL       | NULL    | NULL | 1000 | Distinct        |
> +----+-------------+------------+--------+---------------+------------+---------+------+------+-----------------+
> 
> 
> ... and it either uses no quoting at all (see "unique_key"), or angle brackets
> quoting (see <subquery2>). I think we should not multiply the ways of quoting,
> so please consider switching to <quoting> or to not using any quoting at all.

We use # in generated names for temporary tables. I don't see any
problem here.
> 
> ISSUE2
> 
> MariaDB [j1]> EXPLAIN
>     -> SELECT City.Name, Country.Name, CountryLanguage.Language
>     -> FROM City,Country,CountryLanguage
>     -> WHERE City.Country=Country.Code AND
>     -> CountryLanguage.Country=Country.Code AND
>     -> City.Name LIKE 'L%' AND Country.Population > 3000000 AND
>     -> CountryLanguage.Percentage > 50 AND
>     -> LENGTH(Language) < LENGTH(City.Name) - 2;
> +----+-------------+-----------------+----------+--------------------+---------------+---------+----------------------------+------+--------------------------------------------------+
> | id | select_type | table           | type     | possible_keys      | key           | key_len | ref                        | rows | Extra                                            |
> +----+-------------+-----------------+----------+--------------------+---------------+---------+----------------------------+------+--------------------------------------------------+
> |  1 | SIMPLE      | CountryLanguage | ALL      | PRIMARY,Percentage | NULL          | NULL    | NULL                       |  984 | Using where                                      |
> |  1 | SIMPLE      | Country         | hash_ALL | PRIMARY            | #hash#PRIMARY | 3       | j1.CountryLanguage.Country |  239 | Using where; Using join buffer (flat, BNLH join) |
> |  1 | SIMPLE      | City            | hash_ALL | Country            | #hash#Country | 3       | j1.CountryLanguage.Country | 4079 | Using where; Using join buffer (flat, BNLH join) |
> +----+-------------+-----------------+----------+--------------------+---------------+---------+----------------------------+------+--------------------------------------------------+
> 3 rows in set (0.01 sec)
> 
> "key" column shows "#hash#PRIMARY". I suppose, this is because hash join opimizer is
> internally hooked into ref optimizer, and it used the PRIMARY index.  This
> might be useful for developer, but is this information meaningful for the user?
> If one changes #hash#PRIMARY to #hash#IDX2  (suppose there is an index called
> IDX2 which has the same definition as primary key), or to #hash#$hj, will it be
> a different query plan?  I think it should be "<hashkey>" always.
> 
We can use (and are going to use it) statistics provided for used_index.
So #hash#<associated_index> is quite relevant here.
> 
> ISSUE3
> For hash_range, "key" column shows {hashkey}:{range_key}, which allows to
> determine that which key was used for range access. For hash_index type, 
> "key" column shows only the hash's pseudo-index: 
> 
> MariaDB [j3]> EXPLAIN SELECT t2.i FROM t1,t2 WHERE t1.cu = t2.cl;
> +----+-------------+-------+------------+---------------+-----------+---------+------+------+--------------------------------------------------+
> | id | select_type | table | type       | possible_keys | key       | key_len | ref  | rows | Extra                                            |
> +----+-------------+-------+------------+---------------+-----------+---------+------+------+--------------------------------------------------+
> |  1 | SIMPLE      | t2    | ALL        | NULL          | NULL      | NULL    | NULL |    6 |                                                  |
> |  1 | SIMPLE      | t1    | hash_index | cu1,cu2       | #hash#cu1 | 33      | func |   10 | Using where; Using join buffer (flat, BNLH join) |
> +----+-------------+-------+------------+---------------+-----------+---------+------+------+--------------------------------------------------+
> 
> What is the index that's used for scanning the table t1?  "#hash#cu1" shows
> that hash join optimizer used access on index "cu1" in its analysis, but
> generally it doesn't mean that full index scan on table t1 was done on index
> cu1, doesn't it? 
> I would very much prefer to see #hash#cu1:USED_INDEX there (or taking into
> account previous suggestions, hashkey:USED_INDEX). It will also be
> consistent with how hash_range and hash_index_merge use that column.
>
Yes, the scan index is missing here. I will add.


> ISSUE4
> If key and key_len use hash_part:real_part notation, i.e. use semicolon, why
> use underscore in hash_ALL, hash_index, etc? Won't hash:ALL, hash:index look
> more consistent with other columns?

again: I follow here Monty's recommendations. Nothing more, nor less.

Regards,
Igor.
> 
> 
> 
> 
> 
> 
> BR
>  Sergey
Follow ups

Re: Fwd: [Commits] Rev 2901: BNLH algorithm always used a full table scan over the joined table in file:///home/igor/maria/maria-5.3-mwl128-hashrange/
From: Sergey Petrunya, 2011-02-22
References

Re: Fwd: [Commits] Rev 2901: BNLH algorithm always used a full table scan over the joined table in file:///home/igor/maria/maria-5.3-mwl128-hashrange/
From: Sergey Petrunya, 2011-02-21