maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #03928
Re: Fwd: [Commits] Rev 2901: BNLH algorithm always used a full table scan over the joined table in file:///home/igor/maria/maria-5.3-mwl128-hashrange/
On 02/21/2011 12:30 PM, Sergey Petrunya wrote:
> On Fri, Feb 11, 2011 at 08:49:37PM -0800, Igor Babaev wrote:
>> Hi Sergey,
>>
>> Please review this patch that fixes a defect of the current
>> implementation of the mwl #128 that never couples hash join
>> with range/index_merge scans.
>> The patch also adds prefix #hash# to the names of the used hash indexes.
>> (we agreed upon this at the last optimizer meeting).
>>
>> Regards,
>> Igor.
>>
>> -------- Original Message --------
>> Subject: [Commits] Rev 2901: BNLH algorithm always used a full table
>> scan over the joined table in
>> file:///home/igor/maria/maria-5.3-mwl128-hashrange/
>> Date: Fri, 11 Feb 2011 20:41:23 -0800 (PST)
>> From: Igor Babaev <igor@xxxxxxxxxxxx>
>> Reply-To: maria-developers@xxxxxxxxxxxxxxxxxxx
>> To: <commits@xxxxxxxxxxx>
>>
>> At file:///home/igor/maria/maria-5.3-mwl128-hashrange/
>>
>> ------------------------------------------------------------
>> revno: 2901
>> revision-id: igor@xxxxxxxxxxxx-20110212044122-w9n3jdk3d2ps0w3o
>> parent: igor@xxxxxxxxxxxx-20110207231903-3rqbfs50d33lk3r9
>> committer: Igor Babaev <igor@xxxxxxxxxxxx>
>> branch nick: maria-5.3-mwl128-hashrange
>> timestamp: Fri 2011-02-11 20:41:22 -0800
>> message:
>> BNLH algorithm always used a full table scan over the joined table
>> even in the cases when there existed range/index-merge scans that
>> were cheaper than the full table scan.
>> This was a defect/bug of the implementation of mwl #128.
>> Now hash join can work not only with full table scan of the joined
>> table, but also with full index scan, range and index-merge scans.
>> Accordingly, in the cases when hash join is used the column 'type'
>> in the EXPLAINs can contain now 'hash_ALL', 'hash_index', 'hash_range'
>> and 'hash_index_merge'. If hash join is coupled with a range/index_merge
>> scan then the columns 'key' and 'key_len' contain info not only on
>> the used hash index, but also on the indexes used for the scan.
> ...
>> === modified file 'sql/sql_select.cc'
>> --- a/sql/sql_select.cc 2011-02-06 04:57:03 +0000
>> +++ b/sql/sql_select.cc 2011-02-12 04:41:22 +0000
> ...
>> @@ -8419,6 +8438,11 @@
>> sort_by_tab->type= JT_ALL;
>> sort_by_tab->read_first_record= join_init_read_record;
>> }
>> + else if (sort_by_tab->type == JT_HASH_NEXT)
>> + {
>> + sort_by_tab->type= JT_HASH;
>> + sort_by_tab->read_first_record= join_init_read_record;
>> + }
>> }
>> break;
>> }
> I have a question only to this part of the patch: I don't understand how the
> above code could work.
>
> The if statement above the one you've added checks if sort_by_tab was using
> [ordered] index scan to produce records in orderer, and if yes, switches it to
> a full scan (because use of join buffering will break ordering anyway).
>
> I suppose the part you've added does something similar, but I totally fail to
> understand what exactly it does.
The above code says:
if hash join is used do not use full index scan to look through the
joined table, rather use full table scan in this case.
>
> Now, some comments on the new EXPLAIN output:
>
>
> ISSUE1.
>
> Consider a query:
> MariaDB [j1]> explain select * from t3, t4 where t3.col1=t4.col2;
> +----+-------------+-------+----------+---------------+-----------+---------+------------+------+--------------------------------------------------+
> | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
> +----+-------------+-------+----------+---------------+-----------+---------+------------+------+--------------------------------------------------+
> | 1 | SIMPLE | t3 | ALL | NULL | NULL | NULL | NULL | 1000 | Using where |
> | 1 | SIMPLE | t4 | hash_ALL | NULL | #hash#$hj | 5 | j1.t3.col1 | 1000 | Using where; Using join buffer (flat, BNLH join) |
> +----+-------------+-------+----------+---------------+-----------+---------+------------+------+--------------------------------------------------+
> 2 rows in set (0.00 sec)
>
> AFAIU #hash#$hj means the hash key on the hashtable that's used for
> hash join. I think the name is too convoluted. The hashtable has only one
> key, if we need to tell it from t4's indexes it would be sufficient to use
> "#hashkey#", without the cryptic "$hj".
>
> As for "#" characters: EXPLAIN already shows "internal" tables/indexes:
If you look into #maria-call log you'll see that I followed strictly
Monty's instructions:
to use the prefix #hash# before the index name I used to use, i.e.
<used_index> should be substituted for #hash#<used_index>.
The only deviation I afforded myself was: I used $hj instead of hj_key.
>
> MariaDB [j1]> explain select * from (select * from t1) X;
> +----+-------------+------------+------+---------------+------+---------+------+------+-------+
> | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
> +----+-------------+------------+------+---------------+------+---------+------+------+-------+
> | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 1000 | |
> | 2 | DERIVED | t1 | ALL | NULL | NULL | NULL | NULL | 1000 | |
> +----+-------------+------------+------+---------------+------+---------+------+------+-------+
>
> # The following is from MWL#90 tree:
> MariaDB [test]> explain select * from t10 where a in (select max(b) from t10 group by a);
> +----+-------------+-------------+--------+---------------+--------------+---------+------------+------+---------------------------------+
> | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
> +----+-------------+-------------+--------+---------------+--------------+---------+------------+------+---------------------------------+
> | 1 | PRIMARY | t10 | ALL | NULL | NULL | NULL | NULL | 1000 | Using where |
> | 1 | PRIMARY | <subquery2> | eq_ref | distinct_key | distinct_key | 5 | test.t10.a | 1 | |
> | 2 | SUBQUERY | t10 | ALL | NULL | NULL | NULL | NULL | 1000 | Using temporary; Using filesort |
> +----+-------------+-------------+--------+---------------+--------------+---------+------------+------+---------------------------------+
> 3 rows in set (0.01 sec)
>
> # The following is from current 5.3:
> MariaDB [j1]> explain select distinct * from t1 where a in (select b from t2);
> +----+-------------+------------+--------+---------------+------------+---------+------+------+-----------------+
> | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
> +----+-------------+------------+--------+---------------+------------+---------+------+------+-----------------+
> | 1 | PRIMARY | t1 | ALL | NULL | NULL | NULL | NULL | 1000 | Using temporary |
> | 1 | PRIMARY | subselect2 | eq_ref | unique_key | unique_key | 5 | func | 1 | |
> | 2 | SUBQUERY | t2 | ALL | NULL | NULL | NULL | NULL | 1000 | Distinct |
> +----+-------------+------------+--------+---------------+------------+---------+------+------+-----------------+
>
>
> ... and it either uses no quoting at all (see "unique_key"), or angle brackets
> quoting (see <subquery2>). I think we should not multiply the ways of quoting,
> so please consider switching to <quoting> or to not using any quoting at all.
We use # in generated names for temporary tables. I don't see any
problem here.
>
> ISSUE2
>
> MariaDB [j1]> EXPLAIN
> -> SELECT City.Name, Country.Name, CountryLanguage.Language
> -> FROM City,Country,CountryLanguage
> -> WHERE City.Country=Country.Code AND
> -> CountryLanguage.Country=Country.Code AND
> -> City.Name LIKE 'L%' AND Country.Population > 3000000 AND
> -> CountryLanguage.Percentage > 50 AND
> -> LENGTH(Language) < LENGTH(City.Name) - 2;
> +----+-------------+-----------------+----------+--------------------+---------------+---------+----------------------------+------+--------------------------------------------------+
> | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
> +----+-------------+-----------------+----------+--------------------+---------------+---------+----------------------------+------+--------------------------------------------------+
> | 1 | SIMPLE | CountryLanguage | ALL | PRIMARY,Percentage | NULL | NULL | NULL | 984 | Using where |
> | 1 | SIMPLE | Country | hash_ALL | PRIMARY | #hash#PRIMARY | 3 | j1.CountryLanguage.Country | 239 | Using where; Using join buffer (flat, BNLH join) |
> | 1 | SIMPLE | City | hash_ALL | Country | #hash#Country | 3 | j1.CountryLanguage.Country | 4079 | Using where; Using join buffer (flat, BNLH join) |
> +----+-------------+-----------------+----------+--------------------+---------------+---------+----------------------------+------+--------------------------------------------------+
> 3 rows in set (0.01 sec)
>
> "key" column shows "#hash#PRIMARY". I suppose, this is because hash join opimizer is
> internally hooked into ref optimizer, and it used the PRIMARY index. This
> might be useful for developer, but is this information meaningful for the user?
> If one changes #hash#PRIMARY to #hash#IDX2 (suppose there is an index called
> IDX2 which has the same definition as primary key), or to #hash#$hj, will it be
> a different query plan? I think it should be "<hashkey>" always.
>
We can use (and are going to use it) statistics provided for used_index.
So #hash#<associated_index> is quite relevant here.
>
> ISSUE3
> For hash_range, "key" column shows {hashkey}:{range_key}, which allows to
> determine that which key was used for range access. For hash_index type,
> "key" column shows only the hash's pseudo-index:
>
> MariaDB [j3]> EXPLAIN SELECT t2.i FROM t1,t2 WHERE t1.cu = t2.cl;
> +----+-------------+-------+------------+---------------+-----------+---------+------+------+--------------------------------------------------+
> | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
> +----+-------------+-------+------------+---------------+-----------+---------+------+------+--------------------------------------------------+
> | 1 | SIMPLE | t2 | ALL | NULL | NULL | NULL | NULL | 6 | |
> | 1 | SIMPLE | t1 | hash_index | cu1,cu2 | #hash#cu1 | 33 | func | 10 | Using where; Using join buffer (flat, BNLH join) |
> +----+-------------+-------+------------+---------------+-----------+---------+------+------+--------------------------------------------------+
>
> What is the index that's used for scanning the table t1? "#hash#cu1" shows
> that hash join optimizer used access on index "cu1" in its analysis, but
> generally it doesn't mean that full index scan on table t1 was done on index
> cu1, doesn't it?
> I would very much prefer to see #hash#cu1:USED_INDEX there (or taking into
> account previous suggestions, hashkey:USED_INDEX). It will also be
> consistent with how hash_range and hash_index_merge use that column.
>
Yes, the scan index is missing here. I will add.
> ISSUE4
> If key and key_len use hash_part:real_part notation, i.e. use semicolon, why
> use underscore in hash_ALL, hash_index, etc? Won't hash:ALL, hash:index look
> more consistent with other columns?
again: I follow here Monty's recommendations. Nothing more, nor less.
Regards,
Igor.
>
>
>
>
>
>
> BR
> Sergey
Follow ups
References