← Back to team overview

maria-discuss team mailing list archive

Re: Maraidb 10.1, Dependent Subqueries, much slower than 5.5

 

It appears I was mistaken about persistent stats, confusing them with engine independent table stats (thought they were the same thing), which are persistent as well. So, my settings which were intended to do something else, caused this issue.

Am I to understand correctly, that EITS are *never* updated unless another analyze table is done (assuming I do not update them manually)?


> On Apr 7, 2017, at 11:26 AM, Sales <info@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> 
> Yep, I noticed that as well, wondering if maybe persistent (engine independent) stats has something to do with this. Mine are set to preferably. For this table, Product_Inventory_Summed, there are 15,841,316 rows, a few more than the 1. There are no persistent stats for this table. With preferred, that’s supposed to mean it reverts back to engine stats. I believe it’s estimated count is the crux of the matter. 
> 
> Here’s the explain from the variant query on 5.5 so I can get an explain. But note it’s a different server with a slightly larger number of rows, however, it’s exactly the same structure, etc.
> 
> MariaDB [feeds]> explain 
>    -> select 
>    ->  * ,
>    ->  (select 
>    ->     sum(ifnull(pi.Quantity_On_Hand, pi.Other_Quantity)) 
>    ->   from
>    ->     Product_Inventory_Summed pi 
>    ->   where 
>    ->     pi.Product_ID = pm.Product_ID 
>    ->   group by 
>    ->     pi.Product_ID
>    ->  )
>    -> from Product_Master pm;
> +------+--------------------+-------+------+---------------+---------+---------+---------------------+---------+-------------+
> | id   | select_type        | table | type | possible_keys | key     | key_len | ref                 | rows    | Extra       |
> +------+--------------------+-------+------+---------------+---------+---------+---------------------+---------+-------------+
> |    1 | PRIMARY            | pm    | ALL  | NULL          | NULL    | NULL    | NULL                | 7537459 |             |
> |    2 | DEPENDENT SUBQUERY | pi    | ref  | PRIMARY       | PRIMARY | 22      | feeds.pm.Product_ID |       1 | Using where |
> +------+--------------------+-------+------+---------------+---------+---------+---------------------+---------+-------------+
> 
> Setting use_stat_tables=never does not change the explain on 10.1. This might indicate an issue with innodb stats. Looking at the new 10.1 stats variables for innodb, I see a bunch of new ones. I look specifically at innodb_stats variables:
> 
> +--------------------------------------+-------------+
> | Variable_name                        | Value       |
> +--------------------------------------+-------------+
> | innodb_stats_auto_recalc             | OFF         |
> | innodb_stats_include_delete_marked   | OFF         |
> | innodb_stats_method                  | nulls_equal |
> | innodb_stats_modified_counter        | 0           |
> | innodb_stats_on_metadata             | OFF         |
> | innodb_stats_persistent              | ON          |
> | innodb_stats_persistent_sample_pages | 20          |
> | innodb_stats_sample_pages            | 12          |
> | innodb_stats_traditional             | ON          |
> | innodb_stats_transient_sample_pages  | 12          |
> +--------------------------------------+——————+ 
> 
> I wonder about innodb_stats_on_metadata, especially as new tables are created. This summed table is created in a previous step, not sure when innodb statistics are updated any more if this is set to off, the new default. Might have to try turning that back on.
> 
> 
> 
>> On Apr 7, 2017, at 5:41 AM, Sergey Petrunia <sergey@xxxxxxxxxxx> wrote:
>> 
>> Hello,
>> 
>> On Thu, Apr 06, 2017 at 01:51:09PM -0500, Sales wrote:
>>> We are noticing many dependent subqueries not working as well as they did on Mariadb 5.5, we’re having to add a force index.
>>> 
>>> Relatively simple example:
>>> 
>>> explain update Product_Master pm set Total_Inventory = (select sum(ifnull(pi.Quantity_On_Hand, pi.Other_Quantity)) from Product_Inventory_Summed pi where pi.Product_ID = pm.Product_ID group by pi.Product_ID);
>>> +------+--------------------+-------+-------+---------------+---------+---------+------+---------+-------------+
>>> | id   | select_type        | table | type  | possible_keys | key     | key_len | ref  | rows    | Extra       |
>>> +------+--------------------+-------+-------+---------------+---------+---------+------+---------+-------------+
>>> |    1 | PRIMARY            | pm    | index | NULL          | PRIMARY | 22      | NULL | 5001829 |             |
>>> |    2 | DEPENDENT SUBQUERY | pi    | ALL   | PRIMARY       | NULL    | NULL    | NULL |       1 | Using where |
>>> +------+--------------------+-------+-------+---------------+---------+---------+------+---------+——————+
>>> 
>>> This query runs in a couple of minutes on 5.5, runs for many many hours on 10.1
>> 
>> I take the EXPLAIN output is from 10.1?  The plan is to do a full table scan on
>> Product_Inventory_Summed, and the optimizer expects to read just one row. How 
>> many rows does Product_Inventory_Summed table actually have?
>> 
>> (It would be nice to get an EXPLAIN from 5.5, but this is an UPDATE statement, and
>> EXPLAIN UPDATE is not supported in 5.5. Can you run a SELECT variant on 5.5:
>> 
>> explain 
>> select 
>> * ,
>> (select 
>>    sum(ifnull(pi.Quantity_On_Hand, pi.Other_Quantity)) 
>>  from
>>    Product_Inventory_Summed pi 
>>  where 
>>    pi.Product_ID = pm.Product_ID 
>>  group by 
>>    pi.Product_ID
>> )
>> from Praoduct_Master pm;
>> 
>> 
>>> By adding the force index, it is back to “normal”. 
>>> 
>>> explain update Product_Master pm set Total_Inventory = (select sum(ifnull(pi.Quantity_On_Hand, pi.Other_Quantity)) from Product_Inventory_Summed pi force index (PRIMARY) where pi.Product_ID = pm.Product_ID group by pi.Product_ID);
>>> +------+--------------------+-------+-------+---------------+---------+---------+---------------------+---------+-------+
>>> | id   | select_type        | table | type  | possible_keys | key     | key_len | ref                 | rows    | Extra |
>>> +------+--------------------+-------+-------+---------------+---------+---------+---------------------+---------+-------+
>>> |    1 | PRIMARY            | pm    | index | NULL          | PRIMARY | 22      | NULL                | 5001829 |       |
>>> |    2 | DEPENDENT SUBQUERY | pi    | ref   | PRIMARY       | PRIMARY | 22      | feeds.pm.Product_ID |       1 |       |
>>> +------+--------------------+-------+-------+---------------+---------+---------+---------------------+---------+———+
>>> 
>>> We are using persistent statistics on both tables, so, it knows there are a
>>> lot of rows in each. I don’t see any optimizer switches that are turned off.
>> 
>> Just to clarify: are you using "Engine indepdent table statistics", like 
>> documented here:
>> https://mariadb.com/kb/en/mariadb/engine-independent-table-statistics/
>> 
>> or just innodb_stats_persistent=ON?
>> 
>> what values do you have for @@optimizer_use_condition_selectivity and
>> @@use_stat_tables?
>> 
>>> 
>>> Any hints as to why or is this a current limitation of 10.1?
>> 
>> Shouldn't be happening. Let's continue to discuss and find out.
>> 
>> BR
>> Sergei
>> -- 
>> Sergei Petrunia, Software Developer
>> MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog
>> 
>> 
> 
> 
> _______________________________________________
> Mailing list: https://launchpad.net/~maria-discuss
> Post to     : maria-discuss@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~maria-discuss
> More help   : https://help.launchpad.net/ListHelp



References