← Back to team overview

maria-developers team mailing list archive

Re: [Commits] 56d339111a4: MDEV-16374: Filtered shows 0 for materilization scan for a semi join, which makes optimizer always picks

 

Hi Varun,

On Mon, Jun 04, 2018 at 12:34:49PM +0530, Varun wrote:
> revision-id: 56d339111a4e5c868ffa3b1f2538c5f101f67767 (mariadb-10.0.30-366-g56d339111a4)
> parent(s): a61724a3ca187f6eb44fc67948acb76a94efa783
> author: Varun Gupta
> committer: Varun Gupta
> timestamp: 2018-06-04 12:26:56 +0530
> message:
> 
> MDEV-16374: Filtered shows 0 for materilization scan for a semi join, which makes optimizer always picks
>             materialization scan over materialization lookup
> 
> For non-mergeable semi-joins we don't store the estimates of the IN subquery in table->file->stats.records.
> In the function TABLE_LIST::fetch_number_of_rows, we store the number of rows in the tables
> (estimates in case of derived table/views).
> Currently we don't store the estimates for non-mergeable semi-joins, which leads to a problem of selecting
> materialization scan over materialization lookup.
> Fixed this by storing these estimated appropriately
> 

* This is a change in the optimizer. Should we really do it in 10.0? In my
opinion, this should go into 10.3 or 10.4.  Let's discuss this, and I think
Igor may have something to say, too.

* The patch updates selectivity.test but not selectivity_innodb.result (which
includes selectivity.test).


* Please see my last comments at https://jira.mariadb.org/browse/MDEV-16374 .
Here is an alternative patch (it passes the testsuite, and produces the same
result for selectivity* tes):

diff --git a/sql/sql_select.cc b/sql/sql_select.cc
index 1285835..884997c 100644
--- a/sql/sql_select.cc
+++ b/sql/sql_select.cc
@@ -6602,7 +6602,7 @@ double matching_candidates_in_table(JOIN_TAB *s, bool with_found_constraint,
   {
     TABLE *table= s->table;
     double sel= table->cond_selectivity;
-    double table_records= (double)table->stat_records();
+    double table_records= s->records; // psergey-10
     dbl_records= table_records * sel;
     return dbl_records;
   }

What do you think about it?
I would like to get Igor's opinion also because I'm not certain which member
variable should be changed here.

(Why does derived table modify table->stats.records while JTBM semi-joins
modify JOIN_TAB::records? These seem like two solutions for the same problem?)

BR
 Sergei
-- 
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog