maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #03650
Performance problem: need index-based access when have only non-sargable conditions
Hi!
I was asked about this performance problem: consider a query
select * from tbl where tbl.key like '%foo%'
Table records are big, much larger than the length of the key. LIKE condition
is very selective, however its pattern starts with '%', so we can't construct
range access (and they are really searching for matches in word
suffixes/infixes, so fulltext index isn't an option).
The desired execution strategy is:
- Scan the index on tbl.key and collect index tuples that match the LIKE
condition.
- For those index tuples, retrieve full records.
One can simulate it by rewriting the query as a self join:
select * from tbl A, tbl B
where
B.primary_key = A.primary_key AND B.key LIKE '%foo%';
However, in this particular case they have a problem doing such rewrites
because the queries are generated by some ORM tool.
Proposed solution from our side
-------------------------------
It is not difficult to add an execution strategy like this
get_next_record()
{
read index tuple;
if (condition satisified)
{
read full record
}
}
(with optional rowid-ordered retrieval for full records).
The difficult part is to find a way to decided whether this strategy should be
used. We have no meaningful selectivity estimates for 'column LIKE '%foo%'.
The only way we could achieve this is by having the user manually specify
condition selectivities. I can see one way to achieve this without getting into
painful syntax modifications:
select * from tbl where selectivity(tbl.key like '%foo%', 0.05)
here 'selectivity(X,y)' will evaluate to value of X, but will also inform the
optimizer that P(X is true)=y.
Any thoughts about this?
BR
Sergey
--
Sergey Petrunia, Software Developer
Monty Program AB, http://askmonty.org
Blog: http://s.petrunia.net/blog
Follow ups