maria-developers team mailing list archive

Thread
Date

Re: Question about Query plan

To: Sergei Petrunia <psergey@xxxxxxxxxxxx>
From: Markus Pilman <mpilman@xxxxxxxxxxx>
Date: Mon, 14 May 2012 16:31:18 +0200
Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20120514135803.GD25281@pslp2>

Ah cool, thanks a lot for this information, that is all very useful!
So we added the implementation of these two hanlder methods on our
todo list. I also did not know about the keyread_time (etc) functions.

Thanks and best regards

Markus

On Mon, May 14, 2012 at 3:58 PM, Sergei Petrunia <psergey@xxxxxxxxxxxx> wrote:
> On Mon, May 14, 2012 at 03:15:09PM +0200, Markus Pilman wrote:
>> > - Do you really records_in_range() implementation like in (*)
>> Yes.
>>
>> > - Are estimates of 10 and 20 rows close to reality?
>> Not at all... The thing is: we can implement records_in_range not as a
>> function with constant complexity and since a full table scan is not
>> faster than a index range lookup in our storage engine (even if the
>> whole table is contained in the range), we wanted to force MySQL to
>> make always an index lookup. If I understand everything correctly, my
>> error in thinking was, that I thought MySQL uses the information
>> returned by records_in_range only to chose between index lookup and
>> fulltable scan (since on a traditional disk architecture a full table
>> scan will be faster than an index scan which scans the whole table).
>> But this is obviously not the case.
>
> As far as I understand you, you're correct. Information returned by
> records_in_range() is used in multiple places:
> 1. It is used to determine whether to make range scan or a full table scan.
> 2. It affects the order in which tables are joined.
> 3. (maybe there is something else it affects also)
>
> I think you should be able to force MySQL into doing index lookups instead of
> table scans by returing appropriate costs from
> handler->keyread_time() and handler->read_time() as opposed to scan_time().
>
>> So we will have to implemented this records_in_range method. Do I
>> understand this correct, that a logarithmic (with respect to the table
>> size) run time will be ok for this method (InnoDB seems to walk two
>> times through the index if I get the comments in the code correct)?
>
> I think logarithmic time should be ok.
>
> Note that records_in_range() can be called multiple times if you have multiple
> ranges. For example, for query like
>
> SELECT * FROM tbl WHERE t.key < 11 OR t.key IN (15, 17, 19, 22)
>
> you will get these calls:
>
>  records_in_range(t.key < 11)
>  records_in_range(t.key=15)
>  records_in_range(t.key=17)
>  records_in_range(t.key=19)
>  records_in_range(t.key=22)
>
> If you prefer to get all ranges at once, check out
> handler::multi_range_read_info_const() in sql/multi_range_read.cc - you can
> implement that function and get access to a  disjoint-and-sorted list of ranges.
> (however, records_in_range() will still need to be implemented because there
> are certain cases where records_in_range() is called directly, not through
> multi_range_read_info_const()).
>
> BR
>  Sergei
> --
> Sergei Petrunia, Software Developer
> Monty Program AB, http://askmonty.org
> Blog: http://s.petrunia.net/blog

References

Question about Query plan
From: Markus Pilman, 2012-05-14
Re: Question about Query plan
From: Sergei Petrunia, 2012-05-14
Re: Question about Query plan
From: Markus Pilman, 2012-05-14
Re: Question about Query plan
From: Sergei Petrunia, 2012-05-14