← Back to team overview

maria-developers team mailing list archive

Re: Performance of index intersection

 

Thanks so much for looking into this!

Last weekend I started playing with the code, but it looks like I may need
to upgrade to an SSD to get a reasonable build time.

For Zone B I was imagining there would be a way to try increasingly large
jumps, perhaps aided by an index cursor that remembered some information it
saw as it walked through the B*-tree and page directory.

I'm happy to hear an optimization to skip over Zone A looks fairly easy.

As for real data, do full-text indexes return rowid-ordered rows? If so,
queries for a long-tail word (e.g. "hobbit") AND a frequent value (e.g.
soft_deleted=0) should benefit significantly from skipping over Zone A.
Maybe we could use a Wikipedia dump?

Thank you,
David




On Wed, 2 Oct 2019, 8:51 AM jocelyn fournier, <jocelyn.fournier@xxxxxxxxx>
wrote:

> Hi Sergey!
>
> Actually fetching min & max value for each keys (key1=1 and key2=2 in your
> case) should not cost much?
> Then you can compute the overlap zone (Zone B in your graph) and directly
> perform a jump on the beginning of Zone B and stop at the end of Zone B
> instead of EOF, using the index with the less amount of matching rows.
>
> BR,
>   Jocelyn
>
>
> > Le 1 oct. 2019 à 21:31, Sergey Petrunia <sergey@xxxxxxxxxxx> a écrit :
> >
> > Hello,
> >
> > More thoughts on this:
> >
> > Let's consider a query
> >
> > select * from t1 where key1=1 and key2=2
> >
> > let's assume it uses index itnersection.
> >
> > Check the attached picture. It shows the table rowids at the center,
> > key1 on the left, key2 on the right.
> >
> > There are three zones:
> >
> > - Zone A, where rowids matching 'key1=1' have no overlap with any rowids
> for 'key2=2'
> >
> > - Zone B, where rowids from two scans overlap.
> >
> > - Zone C, similar to zone A but at the end of the scan. Here, rowids
> matching
> > key2=2 have no overlap.
> >
> > The current "merge-ordered-streams" algorithm takes care of "Zone C".
> > the code will return EOF as soon as one of the merged streams has
> produced EOF.
> >
> > Now, let's look at Zone B.  Here, the code would scan both indexes
> forward
> > and search for records with key=1 that have matches with key=2.
> >
> > The search forward is this loop in QUICK_ROR_INTERSECT_SELECT::get_next:
> >
> >    do
> >    {
> >      DBUG_EXECUTE_IF("innodb_quick_report_deadlock",
> >                      DBUG_SET("+d,innodb_report_deadlock"););
> >      if (unlikely((error= quick->get_next())))
> >      {
> >        /* On certain errors like deadlock, trx might be rolled back.*/
> >        if (!thd->transaction_rollback_request)
> >          quick_with_last_rowid->file->unlock_row();
> >        DBUG_RETURN(error);
> >      }
> >      quick->file->position(quick->record);
> >      cmp= head->file->cmp_ref(quick->file->ref, last_rowid);
> >      if (cmp < 0)
> >      {
> >        /* This row is being skipped.  Release lock on it. */
> >        quick->file->unlock_row();
> >      }
> >    } while (cmp < 0);
> >
> >
> > the quick->get_next() call (after several levels of indirection) do this
> > storage engine API call:
> >
> >  handler->index_next_same();
> >
> > And the question is whether it would be faster if this call was made
> instead:
> >
> >  handler->index_read_map(key_value, rowid_of_match_candidate);
> >
> > The answer is that it's faster to do a jump when we will jump over many
> > records. When we jump over a few, it is actually slower than making
> several
> > index_next_same() calls.  The first reason for this is that doing seeks
> will
> > break InnoDB's prefetch buffer optimization. There might be other
> reasons.
> >
> > I don't have any idea how one could predict the jump size.
> >
> > As for Zone A, we could jump it over with one jump. If we first read the
> rowid
> > from "key2=2", then we could already use it when making a lookup with
> key1=1.
> >
> > But what if we first did an index lookup on key1=1? Then we will miss the
> > chance to jump to the first rowid for that one sees for key2=2...
> >
> > If we assume that the first jump is more important than others, we could
> just
> > look at the first record in each of the merged indexes and then feed
> them all
> > back the maximum rowid we saw as the starting point. This would let us
> jump
> > over zone A.
> >
> > I think, implementing "jump over zone A" is easier than inventing a way
> to do
> > jumps while in zone B. (And this is actually what your idea was).
> >
> >
> > A different question - are there any example datasets or queries we
> > could try this on? One can of course construct an artificial dataset and
> > queries but it would be much nicer to try this on a real dataset and a
> real
> > query.
> >
> > BR
> > -- Sergei P.
> >
> > On Fri, Sep 20, 2019 at 12:31:10PM +0300, Sergey Petrunia wrote:
> >> Hello David,
> >>
> >> On Mon, Sep 16, 2019 at 09:07:09PM +1200, David Sickmiller wrote:
> >>> I've been using MySQL/MariaDB for two decades but have more recently
> been
> >>> working with Elasticsearch.  I knew to expect an inverted index to
> speed up
> >>> querying full text fields, but I've been surprised (and a bit annoyed)
> at
> >>> how fast ES can query structured data.  (In my case, I'm largely
> looking
> >>> for intersections of a number of varchar fields with lowish
> cardinality,
> >>> e.g. WHERE country = 'US' AND client = 'Microsoft' AND status =
> >>> 'Completed'.)
> >>>
> >>> Elasticsearch seems to have several things going on, but I think a core
> >>> aspect, to use RDMS terminology, is that each column is indexed, and
> index
> >>> unions/intersections are used if the WHERE clause references multiple
> >>> columns.
> >>>
> >>> I've heard that MySQL/MariaDB has the ability to merge indexes, but
> I've
> >>> rarely observed it in person.  Googling for it yields a bunch of
> >>> StackOverflow posts complaining how slow it is, with responses
> agreeing and
> >>> explaining how to disable it.
> >>>
> >>> If I'm reading the MySQL/MariaDB code correctly, it looks like MariaDB
> will
> >>> intersect indexes by looping through each index, reading the rowids of
> all
> >>> matching keys and then, at the end (or once the buffer is full),
> checking
> >>> whether each rowid is present in each index.
> >>
> >> There are two algorithms:
> >>
> >> 1. index_merge/intersection
> >>
> >> This is implemented in QUICK_ROR_INTERSECT_SELECT.
> >> It is applicable when rowids from each source we are merging come
> ordered by
> >> the rowid value.
> >>
> >> This requirement is met when the scan over a single-point range. If the
> table
> >> has an index defined as
> >>
> >> INDEX i1(col1, ..., colN)
> >>
> >> then index tuples that compare as equal are ordered by their rowid.
> That is,
> >> if one does an index scan over
> >>
> >> (col1, ... colN)=(const1, ..., constN)
> >>
> >> they will get the records i`n rowid order.
> >>
> >> QUICK_ROR_INTERSECT select will run the scans on all merged streams
> >> simultaneously and do ordered-stream-merge on them.
> >>
> >> 2. index_merge/sort-interesect
> >> ( https://mariadb.com/kb/en/library/index_merge-sort_intersection/)
> >>
> >> This is implemented in QUICK_INDEX_INTERSECT_SELECT.
> >>
> >> The algorithm doesn't assume that the input streams are ordered.
> >>
> >> It scans all inputs and puts the rowids into a "Unique" object (think
> RB tree
> >> which overflows to disk). After the scans are finished, it can check
> which
> >> rowids were produced in all of the inputs, and those are in the
> intersection.
> >>
> >>> I wonder if there's an opportunity to speed this up.  If we first read
> in
> >>> the rowids from one index (ideally the one with the fewest matches), we
> >>> could tell the storage engine that, when reading the next index, it
> should
> >>> skip over rowids lower than the next candidate intersection.
> >>
> >> This is a good idea, neither QUICK_ROR_INTERSECT_SELECT, nor
> >> QUICK_INDEX_INTERSECT_SELECT do this.
> >>
> >>> In the best
> >>> case scenario, I think this could enable InnoDB to use its page
> directory
> >>> to skip past some of the keys, improving the performance from O(n) to
> O(log
> >>> n).
> >>
> >> Agree.
> >> If the scans being merged produce data with non-overlapping rowid
> ranges,
> >> then things could be sped up. I'm just wondering how often this is the
> case
> >> in practice. Do you have any thoughts this?
> >>
> >>> That said, this is all new to me.  Maybe there's an obvious reason this
> >>> wouldn't make much of an improvement, or maybe I've overlooked that
> it's
> >>> already been done.  However, if it looks promising to you folk, and
> it's
> >>> something you'd consider merging, I'd be willing to attempt writing a
> PR
> >>> for it.
> >>
> >> BR
> >> Sergei
> >> --
> >> Sergei Petrunia, Software Developer
> >> MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog
> >>
> >>
> >
> > BR
> > Sergei
> > --
> > Sergei Petrunia, Software Developer
> > MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog
> >
> >
> > <index-merge1.jpg>
>
>

Follow ups

References