maria-developers team mailing list archive

Thread
Date

Re: 6ac19d09c66: MDEV-16978 Application-time periods: WITHOUT OVERLAPS

To: Sergei Golubchik <serg@xxxxxxxxxxx>
From: Nikita Malyavin <nikitamalyavin@xxxxxxxxx>
Date: Wed, 4 Dec 2019 01:00:03 +1000
Cc: "MariaDB Developers \(maria-developers@xxxxxxxxxxxxxxxxxxx\)" <maria-developers@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <20191203135957.GA10727@pweza.fritz.box>

On Tue, 3 Dec 2019 at 23:59, Sergei Golubchik <serg@xxxxxxxxxxx> wrote:

> > > > > But it can be done better. You search for the key with the same
> > > > > value and a period start <= than the period start of new row.
> > > > > And then you have to check two rows for overlaps. If you'll
> > > > > search for a key with the period start <= than the _period end_
> > > > > of the new row, then you'll only need to check one row (may be
> > > > > one more, if updating).
> > > > >
> > > >  It can't work just that way: to handle case when period start =
> > > >  _period end_ of the new row, you should write even more checks,
> > > >  and then move cursor left.
> > >
> > > No, I don't see that. Can you show an example where you'd need even
> > > more checks?
> >
> > Ok,
> > suppose you have to rows in table with periods:
> > (b, c),
> > (c, d).
> >
> > Now you insert (a, c). ha_index_read_map will look for period_start <=
> > c, so it will return (c, d). These rows do not overlap, but we yet do
> > not know if (b, c) is there. So we need to go to the previous row.
> >
> > Now the algorithm looks same complex: read key, make some checks, got
> > to the prev record.
>
> I see. Because intervals are inclusive we probably have to search
> with < not <=. HA_READ_BEFORE_KEY
>
> Yes, suddenly I didn't think about <  (i.e., HA_READ_BEFORE_KEY), but that
can work

> > > > Also, why do you store both period ends in the index? It seems
> > > > > like it'd be enough to store only one end - only the start or
> > > > > only the end. Both ends help if you use keyreads, but you don't.
> > > > > On the second thought, perhaps, you should use keyreads, there's
> > > > > no need to fetch the whole row for this overlap check. On the
> > > > > yet another thought it might not work with
> > > > > HA_EXTRA_REMEMBER_POS.
> > > >
> > > > I think keyreads can work with HA_EXTRA_REMEMBER_POS. Moreover, I
> > > > agree that using keyreads can be much more efficient. However, all
> > > > of my latest code, namely FKs and REPLACE/IODKU, are not based on
> > > > keyreads, and rewriting it now will require significant effort.
> > > > Let's make it a separate task, maybe?
> > >
> > > Why would it require significant effort? As far as I understand in
> > > your current code you only need to enable keyreads and that's all.
> >
> > it'll return in different format, isn't it? You can't rely on
> > key_part->field->ptr_in_record after that.
>
> Same format, it'll unpack the key into the table->record[0]
> so you can use Field->val* methods normally.
>
> Aah, now i see how it works. It just leaves the holes in record. OK then,
I can try enablling it.


> > > > > +      /* In case of update it could appear that the nearest
> neighbour is
> > > > > > +       * a record we are updating. It means, that there are no
> overlaps
> > > > > > +       * from this side. */
> > > > > > +      if (is_update && memcmp(old_data + table->s->null_bytes,
> > > > > > +                              record_buffer +
> table->s->null_bytes,
> > > > > > +                              table->s->reclength -
> table->s->null_bytes) == 0)
> > > > > > +      {
> > > > > > +        continue;
> > > > > > +      }
> > > > >
> > > > > I'd rather compare row positions here.
> > > > What do you mean by that?
> > >
> > > two rows are the same, if their "positions" are equal, not if their
> > > column values are equal. Also positions are much shorter to compare.
> > >
> > > after ha_index_read_map or ha_index_next you do
> > >
> > >   handler->position(record_buffer)
> > >
> > > and then you have a "position" stored in handler->ref, and it has the
> > > length of handler->ref_length bytes. For MyISAM it's usually the file
> > > offset, for InnoDB - PK value.
> > >
> > > For UPDATE you can, I suppose, call this->position(old_data) to get the
> > > position.
> >
> > It actually returns the ref for the last fetched row. the argument passed
> > is not even used😐 . Only innodb uses it, and only for the primary key
> case.
>
> Right, but it doesn't matter what the engine internally uses, you have
> to pass the row - that's how the method is defined.
>
> But yes, it'll return positions for the last fetched row, that's why
> handler->position(record_buffer) and this->position(old_data)
> will return you two positions that you can compare.
>

old_data could have been even not fetched, and in some cases handler is not
cloned, so it can't work in general.

BTW I also wonder, would this memcpy work with KEYREAD?


-- 
Yours truly,
Nikita Malyavin

Follow ups

Re: 6ac19d09c66: MDEV-16978 Application-time periods: WITHOUT OVERLAPS
From: Nikita Malyavin, 2019-12-04
Re: 6ac19d09c66: MDEV-16978 Application-time periods: WITHOUT OVERLAPS
From: Sergei Golubchik, 2019-12-03

References

Re: 6ac19d09c66: MDEV-16978 Application-time periods: WITHOUT OVERLAPS
From: Sergei Golubchik, 2019-11-19
Re: 6ac19d09c66: MDEV-16978 Application-time periods: WITHOUT OVERLAPS
From: Nikita Malyavin, 2019-11-26
Re: 6ac19d09c66: MDEV-16978 Application-time periods: WITHOUT OVERLAPS
From: Sergei Golubchik, 2019-12-03
Re: 6ac19d09c66: MDEV-16978 Application-time periods: WITHOUT OVERLAPS
From: Nikita Malyavin, 2019-12-03
Re: 6ac19d09c66: MDEV-16978 Application-time periods: WITHOUT OVERLAPS
From: Sergei Golubchik, 2019-12-03