maria-developers team mailing list archive

Thread
Date

Re: 6ac19d09c66: MDEV-16978 Application-time periods: WITHOUT OVERLAPS

To: Nikita Malyavin <nikitamalyavin@xxxxxxxxx>
From: Sergei Golubchik <serg@xxxxxxxxxxx>
Date: Tue, 3 Dec 2019 14:59:57 +0100
Cc: "MariaDB Developers \(maria-developers@xxxxxxxxxxxxxxxxxxx\)" <maria-developers@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <CAOByAQXUj-6h98+OkYHzDDLc6tpDcd1vXFksqKoAopmYZCqv5A@mail.gmail.com>
User-agent: Mutt/1.10.1 (2018-07-13)

Hi, Nikita!

On Dec 03, Nikita Malyavin wrote:
> > > >
> > > > why "if (handler->inited != NONE)" ?
> > > > What happens if it is NONE?
> > > >
> > > then `this` is used
> >
> > right, but when handler->inited is NONE? How can that happen?
> 
> When you just insert the record, for example. Neither rnd, nor index
> can be inited.

Okay

> > > > But it can be done better. You search for the key with the same
> > > > value and a period start <= than the period start of new row.
> > > > And then you have to check two rows for overlaps. If you'll
> > > > search for a key with the period start <= than the _period end_
> > > > of the new row, then you'll only need to check one row (may be
> > > > one more, if updating).
> > > >
> > >  It can't work just that way: to handle case when period start =
> > >  _period end_ of the new row, you should write even more checks,
> > >  and then move cursor left.
> >
> > No, I don't see that. Can you show an example where you'd need even
> > more checks?
> 
> Ok,
> suppose you have to rows in table with periods:
> (b, c),
> (c, d).
> 
> Now you insert (a, c). ha_index_read_map will look for period_start <=
> c, so it will return (c, d). These rows do not overlap, but we yet do
> not know if (b, c) is there. So we need to go to the previous row.
> 
> Now the algorithm looks same complex: read key, make some checks, got
> to the prev record.

I see. Because intervals are inclusive we probably have to search
with < not <=. HA_READ_BEFORE_KEY

> > > > Also, why do you store both period ends in the index? It seems
> > > > like it'd be enough to store only one end - only the start or
> > > > only the end. Both ends help if you use keyreads, but you don't.
> > > > On the second thought, perhaps, you should use keyreads, there's
> > > > no need to fetch the whole row for this overlap check. On the
> > > > yet another thought it might not work with
> > > > HA_EXTRA_REMEMBER_POS.
> > >
> > > I think keyreads can work with HA_EXTRA_REMEMBER_POS. Moreover, I
> > > agree that using keyreads can be much more efficient. However, all
> > > of my latest code, namely FKs and REPLACE/IODKU, are not based on
> > > keyreads, and rewriting it now will require significant effort.
> > > Let's make it a separate task, maybe?
> >
> > Why would it require significant effort? As far as I understand in
> > your current code you only need to enable keyreads and that's all.
> 
> it'll return in different format, isn't it? You can't rely on
> key_part->field->ptr_in_record after that.

Same format, it'll unpack the key into the table->record[0]
so you can use Field->val* methods normally.

> > > > +      /* In case of update it could appear that the nearest neighbour is
> > > > > +       * a record we are updating. It means, that there are no overlaps
> > > > > +       * from this side. */
> > > > > +      if (is_update && memcmp(old_data + table->s->null_bytes,
> > > > > +                              record_buffer + table->s->null_bytes,
> > > > > +                              table->s->reclength - table->s->null_bytes) == 0)
> > > > > +      {
> > > > > +        continue;
> > > > > +      }
> > > >
> > > > I'd rather compare row positions here.
> > > What do you mean by that?
> >
> > two rows are the same, if their "positions" are equal, not if their
> > column values are equal. Also positions are much shorter to compare.
> >
> > after ha_index_read_map or ha_index_next you do
> >
> >   handler->position(record_buffer)
> >
> > and then you have a "position" stored in handler->ref, and it has the
> > length of handler->ref_length bytes. For MyISAM it's usually the file
> > offset, for InnoDB - PK value.
> >
> > For UPDATE you can, I suppose, call this->position(old_data) to get the
> > position.
> 
> It actually returns the ref for the last fetched row. the argument passed
> is not even used😐 . Only innodb uses it, and only for the primary key case.

Right, but it doesn't matter what the engine internally uses, you have
to pass the row - that's how the method is defined.

But yes, it'll return positions for the last fetched row, that's why
handler->position(record_buffer) and this->position(old_data)
will return you two positions that you can compare.

Regards,
Sergei
VP of MariaDB Server Engineering
and security@xxxxxxxxxxx

Follow ups

Re: 6ac19d09c66: MDEV-16978 Application-time periods: WITHOUT OVERLAPS
From: Nikita Malyavin, 2019-12-03

References

Re: 6ac19d09c66: MDEV-16978 Application-time periods: WITHOUT OVERLAPS
From: Sergei Golubchik, 2019-11-19
Re: 6ac19d09c66: MDEV-16978 Application-time periods: WITHOUT OVERLAPS
From: Nikita Malyavin, 2019-11-26
Re: 6ac19d09c66: MDEV-16978 Application-time periods: WITHOUT OVERLAPS
From: Sergei Golubchik, 2019-12-03
Re: 6ac19d09c66: MDEV-16978 Application-time periods: WITHOUT OVERLAPS
From: Nikita Malyavin, 2019-12-03