maria-developers team mailing list archive

Thread
Date

Re: 2f9bbf392072: MDEV-18734 optimization by taking ownership via String::swap()

To: Aleksey Midenkov <midenok@xxxxxxxxx>
From: Sergei Golubchik <serg@xxxxxxxxxxx>
Date: Tue, 22 Dec 2020 17:58:40 +0100
Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
In-reply-to: <CAF8BazCrk5L-6ZzVS9xo1WxK__8eTCn-WdxdbGkbCod_WkF-rQ@mail.gmail.com>

Hi, Aleksey!

On Nov 03, Aleksey Midenkov wrote:
> 
> the updated patches are here:
> 
> https://github.com/MariaDB/server/commits/5f6bcd6cd0145513974b0dfc5b2cefba28b72f1a

I started writing my comments there, but realized that it'd be clearer
if I reply here, more context. That is, see below.

One comment about the new test case, you have 16K lines in the result
file, are all 16K necessary? Can you do something like

   select x, left(b,10), left(v,10)

?

And please, pretty please, don't use non-constant references as function
arguments.

> On Mon, Nov 2, 2020 at 11:33 PM Sergei Golubchik <serg@xxxxxxxxxxx> wrote:
> >
> > > diff --git a/sql/field.cc b/sql/field.cc
> > > index bdaaecc20269..0fd40c979d2c 100644
> > > --- a/sql/field.cc
> > > +++ b/sql/field.cc
> > > @@ -8310,6 +8310,7 @@ int Field_blob::store(const char *from,uint length,CHARSET_INFO *cs)
> > >    copy_length= copier.well_formed_copy(field_charset,
> > >                                         (char*) value.ptr(), new_length,
> > >                                         cs, from, length);
> > > +  value.length(copy_length);
> >
> > good! Could this be a distinct bug with its own test case?
> 
> Do we need to spend time on that bureaucracy? There are real problems to
> solve... Tickets drain time from the management people and generally from
> anyone who sees them.

I didn't mean opening an MDEV. Only writing a test case and extracting
it into a separate commit.

If this is a genuine bug fix and you don't create a test case - how can
you be sure the fix will stay? It may disappear in a merge or a rebase,
manually or automatically. Or in a refactoring that "looks fine, all
tests pass". If you can create a test case - please, do.

> > >    Field_blob::store_length(copy_length);
> > >    bmove(ptr+packlength,(uchar*) &tmp,sizeof(char*));
> > >
> > > diff --git a/sql/ha_partition.cc b/sql/ha_partition.cc
> > > index 9d82df0235c7..7c658082d397 100644
> > > --- a/sql/ha_partition.cc
> > > +++ b/sql/ha_partition.cc
> > > @@ -5125,14 +5128,14 @@ bool ha_partition::init_record_priority_queue()
> > >    uint alloc_len;
> > >    uint used_parts= bitmap_bits_set(&m_part_info->read_partitions);
> > >    /* Allocate record buffer for each used partition. */
> > > -  m_priority_queue_rec_len= m_rec_length + PARTITION_BYTES_IN_POS;
> > > +  m_priority_queue_rec_len= m_rec_length + ORDERED_REC_OFFSET;
> > >    if (!m_using_extended_keys)
> > >        m_priority_queue_rec_len += m_file[0]->ref_length;
> > >    alloc_len= used_parts * m_priority_queue_rec_len;
> > >    /* Allocate a key for temporary use when setting up the scan. */
> > >    alloc_len+= table_share->max_key_length;
> > >
> > > -  if (!(m_ordered_rec_buffer= (uchar*)my_malloc(alloc_len, MYF(MY_WME))))
> > > +  if (!(m_ordered_rec_buffer= (uchar*) alloc_root(&m_ordered_root, alloc_len)))
> >
> > I don't see why you need a memroot here. memroot is needed when you
> > allocate later at difeerent points in time some initially unpredictable
> > amount of memory that has the same lifetime and needs to be
> > freed all at once.
> 
> I know... This is just another convenient way of allocation with presumably
> minimized number of mallocs (probably pre_alloc_size should be tweaked to a
> better value).

memroot wasn't created for that. There are approaches with much less
overhead

> >
> > But here it seems that you only allocate once. You could just make
> >
> >    m_priority_queue_rec_len = m_rec_length + PARTITION_BYTES_IN_POS +
> >               table->s->blob_fields * sizeof(Ordered_blob_storage)
> 
> I don't wnat this data serialized. It is inconvenient to read the dump of
> the record. Why we should make it harder to code, read the code and debug
> while we can make it easier? If it is only for guaranteed 1 malloc, I don't
> think it's worth it.

What do you mean "serialized"?
What do you mean "read the dump"?

Note that you can also use my_multi_malloc, that allocates many buffers
together and hides address arithmetcs from the caller. Like

  my_multi_malloc(MYF(MY_WME),
     &m_ordered_rec_buffer, m_rec_length + PARTITION_BYTES_IN_POS,
     &blob_storage, table->s->blob_fields * sizeof(Ordered_blob_storage),
     NULL)

(but with error checking). And then, perhaps, with placement new[]
you could construct all Ordered_blob_storage objects in one line.

> > > @@ -6178,7 +6203,11 @@ int ha_partition::handle_ordered_index_scan(uchar *buf, bool reverse_order)
> > >        */
> > >        error= file->read_range_first(m_start_key.key? &m_start_key: NULL,
> > >                                      end_range, eq_range, TRUE);
> > > -      memcpy(rec_buf_ptr, table->record[0], m_rec_length);
> > > +      if (!error)
> > > +      {
> > > +        memcpy(rec_buf_ptr, table->record[0], m_rec_length);
> > > +      }
> >
> > did you have a bug because of that? something didn't work?
> 
> No, I just saw worthless memcpy and avoided it.

You're right that memcopy is unnecessary in case of an error.
But most of the time there is no error. You made an exceptional code path
faster by making a common code path slower. I think it's generally 
better to do the opposite, making the common, no-error code path
faster, fewer conditionals, in particular. At the cost of making the error
path slower, if needed.

> > > @@ -6310,6 +6349,43 @@ int ha_partition::handle_ordered_index_scan_key_not_found()
> > > +      if (cached)
> > > +      {
> > > +        cached->swap(s.blob);
> > > +        s.set_read_value= set_read_value;
> >
> > is it indeed possible here for a value to be either in `value` or in
> > `read_value` ?
> >
> > When happens what?
> 
> Yes, it uses `value` for plain fields and `read_value` for virtual fields.

Oh, I see.

This read_value was introduced in ea1b25046c81db8fdf to solve the case
of UPDATE, where a row is moved with store_record(table, record[1]) and
blob pointers aren't updated. This is very similar to what you're
fixing, perhaps read_value should go away and both bugs should have a
one unified solution that allows to "park" a record somewhere out of
record[0] and restore later, all with proper blob handling?

> >
> > > +      }
> > > +    }
> > > +  }
> > > +  table->move_fields(table->field, table->record[0], rec_buf);
> > > +}

Regards,
Sergei
VP of MariaDB Server Engineering
and security@xxxxxxxxxxx

Follow ups

Re: 2f9bbf392072: MDEV-18734 optimization by taking ownership via String::swap()
From: Aleksey Midenkov, 2021-01-25

References

Re: 2f9bbf392072: MDEV-18734 optimization by taking ownership via String::swap()
From: Sergei Golubchik, 2020-11-02
Re: 2f9bbf392072: MDEV-18734 optimization by taking ownership via String::swap()
From: Aleksey Midenkov, 2020-11-03