← Back to team overview

maria-developers team mailing list archive

Re: 2f9bbf392072: MDEV-18734 optimization by taking ownership via String::swap()

 

Hi Sergei!

On Thu, Jun 3, 2021 at 1:58 PM Sergei Golubchik <serg@xxxxxxxxxxx> wrote:
>
> Hi, Aleksey!
>
> On Jan 26, Aleksey Midenkov wrote:
> > On Tue, Dec 22, 2020 at 7:58 PM Sergei Golubchik <serg@xxxxxxxxxxx> wrote:
> > > On Nov 03, Aleksey Midenkov wrote:
> > >
> > > And please, pretty please, don't use non-constant references as
> > > function arguments.
> >
> > Why? With pointers we can pass NULL and we don't need NULL there, do we?
>
> Because this is the policy. If you convince Monty that non-constant
> references are fine - feel free to use them.

If this is the policy, why there is

void String::swap(String &s)

since 2004? Are you asking me to pass the pointer and then dereference
it and pass it as a reference into String class? That would look weird
to me. I'm going to agree if you insist, but really?

>
> > > > On Mon, Nov 2, 2020 at 11:33 PM Sergei Golubchik <serg@xxxxxxxxxxx> wrote:
> > > > >
> > > > > > diff --git a/sql/field.cc b/sql/field.cc
> > > > > > index bdaaecc20269..0fd40c979d2c 100644
> > > > > > --- a/sql/field.cc
> > > > > > +++ b/sql/field.cc
> > > > > > @@ -8310,6 +8310,7 @@ int Field_blob::store(const char *from,uint length,CHARSET_INFO *cs)
> > > > > >    copy_length= copier.well_formed_copy(field_charset,
> > > > > >                                         (char*) value.ptr(), new_length,
> > > > > >                                         cs, from, length);
> > > > > > +  value.length(copy_length);
> > > > >
> > > > > good! Could this be a distinct bug with its own test case?
> > > >
> > > > Do we need to spend time on that bureaucracy? There are real problems to
> > > > solve... Tickets drain time from the management people and generally from
> > > > anyone who sees them.
> > >
> > > I didn't mean opening an MDEV. Only writing a test case and extracting
> > > it into a separate commit.
> > >
> > > If this is a genuine bug fix and you don't create a test case - how can
> > > you be sure the fix will stay? It may disappear in a merge or a rebase,
> > > manually or automatically. Or in a refactoring that "looks fine, all
> > > tests pass". If you can create a test case - please, do.
> >
> > It is tested by federated.federated_partition test case.
>
> Do you mean, that if I remove this line, federated.federated_partition
> test will fail? If yes - good, this is the test I meant, no need to
> create another one.

Yes, the test will fail.

>
> > Why do we have to extract this side-fix to a separate commit?
>
> Because later when someone looks at the code it helps to be able to
> understand what a commit is actually fixing.

As long as both fixes are important and there is no sense to apply one
without the other I see separate commits as a measure to satisfy
someone's curiosity which I do not share (and there is nothing to be
curious about that one-liner). Though I'm going to agree if you ask me
to do that.

>
> > > > > > diff --git a/sql/ha_partition.cc b/sql/ha_partition.cc
> > > > > > --- a/sql/ha_partition.cc
> > > > > > +++ b/sql/ha_partition.cc
> > > > > > @@ -5125,14 +5128,14 @@ bool ha_partition::init_record_priority_queue()
> > > > > >    uint alloc_len;
> > > > > >    uint used_parts= bitmap_bits_set(&m_part_info->read_partitions);
> > > > > >    /* Allocate record buffer for each used partition. */
> > > > > > -  m_priority_queue_rec_len= m_rec_length + PARTITION_BYTES_IN_POS;
> > > > > > +  m_priority_queue_rec_len= m_rec_length + ORDERED_REC_OFFSET;
> > > > > >    if (!m_using_extended_keys)
> > > > > >        m_priority_queue_rec_len += m_file[0]->ref_length;
> > > > > >    alloc_len= used_parts * m_priority_queue_rec_len;
> > > > > >    /* Allocate a key for temporary use when setting up the scan. */
> > > > > >    alloc_len+= table_share->max_key_length;
> > > > > >
> > > > > > -  if (!(m_ordered_rec_buffer= (uchar*)my_malloc(alloc_len, MYF(MY_WME))))
> > > > > > +  if (!(m_ordered_rec_buffer= (uchar*) alloc_root(&m_ordered_root, alloc_len)))
> > > > >
> > > > > I don't see why you need a memroot here. memroot is needed when
> > > > > you allocate later at difeerent points in time some initially
> > > > > unpredictable amount of memory that has the same lifetime and
> > > > > needs to be freed all at once.
> > > >
> > One should keep in mind this is an artificial usage of the prio queue.
> > In diversified real-world usage the share of such kind SELECT is say
> > 0.1%. But even if it is 1% we divide our stats by 100. So, do you
> > really care about 0.0015%?
>
> A right tool for a job. If you need to allocate few buffers at the same
> time - use multi_alloc, if you do many allocations over time and want to
> fee them at once - use memroot.

Changed to my_multi_alloc().

>
> > > > > > @@ -6178,7 +6203,11 @@ int ha_partition::handle_ordered_index_scan(uchar *buf, bool reverse_order)
> > > > > >        */
> > > > > >        error= file->read_range_first(m_start_key.key? &m_start_key: NULL,
> > > > > >                                      end_range, eq_range, TRUE);
> > > > > > -      memcpy(rec_buf_ptr, table->record[0], m_rec_length);
> > > > > > +      if (!error)
> > > > > > +      {
> > > > > > +        memcpy(rec_buf_ptr, table->record[0], m_rec_length);
> > > > > > +      }
> > > > >
> > > > > did you have a bug because of that? something didn't work?
> > > >
> > > > No, I just saw worthless memcpy and avoided it.
> > >
> > > You're right that memcopy is unnecessary in case of an error.
> > > But most of the time there is no error. You made an exceptional code path
> > > faster by making a common code path slower. I think it's generally
> > > better to do the opposite, making the common, no-error code path
> > > faster, fewer conditionals, in particular. At the cost of making the error
> > > path slower, if needed.
> >
> > I'm usually aware of such things in my decisions. The slowdown is
> > close to 0, it is not even 0.0001%, it is much lower. OTOH memcpy()
> > slowdown is much higher and this is not a fully exceptional code path:
> > HA_ERR_KEY_NOT_FOUND continues the loop.
>
> This is a wrong code pattern, you should optimize for the normal code
> path, not for errors. Errors are exceptions, they're much more rare than
> non-errors.

Okay. Looks like no sense to argue that anymore. Reverted.

>
> > > > > > @@ -6310,6 +6349,43 @@ int ha_partition::handle_ordered_index_scan_key_not_found()
> > > > > > +      if (cached)
> > > > > > +      {
> > > > > > +        cached->swap(s.blob);
> > > > > > +        s.set_read_value= set_read_value;
> > > > >
> > > > > is it indeed possible here for a value to be either in `value` or in
> > > > > `read_value` ?
> > > > >
> > > > > When happens what?
> > > >
> > > > Yes, it uses `value` for plain fields and `read_value` for virtual fields.
> > >
> > > Oh, I see.
> > >
> > > This read_value was introduced in ea1b25046c81db8fdf to solve the
> > > case of UPDATE, where a row is moved with store_record(table,
> > > record[1]) and blob pointers aren't updated. This is very similar to
> > > what you're fixing, perhaps read_value should go away and both bugs
> > > should have a one unified solution that allows to "park" a record
> > > somewhere out of record[0] and restore later, all with proper blob
> > > handling?
> >
> > Do you think this is necessary now? I believe there are a lot of
> > *much* more important tasks.
>
> Uhm, yes? This would be a notable code simplification, and it won't take
> long, a low-hanging fruit.

I'm not sure if this is a "fruit". Can you please elaborate the
change? Use so-called Ordered_blob_storage instead of read_value and
do swap_blobs() on store_record()? And have 2 arrays of
Ordered_blob_storage for each TABLE::record like that:

diff --git a/sql/table.h b/sql/table.h
index 69bd14b2834..40042dc1df0 100644
--- a/sql/table.h
+++ b/sql/table.h
@@ -1151,6 +1151,8 @@ struct TABLE
  THD  *in_use;                        /* Which thread uses this */

  uchar *record[2];                    /* Pointer to records */
+  Ordered_blob_storage **blob_storage[2]; /* NULL-terminated arrays of pointers
+                                             to blob-storages */
  uchar *write_row_record;             /* Used as optimisation in
                                          THD::write_row */
  uchar *insert_values;                  /* used by INSERT ... UPDATE */

>
> Regards,
> Sergei
> VP of MariaDB Server Engineering
> and security@xxxxxxxxxxx



--
All the best,

Aleksey Midenkov
@midenok


Follow ups

References