maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #12600
Re: f46382ba93c: Optimize usage of c_ptr(), c_ptr_quick() and String::alloc()
Hi, Michael!
On Mar 27, Michael Widenius wrote:
> revision-id: f46382ba93c (mariadb-10.5.2-515-gf46382ba93c)
> parent(s): c157c285db9
> author: Michael Widenius <michael.widenius@xxxxxxxxx>
> committer: Michael Widenius <michael.widenius@xxxxxxxxx>
> timestamp: 2021-03-24 14:31:53 +0200
> message:
>
> Optimize usage of c_ptr(), c_ptr_quick() and String::alloc()
>
> The proble was that hen using String::alloc() to allocate a string,
"hen" ? after a minute of staring I realized that you probably meant "when".
(also, "problem")
> the String ensures that there is space for an extra NULL byte in the
> buffer and if not, reallocates the string. This is a problem with the
> String::set_int() that calls alloc(21), which forces
> an extra malloc/free to happen.
>
> - We do not anymore re-allocate String if alloc() is called with the
> Allocated_length. This reduces number of malloc() allocations,
> especially one big re-allocation in Protocol::send_result_Set_metadata()
> for almost every query that produced a result to the connnected client.
> - Avoid extra mallocs when using LONGLONG_BUFFER_SIZE
> This can now be done as alloc() doesn't increase buffers if new length is
> not bigger than old one.
> - c_ptr() is redesigned to be safer (but a bit longer) than before.
> - Remove wrong usage of c_ptr_quick()
> c_ptr_quick() where used in many cases to get the pointer to the used
s/where/was/
> buffer, even when it didn't need to be \0 terminated. In this case
> ptr() is a better substitute.
> Another problem with c_ptr_quick() is that it didn't guarantee that
> the string would be \0 terminated.
> - item_val_str(), an API function not used currently by the server,
> now always returns a null terminated string (before it didn't always
> do that).
> - Ensure that all String allocations uses STRING_PSI_MEMORY_KEY. The old
> mixed usage of performance keys caused assert's when String buffers
> where shrunk.
> - Binary_string::shrink() is simplifed
>
> diff --git a/sql/item.cc b/sql/item.cc
> index f6f3e2720fa..a8a43c6266a 100644
> --- a/sql/item.cc
> +++ b/sql/item.cc
> @@ -4703,7 +4703,7 @@ Item *Item_param::value_clone_item(THD *thd)
> return 0; // Should create Item_decimal. See MDEV-11361.
> case STRING_RESULT:
> return new (mem_root) Item_string(thd, name,
> - Lex_cstring(value.m_string.c_ptr_quick(),
> + Lex_cstring(value.m_string.ptr(),
Hmm, you said that LEX_CSTRING::str should always be \0-terminated.
If that's the case, then c_ptr() would be correct here, not ptr()
> value.m_string.length()),
> value.m_string.charset(),
> collation.derivation,
> diff --git a/sql/partition_info.cc b/sql/partition_info.cc
> index 871411cf6c4..b3e1bceec31 100644
> --- a/sql/partition_info.cc
> +++ b/sql/partition_info.cc
> @@ -126,7 +126,7 @@ partition_info *partition_info::get_clone(THD *thd)
> /**
> Mark named [sub]partition to be used/locked.
>
> - @param part_name Partition name to match.
> + @param part_name Partition name to match. Must be \0 terminated!
> @param length Partition name length.
>
> @return Success if partition found
> @@ -172,9 +172,9 @@ bool partition_info::add_named_partition(const char *part_name, size_t length)
> else
> bitmap_set_bit(&read_partitions, part_def->part_id);
> }
> - DBUG_PRINT("info", ("Found partition %u is_subpart %d for name %s",
> + DBUG_PRINT("info", ("Found partition %u is_subpart %d for name %.*s",
> part_def->part_id, part_def->is_subpart,
> - part_name));
> + length, part_name));
These two changes contradict each other. Either part_name must be
\0-terminated, and then you don't need to specify a length in printf.
Or it is not \0-terminated and you must specify a length.
> DBUG_RETURN(false);
> }
>
> diff --git a/sql/sql_string.cc b/sql/sql_string.cc
> index f2a0f55aec8..95a57017c53 100644
> --- a/sql/sql_string.cc
> +++ b/sql/sql_string.cc
> @@ -118,9 +121,14 @@ bool Binary_string::realloc_raw(size_t alloc_length)
> return FALSE;
> }
>
> +
> bool String::set_int(longlong num, bool unsigned_flag, CHARSET_INFO *cs)
> {
> - uint l=20*cs->mbmaxlen+1;
> + /*
> + This allocates a few bytes extra in the unlikely case that cs->mb_maxlen
> + > 1, but we can live with that
dunno, it seems that utf8 is the *likely* case and it's getting more and
more likely with the time,
> + */
> + uint l= LONGLONG_BUFFER_SIZE * cs->mbmaxlen;
> int base= unsigned_flag ? 10 : -10;
>
> if (alloc(l))
> @@ -1254,21 +1262,16 @@ bool String::append_semi_hex(const char *s, uint len, CHARSET_INFO *cs)
> // Shrink the buffer, but only if it is allocated on the heap.
> void Binary_string::shrink(size_t arg_length)
> {
> - if (!is_alloced())
> - return;
> - if (ALIGN_SIZE(arg_length + 1) < Alloced_length)
> + if (is_alloced() && ALIGN_SIZE(arg_length + 1) < Alloced_length)
> + {
> + char *new_ptr;
> + /* my_realloc() can't fail as new buffer is less than the original one */
> + if ((new_ptr= (char*) my_realloc(STRING_PSI_MEMORY_KEY, Ptr, arg_length,
you don't need an if() because, as you wrote yourself "my_realloc()
can't fail" here.
> + MYF(thread_specific ?
> + MY_THREAD_SPECIFIC : 0))))
> {
> diff --git a/sql/sql_string.h b/sql/sql_string.h
> index b3eca118b63..ba0cff68fb4 100644
> --- a/sql/sql_string.h
> +++ b/sql/sql_string.h
> @@ -600,25 +600,34 @@ class Binary_string: public Static_binary_string
>
> inline char *c_ptr()
> {
> - DBUG_ASSERT(!alloced || !Ptr || !Alloced_length ||
> - (Alloced_length >= (str_length + 1)));
> -
> - if (!Ptr || Ptr[str_length]) // Should be safe
> - (void) realloc(str_length);
> + if (unlikely(!Ptr))
> + return (char*) "";
> + /*
> + Here we assume that any buffer used to initalize String has
> + an end \0 or have at least an accessable character at end.
> + This is to handle the case of String("Hello",5) efficently.
> + */
> + if (unlikely(!alloced && !Ptr[str_length]))
> + return Ptr;
No, this is not good. Note the difference between
String a("Hello", 5)
and
char hello[5];
String a(buf, 5);
Your assumption should only apply to the first case, not to the second.
In the first case alloced=Alloced_length=0, in the second case only
alloced=0 and Alloced_length=5. So in the if() above you need to look
at Alloced_length:
if (!Alloced_length && !Ptr[str_length])
return Ptr;
> + if (str_length < Alloced_length)
> + {
> + Ptr[str_length]=0;
> + return Ptr;
> + }
> + (void) realloc(str_length+1); /* This will add end \0 */
> return Ptr;
> }
> + /* Use c_ptr() instead. This will be deleted soon, kept for compatiblity */
> inline char *c_ptr_quick()
> {
> - if (Ptr && str_length < Alloced_length)
> - Ptr[str_length]=0;
> - return Ptr;
> + return c_ptr_safe();
1. why not to remove it now?
2. it's strange that you write "use c_ptr() instead" but you actually
use c_ptr_safe() instead.
> }
> inline char *c_ptr_safe()
> {
> if (Ptr && str_length < Alloced_length)
> Ptr[str_length]=0;
> else
> - (void) realloc(str_length);
> + (void) realloc(str_length + 1);
> return Ptr;
> }
could you write a comment, explaining when one should use c_ptr() and
when - c_ptr_safe() ?
Regards,
Sergei
VP of MariaDB Server Engineering
and security@xxxxxxxxxxx
Follow ups