← Back to team overview

maria-developers team mailing list archive

Re: MDEV-9332 Bug after upgrade to 10.1.10

 

Hi, Alexander!

Looks good. Great comments.
See my suggestions below.

On Jan 26, Alexander Barkov wrote:
> diff --git a/sql/sql_string.cc b/sql/sql_string.cc
> index b14c3af..606e71d 100644
> --- a/sql/sql_string.cc
> +++ b/sql/sql_string.cc
> @@ -789,21 +789,117 @@ int stringcmp(const String *s,const String *t)
>  }
>  
> +/**
> +  Return a string which has the same value with "from" and
> +  which is safe to modify, trying to avoid unnecessary allocation
> +  and copying when possible.
> +
> +  @param to           Buffer. Must not be a constant string.
> +  @param from         Some existing value. We'll try to reuse it.
> +                      Can be a constant or a variable string.
> +  @param from_length  The total size that will be possibly needed.
> +                      Note, can be 0.
> +
> +  Note, in some cases "from" and "to" can point to the same object.
> +
> +  If "from" is a variable string and its allocated memory is enough
> +  to store "from_length" bytes, then "from" is returned as is.
> +
> +  If "from" is a variable string and its allocated memory is not enough
> +  to store "from_length" bytes, then then "from" is reallocated and returned.
> +
> +  Otherwise (if "from" is a constant string, or looks like a constant string),
> +  then "to" is reallocated to fit "from_length" bytes, the value is copied
> +  from "from" to "to", then "to" is returned.
> +*/
>  String *copy_if_not_alloced(String *to,String *from,uint32 from_length)
>  {
> -  if (from->Alloced_length >= from_length)
> -    return from;
> -  if ((from->alloced && (from->Alloced_length != 0)) || !to || from == to)
> +  DBUG_ASSERT(to);
> +  /*
> +    If "from" is a constant string, e.g.:
> +       SELECT INSERT('', <pos>, <length>, <replacement>);
> +    we should not return it. See MDEV-9332.
> +
> +    The code below detects different string types:
> +
> +    a. All constant strings have Alloced_length==0 and alloced==false.
> +       They point to a static memory array, or a mem_root memory,
> +       and should stay untouched until the end of their life cycle.
> +       Not safe to reuse.
> +
> +    b. Some variable string have Alloced_length==0 and alloced==false initially,
> +       they are not bound to any char array and allocate space of the first use
> +       (and become #d). A typical example of such String is Item::str_value.
> +       This type of string could be reused, but there is no a way to distinguish
> +       them from the true constant strings (#a).
> +
> +    c. Some variable strings have Alloced_length>0 and alloced==false.
> +       They point to a fixed length writtable char array initially but can
> +       later allocate more space on the heap when the array is too small
> +       (these strings become #d after allocation).
> +       Safe to reuse.
> +
> +    d. Some variable strings have Alloced_length>0 and alloced==true.
> +       They already store data on the heap.
> +       Safe to reuse.
> +
> +    e. Some strings can have Alloced_length==0 and alloced==true.
> +       This type of strings allocate space on the heap, but then are marked
> +       as constant strings using String::mark_as_const().
> +       A typical example - the result of a character set conversion
> +       of a constant string.
> +       Not safe to reuse.
> +  */
> +  if (from->Alloced_length > 0) // "from" is  #c or #d (not a constant)
>    {
> -    (void) from->realloc(from_length);
> -    return from;
> +    if (from->Alloced_length >= from_length)
> +      return from; // #c or #d (large enough to store from_length bytes)
> +
> +    if (from->alloced)
> +    {
> +      (void) from->realloc(from_length);
> +      return from; // #d (reallocated to fit from_length bytes)
> +    }
> +    /*
> +      "from" is of type #c. It currently points to a writtable char array
> +      (typically on stack), but is too small for "from_length" bytes.
> +      We need to reallocate either "from" or "to".
> +
> +      "from" typically points to a temporary buffer inside Item_xxx::val_str(),
> +      or to Item::str_value, and thus is "less permanent" than "to".
> +
> +      Reallocating "to" may give more benifits:
> +      - "to" can point to a more permanent storage and can be reused
> +        for multiple rows, e.g. str_buffer in Protocol::send_result_set_row(),
> +        which is passed to val_str() for all string type rows.
> +      - "from" can stay pointing to its original fixed length stack char array,
> +        and thus reduce the total amount of my_alloc/my_free.
> +    */
> +  }
> +
> +  if (from == to)
> +  {
> +    /*
> +      Possible string types:
> +      #a  not possible (constants should not be passed as "to")
> +      #b  possible     (a fresh variable with no associated char buffer)
> +      #c  possible     (a variable with a char buffer,
> +                        in case it's smaller than fixed_length)
> +      #d  not possible (handled earlier)
> +      #e  not possible (constants should not be passed as "to")
> +
> +      If a string of types #a or #e appears here, that means the caller made
> +      something wrong. Otherwise, it's safe to reallocate and return "to".
> +    */
> +    (void) to->realloc(from_length);
> +    return to;

1. use 'from->realloc()' and 'return from' here
   (a bit easier to read, because between 'to->realloc()' and 'return to'
   one normally needs to copy 'from' to 'to'. Or one needs to read back to
   remember that here from == to)

2. please add asserts to make sure it's not #a or #e
   (remember not to use assert(expr1 && expr2); use two asserts in such
   a case)

3. sometimes you check the realloc's return value and sometimes you
   don't. shall we be consistent here?

>    }
>    if (to->realloc(from_length))
>      return from;				// Actually an error
>    if ((to->str_length=MY_MIN(from->str_length,from_length)))
>      memcpy(to->Ptr,from->Ptr,to->str_length);
>    to->str_charset=from->str_charset;
> -  return to;
> +  return to; // "from" was of types #a, #b, #e, or small #c.
>  }

Regards,
Sergei
Chief Architect MariaDB
and security@xxxxxxxxxxx


Follow ups

References