← Back to team overview

maria-developers team mailing list archive

Re: Please review MDEV-9842 LOAD DATA INFILE does not work well with a TEXT column when using sjis

 

Hi, Alexander!

On Apr 01, Alexander Barkov wrote:
> Hi Sergei,
> 
> Please review a patch for MDEV-9842.
> 
> This is a prerequisite for:
> 
>  MDEV-6353 my_ismbchar() and my_mbcharlen() refactoring
> 
> Thanks.

> commit 4d615db357a5b513c915ab677afd130a0673f0da
> Author: Alexander Barkov <bar@xxxxxxxxxxx>
> Date:   Fri Apr 1 14:53:00 2016 +0400
> 
>     MDEV-9842 LOAD DATA INFILE does not work well with a TEXT column when using sjis
>     Fixing READ_INFO to store the loaded data in a String object.
>     Making sure to reserve MY_CS_MBMAXLEN bytes before loading the next character
>     in read_field().

please, always add an empty line after the commit comment subject (first
line) and the commit comment body (the rest of the comment).

> diff --git a/sql/sql_load.cc b/sql/sql_load.cc
> index f1c2920..d7ca8ef 100644
> --- a/sql/sql_load.cc
> +++ b/sql/sql_load.cc
> @@ -66,10 +66,9 @@ XML_TAG::XML_TAG(int l, String f, String v)
>  
>  class READ_INFO {
>    File	file;
> -  uchar	*buffer,			/* Buffer for read text */
> -	*end_of_buff;			/* Data in bufferts ends here */
> -  uint	buff_length,			/* Length of buffert */
> -	max_length;			/* Max length of row */
> +  String data;                          /* Read buffer */
> +  uint fixed_length;                    /* Length of the fixed length record */
> +  uint max_length;                      /* Max length of row */
>    const uchar *field_term_ptr,*line_term_ptr;
>    const char  *line_start_ptr,*line_start_end;
>    uint	field_term_length,line_term_length,enclosed_length;
> @@ -1349,10 +1348,11 @@ READ_INFO::READ_INFO(THD *thd, File file_par, uint tot_length, CHARSET_INFO *cs,
>  		     String &field_term, String &line_start, String &line_term,
>  		     String &enclosed_par, int escape, bool get_it_from_net,
>  		     bool is_fifo)
> -  :file(file_par), buffer(NULL), buff_length(tot_length), escape_char(escape),
> +  :file(file_par), fixed_length(tot_length), escape_char(escape),
>     found_end_of_line(false), eof(false),
>     error(false), line_cuted(false), found_null(false), read_charset(cs)
>  {
> +  data.set_thread_specific();
>    /*
>      Field and line terminators must be interpreted as sequence of unsigned char.
>      Otherwise, non-ascii terminators will be negative on some platforms,
> @@ -1394,18 +1394,16 @@ READ_INFO::READ_INFO(THD *thd, File file_par, uint tot_length, CHARSET_INFO *cs,
>    set_if_bigger(length,line_start.length());
>    stack= stack_pos= (int*) thd->alloc(sizeof(int) * length);
>  
> -  if (!(buffer=(uchar*) my_malloc(buff_length+1,MYF(MY_THREAD_SPECIFIC))))
> +  if (data.reserve(tot_length))
>      error=1; /* purecov: inspected */
>    else
>    {
> -    end_of_buff=buffer+buff_length;
>      if (init_io_cache(&cache,(get_it_from_net) ? -1 : file, 0,
>  		      (get_it_from_net) ? READ_NET :
>  		      (is_fifo ? READ_FIFO : READ_CACHE),0L,1,
>  		      MYF(MY_WME | MY_THREAD_SPECIFIC)))
>      {
> -      my_free(buffer); /* purecov: inspected */
> -      buffer= NULL;
> +      data.free();

do you really need to call data.free() explicitly? String destructor
takes care of that automatically.

>        error=1;
>      }
>      else
> @@ -1428,7 +1426,7 @@ READ_INFO::READ_INFO(THD *thd, File file_par, uint tot_length, CHARSET_INFO *cs,
>  READ_INFO::~READ_INFO()
>  {
>    ::end_io_cache(&cache);
> -  my_free(buffer);
> +  data.free();

same here

>    List_iterator<XML_TAG> xmlit(taglist);
>    XML_TAG *t;
>    while ((t= xmlit++))
> @@ -1492,7 +1489,7 @@ int READ_INFO::read_field()
>  
>    for (;;)
>    {
> -    while ( to < end_of_buff)
> +    while (data.length() + MY_CS_MBMAXLEN < data.alloced_length())
>      {
>        chr = GET;
>        if (chr == my_b_EOF)
>  #ifdef USE_MB
> -      if (my_mbcharlen(read_charset, chr) > 1 &&
> -          to + my_mbcharlen(read_charset, chr) <= end_of_buff)
> +      if (my_mbcharlen(read_charset, chr) > 1)
>        {

The fix would've been much easier to understand if you'd put the cleanup
change into a separate commit. And why did you change READ_INFO to use
String at all?

Regards,
Sergei
Chief Architect MariaDB
and security@xxxxxxxxxxx


References