← Back to team overview

maria-developers team mailing list archive

MDEV-10142 - bb-10.2-compatibility / MDEV-10574

 

Hello Alexander,
Here are patches for replace, trim, substring functions and empty strings:
 - replace.diff
 - trim.diff
- substring.patch
- empty_strings.diff

Best regards,
Jérôme.

> -----Message d'origine-----
> De : Alexander Barkov [mailto:bar@xxxxxxxxxxx]
> Envoyé : mercredi 26 avril 2017 06:12
> À : jerome brauge
> Cc : MariaDB Developers
> Objet : Re: MDEV-10142 - bb-10.2-compatibility / MDEV-10574
> 
> Hello Jerome,
> 
> Thanks for working on this!
> 
> On 04/25/2017 11:42 PM, jerome brauge wrote:
> > Hello Alexander,
> > Some times ago, we have had the below discussion.
> > I made a patch following my first idea (it's enough from our point of view) :
> replace empty string by null in literals and in make_empty_string.
> > I think this patch can be complementary to your suggestion of adding
> > an Item_func_eq_oracle (for data inserted in default sql_mode)
> >
> > This patch also contains following some functions  changes:
> > - replace ('ab','a',null) returns 'a' instead null
> > - trim / ltrim / rtrim returns null when input only contains spaces
> > - lpad /rpad can accept 2 args (default padding char is space) and if
> > length equals to 0, returns null -substring : if start index is equal
> > to 0, act as if it's equals to 1 -add a function chr() as a synonym of
> > char() (but accept only one arg) -add a function lengthb() as a
> > synonym of lentgh() -change the semantic of length for oracle (where
> > length() is the character length and not the byte length)
> >
> > What do you think about this patch ?
> >
> > (Perhaps I have to make 2 patchs, one for empty string, and one for
> > functions changes with separate test file for each modified function)
> 
> Right, can you please split the patch into pieces?
> I suggest even more pieces than two. One patch for one function.
> I'll create MDEVs, one MDEV per function and put examples how Oracle
> works.
> 
> Can we do it in the following order:
> 
> 1. LPAD / RPAD - making the third parameter optional.
>    This extension will be good for all sql_modes.
>    I checked IBM DB2, Oracle PostgreSQL, Firebird - they all have
>    the third parameter to LPAD/RPAD optional.
> 
> 2. CHR().
> 
> 3. LENGTH() and LENGTHB().
> 
> 
> Then we can continue with REPLACE, TRIM. SUBSTRING in no particular
> order, one patch per function.
> 
> 
> Some general notes:
> 
> 
> 1. Unify *.yy files
> 
> Please put sql_mode dependent code into LEX rather than *.yy files directly.
> We're going to unify the two files soon. See here for details:
> MDEV-12518) Unify sql_yacc.yy and sql_yacc_ora.yy
> 
> So instead of doing:
> 
> | SUBSTRING '(' expr ',' expr ',' expr ')'
>  {
>    $$= new (thd->mem_root) Item_func_substr(thd, $3, $5, $7);
>    if ($$ == NULL)
>    MYSQL_YYABORT;
>  }
> 
> 
> Please do something like this in both sql_yacc.yy and sql_yacc_ora.yy:
> 
> 
> | SUBSTRING '(' expr ',' expr ',' expr ')'
>  {
>    if (!($$= Lex->make_item_func_substr(thd, $3, $5, $7)))
>      MYSQL_YYABORT;
>  }
> 
> where LEX::make_item_func_substr() is a new method, creating either
> Item_func_substr or Item_func_substr_oracle depending on sql_mode.
> 
> The same applies to all other functions, and to text_literal.
> 
> 
> 2. Don't use Item::is_null() when you call val_str() anyway.
> 
> There are two reasons that:
> - Performance
> is_null() involves the second val_str() call, which can be resource consuming
> in case of complex functions and subselect.
> 
> - Side effect
> Expressions can have some side effects.
> For example, in case if the argument is a stored function which counts how
> many times it was called and stores the counter in a user variable, or writes
> something to a table on every execution.
> If we do is_null() followed by val_str(), the side effect will done two times.
> We need one time only.
> 
> So instead of:
> 
> String *Item_func_replace_oracle::val_str(String *str) {
>   String * res;
>   if (res = replace_value(str,
>             args[1]->is_null() ? &tmp_emtpystr :
> rgs[1]->val_str(&tmp_value2),
>             args[2]->is_null() ? &tmp_emtpystr :
> rgs[2]->val_str(&tmp_value3))
> 
>     return (res->length() == 0) ? make_empty_result () : res;
>   else
>     return 0;
> }
> 
> 
> Please do something like this:
> 
> String *Item_func_replace_oracle::val_str(String *str) {
>   String * res;
>   String *str1= args[1]->val_str(&tmp_value1);
>   String *str2= args[2]->val_str(&tmp_value2);
>   if (res= replace_value(str,
>                          str1 ? str1 : &tmp_emptystr,
>                          str2 ? str2 : &tmp_emptystr)
>     return (res->length() == 0) ? make_empty_result () : res;
>   else
>     return 0;
> }
> 
> 
> Thanks!
> 
> 
> 
> >
> > Best regards,
> > Jérôme.
> >
> >> -----Message d'origine-----
> >> De : Alexander Barkov [mailto:bar@xxxxxxxxxxx] Envoyé : lundi 23
> >> janvier 2017 12:49 À : jerome brauge Cc : MariaDB Developers Objet :
> >> Re: MDEV-10142 - bb-10.2-compatibility / MDEV-10574
> >>
> >> Hi Jerome,
> >>
> >> On 01/18/2017 01:22 PM, jerome brauge wrote:
> >>> Hello Alexander,
> >>> Sometime ago, when I ask you about plan for MDEV-10574, you replied :
> >>>
> >>>> The current plan is to do these transformations:
> >>>>
> >>>> 1. Transform Insert
> >>>> - insert values ("") -> insert values (null)
> >>>>
> >>>> 2. Transform Select
> >>>>
> >>>> - where v=x => (v <> "" and V=X)
> >>>> - where v is null => (v="" or v is null)
> >>>>
> >>>> We didn't plan to change functions yet. Thanks for bringing this up.
> >>>> We'll discuss this.
> >>>
> >>> I've done some tests just by changing :
> >>> - insert an Item_null instead of an Item_string when $1.length==0 in
> >>> rule text_literal of sql_yacc_ora.yy
> >>> - return null instead of an empty string in
> >>> Item_str_func::make_empty_result
> >>>
> >>> My first tests seem promising.
> >>>
> >>> Of course this solution does not allow to "see" the records created
> >>> with
> >> empty strings as null values.
> >>> I don't see the importance of being able to do this in a transparent way.
> >>> We can explicitly select these row by adding rtrim on these columns.
> >>>
> >>> If you are interesting, I can begin to write a test to evaluate the
> >>> coverage of
> >> this solution.
> >>
> >> I'm afraid this will fix the behavior only for literals.
> >>
> >> This script:
> >>
> >> DROP TABLE t1;
> >> CREATE TABLE t1 (a VARCHAR(10), b VARCHAR(10)); INSERT INTO t1
> VALUES
> >> ('',''); SELECT COUNT(*) FROM t1 WHERE a=''; SELECT COUNT(*) FROM t1
> >> WHERE a=b;
> >>
> >> returns 0 for both SELECT queries in Oracle, and returns 1 for both
> >> SELECT queries in MariaDB, Your approach will fix the first SELECT only.
> >> The second will still return 1.
> >>
> >>
> >> For now, I'm not sure how to do this correctly. Some ways possible:
> >>
> >> 1. We could add new classes, e.g. Item_func_eq_oracle as a pair for
> >> Item_func_eq, which will handle both empty strings and NULLs inside
> >> their
> >> val_int() implementations.
> >>
> >> 2. Or there could be some after-processing on the created Item tree
> >> which will replace all things like Item_func_eq to Item_cond_and with
> >> Item_func_eq and Item_func_null_predicate passed as arguments.
> >>
> >> Currently I'm inclined to #1, because we don't have a good mechanism
> >> to clone items to implement #2 correctly. We'd need such cloning to
> >> pass arguments both to Item_func_eq and Item_func_null_predicate
> correctly.
> >> Some places in the code don't expect that the same arguments can be
> >> passed to two different Item_func instances.
> >>
> >> On the other hand, implementing #1 will need more changes in the
> >> optimizer.
> >>
> >> We were going to think on this later.
> >>
> >>>
> >>> Best regard.
> >>> Jérôme.
> >>>

Attachment: replace.diff
Description: replace.diff

Attachment: trim.diff
Description: trim.diff

Attachment: substring.patch
Description: substring.patch

Attachment: empty_strings.diff
Description: empty_strings.diff


References