← Back to team overview

maria-developers team mailing list archive

Re: MDEV-10142 - bb-10.2-compatibility / MDEV-10574

 

Hello Jerome,

Thanks for working on this!

On 04/25/2017 11:42 PM, jerome brauge wrote:
> Hello Alexander,
> Some times ago, we have had the below discussion.
> I made a patch following my first idea (it's enough from our point of view) : replace empty string by null in literals and in make_empty_string.
> I think this patch can be complementary to your suggestion of adding an Item_func_eq_oracle (for data inserted in default sql_mode)
> 
> This patch also contains following some functions  changes:
> - replace ('ab','a',null) returns 'a' instead null
> - trim / ltrim / rtrim returns null when input only contains spaces
> - lpad /rpad can accept 2 args (default padding char is space) and if length equals to 0, returns null
> -substring : if start index is equal to 0, act as if it's equals to 1
> -add a function chr() as a synonym of char() (but accept only one arg)
> -add a function lengthb() as a synonym of lentgh()
> -change the semantic of length for oracle (where length() is the character length and not the byte length)
> 
> What do you think about this patch ?
> 
> (Perhaps I have to make 2 patchs, one for empty string, and one for functions changes with separate test file for each modified function)

Right, can you please split the patch into pieces?
I suggest even more pieces than two. One patch for one function.
I'll create MDEVs, one MDEV per function and put examples how Oracle works.

Can we do it in the following order:

1. LPAD / RPAD - making the third parameter optional.
   This extension will be good for all sql_modes.
   I checked IBM DB2, Oracle PostgreSQL, Firebird - they all have
   the third parameter to LPAD/RPAD optional.

2. CHR().

3. LENGTH() and LENGTHB().


Then we can continue with REPLACE, TRIM. SUBSTRING in no particular
order, one patch per function.


Some general notes:


1. Unify *.yy files

Please put sql_mode dependent code into LEX rather than *.yy files
directly. We're going to unify the two files soon. See here for details:
MDEV-12518) Unify sql_yacc.yy and sql_yacc_ora.yy

So instead of doing:

| SUBSTRING '(' expr ',' expr ',' expr ')'
 {
   $$= new (thd->mem_root) Item_func_substr(thd, $3, $5, $7);
   if ($$ == NULL)
   MYSQL_YYABORT;
 }


Please do something like this in both sql_yacc.yy and sql_yacc_ora.yy:


| SUBSTRING '(' expr ',' expr ',' expr ')'
 {
   if (!($$= Lex->make_item_func_substr(thd, $3, $5, $7)))
     MYSQL_YYABORT;
 }

where LEX::make_item_func_substr() is a new method,
creating either Item_func_substr or Item_func_substr_oracle
depending on sql_mode.

The same applies to all other functions, and to text_literal.


2. Don't use Item::is_null() when you call val_str() anyway.

There are two reasons that:
- Performance
is_null() involves the second val_str() call, which can be
resource consuming in case of complex functions and subselect.

- Side effect
Expressions can have some side effects.
For example, in case if the argument is a stored function
which counts how many times it was called and stores the counter in a
user variable, or writes something to a table on every execution.
If we do is_null() followed by val_str(), the side effect will done two
times. We need one time only.

So instead of:

String *Item_func_replace_oracle::val_str(String *str)
{
  String * res;
  if (res = replace_value(str,
            args[1]->is_null() ? &tmp_emtpystr :
rgs[1]->val_str(&tmp_value2),
            args[2]->is_null() ? &tmp_emtpystr :
rgs[2]->val_str(&tmp_value3))

    return (res->length() == 0) ? make_empty_result () : res;
  else
    return 0;
}


Please do something like this:

String *Item_func_replace_oracle::val_str(String *str)
{
  String * res;
  String *str1= args[1]->val_str(&tmp_value1);
  String *str2= args[2]->val_str(&tmp_value2);
  if (res= replace_value(str,
                         str1 ? str1 : &tmp_emptystr,
                         str2 ? str2 : &tmp_emptystr)
    return (res->length() == 0) ? make_empty_result () : res;
  else
    return 0;
}


Thanks!



> 
> Best regards,
> Jérôme.
> 
>> -----Message d'origine-----
>> De : Alexander Barkov [mailto:bar@xxxxxxxxxxx]
>> Envoyé : lundi 23 janvier 2017 12:49
>> À : jerome brauge
>> Cc : MariaDB Developers
>> Objet : Re: MDEV-10142 - bb-10.2-compatibility / MDEV-10574
>>
>> Hi Jerome,
>>
>> On 01/18/2017 01:22 PM, jerome brauge wrote:
>>> Hello Alexander,
>>> Sometime ago, when I ask you about plan for MDEV-10574, you replied :
>>>
>>>> The current plan is to do these transformations:
>>>>
>>>> 1. Transform Insert
>>>> - insert values ("") -> insert values (null)
>>>>
>>>> 2. Transform Select
>>>>
>>>> - where v=x => (v <> "" and V=X)
>>>> - where v is null => (v="" or v is null)
>>>>
>>>> We didn't plan to change functions yet. Thanks for bringing this up.
>>>> We'll discuss this.
>>>
>>> I've done some tests just by changing :
>>> - insert an Item_null instead of an Item_string when $1.length==0 in
>>> rule text_literal of sql_yacc_ora.yy
>>> - return null instead of an empty string in
>>> Item_str_func::make_empty_result
>>>
>>> My first tests seem promising.
>>>
>>> Of course this solution does not allow to "see" the records created with
>> empty strings as null values.
>>> I don't see the importance of being able to do this in a transparent way.
>>> We can explicitly select these row by adding rtrim on these columns.
>>>
>>> If you are interesting, I can begin to write a test to evaluate the coverage of
>> this solution.
>>
>> I'm afraid this will fix the behavior only for literals.
>>
>> This script:
>>
>> DROP TABLE t1;
>> CREATE TABLE t1 (a VARCHAR(10), b VARCHAR(10)); INSERT INTO t1 VALUES
>> ('',''); SELECT COUNT(*) FROM t1 WHERE a=''; SELECT COUNT(*) FROM t1
>> WHERE a=b;
>>
>> returns 0 for both SELECT queries in Oracle, and returns 1 for both SELECT
>> queries in MariaDB, Your approach will fix the first SELECT only.
>> The second will still return 1.
>>
>>
>> For now, I'm not sure how to do this correctly. Some ways possible:
>>
>> 1. We could add new classes, e.g. Item_func_eq_oracle as a pair for
>> Item_func_eq, which will handle both empty strings and NULLs inside their
>> val_int() implementations.
>>
>> 2. Or there could be some after-processing on the created Item tree which
>> will replace all things like Item_func_eq to Item_cond_and with
>> Item_func_eq and Item_func_null_predicate passed as arguments.
>>
>> Currently I'm inclined to #1, because we don't have a good mechanism to
>> clone items to implement #2 correctly. We'd need such cloning to pass
>> arguments both to Item_func_eq and Item_func_null_predicate correctly.
>> Some places in the code don't expect that the same arguments can be
>> passed to two different Item_func instances.
>>
>> On the other hand, implementing #1 will need more changes in the
>> optimizer.
>>
>> We were going to think on this later.
>>
>>>
>>> Best regard.
>>> Jérôme.
>>>


Follow ups

References