← Back to team overview

maria-developers team mailing list archive

Re: MDEV-10142 - bb-10.2-compatibility

 

Hi Jerome,

On 05/04/2017 08:16 AM, Alexander Barkov wrote:
> Hello Jerome,
> 
> 
> On 04/27/2017 08:43 AM, jerome brauge wrote:
>> Hello Alexander,
>> This is the patch for CHR().
>> Function CHR(n) returns the character having the binary equivalent to n in the database character set. Returns null on null input.
> 
> Looking at Oracle's manual:
> 
> https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions019.htm
> 
> I think that:
> 
> 1. We should support the optional USING clause.
> CHR(x USING y) could be just translated to CHAR(x USING y).
> 
> 2. When no USING is specified,
> CHR(x) could be translated to CHAR(x USING database_character_set)
> 
> MariaDB's CHAR() uses &my_charset_bin as the default character set,
> which means VARBINARY result.
> Oracle's CHR() uses the database character set as the default character set.
> 
> 
> It seems that both can be done in sql_yacc_ora.yy,
> and no new class is needed.
> 
> 
> I suggest we do the following:
> 
> 1. Add this into lex.h:
> 
>   { "CHR",             SYM(CHR_SYM)},
> 
> 2. Add this into sql_yacc.yy and sql_yacc_ora.yy:
> 
> %token  CHR_SYM
> 
> This is to make this keyword non-reserved.
> 
> 
> 3. Add CHR_SYM into sql_yacc.yy, into the keyword_sp rule.
> 
> 4. Add CHR_SYM into sql_yacc_ora.yy, into the keyword_sp_not_data_type rule.

I pushed a partial patch for sql_yacc.yy and sql_yacc_ora.yy unification.

So now sql_yacc.yy also has keyword_sp_not_data_type.

CHR_SYM now needs to go to keyword_sp_not_data_type in both sql_yacc.yy
and sql_yacc_ora.yy.

> 
> This is to make this keyword non-reserved for sql_mode=ORACLE.
> 
> 
> 5. Add this into sql_yacc_ora.yy, into function_call_keyword
> 
>           CHR_SYM '(' expr_list ')'
>           {
>             $$= new (thd->mem_root)
>                 Item_func_char(thd, *$3,
>                                thd->variables.collation_database);
>             if ($$ == NULL)
>               MYSQL_YYABORT;
>           }
>         | CHR_SYM '(' expr_list USING charset_name ')'
>           {
>             $$= new (thd->mem_root) Item_func_char(thd, *$3, $5);
>             if ($$ == NULL)
>               MYSQL_YYABORT;
>           }
> 
> 
> 
> In the meanwhile I'll fix two bugs that I found reading the code
> in Item_func_char:
> 
> 
> MDEV-12680 Crash when CREATE TABLE t1 AS SELECT char(100 using filename)
> MDEV-12681 Wrong VIEW results for CHAR(0xDF USING latin1)
> 
> Thanks!
> 
>>
>> Regard,
>> Jérôme.
>>
>> PS : I share this code under terms of MCA.
>>
>>
>>> -----Message d'origine-----
>>> De : Alexander Barkov [mailto:bar@xxxxxxxxxxx]
>>> Envoyé : mercredi 26 avril 2017 06:12
>>> À : jerome brauge
>>> Cc : MariaDB Developers
>>> Objet : Re: MDEV-10142 - bb-10.2-compatibility / MDEV-10574
>>>
>>> Hello Jerome,
>>>
>>> Thanks for working on this!
>>>
>>> On 04/25/2017 11:42 PM, jerome brauge wrote:
>>>> Hello Alexander,
>>>> Some times ago, we have had the below discussion.
>>>> I made a patch following my first idea (it's enough from our point of view) :
>>> replace empty string by null in literals and in make_empty_string.
>>>> I think this patch can be complementary to your suggestion of adding
>>>> an Item_func_eq_oracle (for data inserted in default sql_mode)
>>>>
>>>> This patch also contains following some functions  changes:
>>>> - replace ('ab','a',null) returns 'a' instead null
>>>> - trim / ltrim / rtrim returns null when input only contains spaces
>>>> - lpad /rpad can accept 2 args (default padding char is space) and if
>>>> length equals to 0, returns null -substring : if start index is equal
>>>> to 0, act as if it's equals to 1 -add a function chr() as a synonym of
>>>> char() (but accept only one arg) -add a function lengthb() as a
>>>> synonym of lentgh() -change the semantic of length for oracle (where
>>>> length() is the character length and not the byte length)
>>>>
>>>> What do you think about this patch ?
>>>>
>>>> (Perhaps I have to make 2 patchs, one for empty string, and one for
>>>> functions changes with separate test file for each modified function)
>>>
>>> Right, can you please split the patch into pieces?
>>> I suggest even more pieces than two. One patch for one function.
>>> I'll create MDEVs, one MDEV per function and put examples how Oracle
>>> works.
>>>
>>> Can we do it in the following order:
>>>
>>> 1. LPAD / RPAD - making the third parameter optional.
>>>    This extension will be good for all sql_modes.
>>>    I checked IBM DB2, Oracle PostgreSQL, Firebird - they all have
>>>    the third parameter to LPAD/RPAD optional.
>>>
>>> 2. CHR().
>>>
>>> 3. LENGTH() and LENGTHB().
>>>
>>>
>>> Then we can continue with REPLACE, TRIM. SUBSTRING in no particular
>>> order, one patch per function.
>>>
>>>
>>> Some general notes:
>>>
>>>
>>> 1. Unify *.yy files
>>>
>>> Please put sql_mode dependent code into LEX rather than *.yy files directly.
>>> We're going to unify the two files soon. See here for details:
>>> MDEV-12518) Unify sql_yacc.yy and sql_yacc_ora.yy
>>>
>>> So instead of doing:
>>>
>>> | SUBSTRING '(' expr ',' expr ',' expr ')'
>>>  {
>>>    $$= new (thd->mem_root) Item_func_substr(thd, $3, $5, $7);
>>>    if ($$ == NULL)
>>>    MYSQL_YYABORT;
>>>  }
>>>
>>>
>>> Please do something like this in both sql_yacc.yy and sql_yacc_ora.yy:
>>>
>>>
>>> | SUBSTRING '(' expr ',' expr ',' expr ')'
>>>  {
>>>    if (!($$= Lex->make_item_func_substr(thd, $3, $5, $7)))
>>>      MYSQL_YYABORT;
>>>  }
>>>
>>> where LEX::make_item_func_substr() is a new method, creating either
>>> Item_func_substr or Item_func_substr_oracle depending on sql_mode.
>>>
>>> The same applies to all other functions, and to text_literal.
>>>
>>>
>>> 2. Don't use Item::is_null() when you call val_str() anyway.
>>>
>>> There are two reasons that:
>>> - Performance
>>> is_null() involves the second val_str() call, which can be resource consuming
>>> in case of complex functions and subselect.
>>>
>>> - Side effect
>>> Expressions can have some side effects.
>>> For example, in case if the argument is a stored function which counts how
>>> many times it was called and stores the counter in a user variable, or writes
>>> something to a table on every execution.
>>> If we do is_null() followed by val_str(), the side effect will done two times.
>>> We need one time only.
>>>
>>> So instead of:
>>>
>>> String *Item_func_replace_oracle::val_str(String *str) {
>>>   String * res;
>>>   if (res = replace_value(str,
>>>             args[1]->is_null() ? &tmp_emtpystr :
>>> rgs[1]->val_str(&tmp_value2),
>>>             args[2]->is_null() ? &tmp_emtpystr :
>>> rgs[2]->val_str(&tmp_value3))
>>>
>>>     return (res->length() == 0) ? make_empty_result () : res;
>>>   else
>>>     return 0;
>>> }
>>>
>>>
>>> Please do something like this:
>>>
>>> String *Item_func_replace_oracle::val_str(String *str) {
>>>   String * res;
>>>   String *str1= args[1]->val_str(&tmp_value1);
>>>   String *str2= args[2]->val_str(&tmp_value2);
>>>   if (res= replace_value(str,
>>>                          str1 ? str1 : &tmp_emptystr,
>>>                          str2 ? str2 : &tmp_emptystr)
>>>     return (res->length() == 0) ? make_empty_result () : res;
>>>   else
>>>     return 0;
>>> }
>>>
>>>
>>> Thanks!
>>>
>>>
>>>
>>>>
>>>> Best regards,
>>>> Jérôme.
>>>>
>>>>> -----Message d'origine-----
>>>>> De : Alexander Barkov [mailto:bar@xxxxxxxxxxx] Envoyé : lundi 23
>>>>> janvier 2017 12:49 À : jerome brauge Cc : MariaDB Developers Objet :
>>>>> Re: MDEV-10142 - bb-10.2-compatibility / MDEV-10574
>>>>>
>>>>> Hi Jerome,
>>>>>
>>>>> On 01/18/2017 01:22 PM, jerome brauge wrote:
>>>>>> Hello Alexander,
>>>>>> Sometime ago, when I ask you about plan for MDEV-10574, you replied :
>>>>>>
>>>>>>> The current plan is to do these transformations:
>>>>>>>
>>>>>>> 1. Transform Insert
>>>>>>> - insert values ("") -> insert values (null)
>>>>>>>
>>>>>>> 2. Transform Select
>>>>>>>
>>>>>>> - where v=x => (v <> "" and V=X)
>>>>>>> - where v is null => (v="" or v is null)
>>>>>>>
>>>>>>> We didn't plan to change functions yet. Thanks for bringing this up.
>>>>>>> We'll discuss this.
>>>>>>
>>>>>> I've done some tests just by changing :
>>>>>> - insert an Item_null instead of an Item_string when $1.length==0 in
>>>>>> rule text_literal of sql_yacc_ora.yy
>>>>>> - return null instead of an empty string in
>>>>>> Item_str_func::make_empty_result
>>>>>>
>>>>>> My first tests seem promising.
>>>>>>
>>>>>> Of course this solution does not allow to "see" the records created
>>>>>> with
>>>>> empty strings as null values.
>>>>>> I don't see the importance of being able to do this in a transparent way.
>>>>>> We can explicitly select these row by adding rtrim on these columns.
>>>>>>
>>>>>> If you are interesting, I can begin to write a test to evaluate the
>>>>>> coverage of
>>>>> this solution.
>>>>>
>>>>> I'm afraid this will fix the behavior only for literals.
>>>>>
>>>>> This script:
>>>>>
>>>>> DROP TABLE t1;
>>>>> CREATE TABLE t1 (a VARCHAR(10), b VARCHAR(10)); INSERT INTO t1
>>> VALUES
>>>>> ('',''); SELECT COUNT(*) FROM t1 WHERE a=''; SELECT COUNT(*) FROM t1
>>>>> WHERE a=b;
>>>>>
>>>>> returns 0 for both SELECT queries in Oracle, and returns 1 for both
>>>>> SELECT queries in MariaDB, Your approach will fix the first SELECT only.
>>>>> The second will still return 1.
>>>>>
>>>>>
>>>>> For now, I'm not sure how to do this correctly. Some ways possible:
>>>>>
>>>>> 1. We could add new classes, e.g. Item_func_eq_oracle as a pair for
>>>>> Item_func_eq, which will handle both empty strings and NULLs inside
>>>>> their
>>>>> val_int() implementations.
>>>>>
>>>>> 2. Or there could be some after-processing on the created Item tree
>>>>> which will replace all things like Item_func_eq to Item_cond_and with
>>>>> Item_func_eq and Item_func_null_predicate passed as arguments.
>>>>>
>>>>> Currently I'm inclined to #1, because we don't have a good mechanism
>>>>> to clone items to implement #2 correctly. We'd need such cloning to
>>>>> pass arguments both to Item_func_eq and Item_func_null_predicate
>>> correctly.
>>>>> Some places in the code don't expect that the same arguments can be
>>>>> passed to two different Item_func instances.
>>>>>
>>>>> On the other hand, implementing #1 will need more changes in the
>>>>> optimizer.
>>>>>
>>>>> We were going to think on this later.
>>>>>
>>>>>>
>>>>>> Best regard.
>>>>>> Jérôme.
>>>>>>


References