Thread Previous • Date Previous • Date Next • Thread Next |
Hello Alexander, Here are patches for replace, trim, substring functions and empty strings: - replace.diff - trim.diff - substring.patch - empty_strings.diff Best regards, Jérôme. > -----Message d'origine----- > De : Alexander Barkov [mailto:bar@xxxxxxxxxxx] > Envoyé : mercredi 26 avril 2017 06:12 > À : jerome brauge > Cc : MariaDB Developers > Objet : Re: MDEV-10142 - bb-10.2-compatibility / MDEV-10574 > > Hello Jerome, > > Thanks for working on this! > > On 04/25/2017 11:42 PM, jerome brauge wrote: > > Hello Alexander, > > Some times ago, we have had the below discussion. > > I made a patch following my first idea (it's enough from our point of view) : > replace empty string by null in literals and in make_empty_string. > > I think this patch can be complementary to your suggestion of adding > > an Item_func_eq_oracle (for data inserted in default sql_mode) > > > > This patch also contains following some functions changes: > > - replace ('ab','a',null) returns 'a' instead null > > - trim / ltrim / rtrim returns null when input only contains spaces > > - lpad /rpad can accept 2 args (default padding char is space) and if > > length equals to 0, returns null -substring : if start index is equal > > to 0, act as if it's equals to 1 -add a function chr() as a synonym of > > char() (but accept only one arg) -add a function lengthb() as a > > synonym of lentgh() -change the semantic of length for oracle (where > > length() is the character length and not the byte length) > > > > What do you think about this patch ? > > > > (Perhaps I have to make 2 patchs, one for empty string, and one for > > functions changes with separate test file for each modified function) > > Right, can you please split the patch into pieces? > I suggest even more pieces than two. One patch for one function. > I'll create MDEVs, one MDEV per function and put examples how Oracle > works. > > Can we do it in the following order: > > 1. LPAD / RPAD - making the third parameter optional. > This extension will be good for all sql_modes. > I checked IBM DB2, Oracle PostgreSQL, Firebird - they all have > the third parameter to LPAD/RPAD optional. > > 2. CHR(). > > 3. LENGTH() and LENGTHB(). > > > Then we can continue with REPLACE, TRIM. SUBSTRING in no particular > order, one patch per function. > > > Some general notes: > > > 1. Unify *.yy files > > Please put sql_mode dependent code into LEX rather than *.yy files directly. > We're going to unify the two files soon. See here for details: > MDEV-12518) Unify sql_yacc.yy and sql_yacc_ora.yy > > So instead of doing: > > | SUBSTRING '(' expr ',' expr ',' expr ')' > { > $$= new (thd->mem_root) Item_func_substr(thd, $3, $5, $7); > if ($$ == NULL) > MYSQL_YYABORT; > } > > > Please do something like this in both sql_yacc.yy and sql_yacc_ora.yy: > > > | SUBSTRING '(' expr ',' expr ',' expr ')' > { > if (!($$= Lex->make_item_func_substr(thd, $3, $5, $7))) > MYSQL_YYABORT; > } > > where LEX::make_item_func_substr() is a new method, creating either > Item_func_substr or Item_func_substr_oracle depending on sql_mode. > > The same applies to all other functions, and to text_literal. > > > 2. Don't use Item::is_null() when you call val_str() anyway. > > There are two reasons that: > - Performance > is_null() involves the second val_str() call, which can be resource consuming > in case of complex functions and subselect. > > - Side effect > Expressions can have some side effects. > For example, in case if the argument is a stored function which counts how > many times it was called and stores the counter in a user variable, or writes > something to a table on every execution. > If we do is_null() followed by val_str(), the side effect will done two times. > We need one time only. > > So instead of: > > String *Item_func_replace_oracle::val_str(String *str) { > String * res; > if (res = replace_value(str, > args[1]->is_null() ? &tmp_emtpystr : > rgs[1]->val_str(&tmp_value2), > args[2]->is_null() ? &tmp_emtpystr : > rgs[2]->val_str(&tmp_value3)) > > return (res->length() == 0) ? make_empty_result () : res; > else > return 0; > } > > > Please do something like this: > > String *Item_func_replace_oracle::val_str(String *str) { > String * res; > String *str1= args[1]->val_str(&tmp_value1); > String *str2= args[2]->val_str(&tmp_value2); > if (res= replace_value(str, > str1 ? str1 : &tmp_emptystr, > str2 ? str2 : &tmp_emptystr) > return (res->length() == 0) ? make_empty_result () : res; > else > return 0; > } > > > Thanks! > > > > > > > Best regards, > > Jérôme. > > > >> -----Message d'origine----- > >> De : Alexander Barkov [mailto:bar@xxxxxxxxxxx] Envoyé : lundi 23 > >> janvier 2017 12:49 À : jerome brauge Cc : MariaDB Developers Objet : > >> Re: MDEV-10142 - bb-10.2-compatibility / MDEV-10574 > >> > >> Hi Jerome, > >> > >> On 01/18/2017 01:22 PM, jerome brauge wrote: > >>> Hello Alexander, > >>> Sometime ago, when I ask you about plan for MDEV-10574, you replied : > >>> > >>>> The current plan is to do these transformations: > >>>> > >>>> 1. Transform Insert > >>>> - insert values ("") -> insert values (null) > >>>> > >>>> 2. Transform Select > >>>> > >>>> - where v=x => (v <> "" and V=X) > >>>> - where v is null => (v="" or v is null) > >>>> > >>>> We didn't plan to change functions yet. Thanks for bringing this up. > >>>> We'll discuss this. > >>> > >>> I've done some tests just by changing : > >>> - insert an Item_null instead of an Item_string when $1.length==0 in > >>> rule text_literal of sql_yacc_ora.yy > >>> - return null instead of an empty string in > >>> Item_str_func::make_empty_result > >>> > >>> My first tests seem promising. > >>> > >>> Of course this solution does not allow to "see" the records created > >>> with > >> empty strings as null values. > >>> I don't see the importance of being able to do this in a transparent way. > >>> We can explicitly select these row by adding rtrim on these columns. > >>> > >>> If you are interesting, I can begin to write a test to evaluate the > >>> coverage of > >> this solution. > >> > >> I'm afraid this will fix the behavior only for literals. > >> > >> This script: > >> > >> DROP TABLE t1; > >> CREATE TABLE t1 (a VARCHAR(10), b VARCHAR(10)); INSERT INTO t1 > VALUES > >> ('',''); SELECT COUNT(*) FROM t1 WHERE a=''; SELECT COUNT(*) FROM t1 > >> WHERE a=b; > >> > >> returns 0 for both SELECT queries in Oracle, and returns 1 for both > >> SELECT queries in MariaDB, Your approach will fix the first SELECT only. > >> The second will still return 1. > >> > >> > >> For now, I'm not sure how to do this correctly. Some ways possible: > >> > >> 1. We could add new classes, e.g. Item_func_eq_oracle as a pair for > >> Item_func_eq, which will handle both empty strings and NULLs inside > >> their > >> val_int() implementations. > >> > >> 2. Or there could be some after-processing on the created Item tree > >> which will replace all things like Item_func_eq to Item_cond_and with > >> Item_func_eq and Item_func_null_predicate passed as arguments. > >> > >> Currently I'm inclined to #1, because we don't have a good mechanism > >> to clone items to implement #2 correctly. We'd need such cloning to > >> pass arguments both to Item_func_eq and Item_func_null_predicate > correctly. > >> Some places in the code don't expect that the same arguments can be > >> passed to two different Item_func instances. > >> > >> On the other hand, implementing #1 will need more changes in the > >> optimizer. > >> > >> We were going to think on this later. > >> > >>> > >>> Best regard. > >>> Jérôme. > >>>
Attachment:
replace.diff
Description: replace.diff
Attachment:
trim.diff
Description: trim.diff
Attachment:
substring.patch
Description: substring.patch
Attachment:
empty_strings.diff
Description: empty_strings.diff
Thread Previous • Date Previous • Date Next • Thread Next |