← Back to team overview

maria-developers team mailing list archive

Re: MDEV-13418 Compatibility: The order of evaluation of SELECT..INTO assignments

 

Hello Jerome,


On 11/09/2017 12:03 PM, jerome brauge wrote:
> Hello Alexander,
> 
>> -----Message d'origine-----
>> De : Alexander Barkov [mailto:bar@xxxxxxxxxxx]
>> Envoyé : mardi 7 novembre 2017 16:34
>> À : jerome brauge
>> Cc : 'MariaDB Developers (maria-developers@xxxxxxxxxxxxxxxxxxx)'
>> Objet : Re: MDEV-13418 Compatibility: The order of evaluation of
>> SELECT..INTO assignments
>>
>> Hi Jerome,
>>
>>
>> On 11/06/2017 01:01 PM, jerome brauge wrote:
>>> Hello Alexander,
>>> We have checked all our stored procedure and the good news for us is that
>> only 2 statements are concerned !
>>> So, MDEV-13418 is not mandatory for us.
>>>
>>> However, can we consider to do a full parse just at compile time ?
>>> It would allow to :
>>>  - determining variables which must have a temporary pair and
>>> modifying stored source code to handle them
>>>  - issuing error or warning when these statements have syntax error or
>>> call unknown functions ( like Oracle, SQLserver , UDB , .. )
>>
>> I'm not sure I understood about full parser at compile time.
>> Can you please give an example?
>>
> 
> Currently, the following function is compiled without any warning and run fine as long as b1 is not between 1 and 5.
> 
> delimiter /
> CREATE or replace procedure p1(b1 integer)
> BEGIN
>   declare res INTEGER default 0;  
>   if b1 = 0 then
>     -- unknow variable
>     set res=x;
>   end if;
>   if b1 = 1 then
>     -- unknow function in set
>     set res=lenngth('v');
>   end if;
>   if b1 = 2 then
>     -- unknow variable
>     select a into res;
>   end if;
>   if b1 = 3 then
>     -- unknow table / column / function
>    select 1 into res from unknowtable where unknowcolumn = unknowfunc();   
>   end if;
>   if b1 = 4 then
>     -- unknow group by column 
>     select 1 into res from dual group by 13;
>   end if;
>   if b1 = 5 then
>     -- unknow function in test 
>     if coalesc(res,'x') = 'y' then
>       set res:='z';
>     end if;
>   end if;
> 
> END
> /
> call p1(-1)
> /
> 
> My idea is to have different behavior during "create procedure" and load_routine.
> Load_routine must be fast but "create procedure" can be slower and do more check (at least check existence of variables, tables, columns, functions and procedure)
> It's not truly acceptable to throw this kind of error only when the code is really executed.

At CREATE time, we can only add a warning when an unknown procedure or
function is called. Issuing errors is not possible, because one would
not be able to create two routines mutually executing each other
(now it is possible).

So issuing errors is possible at execution time only.

When a routine is loaded, it could be checked for all
called procedures and functions, and a warning or an
error could be issued if some routine is missing
(independently on conditions in control structures).


> 
> To do this, each select list expression have to be "parse" and so we can determine variables which must have a temporary pair.
> Ex:
> CREATE PROCEDURE p1(res OUT VARCHAR)
> AS
>   b1 INT:=10;
> BEGIN
>   SELECT  1,CAST(b1 AS VARCHAR(10)) INTO b1, res FROM dual;  
> END;
> 
> Can be transform and store as
> CREATE PROCEDURE p1(res OUT VARCHAR)
> AS
>   b1 INT:=10;
> BEGIN
>   DECLARE  _b1 INTEGER DEFAULT b1;
>   SELECT  1,CAST(b1 AS VARCHAR(10)) INTO b1, res FROM dual;  
>   SET b1=_b1;
> END;
> 
> So, no additional works should be done during load_routine.

Now we always put the routine into mysql.proc AS IS,
how the definer defined it.

I'm not sure if we should rewrite CREATE statements...


>>>
>>> Without this second point we are exposed to keystrokes errors and to find
>> these mistakes only at runtime.
>>> It's already very hard to have tests that cover most of the code, but here
>> we must have 100% of coverage !
>>> Fortunately for us, SQLServer and Oracle does the work for us :-)
>>>
>>> Other idea : maybe the behavior of "set" could be conditioned by another
>> sql_mode.
>>> Sybase ,  SQLServer, Mariadb : multiple "set" are evaluated as multiple
>> single "set"
>>> DB2 : work as select (each expression is evaluated before the
>>> assignments are performed)
>>>
>>> We could have three sql_mode:
>>> - SET_CONSISTENCY_UPDATE
>>> - SET_CONSISTENCY_SELECT
>>> - SET_CONSISTENCY_SET
>>
>> What would SET_CONSISTENCY_SELECT stand for?
>> Can you give an example?
> 
> I thought to:
> - SET_CONSISTENCY_UPDATE for MDEV-13417 : UPDATE produces wrong values if an updated column is later used as an update source
> - SET_CONSISTENCY_SELECT for MDEV-13418 : The order of evaluation of SELECT..INTO assignments
> - SET_CONSISTENCY_SET  :  evaluate expression before any assignment in "grouped" SET 

Thanks!

I thought that behavior of SET is directly related to behavior of
SELECT INTO, as both save data into variables. Why have two different
sql_mode options for SET and SELECT INTO?

Btw, why not have just one option SET_CONSISTENCY,
which will control all three (UPDATE, SELECT INTO, SET) ?

I didn't understand why you propose separate flags.
Can you clarify please?

> 
>>
>> Thanks!
>>
>>
>>
>>>
>>> Regards,
>>> Jérôme.
>>>
>>>
>>>> -----Message d'origine-----
>>>> De : jerome brauge
>>>> Envoyé : vendredi 3 novembre 2017 12:18 À : 'Alexander Barkov'
>>>> Cc : MariaDB Developers (maria-developers@xxxxxxxxxxxxxxxxxxx)
>>>> Objet : RE: MDEV-13418 Compatibility: The order of evaluation of
>>>> SELECT..INTO assignments
>>>>
>>>> Hi Alexander,
>>>> Thanks for this idea, but for us, cpu is a critical hotspot.
>>>> I will evaluate how many statements in our application are affected.
>>>> If this number is not to high, we modify them.
>>>> In this case, we will need two sql_mode (one for MDEV-13417 and one
>>>> for MDEV-13418).
>>>>
>>>> Best regards.
>>>> Jérôme.
>>>>
>>>>
>>>>> -----Message d'origine-----
>>>>> De : Alexander Barkov [mailto:bar@xxxxxxxxxxx] Envoyé : vendredi 3
>>>>> novembre 2017 11:02 À : jerome brauge Cc : MariaDB Developers
>>>>> (maria-developers@xxxxxxxxxxxxxxxxxxx)
>>>>> Objet : Re: MDEV-13418 Compatibility: The order of evaluation of
>>>>> SELECT..INTO assignments
>>>>>
>>>>> Hello Jerome,
>>>>>
>>>>>
>>>>> On 11/03/2017 12:22 PM, jerome brauge wrote:
>>>>>> Hello Alexander.
>>>>>> I've begun to implement your proposal but now I'm not sure that
>>>>>> it's a
>>>>> better solution than mine.
>>>>>> Let me explain .
>>>>>> - first : number of temporary variables can be significant because
>>>>>> we don't
>>>>> know when they are really needed and their scope are local to the
>>>>> statement.
>>>>>>
>>>>>>   declare b1 INTEGER;
>>>>>>   declare res INTEGER;
>>>>>> ...
>>>>>>   if b1 = 0 then
>>>>>>     select 1,b1+1 into b1, res from dual;
>>>>>>   end if;
>>>>>>   if b1 = 1 then
>>>>>>     select 2,b1+2 into b1, res from dual;
>>>>>>   end if;
>>>>>>
>>>>>> will be transform in :
>>>>>>
>>>>>>   declare b1 INTEGER;
>>>>>>   declare res INTEGER;
>>>>>> ...
>>>>>>   if b1 = 0 then
>>>>>>     declare _b1 INTEGER default res;
>>>>>>     declare _res INTEGER default res;
>>>>>>     select 1,b1+1 into _b1, _res from dual;
>>>>>>     set b1=_b1;
>>>>>>     set res=_res;
>>>>>>     -- _res is not needed, but we don't know because the select
>>>>>> statement is
>>>>> not parsed
>>>>>>   end if;
>>>>>>   if b1 = 1 then
>>>>>>     declare _b1 INTEGER default res;
>>>>>>     declare _res INTEGER default res;
>>>>>>     select 2,b1+2 into b1, res from dual;
>>>>>>     set b1=_b1;
>>>>>>     set res=_res;
>>>>>>     -- same thing here, and we have declare two variables for each
>>>>>> target
>>>>> variables
>>>>>>   end if;
>>>>>>
>>>>>> Perhaps we could
>>>>>>  - declare these temporary variables only one time in the first
>>>>>> frame of the stored procedure (may be tricky)
>>>>>>  - parse columns of each select to know what variables are really
>>>>>> assigned and reused  (heavy cost in cpu and time)
>>>>>>
>>>>>> - second : if we can't determine variables which must have a
>>>>>> temporary
>>>>> variable, number of sp_instr_set and sp_instr_set_var will be too
>>>>> high and their cpu cost is not acceptable.
>>>>>>
>>>>>> My first solution has a fixed memory impact (and memory is not an
>>>>>> issue
>>>>> nowadays), and especially a very light cpu cost.
>>>>>>
>>>>>> What do you think about ?
>>>>>
>>>>> I agree that determining variables which must have a temporary pair
>>>>> by full scanning the SELECT list expressions is probably not a good idea.
>>>>>
>>>>> Declaring these variables only one time is easy.
>>>>> I earlier made about the same trick with cursor parameters.
>>>>> See sp_pcontext::retrieve_field_definitions().
>>>>>
>>>>> The idea is that we can put such backup variables into a separate
>>>>> child sp_pcontext frame, to make sure that only one backup variable
>>>>> exists if the real variable appears multiple time as a SELECT INTO
>>>>> target. Having a dedicated frame allows:
>>>>> - to add variables in the middle of a BEGIN..END block, without
>>>>> having to re- enumerate local variables of the same block.
>>>>> - handle unique names
>>>>>
>>>>> See the comment in  sp_pcontext::retrieve_field_definitions() about
>>>>> frame positions and run-time positions, and the CURSOR declaration
>>>>> grammar in sql_yacc_ora.yy.
>>>>>
>>>>>
>>>>> Every sp_pcontext frame should have one sp_pcontext frame for
>> backup
>>>>> variables, which should be put into m_children.
>>>>>
>>>>> sp_pcontext should probably have a new flag:
>>>>>   bool m_is_backup;
>>>>> So we can iterate through m_children and find the backup frame.
>>>>>
>>>>> Another option is to add:
>>>>>
>>>>>   int m_backup_child_context_index;
>>>>>
>>>>> which will store -1 if the current frame does not have a backup
>>>>> child yet, or a non-negative value meaning the index in m_children.
>>>>> So a new method could looks like this:
>>>>>
>>>>> sp_pcontext *get_backup_context()
>>>>> {
>>>>>   if (m_backup_child_contex_index < 0)
>>>>>     return 0;
>>>>>   return m_children(m_backup_child_context_index);
>>>>> }
>>>>>
>>>>>
>>>>>>
>>>>>> Regard,
>>>>>> Jérôme.
>>>>>>
>>>>>>
>>>>>>> -----Message d'origine-----
>>>>>>> De : Alexander Barkov [mailto:bar@xxxxxxxxxxx] Envoyé : lundi 30
>>>>>>> octobre 2017 10:02 À : jerome brauge Objet : Re: MDEV-14139
>>>>>>> Anchored data types for variables
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 10/28/2017 07:29 PM, Alexander Barkov wrote:
>>>>>>>>
>>>>>>>> On 10/27/2017 10:27 PM, Alexander Barkov wrote:
>>>>>>>>>   Hello Jerome,
>>>>>>>>>
>>>>>>>>> I have implemented "MDEV-14139 Anchored data types for
>>>> variables".
>>>>>>>>> and pushed it to bb-10.2-ext:
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>> https://github.com/MariaDB/server/commit/5dd5253f7e50c21fa758e2eb58f
>>>>>>> 3
>>>>>>>>> aa9c9754e733
>>>>>>>>>
>>>>>>>>> So now it should be easier to implement consistent SET by
>>>>>>>>> creating backup variables.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> LEX::sp_variable_declarations_vartype_finalize() implements the
>>>>>>>>> logic which copies data type from another variable.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The idea is that for all variables, which are assignment targets
>>>>>>>>> in a SET or a SELECT INTO statement, we create a backup variable.
>>>>>>>>>
>>>>>>>>> It will involve these calls for every such variable:
>>>>>>>>>
>>>>>>>>> LEX::sp_variable_declarations_init(thd, 1);
>>>>>>>>> sp_pcontext::add_variable(thd, backup_variable_name);
>>>>>>>>> LEX::sp_variable_declarations_vartype_finalize(thd, 1,
>>>>>>>>> orig_variable_name, def);
>>>>>>>>>
>>>>>>>>> where "def" is Item_splocal created for the original variable.
>>>>>>>>
>>>>>>>> Just an idea: instead of creating sp_instr_set, it's easier to
>>>>>>>> introduce a new class sp_instr_set_var, to copy the value from
>>>>>>>> one variable to another variable.
>>>>>>>>
>>>>>>>> This operation will not need neither Item, nor sp_lex_keeper. It
>>>>>>>> will only need two offsets:
>>>>>>>> - the source variable offset and
>>>>>>>> - the target variable offset.
>>>>>>>>
>>>>>>>> Using these offsets, we can access to
>>>>>>>> spcont->m_var_table->field[source] and
>>>>>>>> spcont->spcont->m_var_table->field[target]
>>>>>>>> and copy the value between them using Field::store_field().
>>>>>>>>
>>>>>>>> This won't however for the ROW variables at the moment, because
>>>>>>>> ROW fields are stored in the Item_spvar_args::m_table member of
>>>>>>>> Item_field_row.
>>>>>>>>
>>>>>>>> It seems we need a new class Field_row and move
>> Virtual_tmp_table
>>>>>>>> from Item_field_row to Field_row.
>>>>>>>>
>>>>>>>> I will try to do it.
>>>>>>>
>>>>>>>
>>>>>>> I have implemented "MDEV-14212 Add Field_row for SP ROW
>> variables"
>>>>>>> and pushed to bb-10.2-ext.
>>>>>>>
>>>>>>> Also, added a comment:
>>>>>>>
>>>>>>> MDEV-13418 Compatibility: The order of evaluation of SELECT..INTO
>>>>>>> assignments
>>>>>>>
>>>>>>>
>>>>>>> Now, when MDEV-14212 is done, these declarations:
>>>>>>>
>>>>>>>     DECLARE a_tmp TYPE OF a DEFAULT a;
>>>>>>>     DECLARE b_tmp TYPE OF b DEFAULT b;
>>>>>>>
>>>>>>> and these assignments:
>>>>>>>
>>>>>>>     SET a=a_tmp;
>>>>>>>     SET b=b_tmp;
>>>>>>>
>>>>>>> can use a new sp_instr_setvar instead of sp_inst_set.
>>>>>>>
>>>>>>> The new command sp_instr_setvar can do direct copying between
>> two
>>>>>>> spcont->m_var_table->field[XXX], without a need to create Item and
>>>> LEX.
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Greetings.
>>>>>>>>>


Follow ups

References