← Back to team overview

maria-developers team mailing list archive

INSERT ... RETURNING intermediate review

 

Hi Rucha!
I've spent quite a bit of time looking at the implementation. It is working and
passes almost all test cases, although you forgot to also commit the
insert_parser updated result file. Good job so far.
I want to give you something to work on and think about for a bit before we do
another overall overview of the whole project. Hopefully this will help you
across all your coding endeavours, not just now. What I want you to learn is
that it is important to think about how your code can and will evolve in the
future. Just getting things to work is ok up to a certain point, but not
implementing things in a decoupled fashion is what we call "hacks". These hacks
are to be avoided as much as possible as it leads to much more difficult work
down the line. Let's elaborate on the topic.
There are a number of problems with our current code (hacks). I am not referring
to yours in particular. All these hacks are a bit difficult to overcome in one
go. I do not expect you to fix them yourself, not during this GSoC at least, but
it is something to consider and if you are willing to work towards fixing some
of them, excellent!
Historically our codebase evolved from a simple parser to the "monster" that you
see now. Sometimes due to time constraints, things were not implemented in the
most extendable way, rather they made use of certain particular constructs, only
valid in the contexts they were written in. Keyword here is context, as you'll
soon see. Some of this contain things you have interacted with so far in your
project. Here is an example that you've had to deal with:
A SELECT statement has what we call a SELECT LIST, this list is traditionally
stored in SELECT_LEX::item_list. This item_list however is overloaded at times.
For example DELETE ... RETURNING uses it to store the expressions part of the
RETURNING clause. This overloading generally works ok, unless you run into
situations like you did with your INSERT ... RETURNING project.
INSERT ... RETURNING clauses already made use of the item_list to store
something else. Now you were forced to introduce another list in SELECT_LEX,
which we called returning_list. Everything ok so far. But then, you ran into
another problem, the bison grammar rules do not put Items in the list you want.
They always used item_list, so you could not use your returning_list. So what
did we do then? Well, we came up with another hack! We suggested this hack to
you because it is something that will work quickly and get you unstuck and that
is to use the same grammar rules like before, but swap the item_list with the
returning list temporarily, such that the grammar rules will put the right items
in the right list. This works, but it masks the underlying problem. The problem
is that we have a very rigid implementation for select_item_list grammar rule. 
What we should do is to make select_item_list grammar return a list using the
bison stack. That means no relying on SELECT_LEX::item_list to store our result
temporarily. You have done steps towards fixing this, which brings us to where
your code's current state. The problem with your implementation is you have not
lifted the restriction of the grammar rules making use of
SELECT_LIST::item_list. Instead, you have masked it by returning the address of
that member on the bison stack. The funny part is that this works, but it still
a hack. It is still a hack, because these grammar rules have a side effect of
modifying this member behind the scenes. A very, very, very (!) good rule is to
consider all side effects as bad, especially now that you are starting out. With
experience you might get away with them every now-and-then, but for now avoid
them like the plague and try to remove them whenever possible.
The current code makes use of a function I really don't like. The inline
function add_item_list(). It is one of those hacky functions that strictly
relies on context. The context is: take the thd->lex->current_select and put the
item passed as a parameter in the item_list. The diff I've attached shows an
implementation of how to use the bison stack properly. I took inspiration from
the "expr" rule. Unfortunately the hack chain goes deeper, but we need to do
baby steps and some things are best left outside your GSoC project, or we risk
going down a rabbit whole we might not get out of.
One other issue I've observed is with the REPLACE code here:
          insert_field_spec           {            if
($6)            {              List<Item>
list;              list=*($6);              Lex->current_select-
>item_list.empty();              $6=&list;            }          }          opt_
select_expressions          {            if ($8)              Lex-
>returning_list=*($8);            if ($6)              Lex->current_select-
>item_list=*($6);            Lex->pop_select(); //main select            if
(Lex->check_main_unit_semantics())              MYSQL_YYABORT;          }
This is really problematic and your probably wrote it because of
opt_select_expressions putting stuff into item_list. In the insert_field_spec
rule, you receive the address of current_select->item_list as parameter $6. You
store the value of this list (basically the pointer to the head element) in a
temporary variable that is only in scope during the if
statement (!).  Attempting to use it after the if statement is over (via the $6
parameter) is illegal! It only works because the compiler did not chose to use
that area of memory for something else. Please find a way to store things here
properly.
After this is fixed, we'll spend the next month cleaning everything up,
documenting our code, the added test cases etc. We have an overall working
solution, but we need to make it as hack-free as possible, so we can later work
on supporting INSERT ... RETURNING in subqueries.
Let me know if anything in the email is unclear. It took me a while to write it,
trying to explain the reasoning, but I may have missed a few things. :)
Vicențiu   
diff --git a/sql/sql_yacc.yy b/sql/sql_yacc.yy
index 099ad17c48f..4cbe8c38e48 100644
--- a/sql/sql_yacc.yy
+++ b/sql/sql_yacc.yy
@@ -1902,6 +1902,7 @@ bool my_yyoverflow(short **a, YYSTYPE **b, size_t *yystacksize);
         simple_target_specification
         condition_number
         reset_lex_expr
+        select_item
 
 %type <item_param> param_marker
 
@@ -1911,11 +1912,14 @@ bool my_yyoverflow(short **a, YYSTYPE **b, size_t *yystacksize);
 %type <item_basic_constant> text_literal
 
 %type <item_list>
-        insert_field_spec opt_insert_update insert_values opt_select_expressions expr_list opt_udf_expr_list udf_expr_list when_list when_list_opt_else
+        insert_field_spec opt_insert_update insert_values
+        opt_select_expressions expr_list
+        opt_udf_expr_list udf_expr_list when_list when_list_opt_else
         ident_list ident_list_arg opt_expr_list
         decode_when_list_oracle
         execute_using
         execute_params
+        select_item_list
 
 %type <sp_cursor_stmt>
         sp_cursor_stmt_lex
@@ -2048,7 +2052,7 @@ bool my_yyoverflow(short **a, YYSTYPE **b, size_t *yystacksize);
         keycache_list keycache_list_or_parts assign_to_keycache
         assign_to_keycache_parts
         preload_list preload_list_or_parts preload_keys preload_keys_parts
-        select_item_list select_item values_list no_braces
+        values_list no_braces
         delete_limit_clause fields opt_values values
         no_braces_with_names opt_values_with_names values_with_names
         procedure_list procedure_list2 procedure_item
@@ -9223,6 +9227,7 @@ query_specification_start:
           select_item_list
           {
             Select->parsing_place= NO_MATTER;
+            Select->item_list= *($5);
           }
           ;
 
@@ -9546,7 +9551,15 @@ opt_lock_wait_timeout_new:
 
 select_item_list:
           select_item_list ',' select_item
+          {
+            $1->push_back($3, thd->mem_root);
+            $$= $1;
+          }
         | select_item
+          {
+            if (unlikely(!($$= List<Item>::make(thd->mem_root, $1))))
+              MYSQL_YYABORT;
+          }
         | '*'
           {
             Item *item= new (thd->mem_root)
@@ -9554,8 +9567,9 @@ select_item_list:
                                      NULL, NULL, &star_clex_str);
             if (unlikely(item == NULL))
               MYSQL_YYABORT;
-            if (unlikely(add_item_to_list(thd, item)))
+            if (unlikely(!($$= List<Item>::make(thd->mem_root, item))))
               MYSQL_YYABORT;
+
             (thd->lex->current_select->with_wild)++;
           }
         ;
@@ -9563,15 +9577,13 @@ select_item_list:
 select_item:
           remember_name select_sublist_qualified_asterisk remember_end
           {
-            if (unlikely(add_item_to_list(thd, $2)))
-              MYSQL_YYABORT;
+            $$= $2;
           }
         | remember_name expr remember_end select_alias
           {
             DBUG_ASSERT($1 < $3);
 
-            if (unlikely(add_item_to_list(thd, $2)))
-              MYSQL_YYABORT;
+            $$= $2;
             if ($4.str)
             {
               if (unlikely(Lex->sql_command == SQLCOM_CREATE_VIEW &&
@@ -13764,6 +13776,8 @@ single_multi:
           {
             if ($3)
               Select->order_list= *($3);
+            if ($5)
+              Select->item_list= *($5);
             Lex->pop_select(); //main select
           }
         | table_wild_list
@@ -13800,7 +13814,7 @@ opt_select_expressions:
           /* empty */ {$$=NULL;}
         | RETURNING_SYM select_item_list
           {
-            $$=&Lex->current_select->item_list;
+            $$= $2;
           }
         ;
 

Follow ups