← Back to team overview

maria-developers team mailing list archive

Re: INSERT ... RETURNING intermediate review



Intermediate review was very helpful. It has made things more clear. I
worked on it and here is the report for the same.

I committed the result file for insert_parser. Here is the link:

I have made changes according to the diff file and also fixed
REPLACE...RETURNING and have pushed the latest changes on my repo.
I also ran the tests related to INSERT...RETURNING. They are passing.

Also, I am willing to work on fixing some of the hacks once GSoC project
implementation is perfect.

Rucha Deodhar

On Sun, Jul 21, 2019, 12:05 AM Vicențiu Ciorbaru <vicentiu@xxxxxxxxxxx>

> Hi Rucha!
> I've spent quite a bit of time looking at the implementation. It is
> working and passes almost all test cases, although you forgot to also
> commit the insert_parser updated result file. Good job so far.
> I want to give you something to work on and think about for a bit before
> we do another overall overview of the whole project. Hopefully this will
> help you across all your coding endeavours, not just now. What I want you
> to learn is that it is important to think about how your code can and will
> evolve in the future. Just getting things to work is ok up to a certain
> point, but not implementing things in a decoupled fashion is what we call
> "hacks". These hacks are to be avoided as much as possible as it leads to
> much more difficult work down the line. Let's elaborate on the topic.
> There are a number of problems with our current code (hacks). I am not
> referring to yours in particular. All these hacks are a bit difficult to
> overcome in one go. I do not expect you to fix them yourself, not during
> this GSoC at least, but it is something to consider and if you are willing
> to work towards fixing some of them, excellent!
> Historically our codebase evolved from a simple parser to the "monster"
> that you see now. Sometimes due to time constraints, things were not
> implemented in the most extendable way, rather they made use of certain
> particular constructs, only valid in the contexts they were written in.
> Keyword here is context, as you'll soon see. Some of this contain things
> you have interacted with so far in your project. Here is an example that
> you've had to deal with:
> A SELECT statement has what we call a SELECT LIST, this list is
> traditionally stored in SELECT_LEX::item_list. This item_list however is
> overloaded at times. For example DELETE ... RETURNING uses it to store the
> expressions part of the RETURNING clause. This overloading generally works
> ok, unless you run into situations like you did with your INSERT ...
> RETURNING project.
> INSERT ... RETURNING clauses already made use of the item_list to store
> something else. Now you were forced to introduce another list in
> SELECT_LEX, which we called returning_list. Everything ok so far. But then,
> you ran into another problem, the bison grammar rules do not put Items in
> the list you want. They *always* used item_list, so you could not use
> your *returning_list*. So what did we do then? Well, we came up with
> another hack! We suggested this hack to you because it is something that
> will work quickly and get you unstuck and that is to use the same grammar
> rules like before, but swap the item_list with the returning list
> temporarily, such that the grammar rules will put the right items in the
> right list. This works, but it masks the underlying problem. The problem is
> that we have a very rigid implementation for *select_item_list *grammar
> rule.
> What we should do is to make *select_item_list* grammar return a list
> using the bison stack. That means no relying on SELECT_LEX::item_list to
> store our result temporarily. You have done steps towards fixing this,
> which brings us to where your code's current state. The problem with your
> implementation is you have not lifted the restriction of the grammar rules
> making use of SELECT_LIST::item_list. Instead, you have masked it by
> returning the *address* of that member on the bison stack. The funny part
> is that this *works*, but it still a hack. It is still a hack, because
> these grammar rules have a side effect of modifying this member behind the
> scenes. A very, very, very (!) good rule is to consider all side effects as
> bad, especially now that you are starting out. With experience you might
> get away with them every now-and-then, but for now avoid them like the
> plague and try to remove them whenever possible.
> The current code makes use of a function I really don't like. The inline
> function add_item_list(). It is one of those hacky functions that strictly
> relies on context. The context is: take the thd->lex->current_select and
> put the item passed as a parameter in the item_list. The diff I've attached
> shows an implementation of how to use the bison stack properly. I took
> inspiration from the "expr" rule. Unfortunately the hack chain goes deeper,
> but we need to do baby steps and some things are best left outside your
> GSoC project, or we risk going down a rabbit whole we might not get out of.
> One other issue I've observed is with the REPLACE code here:
>           insert_field_spec
>           {
>             if ($6)
>             {
>               List<Item> list;
>               list=*($6);
>               Lex->current_select->item_list.empty();
>               $6=&list;
>             }
>           }
>           opt_select_expressions
>           {
>             if ($8)
>               Lex->returning_list=*($8);
>             if ($6)
>               Lex->current_select->item_list=*($6);
>             Lex->pop_select(); //main select
>             if (Lex->check_main_unit_semantics())
>               MYSQL_YYABORT;
>           }
> This is really problematic and your probably wrote it because of
> opt_select_expressions putting stuff into item_list. In the
> insert_field_spec rule, you receive the address
> of current_select->item_list as parameter *$6*. You store the value of
> this list (basically the pointer to the head element) in a temporary
> variable that is only in scope during the if statement (!). Attempting to
> use it after the if statement is over (via the $6 parameter) is illegal! It
> only works because the compiler did not chose to use that area of memory
> for something else. Please find a way to store things here properly.
> After this is fixed, we'll spend the next month cleaning everything up,
> documenting our code, the added test cases etc. We have an overall working
> solution, but we need to make it as hack-free as possible, so we can later
> work on supporting INSERT ... RETURNING in subqueries.
> Let me know if anything in the email is unclear. It took me a while to
> write it, trying to explain the reasoning, but I may have missed a few
> things. :)
> Vicențiu