← Back to team overview

maria-developers team mailing list archive

Re: Problem with prep_where (LP BUG#675248)

 

Hi!

On Mon, Nov 15, 2010 at 04:47:54PM +0200, Oleksandr Byelkin wrote:
> Preparation statement for PS looks like this (example uses prep_where,  
> but we have the same problem for other parts stored in  
> st_select_lex::fix_prepare_information()):
>
>
> ...
>    conds->fix_fields(...&conds...) /* (1.1) JOIN::prepare()*/
> ...
>    prep_where= *conds;  /* (1.2)  
> st_select_lex::fix_prepare_information() */
>    *conds= where= prep_where->copy_andor_structure()
> ...
>
> Then executing of the prepared statement is like this:
>
> where= prep_where->copy_andor_structure() /* (2.1) reinit_stmt_before_use */
>    ...
> where->fix_fields() /* (2.2) JOIN::prepare() */
>    ...
>
> The problem is that during (1.1) we could substitute conds with  
> reference on other item (for example if we change reference on  
> Item_field on Item_ref* or Item_ref on Item_field). The new item will be  
> destroyed after preparation (i.e before (2.1)). So during (2.1)  
> prep_where points on freed memory.
>
> How to solve the problem.
>
> 1. Workarounds
>
>   1.1. Make changing Items inherited from Item_ident (Item_field and  
> Item_ref*) during resolving permanent.
>     Pros:
>       - Looks logical
>       - reduce work during second preparation
>     Cons:
>       - Play with statement memory usually complex enough.
>       - have to be sure that there is no any reference on the original Item
we will also need to make sure that each re-written item $i allows one to make
a $i->fix_fields() call (although I'm not sure what exactly its meaning should
be). 

Another possible concern is that currently fix_fields() calls for some items
have side effects, e.g. running item->fix_fields() on a reference to a field 
in ancestor query will mark the subquery predicate we're in as correlated.
I'm not certain, but it could turn out that for some kinds of items, we need to 
re-do the item->fix_fields() call for every PS re-execution because of the side
effects it has.  It's possible to move that code out somewhere of course, but
that can be labor-intensive and end up in a big patch.

>   1.2 Add new method for Item, which return 'this' pointer but for  
> Item_idents creating during resolving original Item. (1.2) turns to  
> something like:
> prep_where= (*conds)->original_ident_item();
>     Pros:
>       - Looks simple
>     Cons:
>       - Item already overloaded with methods like this
I really don't like the cons part here.

>   1.3 Assign prep_where before (1.1)
>     Pros:
>       - Easy to implement
>     Cons:
>       - Need changing logic and semantic of some calls and  
> prep_where/prep_having... class variables.
>
> 2. Global solutions for the problem of double/triple/... reference on  
> transformed expression (it is not applicadle for 5.1-5.3 but could de  
> universal solution in 5.5 of higher):
>    2.1 Absolutely transparent wrapper to save only one reference on any  
> expression which could be transformed:
>     Pros:
>        - Could base class for many other wrappers
>     Cons:
>        - It is difficult to implement such absolutely transparent wrapper.
>        - A lot of places where it should be put, make Item tree bigger

I suspect, without a clear definition of where it should be put, we will end up
having this wrapper injected in too many (and at the same time, too few :) places.

>    2.2 Turn THD::change_list to hash (for example) and chack presence of 
> assigning variable in the list, if it found, also register changing for 
> variable we are assigning value now. For example instead of prep_where= 
> *conds; use thd->assign_changeable(&prep_where, conds), where:
> void THD::assign_changeable(void **newref, void **oldref)
> {
>   if (in_changed_references(oldref)
>     change_item_tree(newref, *oldref);
>   else
>     *newref= *oldref;
> }
>      Pros:
>        - no additional work for usual statements
>      Cons:
>        - List supports order of assigments easyly, but hash can't
>        - time for finding pointers in the hash

So, this approach is an attempt to extend the thd->change_item_tree()
mechanism to handle Item_field->Item_ref change/reversal for the case when the
changed item is the top-most item of the WHERE clause. It seems to be nice that
we won't have to introduce additional entities.

With regards to performance, I don't think we should care to even have a hash
table: a query will have at most O(#subselects)  assign_changeable() calls, and
I think we could safely assume that there are not more than O(#subselects)
thd->change_item_tree() calls, too.

BR
 Sergey
-- 
Sergey Petrunia, Software Developer
Monty Program AB, http://askmonty.org
Blog: http://s.petrunia.net/blog



References