← Back to team overview

maria-developers team mailing list archive

Re: Rev 2765: Change Field_enumerator to enumerate Item_field-s not Field-s. in file:///home/psergey/dev/maria-5.3-subqueries-r7/

 

On Sun, Feb 21, 2010 at 09:07:34AM +0200, Sergey Petrunya wrote:
> On Sun, Feb 21, 2010 at 09:32:26AM +0300, Sergey Petrunya wrote:
> > At file:///home/psergey/dev/maria-5.3-subqueries-r7/
> > 
> > ------------------------------------------------------------
> > revno: 2765
> > revision-id: psergey@xxxxxxxxxxxx-20100221063223-h0f7u2low7rtjixc
> > parent: psergey@xxxxxxxxxxxx-20100221033618-83dgm2h9ingzmhcc
> > committer: Sergey Petrunya <psergey@xxxxxxxxxxxx>
> > branch nick: maria-5.3-subqueries-r7
> > timestamp: Sun 2010-02-21 08:32:23 +0200
> > message:
> >   Change Field_enumerator to enumerate Item_field-s not Field-s.
> >   In Item_ref::fix_fields() do invoke mark_as_dependent() for outside 
> >   references in all cases (see email for more details)
> 
> The problem occurs only when using --ps-protocol and manifests itself as
> follows:
> 
> CURRENT_TEST: main.subselect
> mysqltest: At line 258: query 'select numeropost as a FROM t1 ORDER BY (SELECT 1 FROM t1 HAVING a=1)' succeeded - should have failed with errno 1242...
> 
> The result from queries just before the failure was:
> < snip >
> CREATE TABLE `t1` (
> `numeropost` mediumint(8) unsigned NOT NULL auto_increment,
> `maxnumrep` int(10) unsigned NOT NULL default '0',
> PRIMARY KEY  (`numeropost`),
> UNIQUE KEY `maxnumrep` (`maxnumrep`)
> ) ENGINE=MyISAM ROW_FORMAT=FIXED;
> INSERT INTO t1 (numeropost,maxnumrep) VALUES (1,0),(2,1);
> select numeropost as a FROM t1 GROUP BY (SELECT 1 FROM t1 HAVING a=1);
> ERROR 21000: Subquery returns more than 1 row
> 
> 
> The lack of error is caused by "HAVING a=1" not being true, which itself is
> caused by "a" having garbage, because the optimizer evaluates the subquery
> before it starts exection of the upper select, which it does because it sees
> that
>   
>   subselect_item->used_tables() == 0  (*)
> 
> which is wrong.
> 
> (*) occurs because during EXECUTE running fix_fields() for "HAVING a=1" does
> not trigger any mark_as_dependent() calls. 
> 
> (I'm not sure why this happens in this case but not in others. Normally,
> fix_fields() call for outer references will cause mark_as_dependent calls. 
> There is a whole mark_select_range_as_dependent() function which is called 
> only by EXECUTE (and not by regular non-PS execution).
> 
> Phyisically it looks as follows:
> 
> 
> bool Item_ref::fix_fields(THD *thd, Item **reference)
> {
>   ...
>   if (!ref || ref == not_found_item)
>   {
>      (All mark_as_dependent() or mark_select_range_as_dependent() calls are
>      made from here)
>   }
>   DBUG_ASSERT(*ref);
>   ...
> }
> 
> For the example query, the if branch is taken during PREPARE command, but not 
> in EXECUTE. This means that during EXECUTE branch we don't call
> mark.*as_dependent(), which causes subselect's outside reference not to be
> recorded, and then subselect_item->used_tables()==0 which leads to errors.
> 
> To avoid the problem, I've changed the code to be:
> 
> bool Item_ref::fix_fields(THD *thd, Item **reference)
> {
>   ...
>   if (!ref || ref == not_found_item)
>   {
>      (All mark_as_dependent() or mark_select_range_as_dependent() calls are
>      made from here)
>   }
>   else
>   {
>     (Enumerate all table field references in *ref and call mark_as_dependent()
>     as necessary)
>   }
>   DBUG_ASSERT(*ref);
>   ...
> }

An alternative would be to make Item_ref resolve its reference in every
fix_fields() call. So I've tried to do that in this way:

=== modified file 'sql/item.cc'
--- sql/item.cc 2010-02-24 11:33:42 +0000
+++ sql/item.cc 2010-02-27 15:51:00 +0000
@@ -6141,6 +6141,7 @@ void Item_ref::cleanup()
 {
   DBUG_ENTER("Item_ref::cleanup");
   Item_ident::cleanup();
+  ref= 0; //psergey-try
   result_field= 0;
   DBUG_VOID_RETURN;
 }


And then I get numerous failures like this:

./mysql-test-run --retry=1 --ps-protocol t/subselect.test

CURRENT_TEST: main.subselect
mysqltest: At line 101: query 'select * from t3 where a not in (select b from t2)' failed: 1054: Unknown column '<no matter>.<left expr>' in 'where clause'

This means that Item_ref is unable to re-resolve Item_ref::ref on second re-execution.

The question for Sanja is whether it's possible to make Item_ref re-resolve 
Item_ref::ref and make appropriate mark_*as_dependent() calls.

BR
 Sergey
-- 
Sergey Petrunia, Software Developer
Monty Program AB, http://askmonty.org
Blog: http://s.petrunia.net/blog



References