← Back to team overview

maria-developers team mailing list archive

Re: engine condition pushdown

 

Hi, AugustQ!

First, there won't be an answer to your question here, sorry :(

A couple of thoughts:

cond->str_value is not the how you get the value of an Item.
Item::str_value is kind of a internal storage in the Item, can be used
as a cache or temporary value holder or anything. Or not used at all,
it's up to the individual item implementation. For some items it might
hold the value.

Condition pushdown was originally created for engines like NDB, that
store data remotely and need to send to the SQL executor for processing.
With condition pushdown they could avoid sending data that will be
anyway discarded by the server. So, if something isn't pushed down, it
may be because pushing it wasn't helping in that particular use case.

Now, if you'd provided a complete test case - an sql file that I could
just feed into mysql command line client and it would create and
populate tables and execute your query - I could've looked in a
debugger and answer your question exactly.

Without a test case - I tried to, but couldn't (read your blog too).

On Feb 16, AugustQ wrote:
> 
> by playing with a SQL-statement I encountered a problem. I'm not sure
> if this is a bug or if I did not use the correct method to get the
> information I expected. 
> 
> So let me please explain the situation in which I encountered this
> problem. But it will take some lines of text to come to the point.
> 
> My environment: MariaDB 10.0.10 source-code-version on Linux.
> 
> The statement: select SQL_NO_CACHE   B.PZN, B.ArtikelText  from
> TestSmall B where exists ( select 1 from TestBig A where A.Hersteller =
> '36367' and A.PZN = B.PZN) and B.Hersteller = '00020'
> Engine used is MyISAM but the effect should be independent of any
> engine (not checked). No indexes exist on the tables involved.
> 
> You will find a lot more of details here: http://augustq.blogspot.com/2
> 017/02/subselect-execution-4.html
> 
> Execution of this (silly) statement is done by a table-scan on the
> table TestSmall. When a record is found the matches the WHERE-condition 
> the server switches over to the table TestBig and does a table-scan on
> this table. When a match is found it returns to the table TestSmall,
> otherwise the table TestBig is read to the end and then the server
> returns to the table TestSmall. The server continues scanning the table
> TestSmall, switching over to TestBig, returning to TestSmall, continues
> reading TestSmall and so on, until it reaches the end of the table
> TestSmall.
> 
> Of interest for me was the condition tree which can be inspected by
> looking at the function cond_push() (in my case
> ha_myisam::cond_push()).
> 
> So let's start.
> 
> The function cond_push() is initally called for the tables TestSmall
> and TestBig, this looks good.
> 
> Then the server starts reading the table TestSmall and therefore the
> function cond_push()  is called again. This looks good too.
> 
> Now a matching record is found  in TestSmall: it switches over to the
> table TestBig and calls cond_push() with the condition-tree for this
> table. This tree looks like this:
> 
>     COND-ITEM   args: 0 type=[COND_AND_FUNC]    
> FUNC-ITEM   [=] args: 2 type=[EQ_FUNC]  
> FIELD-ITEM  [TestOpt] [A] [Hersteller]  name=<Hersteller>
> STRING-ITEM     str_length=<5>  str_value=<36367>   name=<36367>
> FUNC-ITEM   [=] args: 2 type=[EQ_FUNC]  
> FIELD-ITEM  [TestOpt] [A] [PZN] name=<PZN>
> FIELD-ITEM  [TestOpt] [B] [PZN] name=<PZN>
> 
> The last line is of interest (I've marked it in bold). For the table
> with the alias B (= TestSmall) no value is given for the column PZN but
> this value is available. This is the first point of interest.
> 
> Let's continue with the operation. When the table TestBig is read to
> the end, the scanning of TestSmall continues and a matching record is
> found over there the server again switches over to TestBig. Before it
> starts reading the records from this table the function cond_push() is
> called again. This is the output of the condition-tree, restricted to
> the line of interest (the last line):
> 
>     FIELD-ITEM  [TestOpt] [B] [PZN]
>     str_length=<2>  str_value=<37>  name=<PZN>
> 
> So let's continue: when it switches to TestBig for the 3rd time this
> output looks like:
> 
>     FIELD-ITEM  [TestOpt] [B] [PZN] str_length=<7>  str_value=<222    >
>     name=<PZN>
> 
> Enough of data. I can deliver more examples if needed or you may look
> into my text.
> 
> Judgement:
> The output of the first case des not show the value of the column PZN
> from the current record in TestSmall although the value is available.
> 
> The output in the second case shows the value 37 (because of the length
> eq. 2) but the correct value would be 3717968 (and length eq. 7).
> 
> The output in the third case shows the value 222 followed by 4 spaces
> (length eq. 7), but the correct value is 222 (length eq. 3).
> 
> What I see is the correct value (exclude the first example presented
> here) but the length-information is one row behind (one row of
> TestSmall).
> 
> For accessing the information presented here I used COND->str_value.
> 
> So the question is:
> 
> 	- the information is available but I used the wrong
> function/variable?
> 
> OR
> 
> 	- is this a bug in the code?

Regards,
Sergei
Chief Architect MariaDB
and security@xxxxxxxxxxx


Follow ups

References