maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #10409
Re: engine condition pushdown
Hi Sergej,
attached you will find a file with commands for generating the effect.
It's not a script so please copy and paste the lines into your
favourite front-end.
Here is the result on my test:
# the result in cond_push():
for TestSmall: OK
for TestBig: OK
start scanning TestSmall: OK
1st scanning TestBig:
FIELD-ITEM [TestECP] [B] [PZN] name=<PZN>
2nd scanning TestBig:
FIELD-ITEM [TestECP] [B] [PZN]
str_length=<2> str_value=<37> name=<PZN>
3rd scanning TestBig:
FIELD-ITEM [TestECP] [B] [PZN] str_length=<7> str_value=<222 >
name=<PZN>
4th scanning TestBig:
FIELD-ITEM [TestECP] [B] [PZN] str_length=<3> str_value=<371>
name=<PZN>
5th scanning TestBig:
FIELD-ITEM [TestECP] [B] [PZN] str_length=<7> str_value=<3717997>
name=<PZN>
as you cann see from the INSERT-statement above it is:
1st: is <>, should have been <12>
2nd: is <2>, should have been <3717968>
3rd: is <222 >, should have been <222>
4th: ia <371>, should have been <3717980>
5th: is <3717997>, this is correct!The data is almost identical to the data in my text.Hope this helps.
AugustQ
Am Sonntag, den 19.02.2017, 17:32 +0100 schrieb Sergei Golubchik:
> Hi, AugustQ!
>
> First, there won't be an answer to your question here, sorry :(
>
> A couple of thoughts:
>
> cond->str_value is not the how you get the value of an Item.
> Item::str_value is kind of a internal storage in the Item, can be
> used
> as a cache or temporary value holder or anything. Or not used at all,
> it's up to the individual item implementation. For some items it
> might
> hold the value.
>
> Condition pushdown was originally created for engines like NDB, that
> store data remotely and need to send to the SQL executor for
> processing.
> With condition pushdown they could avoid sending data that will be
> anyway discarded by the server. So, if something isn't pushed down,
> it
> may be because pushing it wasn't helping in that particular use case.
>
> Now, if you'd provided a complete test case - an sql file that I
> could
> just feed into mysql command line client and it would create and
> populate tables and execute your query - I could've looked in a
> debugger and answer your question exactly.
>
> Without a test case - I tried to, but couldn't (read your blog too).
>
> On Feb 16, AugustQ wrote:
> >
> >
> > by playing with a SQL-statement I encountered a problem. I'm not
> > sure
> > if this is a bug or if I did not use the correct method to get the
> > information I expected.
> >
> > So let me please explain the situation in which I encountered this
> > problem. But it will take some lines of text to come to the point.
> >
> > My environment: MariaDB 10.0.10 source-code-version on Linux.
> >
> > The statement: select SQL_NO_CACHE B.PZN, B.ArtikelText from
> > TestSmall B where exists ( select 1 from TestBig A where
> > A.Hersteller =
> > '36367' and A.PZN = B.PZN) and B.Hersteller = '00020'
> > Engine used is MyISAM but the effect should be independent of any
> > engine (not checked). No indexes exist on the tables involved.
> >
> > You will find a lot more of details here: http://augustq.blogspot.c
> > om/2
> > 017/02/subselect-execution-4.html
> >
> > Execution of this (silly) statement is done by a table-scan on the
> > table TestSmall. When a record is found the matches the WHERE-
> > condition
> > the server switches over to the table TestBig and does a table-scan
> > on
> > this table. When a match is found it returns to the table
> > TestSmall,
> > otherwise the table TestBig is read to the end and then the server
> > returns to the table TestSmall. The server continues scanning the
> > table
> > TestSmall, switching over to TestBig, returning to TestSmall,
> > continues
> > reading TestSmall and so on, until it reaches the end of the table
> > TestSmall.
> >
> > Of interest for me was the condition tree which can be inspected by
> > looking at the function cond_push() (in my case
> > ha_myisam::cond_push()).
> >
> > So let's start.
> >
> > The function cond_push() is initally called for the tables
> > TestSmall
> > and TestBig, this looks good.
> >
> > Then the server starts reading the table TestSmall and therefore
> > the
> > function cond_push() is called again. This looks good too.
> >
> > Now a matching record is found in TestSmall: it switches over to
> > the
> > table TestBig and calls cond_push() with the condition-tree for
> > this
> > table. This tree looks like this:
> >
> > COND-ITEM args: 0 type=[COND_AND_FUNC]
> > FUNC-ITEM [=] args: 2 type=[EQ_FUNC]
> > FIELD-ITEM [TestOpt] [A] [Hersteller] name=<Hersteller>
> > STRING-ITEM str_length=<5> str_value=<36367> name=<36367>
> > FUNC-ITEM [=] args: 2 type=[EQ_FUNC]
> > FIELD-ITEM [TestOpt] [A] [PZN] name=<PZN>
> > FIELD-ITEM [TestOpt] [B] [PZN] name=<PZN>
> >
> > The last line is of interest (I've marked it in bold). For the
> > table
> > with the alias B (= TestSmall) no value is given for the column PZN
> > but
> > this value is available. This is the first point of interest.
> >
> > Let's continue with the operation. When the table TestBig is read
> > to
> > the end, the scanning of TestSmall continues and a matching record
> > is
> > found over there the server again switches over to TestBig. Before
> > it
> > starts reading the records from this table the function cond_push()
> > is
> > called again. This is the output of the condition-tree, restricted
> > to
> > the line of interest (the last line):
> >
> > FIELD-ITEM [TestOpt] [B] [PZN]
> > str_length=<2> str_value=<37> name=<PZN>
> >
> > So let's continue: when it switches to TestBig for the 3rd time
> > this
> > output looks like:
> >
> > FIELD-ITEM [TestOpt] [B] [PZN]
> > str_length=<7> str_value=<222 >
> > name=<PZN>
> >
> > Enough of data. I can deliver more examples if needed or you may
> > look
> > into my text.
> >
> > Judgement:
> > The output of the first case des not show the value of the column
> > PZN
> > from the current record in TestSmall although the value is
> > available.
> >
> > The output in the second case shows the value 37 (because of the
> > length
> > eq. 2) but the correct value would be 3717968 (and length eq. 7).
> >
> > The output in the third case shows the value 222 followed by 4
> > spaces
> > (length eq. 7), but the correct value is 222 (length eq. 3).
> >
> > What I see is the correct value (exclude the first example
> > presented
> > here) but the length-information is one row behind (one row of
> > TestSmall).
> >
> > For accessing the information presented here I used COND-
> > >str_value.
> >
> > So the question is:
> >
> > - the information is available but I used the wrong
> > function/variable?
> >
> > OR
> >
> > - is this a bug in the code?
> Regards,
> Sergei
> Chief Architect MariaDB
> and security@xxxxxxxxxxx
# the statements:
create database TestECP;
use TestECP;
CREATE TABLE `TestSmall` (
`Id` char(8) DEFAULT NULL,
`PZN` char(7) DEFAULT NULL,
`EVP` decimal(7,2) DEFAULT NULL,
`HAP` decimal(7,2) DEFAULT NULL,
`ArtikelBez` varchar(40) DEFAULT NULL,
`ArtikelText` varchar(26) DEFAULT NULL,
`Hersteller` char(5) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1
;
CREATE TABLE `TestBig` (
`Id` char(8) DEFAULT NULL,
`PZN` char(7) DEFAULT NULL,
`EVP` decimal(7,2) DEFAULT NULL,
`HAP` decimal(7,2) DEFAULT NULL,
`ArtikelBez` varchar(40) DEFAULT NULL,
`ArtikelText` varchar(26) DEFAULT NULL,
`Hersteller` char(5) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1
;
insert into TestSmall values( '1002100', '12', 3.95, 1.83, 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXX', 'XXXXXXXXXXXXXXXXXXXXXXXX', '00020'),
( '1025266', '3717968', 0.00, 2.90, 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXX', 'XXXXXXXXXXXXXXXXXXXXXXXX', '00020'),
( '1025267', '222', 0.00, 4.45, 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXX', 'XXXXXXXXXXXXXXXXXXXXXXXX', '00020'),
( '1025268', '3717980', 0.00, 6.30, 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXX', 'XXXXXXXXXXXXXXXXXXXXXXXX', '00020'),
( '1025269', '3717997', 0.00, 6.30, 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXX', 'XXXXXXXXXXXXXXXXXXXXXXXX', '00020') ;
insert into TestBig values ( '01000001', '111', 0.00, 18.91, 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXX', 'XXXXXXXXXXXXXXXXXXXXXXXX', '36367'),
( '01000025', '999999', 82.95, 0.00, 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXX', 'XXXXXXXXXXXXXXXXXXXXXXXX', '36367'),
( '01000045', '222', 0.00, 383.92, 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXX', 'XXXXXXXXXXXXXXXXXXXXXXXX', '36367'),
( '01000072', '999999', 0.00, 18.32, 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXX', 'XXXXXXXXXXXXXXXXXXXXXXXX', '36367'),
( '01000110', '999999', 0.00, 0.00, 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXX', 'XXXXXXXXXXXXXXXXXXXXXXXX', '36367'),
( '01000114', '999999', 13.06, 12.54, 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXX', 'XXXXXXXXXXXXXXXXXXXXXXXX', '36367'),
( '01000118', '204665', 25.43, 7.87, 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXX', 'XXXXXXXXXXXXXXXXXXXXXXXX', '36367'),
( '01000120', '999999', 0.00, 0.00, 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXX', 'XXXXXXXXXXXXXXXXXXXXXXXX', '36367'),
( '01000139', '999999', 11.80, 3.90, 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXX', 'XXXXXXXXXXXXXXXXXXXXXXXX', '36367'),
( '01000141', '999999', 0.00, 0.00, 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXX', 'XXXXXXXXXXXXXXXXXXXXXXXX', '36367') ;
# to produce the effect described:
set optimizer_switch='engine_condition_pushdown=on';
select SQL_NO_CACHE B.PZN, B.ArtikelText from TestSmall B where exists ( select 1 from TestBig A where A.Hersteller = '36367' and A.PZN = B.PZN) and B.Hersteller = '00020';
# the result in cond_push():
for TestSmall: OK
for TestBig: OK
start scanning TestSmall: OK
1st scanning TestBig:
FIELD-ITEM [TestECP] [B] [PZN] name=<PZN>
2nd scanning TestBig:
FIELD-ITEM [TestECP] [B] [PZN] str_length=<2> str_value=<37> name=<PZN>
3rd scanning TestBig:
FIELD-ITEM [TestECP] [B] [PZN] str_length=<7> str_value=<222 > name=<PZN>
4th scanning TestBig:
FIELD-ITEM [TestECP] [B] [PZN] str_length=<3> str_value=<371> name=<PZN>
5th scanning TestBig:
FIELD-ITEM [TestECP] [B] [PZN] str_length=<7> str_value=<3717997> name=<PZN>
as you cann see from the INSERT-statement above it is:
1st: is <>, should have been <12>
2nd: is <2>, should have been <3717968>
3rd: is <222 >, should have been <222>
4th: ia <371>, should have been <3717980>
5th: is <3717997>, this is correct!
Attachment:
signature.asc
Description: This is a digitally signed message part
Follow ups
References