maria-developers team mailing list archive

Thread
Date

Re: RFC - query cache insert

To: Oleksandr Byelkin <sanja@xxxxxxxxxxxxxxxx>
From: Roberto Spadim <roberto@xxxxxxxxxxxxx>
Date: Tue, 6 Aug 2013 14:49:20 -0300
Cc: "maria-developers@xxxxxxxxxxxxxxxxxxx" <maria-developers@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <5200B6A3.9080802@montyprogram.com>

humm i will check the source and understand how it is done,
thanks oleksandr!

but... another doubt...

today query cache have:
query blocks (with query and flags),
table blocks (with tables)
result blocks (with results)
there's any way of two query blocks (or more) link to same result block?

for example...
we execute the query "SELECT * FROM A", we will have 1 block of each
(query, table, result)
now we execute the query but instead of "SELECT * FROM A" we run "select *
from A" (for example)
we will have 2 blocks of query and result, and one of table now... (we
waste space of 1 result block)

the point is... could we cache the first query "SELECT * FROM A", and the
normalized version of the query "SELECT ´a´.´a´,´a´.´b´ .... FROM A"?
i didn't found how the EXPLAIN EXTENDED write the 'normalized' query, but
it's something like the output of "explain extended QUERY;show warnings;"

in this case we will have 2 blocks of query (one for the normal query and
other for the normalized query), 1 block for table and one for result
now we could check the "raw" query cache, and the "normalized" query cache,
and when we got a new query that hit the cache via the 'normalized' cache
(after parsing the query) we only include the 'raw' query to query block
and return the results from the result block (instead of executing the
query and expending disk i/o)

for example...
1)"SELECT * FROM A" -> 1 raw query, 1 normalized query, 1 result, 1 table
(this one will be executed and consume disk i/o)
here we will have "SELECT * FROM A" (raw query) + "SELECT ´a´.´a´,´a´.´b´
FROM ´database´.´a´" (normalized query)

2)"SELECT * FROm A" -> 2 raw query, 1 normalized query, 1 result, 1 table
 (this one was hit when we do the parse and check the query cache via
normalized cache)
here we will parse the raw query ("SELECT * FROm A") to => "SELECT
´a´.´a´,´a´.´b´ FROM ´database´.´a´" (normalized query), and we can find
the normalized query in query cache, now the job is put the the raw query
to query cache and link the results and tables block to it, and return the
result to client without consume disk i/o

3)"SELECT * FRoM A" -> 3 raw query, 1 normalized query, 1 result, 1 table
the same of (2)

4)"SELECT a,b FRoM A" -> 4 raw query, 1 normalized query, 1 result, 1 table
the same of (2)


if we could do this we could reduce the memory consume of query cache
(instead of 4 results block used we have only one used, in other words 25%
less consume)
other pros is "optimize" the parser if the query was found in normalized
query cache, in this case we reduce the disk i/o and return the result via
query cache

it's a raw idea and i don't know if internally it could be done
i put a MDEV about it some time ago but i doesn't started it, i'm studying
the code yet and have some jobs parallel to my "hobby" of mariadb/mysql
"coder"
the MDEV is here: https://mariadb.atlassian.net/browse/MDEV-4681

thanks for your time

Follow ups

Re: RFC - query cache insert
From: Oleksandr Byelkin, 2013-08-07

References

RFC - query cache insert
From: Roberto Spadim, 2013-08-05
Re: RFC - query cache insert
From: Oleksandr Byelkin, 2013-08-06