maria-developers team mailing list archive
Mailing list archive
Re: MDEV-10306 Wrong results with combination of CONCAT, SUBSTR and CONVERT in subquery
Am 01.03.2017 um 13:56 schrieb Sergei Golubchik:
IMHO 2 is most realistic and safe. I can imagine many situation when one
item val_* called many times and have no idea how it easy can be avoided
without major refactoring (it is about #3 & #4).
On Feb 28, Alexander Barkov wrote:
Author: Alexander Barkov <bar@xxxxxxxxxxx>
Date: Tue Feb 28 10:28:09 2017 +0400
MDEV-10306 Wrong results with combination of CONCAT, SUBSTR and CONVERT in subquery
The bug happens because of a combination of unfortunate circumstances:
1. Arguments args and args of Item_func_concat point recursively
(through Item_direct_view_ref's) to the same Item_func_conv_charset.
Both args->args->ref and args->args->ref refer to
2. When Item_func_concat::args->val_str() is called,
Item_func_conv_charset::val_str() writes its result to
3. Then, for optimization purposes (to avoid copying),
Item_func_substr::val_str() initializes Item_func_substr::tmp_value
to point to the buffer fragment owned by Item_func_conv_charset::tmp_value
Item_func_substr::tmp_value is returned as a result of
4. Due to optimization to avoid memory reallocs,
Item_func_concat::val_str() remembers the result of args->val_str()
in "res" and further uses "res" to collect the return value.
5. When Item_func_concat::args->val_str() is called,
Item_func_conv_charset::tmp_value gets overwritten (see #1),
which effectively overwrites args's Item_func_substr::tmp_value (see #3),
which effectively overwrites "res" (see #4).
The fix marks Item_func_substr::tmp_value as a constant string, which
tells Item_func_concat::val_str "Don't use me as the return value, you cannot
append to me because I'm pointing to a buffer owned by some other String".
This pretty much looks like a hack, that makes the bug disappear in this
particular test case.
What if SUBSTR() wasn't used? CONCAT would still modify
args->tmp_value, and it would be overwritten by args->val_str().
On the other hand, if you remove args from the test case, then CONCAT
can safely modify args's buffer and marking SUBSTR as const would
prevent a valid optimization.
So, I see few possible approaches to this and other similar queries:
1. We specify that no Item's val method can modify the buffer of the
arguments. That is, CONCAT will always have to copy. SUBSTR won't
need to copy, because it doesn't modify the buffer, it only returns a
pointer into it.
2. May be #1 is not strict enough, and we'll need to disallow pointers
into the arguments' buffer too. Because, perhaps, args->val_str()
could realloc and then the pointer will become invalid.
3. A different approach would be to disallow one item to appear twice in
an expression. No idea how to do that.
4. A variand of #3, an item can appear many times, but it'll be only
evaluated once per row. That still needs #1, but #2 is unnecessary.
IMHO 2 is good idea (Actually I thought that now it is done like 2)
Chief Architect MariaDB
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~maria-developers
More help : https://help.launchpad.net/ListHelp