← Back to team overview

maria-developers team mailing list archive

Issue with index condition pushdown being used with no end_range set

 

Hello all,

I've noticed a scenario with index condition pushdown (ICP) in MariaDB
that leads to really bad performance in TokuDB. I don't know if it is
a bug in ICP or if TokuDB is misusing/misunderstanding the API.

Here is the problem. Suppose we run the following query on the following schema:
SELECT * FROM foo WHERE col1 = '7' ORDER BY b desc
| foo   | CREATE TABLE `foo` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `col1` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
  `b` int(11) DEFAULT NULL,
  `c` int(11) DEFAULT NULL,
  `d` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  CLUSTERING KEY `col1_2` (`col1`,`b`)
) ENGINE=TokuDB AUTO_INCREMENT=8001 DEFAULT CHARSET=utf8
COLLATE=utf8_unicode_ci ROW_FORMAT=TOKUDB_ZLIB |

The output of explain is:
MariaDB [test]> explain SELECT * FROM foo WHERE col1 = '7' ORDER BY b desc;
+------+-------------+-------+------+---------------+--------+---------+-------+------+-------------+
| id   | select_type | table | type | possible_keys | key    | key_len
| ref   | rows | Extra       |
+------+-------------+-------+------+---------------+--------+---------+-------+------+-------------+
|    1 | SIMPLE      | foo   | ref  | col1_2        | col1_2 | 302
| const | 1320 | Using where |
+------+-------------+-------+------+---------------+--------+---------+-------+------+-------------+
1 row in set (2.32 sec)

The query is doing a reverse range scan, as it should.

We get handler::idx_cond_push called, but end_range is not set. As a
result, TokuDB thinks it can use index condition pushdown to filter
rows. As we do this and get to the end, because end_range is not set,
we never get a result of ICP_OUT_OF_RANGE, and always get a result of
ICP_NO_MATCH. So, when we go to retrieve that first row past the end
of the range, the row that will tell MySQL it should stop searching,
TokuDB never finds a row and never gets ICP_NO_MATCH. It scans to the
beginning of the index (because it is running in reverse order),
getting ICP_NO_MATCH for every row it encounters.

Although my index is clustering, I've seen this with a normal key as well.

If col1 is an int instead of the funky varchar, handler::idx_cond_push
is never called, so this problem does not exist.

It seems to me that if end_range is not set and we cannot reliably
learn when we go out of range, we should not have
handler::idx_cond_push called, otherwise we can get bad performance
such as the example above.

Thoughts?

Thanks

--Zardosht


Follow ups