← Back to team overview

maria-developers team mailing list archive

Re: On constructing ref access from mis-matched charset comparisons

 

Hi, Sergey!

On Feb 24, Sergey Petrunia wrote:
> == Long form ==
> 
> === Example dataset ===
> 
> create table t0 (
>   a varchar(32) character set utf8 collate utf8_general_ci
> );
> 
> create table t1 (
>   a varchar(32) character set latin1 collate latin1_swedish_ci,
>   col1 varchar(32),
>   key(a)
> );
> 
> insert into t0 select a from ten;
> insert into t1 select a,a from one_k;
> 
> === Queries ===
> 
> MariaDB [test]> explain 
>     -> select * from t0, t1 where t0.a=t1.a;
> +------+-------------+-------+------+---------------+------+---------+------+
> | id   | select_type | table | type | possible_keys | key  | key_len | ref  |
> +------+-------------+-------+------+---------------+------+---------+------+
> |    1 | SIMPLE      | t0    | ALL  | NULL          | NULL | NULL    | NULL |
> |    1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL |
> +------+-------------+-------+------+---------------+------+---------+------+
> 2 rows in set (0.00 sec)
> 
> Oops, the index t1(a) is not used.  Still it feels like the optimizer could
> 
> * Try to convert the value of t0.a into latin1.  
> * If it can't be represented in latin1, then we know that no row
>   in t1 has t1.a=t0.a.
> * make a lookup on t1.a= convert(t0.a using latin1) , using t1.a's collation
>   rules.
> 
> Is this true [for some charsets] ?

I suppose it's true for *some charsets*, it's not true generally.
It is certainly not true that if you can convert, you can compare. For
example, let's compare t1.a with a latin1_german2_ci literal, for
example, "ö". You can convert "ö" to latin1_swedish_ci. But

  t1.a= "ö" collate latin1_german2_ci

is not the same as

  t1.a= "ö" collate latin1_swedish_ci

In the first case you'll find 'oe', in the second - you won't.

Regards,
Sergei
Chief Architect MariaDB
and security@xxxxxxxxxxx


References