← Back to team overview

maria-developers team mailing list archive

Re: MDEV-7649 wrong result when comparing utf8 column with an invalid literal

 

Hi, Alexander!

On Apr 07, Alexander Barkov wrote:
> Hi Sergei,
> 
> I tested comparison behaviour for various different situations.
> 
> Please find the comment with the summary table in the end of:
> https://mariadb.atlassian.net/browse/MDEV-7649
> 
> Currently there are 5 possible reactions on bad bytes on comparison,
> depending on the collation, presence of an index, and 
> character_set_connection value:
> 
> - truncate on bad byte, compare only the well-formed prefix (#1)
> - treat bad bytes as '?' (#7 and #8)
> - empty set with no warning (#3)
> - empty set with a warning (#2)
> - error (#5 and #6)
> 
> I'm in doubts what to do for 10.1 and for 5.5. Please suggest.
> 
> 1. For 10.1 or 10.2 I think it would be nice to make all these cases
> work  exactly the same way. I am not sure which way would be the best.
> Any ideas?

I like your last idea from MDEV-7649 -
treating invalid symbols as "larger than any valid symbol".

If that's doable without major changes.

> 2. For 5.5 we need to fix the bug with minimal changes. Do you agree?
> I suggest we fix only #1 and maybe #2.

I'd suggest to fix only #1 - and make the comparison to return an empty
set.

Regards,
Sergei



Follow ups

References