← Back to team overview

maria-discuss team mailing list archive

Re: Collations, trailing spaces and unique indexes

 

Just a quick idea... Perhaps you can include an additional field in the unique index containing the length on the string. So

Str,	str_len
'a', 	1
'a ',	2

Unq index on (str, str_len). 

Rhys

-----Original Message-----
From: Maria-discuss [mailto:maria-discuss-bounces+rhys.campbell=swisscom.com@xxxxxxxxxxxxxxxxxxx] On Behalf Of Binarus
Sent: Friday, March 11, 2016 9:19 AM
To: maria-discuss@xxxxxxxxxxxxxxxxxxx
Subject: [Maria-discuss] Collations, trailing spaces and unique indexes

Dear all,

I already have done some MySQL development and now eventually would like to switch to MariaDB because I no longer can stand the following:

"All MySQL collations are of type PADSPACE. This means that all CHAR, VARCHAR, and TEXT values in MySQL are compared without regard to any trailing spaces. “Comparison” in this context does not include the LIKE pattern-matching operator, for which trailing spaces are significant."

"For those cases where trailing pad characters are stripped or comparisons ignore them, if a column has an index that requires unique values, inserting into the column values that differ only in number of trailing pad characters will result in a duplicate-key error. For example, if a table contains 'a', an attempt to store 'a ' causes a duplicate-key error."

(taken from https://dev.mysql.com/doc/refman/5.5/en/char.html)

I did not find any information whether the collations which MariaDB provides are of type PADSPACE as in MySQL (none of the collation tables from MariaDB's documentation does contain this information) or not.

To make a long story short: I would like to have CHAR and VARCHAR fields with unique indexes which consider 'a' and 'a ' to be different values (which is the only reasonable point of view IMHO). This is not possible with MySQL. Is it possible with MariaDB? I am generally working with charset UTF8 and would be happy with a general solution, but also with a non-general solution which only works with UTF8.

Since the index behaviour obviously depends on the collation, would building an own collation which does not PADSPACE be an option? I have read vague hints about that in MariaDB's documentation, but didn't try yet (before doing so, I would like to know if there is an easy, clean solution).

Thank you very much for any ideas,

Binarus

_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss
Post to     : maria-discuss@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp

Follow ups

References