← Back to team overview

maria-discuss team mailing list archive

Re: Limited Unicode Support?

 

Thanks for the replies. I've tried to just replace all occurrences of
"utf8" in my example with "utf8mb4" and it works.

Inconveniently this will require major conversations and down times for my
application, but at least I know what I must do to make it work.

However, the "mb4" sounds a little suspicious, though. While there are no
sufficiently high numbered Unicode Points yet that would make such a
measure necessary, the UTF-8 encoding allows for up to seven byte long
characters, if I am not mistaken. Does utf8mb4 allow for more than four
byte long characters if in and when the time comes?

Am Do., 10. Okt. 2019 um 17:18 Uhr schrieb Diego Dupin <
diego.dupin@xxxxxxxxxxx>:

> Hi björn,
>
> 🙋 is  a 4 bytes encoded character (0xF0 0x9F 0x99 0x8B).
>
> "utf8" is a 3-Byte UTF-8 Unicode encoding.
> You have to configure charset "utf8mb4" that permits full utf8 support.
> https://jira.mariadb.org/browse/MDEV-8334 in 10.5 is the first step to
> makes utf8mb4 default for 'utf8'.
>
> regards,
> diego.
>
>
> On Thu, Oct 10, 2019 at 3:53 PM Björn Keil <schattenkeil@xxxxxxxxxxxxxx>
> wrote:
>
>> Hello,
>>
>> I hope this is the proper mailing list to ask such questions, I apologise
>> if it isn't.
>>
>> I am having some problems with unusual Unicode characters in my MariaDB
>> database.
>>
>> $ mariadb --version
>> mariadb  Ver 15.1 Distrib 10.3.17-MariaDB, for debian-linux-gnu (x86_64)
>> using readline 5.2
>> $ sudo ./mariadb.php
>> [sudo] Passwort für bjoern:
>> Query: INSERT INTO `test` SET `string` = '🙋 Huhu. wie geht es dir?'
>> Inserted: '🙋 Huhu. wie geht es dir?'
>> Returned: '???? Huhu. wie geht es dir?'
>>
>> SHOW VARIABLES LIKE 'character%':
>> character_set_client utf8
>> character_set_connection utf8
>> character_set_database utf8
>> character_set_filesystem binary
>> character_set_results utf8
>> character_set_server latin1
>> character_set_system utf8
>> character_sets_dir /usr/share/mysql/charsets/
>>
>> As you can see here, MariaDB does not take the character '🙋' (
>> https://www.fileformat.info/info/unicode/char/1f64b/index.htm ) and
>> instead replaces it with four question marks and I have no idea why.
>>
>> I've attached the PHP code for the example.
>>
>> I would be most grateful for any suggestion.
>>
>> Regards,
>> Björn Keil
>> _______________________________________________
>> Mailing list: https://launchpad.net/~maria-discuss
>> Post to     : maria-discuss@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~maria-discuss
>> More help   : https://help.launchpad.net/ListHelp
>>
>

Follow ups

References