← Back to team overview

maria-discuss team mailing list archive

Re: execute failed: Incorrect string value: '\xD6sterl...' with mariadb and perl DBD

 

Hi Dave,

You can convert iso8859-1 to UTF8 using the iconv utility. Then feed the output into MariaDB.

-FG

> On Jun 19, 2019, at 2:21 PM, Dave Wreski <dwreski@xxxxxxxxxxxxxxxxxxx> wrote:
> 
> Hi Felipe,
> 
>> On 6/19/19 1:59 PM, Felipe Gasper wrote:
>> It looks like your input, with “Ö” as a single byte \xd6, is not
>> UTF-8. Any Unicode code point above 127 requires at least 2 bytes in
>> UTF-8. So if MySQL is expecting UTF-8, it may be choking on what it
>> sees as invalid input.
> 
> It's from an email with the following:
> 
> Content-Type: text/plain; charset=iso-8859-1
> Content-Disposition: inline
> Content-Transfer-Encoding: quoted-printable
> 
> Do you have any recommendations for fixing it? I can't imagine this is the first time it's been encountered. I thought about escaping or encoding it, but I don't think joomla would know to decode it.
> 
> I also found this reference, which points to making changes to the tables and columns, but it's like seven years old:
> 
> https://mathiasbynens.be/notes/mysql-utf8mb4
> 
> Is this still relevant? If so, is there a way to automate the changes? There are dozens of tables...
> 
> Thanks,
> Dave
> 
>> -F
>>> On Jun 19, 2019, at 1:56 PM, Dave Wreski
>>> <dwreski@xxxxxxxxxxxxxxxxxxx> wrote:
>>> Hi,
>>> I'm trying to use perl-DBD to write a buffer of text that contains
>>> an email with umlauts and other non-ASCII characters to a joomla
>>> database and having a problem.
>>> DBD::mysql::st execute failed: Incorrect string value:
>>> '\xD6sterl...' for column `lsv5webstage`.`xuxgc_content`.`fulltext`
>>> at row 1 at /home/alerts/scripts_linstage/AdvisoryTest.pm line
>>> 373.
>>> I'm not familiar enough with how encoding works to fully understand
>>> what the problem is. This is a fedora29 system with mariadb-10.3.12
>>> and joomla-3.9.
>>> Apparently the '\xD6' is an O with an umlaut in "Sebastian
>>> �sterlund". I read something about utf8 not being able to handle
>>> 4-char, but I don't fully understand.
>>> I found the following reference online which talks about changing
>>> the encoding type from utf8 to utf8mb4, but the tables all appear
>>> to already be using that encoding:
>>>> SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR
>>> Variable_name LIKE 'collation%'; +--------------------------+--------------------+ | Variable_name
>>> | Value              | +--------------------------+--------------------+ |
>>> character_set_client     | utf8mb4            | |
>>> character_set_connection | utf8mb4            | |
>>> character_set_database   | utf8mb4            | |
>>> character_set_filesystem | binary             | |
>>> character_set_results    | utf8mb4            | |
>>> character_set_server     | utf8mb4            | |
>>> character_set_system     | utf8               | |
>>> collation_connection     | utf8mb4_unicode_ci | |
>>> collation_database       | utf8mb4_unicode_ci | | collation_server
>>> | utf8mb4_unicode_ci | +--------------------------+--------------------+
>>> execute failed: Incorrect string value: '\xD6sterl...' with mariadb
>>> and perl DBD a novice perl programmer I'm not sure it's helpful,
>>> but this is the insert statement I'm using in my perl code:
>>> my $sql                 = <<EOF; INSERT INTO xuxgc_content (title,
>>> alias, introtext, `fulltext`, state, catid, created, created_by,
>>> created_by_alias, modified, modified_by, checked_out,
>>> checked_out_time, publish_up, publish_down, images, urls, attribs,
>>> version, ordering, metakey, metadesc, metadata, access, hits,
>>> language) VALUES ($title, "$title_alias", $introText, $fullText,
>>> $state, $catid, $created, $created_by, $created_by_alias,
>>> $modified, $modified_by, $checked_out, $checked_out_time,
>>> $publish_up, $publish_down, $images, $urls, $attribs, $version,
>>> $ordering, $metakey, $metadesc, $metadata, $access, $hits,
>>> $language); EOF
>>> my $sth = $dbh->prepare($sql); $sth->execute(); db_disconnect($dbh);
>>> I don't recall having this problem in the past, and this script has
>>> been in use for quite a while.
>>> What am I doing wrong?
>>> _______________________________________________ Mailing list:
>>> https://launchpad.net/~maria-discuss Post to     :
>>> maria-discuss@xxxxxxxxxxxxxxxxxxx Unsubscribe :
>>> https://launchpad.net/~maria-discuss More help   :
>>> https://help.launchpad.net/ListHelp



References