← Back to team overview

mahara-contributors team mailing list archive

[Bug 1613392] Re: PostgreSQL insert error into site_content with multibyte language packs.

 

I was finally able to replicate this! I had assumed the "\x..."
formatting in the original error stack was generated by var_dump(), but
then I realized, perhaps that was the actual content of the lang string.
And sure enough, copying the original string with all the "\x..."s in
it, instead of the decoded Japanese glyphs, replicated the error.

And I was able to minimize it down to that suspect \x81\xa7 character.

1. Modify htdocs/lang/en.utf8/install.php so that
'loggedouthomedefaultcontent' is this:

$string['loggedouthomedefaultcontent'] = "\x81\xa7";

2. Log in. Create a new institution.

Expected result: You create a new institution.
Actual result: Error stack containing this message: Failed to get a recordset: postgres8 error: [-1: ERROR:  invalid byte sequence for encoding "UTF8": 0x81]

The error went goes away if you do any of these:

- Replace with "\xe3\x81\xa7" (で) http://www.mclean.net.nz/ucf/?c=U+3067
- Replace with "で"
- Replace with "\xe8\x86\xa7" (膧) http://www.mclean.net.nz/ucf/?c=U+81A7
- Replace with "膧"

So it does seem to be an issue with the lang file itself, not with
Mahara. The code change shouldn't be necessary so long as lang files
contain only valid UTF-8. User input becomes UTF-8 encoded via the
browser, due to the "<meta http-equiv="Content-type" content="text/html;
charset=UTF-8">" tag we put on every page. I suppose there is still the
possibility of data from other sources having incompatible encoding,
like RSS feeds or Leap2a files. But we'll cross that bridge when we come
to it.

Cheers,
Aaron

** Changed in: mahara
       Status: Incomplete => Invalid

** Changed in: mahara
       Status: Invalid => Won't Fix

** Changed in: mahara
       Status: Won't Fix => Invalid

** Changed in: mahara
       Status: Invalid => Won't Fix

-- 
You received this bug notification because you are a member of Mahara
Contributors, which is subscribed to Mahara.
Matching subscriptions: Subscription for all Mahara Contributors -- please ask on #mahara-dev or mahara.org forum before editing or unsubscribing it!
https://bugs.launchpad.net/bugs/1613392

Title:
  PostgreSQL insert error into site_content with multibyte language
  packs.

Status in Mahara:
  Won't Fix

Bug description:
  When we try to add a new institution using Japanese language menu, we
  have an error message "Mahara: Site unavailable A nonrecoverable error
  occurred. This probably means you have encountered a bug in the
  system" and can't add the institution.

  And Apache error log says as below:
  [error] [client xxx.xxx.xxx.xxx] [WAR] 18 (lib/errors.php:820) HINT:  This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".] in adodb_throw(INSERT INTO "site_content" ("name", "content", "ctime", "mtime", "institution") VALUES (?, ?, ?, ?, ?), home,<h1>Mahara\xe3\x81\xab\xe3\x82\x88\xe3\x81\x86\xe3\x81\x93\xe3\x81\x9d</h1><p>[<b>\xe3\x81\x82\xe3\x81\xaa\xe3\x81\x9f\xe3\x81\xae\xe7\xb5\x84\xe7\xb9\x94\xe5\x90\x8d</b>]\xe3\x81\xaf\xe3\x82\xaa\xe3\x83\xb3\xe3\x83\xa9\xe3\x82\xa4\xe3\x83\xb3\xe3\x82\xb3\xe3\x83\x9f\xe3\x83\xa5\xe3\x83\x8b\xe3\x83\x86\xe3\x82\xa3\xe3\x82\x92\xe6\xa7\x8b\xe7\xaf\x89\xe3\x81\x99\xe3\x82\x8b\xe3\x81\x9f\xe3\x82\x81\xe3\x81\xae\xe5\x8d\x81\xe5\x88\x86\xe3\x81\xaa\xe6\xa9\x9f\xe8\x83\xbd\xe3\x82\x92\xe6\x9c\x89\xe3\x81\x99\xe3\x82\x8b\xe3\x82\xa4\xe3\x83\xb3\xe3\x82\xbf\xe3\x83\xbc\xe3\x83\x8d\xe3\x83\x83\xe3\x83\x88\xe4\xb8\x8a\xe3\x81\xae\xe3\x83\x9d\xe3\x83\xbc\xe3\x83\x88\xe3\x83\x95\xe3\x82\xa9\xe3\x83\xaa\xe3\x82\xaa\xe3\x82\xb7\xe3\x82\xb9\xe3\x83\x86\xe3\x83\xa0\xe3\x81\xa7\xe3\x81\x99\xe3\x80\x82</p><p>Mahara\xe3\x81\xab\xe9\x96\xa2\xe3\x81\x99\xe3\x82\x8b\xe8\xa9\xb3\xe7\xb4\xb0\xe3\x81\xaf<ahref="about.php">About</a>[\xe3\x81\x93\xe3\x81\xae\xe3\x83\x9a\xe3\x83\xbc\xe3\x82\xb8\xe3\x82\x92\xe5\xbf\x98\xe3\x82\x8c\xe3\x81\x9a\xe3\x81\xab\xe7\xb7\xa8\xe9\x9b\x86\xe3\x81\x97\xe3\x81\xa6\xe3\x81\x8f\xe3\x81\xa0\xe3\x81\x95\xe3\x81\x84]\xe3\x82\x92\xe3\x81\x8a\xe8\xaa\xad\xe3\x81\xbf\xe3\x81\x8f\xe3\x81\xa0\xe3\x81\x95\xe3\x81\x84\xe3\x80\x82\xe3\x81\xbe\xe3\x81\x9f\xe3\x80\x81\xe7\xa7\x81\xe3\x81\x9f\xe3\x81\xa1\xe3\x81\xab<ahref="contact.php">\xe3\x81\x8a\xe6\xb0\x97\xe8\xbb\xbd\xe3\x81\xab\xe3\x81\x8a\xe5\x95\x8f\xe3\x81\x84\xe5\x90\x88\xe3\x82\x8f\xe3\x81\x9b\xe3\x81\x8f\xe3\x81\xa0\xe3\x81\x95\xe3\x81\x84</a>\xe3\x80\x82</p><p>\xe3\x81\x82\xe3\x81\xaa\xe3\x81\x9f\xe3\x81\xaf\xe3\x81\x93\xe3\x81\xae\xe3\x83\x86\xe3\x82\xad\xe3\x82\xb9\xe3\x83\x88\xe3\x82\x92\x81\xa7\xe7\xb7\xa8\xe9\x9b\x86\xe3\x81\x99\xe3\x82\x8b\xe3\x81\x93\xe3\x81\xa8\xe3\x81\x8c\xe3\x81\xa7\xe3\x81\x8d\xe3\x81\xbe\xe3\x81\x99\xe3\x80\x82</p>,2016-08-1512:50:39,2016-08-1512:50:39,inst), referer: https://mahara.xxxxx.com/admin/users/institutions.php

  To avoid this error, I would like to recommend to modify lib/dml.php
  as below.

  File:
  lib/dml.php

  Line:
  1061

  [ Before ]
    // Pull out data matching these fields
      $ddd = array();
      foreach ($columns as $column) {
          if (isset($data[$column->name])) {
              if ($column->name == $primarykey && empty($setfromseq)) {
                  continue;
              }
              $ddd[$column->name] = $data[$column->name];
          }
      }

  [ After ]
    // Pull out data matching these fields
      $ddd = array();
      foreach ($columns as $column) {
          if (isset($data[$column->name])) {
              if ($column->name == $primarykey && empty($setfromseq)) {
                  continue;
              }
              if (function_exists('mb_convert_encoding')) {
                  $ddd[$column->name] = mb_convert_encoding($data[$column->name], "UTF-8", "auto");
              } else {
                  $ddd[$column->name] = $data[$column->name];
             }
          }
      }

To manage notifications about this bug go to:
https://bugs.launchpad.net/mahara/+bug/1613392/+subscriptions


References