← Back to team overview

ubuntu-developer-manual team mailing list archive

Re: language issue: "fr_CA" (French Canada) doesn't work

 

So for example, the pt_BR.po file contains an error (unescaped underscore) that I introduced (and committed, oops).

This is the result:
"
$ ./lang_pdfs
po file: po/pt_BR.po has Errors:
    msgstr "pt_BR Intro to LP"
Errors in po files. Stopping.
"

It reports the po file that is errored and the msgstr that has it.

Escape that underscore, and all PDFs for all valid linguas are created.

Cheers,
Kyle



On 01/06/2011 05:15 PM, Kyle Nitzsche wrote:
Hi Kevin,

My solution just flags errors in po files IF they exist. It doesn't actually escape them.

My po files worked fine, until I manually introduced an unescaped underscore, something a translator might do.

I did this when testing "fr_CA". That is, I added that "fr_CA" to a msgstr in a po file to test translation/pdf generation. This cause pdf build failure.

If translators were to do the same, this would be handy and would lead us directly to the problematic translation/language.

It simply notices the failure condition and reports the problem meaningfully to standard out.

Cheers,
Kyle
On 01/06/2011 04:44 PM, Kevin Godby wrote:
Hey, Kyle.

On Thu, Jan 6, 2011 at 12:47 PM, Kyle Nitzsche
<kyle.nitzsche@xxxxxxxxxxxxx>  wrote:
The Solution:
  * I've written a python script that checks each valid po file (that is in
LINGUAS and is present) for such unescaped special chars in the translations
(the msgstr fields)
  * if it finds any, it reports them to stdout and returns with exit status 1
  * lang_pdfs script now runs the python script first and if errors are
found, doesn't proceed with the localized builds and errors are reported to
stdout.

So this enables us to identify errors automatically before localized
building.
I'm afraid that the errors your script reports will consist almost
entirely of false positives.

The reason you encountered a naked underscore in your .po files was
because you created them by hand.
  Our current .tex files do contain
underscores (appropriately escaped, otherwise you wouldn't be able to
build the PDF) and po4a escapes the backslashes when it generates the
.pot file.  As long as the backslashes aren't removed by the
translators, everything will work out okay.
(And in the instances
they are removed, the problem is usually fairly obvious and can be
corrected quickly.)

Here are a couple reasons why your script won't do what you intend it to do:

1. As noted parenthetically in the style guide, those characters are
special because they're part of LaTeX's syntax.  So if you have a
string like this:

   \emph{This text has been emphasized.}

and you blindly escape the backslash and braces:
   \\emph\{This text has been emphasized.\}

you'll have changed the meaning of the text.  (The \\ will insert a
line break, 'emph' will appear in the text, the braces will appear in
the text, and the text 'This text...' will not, in fact, be
emphasized.)

2. The meaning of characters in TeX can change based on the context.
While 'this_is_a_long_variable_name' would need the underscores
escaped in normal text, they don't need to be escaped in this case:

   \lstinline|this_is_a_long_variable_name|

or in these cases:

   \begin{verbatim}
   this_is_a_long_variable_name
   \end{verbatim}

   \begin{lstlisting}
   this_is_a_long_variable_name
   \end{lstlisting}
  $a_1, a_2, \ldots, a_n$

I think, in the end, a better solution would be to make a list of
common error messages and add them to the style guide to ease
troubleshooting.

If you're itching to find ways to help with the translation side of
things, could you have a look at po4a?  We'll need to tell po4a about
some of the environments and commands we're using so that it doesn't
mangle their contents/arguments so much.  For instance, the
lstlistings environment should be a verbatim environment. It's
newlines should be left as is.  (po4a has a nasty habit of rewrapping
all the lines which can wreak havoc with code listings and the like.)

--Kevin





References