← Back to team overview

ubuntu-developer-manual team mailing list archive

Re: language issue: "fr_CA" (French Canada) doesn't work

 

On 01/06/2011 06:32 PM, Kevin Godby wrote:
On Thu, Jan 6, 2011 at 5:21 PM, Kyle Nitzsche
<kyle.nitzsche@xxxxxxxxxxxxx>  wrote:
So, to remove false positives, I wonder whether it is only some of the
special characters that could ever appear unescaped in a msgstr.

For example, maybe curly braces can and do, but underscores never should?

If so, I can just remove curly braces from the list of characters it checks
for.

Do you know?
There's absolutely no way to remove all the false positives without
parsing all of the TeX code (including the document class, packages,
etc.) because the way that TeX parses those characters can be changed
as it goes along.

An example of this that I mentioned above was the lstlistings
environment and lstinline command.  The contents of those are parsed
in such a way that the underscore should *not* be escaped.

Another example is math mode:

   The Pythagorean theorem may be stated as $c = \sqrt{a^{2} + b^{2}}$.

In that line of text, the only thing that should be escaped in the .po
file is the backslash—and that's only because the po parsers require
it.  The $, {, }, and ^ should not be escaped.

I think that it may be helpful if you inverted your logic.  Instead of
reading a .tex file and assuming that those special characters should
be escaped except under given circumstances, you should instead assume
that those characters are part of LaTeX's syntax and *only need to be
escaped when the characters are to appear as they are in the body
text*.

In other words, only escape those characters if you want them to
appear verbatim in the normal body text of the book.

In practice (with this book), I think it will be rare that those
characters need to be escaped because they will be handled by special
environments and commands that know how to parse them properly.

--Kevin

OK. Kevin, it seems the python script (thing 'o beauty though it is ;) should be dropped. And that we deal with unescaped *problematic* special characters in msgstr fields in another way. Do you agree?

Another question: how ARE we going to ensure msgids that should NOT be translated are not? It seems that there are quite a few of them...

In normal code circumstances, developer comments the code before the string and that is passed through the toolchain and display to translator, whether s/he's reviewing a po file or using LP.


Thanks for your detailed responses, by the way!
Cheers,
Kyle

Cheers,
Kyle



Follow ups

References