Hey, Kyle.
On Thu, Jan 6, 2011 at 12:47 PM, Kyle Nitzsche
<kyle.nitzsche@xxxxxxxxxxxxx> wrote:
The Solution:
* I've written a python script that checks each valid po file (that is in
LINGUAS and is present) for such unescaped special chars in the translations
(the msgstr fields)
* if it finds any, it reports them to stdout and returns with exit status 1
* lang_pdfs script now runs the python script first and if errors are
found, doesn't proceed with the localized builds and errors are reported to
stdout.
So this enables us to identify errors automatically before localized
building.
I'm afraid that the errors your script reports will consist almost
entirely of false positives.
The reason you encountered a naked underscore in your .po files was
because you created them by hand.
Our current .tex files do contain
underscores (appropriately escaped, otherwise you wouldn't be able to
build the PDF) and po4a escapes the backslashes when it generates the
.pot file. As long as the backslashes aren't removed by the
translators, everything will work out okay.
(And in the instances
they are removed, the problem is usually fairly obvious and can be
corrected quickly.)
Here are a couple reasons why your script won't do what you intend it to do:
1. As noted parenthetically in the style guide, those characters are
special because they're part of LaTeX's syntax. So if you have a
string like this:
\emph{This text has been emphasized.}
and you blindly escape the backslash and braces:
\\emph\{This text has been emphasized.\}
you'll have changed the meaning of the text. (The \\ will insert a
line break, 'emph' will appear in the text, the braces will appear in
the text, and the text 'This text...' will not, in fact, be
emphasized.)
2. The meaning of characters in TeX can change based on the context.
While 'this_is_a_long_variable_name' would need the underscores
escaped in normal text, they don't need to be escaped in this case:
\lstinline|this_is_a_long_variable_name|
or in these cases:
\begin{verbatim}
this_is_a_long_variable_name
\end{verbatim}
\begin{lstlisting}
this_is_a_long_variable_name
\end{lstlisting}
$a_1, a_2, \ldots, a_n$
I think, in the end, a better solution would be to make a list of
common error messages and add them to the style guide to ease
troubleshooting.
If you're itching to find ways to help with the translation side of
things, could you have a look at po4a? We'll need to tell po4a about
some of the environments and commands we're using so that it doesn't
mangle their contents/arguments so much. For instance, the
lstlistings environment should be a verbatim environment. It's
newlines should be left as is. (po4a has a nasty habit of rewrapping
all the lines which can wreak havoc with code listings and the like.)
--Kevin