← Back to team overview

openstack team mailing list archive

Re: Proposal for manuals translation process

 

At present I am simply trying out various tools to see what it takes to integrate into them, and how they would work for both individual projects and OpenStack as a whole. Nothing I’m doing is official yet, so don’t get too worked up. There were several options thrown out for translation platforms at the summit, and narrowing them down isn’t a trivial task.

I’ll report back next week with more information. In general I look forward to kick-starting a more concerted translation process for OpenStack.

All the best,


-          Gabriel Hurley

From: openstack-bounces+gabriel.hurley=nebula.com@xxxxxxxxxxxxxxxxxxx [mailto:openstack-bounces+gabriel.hurley=nebula.com@xxxxxxxxxxxxxxxxxxx] On Behalf Of Ying Chun Guo
Sent: Saturday, April 28, 2012 2:35 AM
To: Anne Gentle
Cc: annegentle@xxxxxxxxxxxxxxxxxx; openstack@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Openstack] Proposal for manuals translation process


Hi, Anne

Thank you for your comments. I'm glad to know that you are working for a larger "goal". I don't know
Launchpad is broken with code strings now. What do you mean when you said "Launchpad to be
broken with code strings now"? I take a look at Horizon in Transifex. Gabriel Hurley is the coordinator.
Can I know the reason why Dashboard turn to Transifex other than Launchpad?

Pootle will be a good open source project to look into. It supports a very powerful "Terminology matching" feature,
which can match and list the relevant terminologies at real time. The translation memory feature is not so powerful.
Suggested translations from a translation memory must be generated before translation, while Transifex can list
suggested translations at real time. I will add the third column in our wiki page.

There is a major issue for Pootle that we need to consider. Although Pootle has its official server to host the translation
of Pootle UI and related projects, the policy for selecting projects on our official server are not finalised yet.
I cannot find a way to register our projects in that official server. So we might need to host our
own Pootle server if we use it. Are we able to host our own translation server? I'm not clear whether Pootle
supports OpenID. If we are going to host our own Pootle server, maybe we can enhance it and enable
the OpenID authentication.

Both Launchpad and Transifex, even Pootle, manage the translation review process by their own. I think, the translation quality
review shall be done using the translation tool. Gerrit and Jenkins will play an important role in Generating step.
I'm not familiar with Gerrit and Jenkins. If my description is wrong, feel free to correct me. After the fourth step "Converging",
DocBooks in different languages will be generated and submitted to Gerrit for review. Jenkins will run Maven build and upload the
generated sources to server. The reviewer ( translation coordinator ) will accept the changes.

When I propose the five steps, I only think of manuals translation. Code strings may be a little different. Are there any globalization
test in Jenkins? If there is no, we may need to add globalization test.

Regards
Daisy

annegentle@xxxxxxxxxxxxxxxxxx<mailto:annegentle@xxxxxxxxxxxxxxxxxx> wrote on 04/28/2012 03:21:07 AM:

> Anne Gentle <anne@xxxxxxxxxxxxx<mailto:anne@xxxxxxxxxxxxx>>
> Sent by: annegentle@xxxxxxxxxxxxxxxxxx<mailto:annegentle@xxxxxxxxxxxxxxxxxx>
>
> 04/28/2012 03:21 AM
>
> To
>
> Ying Chun Guo/China/IBM@IBMCN,
>
> cc
>
> openstack@xxxxxxxxxxxxxxxxxxx<mailto:openstack@xxxxxxxxxxxxxxxxxxx>
>
> Subject
>
> Re: [Openstack] Proposal for manuals translation process
>
> Hi Daisy,
>
> Thanks so much for this detailed proposal. I'd like you to put it on
> the OpenStack wiki, at http://wiki.openstack.org/Translations.
>
> My first read-through and discussion with the CI team brings up a
> few comments:
> - Whatever we do for docs, we should also do for code strings. So
> unfortunately the scope for the "Goal" probably cannot be so narrow.
> We know Launchpad to be broken with code strings now, that data
> point should be reflected in this point-in-time analysis.
>
> - Dashboard uses Transifex now (while the other projects
> unsuccessfully use Launchpad). Tres Henry, can you comment on the
> number of translators of Dashboard strings you have on the Transifex
> side already?
>
> - Not that I want analysis paralysis, but, we may need to add a
> third column of a crowd-sourced translation option like Pootle that
> is familiar to open-source translators. Also, the lack of a
> translation memory/dictionary (and having to hold such a valuable
> asset in a wiki page) is troubling, can we also analyze an option
> that offers a translation dictionary? So much re-use would be
> available to all the projects.
>
> - There seems to be assumptions that Jenkins and Gerrit will "just
> work" with out much description of the role those two crucial tools
> play. Can you further describe the workflow for those in the
> Slicing, Uploading, Downloading, Converging, and Generating steps?
>
> I appreciate all the hard work I _know_ went into this proposal.
> Let's get it on the wiki, discuss more, and keep adding details. I'd
> like to make this a blueprint, could be for openstack-manuals, could
> be for horizon, I don't know yet. Thanks for stepping up and
> embracing our international community's needs!
>
> Thanks,
> Anne
>
>
>
>

> On Fri, Apr 27, 2012 at 3:45 AM, Ying Chun Guo <guoyingc@xxxxxxxxxx<mailto:guoyingc@xxxxxxxxxx>> wrote:
> Hi, all
>
> During the "I18N in OpenStack" discussion in design summit, it is
> mentioned that documents need to I18N. I also noticed some requests
> for a Chinese version manuals from China users. But unlike Gettext
> strings in the codes,  there is no process for DocBook translation
> yet. Translators, who want to help translation, have to take a
> DocBook into a tool and perform a translation on a copy which will
> be saved as a new file. This traditional translation model is not
> good for collaboration. Usually, the open source translation depends
> on volunteers. It's better to use the crowd translation model, which
> enables a mass of translators to work on the same job, just like the
> Launchpad Web UI for Gettext strings translation, any people can
> jump in at any time and contribute to any part of the translatable contents.
>
> In order to facilitate the manuals translation, I investigated
> several translation websites and several open source projects. I
> composed this proposal. Now it's open for suggestions and comments.
>
> Goal
> ------------
> A process for manuals translation
>
> Background
> --------------
> OpenStack Manuals are in DocBook format. The source is on GitHub:
> http://github.com/openstack/openstack-manuals
> Launchpad and Transifex are free web based tools used for crowd
> translation. Both of them provide a simple web interface in which
> non-technical people can help translation. They don't support
> DocBook format, but support the popular GNU Gettext file formats (PO
> Template or PO).
>
> Translation Process
> -------------------
> In order to translate OpenStack Manuals to multiple languages, which
> are in DocBook format, we can slice the documents into short
> statements, then use a web based translation management tool to
> manage the translation process, and finally converge the translated
> content into a new copy of DocBook.
>
> Here are the five steps of the translation process:
> Step #1 Slicing - extract translatable content from DocBooks and
> generate Gettext compatible POT files (PO Template or PO);
> Step #2 Uploading - upload the POT (or PO) files to a web based
> translation management tool;
> Step #3 Downloading - download PO (or MO) files from the web tool
> after translation and review;
> Step #4 Converging - converge the translated contents into new
> copies of DocBook, create DocBooks in multiple languages
> Step #5 Generating - generate HTML/PDF in multiple languages from
> DocBooks in multiple languages
>
> The picture in the attachment describes these steps.
> (See attached file: DocBook translation process.png)
>
> Compare of Launchpad and Transifex
> -------------------
> Launchpad (https://launchpad.net/) and Transifex (https://www.transifex.net/
> ) are similar web based tools used for crowd translation. The goal
> of the compare is to find the most appropriate tool for this
> scenario. The compare are made between Launchpad and Transifex free
> version for open sources. (Refer to https://www.transifex.net/plans/
>  to get details of “Transifex free version for open sources”)
>
> After considering the requirements for manuals translation,  below
> perspectives are taking into consideration:
> *Supported format
> *DocBook slicing support
> *Converging support
> *Source uploading method
> *Output downloading method
> *Translation Memory support
> *Translation history support
> *Change management
> *Translation Dictionary
> Refer to Table 1 for detail information of the compare.
>
> Another important measurement to compare is the workload. Having the
> five steps in the process execute automatically as much as possible
> will decrease the workload of translation coordinators.
> Refer to Table 2 for the detail of workload compare when using
> Launchpad or Transifex for DocBook translation.
>
> Here are the conclusions after the compare,
> (1) the workload using Transifex is similar with using Launchpad.
> (2) The advantages of Launchpad are:
> * Leverage the same user id and user group of developers, users,
> translators of Gettext strings.
> * Leverage the same contribution calculating method "Karma", with
> fixing bugs, answering questions and Gettext strings translation.
> (3) The advantage of Transifex is better translation memory support.
> The disadvantage of Transifex is having different user registration
> and user interfaces. Both the translators and the coordinators need
> to register in a new website and get familiar with a new user
> interfaces before translation.
>
> Based on these analysis, I think, using Launchpad to do the manuals
> translation is a good choice.
>
> Other considerations
> -------------------
> *Translation Dictionary
> Translation Dictionary here means terminology translation. It is
> very helpful to ensure the translation quality. Unfortunately, both
> Launchpad and Transifex don't support Translation Dictionary. I
> suggest to use wiki pages to document the terminology translation
> for translators reference.
> Here is a sample wiki page for Eclipse globalization: http://
> wiki.eclipse.org/French_Glossary.
>
> *Change Management
> Launchpad and Transifex support the synchronize of old PO files and
> new PO files in their own ways. They will compare the new po and the
> existing po and handle the changes automatically. But new PO files
> won't be generated automatically after DocBooks are changed.
> Translation coordinators need to generate new PO files by running a
> Python program manually.
> I will suggest to develop a program in future, to monitor the update
> of manuals GitHub repository. When a DocBook is updated, a new PO
> file will be generated and synchronized with the old one in the
> Launchpad server.
>
> *Machine translation
> Is it necessary to include machine translation?  Machine translation
> can be executed before human beings review. Then translators won't
> need to translate from scratch. Translators can review the result of
> machine translation and correct them.
> But after investigation, I found the quality of free machine
> translations, which have API exported, are not so good. I doubt
> whether a poor quality machine translation is helpful.
> Anyway, if most of the community members want to include machine
> translation, it is possible to improve the slicing program, to
> generate a PO file with the results of machine translation.
>
> Reference
> -------------------
> Table 1 - Compare of Launchpad and Transifex
>
> [image removed]
>
> Launchpad
>
> Transifex
>
> Supported format
>
> pot file (.pot),
> po file (.po)
>
> android string resources (.xml),
> po file (.po),
> html (.html),
> WIKI file (.wiki), etc.
> (Note, DocBook is not a supported file format; OpenStack Wiki format
> is not a supported wiki format.)
>
> DocBook Slicing support
>
> No
>
> No
>
> Converging support
>
> No
>
> No
>
> Source uploading method
>
> Two methods:
> a> Automatic template imports from Bazaar branch
> b> Manually upload template (or an archive) through Launchpad's web interface.
>
> Two methods:
> a> Use a command tool “Transifex Client” to synchronize the server
> with local repository (local folder) by typing several commands.
> b> Manually upload a source translation file from web interface;
>
> Output downloading method
>
> Two methods:
> a> Automatic save output files to Bazaar branch;
> b> Manually download output files through web interface.
>
> Two methods:
> a> Use “Transifex Client” to download the latest translations from
> the server by typing one command.
> b> Manually download through web interface.
>
> Translation Memory support
>
> The exact same translation items in other projects can be listed as
> a reference.
>
> The similar translation items will be listed as a reference.
> Translation memory can be shared within two and more projects.
>
> Translation history support
>
> Yes
>
> Yes
>
> Change management
>
> Launchpad will automatically update its data every time you push a
> new revision to the Bazaar branch.
>
> When you push some local updates to server, Transifex will overwrite
> the existing source strings and translations with the updated version.
> (Note: This may lead to loss of translations. So users need to make
> sure the local repository contains the latest translation results in
> the server.)
>
> Translation Dictionary
>
> No
>
> No
>
>
> Table 2 - Workload compare when using Launchpad or Transifex for
> DocBook translation
>
> [image removed]
>
> Using Launchpad
>
> Using Transifex
>
> Step 1: Slicing
>
> Python program [1] can be used to slice all the DocBook together in
> one command
>
> Same with Launchpad
>
> Step 2: Uploading
>
> If the source code is synchronized with Bazaar, the uploading can be
> automatically handled by Launchpad.
>
> Use “Transifex Client” to upload resources to Transifex server from
> local repository (local folder) by typing several commands.
>
> Step 3: Downloading
>
> Launchpad can commit daily snapshots of the translations to a Bazaar
> branch in a specific folder.
>
> Use “Transifex Client” to download the latest translations from the
> server by typing one command.
>
> Step 4: Converging
>
> Python program [2] can be used to coverge all the po files back to DocBooks
>
> Same with Launchpad
>
> Step 5: Generating
>
> Maven command can be used to generate HTML/PDF from DocBooks
>
> Same with Launchpad
>
> [1] The Python program can be written based on “xml2po” to slice all
> DocBooks of the manuals project to translatable strings in batch.
> “xml2po” is an existing Python program in GNOME gnome-doc-utils
> package which can extracts translatable content from free-form XML
> documents and outputs gettext compatible POT files.
> [2] The Python program can be written based on “xml2po”, to converge
> the translated strings back to copies of DocBooks in batch.
>
>
> Regards
> Daisy Guo
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@xxxxxxxxxxxxxxxxxxx<mailto:openstack@xxxxxxxxxxxxxxxxxxx>
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp

References