← Back to team overview

openstack team mailing list archive

Re: Proposal for manuals translation process

 

Hi Daisy,

Thanks so much for this detailed proposal. I'd like you to put it on the
OpenStack wiki, at http://wiki.openstack.org/Translations.

My first read-through and discussion with the CI team brings up a few
comments:
- Whatever we do for docs, we should also do for code strings. So
unfortunately the scope for the "Goal" probably cannot be so narrow. We
know Launchpad to be broken with code strings now, that data point should
be reflected in this point-in-time analysis.

- Dashboard uses Transifex now (while the other projects unsuccessfully use
Launchpad). Tres Henry, can you comment on the number of translators of
Dashboard strings you have on the Transifex side already?

- Not that I want analysis paralysis, but, we may need to add a third
column of a crowd-sourced translation option like Pootle that is familiar
to open-source translators. Also, the lack of a translation
memory/dictionary (and having to hold such a valuable asset in a wiki page)
is troubling, can we also analyze an option that offers a translation
dictionary? So much re-use would be available to all the projects.

- There seems to be assumptions that Jenkins and Gerrit will "just work"
with out much description of the role those two crucial tools play. Can you
further describe the workflow for those in the Slicing, Uploading,
Downloading, Converging, and Generating steps?

I appreciate all the hard work I _know_ went into this proposal. Let's get
it on the wiki, discuss more, and keep adding details. I'd like to make
this a blueprint, could be for openstack-manuals, could be for horizon, I
don't know yet. Thanks for stepping up and embracing our international
community's needs!

Thanks,
Anne





On Fri, Apr 27, 2012 at 3:45 AM, Ying Chun Guo <guoyingc@xxxxxxxxxx> wrote:

> Hi, all
>
> During the "I18N in OpenStack" discussion in design summit, it is
> mentioned that documents need to I18N. I also noticed some requests for a
> Chinese version manuals from China users. But unlike Gettext strings in the
> codes,  there is no process for DocBook translation yet. Translators, who
> want to help translation, have to take a DocBook into a tool and perform
> a translation on a copy which will be saved as a new file. This
> traditional translation model is not good for collaboration. Usually, the
> open source translation depends on volunteers. It's better to use the crowd
> translation model, which enables a mass of translators to work on the
> same job, just like the Launchpad Web UI for Gettext strings translation,
> any people can jump in at any time and contribute to any part of the
> translatable contents.
>
> In order to facilitate the manuals translation, I investigated several
> translation websites and several open source projects. I composed this
> proposal. Now it's open for suggestions and comments.
>
> *Goal*
> *------------*
> A process for manuals translation
>
> *Background*
> *--------------*
> OpenStack Manuals are in DocBook format. The source is on GitHub:
> http://github.com/openstack/openstack-manuals
> Launchpad and Transifex are free web based tools used for crowd
> translation. Both of them provide a simple web interface in which
> non-technical people can help translation. They don't support DocBook
> format, but support the popular GNU Gettext file formats (PO Template or
> PO).
>
> *Translation Process*
> *-------------------*
> In order to translate OpenStack Manuals to multiple languages, which are
> in DocBook format, we can slice the documents into short statements, then
> use a web based translation management tool to manage the translation
> process, and finally converge the translated content into a new copy of
> DocBook.
>
> Here are the five steps of the translation process:
> Step #1 Slicing - extract translatable content from DocBooks and generate
> Gettext compatible POT files (PO Template or PO);
> Step #2 Uploading - upload the POT (or PO) files to a web based
> translation management tool;
> Step #3 Downloading - download PO (or MO) files from the web tool after
> translation and review;
> Step #4 Converging - converge the translated contents into new copies of
> DocBook, create DocBooks in multiple languages
> Step #5 Generating - generate HTML/PDF in multiple languages from DocBooks
> in multiple languages
>
> The picture in the attachment describes these steps.
> *(See attached file: DocBook translation process.png)*
>
> *Compare of Launchpad and Transifex*
> *-------------------*
> Launchpad (https://launchpad.net/) and Transifex (
> https://www.transifex.net/) are similar web based tools used for crowd
> translation. The goal of the compare is to find the most appropriate tool
> for this scenario. The compare are made between Launchpad and Transifex
> free version for open sources. (Refer to https://www.transifex.net/plans/ to
> get details of “Transifex free version for open sources”)
>
> After considering the requirements for manuals translation,  below
> perspectives are taking into consideration:
> *Supported format
> *DocBook slicing support
> *Converging support
> *Source uploading method
> *Output downloading method
> *Translation Memory support
> *Translation history support
> *Change management
> *Translation Dictionary
> Refer to Table 1 for detail information of the compare.
>
> Another important measurement to compare is the workload. Having the five
> steps in the process execute automatically as much as possible will
> decrease the workload of translation coordinators.
> Refer to Table 2 for the detail of workload compare when using Launchpad
> or Transifex for DocBook translation.
>
> Here are the conclusions after the compare,
> (1) the workload using Transifex is similar with using Launchpad.
> (2) The advantages of Launchpad are:
> * Leverage the same user id and user group of developers, users,
> translators of Gettext strings.
> * Leverage the same contribution calculating method "Karma", with fixing
> bugs, answering questions and Gettext strings translation.
> (3) The advantage of Transifex is better translation memory support.
> The disadvantage of Transifex is having different user registration and
> user interfaces. Both the translators and the coordinators need to register
> in a new website and get familiar with a new user interfaces before
> translation.
>
> Based on these analysis, I think, using Launchpad to do the manuals
> translation is a good choice.
>
> *Other considerations*
> *-------------------*
> *Translation Dictionary
> Translation Dictionary here means terminology translation. It is very
> helpful to ensure the translation quality. Unfortunately, both Launchpad
> and Transifex don't support Translation Dictionary. I suggest to use wiki
> pages to document the terminology translation for translators reference.
> Here is a sample wiki page for Eclipse globalization:
> http://wiki.eclipse.org/French_Glossary.
>
> *Change Management
> Launchpad and Transifex support the synchronize of old PO files and new PO
> files in their own ways. They will compare the new po and the existing po
> and handle the changes automatically. But new PO files won't be generated
> automatically after DocBooks are changed. Translation coordinators need to
> generate new PO files by running a Python program manually.
> I will suggest to develop a program in future, to monitor the update of
> manuals GitHub repository. When a DocBook is updated, a new PO file will be
> generated and synchronized with the old one in the Launchpad server.
>
> *Machine translation
> Is it necessary to include machine translation?  Machine translation can
> be executed before human beings review. Then translators won't need to
> translate from scratch. Translators can review the result of machine
> translation and correct them.
> But after investigation, I found the quality of free machine translations,
> which have API exported, are not so good. I doubt whether a poor quality
> machine translation is helpful.
> Anyway, if most of the community members want to include machine
> translation, it is possible to improve the slicing program, to generate a
> PO file with the results of machine translation.
>
> *Reference *
> *-------------------*
> Table 1 - Compare of Launchpad and Transifex
> Launchpad
> Transifex
>
>    Supported format
>
>
>    pot file (.pot),
>    po file (.po)
>
>
>    android string resources (.xml),
>    po file (.po),
>    html (.html),
>    WIKI file (.wiki), etc.
>    (Note, DocBook is not a supported file format; OpenStack Wiki format
>    is not a supported wiki format.)
>
>
>    DocBook Slicing support
>
>
>    No
>
>
>    No
>
>
>    Converging support
>
>
>    No
>
>
>    No
>
>
>    Source uploading method
>
>
>    Two methods:
>    a> Automatic template imports from Bazaar branch
>    b> Manually upload template (or an archive) through Launchpad's web
>    interface.
>
>
>    Two methods:
>    a> Use a command tool “Transifex Client” to synchronize the server
>    with local repository (local folder) by typing several commands.
>    b> Manually upload a source translation file from web interface;
>
>
>    Output downloading method
>
>
>    Two methods:
>    a> Automatic save output files to Bazaar branch;
>    b> Manually download output files through web interface.
>
>
>    Two methods:
>    a> Use “Transifex Client” to download the latest translations from the
>    server by typing one command.
>    b> Manually download through web interface.
>
>
>    Translation Memory support
>
>
>    The exact same translation items in other projects can be listed as a
>    reference.
>
>
>    The similar translation items will be listed as a reference.
>    Translation memory can be shared within two and more projects.
>
>
>    Translation history support
>
>
>    Yes
>
>
>    Yes
>
>
>    Change management
>
>
>    Launchpad will automatically update its data every time you push a new
>    revision to the Bazaar branch.
>
>
>    When you push some local updates to server, Transifex will overwrite
>    the existing source strings and translations with the updated version.
>    (Note: This may lead to loss of translations. So users need to make
>    sure the local repository contains the latest translation results in the
>    server.)
>
>
>    Translation Dictionary
>
>
>    No
>
>
>    No
>
>
>
>
> Table 2 - Workload compare when using Launchpad or Transifex for DocBook
> translation
>
>    Using Launchpad
>
>
>    Using Transifex
>
>
>    Step 1: Slicing
>
>
>    Python program [1] can be used to slice all the DocBook together in
>    one command
>
>
>    Same with Launchpad
>
>
>    Step 2: Uploading
>
>
>    If the source code is synchronized with Bazaar, the uploading can be
>    automatically handled by Launchpad.
>
>
>    Use “Transifex Client” to upload resources to Transifex server from
>    local repository (local folder) by typing several commands.
>
>
>    Step 3: Downloading
>
>
>    Launchpad can commit daily snapshots of the translations to a Bazaar
>    branch in a specific folder.
>
>
>    Use “Transifex Client” to download the latest translations from the
>    server by typing one command.
>
>
>    Step 4: Converging
>
>
>    Python program [2] can be used to coverge all the po files back to
>    DocBooks
>
>
>    Same with Launchpad
>
>
>    Step 5: Generating
>
>
>    Maven command can be used to generate HTML/PDF from DocBooks
>
>
>    Same with Launchpad
>
>
>
> [1] The Python program can be written based on “xml2po” to slice all
> DocBooks of the manuals project to translatable strings in batch. “xml2po”
> is an existing Python program in GNOME gnome-doc-utils package which can
> extracts translatable content from free-form XML documents and outputs
> gettext compatible POT files.
> [2] The Python program can be written based on “xml2po”, to converge the
> translated strings back to copies of DocBooks in batch.
>
>
> Regards
> Daisy Guo
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>
>

GIF image


Follow ups

References