← Back to team overview

ubuntu-manual team mailing list archive

Re: Documentation pool sample content


Hello, Phil.

On Wed, Aug 4, 2010 at 5:56 PM, Phil Bull <philbull@xxxxxxxxx> wrote:
> Sorry to change tack slightly, but I'm wondering whether the format
> discussion should be deferred until some more important issues are
> resolved. If we choose an XML format, we can always transform to other
> formats with a bit of work.

That's true to a degree.  But if one format has more granular tags
than another, it's hard to convert from the more general format to the
more granular format.  (It's easy to swap one tag for another or to
remove unnecessary tags, but it's hard to add them.)

> I discussed the content pool in passing with a few GNOME docs guys at
> GUADEC, and one particularly serious concern was how the content would
> fit together. Books are written in a different style to help topics,
> which are written in a different style to training worksheets (and what
> have you). How would chunks of material from the documentation pool be
> reused for different purposes?
> My thoughts were that reuse would require significant editing (making
> material from a help topic flow better when used in the book context,
> for example). But surely this would remove a major benefit of having a
> shared pool: namely, reduced translation duplication?
> Do you have any ideas on how we can overcome problems like this? I'm
> worried that, in practise, documentation from the different teams would
> overlap in subject but not in style, so we'd either have a pool
> containing items which are only suitable for one team, or items which
> need significant (and destructive) editing to get them to work together.

The way I've been thinking about the document pool is that it provides
us with some transparency of provenance for our documentation.

An example to illustrate: Let's say that the Ubuntu docs team has
written a fantastic page on using Rhythmbox.  The manual team would
like to keep the same information but reword it to fit the flow of
their book.  The manual team would branch the document and modify it
to suit their needs.  If, later on, the docs team rewrites a portion
of their Rhythmbox article (say, a Rhythmbox feature has been modified
somehow), the manual team could easily tell that the 'upstream'/source
document has been modified and tweak their document accordingly.
Similarly, the docs team could see what changes the manual team has
made and incorporate those changes back into the original document.

Any unmodified text in this branching process would inherit the
original translations.  And the translators could be notified of the
modified text that requires retranslation.

Translation is one of the bigger problems we've been wrestling with on
the manual project.  Currently, we're using the po4a Perl script to
translate between LaTeX and po formats.  The translations are done via
Launchpad.  Each 'unit' of translation is a full paragraph (because
that's how po4a operates).  This is nice in that it provides some
context for a sentence, but causes a lot of problems for minor
updates.  If someone adds a missing comma to the original English,
then the entire paragraph is marked as requiring a new translation
(despite the fact that the translator may have fixed the comma in the
translation as she worked on it).

I'd like to see smaller and smarter translation units.  Translating an
entire sentence seems sensible for the most part, but we should
continue to provide the context of the sentence so the translation can
flow as well as the original text.

We could help translators by providing lookups of menu item strings
automatically (thereby ensuring that the translation of the
documentation matches the translation of the application).

I think that getting translations right for large chunks of text (as
opposed to short UI element strings) would go a long way toward
helping all of our projects.

(Did I stray wildly off-topic and completely forget to answer your question?)

--Kevin Godby

Follow ups