Launchpad logo and name.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index ][Thread Index ]

RE: Call for testing new Launchpad Translations code performance



El mié, 21-11-2007 a las 15:22 +0100, Philippe Verdy escribió:
> Could I request a missing item for translations: currently, there's
> absolutely NO ways of entering non-breaking spaces in translations, despite
> some translations require them.

That's not completely true, we should expose that feature more but we do
allow you to introduce non-breaking spaces:

https://bugs.launchpad.net/rosetta/+bug/81281

> 
> When we submit a translation string using the online web form, these non
> breaking spaces become regular spaces, which is wrong when NBSP are needed
> for example with some punctuation signs (in French for example, or as group
> separators in decimal numbers). This causes some messages to display with an
> incorrect line-break when lines are wrapped (for example an undesirable
> linebreak in the middle of a number, or between a word and a punctuation).

If you take a look to
https://bugzilla.mozilla.org/show_bug.cgi?id=218277 you can see that
it's a bug with Mozilla based browsers, seems like 3.x versions of
Firefox should have that bug fixed, but that's why we had to add a
workaround in Launchpad to allow it usage. We don't strip it at all,
your browser is doing that. If you are not using Firefox or any other
browser using Mozilla's Gecko, then yes, it maybe a bug in our code and
thus, we will appreciate that you file a bug report on
https://bugs.launchpad.net/rosetta with your browser information and
concrete URLs that fail for you and what do you submit so we can debug
the problem and fix it.

> 
> Currently the web form displays non-breaking spaces as "[nbsp]" but trying
> to enter this in the web form does not recreate the expected code, but
> enters the "[nbsp]" string as a litteral.

Well, that's only in the web interface as a visual tag to note it, did
you try to download that file and see what's exported?

> 
> So please, make sure that the web input form are processed correctly: don't
> convert non-breaking spaces from the input form into regular spaces. Similar
> problems occur also with some format controls needed for some languages,
> notably:
> 
> - direction controls for embedding multiscript texts: LRE, RLE and LRM
> (often needed for embedding Latin fragments including punctuation, to avoid
> incorrect reordering or mirroring notably at the inter-script boundaries).
> 
> - word-breaking controls: ZWJ, ZWNJ (really needed for supporting
> South-Asian scripts)
> 
> - combining grapheme controls (needed because of Unicode normalization): CGJ
> (really needed for supporting Hebrew, due to the way Hebrew combining points
> in full-pointed texts are normalized because of the "incorrect" but
> unchangeable combining weights that are assigned to Hebrew combining points;
> with CGJ, the reordering of multiple Hebrew points can be blocked during
> normalization; this CGJ is normally not needed for modern Hebrew as there
> are normally only one point per Hebrew base letter, but this still occurs
> with vowel points added on letters with a dagesh or resh point; the Hebrew
> combining marks were given default combining weight assuming only the modern
> usage, but problems happen immediately when you have to manage Biblic and
> other religious texts that use a lot of additional combining marks,
> including cantillation).

By default, we shouldn't remove any of those tags. Please, if you have
some specific issues, file a bug with the concrete Unicode char codes so
we can debug it and find what's causing you those problems and fix it
when it's possible and not a problem with the browser.

> 
> Or suggest an input syntax for allowing entering them, for example by using
> a "\uNNNN" notation (if the texts contain litteral "\u" convert them first
> to "\\u" before displaying them and allowing them to be entered in the input
> form). The Javascript in the input form could be used to redisplay these
> controls correctly (for example displaying "[nbsp]" with smaller letters,
> within a box with greyed background), and could use this trick to support
> correct input of litteral backslashes (entered as "\\" internally but viewed
> as a "\" in a greyed box), or newlines others than single U+000A.
> 
> But currently, the current code alters/destroys the existing non-breaking
> spaces in existing translations submitted to Lauchpad. This makes the
> Launchpad website not very suited for handling international texts, notably
> for the "Translations" where this is really needed!

Again, please, take into account that this kind of things could either
be a problem in Launchpad or in your browser. For non-breaking spaces,
it's a browser bug not a bug in our side.

I wonder whether the other special tags are really broken in Launchpad
or just you are assuming that it's broken, I'm not sure so I just want
to be sure whether you had that kind of problems already while using
Launchpad for translations.

Cheers.

P.S.: Please, open a new thread in the mailing list for things related
to this email so we have it isolated from the performance testing
thread.






This is the launchpad-users mailing list archive — see also the general help for Launchpad.net mailing lists.

(Formatted by MHonArc.)