Carlos Perelló Marín [mailto:carlos.perello@xxxxxxxxxxxxx] > El mié, 21-11-2007 a las 15:22 +0100, Philippe Verdy escribió: > > Could I request a missing item for translations: currently, there's > > absolutely NO ways of entering non-breaking spaces in translations, > despite > > some translations require them. > > That's not completely true, we should expose that feature more but we do > allow you to introduce non-breaking spaces: > > https://bugs.launchpad.net/rosetta/+bug/81281 No this occurs as well in IE7, not just Mozilla-based browsers. There's apparently something wrong in the way the content is encoded/interpreted in browsers, possibly because it is not correctly encoded to support "alternate" spaces, or because of the encoding used to return the submitted form. I see only one way to solve it: don't send to browsers a pure text, send escaped texts (using a trick like "\uNNNN" encoding). Then: * if Javascript is not supported or disabled, the users will see the text using these escapes, and will submit the data using a "safe" encoding that is preserved. * If Javascript is enabled, your Javascript can PRESENT the decoded text to the user, and process back the input from the user so that the text is shown in WYSIWYG mode. There should then be some checkbox allowing the text input form to show the decoded (WYSIWIG) or encoded (\uNNNN) form. Only the encoded form (with escapes) will be used to talk to the webserver. Having the possibility to switch the content of an input form between the two forms will help seeing the otherwise invisible characters that may cause problems, or will help users entering these characters. Note this: Browsers don't let users enter for example a NBSP character, EVEN IF this character is mapped on the keyboard, because this is generally mapped on Ctrl+Space or Strl+Shift+Space and browsers are modifying the keymap and disabling the Control key, or transforming the input as soon as it is entered, even if this input comes from a copy/paste operation...). Instead, they are assuming that the language entered will match with the language used in the web page, and forcing the input to adopt the encoding and character subsets used in the language determined from the web page (they uses various tricks to do that, including not only the page encoding, but also metadata sent in the HTTP headers, or some other elements in the page, but generally they ignore the xml:lang or lang attribute set in the web input elements, and this becomes even more complex with stylesheets in actions). The encodings interactins are really complex to handle over the HTTP interface and with the interaction of HTML syntax. One way to prevent this is to simplify the encoding at this interface, and then let some Javascript make the work locally in the browser, to render the text back to normal without the intermediate encoding used. Such loal Javascript will perform the input decoding/encoding, validation and reformatting dynamically. If Javascript is not enabled, users will still be able to interact with a normal browser, but using only "safe" characters and an escaping syntax. Note also that it is not clear what Launchpad is doing with translations that contain text containing something that looks like literal HTML or literal named character references, I've seen them changing after just changing one character in the resource, despite it was not expected that this would affect the encoding of the rest of the text. So when I download back the translation results, I can see that they have been transformed without any warning set to the user (no visible difference) when submitting the data. For this reason, I have reverted from using Launchpad: it cannot handle international text properly and really breaks existing resources that were working properly before and were already properly encoded. (For the projet I'm interested in, the resources are to be converted into Java properties files, and really contain Unicode text; Unicode being used as the central encoding, even if it is then automatically converted into ASCII only using Java-specific resource format for Unicode escapes in a ASCII only file): very large files with thousands of resources that were completed and reviewed by many persons since several years have suddenly been degraded to become almost unusable, and everything needs to be rechecked manually (the project counts more than 400,000 resources in various languages and scripts, the whole set of texts occupying several megabytes if not compressed), and the translation status was suddenly degraded so that many existing languages were no longer usable and would have been removed from the distribution (this included very common languages that had resources translated at nearly 100%, with just a few ones to maintain from time to time, and whose translation level suddenly came to below 50% in the needed core resources). Note also that in your site, * Please don't let input box force their width so that they require scrolling horizontally (even on a display with a large resolution), just because the text to translate is a single paragraph (without any newline). The text in that case is supposed to be displayed with automatic line-wrapping, and your interface allows seeing the position where newlines are effectively encoded in the resource text. * the stylesheet is nearly unusable for proper text input: the text is really TOO SMALL for entering anything else than just Basic Latin (i.e. English and a few other languages, but most other languages use non ASCII characters, and they are really hard to see and correct; for languages with complex scripts or with subtle glyph distinctions, like Chinese, pointed Arabic, Indian scripts, but also Korean using the regular Hangul alphabet, it's almost impossible to read the text properly). * the fonts specified are forced, but do not allow proper input of international text. Please remove the font assignment at least in the input box, or in the resource display (let the user specify its own visual font from the browser settings, or at least make sure that the language being worked on has its localized text styled using fonts that DO WORK with the language): Test a list of fonts working for each language/script pair, i.e. each one of the supported locales, then mark the text displayed with a style specific to that language, and then map a list of fonts for this language in a CSS "font-family:", and make sure that the font are styled with a sufficient point size ! And please let users grow the visual font size, both for the display of the resources, and/or the input element. My message is on topic, because this topic is speaking about a new Launchpad Translations site, and it is testing some new features (primarily to speak about the current performance problem, but this should also include the problems of usability).
This is the launchpad-users mailing list archive — see also the general help for Launchpad.net mailing lists.
(Formatted by MHonArc.)