Online Multilinguism by Chris Waigl

How not to translate websites! Internationalisation (i18n)… localisation (l10n)… these are buzzwords of importance for any website of size. As I work for Last.fm, I figured this would be a topic of practical application to me.

Chris, of German origin, is multilingual and the roots of her interest in this area lie with the fact that when she attempted to publish a bi-lingual blog, she could not find the tools to support her.

First, some facts. There are 347 languages in the world with more than 1 million speakers… that’s only 5% of the languages on Earth, but represents 95% the planet’s speakers. Stepping it up a gear: there are 75 languages with more than 10 million speakers, covering 80% of the speakers on the planet.

The largest languages are Mandarin with 900 million speakers, and then Spanish, Hindi and English with 300 to 350 million speakers each. Interestingly, English has 2.5 to 3 times as many second-language fluent speakers as first language speakers. This means that the majority of English readers would prefer another language.

Multilingualism is widespread outside of English speaking countries and, in contrast to rich countries of the west, not a mark of privilege. And it changes fast. Anecdotaly, Chris gives the example of Irish: adding Irish to the national curriculum, funded by the government, saved the language from extinction within 5 to 10 years.

Internationalization is a user-experience problem and should be integrated throughout the design, build and testing of a website or product, in the best case – testing with non-primary language speakers and multilingual users. But what tools are available to help with this process?

From the web-user’s point of view, any language I read well is a language I can do business in. Searching is another matter – I want the results of everything I can read together. Community-wise, I want to post content in all languages but not deal with content I don’t understand.

In an ideal world, automatic translation can solve some of these problems but in reality there are very few cases which that’s acceptable. Chris suggests that searching for, say, Korean news results and having them translated to a language she reads is acceptable – even with the awkwardness of the robot inflection, it’s still access to previously unavailable content. But what can be done better where multilingualism CAN be dealt with effectively?

Other key points to realise for an international web developer are that language is not the same as country; and country is not the same as language. You can not convert directly from one to the other, despite it being the path of least resistance for many developers.

An experiment. Chris travelled to Prague, with her laptop set to the German locale, to see how some websites behave at this conflict of information. Facebook offers immediately to have the site in Czech if you wish, although showing the site in German – taking in to account both the language setting and the location of connection.

Google, on the other hand, assumes that because you’re in Czech you want Czech despite being set to the language of your choice. This problem can go wider, especially for a traveller using an internet cafe and unable to understand the operating system itself…

Paypal is a third example – how at one time, it automatically chose the language based on your postal address, even if you are, say, an ex-pat! Fortunately that situation has since changed. Nevertheless, the choices of country/language combination are still very limited. Too many assumptions!

Chris was generally praiseful of Last.fm, but did mention from a user experience perspective how it’s potentially inappropriate to represent our country choices with country flags; a German speaking citizen of Switzerland is not German, after all. Not a point I’d previously thought about, but she concedes that it’s very hard to represent a language pictographically.

Next in the firing line is Amazon. Despite having high usability street-cred, Amazon does have a habit of redirecting users to specific language versions based on location. It’s a complicated problem and Amazon does not handle it too gracefully. Edge cases that fail include a French speaker in Switzerland being redirected to the german site; a traveller with an amazon.co.uk cookie being continually offered the choice of the .co.uk site despite living in France; ordering in one language from another country to ship as a gift.

Any multilingual blogger has three areas to worry about: translating posts, interface elements and static content. There are some bi-lingual plugins for WordPress but many issues left to iron out. Many issues arise – how do you translate tags, for example, and is that even necessary?

As a mainly non-technical talk focusing on user-experience, Chris touched briefly at the end on some of the ongoing sticky topics in this area including Unicode normalisation, UTF-8 and Accept-language HTTP headers – stuff every web-dev should know, and few are comfortable with.

In conclusion, a worthy topic with some excellent examples of what to do and what not to do, and I will be feeding back some of this information on Monday…

Tags: , , , , , , ,

This entry was posted on Saturday, October 4th, 2008 at 2:46 pm and is filed under Sessions. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

One Response to “Online Multilinguism by Chris Waigl”

  1. Priceless Paintings from W7 » A Modern Linguistic Dilemma Says:

    [...] Online Multilinguism by Chris Waigl [...]

Leave a Reply