Wednesday, January 13, 2010

Better Translating Service

the problem of making an intelligible automated translation -- much more one that's actually reasonable-sounding -- is not easy, and it's not exactly solved anywhere.

the proposal is to make translations more direct and decipherable by not "sugar-coating" them trying to put them into regular English (or any other recipient language obviously) grammar. For example, Mandarin grammar is *completely* different from English grammar. When translating from Mandarin, notation (brackets, etc.) could be used instead of normal English grammatical structures to relay the grammatical structure of the Mandarin phrase. Most people aren't too stupid to get the hang of such a system after a couple of translations; the bigger problem is simply not knowing the words of another language.

Another example of this feature is how to handle agglutination. To translate a complicated German agglutinated word into English, we could simply say present something like zeitgeist, for example, as "time-spirit." People are intelligent enough to infer the real meaning behind such things based on context in a lot of cases; again, the translator only needs to come half-way. The reader simply can't be bothered to spend years learning a language just to translate one text.

Another great feature would be the ability to click on any word in the translation and get a pop-up list of various other words it could have been translated as. Obviously translations aren't strictly word-by-word, but most of the time a word should be able to be tagged (in html id attribute for example) as having been a translation of another word in the foreign text, for use during lookup.

Also, in some cases where there are only 2 or 3 different possible translations of a word but they're significantly different from each other, the word could be presented in a way that shows all of them; for example, "geist" could be show up as "[ghost|spirit]"