|
News Archive / How Does Google Translate Work?
29.09.2010
Google Translate is by no means a perfect translation service — you’re still going to have to invest in those language classes if you want to be able to fluently communicate with speakers of foreign tongues — but if you remember the online wasteland that preceded it, you’ll know it’s pretty darn good at conveying the gist of what you’re trying to express. Also: That ‘detect language’ feature saves you the hassle of fiddling with lots of dropdown menus. So how does it do it? As with many of Google’s technologies, Google Translate’s M.O. consists of sifting through large piles of data — in this case, text. Google refers to this process of translation by finding patterns in vast swathes of writing “statistical machine translation.” As humans, when we learn languages, we do so by navigating the sets of rules which govern them, so Google’s process might seem deeply unintuitive. However, when you compare its results to those of translation services like Babel Fish, which is powered by the rule-based machine translation of SYSTRAN, the improved accuracy of the results speaks for itself. Indeed, Google used SYSTRAN for its translations up until 2007, when it switched to its own system. At the time, Google research scientist Franz Och explained the switch as follows: “Most state-of-the-art commercial machine translation systems in use today have been developed using a rules-based approach and require a lot of work by linguists to define vocabularies and grammars. Several research systems, including ours, take a different approach: we feed the computer with billions of words of text, both monolingual text in the target language, and aligned text consisting of examples of human translations between the languages. We then apply statistical learning techniques to build a translation model.” With a much more robust set of data at its disposal today — Google has reportedly fed thousands of multilingual United Nations and European Union official documents to Google Translate to get it up to speed, Rosetta Stone-style, and the millions of pages it indexes can’t hurt — Google continues to stick up for its data-driven approach to translation while granting that it’s imperfect. Recently, Google released the video below to explain Google Translate. “If you want to teach someone a new language you might start by teaching them vocabulary words and grammatical rules that explain how to construct sentences. A computer can learn foreign language the same way – by referring to vocabulary and a set of rules. But languages are complicated and, as any language learner can tell you, there are exceptions to almost any rule. When you try to capture all of these exceptions, and exceptions to the exceptions, in a computer program, the translation quality begins to break down. Google Translate takes a different approach…” “…Once the computer finds a pattern, it can use this pattern to translate similar texts in the future. When you repeat this process billions of times you end up with billions of patterns and one very smart computer program. For some languages however we have fewer translated documents available and therefore fewer patterns that our software has detected. This is why our translation quality will vary by language and language pair.” source: http://www.geekosystem.com See also:
28.09.2010
Language in the digital age
23.09.2010
8 Steps to Website Globalization
23.09.2010
Translation and Localization Industry Pricing Survey Reveals While Demand is Up, Prices are Down
20.09.2010
Top 3 rumours on translation quality
| ||