Friday, January 13, 2017

Machine translation in 2017: It's getting neural

By November 2016, something happened to the most used open machine translation platform, Google Translate. Users around the globe – mainly those translating in combinations of English to Chinese, Spanish, French, Japanese, German, Korean, Portuguese and Turkish – have noticed a major improvement in the machine translation results which suddenly translated whole sentences, worked with much broader context and got, well, more human. Google itself claims the platform has improved "more in a single leap than we’ve seen in the last ten years combined". So what happened?

Comparison of Statistical Machine Translation vs. Neural Machine Translation results.
Source: blog.google.com

SMT vs. NMT

In short, neural networks and AI happened. Rather than relying on statistical methods that solve problems "by force" (the more complex databases and computing power available, the better results), neural networks utilize artificial neurons in computing and loosely imitate actual models of a biological brain. Google's Statistical Machine Translation (SMT) methods impressed the world by the ability to translate words and short phrases with more or less acceptable accuracy in over 100 languages (currently 103 to be exact). But the newly implemented Neural Machine Translation goes beyond this. Using deep-learning techniques, it first assumes the most relevant variant for translation that fits the context of sentences rather than just limited phrases, and then transforms it to match human speech and grammar as much as possible (as demonstrated in the picture above).

How good can NMT get

Naturally, neural networks get better over time as they learn and Google's NMT has still a lot of learning to do in order to match professional human translation, mainly for inflected languages (seems like Latin and Greek will be the last to go). But the recent evolution of the technology demonstrates exponential improvements in machine translation. Over next few years, Google will be perfecting their NMT results for all the 103 languages covered and implement the translation feature into the very DNA of "intelligent" online platforms and apps.

As it has been 10 years now since the introduction of Google Translate, it will be interesting to observe where the service will be in another 10 years and how deep will it affect the professional human translation industry. Will human translators become human editors with additional required specialization and skills by 2027?



No comments:

Post a Comment