Why is Machine Translation the best Upgrade CAT Tools could Hope for?

Lastly, I have been quite overwhelmed by the Translation studies I’ve been undertaking since last September. Not because there is a lot to do – and there is – but because they had me feeling quite obsolete and … limited. If we take a look at the translation market today, three elements stand out:

  1. The ever-growing amount of data
  2. The progress of artificial intelligence (AI)
  3. The tendency to level the prices down.

When considering these criteria, it’s easy to see that our limited biological bodies will never be able to cope. Did you know that over 50 million webpages are translated daily by machines? Call me bold but I think even Luc Besson’s Lucy would struggle to reach such high levels of productivity. That’s why today’s topic is neural machine translation, the current state-of-the-art in terms of MT.

Neural MT (NMT) is the result of decades of trial and error in the field of autonomous translation, i.e. machines that can translate without the help of human beings, such as DeepL or Google Translate. So far, I haven’t completed half of my master’s degree yet, so maybe the proposition I’m about to formulate is already valid, but I am convinced that incorporating MT engines in CAT tools could be the greatest upgrade ever for the latter.

To prove my point, let me explain briefly what NMT consists in.

First things first, there are three main types of MT: rule-based, statistical and neural (ranked from oldest to newest and from least to most autonomous). Statistical and neural MT work from huge corpora and the outputs rely mostly on probability calculations, so they do NOT understand the texts per se, it is more an exercise about deduction. I don’t like to imply that computers function like humans, but their behaviour recalls that of every language learner in a foreign country. You know, these words or phrases that you have always heard in a specific context but whose definition you never bothered to ask about, and that you continue to use because they feel right? Well, this is, in a nutshell, how NMT works. NMT is basically a drunk English learner at a party in Australia saying with confidence, “It was fair dinkum madness!” to the person he or she is trying to impress.

To be able to guess which word to suggest and when, NMT relies on huge databases as well as a deep learning method called recurrent neural networks (RNNs), i.e. networks of cells that communicate with one another to determine which word or sequence of the database suits best.

During the learning phase, the AI is fed vast amounts of data through aligned corpora, termbases, sets of rules (e.g. grammar and syntax), etc. in order to “master” the source and target languages. The notion of error is also integrated to the AI, which will then be used to improve the target sentence. To do so, a source word or sequence is associated to its correct translation in the target language, and this association becomes a reference that the AI will try to get close to. Before we go any further, it has to be clear that words are not processed by the AI algorithms as such, but rather in the form of vectors, i.e. a set of numbers that represent them.

So, for instance, say the English word “ring” is associated to French words “anneau”, “alliance”, “bague” and “sonnerie” but is only associated to “alliance” when it comes after “wedding”.

“Wedding+ring = alliance” will become a reference, and every time the algorithm proposes any other translation for “ring”, it will compare it to “alliance” and change the solution until the value obtained matches that of “alliance” or gets closer to it than with other solutions.

Another critical component of AI is its flexibility: every time a mistake is corrected, or a translation is deemed accurate, the probability of the solution increases, so that the next time the algorithm has to translate a similar sequence, it will be more likely to use the aforementioned correct solution.

Apart from its ability to correct itself and learn, one last specificity of RNNs I’d like to highlight is how they use context and a specific kind of memory called Long Short-Term Memory (LSTM). Although any algorithm (even statistical ones) is able to consider the nearest one or two words in the sequence, the type of RNN used in NMT can remember words located further in the sentence, either at the beginning or towards the end.

For a live demonstration of this process, I recommend you submit the following sentence to DeepL from English to French (no copy/paste, you have to type every word to see the magic!): “I asked my brother to bring the rings for my mother’s best friend’s wedding”. Have you noticed that “bagues” becomes “alliances” once you wrote “wedding”? If this is not proof that AIs are one of the translator’s most valuable tools, I don’t know what is!

DeepL when you still haven’t made clear whether the ring is for you significant other or your mum

Now that you might have understood how powerful these tools are, imagine being able to use them at the same time as CAT tools such as SDL Trados Studios (if these words don’t ring a bell, my colleague Emmy Tournier will tell you more about them). As far as I’m concerned, the hardest part of any translation activity is to create sentences and to work from the source text only. I reckon my work sessions would be so much more productive and efficient if I could just take any text, open it on SDL Trados Studio and translate it with DeepL directly on Studio’s platform. I’d just have to make sure the terminology is correct and coherent, and check if the translation memory provides better solutions, and all of this on one screen! To conclude, now that pretty much all translators have accepted CAT tools as a precious productivity booster, we might as well accept that NMT is also a miracle and greet it with a warm welcome!


Une réflexion sur “Why is Machine Translation the best Upgrade CAT Tools could Hope for?

Laisser un commentaire