CAT Tools are what happened after we woke up
from the perfect machine translation dream
The Dream of Perfect Computer Translation
To understand Computer Aided Translation (CAT) tools, consider them a partial solution to our unrealized dream of perfect machine translation (MT). By the 1990s, attempts to create a “universal translator” had failed, even in our imagined science-fiction future:
Sure, to an alien, “human beings” are “bags of mostly water”—but that’s not all we are!
In the post-WWII era, scientists and mathematicians like Warren Weaver believed that we could solve the problem of translation in the same way that we had deciphered enemy texts during the war. The solution, he said, was not a one-to-one replacement of single words, but replacement in context:
“… it is obviously impossible to determine, one at a time, the meaning of words… until one can see not only the central word in question but also see [a set of] words on either side, then, if [the set] is large enough one can unambiguously decide the meaning. . .]”
Weaver’s simple solution of establishing “context” turned out to be harder than he thought. Weaver thought that a set of 4 or 5 words surrounding a particular word would be enough to establish context for an accurate translation.
Now we know that translation systems like the ones Weaver envisioned often do not accurately capture meaning. Relying on even the most “state of the art” MT is risky: it could result in financial harm, loss of reputation and even death. (add links to MT blog posts)
The Rise of the CAT Tool
As they realized the limitations of MT, developers shifted their focus from replacing human beings to augmenting them. Human translators discern and translate meaning, while computers do what they do best: compare, store, and retrieve data.
The Translation Memory (Augments Translator Efficiency/Capacity)
The earliest and still most critical development in CAT technology was the creation of a simple database for translators to draw from: the translation memory (TM). What makes a TM different from a dictionary is the length of the units or segments that a TM saves and sorts. As a general rule segments are made up of full sentences. TMs do not store individual words, like a foreign language dictionary, nor do they simply store a few surrounding words, as Warren Weaver dreamed. Instead, an entire sentence in a source language is paired with its faithful translation in the target.
When a translator is working with a CAT tool, the basic interface presents rows of segments broken into two columns: source in the left column and target in the right column. The translator fills in empty target segments until the translation has been completed. When the TM has a match for a particular segment, that segment is already filled in for the translator. This is known as “pre-translation.” The more matches in the TM, the more pre-translation, thus increasing the translation capacity for an individual translator.
This works wonderfully for translating and updating standardized documents, for example, legal/corporate documents, patent applications, software updates, and technical manuals.
Beyond the TM: Quality Assurance
A major strength of CAT tools is their quality assurance (QA) capability. This includes an array of functions to compensate for human error: typos and spelling errors, accidental formatting changes, deviations from approved terminologies (medical, industrial, legal), and transposed numbers. For these tasks, the computer excels where humans struggle. For example, strict numeric comparisons across large data sets can be mind-numbing for a human proofreader, but CAT tools reveal in an instant whether the translator typed 892 in the target column instead of 829.
How the client benefits from LSPs adoption of CAT Tools
There are three key benefits of well-managed CAT tools that flow directly to the client: cost savings, speed, and quality.
Before we quote a price for a new job, we use a CAT tool to analyze the files to be translated. We are looking for at least two things: 1) repetitions of sentences or other freestanding text segments within the document AND across a set of documents and 2) matches of document segments with segments already in the translation memory if previous translations for the same language pair, client and subject domain were performed with a CAT tool. Those matches can be exact or they can be “fuzzy,” because the segment stored in memory (the database) may be similar to but not precisely the same as the source segment in need of translation. Repetition and TM matches can lead to substantial discounts for the client as they allow the new documents to be partially pre-translated. Robust translation memories can reduce the billable word count of a job by 25% or more.
With the aid of a CAT tool, a translator can complete whole projects in much shorter time frames. Translators can quickly review or skip over pre-translated content, allowing them to work through documents more efficiently. In addition, when translating a new segment that appears throughout the document or across a batch of documents, the CAT tool will auto-propagate that translation wherever the source segment appears and save in the TM for future jobs.
CAT tools let the translator focus on translation, rather than worrying about the formatting of the files they have been asked to translate. The stronger CAT tools, such as SDL Studio and MemoQ, are capable of filtering out formatting and other tags (such as HTML, XML, or design programs like Adobe InDesign) that are critical to the final document. QA tools can compare the formatting of source and target and call to the translator’s attention any discrepancies. In addition, when standardized terminology is necessary throughout the text, the tools implement glossaries to ensure consistency
The Future of CAT tools in Language Services
Although machine translation cannot replace human translators, the enhancement of the translation process with CAT tools is very much a reality today.
As the world becomes more comfortable working in multiple environments across devices, new forms of CAT tools are beginning to appear, many of which take advantage of the possibilities of cloud-based architectures. This new phase of collaborative translation technologies offers much hope for the future, but the transition is not a straightforward one. If we lose sight of the human elements of translation, relying too much on the latest technologies in translation, quality suffers.
Translators have unique skills, and come at the translation process in their own ways. Many cloud technologies today force translators to do their jobs in certain ways. Some translators are more malleable to such changes. Others are more resistant. Further, when new technologies offer too much information to the translator, or when the utility of that information drops below a certain level, the tool becomes a distraction rather than an enhancement. Not every new translation tool is a real step forward in terms of the real goal of assisting translations. There must always be a balance. MTM LinguaSoft has built its business on respecting the art of translation and ensuring that technology does not get in the way.
CAT tools have led to great improvements in translator efficiency and accuracy and in cost savings, providing significant benefit to the client. Their utility in the translation process will continue to grow. Nevertheless, the human translators remain the foundation of a quality translation process. Language Service Providers strive to keep track of and remain knowledgeable about new technological solutions, while being careful not to adopt new technologies too quickly, without first considering the core human relations upon which true understanding is built.