Categories: Archives

Employment Guide: Machine Translation: A useful, but imperfect, tool

When Carol Bové ran a passage she uses in her class through an online translator, the software lost some things in translation.

“I spent four months in America is low, and in addition I traveled for pleasure and random occasions, there are huge areas of the new world, which I did not have any escape,” begins the Google Translate edition of Simone de Beauvoir’s “America Day by Day.”

Bové, a senior lecturer in Pitt’s Department of English who teaches a translation studies course, said that “is low” should have read “it’s a little bit,” and “escape” should have been translated as “glimpses.”

While software developers and professional translators agree that the flowery language of poetry and literature would be better left to human translators, they have more trouble agreeing on the best use of machine translation programs. Professionals are debating the accuracy of translation software, its effect on jobs and what texts it can and cannot translate.

In 2010, the U.S. Bureau of Labor Statistics estimated that the number of positions for foreign-language translators in the U.S. would increase by 42 percent over the next decade — three times the average for all industries in the U.S. job market. While exact numbers are unknown, there are probably hundreds of thousands of translators, interpreters and people who use language in their jobs.

While these professionals would probably all be able to tell the difference between “escape” and “glimpses” in their respective languages, they are needed less and less for jobs that involve translating chunks of text. Instead, machine translation, completed on computers, iPads and smartphones, is taking over those jobs.

Jaime Carbonell, the director of Carnegie Mellon University’s Language Technologies Institute, said translation software relies on having enough digitized text  in both the original and the target language in order to generate the meaning that is most likely to be correct.

This means that for commonly written languages such as English, Chinese and Hindi, there is a lot of text for machine translators to use. But for less commonly used languages, there is subsequently less text to guide machine translation programs toward correct translation.

Using humans and using computers each has its own advantages.

“Machines tend to interpret technical words right, and humans tend to be the other way around,” Carbonell said.

While humans are unlikely to remember specific, technical vocabulary offhand unless they use it frequently, computers can store almost endless dictionary files. 

But machine translation programs struggle with sentence structure, especially when two languages differ. For example, designing a program that can rewrite English in German or Spanish is easier than doing so in Chinese.

Carbonell said that, in some cases, it is relatively easy for machine translation programs to define a word based on context. For instance, when the word “bank” appears next to “account,” it is likely that a translation program would know it refers to a financial institution rather than to a riverbank. But if such contextual clues are farther apart, computers have considerably more trouble.

But translating poems isn’t on the top of companies’ to-do lists. Computers can get a general idea of long passages almost instantly, while humans have to labor through them.

Carbonell said that while machine translation can’t replace human knowledge, the logical next step for companies is to use computers to do the grunt work while humans look over their shoulders and check the finished product.

“The real demand is for people who are translators and who have some skills in editing,” he said.

Bové  wrote after her test of Google Translate that, although the program did get a great deal right, “a person with foreign language skills is needed in order to make the best use of machine translation for a complex text.”

Mark Cavanagh, the vice president of U.S. operations for the multinational translation agency Translate Media, said that machine translation programs are very good at sorting through large amounts of data to see what’s important. For instance, running legal documents through a machine translation program to find the relevant ones could save a human a great deal of trouble.

But for some things, computers have more trouble. For instance, even the best translation software can’t convey nuance. This means that machine translation would be almost useless in literature or poetry.

This also limits what it can do in advertising, where persuading the audience requires much more than simply translating for meaning.

“If your sentence has simile or sarcasm, it’s important to reflect that in the translation.”

Cavanagh said that Translate Media, which has a total of about 6,500 linguists worldwide, employs about 2,000 translators who have proved themselves reliable to check and edit others’ work. These editors also check translations produced by software.  Translate Media uses machine translation to complete work for some of its projects, but only with the permission of clients. It also uses translation memory, which differs from machine translation in an important way.

When a word, phrase or sentence in the new text matches a segment of text that has already been translated and stored, the memory translation program replaces the segment of text in the old language with the corresponding text in the new language.

In translation memory, a computer looks in the document for segments of text that have already been translated.

In machine translation, a computer creates its own translation based on rules, a statistical algorithm or a combination of both. 

Digital Rosetta Stones

Some machine translation developers say that their programs are becoming good enough to translate a company’s image from one language to another.

Udi Hershkovich is the vice president of business development at Safaba Translation Solutions, which is based in Squirrel Hill.

Safaba, whose name combines the Hebrew words for “language” and “within” to communicate the idea that “language is within the machine,” develops programs for companies that translate their e-commerce sites, marketing literature, tech support and even internal emails into multiple languages.

The company, which was founded in 2009, is considered a spinoff of CMU’s Language Technologies Institute. Although Safaba is institutionally separate from the university, Hershkovich said that CMU owns a minority share of the company. Dell and Paypal are the only two Safaba clients that have been made public.

Safaba’s programs don’t just communicate meaning from one language to another, according to Hershkovich. Each program is designed specifically for a client to communicate the client’s brand, which relies heavily on word choice. Hershkovich used the corporate language that Apple employs on its website as an example of a very artistic image.

He pointed to a web page for one of Apple’s products as an example. On the page, Apple writes that the device includes “sensors to dim the screen in low-light conditions.”

“If I translate that into Russian, I could say, ‘dim the screen in low-light conditions’ in so many different ways,” Hershkovich said.  “But they chose that way.”

If Safaba designed a program for Apple, Hershkovich said, it would favor such artistic phrases.

Safaba’s programs rely on statistical translation, in which the program uses previously translated text in the same languages as a kind of Rosetta Stone-eque method to choose the translation that is the most likely correct. 

Hershkovich said that each client of Safaba employs translators who edit and correct the machine translator’s work. As they tweak the program, it learns how to produce translations tailored to the company’s image.

Hershkovich also predicted that advances in machine translation would drastically change the market and job description for translators. While many translators now start from scratch, they could instead take a shortcut by using a machine translation program to do most of the work for them.

Because even the best machine translation programs still need a human with skills in both languages to check the translation and make sure it’s worded correctly, human translators are not in danger of losing their jobs, Hershkovich said.

Instead, by letting machine translation programs translate text first and then editing it for mistakes, a human translator can produce between two and four times as much work.

But not all translation professionals agree.

A cleaner chainsaw

Kevin Hendzel, a professional translator who first started translating English and Russian more than 25 years ago, said that machine translation is ideally suited for wading through large amounts of information in an unfamiliar language. He compared it to using a chainsaw to clear a large area.

“It’s very fast for cutting down a lot of trees, but you’re not going to use it for brain surgery,” he said. “You can talk about building a faster chainsaw. You can talk about building a cleaner chainsaw. But it’s still a chainsaw.”

Hendzel insisted that it could be dangerous for a company to use machine translation for any of its own legal texts or any other sensitive material.

“Machine translation is not designed to be perfect,” Hendzel said. “It’s designed to be imperfect and fast.”

He added that translation software, even with human translators correcting it, would create a tremendous risk. This is especially true in scientific, financial or legal translations, where mistranslation creates severe liabilities.

To do so, Hendzel said, would be like having a room full of eighth-graders each write a paragraph and then trying to edit the paragraphs into a complete article for publication. It would be easier to write a whole article from scratch.

“Don’t get me wrong, they’re really enthusiastic eighth-graders,” he said. “But they’re still eighth-graders.”

Hendzel said companies would be unwise to use machine translation for anything that could “endanger their brand,” such as legal, financial and marketing information.

Cavanagh agreed and said that Translate Media would “absolutely not” use software to translate any of its own legal documents. Cavanagh also said that humans can produce higher-quality translations without machine translation than by rewriting a computer’s awkward sentence structure.

Hendzel said that even the best-known machine translation companies, including  Microsoft, which owns Bing Translator, and Google, rely on humans to translate those types of sensitive information for them.

Google did not answer multiple requests for comment. A spokesman for Microsoft, who declined to be named, would not say whether Microsoft uses its own programs to translate legal documents.

But if this is true, Safaba is an exception. Hershkovich said that the company plans to launch a marketing website translated by its own software by the end of this year. His company will contract with human translators to check the web pages first.

Try poetry 

Carbonell also insisted that machine translations have advanced to the point where they can be relied on for the services that a company would need, even if their work still requires careful editing by human translators.

“If you want an area where machine translation still fails to generate useful drafts, try poetry, where meter and rhyme and such matter greatly,” Carbonell wrote in an email.

Pitt News Staff

Share
Published by
Pitt News Staff

Recent Posts

Frustrations in Final Four: Pitt volleyball collects fourth straight loss in Final Four

The best team in Pitt volleyball history fell short in the Final Four to Louisville…

1 day ago

Olivia Babcock wins AVCA National Player of the Year

Pitt volleyball sophomore opposite hitter Olivia Babcock won AVCA National Player of the Year on…

1 day ago

Photos: Pitt women’s basketball falters against Miami

Pitt women’s basketball fell to Miami 56-62 on Sunday at the Petersen Events Center.

2 days ago

Photos: Pitt volleyball downs Kentucky

Pitt volleyball swept Kentucky to advance to the NCAA Semifinals in Louisville on Saturday at…

2 days ago

Photos: Pitt wrestling falls to Ohio State

Pitt Wrestling fell to Ohio State 17-20 on Friday at Fitzgerald Field House. [gallery ids="192931,192930,192929,192928,192927"]

2 days ago

Photos: Pitt volleyball survives Oregon

Pitt volleyball survived a five-set thriller against Oregon during the third round of the NCAA…

2 days ago