From Wikipedia, the free encyclopedia - View original article
|Type of site||Machine translation|
|This article needs additional citations for verification. (July 2014)|
|Type of site||Machine translation|
Before October 2007, for languages other than Arabic, Chinese and Russian, Google Translate was based on SYSTRAN, a software engine which is still used by several other online translation services such as Yahoo! Babel Fish, AOL, and Yahoo. Since October 2007, Google Translate has used proprietary, in-house technology based on statistical machine translation instead.
On May 26, 2011, Google announced that the Google Translate API for software developers had been deprecated and would cease functioning on December 1, 2011, "due to the substantial economic burden caused by extensive abuse." Because the API was used in numerous third-party websites, this decision led some developers to criticize Google and question the viability of using Google APIs in their products. In response to public pressure, Google announced on June 3, 2011, that the API would continue to be available as a paid service.
Google Translate offers a web interface, mobile interfaces for Android and iOS, and an API that developers can use to build browser extensions, applications, and other software. For some languages, Google Translate can pronounce translated text, highlight corresponding words and phrases in the source and target text, and act as a simple dictionary for single-word input. If "Detect language" is selected, text in an unknown language can be identified.
In the web interface, users can suggest alternate translations, such as for technical terms, or correct mistakes. These suggestions are included in future updates to the translation process. If a user enters a URL in the source text, Google Translate will produce a hyperlink to a machine translation of the website. For some languages, text can be entered via an on-screen keyboard, handwriting recognition, or speech recognition.
Google Translate is available in some browsers as an extension which can translate websites.
Google Translate is available as a free downloadable application for Android OS users. The first version was launched in January 2010. It works simply like the browser version. Google translation for Android contains two main options: "SMS translation" and "History".
An early 2011 version supported Conversation Mode when translating between English and Spanish (in alpha testing). This interface within Google Translate allows users to communicate fluidly with a nearby person in another language. In October 2011 it was expanded to 14 languages.
The application supports 53 languages and voice input for 15 languages. It is available for devices running Android 2.1 and above and can be downloaded by searching for "Google Translate" in Google Play. It was first released in January 2010, with an improved version available on January 12, 2011.
Latest version: 2.0.0 build 42.
In August 2008, Google launched a Google Translate HTML5 web application for iOS for iPhone and iPod Touch users. The official iOS app for Google Translate was released February 8, 2011. It accepts voice input for 15 languages and allows translation of a word or phrase into one of more than 50 languages. Translations can be spoken out loud in 23 different languages.
Google Translate, like other automatic translation tools, has its limitations. The service limits the number of paragraphs and the range of technical terms that can be translated, and while it can help the reader to understand the general content of a foreign language text, it does not always deliver accurate translations. Some languages produce better results than others. Google Translate performs well especially when English is the target language and the source language is one of the languages of the European Union. A 2010 analysis indicated that French to English translation is relatively accurate, and 2011 and 2012 analyses showed that Italian to English translation is relatively accurate as well. However, if the source text is shorter, rule-based machine translations often perform better; this effect is particularly evident in Chinese to English translations. While edits of translations may be submitted, in Chinese specifically one is not able to edit sentences as a whole. Instead, one must edit sometimes arbitrary sets of characters, leading to incorrect edits.
Texts written in the Greek, Devanagari, Cyrillic and Arabic scripts can be transliterated automatically from phonetic equivalents written in the Latin alphabet. The browser version of the Google translator provides the read phonetically option for Japanese to English conversion. The same option is not available on the paid API version.
Many of the more popular languages have a "text-to-speech" audio function that is able to read back a text in that language, up to a few dozen words or so. In the case of pluricentric languages, the accent depends on the region: for English, in the Americas, most of the Asia-Pacific and West Asia the audio uses a female General American accent, whereas in Europe, Hong Kong, Malaysia, Singapore, Guyana and all other parts of the world a female British English accent is used, except for a special Oceania accent used in Australia, New Zealand and Norfolk lsland; for Spanish, in the Americas a Latin American Spanish accent is used, while in the other parts of the world a Castilian Spanish accent is used; Portuguese uses a São Paulo accent in the world, except for Portugal, where their native accent is used. Some languages use the open-source eSpeak synthesizer for their speech.
Google Translate does not apply grammatical rules, since its algorithms are based on statistical analysis rather than traditional rule-based analysis. Indeed, the system's original creator, Franz Josef Och, has criticized the effectiveness of rule-based algorithms in favor of statistical approaches. It is based on a method called statistical machine translation, and more specifically, on research by Och who won the DARPA contest for speed machine translation in 2003. He is now the head of Google's machine translation group.
Google does not translate from one language to another (L1 → L2), but often translates first to English and then to the target language (L1 → EN → L2). However, because English, like all human languages, is ambiguous and depends on context, this can cause translation errors. For example, translating vous from French to Russian gives vous → you → ты OR Bы/вы. If Google were using an unambiguous, artificial language as the intermediary, it would be vous → you → Bы/вы OR tu → thou → ты. Such a suffixing of words disambiguates their different meanings. Hence, publishing in English, using unambiguous words, providing context, using expressions such as "you all" often make a better one-step translation.
The following languages do not have a direct Google translation to or from English. These languages are translated through the indicated intermediate language (which in all cases is closely related to the desired language but more widely spoken) in addition to through English:
According to Och, a solid base for developing a usable statistical machine translation system for a new pair of languages from scratch would consist of a bilingual text corpus (or parallel collection) of more than a million words, and two monolingual corpora each of more than a billion words. Statistical models from these data are then used to translate between those languages.
To acquire this huge amount of linguistic data, Google used United Nations documents. The UN typically publishes documents in all six official UN languages, which has produced a very large 6-language corpus.
Google representatives have been involved with domestic conferences in Japan where Google has solicited bilingual data from researchers.
When Google Translate generates a translation, it looks for patterns in hundreds of millions of documents to help decide on the best translation. By detecting patterns in documents that have already been translated by human translators, Google Translate makes intelligent guesses (AI) as to what an appropriate translation should be.
|Albanian||Albanet||CC-BY 3.0/GPL 3|
|Arabic||Arabic Wordnet||CC-BY-SA 3|
|Hindi||IIT Bombay Wordnet||Indo Wordnet|
|Farsi/Persian||Persian Wordnet||Free to Use|
|French||WOLF (WOrdnet Libre du Francias)||CeCILL-C|
|Catalan||Multilingual Central Repository||CC-BY-3.0|
|Galilean||Multilingual Central Repository||CC-BY-3.0|
|Spanish||Multilingual Central Repository||CC-BY-3.0|
Shortly after launching the translation service, Google won an international competition for English–Arabic and English–Chinese machine translation.
Because Google Translate uses statistical matching to translate rather than a dictionary/grammar rules approach, translated text can often include apparently nonsensical and obvious errors, often swapping common terms for similar but nonequivalent common terms in the other language, as well as inverting sentence meaning. Also, for the speech, it uses only European French as well as Latin American Spanish worldwide, but both European and Brazilian Portuguese (European for translate.google.pt and Brazilian for all other Google Translate sites).
Google has been accused of sexism due to the statistical assignment of gender when translating from or through English into languages where verbs are conjugated by gender. For example, the phrase I drive used to be translated into a masculine conjugation, while I cook into a feminine conjugation, due to the higher occurrence of such forms in corpora. Due to public criticism in Israel, Google has manually fixed some apparent cases of sexist translation into Hebrew by using the masculine form for all verbs.