How does it work?

Yandex.Translate is a statistical machine translation system. The system translates separate words, complete texts, and web pages. It is available as a web service and mobile application, and is also used in other Yandex products, such as translating web pages in Yandex.Browser.

Yandex.Translate has an automated dictionary that sets it apart from the limited number of similar existing services. The technology, developed by a Yandex team of linguists and programmers, combines current statistical machine translation approaches with traditional linguistic tools.

Yandex machine translation is based on the statistical approach. To learn a language, the system compares hundreds of thousands of parallel texts that translate each other “sentence by sentence”. It has two main components: the translation model and the language model.

The translation model constructs a graph containing all the possible ways to translate a sentence. The language model selects the best translation in terms of the optimal word combinations in natural language.

The translation model learns from extensive bilingual parallel corpora. The language model is built from large single-language corpora, and contains all the language's most frequent n-word combinations. N may be from 1 to 7 (usually 5).

Yandex uses BLEU metrics to automatically evaluate the quality of machine translation; it determines the percent of n-grams (n<=4) that match between the machine translation and the standard translation of a sentence. Translations are usually manually rated for two factors, Adequacy and Fluency, using a 5-point scale.