What is the state of the art technique for Machine Translation ?

  1. Rule based machine translation (Older techniques) : Uses dictionary between words of the two languages along with syntactic, semantic morphological analysis of the source sentence to define  context. Linguistic Rules are defined to translate a specific word in a given context into target language. https://en.wikipedia.org/wiki/Rule-based_machine_translation

Advantages of this approach : No requirement of parallel corpora

Disadvantages: Hard to define these rules between each language pair

  1. Statistical Machine Translation  : This technique involves working with parallel corpora, that are aligned sentence by sentence (sometimes word to word ).

IBM models have been very popular for a long time, a series of increasingly complex probabilistic models based learning on word to word translation and alignment. Basically, if S is the source sentence and T is the target sentence, we want to find the target sentence T that maximizes P(T|S)

  1. Neural Machine Translation : This is the technique employed by google translate and the state of the art involving sequence to sequence models.

https://arxiv.org/abs/1703.01619

https://arxiv.org/pdf/1409.0473.pdf

 

Leave a Reply

Your email address will not be published. Required fields are marked *