- Rule based machine translation (Older techniques) : Uses dictionary between words of the two languages along with syntactic, semantic morphological analysis of the source sentence to define context. Linguistic Rules are defined to translate a specific word in a given context into target language. https://en.wikipedia.org/wiki/Rule-based_machine_translation
Advantages of this approach : No requirement of parallel corpora
Disadvantages: Hard to define these rules between each language pair
- Statistical Machine Translation : This technique involves working with parallel corpora, that are aligned sentence by sentence (sometimes word to word ).
IBM models have been very popular for a long time, a series of increasingly complex probabilistic models based learning on word to word translation and alignment. Basically, if S is the source sentence and T is the target sentence, we want to find the target sentence T that maximizes P(T|S)
- Neural Machine Translation : This is the technique employed by google translate and the state of the art involving sequence to sequence models.
https://arxiv.org/abs/1703.01619
https://arxiv.org/pdf/1409.0473.pdf