Natural Language Processing – Machine Learning Interviews

Positional Encoding in the Transformer Model

Posted on May 3, 2024 by MLNerds

Skip or Residual Connections in Deep Networks

Posted on March 1, 2024 by MLNerds

The BERT Score – Evaluating Text Generation

Posted on November 28, 2023 by MLNerds

This video talks about the evaluation metric BERTScore, why it needed over existing metrics such as the BLEU score and so on and how it is computed and evaluated. Traditional metrics look at exact text match. BERTScore looks at semantic similarity leveraging contextual word embeddings of words in the candidate and the reference sentences.

BERT Model

Posted on October 30, 2023 by MLNerds

BLUE Score

Posted on October 22, 2021November 5, 2021 by MLNerds

This brief video describes the BLEU score, a popular evaluation metric used for sevaral tasks such as machine translation, text summarization and so on. What is BLEU Score? BLEU stands for Bilingual evaluation Understudy. It is a metric used to evaluate the quality of machine generated text by comparing it with a reference text that…

How do you deal with out of vocabulary words during run time when you build a language model ?

Posted on February 26, 2019July 31, 2019 by MLInterview

Out of vocabulary words are words that are not in the training set, but appear in the test set, real data. The main problem is that the model assigns a probability zero to out of vocabulary words resulting in a zero likelihood. This is a common problem, specially when you have not trained on a…

You want to find food related topics in twitter – how do you go about it ?

Posted on February 21, 2019May 13, 2019 by MLInterview

One can use any of the topic models above to get topics. However, to direct the topics to contain food related information, specialized topic modeling algorithms are available. However, one simple way to direct the topics to food related things is : Filter tweets by a limited set of food related keywords (food, meal, dinner,…

What are common tools for speech recognition ? What are the advantages and disadvantages of each?

Posted on February 21, 2019 by MLInterview

There are several ready tools for speech recognition, that one can use to train custom models given the appropriate dataset. CMU Sphinx : Used more in an academic setting, one of the oldest libraries. Kaldi – hard to set up, very flexible to use. Typically used by academics. Deep Speech – Easy to set up,…

Older posts →

Category: Natural Language Processing

Knowledge Distillation