How many parameters are there for an hMM model?

Let us calculate the number of parameters for bi-gram hMM given as

    \[p(x, y)=p(x|y)p(y)\,=\,\prod_{t=1}^{T}p(x_{t}|y_{t})p(y_{t}|y_{t-1})\]

Let N be the total number of states y and V be the vocabulary size and T be the length of the sequence

  1. Before directly estimating the number of parameters, let us first try to see what are the different probabilities or rather probability matrix we have.
  2. Once we know the probability matrix, we can estimate the parameters by its size.
  3. If you’re thinking how does a probability matrix appear, notice that we have conditional probabilities here, p(x_{t}|y_{t}) and p(y_{t}|y_{t-1}) or probability expression involving 2 variables x and y such as p(x, y), hence we should have the probability matrix.
  4. So for p(x_{t}|y_{t}) we have a probability matrix where each row y is a state and each column is an output variable x. Hence this matrix is of size N*V leading to these many parameters.
  5. Similarly for p(y_{t}|y_{t-1}), we have N x N matrix and hence same number of parameters.
  6. From (d) and (e), we have at least N*N + V*N parameters.
  7. Now think carefully if we there is anything which we missed in our calculation.
  8. We also have start tokens x_{0} and initial state y_{0}. In State Probability matrix explained in (e), we have one more row but columns are still N. Therefore number of parameters due to initial state y_{0} becomes N*(N+1)
  9. Once you’re convinced, you’ll get to know that total number of parameters in the above hMM model are N*(N+1) + N*V = N*(N+V+1)
  10. Give a try for a general n for n-gram hMM!

Leave a Reply

Your email address will not be published. Required fields are marked *