Gaussian mixture model

cosmos 4th November 2016 at 2:43pm
Mixture model

aka mixture of Gaussians

Example in 1D

Algo def

Assume there's a latent (hidden/unobserved) Random variable zz and x(i),z(i)x^{(i)}, z^{(i)} have a joint distribution

P(x(i),z(i))=P(x(i)z(i))P(z(i))P(x^{(i)}, z^{(i)})=P(x^{(i)}|z^{(i)})P(z^{(i)})

z(i)Multinomial(ϕ)z^{(i)} \sim \text{Multinomial}(\phi)

(ϕj0jϕj=1\phi_j \geq 0 \sum_j \phi_j =1 )

x(i)(z(i)=j)N(μj,Σj)x^{(i)} | (z^{(i)} = j) \sim \mathcal{N}(\mu_j, \Sigma_j)

This is very similar to Gaussian discriminant analysis, but where the known labels are substituted by unknown hidden variables zz. (vid).

Train via the EM algorithm

See videoEM for mixture of Gaussians

  1. Repeat until convergence
    1. E-step. Guess values of z(i)z^{(i)}s. In particular, compute the a-posteriori probability wj(i)=P(z(i)=jx(i);ϕ,μ,Σ)w^{(i)}_j = P(z^{(i)} = j | x^{(i)}; \phi, \mu, \Sigma) =P(x(i)z(i)=j)P(z(i)=j)l=1kP(x(i)z(i)=l)P(z(i)=l)= \frac{P(x^{(i)}|z^{(i)} = j)P(z^{(i)} = j)}{\sum_{l=1}^k P(x^{(i)}|z^{(i)} = l)P(z^{(i)} = l)} =1(2π)d2Σj12exp{(x(i)μj)TΣj1(x(i)μj)}ϕjl=1k1(2π)d2Σl12exp{(x(i)μl)TΣl1(x(i)μl)}ϕl=\frac{\frac{1}{\left(2\pi \right)^{\frac{d}{2}}\left|\Sigma _j\right|^{\frac{1}{2}}}\exp \left\{\left(x^{\left(i\right)}-\mu _j\right)^T\Sigma _j^{-1}\left(x^{\left(i\right)}-\mu _j\right)\right\}\phi _j}{\sum _{l=1}^k\frac{1}{\left(2\pi \right)^{\frac{d}{2}}\left|\Sigma _l\right|^{\frac{1}{2}}}\exp \left\{\left(x^{\left(i\right)}-\mu _l\right)^T\Sigma _l^{-1}\left(x^{\left(i\right)}-\mu _l\right)\right\}\phi _l}
    2. M-step. ϕj=1mi=1mwj(i)\phi _j=\frac{1}{m}\sum _{_{i=1}}^mw_j^{\left(i\right)}. μj=i=1mwj(i)x(i)i=1mwj(i)\mu _j=\frac{\sum _{_{i=1}}^mw_j^{\left(i\right)}x^{\left(i\right)}}{\sum _{_{i=1}}^mw_j^{\left(i\right)}}. Σj=i=1mwj(i)(x(i)μj)(x(i)μj)Ti=1mwj(i)\Sigma _j=\frac{\sum _{i=1}^mw_j^{\left(i\right)}\left(x^{\left(i\right)}-\mu _j\right)\left(x^{\left(i\right)}-\mu _j\right)^T}{\sum _{i=1}^mw_j^{\left(i\right)}}