Factor analysis model: Cosmos — All that is, or was, or ever will be

Factor analysis model

cosmos 4th November 2016 at 2:43pm

High-dimensional data

Useful for high-dimensional data, where the dimension is similar or much larger than the number of data samples, $n \gg m$ . In this regime the maximum likelihood estimate of the parameters for a fitted Gaussian have problems, and similar problems would occur for Gaussian mixture model (Particular example of this).

To solve this, we could constraint the covariance matrix of the Gaussian to be diagonal. You could also contrain it to be proportional to identity matrix.

The factor analysis model is another way to do this that doesn't throw away correlations

Model

Description of model.

Assume a latent variable $z \sim \mathcal{N}(0, \mathbf{I})$ , $z \in \mathbb{R}^d$ , $d < n$ .

Then the data has conditional distribution $x | z \sim \mathcal{N}(\mu + \mathbf{\Lambda} z, \mathbf{\Psi})$ . Equivalently, $x = \mu + \mathbf{\Lambda} z + \epsilon$ , where $\epsilon \sim \mathcal{N}(0, \mathbf{\Psi})$ . We also assume that $\mathbf{\Psi}$ is diagonal.

Basically, model the data as lying in some subspace, which is possibly lower-dimensional than that of $x$ , and having some noise around this subspace.

Another example

Some notation and some probability results for Gaussians (recap)

Distribution of the random variable (z,x). Result:

$\begin{bmatrix}\vec{z} \\ \vec{x}\end{bmatrix} \sim \mathcal{N}\left(\begin{bmatrix}\vec{0} \\ \vec{\mu}\end{bmatrix} , \begin{bmatrix} \mathbf{I} & \mathbf{\Lambda}^T \\ \mathbf{\Lambda} & \mathbf{\Lambda} \mathbf{\Lambda}^T + \mathbf{\Psi}\end{bmatrix} \right)$

This implies that if we marginalize $\vec{z}$ , we find

$\vec{x} \sim \mathcal{N}(\vec{\mu}, \mathbf{\Lambda} \mathbf{\Lambda}^T + \mathbf{\Psi} )$

Learning the parameters

We could use MLE (likelihood), but it turns out that the resulting optimization problem can't be solved in closed form, and it's quite hard. Therefore, we actually use the EM algorithm. Note that $z^{(i)}$ is now continuous, so sum over its values become integrals.

E-step. Video, using the properties of Gaussians he mentioned above

M-step. Video (using special trick for Gaussian integral). Go here really.

Result for \Lambda