Latent semantic indexing

cosmos 4th November 2016 at 2:43pm
Natural language processing Principal component analysis

Essentially application of PCA to text data, where we usually skip the pre-processing step.

We use it for measuring document similarity. Use angle between vectors representing documents.

Intuition on applying PCA to text