The idea, I think, is that it should develop a good hierarchical model, if the hidden activations, even after adding noise (or somehow information-bottlenecking) help reconstructing activations at lower levels of abstraction.
It is mainly used for Semi-supervised learning. Although it can also be used as a Generative model, and for Unsupervised learning
Semi-Supervised Learning with Ladder Networks
See EM algorithm
http://users.ics.aalto.fi/harri/ica2000a/node2.html, http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=0814D988A82AFA273E351B62A9FCBC55?doi=10.1.1.149.5636&rep=rep1&type=pdf