Generative vs discriminative models

cosmos 17th March 2017 at 1:41pm

Comparison of Generative supervised learning and Discriminative learning

If you have enough data, and you only care about p(yx)p(y|x) (prediction), discriminative models tend to be best, because you are modelling what you care directly, and not constraining other aspects of the system.

Generative models model more about the system producing the data. This means one is often constraining more (by focusing on a particular family of models for more aspects of the system), so that the model is less flexible, but if the assumptions are OK, the model can work well with much less data. Also, one may want to model these extra aspects of the system because one is interested in more than just prediction, but also generation of samples, for instance.

See here (Andrew Ng lec), and here from graphical model lectures. Use discriminative when you don't care about input data distribution, for instance.


I'd be interested, however, on knowing under what circumstances the two approaches give the same result or differ. For instance, if you make an equivalent set of assumptions, they should give the same result, as you are then maximizing the same quantity (likelihood), right? So the only difference is in the set of assumptions, i.e. the family of models describing your system, that you consider, and parametrize.

See Learning theory.