Model selection

cosmos 16th May 2017 at 2:05pm
Statistical inference

This includes Model evaluation, which is the way models are selected...

Introduction, see overfitting and underfitting below. Model selection algorithms provide methods to automatically choose optimal bias/variance tradeoffs. Explanation

Predictive posterior

Predictive posterior checks. Likelihood of data, mostly on test data (see Cross-validation).

Check distribution of extreme values

Information criteria

Paper

Cross-validation

Feature selection

Structural risk minimization


A lot of these methods are very much related to Regularization methods, as both try to make our model better. Often we want the model to be better at generalizing, and this is done by reducing model complexity.

Using cross-validation for regularization can be done using Early stopping using the validation set


Model selection for Artificial neural networks: Neural networks [2.10] : Training neural networks - model selection

Old comment: One can show (maybe technical details I don't know..) that given the real distribution of the data, and a sample used for training, one is likely to underestimate the error. So I think cross-validation can be shown rigorously to be good for assessing a model's predictive power (i.e. probability of predicting rightly). See Elements of Statistical Learning book for all details..

It is a way to find out if you are overfitting

Related: https://en.wikipedia.org/wiki/Testing_hypotheses_suggested_by_the_data