Linear regression with a Tikhonov regularization.
Constant term doesn't contribute to complexity..
Scaling input variables shouldn't change the model complexity, so we normalize them.
Estimate for ridge regression
easier to solve than normal equation in standard linear regression. lambda is actually a Lagrange multiplier, and dependening on its value we are considering a sphere around the origin. The equation when differentiating w.r.t. lambda gives us that is less than a constant (see here
Also called l2 regularization or weight-decay.
In Bayesian terms, this corresponds to a Gaussian priors, and we are using Maximum a-posteriori.