Kernel trick

cosmos 30th November 2017 at 9:26pm
Kernel method

See Kernel method, and here

The idea of generalizing Dictionary learning to learning over Reproducing kernel Hilbert space, where function evaluation can be substituted with an inner product. Operationally, one often just has to substitute inner products between input vectors with a Kernel function evaluated at these input vectors, and one gets a new learning algorithm. This learning algorithm actually can now have an infinite dimensional hypothesis class (refering to dimension of Hilbert space of functions), while it was finite dimensional for dictionary learning.

See a derivation here: https://arxiv.org/pdf/1611.03530.pdf , and original reference

Note that this kernel solution has an appealing interpretation in terms of implicit regularization. Simple algebra reveals that it is equivalent to the minimum l2-norm solution of Xw = y. That is, out of all models that exactly fit the data, SGD will often converge to the solution with minimum norm.

A generalized representer theorem