Max-margin learning: Cosmos — All that is, or was, or ever will be

Max-margin learning

cosmos 7th November 2016 at 6:12pm

Maximum margin principle

Maximise the distance of the closest point from the decision boundary.

Points that are closest to the decision boundary are support vectors

Support vector machines

Application to Transfer learning

Max-margin: learning a function that identifies sensible data (e.g. sentences that make sense), thats what we do with the algorithm he explains of finding a prob dist bigger at the data points that "anywhere" else. This will, in particular, make the NN learn a good representation of the data, or embedding. For this we use hinge loss. In practice, we do this