Online learning: Cosmos — All that is, or was, or ever will be

Online learning

cosmos 13th July 2018 at 10:52pm

This takes the distribution-free bounds of Statistical learning theory and goes one step further, by assuming that the data does not necessarily follow a distribution at all. Instead we consider it arbitrary so that to obtain bounds on the Regret (Online learning), we need to consider the worst-case data.

lecture notes – intro notes

Other notes

Prediction with expert advice

Can relate to Learning boosting

he essential difference be-tween online learning and statistical learning, in addition to the sequential aspect,is the fact that no probabilistic assumption is made on the dataset.

Online learnability (worst-case regret) is stronger (it implies) PAC learnability

Bandit problem

An example of an online learning problem with limited feedback: instead of observing the adversary's action, we observe only the incurred loss in the case of bandits.

Multiarmed Bandits With Limited Expert Advice

Adversarial online learning is about being probability-free, but solution requires stochastic agent, so needs probability at the end..

Is the usefulness of adversarial/agnostic analyses that they tend to imply non-worst case learnability often..?

Adversarial analysis of the agent. we have control over it. Doesn't make so much sense