This takes the distribution-free bounds of Statistical learning theory and goes one step further, by assuming that the data does not necessarily follow a distribution at all. Instead we consider it arbitrary so that to obtain bounds on the Regret (Online learning), we need to consider the worst-case data.
Can relate to Learning boosting
he essential difference be-tween online learning and statistical learning, in addition to the sequential aspect,is the fact that no probabilistic assumption is made on the dataset.
Online learnability (worst-case regret) is stronger (it implies) PAC learnability
An example of an online learning problem with limited feedback: instead of observing the adversary's action, we observe only the incurred loss in the case of bandits.
Multiarmed Bandits With Limited Expert Advice
Adversarial online learning is about being probability-free, but solution requires stochastic agent, so needs probability at the end..
Is the usefulness of adversarial/agnostic analyses that they tend to imply non-worst case learnability often..?
Adversarial analysis of the agent. we have control over it. Doesn't make so much sense