Growth function: Cosmos — All that is, or was, or ever will be

Growth function

cosmos 17th May 2019 at 12:39am

The growth function is defined, for every natural number $m$ , as

\Pi_C(m) = max\{|\Pi_C(S)| | |S| = m\}

where $\Pi_C(S)$ is the number of dichotomies of Concept class $C$ on the set $S$ .

For a concept class of VC dimension $d$ , we have

For $m \leq d$ , $\Pi_C (m) = 2^m$
$m \geq d$ , $\Pi_C(m)$ is bounded a degree $d$ polynomial in $m$ ! See here and VC dimension, PAC learning with infinite concept classes

See here

Growth function-based Generalization error bounds

Growth function generalization error bound

Can be shown using Rademacher complexity, and is part of most proofs of VC dimension-based bounds (so uses the double-sample argument, etc)

One can obtain realizable versions of the bounds as well. See the proof of growth function generalization error bound for realizable case (see my notes, or Understanding Machine Learning by Shai and Shai).

From Understanding Machine Learning by Shai and Shai page 51

From here

Tighter bound (but not uniform over training errors)

See here, A result of Vapnik with applications. Note that this is a bound that can be used to get bounds of the form if training error less than blah, then test error is less than blah. But it doesn't give {a bound on the test error, that is dependent on the training error}, which holds uniformly over training errors!

> In here they prove a tighter version of the Growth function generalization error bound, which allows them to bound a sort of relative version of the generalization gap (divided by the square root of the true error, or by the actual true error, if they assume a bound on it too). > The bound they get is not uniform over training error, it's more like a generalization of the realizable bound, it shows that with high probability, if training error is less than blah (this is fixed before drawing the sample), then the generalization error is less than blah (just like this other blah). > That is why they later do the weighted union bound in here, because they don't really have an agnostic bound. They have a bound that holds can be applied to any training error, but doesn't work uniformly over all training errors. It's a different kind of thing.

Sauer's lemma

bound the Growth function of sets with bounded VC dimension.