Learning Boolean conjunctions

cosmos 6th March 2017 at 12:23pm

See Probably approximately correct

variables x1, ..., xn

c = x1 and x2 and not x3 ... and xn

Algorithm

  1. Start with h=x1x1¯...xnxn¯h=x_1 \wedge \bar{x_1} \wedge ... \wedge x_n \wedge \bar{x_n}
  2. for any observed input/output pair (\vec{a}, 1). For all i, if ai=1a_i = 1, drop xi¯\bar{x_i}, if ai=0a_i=0, drop x0x_0.
  3. Return h.

we can only make false negative errors. if the real function c:

c(a)=1c(a)=1, h(a)h(a) may be 0 or 1.

c(a)=0c(a)=0 ==> h(a)=0h(a)=0.

Literal l. p(l)=PaD[c(a)=1l=0 in a]p(l) = P_{a\sim D} [c(a)=1 \wedge l=0 \text{ in } a]. .. l is something like x1x_1, or x1¯\bar{x_1}..

literal ll causes missclassification if the event c(a)=1l=0 in a{c(a)=1 \wedge l=0 \text{ in } a} is not observed.

We worry about non-rare events: p(l)ϵ2np(l) \geq \frac{\epsilon}{2n}. We call these bad literals

AlA_l is the event that for m points drawn independently from D, ll is a bad literal , but not eliminated from h.

Probability: P(Al)=(1p(l))m(1ϵ2n)nP(A_l) = (1-p(l))^m \leq (1-\frac{\epsilon}{2n})^n

E=l badAlE=\cup_{l \text{ bad}} A_l

Probably: Probability that we don't eliminate all bad literals is small: P(E)l badP(Al)(#badliterals)(1ϵ2n)mP(E) \leq \sum_{l \text{ bad}} P(A_l) \leq (\#bad literals) (1-\frac{\epsilon}{2n})^m

2n(1ϵ2nm2nexp(mϵ/2n)δ\leq 2n (1-\frac{\epsilon}{2n}^m \leq 2n \exp{(-m\epsilon/2n)}\leq \delta

..

Approximately: Probability of error, if we eliminate all bad literals, is small: err(h)=PaD[c(a)=1h(a)=0]l goodp(l)ϵerr(h) = P_{a\sim D} [c(a)=1 \wedge h(a)=0] \leq \sum_{l \text{ good}} p(l) \leq \epsilon

c(a)=1h(a)=0lh,lcc(a)=1 \wedge h(a)=0 \Rightarrow \exists l \in h, l \in c s.t. l=0l=0 in aa.