Learn a DFA representing a target language L⊆Σ∗ using membership and equivalence queries to a teacher
- Membership query. selects string and asks whether that string is part of the langauge
- Equivalence query. learner selects hypothesis DFA H and asks does L(H)=L? If yes, the algo halts. If no, the teacher gives a counterexample.
We want to exactly learn L using # of queries polynomial in the size of a in DFA for L and the length of the longer coutnerexample given by the teacher.
Angulin's L∗-algorithm
See also Evolving automata.
Target language L. The learner maintans
- A set Q⊆Σ∗ of access worlds, with ϵ∈Q.
- A set of T⊆Σ∗ of test words.
We say that v,w∈Σ∗ are T-equivalent, deonted V≡Tw if vu∈L⇔wu∈L for all u∈T, i.e. L(vu)=L(wu).
(Q,T) is separable if no two distinct words v,w∈Q are T-equivalent.
(Q,T) is closed if for all q∈Q and all a∈Σ, there exists some q′∈Q s.t. qa≡Tq′.
If (Q,T) are separable and closed then we define a DFA H as follows;
- The set of states of H is Q.
- The initial state is ϵ.
- Transition function: Given q∈Q and a∈Σ, q makes an a-transition to the unique q′∈Q s.t. qa≡Tq′.
- The set of accepting states Q∩L
Algorithm
Invariant: (Q,T is separable
- Q:={ϵ}, T:={ϵ}.
- If (Q,T) is not closed then expand Q using membership queries until (Q,T) is closed and separable. Add the necessary qa.
- Compute hypothesis automaton and ask equivalence query.
- If yes, then halt.
- If no, then use counterexample to properly expand Q and T to again get separable (Q,T).
L={w∈{a,b}∗:∣w∣b≡3(mod4)}
∣⋅∣b means number of bs.
(Q,T) | H | Counterexample |
(ϵ,ϵ) | | bbb |
...
Prop (for step 5): Suppose (Q,T) is sep. and closed; with H the hyp. automaton. Given a counter example w∈Σ∗ , using log∣w∣ membership queries, one can find q∈Σ∗∖Q and t∈Σ∗ s.t. (Q∪{q},T∪{t}) is separable.
Let w=w1w2...wn and let q0,q1,...qn be the run of H on w. We say that q_i is correct if L(qiwi+1...wn)=L(w)....
Using membership queries and binary shearch, we find i s.t. qi−1 is correct and qi is not correct, that is,
L(qi−1wi...wn)≠L(qiwi+1...wn)
Let Q′=Q∪{qi−1wi}
T′=T∪{wi+1...wn}.
Note that qi≡Tqi−1wi. But qi\nequivT′qi−1wi, so (Q′,T′) is separable.
Proving that the algorithm halts means that it is correct. We can show Q is bounded by the size of the minimum automaton for L