attempts to extend correction queries
DESCRIPTION
31 st of October 2005 Seminar IV. Attempts to extend correction queries. Cristina Bibire Research Group on Mathematical Linguistics, Rovira i Virgili University Pl. Imperial Tarraco 1, 43005, Tarragona, Spain E-mail: [email protected]. Correction queries - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/1.jpg)
Attempts to extend correction Attempts to extend correction queriesqueries
Cristina BibireCristina BibireResearch Group on Mathematical Linguistics, Rovira i Research Group on Mathematical Linguistics, Rovira i
Virgili UniversityVirgili University Pl. Imperial Tarraco 1, 43005, Tarragona, SpainPl. Imperial Tarraco 1, 43005, Tarragona, Spain
E-mail: [email protected]: [email protected]
3131stst of October 2005 of October 2005Seminar IVSeminar IV
![Page 2: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/2.jpg)
Correction queries
PAC learning of DFA
Learning CFL
Learning WFA
Redefining the correcting string
References
![Page 3: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/3.jpg)
Learning from corrections Learning from corrections The correcting string of s in the language L is the smallest string s' (in lex-length order) such that s.s' belongs to L.
The answer to a correction query for a string consists of its correcting string.
Myhill-Nerode theorem:
The number of states in the smallest DFA accepting L is equal to the number of equivalence classes in .L
*if iffLx y z xz L yz L
![Page 4: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/4.jpg)
Learning from corrections Learning from corrections
![Page 5: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/5.jpg)
PAC learning of DFA with CQ
Learning CFL with CQ
Learning WFA with CQ
Redefining the correcting string
How can we extend CQ?
![Page 6: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/6.jpg)
PAC learning of DFA with PAC learning of DFA with CQCQWe assume that there is some probability distribution Pr on the set
of all strings over the alphabet Σ and let L be an unknown regular set
The Learner has access to information about L by means of two oracles:
• C(x) returns the correcting string for x
• Ex( ) is a random sampling oracle that selects a string x from Σ* according to the distribution Pr and returns the pair (x, C(x)).
In addition, the Learner is given the accuracy ε and the confidence δ.
Definition: We say that the language L1 is an ε-approximation of the language L2 provided that:
If A is a DFA, it is said to be an ε-approximation of the set L if L(A) is an ε-approximation of L.
1 2
Prx L L
x
![Page 7: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/7.jpg)
PAC learning of DFA with PAC learning of DFA with CQCQIf A is an ε-approximation of L, then the probability of finding a
discrepancy between L(A) and L with one call of the random sampling oracle Ex( ) is at most ε.
The approximate learner LCAapprox is obtained by modifying LCA. A correction query of the string x is satisfied by a call to C(x). Each conjecture is tested by a number of calls to Ex( ).
• If any of the calls to Ex( ) returns a pair (t, C(t)) such that:
- C(t)=λ but A(S,E,C) rejects it or
- C(t)≠λ but A(S,E,C) accepts it
then t is said to be a counterexample and LCAapprox proceeds as LCA
• If none of the calls to Ex( ) returns a counterexample, then LCAapprox halts and outputs A(S,E,C)
![Page 8: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/8.jpg)
PAC learning of DFA with PAC learning of DFA with CQCQHow many calls to Ex( ) does LCAapprox make to test a given
conjecture?
• accuracy and confidence parameters, ε and δ
• how many previous conjectures have been tested
Let
If i previous conjectures have been tested then LCAapprox makes [ri] calls to Ex( ).
Theorem. If n is the number of states in the minimum DFA for the target language L, then LCAapprox terminates after O(n+(1/ε) (ln(1/δ)n+n2)) calls to Ex( ) oracle. Moreover, the probability that the automaton output by LCAapprox is an ε-approximation of L is at least 1-δ.
1 1ln 1 ln 2ir i
![Page 9: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/9.jpg)
PAC learning of DFA with PAC learning of DFA with CQCQSketch of the proof:
• the total number of counterexamples is at most n-1, so the total number of calls to Ex( ) is at most
• the probability that LCAapprox will terminate with an automaton that is not an ε-approximation of L is:
2
0
1n
ii
r
21lnO n n n
2
0
1 i
nr
i
2 2
10 0 2
i
n nr
ii i
e
![Page 10: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/10.jpg)
PAC learning of DFA with CQ
Learning CFL with CQ
Learning WFA with CQ
Redefining the correcting string
How can we extend CQ?
![Page 11: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/11.jpg)
Learning CFLLearning CFLThe setting
There is an unknown CFG G in Chomsky normal form. The Learner knows the set T of terminal symbols, the set N of nonterminal symbols and the start symbol S of G. The Teacher is assumed to answer two types of questions:
• MEMBER(x,A) – if the string x can be derived from the non-terminal A in the grammar G, the answer is yes; otherwise, it is no
• EQUIV(H) – if H is equivalent to G, the answer is yes; otherwise, it replies with a counterexample t.
![Page 12: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/12.jpg)
Learning CFLLearning CFLThe Learner LCF
LCF can explicitly enumerate all the possible productions of G in polynomial time (in |T| and |N|). Initially LCF places all possible productions of G in the hypothesized set of productions P.
The main loop of LCF asks an EQUIV(H) question for the grammar H=(T,N,S,P).
• if H is equivalent to G, then LCF halts and outputs H
• otherwise, it “diagnoses” the counterexample t returned, which results in removing at least one production from P; the main loop is then repeated.
![Page 13: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/13.jpg)
PAC learning of DFA with CQ
Learning CFL with CQ
Learning WFA with CQ
Redefining the correcting string
How can we extend CQ?
![Page 14: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/14.jpg)
Learning WFA Learning WFA
Let be a field and be a function. Associate with an
infinite matrix with rows indexed by strings in and columns
indexed by strings in . The entry of contains the value
f(x.y). The function is called a power series and its Hankel
matrix.
If we have an WFA A we can associate a function and vice
versa, for every function there exists a smallest WFA A such that
.
Theorem [Carlyle, Paz 1971] Let such that and
let F be the corresponding Hankel matrix. Then, the size r of the
smallest WFA A such that satisfies r=rank(F).
*:f K K fF x *
y * ,x y Ff F
fAf
Af f*:f K 0f
Af f
![Page 15: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/15.jpg)
Learning WFA Learning WFA Let f be a target function. The learning algorithm may ask the oracle two types of query:
• EQ(h): if h is equivalent to f on all input assignments then the answer to the query is yes; otherwise, the answer is no and it receives a counterexample z ( ).
• MQ(z): the oracle has to return f(z)
The algorithm learns a function f using its Hankel matrix, F. Because of the mentioned theorem, it is enough to keep a sub-matrix of F of full rank. Therefore the learning algorithm can be viewed as a search for appropriate r rows and r columns.
f z h z
r r
![Page 16: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/16.jpg)
Learning WFA Learning WFA The algorithm
(1) Initialize:
(2) Define a hypothesis h
Let
For every , define a matrix such that
For every , define
(3) Ask an equivalence query EQ(h)
• If the answer is yes, halt and output h
• Otherwise, the answer is no and we receive a counterexample z
Using MQ find a string w.σ, prefix of z such that
(a)
(b)
Go to (2)
1 1 1 1, , , ,and 1x y X x Y y l
1 ,..., lf x f x
ˆ . ,1
ˆ ˆˆi j
l
x xi jj
F y F
*w
1 2ˆ ˆ ˆ ˆ
kw
1
ˆh w w
1,
1
ˆ ˆˆi
l
w xii
F w F
. .1,
1
ˆ ˆˆs.t.i
l
w xii
y Y F y w F y
1 1 1 1, . , , ,and 1l l l lx w y y X X x Y Y y l l
![Page 17: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/17.jpg)
PAC learning of DFA with CQ
Learning CFL with CQ
Learning WFA with CQ
Redefining the CQ
How can we extend CQ?
![Page 18: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/18.jpg)
Redefining the correcting Redefining the correcting stringstring• Hamming distance (only for strings of the same length). For two
strings s and t, H(s, t) is the number of places in which the two string differ, i.e., have different characters.
• , s.t. and ( , ) is minimumC s s s L H s s
![Page 19: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/19.jpg)
Redefining the correcting Redefining the correcting stringstring• Hamming distance (only for strings of the same length). For two
strings s and t, H(s, t) is the number of places in which the two string differ, i.e., have different characters.
• , s.t. and ( , ) is minimumC s s s L H s s
q0
1
q1
0
q3q2
0
0
0
1 1 1
λλ StatesStates
λλ λλ qq00
00 φφ qq11
0000 0000 qq22
11 φφ qq11
0101 0000 qq22
000000 φφ qq11
001001 φφ qq11
S S
S
![Page 20: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/20.jpg)
Redefining the correcting Redefining the correcting stringstring• Hamming distance (only for strings of the same length). For two
strings s and t, H(s, t) is the number of places in which the two string differ, i.e., have different characters.
• , s.t. and ( , ) is minimumC s s s L H s s
q0
1
q1
0
q3q2
0
0
0
1 1 1
λλ StatesStates
λλ λλ qq00
00 φφ qq11
0000 0000 qq22
11 φφ qq11
0101 0000 qq22
000000 φφ qq11
001001 φφ qq11
S S
S
q0
q1
0, 1 q2
0, 1
0, 1
![Page 21: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/21.jpg)
Redefining the correcting Redefining the correcting stringstring• Hamming distance (only for strings of the same length). For two
strings s and t, H(s, t) is the number of places in which the two string differ, i.e., have different characters.
• min , ' 'C s H s s s L
![Page 22: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/22.jpg)
Redefining the correcting Redefining the correcting stringstring• Hamming distance (only for strings of the same length). For two
strings s and t, H(s, t) is the number of places in which the two string differ, i.e., have different characters.
• min , ' 'C s H s s s L
q0
1
q1
0
q3q2
0
0
0
1 1 1
λλ StatesStates
λλ 00 qq00
00 ∞∞ qq11
0101 11 qq22
11 ∞∞ qq11
0000 00 qq00
010010 ∞∞ qq11
011011 ∞∞ qq11
S S
S
![Page 23: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/23.jpg)
Redefining the correcting Redefining the correcting stringstring• Hamming distance (only for strings of the same length). For two
strings s and t, H(s, t) is the number of places in which the two string differ, i.e., have different characters.
• min , ' 'C s H s s s L
q0
1
q1
0
q3q2
0
0
0
1 1 1
λλ StatesStates
λλ 00 qq00
00 ∞∞ qq11
0101 11 qq22
11 ∞∞ qq11
0000 00 qq00
010010 ∞∞ qq11
011011 ∞∞ qq11
S S
S
q0
q1
0, 1 q2
1
0, 1
0
![Page 24: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/24.jpg)
Redefining the correcting Redefining the correcting stringstring• Hamming distance
min , ' 'C s H s s s L
q0
1
q1
0
q3q2
0
0
0
1 1 1
S S
S
λλ 00
λλ 00 ∞∞
00 ∞∞ 00
11 ∞∞ 11
0101 11 ∞∞
1010 11 ∞∞
0000 00 ∞∞
1111 00 ∞∞
010010 ∞∞ 11
011011 ∞∞ 00
100100 ∞∞ 11
101101 ∞∞ 00
StatesStates
qq00
qq11
qq22
qq33
qq33
qq00
qq00
qq22
qq11
qq22
qq11
![Page 25: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/25.jpg)
Redefining the correcting Redefining the correcting stringstring• Hamming distance
min , ' 'C s H s s s L
S S
Sq0
1
q10 q2
0
110
λλ 00 StatesStates
λλ ∞∞ ∞∞ qq00
00 ∞∞ 00 qq11
0000 00 11 qq22
0101 11 00 qq33
11 ∞∞ 11 qq44
000000 11 22 qq55
001001 00 11 qq22
010010 00 11 qq22
011011 11 00 qq22
![Page 26: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/26.jpg)
Redefining the correcting Redefining the correcting stringstring• Hamming distance
min , ' 'C s H s s s L
S S
Sq0
1
q10 q2
0
110
λλ 00 StatesStates
λλ ∞∞ ∞∞ qq00
00 ∞∞ 00 qq11
0000 00 11 qq22
0101 11 00 qq33
11 ∞∞ 11 qq44
000000 11 22 qq55
001001 00 11 qq22
010010 00 11 qq22
011011 11 00 qq22
![Page 27: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/27.jpg)
Redefining the correcting Redefining the correcting stringstring• Hamming distance
min , ' 'C s H s s s L
S S
Sq0
1
q10 q2
0
110
λλ 00 StatesStates
λλ ∞∞ ∞∞ qq00
00 ∞∞ 00 qq11
0000 00 11 qq22
0101 11 00 qq33
11 ∞∞ 11 qq44
000000 11 22 qq55
001001 00 11 qq22
010010 00 11 qq22
011011 11 00 qq22
![Page 28: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/28.jpg)
Redefining the correcting Redefining the correcting stringstring• Levenshtein (or edit) distance. It counts also when one has a
character whereas the other does not.
For two characters a and b, define:
Assume we are given two strings s and t of length n and m, respectively. We are going to fill an (n+1)×(m+1) array d with integers such that the low right corner element d(n+1, m+1) will furnish the required values of the Levenshtein distance Lev(s, t).
The definition of entries of d is recursive.
First set and
For other pairs i, j use
0,,
1,
a br a b
a b
, min , ,1, , 1 1 1, 1 ,1 d i j dd i j i j r s i td ji j
,0 , 0,d i i i n 0, , 0,d j j j m
![Page 29: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/29.jpg)
Redefining the correcting Redefining the correcting stringstring• Levenshtein distance
min , ' 'C s Lev s s s L
S S
S
q0
1
q10 q2
0
110
λλ StatesStates
λλ 22 qq00
00 11 qq11
0000 00 qq22
11 22 qq00
0101 11 qq11
000000 11 qq11
001001 00 qq22
![Page 30: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/30.jpg)
Redefining the correcting Redefining the correcting stringstring• Levenshtein distance
min , ' 'C s Lev s s s L
S S
S
q0
1
q10 q2
0
110
λλ StatesStates
λλ 22 qq00
00 11 qq11
0000 00 qq22
11 22 qq00
0101 11 qq11
000000 11 qq11
001001 00 qq22
1
q0
1
q10 q2
0
10
![Page 31: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/31.jpg)
Redefining the correcting Redefining the correcting stringstring• Levenshtein distance
min , ' 'C s Lev s s s L
S S
Sq0
1
q10 q2
0
110
λλ 00
λλ 22 11
00 11 00
0000 00 11
000000 11 11
00000000 11 00
11 22 11
0101 11 00
001001 00 11
00010001 11 11
0000000000 00 11
0000100001 11 00
StatesStates
qq00
qq11
qq22
qq33
qq11
qq00
qq11
qq22
qq33
qq22
qq11
![Page 32: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/32.jpg)
PAC learning of DFA with CQ
Learning CFL with CQ
Learning WFA with CQ
Redefining the correcting string
How can we extend CQ?
![Page 33: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/33.jpg)
ReferencesReferences D. Agluin. Learning Regular Sets from Queries and Counter-examples. Information and Computation 75, 87-106 (1987)
L. Lee. Learning of Context-Free Languages: A Survey of the Literature. Harvard University Technical Report TR-12-1996 (written in 1994)
C. de la Higuera. Learning Stochastic Finite Automata from Experts. In Proceedings of the 4th International Colloquium on Grammatical Inference, Lecture Notes In Computer Science 1433, 79-89 (1998)
F. Bergadano, N. Bshouty, A. Beimel, E. Kushilevitz and S. Varricchio. Learning Functions Represented as Multiplicity Automata. Journal of the ACM 47, 506-530 (2000)
http://www.cut-the-knot.org/do_you_know/Strings.shtml
![Page 34: Attempts to extend correction queries](https://reader036.vdocuments.mx/reader036/viewer/2022062301/568143ae550346895db0386b/html5/thumbnails/34.jpg)