unambiguous automata inference by means of states-merging methods françois coste, daniel fredouille...

28
Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr http://www.irisa.fr/symbiose IRISA-INRIA, Campus de Beaulieu 35042 Rennes Cedex France

Upload: ericka-osment

Post on 14-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

Unambiguous automata inference by means of states-merging methods

François Coste, Daniel Fredouille{fcoste|dfredoui}@irisa.frhttp://www.irisa.fr/symbiose

IRISA-INRIA, Campus de Beaulieu35042 Rennes CedexFrance

Page 2: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 2

Definitions

Alphabet: = {a,b}

Word: abbabbabbaaa

Language: L

Automaton:

*

*

I- Automata inference

L={a+b}*a{a+b}

Page 3: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 3

Classes of automata (1/3)

Nondeterministic Automata (NFA)

Deterministic Automata (DFA)– one outgoing transition per input symbol

I- Automata inference

a

a

L={a+b}*a{a+b}

Page 4: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 4

Classes of automata (2/3)

Unambiguous Automata (UFA) [SH85]– one acceptance per word

I- Automata inference

a

bab

b a

a

b

b

b

L={a+b}*a{a+b}

NFA UFA DFA

Page 5: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 5

Automata inference

Examples

Counter-examples

I- Automata inference

S ={aa,abab}+

S ={ba,abbb}-

L={a+b}*a{a+b}

Page 6: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 6

Why this study ? State of the art: DFA inference Our goal: introducing some amount of

non-determinism Why ?

– NFA << DFA– inferring with less data– inferring “explicit” representations

Method:– extending classical DFA inference algorithm

I- Automata inference

Page 7: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 7

II - Study of the DFA inference framework

Page 8: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 8

Search space for NFAs [DMV94]

UA

MCA

II - The DFA search space

Page 9: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 9

Counter-examples : compatibility

UA

S L -

S L = -(compatible)

(incompatible)

MCA

II - The DFA search space

Page 10: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 10

The search space for DFA

UA

MCA

Deterministic merging

State merging

II - The DFA search space

Page 11: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 11

q1,q2 Q, w *: w pref(q1) w pref(q2) state-merging(q1,q2)

II - The DFA search space

Merging for determinisation procedure

Page 12: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 12

q1,q2 Q, w *: w pref(q1) w pref(q2) state-merging(q1,q2)

II - The DFA search space

Merging for determinization procedure

Page 13: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 13

q1,q2 Q, w *: w pref(q1) w pref(q2) state-merging(q1,q2)

II - The DFA search space

Merging for determinization procedure

Page 14: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 14

Deterministic merging operator =state-merging + merging for determinization

II - The DFA search space

Very commonly used [OG92, LPP98,...]

Demonstration of formal properties– Merging for determinization

• Enables to reach the “closest” DFA from the original NFA

– Deterministic merging

• Enables to reach all derived DFA from a given DFA

– ... (see tech. rep.)

Page 15: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 15

IV - From DFA to UFA inferenceor how to introduce some amount of non-determinism in inference

Page 16: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 16

Inferring non-deterministic representations: the choice of UFA

III - DFA to UFA inference

Why UFA ?– unity in the search space (like DFA)

NFA

UFA

DFA

UA

MCA({aaaaa})

Page 17: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 17

Merging for disambiguisation procedure

III - DFA to UFA inference

q1,q2 Q, w1,w2 *: w1 pref(q1) w1 pref(q2) w2 suff(q1) w2 suff(q2) state-merging(q1,q2)

Page 18: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 18

Merging for disambiguisation procedure

III - DFA to UFA inference

q1,q2 Q, w1,w2 *: w1 pref(q1) w1 pref(q2) w2 suff(q1) w2 suff(q2) state-merging(q1,q2)

Page 19: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 19

Unambiguous merging = state-merging + merging for disambiguisation

III - DFA to UFA inference

Finer operator than merging for determinization Demonstration of formal properties

– Merging for disambiguisation

• Enables to reach the “closest” UFA from the original NFA

– unambiguous merging

• Enables to reach all derived UFA from a given UFA

– ... (see tech. rep.)

Page 20: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 20

IV - Comparative experiments

- Inference algorithms- Benchmarks- Experimental results

Page 21: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 21

Algorithms

UFA– Hill-climbing heuristic

DFA– EDSM heuristic [LPP98]

RFSA– DeLeTe II [DLT01]

IV - Comparative experiments

– Hill-climbing heuristic

Page 22: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 22

Counter-example use for DFA and UFA inference

Compatibility [DMV94]– generalization of , stopped by

Functionality [AS95]

– generalization of and , stopped by

empty intersection

IV - Comparative experiments

S+ S-

S+ S-

Page 23: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 23

Benchmarks

[DLT01]– Generation: DFA, NFA, Regular Expression

– 4 sizes of training sample

– 30 languages generated for each generation mode and sample size

+ UFA generator Evaluation based on

– average recognition level on test sets

– matches between recognition level

IV - Comparative experiments

Page 24: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 24

Results

Best algorithms w.r.t. benchmarks– DFA bench: UFA inference with hill-climbing

– UFA bench: DFA inference with hill-climbing UFA inference with hill-

climbing

– NFA bench: RFSA inference

– Reg. Expr.: RFSA inference DFA inference with hill-climbing

?

IV - Comparative experiments

Page 25: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 25

Results

Heuristic:

– Hill-climbing >> EDSM when inferring DFAs

for NFA/Regular Expression/UFA

bench.

Counter-examples:

– Compatibility Functionality

IV - Comparative experiments

Page 26: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 26

Sample size Generator ufaC-dfaC 13 13 4 11 14 5 8 16 6 15 11 4DLTII-ufaC 1 7 22 1 2 27 2 6 22 2 5 23dfaF-dfaC 4 10 16 2 7 21 2 12 16 7 10 13ufaF-ufaC 8 16 6 4 21 5 10 18 2 10* 16 3GeneratorufaC-dfaC 17 6 7 13 14 3 7 15 8 13 9 8DLTII-ufaC 6 11 13 4 14 12 3 13 14 6 13 11dfaF-dfaC 1 4 25 1 6 23 2 13 15 6 6 18ufaF-ufaC 5 15 10 4 14 12 3* 15 11 3 12 15Generator ufaC-dfaC 16 9 5 11 12 7 12 10 8 12 10 8DLTII-ufaC 9 6 15 12 13 5 14 10 6 16 11 3dfaF-dfaC 10 8 12 5 13 12 5 10 15 6 12 12ufaF-ufaC 4 11 15 4 20 6 5 18 7 6 11 13Generator ufaC-dfaC 5 12 13 11 6 13 10 14 6 6 10 14DLTII-ufaC 15 8 7 17 8 5 10 10 10 18 9 3dfaF-dfaC 7 12 11 5 13 12 5 16 9 3 23 4ufaF-ufaC 9 13 8 9 5 16 5 13 12 6 13 11

UFA

DFA

NFA

Reg. Expr.

20015010050Results (matches)

IV - Comparative experiments

Page 27: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 27

Conclusion UFA inference

– Merging for disambiguisation– Heuristic

Comparison with EDSM & DeLeTe II

Perspectives Speeding up the algorithm Application Using properties of the DFA/UFA space

Page 28: Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr

D. Fredouille and F. Coste, Unambiguous Automata Inference 28

References [AS95] Alquézar, Sanfeliu, “Incremental grammatical

inference from positive and negative data using unbiased finite state automata”, SSPR’94

[DMV94] Dupont and al. “What is the search space of the regular inference ?”, ICGI ’94

[DLT00] Denis and al., “Learning regular languages using nondeterministic automata”, ICGI ’00

[SH85] Stearns, Hunt, “On the equivalence and containment problems for unambiguous regular expressions, regular grammars and finite automata”, SIAM vol 14

[tech. rep.] Coste, Fredouille “What is the search space for NFA, UFA and DFA inference ?”, IRISA