incremental evolving grammar fragments

Incremental Evolutionary Grammar Fragments

Nurfadhlina Mohd Sharef, Trevor Martin, Yun ShenArtificial Intelligence Group, University of Bristol, BS8 1TR UK

[email protected], [email protected], [email protected]

Outline

• Background of problem

• Literature review

• Shortcoming of evolutionary approach

• Fuzzy text pattern learning

• Grammar Approximation

• Conclusion

Digital Obesity

report

website

News paper TV newspamphlet

Sms/mms

comicbooksbrochures

meeting

Information Overload?

Text Structure

• Grammar: the word order governs the message that is to be delivered in the sentences

• Short vs. Long texts

• Full language model (such as the subject-verb-object approach) is difficult to specify, complex to process, and subject to problem domains.

Learning Text Fragments• Shorter Sentence• Less Structured• Multiple patterns• Do not follow formal grammar rules• No need for complete language model• e.g:,

• dates and times, • names of products, • names of people, • simple sentence forms such as questions, complaints, and

news.

Grammars for Postal Addressnumber, street name, town, postCode‘21 London Rd Ipswich Suffolk IP1 2EZ’

• And others:

• ‘29 Meredith Rd Ipswich’ number, street name,town

• ‘Belfairs Hotel 33 Graham Rd Ipswich’ word, business, number, street name, town

• The variations of the pattern will probably increase as more data samples are encountered.

Address A: 29 Meredith Rd Ipswich

A is an address, but is B a

valid address?

A and B are valid addresse

s!

Address B: Future House, 31, Mars Ave, Mars

Existing Approaches

• tagging-based information extraction • document distributions and statistical model• evolutionary genetic algorithms• semantic nets• fuzzy methods

Aimed at generating grammars that would parse fully defined dataset and cannot easily cope with

the addition of a new training example.

Figure 1: Example of information tagging

Genetic Algorithm for Grammar Parsing

• Goal: Generate grammar that would cover past and new examples

• Approach: binary trees of non-terminal nodesleft branch: T:= {word, number, street ending,…}right branch: T U {AND, OR, OPTIONAL}

• Population Setting: Groups of grammar files with varied number of grammar definitions

• Mating selection (Elitist): Among files within and between groups and among grammar elements in and between groups

• Genetic operators: crossover and mutation

• Fitness Function: measure the ability of the grammar to parse test strings

Figure 2: Address Grammar Fragments Binary Tree

25 acacia avenue

gen 0 Groups Total Fitness

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

1 2 3 4 5 6 7 8 9 10

Group

gram

mar

file

s sc

ore

0

1

2

3

4

5

6

7

8

9

Gen 32 Groups Total Fitness

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

group

gra

mm

ar

file

sco

re

0

1

2

3

4

5

6

7

8

9

10

11

Result: 1. Fitness is low although all grammars have converged (average highest

score=0.388, highest score=0.6)

2. Effective for grammar building but requires complete retraining if the initial set of examples is not sufficiently general to create a good classifier.

Figure 3: parsing score of generated grammar groups in generation 0

Figure 4: parsing score of generated grammar groups in generation 32

Fuzzy Approach for Text Pattern Learning

• To describe a relation between the text and the grammar fragment

• Represents the membership degree of the grammar belongingto the text.

• The grammar element can be terminal as well as fuzzy sets

ALPHANUMERIC

NUMBER ALPHABETIC

ANYWORD

PC1 PC2

PLACENAME BUSINESS TYPE

ETC

CITY NAME

Figure 5: Partial Order Table for UK Address

Grammar Similarity

• Fuzzy Grammar and Fuzzy Membership• Loosely inspired by Levenshtein Edit Distance

Source string W E D N E S D A Y

Target string T U E S D A Y

Edit distance* S=1 S=1 D=1 D=1 = = = = =

Table 1: Example of string edit distance operation (*I:Insert, D:Delete, S:Substitute)

Source grammar

Number Word Word Streetending Placename

Target grammar Number Placename Streetending Placename Countyname

Edit distance* = S=1 D=1 = = I=1

Table 2: Example of Grammar Edit Distance Operation (*I:Insert, D:Delete, S:Substitute)

Fuzzy Parsing• Fuzzy Membership: Measure the parsing degree of a

grammar on strings

• Fuzzy Overlap: CostGG(GS, GT): estimate of the cost of changing a string parsed by the grammar GS into one parsed by the grammar GT.

… (Eq. 1)

… (Eq. 2)

… (Eq. 3)

I: insertionD: Deletion

S: SubstituteRs: Remainder in the source

Rt: Remainder in the target

Equations (I)S, T : sequences of grammar elements,

s, t : terminal symbols, TSi and TSj : (fuzzy) sets of terminal symbols,

X : any single grammar element Hs, Ht : tags.

Equations (II)S, T : sequences of grammar elements,



Equations (III)S, T : sequences of grammar elements,



Incremental Evolution Strategy

Suppose we have a set of positive examples (P).

We find the grammar fragment Hmax that parses

Sp with maximum membership

• If CostGG(Sp,Hmax) ≤ (CostGG(Sp,Hi))

• Then we shall incrementally alter Hmax or create a new grammar.

CostGG(Sp,Hmax) ≥ max (CostGG(Hi,Hmax))

Grammar Approximation Operators• Create a new rule Hnew ::= Sp, where appropriate

substring can be tagged and restrict to maintain single optional

Hfinal=[Hi]GHi+1

• Merge duplicate grammar definition which can be generalized and replace with a more generalize fuzzy superset grammar

Hi:={gi, gi+1,…, gn}, gi = moreGeneral(gS,gT)• Replace contiguous optional grammar with optional

fuzzy grammar

[Hnew]={gi, gi+1,…, gn}

Fuzzy Grammar OverlapTarget grammar

Column=0 Column=1 Column=2 Column=3 Column=4 Column=5

Source grammar

Null number anyWord streetend placename postcode

Row=0 Null 0 0 0 1 0 0 2 0 0 3 0 0 4 0 0 5 0 0

Row=1 number 0 1 0 0 0 0 (E) 1 0 0 2 0 0 3 0 0 5 0 0

Row=2 placename 0 2 0 0 1 0 0 0 1 (D) 1 0 1 2 0 1 4 0 1

Row=3 streetend 0 3 0 0 2 0 0 1 1 0 0 1 (C) 1 0 1 (B) 3 0 1 (A)

Figure 6: Overlap Matrix

Approximation Result:X:=placeNameX:=anyWordG1:= placeName-postCodeAddr:= number-X-streetend-[G1]

Approximation Result Generalized:G1:= placeName-postCodeAddr: Number-anyWord-streetend-[G1]

Final costI=0, D=0, S=0

I=0, D=0, S=1

3 0 1 (A)1 0 1 (B)0 0 1 (C)

0 0 1 (D)

0 0 0 (E)

I: insertion + remainder of targetD: Deletion+ remainder of source

S: Substitute

Grammar ApproximationAddress Grammar derived from

addressApproximated Grammar

107 hatfield rd ipswich ip3 9ag

number-placeName-streetend-placeName-postCode

ADDR:=number-placeName-streetend-placeName-postCode

121 sidegate ln ipswich

number-anyWord-streetend-placeName

G1:=streetend-placeName-[postCode]G2:=anyWordG2:=placeNameADDR:=number-G2-G1

Figure 7: Grammar Approximation Example

alnesbourne priory club nacton rd ipswich

anyWord-anyWord-anyWord-anyWord-streetend-placeName

G1:=postCode

G2:=streetend-placeName-[G1]

G3:=anyWord-anyWord

G4:=anyWord

G4:=number

ADDR:=G4-anyWord-[G3]-G2

Conclusion and Future Work

• The fuzzy method outperforms the standard genetic techniques to create fuzzy grammars

• Highlight: ability to learn new text pattern without sacrificing past data

• Approximation operators: escaped from the common genetic operators

• Future Work: refine the approximation method and test with other softer structures data

incremental evolving grammar fragments

Technology