part-of-speech tagging - uppsala universitynivre/gslt/tagging_nlp1.pdfo part-of-speech tagging is...

8
Part-of-Speech Tagging Torbjörn Lager Department of Linguistics Göteborg University NLP1 - Torbjörn Lager 2 Part-of-Speech Tagging: Definition o From Jurafsky & Martin 2000: o Part-of-speech tagging is the process of assigning a part-of-speech or other lexical class marker to each word in a corpus. o The input to a tagging algorithm is a string of words and a specified tagset. The output is a single best tag for each word. o A bit too narrow for my taste... NLP1 - Torbjörn Lager 3 Part-of-Speech Tagging: Example 1 o Input He can can a can o Output He/pron can/aux can/vb a/det can/n o Another possible output He/{pron} can/{aux,n} can/{vb} a/{det} can/{n,vb} NLP1 - Torbjörn Lager 4 Tag Sets o The Penn Treebank tag set (see HTML page) NLP1 - Torbjörn Lager 5 Why Part-of-Speech Tagging? o A first step towards parsing o A first step towards word sense disambiguation o Provide clues to pronounciation o "object" -> OBject or obJECT o (but note: BAnan vs baNAN) o Research in Corpus Linguistics NLP1 - Torbjörn Lager 6 Part-of-Speech Tagging: Example 2 o I can light a fire and you can open a can of beans. Now the can is open and we can eat in the light of the fire .

Upload: others

Post on 03-Jun-2020

31 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Part-of-Speech Tagging - Uppsala Universitynivre/gslt/tagging_NLP1.pdfo Part-of-speech tagging is the process of assigning a part -of-speech or other lexical class marker to each word

1

Part-of-Speech Tagging

Torbjörn LagerDepartment of LinguisticsGöteborg University

NLP1 - Torbjörn Lager 2

Part-of-Speech Tagging: Definition

o From Jurafsky & Martin 2000:

o Part-of-speech tagging is the process of assigning a part-of-speech or other lexical class marker to each word in a corpus.

o The input to a tagging algorithm is a string of words and a specified tagset. The output is a single best tag for each word.

o A bit too narrow for my taste...

NLP1 - Torbjörn Lager 3

Part-of-Speech Tagging: Example 1

o Input

He can can a can

o Output

He/pron can/aux can/vb a/det can/n

o Another possible output

He/{pron} can/{aux,n} can/{vb} a/{det} can/{n,vb}

NLP1 - Torbjörn Lager 4

Tag Sets

o The Penn Treebank tag set(see HTML page)

NLP1 - Torbjörn Lager 5

Why Part-of-Speech Tagging?

o A first step towards parsing

o A first step towards word sense disambiguation

o Provide clues to pronounciationo "object" -> OBject or obJECTo (but note: BAnan vs baNAN)

o Research in Corpus Linguistics

NLP1 - Torbjörn Lager 6

Part-of-Speech Tagging: Example 2

o I can light a fire and you can open a can of beans. Now the canis open and we can eat in the light of the fire.

Page 2: Part-of-Speech Tagging - Uppsala Universitynivre/gslt/tagging_NLP1.pdfo Part-of-speech tagging is the process of assigning a part -of-speech or other lexical class marker to each word

2

NLP1 - Torbjörn Lager 7

Relevant Information

o Lexical information

o Local contextual information

NLP1 - Torbjörn Lager 8

Part-of-Speech Tagging: Example 2

o I can light a fire and you can open a can of beans. Now the can is open and we can eat in the light of the fire.

o I/PRP can/MD light/VB a/DT fire/NN and/CC you/PRP can/MD open/VB a/DT can/NN of/IN beans/NNS ./. Now/RB the/DT can/NN is/VBZ open/JJ and/CC we/PRP can/MD eat/VB in/IN the/DT light/NN of/IN the/DT fire/NN ./.

NLP1 - Torbjörn Lager 9

Part-of-Speech Tagging

Processor

Knowledge

Text POS tagged text

Needed:- Some strategy for representing the knowledge- Some method for acquiring the knowledge- Some method of applying the knowledge

NLP1 - Torbjörn Lager 10

Approaches to PoS Tagging

o The bold approach: 'Use all the information you have and guess"'

o The whimsical approach: 'Guess first, then change your mind if nessessary!'

o The cautious approach: 'Don't guess, just eliminate the impossible!'

NLP1 - Torbjörn Lager 11

Some POS-Tagging Issues

o Accuracyo Speedo Space requirementso Learningo Intelligibility

Processor

Knowledge

Text POS tagged text

NLP1 - Torbjörn Lager 12

Cutting the Cake

o Tagging methodso Rule basedo Statisticalo Mixedo Other methods

o Learning methodso Supervised learningo Unsupervised learning

Page 3: Part-of-Speech Tagging - Uppsala Universitynivre/gslt/tagging_NLP1.pdfo Part-of-speech tagging is the process of assigning a part -of-speech or other lexical class marker to each word

3

NLP1 - Torbjörn Lager 13

HMM Tagging

o The bold approach: 'Use all the information you have and guess"'

o Statistical methodo Supervised (or unsupervised) learning

NLP1 - Torbjörn Lager 14

NLP1 - Torbjörn Lager 15

The Naive Approach and its Problem

o Traverse all the paths compatible with the input and then pick the most probable one

o Problem:o There are 27 paths in the HMM for S="he can can

a can"o Doubling the length of S (with a conjunction in

between) -> 729 pathso Doubling S again -> 531431 paths!o Exponential time complexity!

NLP1 - Torbjörn Lager 16

Solution

o Use the Viterbi algorithm

o Tagging can be done in time proportional to the length of input.

o How and Why does the Viterbi algorithm work? We save this for later...

NLP1 - Torbjörn Lager 17

Training an HMM

o Estimate probabilities from relative frequencies. o Output probabilities P(w|t): the number of occurrences of w

tagged as t, divided by the number of occurrences of t. o Transitional probabilities P(t1|t2): the number of occurrences of

t1 followed by t 2, divided by the number of occurrences of t 2.

o Use smoothing to overcome the sparse data problem(unknown words, uncommon words, uncommon contexts)

NLP1 - Torbjörn Lager 18

Transformation-Based Learning

o The whimsical approach: 'Guess first, then change your mind if necessary!'

o Rule based tagging, statistical learningo Supervised learningo Method due to Eric Brill (1995)

Page 4: Part-of-Speech Tagging - Uppsala Universitynivre/gslt/tagging_NLP1.pdfo Part-of-speech tagging is the process of assigning a part -of-speech or other lexical class marker to each word

4

Transformation-Based Tagging: Small Part-of-Speech Tagging Example

rulespos:NN>VB <- pos:TO@[-1] opos:VB>NN <- pos:DT@[-1] o....

inputShe decided to table her data

lexicondata:NNdecided:VBher:PNshe:PN Ntable:NN VB

to:TO

NP VB TO NN PN NNNP VB TO NN PN NNNP VB TO VB PN NN

NLP1 - Torbjörn Lager 20

A Longer Rule Sequence

tag:'NN'>'VB' <- tag:'TO'@[-1] otag:'VBP'>'VB' <- tag:'MD'@[-1,-2,-3] otag:'NN'>'VB' <- tag:'MD'@[-1,-2] otag:'VB'>'NN' <- tag:'DT'@[-1,-2] otag:'VBD'>'VBN' <- tag:'VBZ'@[-1,-2,-3] otag:'VBN'>'VBD' <- tag:'PRP'@[-1] otag:'POS'>'VBZ' <- tag:'PRP'@[-1] otag:'VB'>'VBP' <- tag:'NNS'@[-1] otag:'IN'>'RB' <- wd:as@[0] & wd:as@[2] otag:'IN'>'WDT' <- tag:'VB'@[1,2] otag:'VB'>'VBP' <- tag:'PRP'@[-1] otag:'IN'>'WDT' <- tag:'VBZ'@[1] o...

Transformation-Based Painting

bluered

K. Samuel 1998

Transformation-Based Painting

K. Samuel 1998

Transformation-Based Painting

red

K. Samuel 1998

Transformation-Based Painting

red

K. Samuel 1998

Page 5: Part-of-Speech Tagging - Uppsala Universitynivre/gslt/tagging_NLP1.pdfo Part-of-speech tagging is the process of assigning a part -of-speech or other lexical class marker to each word

5

Transformation-Based Painting

K. Samuel 1998

Transformation-Based Painting

bluered

K. Samuel 1998

Transformation-Based Painting

bluered

K. Samuel 1998

Learning Transformation Rules

Training Corpus

Baseline System

Current Corpus

Derive and ScoreCandidate Rules

Select Best Rule

Apply Rule

Rule Templates

ReferenceCorpus

Learned RuleSequence

Stop when score of best rule falls below threshold.

Learning Transformation Rules

Training Corpus

Baseline System

Current Corpus

Derive and ScoreCandidate Rules

Select Best Rule

Apply Rule

Rule Templates

ReferenceCorpus

Learned RuleSequence

Stop when score of best rule falls below threshold.

Various Corpora

• Training corpusw0 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10

• Current corpus (CC 1)

dt vb nn dt vb kn dt vb ab dt vb

• Reference corpus

dt nn vb dt nn kn dt jj kn dt nn

Page 6: Part-of-Speech Tagging - Uppsala Universitynivre/gslt/tagging_NLP1.pdfo Part-of-speech tagging is the process of assigning a part -of-speech or other lexical class marker to each word

6

Learning Transformation Rules

Training Corpus

Baseline System

Current Corpus

Derive and ScoreCandidate Rules

Select Best Rule

Apply Rule

Rule Templates

ReferenceCorpus

Learned RuleSequence

Rule Templates

• In TBL, only rules that are instances of templates can be learned.

• For example, the rules

tag:'VB'>'NN' <- tag:'DT'@[-1].tag:’NN’>’VB' <- tag:'DT'@[-1].

• are instances of the template

tag:A>B <- tag:C@[-1].

Learning Transformation Rules

Training Corpus

Baseline System

Current Corpus

Derive and ScoreCandidate Rules

Select Best Rule

Apply Rule

Rule Templates

ReferenceCorpus

Learned RuleSequence

• The score of a rule is the number of its positive matches minus the number of its negative instances:

score(R) = |pos(R)| - |neg(R)|

• The accuracy of a rule is its number of positive matches divided by the total number of matches of the rule:

accuracy(R) =

• The score threshold and the accuracy threshold are the lowest score and the lowest accuracy, respectively, that the highest scoring rule must have in order to be considered.

• In ordinary TBL, we work with an accuracy threshold < 0.5.

Score, Accuracy and Thresholds

|pos(R)| |pos(R)| + |neg(R)|

Derive and Score Candidate Rule 1

CC i+1

CC i

nndtabnndtknnndtnnnndt

vbdtabvbdtknvbdtnnvbdt

Ref. C nndtknjjdtknnndtvbnndt

pos(R1) = 3neg(R1) = 1score(R1) = pos(R1) - neg(R1) = 3-1 = 2

R1 = tag:vb>nn <- tag:dt@[-1]

Template = tag:_>_ <- tag:_@[-1]

Derive and Score Candidate Rule 2

CC i+1

CC i

vbdtabvbdtknvbdtvbvbdt

vbdtabvbdtknvbdtnnvbdt

Ref. C nndtknnndtknnndtvbnndt

pos(R2) = 1neg(R2) = 0score(R2) = pos(R2) - neg(R2) = 1-0 = 1

R2 = tag:nn>vb <- tag:vb@[-1]

Template = tag:_>_ <- tag:_@[-1]

Page 7: Part-of-Speech Tagging - Uppsala Universitynivre/gslt/tagging_NLP1.pdfo Part-of-speech tagging is the process of assigning a part -of-speech or other lexical class marker to each word

7

Learning Transformation Rules

Training Corpus

Baseline System

Current Corpus

Derive and ScoreCandidate Rules

Select Best Rule

Apply Rule

Rule Templates

ReferenceCorpus

Learned RuleSequence

Stop when score of best rule falls below threshold.

• Current ranking of rule candidates

R1 = tag:vb>nn <- tag:dt@[-1] Score = 2R2 = tag:nn>vb <- tag:vb@[-1] Score = 1

• If score threshold =< 2 then select R1, else if score threshold > 2, terminate.

Select Best Rule

NLP1 - Torbjörn Lager 39

Constraint-Grammar Tagging

o Due to Fred Karlsson et al.o The cautious approach: 'Don't guess, just

eliminate the impossible!'o Rule basedo No learning ('learning by injection')

NLP1 - Torbjörn Lager 40

Constraint Grammar Example

o I can light a fire and you can open a can of beans. Now the can is open and we can eat in the light of the fire.

o I/{PRP} can/{MD,NN} light/{JJ,NN,VB} a/{DT} fire/{NN} and/{CC} you/{PRP} can/{MD,NN} open/{JJ,VB} a/{DT} can/{MD,NN} of/{IN} beans/{NNS} ./{.} Now/{RB} the/{DT} can/{MD,NN} is/{VBZ} open/{JJ,VB} and/{CC} we/{PRP} can/{MD,NN} eat/{VB} in/{IN} the/{DT} light/{JJ,NN,VB} of/{IN} the/{DT} fire/{NN} ./{.}

NLP1 - Torbjörn Lager 41

Constraint Grammar Example

tag:red 'RP' <- wd:in@[0] & tag:'NN'@[-1] otag:red 'RB' <- wd:in@[0] & tag:'NN'@[-1] otag:red 'VB' <- tag:'DT'@[-1] otag:red 'NP' <- wd:'The'@[0] otag:red 'VBN' <- wd:said@[0] otag:red 'VBP' <- tag:'TO'@[-1,-2] otag:red 'VBP' <- tag:'MD'@[-1,-2,-3] otag:red 'VBZ' <- wd:'\'s'@[0] & tag:'NN'@[1] otag:red 'RP' <- wd:in@[0] & tag:'NNS'@[-1] otag:red 'RB' <- wd:in@[0] & tag:'NNS'@[-1] o...

NLP1 - Torbjörn Lager 42

Constraint Grammar Example

o I can light a fire and you can open a can of beans. Now the can is open and we can eat in the light of the fire.

o I/{PP} can/{MD} light/{JJ,VB} a/{DT} fire/{NN} and/{CC} you/{PP} can/{MD} open/{JJ,VB} a/{DT} can/{MD,NN} of/{IN} beans/{NNS} ./{.} Now/{RB} the/{DT} can/{MD,NN} is/{VBZ} open/{JJ} and/{CC} we/{PP} can/{MD} eat/{VB} in/{IN} the/{DT} light/{NN} of/{IN} the/{DT} fire/{NN} ./{.}

Page 8: Part-of-Speech Tagging - Uppsala Universitynivre/gslt/tagging_NLP1.pdfo Part-of-speech tagging is the process of assigning a part -of-speech or other lexical class marker to each word

8

NLP1 - Torbjörn Lager 43

Evaluation

o Two reasons for evaluating:o Compare with other peoples methods/systemso Compare with earlier versions of your own system

o Accuracy (recall and precision)

o Baseline

o Ceiling

o N-fold cross-validation methodology => Good use of the data + More statistically reliable results.

NLP1 - Torbjörn Lager 44

Assessing the Taggers

o Accuracy

o Speed

o Space requirements

o Learning

o Intelligibility

NLP1 - Torbjörn Lager 45

Demo Taggers

o Transformation-Based Tagger:o www.ling.gu.se/~lager/Home/brilltagger_ui.html

o Constraint-Grammar Taggero www.ling.gu.se/~lager/Home/cgtagger_ui.html

o Featuring tracing facilities!o Try it yourself!