pos tagging - systems group · - named entity recognition, shallow parsing - word segmentation -...

76
POS tagging Intro to NLP - ETHZ - 17/03/2014

Upload: others

Post on 31-Oct-2019

7 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

POS taggingIntro to NLP - ETHZ - 17/03/2014

Page 2: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Summary

● Parts of speech ● Tagsets ● Part of speech tagging● HMM Tagging:

○ Most likely tag sequence○ Probability of an observation○ Parameter estimation○ Evaluation

Page 3: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

POS ambiguity

"Squad helps dog bite victim"

● bite -> verb?● bite -> noun?

Page 4: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Parts of Speech (PoS)

Traditional parts of speech:● Noun, verb, adjective, preposition, adverb,

article, interjection, pronoun, conjunction, etc.

● Called: parts-of-speech, lexical categories, word classes, morphological classes, lexical tags...

Page 5: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Examples

N (noun): car, squad, dog, bite, victimV (verb): help, biteADJ (adjective): purple, tallADV (adverb): unfortunately, slowlyP (preposition): of, by, toPRO (pronoun): I, me, mineDET (determiner): the, a, that, those

Page 6: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Open and closed classes

1. Closed class: small stable seta. Auxiliaries: may, can, will, been, ...b. Prepositions: of, in, by, ...c. Pronouns: I, you, she, mine, his, them, ...d. Usually function words, short, grammar role

2. Open class:a. new ones are created all the time ("to google/tweet",

"e-quaintance", "captcha", "cloud computing", "netbook", "webinar", "widget")

b. English has 4: Nouns, Verbs, Adjectives, Adverbsc. Many languages have these 4, not all!

Page 7: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Open class words

1. Nouns:a. Proper nouns: Zurich, IBM, Albert Einstein, The

Godfather, ... Capitalized in many languages.b. Common nouns: the rest, also capitalized in

German, mass/count nouns (goat/goats, snow/*snows)

2. Verbs:a. Morphological affixes in English: eat/eats/eaten

3. Adverbs: tend to modify things:a. John walked home extremely slowly yesterdayb. Directional/locative adverbs (here, home, downhill) c. Degree adverbs (extremely, very, somewhat)d. Manner adverbs (slowly, slinkily, delicately)

4. Adjectives: qualify nouns and noun phrases

Page 8: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Closed class words

1. prepositions: on, under, over, ... 2. particles: up, down, on, off, ... 3. determiners: a, an, the, ... 4. pronouns: she, who, I, ..5. conjunctions: and, but, or, ... 6. auxiliary verbs: can, may should, ...7. numerals: one, two, three, third, .

Page 9: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Prepositions with corpus frequencies

Page 10: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Conjunctions

Page 11: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Pronouns

Page 12: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Auxiliaries

Page 13: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Applications

● A useful pre-processing step in many tasks● Syntactic parsing: important source of

information for syntactic analysis● Machine translation● Information retrieval: stemming, filtering● Named entity recognition● Summarization?

Page 14: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Applications

Speech synthesis, for correct pronunciation of "ambiguous" words:

○ lead /lid/ (guide) vs. /led/ (chemical)○ inSULT vs. INsult○ obJECT vs. OBject○ overFLOW vs. OVERflow○ conTENT vs. CONtent

Page 15: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Summarization

● Idea: Filter out sentences starting with certain PoS tags

● Use PoS statistics from gold standard titles (might need cross-validation)

Page 16: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Summarization

● Idea: Filter out sentences starting with certain PoS tags:

● Title1: "Apple introduced Siri, an intelligent personal assistant to which you can ask questions"

● Title2: "Especially now that a popular new feature from Apple is bound to make other phones users envious: voice control with Siri"

● Title3: "But Siri, Apple's personal assistant application on the iPhone 4s, doesn't disappoint"

Page 17: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Summarization

● Idea: Filter out sentences starting with certain PoS tags:

● Title1: "Apple introduced Siri, an intelligent personal assistant to which you can ask questions"

● Title2: "Especially now that a popular new feature from Apple is bound to make other phones users envious: voice control with Siri"

● Title3: "But Siri, Apple's personal assistant application on the iPhone 4s, doesn't disappoint"

Page 18: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

PoS tagging

● The process of assigning a part-of-speech tag (label) to each word in a text.

● Pre-processing: tokenizationWord Tag

Squad N

helps V

dog N

bite N

victim N

Page 19: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Choosing a tagset

1. There are many parts of speech, potential distinctions we can draw

2. For POS tagging, we need to choose a standard set of tags to work with

3. Coarse tagsets N, V, Adj, Adv. a. A universal PoS tagset? http://en.wikipedia.

org/wiki/Part-of-speech_tagging4. More commonly used set is finer grained,

the “Penn TreeBank tagset”, 45 tags

Page 20: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

PTB tagset

Page 21: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Examples*

1. I/PRP need/VBP a/DT flight/NN from/IN Atlanta/NN2. Does/VBZ this/DT flight/NN serve/VB dinner/NNS3. I/PRP have/VB a/DT friend/NN living/VBG in/IN Denver/NNP4. Can/VBP you/PRP list/VB the/DT nonstop/JJ afternoon/NN flights/NNS

Page 22: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Examples*

1. I/PRP need/VBP a/DT flight/NN from/IN Atlanta/NNP2. Does/VBZ this/DT flight/NN serve/VB dinner/NNS3. I/PRP have/VB a/DT friend/NN living/VBG in/IN Denver/NNP4. Can/VBP you/PRP list/VB the/DT nonstop/JJ afternoon/NN flights/NNS

Page 23: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Examples*

1. I/PRP need/VBP a/DT flight/NN from/IN Atlanta/NNP2. Does/VBZ this/DT flight/NN serve/VB dinner/NN3. I/PRP have/VB a/DT friend/NN living/VBG in/IN Denver/NNP4. Can/VBP you/PRP list/VB the/DT nonstop/JJ afternoon/NN flights/NNS

Page 24: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Examples*

1. I/PRP need/VBP a/DT flight/NN from/IN Atlanta/NNP2. Does/VBZ this/DT flight/NN serve/VB dinner/NN3. I/PRP have/VBP a/DT friend/NN living/VBG in/IN Denver/NNP4. Can/VBP you/PRP list/VB the/DT nonstop/JJ afternoon/NN flights/NNS

Page 25: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Examples*

1. I/PRP need/VBP a/DT flight/NN from/IN Atlanta/NNP2. Does/VBZ this/DT flight/NN serve/VB dinner/NN3. I/PRP have/VBP a/DT friend/NN living/VBG in/IN Denver/NNP4. Can/MD you/PRP list/VB the/DT nonstop/JJ afternoon/NN flights/NNS

Page 26: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Complexities

● Book/VB that/DT flight/NN ./.● There/EX are/VBP 70/CD children/NNS

there/RB ./.● Mrs./NNP Shaefer/NNP never/RB got/VBD

around/RP to/TO joining/VBG ./.● All/DT we/PRP gotta/VBN do/VB is/VBZ

go/VB around/IN the/DT corner/NN ./.● Unresolvable ambiguity:

○ The Duchess was entertaining last night .

Page 27: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Words PoS WSJ PoS Universal

The DT DET

oboist NN NOUN

Heinz NNP NOUN

Holliger NNP NOUN

has VBZ VERB

taken VBN VERB

a DT DET

hard JJ ADJ

line NN NOUN

about IN ADP

the DT DET

problems NNS NOUN

. . .

Page 28: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

POS Tagging

● Words often have more than one POS: back○ The back door = JJ○ On my back = NN○ Win the voters back = RB○ Promised to back the bill = VB

The POS tagging problem is to determine the POS tag for a particular instance of a word.

Page 29: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Word type tag ambiguity

Page 30: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Methods

1. Rule-baseda. Start with a dictionaryb. Assign all possible tagsc. Write rules by hand to remove tags in context

2. Stochastica. Supervised/Unsupervisedb. Generative/discriminativec. independent/structured outputd. HMMs

Page 31: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Rule-based tagging

1. Start with a dictionary:

she PRP

promised VBN, VBD

to TO

back VB, JJ, RB, NN

the DT

bill NN, VB

Page 32: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Rule-based tagging

2. Assign all possible tags:

NN

RB

VBN JJ NN

PRP VBD TO VB DT VB

she promised to back the bill

Page 33: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Rule-based tagging

3. Introduce rules to reduce ambiguity:

Rule: "<start> PRP {VBN, VBD}" -> "<start> PRP VBD"

NN

RB

VBN JJ NN

PRP VBD TO VB DT VB

she promised to back the bill

Page 34: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Rule-based tagging

3. Introduce rules to reduce ambiguity:

Rule: "TO {VB, NN, JJ, RB} DT" -> "TO VB DT"

NN

RB

VBN JJ NN

PRP VBD TO VB DT VB

she promised to back the bill

Page 35: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Statistical models for POS tagging

1. A classic: one of the first successful applications of statistical methods in NLP

2. Extensively studied with all possible approaches (sequence models benchmark)

3. Simple to get started on: data, eval, literature4. An introduction to more complex

segmentation and labelling tasks: NER, shallow parsing, global optimization

5. An introduction to HMMs, used in many variants in POS tagging and related tasks.

Page 36: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Supervision and resources

1. Supervised case: data with words manually annotated with POS tags

2. Partially supervised: annotated data + un-annotated data

3. Unsupervised: only raw text available4. Resources: dictionaries with words possible

tags5. Start with the supervised task

Page 37: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

HMMs

HMM = (Q,O,A,B)1. States: Q=q1..qN [the part of speech tags]2. Observation symbols: O = o1..oV [words]3. Transitions:

a. A = {aij}; aij = P(ts=qj|ts-1=qi)b. ts | ts-1 = qi ~ Multi(ai)c. Special vector of initial/final probabilities

4. Emissions:a. B = {bik}; bik = P(ws = ok|ts=qi) b. ws| ts = qi ~ Multi(bi)

Page 38: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Markov Chain Interpretation

Tagging process as a (hidden) Markov process● Independence assumptions

1. Limited horizon

2. Time-invariant

3. Observation depends only on the state

Page 39: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Complete data likelihood

The joint probability of a sequence of words and tags, given a model:

Generative process:1. generate a tag sequence2. emit the words for each tag

Page 40: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Inference in HMMs

Three fundamental problems:1. Given an observation (e.g., a sentence) find

the most likely sequence of states (e.g., pos tags)

2. Given an observation, compute its probability

3. Given a dataset of observation (sequences) estimate the model's parameters:

theta = (A,B)

Page 41: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

HMMs and FSA

Page 42: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

HMMs and FSA

also Bayes nets, directed graphical models, etc.

Page 43: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Other applications of HMMs

• NLP - Named entity recognition, shallow parsing

- Word segmentation- Optical Character Recognition

• Speech recognition• Computer Vision – image segmentation• Biology - Protein structure predictionEconomics, Climatology, Robotics...

Page 44: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

POS as sequence classification

● Observation: a sequence of N words w1:N● Response: a sequence of N tags t1:N ● Task: find the predicted t'1:N such that:

The best possible tagging for the sequence.

Page 45: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Bayes rule reformulation

Page 46: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

HMM POS tagging

● How can we find t'1:N?● Enumeration of all possible sequences?

Page 47: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

HMM POS tagging

● How can we find t'1:N?● Enumeration of all possible sequences?

○ O(|Tagset|N) !● Dynamic programming: Viterbi algorithm

Page 48: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Viterbi algorithm

Page 49: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Example: model

A+ =

B =

N V END

V 0.8 0.2 0.3

N 0.3 0.7 0.7

START 0.6 0.4

board backs plan

V 0.3 0.3 0.4

N 0.4 0.2 0.4

Page 50: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Example: observation

● Sentence: "Board backs plan"● Find the most likely tag sequence● ds(t) = probability of most likely path ending

at state s at time t

Page 51: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Viterbi algorithm: exampleEND

V

N

START

board backs plan

Time 1 2 3

Page 52: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Viterbi: forward passEND

V

N

START

board backs plan

Time 1 2 3

Page 53: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Viterbi: forward passEND

V

N

START

board backs plan

Time 1 2 3

d=.12

d=.24

Page 54: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Viterbi: forward passEND

V

N

START

board backs plan

Time 1 2 3

d=.12

d=.24

Page 55: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Viterbi: forward passEND

V

N

START

board backs plan

Time 1 2 3

d=.12

d=.24 d=.019

d=.050

Page 56: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Viterbi: forward passEND

V

N

START

board backs plan

Time 1 2 3

d=.12

d=.24 d=.019

d=.050

Page 57: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Viterbi: forward passEND

V

N

START

board backs plan

Time 1 2 3

d=.12

d=.24 d=.019

d=.050 d=.005

d=.016

Page 58: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Viterbi: forward passEND

V

N

START

board backs plan

Time 1 2 3

d=.12

d=.24 d=.019

d=.050 d=.005

d=.016

Page 59: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Viterbi: forward passEND

V

N

START

board backs plan

Time 1 2 3

d=.12

d=.24 d=.019

d=.050 d=.005

d=.016

d=.011

Page 60: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Viterbi: backtrackEND

V

N

START

board backs plan

Time 1 2 3

d=.12

d=.24 d=.019

d=.050 d=.005

d=.016

d=.011

Page 61: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Viterbi: backtrackEND

V

N

START

board backs plan

Time 1 2 3

d=.12

d=.24 d=.019

d=.050 d=.005

d=.016

d=.011

Page 62: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Viterbi: backtrackEND

V

N

START

board backs plan

Time 1 2 3

d=.12

d=.24 d=.019

d=.050 d=.005

d=.016

d=.011

Page 63: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Viterbi: outputEND

V

N

START

board/N backs/V plan/N

Time 1 2 3

d=.011

Page 64: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Observation probability

● Given HMM theta = (A,B) and observation sequence w1:N compute P(w1:N|theta)

● Applications: language modeling● Complete data likelihood:

● Sum over all possible tag sequences:

Page 65: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Forward algorithm

● Dynamic programming: each state of the trellis stores a value alphai(s) = probability of being in state s having observed w1:i

● Sum over all paths up to i-1 leading to s

● Init:

Page 66: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Forward algorithm

Page 67: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Forward computationEND

V

N

START

board backs plan

Time 1 2 3

a=.12

a=.24

Page 68: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Forward computationEND

V

N

START

board backs plan

Time 1 2 3

a=.12

a=.24 a=.034

a=.058

Page 69: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Forward computationEND

V

N

START

board backs plan

Time 1 2 3

a=.12

a=.24 a=.034

a=.058 a=.014

a=.022

Page 70: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Forward computationEND

V

N

START

board backs plan

Time 1 2 3

a=.12

a=.24 a=.034

a=.058 a=.014

a=.022

a=0.2

Page 71: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Parameter estimation

Maximum likelihood estimates (MLE) on data1. Transition probabilities:

2. Emission probabilities:

Page 72: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Implementation details

1. Start/End states2. Log space/Rescaling3. Vocabularies: model pruning4. Higher order models:

a. states representationb. Estimation and sparsity: deleted interpolation

Page 73: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Evaluation

● So once you have you POS tagger running how do you evaluate it?

1. Overall error rate with respect to a gold- standard test set.a. ER = # words incorrectly tagged/# words tagged

2. Error rates on particular tags (and pairs) 3. Error rates on particular words (especially

unknown words)

Page 74: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Evaluation

● The result is compared with a manually coded “Gold Standard”

● Typically accuracy > 97% on WSJ PTB● This may be compared with result for a

baseline tagger (one that uses no context).○ Baselines (most frequent tag) can achieve up to

90% accuracy.● Important: 100% is impossible even for

human annotators.

Page 75: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Summary

● Parts of speech ● Tagsets ● Part of speech tagging● HMM Tagging:

○ Most likely tag sequence (decoding)○ Probability of an observation (word sequence)○ Parameter estimation (supervised)○ Evaluation

Page 76: POS tagging - Systems Group · - Named entity recognition, shallow parsing - Word segmentation - Optical Character Recognition • Speech recognition • Computer Vision – image

Next class

● Unsupervised POS tagging models (HMMs)● Parameter estimation: forward-backward

algorithm● Discriminative sequence models: MaxEnt,

CRF, Perceptron, SVM, etc. ● Read J&M 5-6● Pre-process and POS tag the data: report

problems & baselines