named entity classification chioma osondu & wei wei

9
Named Entity Classification Chioma Osondu & Wei Wei

Upload: diana-malone

Post on 17-Dec-2015

228 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Named Entity Classification Chioma Osondu & Wei Wei

Named Entity Classification

Chioma Osondu & Wei Wei

Page 2: Named Entity Classification Chioma Osondu & Wei Wei

Classifiers

• Decision Tree • Multinomial Naïve Bayes • Support Vector Machines

Page 3: Named Entity Classification Chioma Osondu & Wei Wei

Features

• Unigrams• Bigrams• Trigrams• Quadrigrams• Specialized features like number of words, presence of

numbers, etc• Stemmed words

Page 4: Named Entity Classification Chioma Osondu & Wei Wei

Accuracy with Tree Depth

• Accuracy does not grow with the tree depth

• Accuracy is lower than Maximum Entropy Model with the same sets features.

Tree depth vs. Accuracy

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0 50 100

Tree depth

Acc

urac

y

Without Unigram

With unigram

Page 5: Named Entity Classification Chioma Osondu & Wei Wei

Results & Error Analysis (1)

• Features Are not abstract enough: Corp., Corporation, Inc., is really the same feature.

• Out of the 599 disputed classifications, MEM had 481 correct, and the decision tree had 118 correct

• Not enough features defined on Place, Movie and Person.

Page 6: Named Entity Classification Chioma Osondu & Wei Wei

Results & Error Analysis (2) Error Percentage in Data

0

0.05 0.1

0.15

0.2

0.25 0.3

0.35

Drug Person Place Movie Company Category

E r r o r

%

Error In both classifiers Error in Decision Tree Error In Maximum Entropy Model

Page 7: Named Entity Classification Chioma Osondu & Wei Wei

Results & Error Analysis (3)

347

7

347

7 4

0

50

100

150

200

250

300

350

Classification of atomic elements in drug category

drug

person

place

movie

company

Page 8: Named Entity Classification Chioma Osondu & Wei Wei

Results & Error Analysis (4)

1 3 10 6

324

0

50

100

150

200

250

300

350

Classification of atomic elements in company category

drug

person

place

movie

company

Page 9: Named Entity Classification Chioma Osondu & Wei Wei

Conclusion & Future Work

• Stemmed words are too coarse for multi-way• Better accuracies of over 94% can be achieved

using a combination of featuresSee Automatic Classification of Previously Unseen Proper Noun Phrases into Semantic Categories Using an N-Gram Letter Model by Stephen Patel & Joseph Smarr (2001 Final Project)

• Combining classifiers