the classification problem (recap from ling570) ling 572 fei xia, dan jinguji week 1: 1/10/08 1
Post on 22-Dec-2015
219 views
TRANSCRIPT
Outline
• Probability theory
• The classification task
=> Both were covered in LING570, and are therefore part of prerequisites.
2
Three types of probability
• Joint prob: P(x,y)= prob of x and y happening together
• Conditional prob: P(x|y) = prob of x given a specific value of y
• Marginal prob: P(x) = prob of x for all possible values of y
4
Common tricks (III):Bayes rule
)(
)()|(
)(
),()|(
AP
BPBAP
AP
BAPABP
)()|(maxarg
)(
)()|(maxarg
)|(maxarg*
yPyxP
xP
yPyxP
xyPy
y
y
y
7
Common tricks (IV):Independence assumption
)|(
),...|(),...,(
11
111
1
ii
i
ii
in
AAP
AAAPAAP
8
A and B are conditionally independent given C: P(A|B,C) = P(A|C) P(A,B|C) = P(A|C) P(B|C)
Definition of classification problem
• Task: – C= {c1, c2, .., cm} is a finite set of pre-defined classes
(a.k.a., labels, categories).– Given an input x, decide on its category y.
• Multi-label vs. single-label problem– Single-label: for each x, only one class is assigned to it.– Multi-label: a x could have multiple labels.
• Multi-class vs. binary classification problem– Binary: |C| = 2.– Multi-class: |C| > 2
10
Conversion to single-label binary problem
• Multi-label single-label– If labels are unrelated, we can convert a multi-label
problem into |C| binary problems: e.g., does x have label c1? Does it have label c2? … Does it have label cm?
• Multi-class binary problem– We can convert multi-class problem to several binary
problems. We will discuss this in Week #6.
=> We will focus on single-label binary classification problem.
11
Examples of classification tasks
• Text classification
• Document filtering
• Language/Author/Speaker id
• WSD
• PP attachment
• Automatic essay grading
• …
12
Sequence labeling tasks
• Tokenization / Word segmentation• POS tagging• NE detection• NP chunking• Parsing• Reference resolution• …
We can use classification algorithms + beam search
13
Steps for solving a classification problem
• Split data into training/test/validation• Data preparation
• Training• Decoding
• Postprocessing• Evaluation
14
The three main steps
• Data preparation: represent the data as feature vectors
• Training: A trainer takes the training data as input, and outputs a classifier.
• Decoding: A decoder takes a classifier and test data as input, and output classification results.
15
Data
• An instance: (x, y)
• Labeled data: y is known
• Unlabeled data: y is unknown
• Training/test data: a set of instances.
16
Data preparation: creating attribute-value table
f1 f2 … fK Target
d1 yes 1 no -1000 c2
d2
d3
…
dn
17
Attribute-value table
• Each row corresponds to an instance.• Each column corresponds to a feature.
• A feature type (a.k.a. a feature template): w-1
• A feature: w-1=book• Binary feature vs. non-binary feature
18
The training stage
• Three types of learning – Supervised learning: the training data is labeled.– Unsupervised learning: the training data is unlabeled.– Semi-supervised learning: the training data consists
of both.
• We will focus on supervised learning in LING572
19
The decoding stage• A classifier is a function f: f(x) = {(ci, scorei)}.
Given the test data, a classifier “fills out” a decision matrix.
d1 d2 d3 ….
c1 0.1 0.4 0 …
c20.9 0.1 0 …
c3
…20
Important tasks (for you)in LING 572
• Understand various learning algorithms.
• Apply the algorithms to different tasks:– Convert the data into attribute-value table
• Define feature types• Feature selection• Convert an instance into a feature vector
– Choose an appropriate learning algorithm.
21