outline... feature extraction feature selection / dim. reduction classification

100
Outline ... Feature extraction Feature selection / Dim. reduction Classification ...

Upload: ferdinand-york

Post on 29-Dec-2015

237 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Outline... Feature extraction Feature selection / Dim. reduction Classification

Outline ... Feature extraction Feature selection / Dim. reduction Classification ...

Page 2: Outline... Feature extraction Feature selection / Dim. reduction Classification

Outline ... Feature extraction

shape texture

Feature selection / Dim. reduction Classification ...

Page 3: Outline... Feature extraction Feature selection / Dim. reduction Classification

How to extract features? Eg.:

ER

Tubulin DNATfRActin

NucleolinMitoLAMP

gpp130giantin

Page 4: Outline... Feature extraction Feature selection / Dim. reduction Classification

How to extract features? Eg.:

ER

Tubulin DNATfRActin

NucleolinMitoLAMP

gpp130giantinER

brightness(?)

area(?) Mit

Page 5: Outline... Feature extraction Feature selection / Dim. reduction Classification

Images - shapes distance function: Euclidean, on the

area, perimeter, and 20 ‘moments’ [QBIC, ’95]

Q: other ‘features’ / distance functions?

Page 6: Outline... Feature extraction Feature selection / Dim. reduction Classification

Images - shapes (A1: turning angle) A2: wavelets A3: morphology: dilations/erosions ...

Page 7: Outline... Feature extraction Feature selection / Dim. reduction Classification

Wavelets - examplehttp://grail.cs.washington.edu/projects/query/Wavelets achieve *great* compression:

20 100 400 16,000

# coefficients

Page 8: Outline... Feature extraction Feature selection / Dim. reduction Classification

Wavelets - intuition Edges (horizontal; vertical; diagonal)

Page 9: Outline... Feature extraction Feature selection / Dim. reduction Classification

Wavelets - intuition Edges (horizontal; vertical; diagonal) recurse

Page 10: Outline... Feature extraction Feature selection / Dim. reduction Classification

Wavelets - intuition Edges (horizontal; vertical; diagonal) http://www331.jpl.nasa.gov/public/

wave.html

Page 11: Outline... Feature extraction Feature selection / Dim. reduction Classification

Wavelets Many wavelet basis:

Haar Daubechies (-4, -6, -20) Gabor ...

Page 12: Outline... Feature extraction Feature selection / Dim. reduction Classification

Daubechies D4 decompsotion

Original image Wavelet Transformation

Page 13: Outline... Feature extraction Feature selection / Dim. reduction Classification

Gabor Function

We can extend the function to generate Gabor filters by rotating and dilating

Page 14: Outline... Feature extraction Feature selection / Dim. reduction Classification

Feature Calculation Preprocessing

Background subtraction and thresholding, Translation and rotation

Wavelet transformation The Daubechies 4 wavelet 10th level decomposition The average energy of the three high-frequency

components

adv

Page 15: Outline... Feature extraction Feature selection / Dim. reduction Classification

Feature Calculation Preprocessing 30 Gabor filters were generated using five

different scales and six different orientations Convolve an input image with a Gabor filter Take the mean and standard deviation of the

convolved image 60 Gabor texture features

adv

Page 16: Outline... Feature extraction Feature selection / Dim. reduction Classification

Wavelets: Extremely useful Excellent compression / feature

extraction, for natural images fast to compute ( O(N) )

Page 17: Outline... Feature extraction Feature selection / Dim. reduction Classification

Images - shapes (A1: turning angle) A2: wavelets A3: morphology: dilations/erosions ...

Page 18: Outline... Feature extraction Feature selection / Dim. reduction Classification

Other shape features Morphology (dilations, erosions,

openings, closings) [Korn+, VLDB96]

shape (B/W)“structuring

element”

R=1

Page 19: Outline... Feature extraction Feature selection / Dim. reduction Classification

Other shape features Morphology (dilations, erosions,

openings, closings) [Korn+, VLDB96]

shape“structuring

element”

R=0.5

R=1

R=2

Page 20: Outline... Feature extraction Feature selection / Dim. reduction Classification

Other shape features Morphology (dilations, erosions,

openings, closings) [Korn+, VLDB96]

shape“structuring

element”

R=0.5

R=1

R=2

Page 21: Outline... Feature extraction Feature selection / Dim. reduction Classification

Morphology: closing fill in small gaps very similar to ‘alpha contours’

Page 22: Outline... Feature extraction Feature selection / Dim. reduction Classification

Morphology: closing fill in small gaps

‘closing’, with R=1

Page 23: Outline... Feature extraction Feature selection / Dim. reduction Classification

Morphology: opening ‘closing’, for the complement = trim small extremities

Page 24: Outline... Feature extraction Feature selection / Dim. reduction Classification

Morphology: opening

‘opening’with R=1

‘closing’, for the complement = trim small extremities

Page 25: Outline... Feature extraction Feature selection / Dim. reduction Classification

Morphology Closing: fills in gaps

Opening: trims extremities

All wrt a structuring element:

Page 26: Outline... Feature extraction Feature selection / Dim. reduction Classification

Morphology Features: areas of openings (R=1, 2,

…) and closings

Page 27: Outline... Feature extraction Feature selection / Dim. reduction Classification

Morphology resulting areas: ‘pattern spectrum’

translation ( and rotation) independent

As described: on b/w images can be extended to grayscale ones (eg., by

thresholding)

Page 28: Outline... Feature extraction Feature selection / Dim. reduction Classification

Conclusions Shape: wavelets; math. morphology texture: wavelets; Haralick texture

features

Page 29: Outline... Feature extraction Feature selection / Dim. reduction Classification

References Faloutsos, C., R. Barber, et al. (July 1994). “Efficient

and Effective Querying by Image Content.” J. of Intelligent Information Systems 3(3/4): 231-262.

Faloutsos, C. and K.-I. D. Lin (May 1995). FastMap: A Fast Algorithm for Indexing, Data-Mining and Visualization of Traditional and Multimedia Datasets. Proc. of ACM-SIGMOD, San Jose, CA.

Faloutsos, C., M. Ranganathan, et al. (May 25-27, 1994). Fast Subsequence Matching in Time-Series Databases. Proc. ACM SIGMOD, Minneapolis, MN.

Page 30: Outline... Feature extraction Feature selection / Dim. reduction Classification

References Christos Faloutsos, Searching Multimedia Databases

by Content, Kluwer 1996

Page 31: Outline... Feature extraction Feature selection / Dim. reduction Classification

References Flickner, M., H. Sawhney, et al. (Sept. 1995). “Query

by Image and Video Content: The QBIC System.” IEEE Computer 28(9): 23-32.

Goldin, D. Q. and P. C. Kanellakis (Sept. 19-22, 1995). On Similarity Queries for Time-Series Data: Constraint Specification and Implementation (CP95),

Cassis, France.

Page 32: Outline... Feature extraction Feature selection / Dim. reduction Classification

References Charles E. Jacobs, Adam Finkelstein, and David H.

Salesin. Fast Multiresolution Image Querying SIGGRAPH '95, pages 277-286. ACM, New York, 1995.

Flip Korn, Nikolaos Sidiropoulos, Christos Faloutsos, Eliot Siegel, Zenon Protopapas: Fast Nearest Neighbor Search in Medical Image Databases. VLDB 1996: 215-226

Page 33: Outline... Feature extraction Feature selection / Dim. reduction Classification

Outline ... Feature extraction Feature selection / Dim. reduction Classification ...

Page 34: Outline... Feature extraction Feature selection / Dim. reduction Classification

Outline ... Feature selection / Dim. reduction

PCA ICA Fractal Dim. reduction variants

...

Page 35: Outline... Feature extraction Feature selection / Dim. reduction Classification

Feature Reduction Remove non-discriminative features Remove redundant features Benefits :

Speed Accuracy Multimedia indexing

Page 36: Outline... Feature extraction Feature selection / Dim. reduction Classification

SVD - Motivation

area

brig

htne

ss

Page 37: Outline... Feature extraction Feature selection / Dim. reduction Classification

SVD - Motivation

area

brig

htne

ss

Page 38: Outline... Feature extraction Feature selection / Dim. reduction Classification

SVD – dim. reduction

minimum RMS error

SVD: givesbest axis to project

v1

area

brig

htne

ss

Page 39: Outline... Feature extraction Feature selection / Dim. reduction Classification

SVD - Definition A = U VT - example:

v1

Page 40: Outline... Feature extraction Feature selection / Dim. reduction Classification

SVD - PropertiesTHEOREM [Press+92]: always possible to

decompose matrix A into A = U VT , where U, V: unique (*) U, V: column orthonormal (ie., columns are

unit vectors, orthogonal to each other) UT U = I; VT V = I (I: identity matrix)

: eigenvalues are positive, and sorted in decreasing order

math

Page 41: Outline... Feature extraction Feature selection / Dim. reduction Classification

Outline ... Feature selection / Dim. reduction

PCA ICA Fractal Dim. reduction variants

...

Page 42: Outline... Feature extraction Feature selection / Dim. reduction Classification

ICA Independent Component Analysis

better than PCA also known as ‘blind source separation’ (the `cocktail discussion’ problem)

Page 43: Outline... Feature extraction Feature selection / Dim. reduction Classification

Intuition behind ICA:

“Zernike moment #1”

“Zernike moment #2”

Page 44: Outline... Feature extraction Feature selection / Dim. reduction Classification

Motivating Application 2:Data analysis

“Zernike moment #1”

“Zernike moment #2”PCA

Page 45: Outline... Feature extraction Feature selection / Dim. reduction Classification

Motivating Application 2:Data analysis

“Zernike moment #1”

“Zernike moment #2”ICA

PCA

Page 46: Outline... Feature extraction Feature selection / Dim. reduction Classification

Conclusions for ICA Better than PCA Actually, uses PCA as a first step!

Page 47: Outline... Feature extraction Feature selection / Dim. reduction Classification

Outline ... Feature selection / Dim. reduction

PCA ICA Fractal Dim. reduction variants

...

Page 48: Outline... Feature extraction Feature selection / Dim. reduction Classification

Fractal Dimensionality Reduction

1. Calculate the fractal dimensionality of the training data.

2. Forward-Backward select features according to their impact on the fractal dimensionality of the whole data.

adv

Page 49: Outline... Feature extraction Feature selection / Dim. reduction Classification

Dim. reduction

Spot and drop attributes with strong (non-)linear correlationsQ: how do we do that?

adv

Page 50: Outline... Feature extraction Feature selection / Dim. reduction Classification

Dim. reduction - w/ fractals

y

xxx

yy(a) Quarter-circle (c) Spike(b)Line1

0000

1

not informative

adv

Page 51: Outline... Feature extraction Feature selection / Dim. reduction Classification

Dim. reduction

Spot and drop attributes with strong (non-)linear correlationsQ: how do we do that?A: compute the intrinsic (‘ fractal ’)

dimensionality ~ degrees-of-freedom

Page 52: Outline... Feature extraction Feature selection / Dim. reduction Classification

Dim. reduction - w/ fractals

y

xxx

yy(a) Quarter-circle (c) Spike(b)Line1

0000

1

PFD~0

PFD=1

global FD=1

adv

Page 53: Outline... Feature extraction Feature selection / Dim. reduction Classification

Dim. reduction - w/ fractals

y

xxx

yy(a) Quarter-circle (c) Spike(b)Line1

0000

1

PFD=1

PFD=1

global FD=1

adv

Page 54: Outline... Feature extraction Feature selection / Dim. reduction Classification

Dim. reduction - w/ fractals

y

xxx

yy(a) Quarter-circle (c) Spike(b)Line1

0000

1

PFD~1

PFD~1global FD=1

adv

Page 55: Outline... Feature extraction Feature selection / Dim. reduction Classification

Outline ... Feature selection / Dim. reduction

PCA ICA Fractal Dim. reduction variants

...

Page 56: Outline... Feature extraction Feature selection / Dim. reduction Classification

Nonlinear PCA

x

y

adv

Page 57: Outline... Feature extraction Feature selection / Dim. reduction Classification

Nonlinear PCA

x

y

adv

Page 58: Outline... Feature extraction Feature selection / Dim. reduction Classification

Nonlinear PCA

Xnm is the original data matrix, n points, m dimensions

x

y

)( mnmn XFX

adv

Page 59: Outline... Feature extraction Feature selection / Dim. reduction Classification

Kernel PCA

K(x i,x j ) (x i),(x j )

Kernel Function

K(x i,x j ) exp x i x j

2

2 2

y’

x’

x

y

adv

Page 60: Outline... Feature extraction Feature selection / Dim. reduction Classification

Genetic Algorithm

1 1 0 0 1 0 0 1 1 0 0 0 … 0 1 1 0 0 0 0 1 1 0 1 0 …Generation 1.1 Generation 1.2

1 0 0 0 1 0 0 1 1 1 0 0 …Generation 2.1

0 1 1 0 0 1 0 1 1 0 1 0 …Generation 2.2

Mutation

1 0 0 0 1 0 0 1 1 0 1 0 …Generation 3.1 Generation 3.2

0 1 1 0 0 1 0 1 1 1 0 0 …

Crossover

Evaluation Function (Classifier)

adv

Page 61: Outline... Feature extraction Feature selection / Dim. reduction Classification

Stepwise Discriminant Analysis

1. Calculate Wilk’s lambda and its corresponding F-statistic of the training data.

2. Forward-Backward selecting features according to the F-statistics.

(m) W (X )

T (X ),X [X1,X2 , ,Xm ]

Fto enter (n q mq 1

)(1 (m 1)

(m 1))

adv

Page 62: Outline... Feature extraction Feature selection / Dim. reduction Classification

References Berry, Michael: http://www.cs.utk.edu/~lsi/ Duda, R. O. and P. E. Hart (1973). Pattern

Classification and Scene Analysis. New York, Wiley.

Faloutsos, C. (1996). Searching Multimedia Databases by Content, Kluwer Academic Inc.

Foltz, P. W. and S. T. Dumais (Dec. 1992). "Personalized Information Delivery: An Analysis of Information Filtering Methods." Comm. of ACM (CACM) 35(12): 51-60.

Page 63: Outline... Feature extraction Feature selection / Dim. reduction Classification

References Fukunaga, K. (1990). Introduction to

Statistical Pattern Recognition, Academic Press.

Jolliffe, I. T. (1986). Principal Component Analysis, Springer Verlag.

Aapo Hyvarinen, Juha Karhunen, and Erkki Oja Independent Component Analysis, John Wiley & Sons, 2001.

Page 64: Outline... Feature extraction Feature selection / Dim. reduction Classification

References Korn, F., A. Labrinidis, et al. (2000).

"Quantifiable Data Mining Using Ratio Rules." VLDB Journal 8(3-4): 254-266.

Jia-Yu Pan, Hiroyuki Kitagawa, Christos Faloutsos, and Masafumi Hamamoto. AutoSplit: Fast and Scalable Discovery of Hidden Variables in Stream and Multimedia Databases. PAKDD 2004

Page 65: Outline... Feature extraction Feature selection / Dim. reduction Classification

References Press, W. H., S. A. Teukolsky, et al. (1992).

Numerical Recipes in C, Cambridge University Press.

Strang, G. (1980). Linear Algebra and Its Applications, Academic Press.

Caetano Traina Jr., Agma Traina, Leejay Wu and Christos Faloutsos, Fast feature selection using the fractal dimension, XV Brazilian Symposium on Databases (SBBD), Paraiba, Brazil, October 2000

Page 66: Outline... Feature extraction Feature selection / Dim. reduction Classification
Page 67: Outline... Feature extraction Feature selection / Dim. reduction Classification

Outline ... Feature extraction Feature selection / Dim. reduction Classification ...

Page 68: Outline... Feature extraction Feature selection / Dim. reduction Classification

Outline ... Feature extraction Feature selection / Dim. reduction Classification

classification trees support vector machines mixture of experts

Page 69: Outline... Feature extraction Feature selection / Dim. reduction Classification

NucleolarMitoch. Actin

TubulinEndosomal ???

Page 70: Outline... Feature extraction Feature selection / Dim. reduction Classification

-+

???

Page 71: Outline... Feature extraction Feature selection / Dim. reduction Classification

Decision trees - Problem

??

Page 72: Outline... Feature extraction Feature selection / Dim. reduction Classification

Decision trees Pictorially, we have

num. attr#1 (e.g.., ‘area’)

num. attr#2(e.g.., brightness)

+

-++ +

+

+

+

-

--

--

Page 73: Outline... Feature extraction Feature selection / Dim. reduction Classification

Decision trees and we want to label ‘?’

num. attr#1 (e.g.., ‘area’)

num. attr#2(e.g.., brightness)

+

-++ +

+

+

+

-

--

--

?

Page 74: Outline... Feature extraction Feature selection / Dim. reduction Classification

Decision trees so we build a decision tree:

num. attr#1 (e.g.., ‘area’)

num. attr#2(e.g.., brightness)

+

-++ +

+

+

+

-

--

--

?

50

40

Page 75: Outline... Feature extraction Feature selection / Dim. reduction Classification

Decision trees so we build a decision tree:

area<50

Y

+bright. <40

N

-...

Y N

‘area’

bright.

+

-++ +

++

+

-

--

--

?

50

40

Page 76: Outline... Feature extraction Feature selection / Dim. reduction Classification

Decision trees Goal: split address space in (almost)

homogeneous regionsarea<50

Y

+bright. <40

N

-...

Y N

‘area’

bright.

+

-++ +

++

+

-

--

--

?

50

40

Page 77: Outline... Feature extraction Feature selection / Dim. reduction Classification

Details - Variations Pruning

to avoid over-fitting AdaBoost

(re-train, on the samples that the first classifier failed)

Bagging draw k samples (with replacement); train k

classifiers; majority vote

adv

Page 78: Outline... Feature extraction Feature selection / Dim. reduction Classification

AdaBoost It creates new and improved base classifiers on its

way of training by manipulating the training dataset. At each iteration it feeds the base classifier with a

different distribution of the data to focus the base

classifier on hard examples.

Weighted sum of all base classifiers.

Dt1 iDt iexp ty iht x i

Zt

T

ttt xhsignxH

1

adv

Page 79: Outline... Feature extraction Feature selection / Dim. reduction Classification

Bagging Use another strategy to manipulate the

training data: Bootstrap resampling with replacement.

63.2% of the total original training examples are retained in each bootstrapped set.

Good for training unstable base classifiers such as neural network and decision tree.

Weighted sum of all base classifiers.

adv

Page 80: Outline... Feature extraction Feature selection / Dim. reduction Classification

Conclusions -Practitioner’s guide:

Many available implementations e.g., C4.5 (freeware), C5.0 Also, inside larger stat. packages

Advanced ideas: boosting, bagging Recent, scalable methods

see [Mitchell] or [Han+Kamber] for details

Page 81: Outline... Feature extraction Feature selection / Dim. reduction Classification

References Tom Mitchell, Machine Learning, McGraw

Hill, 1997. Jiawei Han and Micheline Kamber, Data

Mining: Concepts and Techniques, Morgan Kaufmann, 2000.

Page 82: Outline... Feature extraction Feature selection / Dim. reduction Classification

Outline ... Feature extraction Feature selection / Dim. reduction Classification

classification trees support vector machines mixture of experts

Page 83: Outline... Feature extraction Feature selection / Dim. reduction Classification

Problem: Classification we want to label ‘?’

num. attr#1 (e.g.., area)

num. attr#2(e.g.., bright.)

+

-++ +

+

+

+

-

--

--

?

Page 84: Outline... Feature extraction Feature selection / Dim. reduction Classification

Support Vector Machines (SVMs) we want to label ‘?’ - linear separator??

area

bright.

+

-

+

+

+

+

-

--

--

?

Page 85: Outline... Feature extraction Feature selection / Dim. reduction Classification

Support Vector Machines (SVMs) we want to label ‘?’ - linear separator??

area

bright.

+

-

+

+

+

+

-

--

--

?

Page 86: Outline... Feature extraction Feature selection / Dim. reduction Classification

Support Vector Machines (SVMs) we want to label ‘?’ - linear separator??

area

bright.

+

-

+

+

+

+

-

--

--

?

Page 87: Outline... Feature extraction Feature selection / Dim. reduction Classification

Support Vector Machines (SVMs) we want to label ‘?’ - linear separator??

+

-

+

+

+

+

-

--

--

?

area

bright.

Page 88: Outline... Feature extraction Feature selection / Dim. reduction Classification

Support Vector Machines (SVMs) we want to label ‘?’ - linear separator??

+

-

+

+

+

+

-

--

--

?

area

bright.

Page 89: Outline... Feature extraction Feature selection / Dim. reduction Classification

Support Vector Machines (SVMs) we want to label ‘?’ - linear separator?? A: the one with the widest corridor!

area

bright.

+

-+

++

+

-

--

--

?

Page 90: Outline... Feature extraction Feature selection / Dim. reduction Classification

Support Vector Machines (SVMs) we want to label ‘?’ - linear separator?? A: the one with the widest corridor!

area

bright.

+

-+

++

+

-

--

--

?

‘support vectors’

Page 91: Outline... Feature extraction Feature selection / Dim. reduction Classification

Support Vector Machines (SVMs) Q: what if + and - are not separable? A: penalize mis-classifications

area

bright.

+

-+

++

+

-

--

-- +

Page 92: Outline... Feature extraction Feature selection / Dim. reduction Classification

Support Vector Machines (SVMs) Q: how about non-linear separators? A:

area

bright.

+

-+

++

+

-

--

-- +

adv

Page 93: Outline... Feature extraction Feature selection / Dim. reduction Classification

Support Vector Machines (SVMs) Q: how about non-linear separators? A: possible (but need human)

area

bright.

+

-+

++

+

-

--

-- +

adv

Page 94: Outline... Feature extraction Feature selection / Dim. reduction Classification

Performance training:

O(Ns^3 + Ns^2 L + Ns L d ) to O(d * L^2 )

where Ns : # of support vectors L : size of training set d : dimensionality

adv

Page 95: Outline... Feature extraction Feature selection / Dim. reduction Classification

Performance classification

O( M Ns )

where Ns : # of support vectors M: # of operations to compute similarity (~

inner product ~ ‘kernel’)

adv

Page 96: Outline... Feature extraction Feature selection / Dim. reduction Classification

References C.J.C. Burges: A Tutorial on Support Vector

Machines for Pattern Recognition, Data Mining and Knowedge Discovery 2, 121-167, 1998

Nello Cristianini and John Shawe-Taylor. An Introduction to Support Vector Machines. Cambridge University Press, Cambridge, UK, 2000.

software: http://svmlight.joachims.org/ http://www.kernel-machines.org/

Page 97: Outline... Feature extraction Feature selection / Dim. reduction Classification

Outline ... Feature extraction Feature selection / Dim. reduction Classification

classification trees support vector machines mixture of experts

Page 98: Outline... Feature extraction Feature selection / Dim. reduction Classification

Mixture of experts Train several classifiers use a (weighted) majority vote scheme

Page 99: Outline... Feature extraction Feature selection / Dim. reduction Classification

Conclusions: 6 powerful tools: shape & texture features:

wavelets mathematical morphology

Dim. reduction: SVD/PCA ICA

Classification: decision trees SVMs

Page 100: Outline... Feature extraction Feature selection / Dim. reduction Classification