ee 6882 statistical methods for video indexing and...

12
1 1 EE 6882 Statistical Methods for Video Indexing and Analysis Fall 2004 Prof. Shih-Fu Chang http://www.ee.columbia.edu/~sfchang Lecture 2 Part A (9/15/04) 2 EE6882-Chang EE E6882 SVIA Lecture 2 Review: Image features, color feature, similarity metrics Additional distance metrics Texture feature Performance evaluation metrics Review of statistic techniques Probability, Distribution Functions and Matlab demos Entropy and mutual information Discriminant Classifiers Bayesian Classifiers, GMM estimation by Expectation Maximization Readings Readings on the class web site about content based image search Vittorio Castteli, Probability Refresher, notes for EE E6880, Statistical Pattern Recognition, Spring 2002. A. Jain et al, "Statistical Pattern Recognition: A Review," IEEE Tran. on Pattern Analysis and Machine Intelligence, vol 22, No 1, Jan. 2000. Digital Image Processing Textbooks: Image classification Gonzalez and Woods Chap 12, Anil Jain Chap 9.14

Upload: others

Post on 15-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EE 6882 Statistical Methods for Video Indexing and Analysissfchang/course/svia-F04/slides/lecture2-A.pdf · EE E6882 SVIA Lecture 2 Review: Image features, color feature, ... Content-based

1

1

EE 6882 Statistical Methods for Video Indexing and Analysis

Fall 2004Prof. Shih-Fu Chang

http://www.ee.columbia.edu/~sfchang

Lecture 2 Part A (9/15/04)

2EE6882-Chang

EE E6882 SVIA Lecture 2Review: Image features, color feature, similarity metricsAdditional distance metricsTexture featurePerformance evaluation metricsReview of statistic techniques

Probability, Distribution Functions and Matlab demos Entropy and mutual informationDiscriminant ClassifiersBayesian Classifiers, GMM estimation by Expectation Maximization

ReadingsReadings on the class web site about content based image searchVittorio Castteli, Probability Refresher, notes for EE E6880, Statistical PatternRecognition, Spring 2002.A. Jain et al, "Statistical Pattern Recognition: A Review," IEEE Tran. on Pattern Analysis and Machine Intelligence, vol 22, No 1, Jan. 2000.Digital Image Processing Textbooks: Image classification

Gonzalez and Woods Chap 12, Anil Jain Chap 9.14

Page 2: EE 6882 Statistical Methods for Video Indexing and Analysissfchang/course/svia-F04/slides/lecture2-A.pdf · EE E6882 SVIA Lecture 2 Review: Image features, color feature, ... Content-based

2

3EE6882-Chang

Review: Statistical Pattern RecognitionImage/video pre-processing –quality, resolution etcFeature extraction

Color, texture, motion, shape, layout, regions, parts, etc

Feature representationDiscrete vs. continuous, vectorization, dimensionInvariance to scale, rotation, translation …

Feature selectionPCA, MDS, Kernel PCA, etc

Classification modelsDiscriminative vs. generativeMulti-modal fusion, early fusion vs. late fusion

Size of training/test data and manual supervision effortsValidation and evaluation processes

x

Likelihood

Probabilistic

Class 1 Class 2

x0(Height, income, …)

P(x|C=1) > or < P(x|C=2)

C(x0 )=?

x1

Decision Boundary

+++

+ + +

+

+++++

+

+++ +

++

++

++

--

--

-

--

-- --

- -

-

-

---

--

--

---

--

-

--

---

- --

--

-

---

--

x2

Discriminative

+

++

+

+

+ ++

f(x) < 0

f(x) > 0

f(x) discriminant function

4EE6882-Chang

Review: Feature-Based Image Matching

UserUser

User interfaceUser

interface

Image thumbnails

Image thumbnails

Images & videos

Images & videos

NetworkNetwork

QueryserverQueryserver

Image/videoServer

Image/videoServer

IndexIndex

ArchiveArchive

HSI-cone (cylindrical coordinates)

VisualSEEk system: 166 quantized bins in HIS space

−−−=

BGR

VVI

06/16/16/26/16/1

3/13/13/1

2

1

)(tan1

21

VVH −=

2/122

21 )( VVS +=

Page 3: EE 6882 Statistical Methods for Video Indexing and Analysissfchang/course/svia-F04/slides/lecture2-A.pdf · EE E6882 SVIA Lecture 2 Review: Image features, color feature, ... Content-based

3

5EE6882-Chang

Review: Similarity MetricsL1 distance

L2 distance

Histogram Intersection

Mohalanobis distance

1 1( , 1) ( ) ( )i ij

D i i H j H j++ = −∑2

2 1( , 1) ( ) ( )i ij

D i i H j H j++ = −∑( )1

1

min ( ), ( )

1

min ( ), ( )

i ij

I

i ij j

H j H j

D

H j H j

+

+

= −

∑ ∑

( ) ( )2 11 2 1 2

Tmah x

x

D x x C x xC : covariance matrix

−= − −o

o oo

xi

xj

ooo

oo

xi

xj

o

12 i jc s s= − 0c =

6EE6882-Chang

Earth Mover’s Distance (EMD)Rubner, Tomasi, Guibas ’98

Transportation Problem [Dantzig’51]

I Jcij

I: set of suppliersJ: set of consumerscij : cost of shipping a unit of supply from i to j

Problem: find the optimal set of flows fij to

0, ,

,

,

i j iji I i I

ij

ij ji I

ij ij J

j ij J

minimize c f s.t.

f i I j J (No reverse shipping)

f y j J (satisfy each consumer need /cacacity)

f x i I (bounded by each supplier's limit)

y x (

∈ ∈

≥ ∈ ∈

= ∈

≤ ∈

∑∑

∑∑

∑i I

feasibility)∈∑

Page 4: EE 6882 Statistical Methods for Video Indexing and Analysissfchang/course/svia-F04/slides/lecture2-A.pdf · EE E6882 SVIA Lecture 2 Review: Image features, color feature, ... Content-based

4

7EE6882-Chang

Advantage of EMDEfficient implementations exist (Simplex Method)Also support partial matching (||I|| >< ||J||, e.g., histogram defined in different color spaces, or scales)If the mass of two distributions equal, then EMD is a true metricAllow flexible structures, e.g., matching multiple regions in each image

Multiple region in one image, each region represented by individual feature vector

Region set: {R1, R2, R3} Region set: {R1’, R2’, R3’, R4’}

Cij = dist(Ri, Rj’), which can be based on EMD also

8EE6882-Chang

EMD of Color Histogram( ) ( ) ( ) ( ) ( ) ( )

( ) 1 1

1 1

, ,..., , , ,..., , ( ) ( )

,

j i

M N

ij iji j

M N

iji j

h h 1 h 2 h M g= g 1 g 2 h N assume g j h i

C f

EMD h gf

= =

= =

= ≤

=

∑ ∑

∑∑

∑∑ Earth Hole

1 1 1

/M N N

ij ij ji j j

ij

ij ij

= C f g Fill up each hole

C : distance between color i in color space h and color j in color space g

f : move f units of mass from color i in h to color j in g

= = =∑∑ ∑

Normalization by the denominator termAvoid bias toward low mass distributions (i.e., small images)what’s the difference if both h and g are normalized first?

exact matching of sub-parts is changed.

Page 5: EE 6882 Statistical Methods for Video Indexing and Analysissfchang/course/svia-F04/slides/lecture2-A.pdf · EE E6882 SVIA Lecture 2 Review: Image features, color feature, ... Content-based

5

9EE6882-Chang

TextureWhat is texture?

Has structure or repetitious pattern, i.e., checkeredHas statistical pattern, i.e., grass, sand, rocks

Why texture?Application to satellite images, medical images Describes contents of real world images, i.e., clouds, fabrics, surfaces, wood, stone

Challenging issuesRotation and scale invariance (3D)Segmentation/extraction of texture regions from imagesTexture in noise

10EE6882-Chang

Page 6: EE 6882 Statistical Methods for Video Indexing and Analysissfchang/course/svia-F04/slides/lecture2-A.pdf · EE E6882 SVIA Lecture 2 Review: Image features, color feature, ... Content-based

6

11EE6882-Chang

Some approaches for texture featuresFourier Domain Energy Distribution

Angular features (directionality)

Radial features (coarseness)

21

1

2

tan

,

),(21

θθ

θθ

=

∫∫

uv

where

dudvvuFV

222

1

2

,

),(21

rvurwhere

dudvvuFV rr

<+≤

= ∫∫

φ

r

12EE6882-Chang

Co-occurrence Matrix - (image with m levels)

Popular early texture approach

Approaches to texture

)cos( and )sin( and ],[ and ],[

NW'' ,north'' e.g., pixels, obetween twrelation ),(,

),()0,(

),0()0,0(),(

0101

1100

),(),(

),(),(

),(

θθ

θ

θθ

θθ

θ

dxxdyyjyxIiyxI

dRwhere

mmQmQ

mQQjiQ

dRdR

dRdR

dR

+=+===

=

=

0P

1Pdθ

Page 7: EE 6882 Statistical Methods for Video Indexing and Analysissfchang/course/svia-F04/slides/lecture2-A.pdf · EE E6882 SVIA Lecture 2 Review: Image features, color feature, ... Content-based

7

13EE6882-Chang

Co-occurrence Matrix(also called Grey-Level Dependence, SGLD)

Measures on

Energy

Entropy

Correlation

Inertia

Local Homogeneity

),(),( jiQ dR θ

∑∑=i j

dR jiQdE ),(),( ),(θθ

)),(/log(),(

),( ),(∑∑=i j

RdR jiQEE

jiQdH θθ

∑∑ ⋅−−

=i j

Ryx

yx jiQji

dC ),())((

),(σσ

µµθ

∑∑ −=i j

R jiQjidI ),()(),( 2θ

∑∑−+

=i j

R jiQji

dL ),()(1

1),( 2θ

Statistical MeasuresNone corresponds to a visual component.

14EE6882-Chang

Non-Fourier type bassMatched better to intuitive texture featuresExamples of filters (out of total 12)

Laws Filters [1980]

−−−−

−−−−−

14642812820000028128214641

−−−−

1020120402000002040210201

−−−−−

−−−−−

−−

1464141624164

6243624641624164

14641

Measure energy of output from each filter

mI12 outputs

Page 8: EE 6882 Statistical Methods for Video Indexing and Analysissfchang/course/svia-F04/slides/lecture2-A.pdf · EE E6882 SVIA Lecture 2 Review: Image features, color feature, ... Content-based

8

15EE6882-Chang

Tamura TextureMethods for approximating intuitive texture featuresExample: ‘Coarseness’, others: ‘contrast’, ‘directionality’

Step1: Compute averages at different scales, 1x1, 2x2, 4x4 pixels

Step2: compute neighborhood difference at each scale

Step 3: select the scale with the largest variation

Step 4: compute the coarseness

kBestL yxSEEEEyx 2) ( ), . . . , , max( determine ),( 21k ==∀

∑∑−

+

−=

+

−=

=∀1

1

1

1

2

22

2

2 2),(),( ),,(

k

k

k

k

y

yjk

x

xik

jifyxAyx

),2(),2() ( ),,( 11, yxAyxAyxEyx k

kk

khk−− −−+=∀

∑∑= =

=m

j

n

iBestCRS jiS

MNF

1 1),(1

16EE6882-Chang

Content-based Image and Video Retrieval System

UserUser

User interface

User interface

Image thumbnails

Image thumbnails

Images & videos

Images & videos

NetworkNetwork

QueryserverQueryserver

Image/videoServer

Image/videoServer

IndexIndex

ArchiveArchive

What are the bottlenecks of the system?What functionalities should each component have?

Page 9: EE 6882 Statistical Methods for Video Indexing and Analysissfchang/course/svia-F04/slides/lecture2-A.pdf · EE E6882 SVIA Lecture 2 Review: Image features, color feature, ... Content-based

9

17EE6882-Chang

Evaluation

Detection

False Alarms

Misses

Correct Dismissals )/(

)/(

)/(

DBBF

BAAP

CAAR

+=

+=

+=

1-N0 "Irrelevant" 0 Relevant"" 1

==

nVn

BVD

AVC

VB

VA

N

n n

N

n n

K

n n

K

n n

−−=

−=

−=

=

∑∑∑∑

=

=

=

))1((

)(

)1(

1

0

1

0

1

0

1

0

N Images in DB K ranked returned Result

D B CA

“Returned” “Relevant Ground Truth”

Recall

Precision

Fallout

Combined 2/)(1 RPRPF

+⋅=

18EE6882-Chang

Evaluation MeasuresPrecision Recall Curve

2. Receiver Operating Characteristic (ROC Curve)

3. Relative Operating Characteristic

4. P value

5. 3-point P value

) vs( RPP

R

BA vs

FA vs

)int( offcut at 1

0∑ −

== N

n nk VkP

0.8 0.5 .20at Avg =RP

A(hit)

B (false)

Page 10: EE 6882 Statistical Methods for Video Indexing and Analysissfchang/course/svia-F04/slides/lecture2-A.pdf · EE E6882 SVIA Lecture 2 Review: Image features, color feature, ... Content-based

10

19EE6882-Chang

Evaluation Metric: Average Precision

S Ranked list of data in response to a query

3/73/63/53/42/31/21/1Precision0001101truth Ground

DDDDD s......2163815

Average precision: datarelevant ofnumber : ,11

totalRIj

RR

AP j

s

j

j∑=

=

0 1 2 3 4 5 6 7

Precision

j

3∑ iP

AP measures the average of precision values at R relevant data points

0 1 2 3 4 5 6 7

Rj

j

1

2

31.0

20EE6882-Chang

Evaluation Metric: Average PrecisionAlternative Measure

Ranked result are manually inspected to a depth of N1E.g., in TREC VIDEO 2003, N1 =100; in TREC VIDEO 2004, N1 =1000

Observations (AP)AP depends on the rankings of relevant data and the size of the relevant data set. E.g., R=10

Case I: + + + + + + + + + - - - - --+Pre: 1 1 1 1 1 1 1 1 1 0 0 0 0 001 AP=1

Case II: - +Pre: 1/2 AP=1/2

- + - + - + - + - + - + - + - + - +1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2

Case II: Pre:

- - - --- - - -- + + + + + + + + +1/11 2/12 10/20… … AP~0.3

Page 11: EE 6882 Statistical Methods for Video Indexing and Analysissfchang/course/svia-F04/slides/lecture2-A.pdf · EE E6882 SVIA Lecture 2 Review: Image features, color feature, ... Content-based

11

21EE6882-Chang

Evaluation Metric: Average PrecisionObservations (AP)

E.g., R=2

AP is different from interpolated average of precision values

Case I: + + - - - -

AP=1

Case II: - - - - + - - - - +

AP=0.2

Precision 0.2 0.2

22EE6882-Chang

Readings available on the class site for content-based image retrieval Consider this topic for class presentation

How to get hands on …Get the image content set from TAGet familiar with programming tools, e.g., Matlab

Introduction to Matlab basic commandshttp://www.ee.columbia.edu/~sfchang/tools/matlab.intro.html

Introduction to basic image processing commands in Matlabhttp://www.ee.columbia.edu/~sfchang/tools/DIPtutorial.m

Page 12: EE 6882 Statistical Methods for Video Indexing and Analysissfchang/course/svia-F04/slides/lecture2-A.pdf · EE E6882 SVIA Lecture 2 Review: Image features, color feature, ... Content-based

12

23EE6882-Chang

Paper List for Fall 2004Updated paper list available at the course web siteTopics

Content-based image searchWeb image searchMedia fingerprintingImage classification

Bayesian, Boosting, SVMRelevance feedback

Document clusteringHMM and video classificationLanguage models and applications in multimedia IR

Feel free to propose additional topics