image processing, image recognition, computer vision...

Image Processing, Image Recognition,

Computer Vision, Image Understanding

Takashi Matsuyama

tm@i.kyoto-u.ac.jp

Dept. of Intelligence Science and Technology

Graduate School of Informatics, Kyoto University

Sound Processing, Speech Recognition,

Auditory Scene Understanding

Mathematical Theory of Pattern Recognition

The introduction will be given in two weeks. Slides will be on http://vision.kuee.kyoto-u.ac.jp/lecture/dsp Questions should be asked to tm@i.kyoto-u.ac.jp All reports should be sent to tm@i.kyoto-u.ac.jp by May 2nd (Fri.)

What is recognition?

Longman Dictionary: 1. the act of realizing and accepting that something is true or important 2. public respect and thanks for someone's work or achievements 3. the act of knowing someone or something because you have known or learned about them in the past 4. the act of officially accepting that an organization, government, person etc has legal or official authority

awareness perception cognition

understanding

Intelligent mental function

World Environments

Other systems

Intelligent System

Reasoning, Learning

Knowledge

Perception

(Sensory System）

Recognition Sensation

Action, Manipulation

(Motor System)

Architecture of Intelligent Systems

Thought

Interaction

Report 1

(a) Describe the meanings of and differences among 1. Sensation, 2. Perception, and 3. Recognition.

(b) Describe the differences between 1. Cognition and 2. Recognition.

Information at the (physical) signal level

Information at the (mental) cognitive level

Discrimination between these levels is important!

Physical Quantity vs Psychophysical Quantity

We see what we want to see and

hear what we want to hear.

Information at the image (signal) level

a pair of intersecting line segments in an image

an imaged part

(b) Electric circuit world

Information at the cognitive level

(a) block world

an imaged part

3D world Semantic world

Shape from Shading

World Environments

Other systems

Intelligent System

Reasoning, Learning

Knowledge

Perception

(Sensory System）

Recognition Sensation

Action, Manipulation

(Motor System)

Architecture of Intelligent Systems

Mental World (Informatics) Physical World

(Physics)

How to bridge two worlds

Thought Thought

Architecture of 21st Century

Cyber Society (mental world)

Physical World

Cyber Network Society

Physical Real World

Social Structure in the 21st Century

Physical Laws (obey)

Rules, Standards (comply)

Physical Model

Computation Model

Cyber-Physical Systems

Cyber-Physical Systems for Developing Smart Society

1. e-money in economy 2. e-Tag in transportation (ubiquitous

systems) 3. Digitizing Human Activities 4. Smart Energy Management

Physical Real World

(1) e-Money in economy

Authentication Security Pricing

Credit Warranty

Physical Real World

(2) e-Tag in transportation (ubiquitous systems)

ID type age origin grade

11 onion １ kyoto A

12 beef ３ USA B

ID role name opinion

8 leader Jim Yes

10 chair John ？

Location Information GIS

(3) Digitizing Human Activities

Real-time Integration

Sensing and Recognition Presentation and Control

Sensor Networks Embedded in the Real World

Motion and Blood Pressure Sensor

Taken from Panasonic Homepage

Real-Time Sensing &

Control

Power, Frequency, Phase Sensing

Power, Frequency, Phase Control

(4)Integrating Information and Electricity Networks

Physical Real World (Electricity Power Network)

Solar Cell

Fuel Cell

Solar Cell

battery

battery battery

battery

・distributed ・personalization ・bi-directional

Fundamental Concepts

Symbol (in the cyber network society) segmented entity with a unique ID

(basic processing: entity identification)

Signal (in the physical real world) non-segmented numerical data with physical measures

(basic processing: segmentation, similarity evaluation)

Pattern Recognition: Transform signal data to symbols. Informationization

Informationization: Bridging between cyber and physical worlds

Digitization

≠ Informationization

Physical Real World

ID Authentication

Object Recognition

Real world objects human

car dog cat

real estate

Data Structure numeral character

figure graph tree

Relation Interaction

Computation

Modeling Prediction

Mechanism of Informationization

bit sequence：0100, 1110

Representing Information in a Computer

Internal State of a computer (Electronic Circuit)

World of Information (Cyber Society)

Algorithm

, Coding

numeral, character, sound, image tree, graph, knowledge, concept

computation reasoning

bit operation

Report 2

In each application of the cyber-physical systems, explain how pattern recognition technologies can be used to realize the informationization. 1. e-money in economy 2. e-Tag in transportation (ubiquitous systems) 3. Digitizing Human Activities 4. Smart Energy Management

Pattern Recognition

Pattern Recognition in Informatics

１． What are patterns? ① Classes/categories/types of objects （class: a set of objects）

② Internal structures of objects（example: design patterns, fabric patterns, sound and image patterns, behavior patterns）

２． What is recognition? ① Decision about the membership of a set

（class/category/type classification） X（observed data） ∈ C（class）？

② Identification of an object （similarity, identity） X （observed data）= M（object model）？

Types of Pattern Recognition Methods

Types of

information

Method of

recognition

Classification

(Categorization)

Matching

(Identification)

Attributes Relations

Statistical

Pattern Classification

Syntactic

Pattern Matching Computer Vision

Image Understanding

Data Representation in Statistical Pattern Classification

All data are represented by Feature Vectors

X1 X2 ・・・ Xn

heightweight ・・・ age

Vector Representation of Video Data

Video Data i-th frame 1D signal

1/30 second

Scan line

Raster scan

Frame 1 Frame 2

row1 row2…rowN row1 row2

t 5 10 11 9 6 3 3 3 5 13 15 11

Class1

Class3 x1

Class2

decision boundaries

Feature Vectors

X1 X2 ・・・ Xn

Basic Scheme of Statistical Pattern Classification

Processes of Image Recognition

How Pattern Classification is used in practical applications.

Image Processing, Image Recognition,

Computer Vision, Image Understanding

•Image Processing ： image → image

•Pattern Classification ： feature vector → class name

•Computer Vision, Image Understanding ： image → scene description

•Image Processing ： signal processing + geometric processing

•Image Recognition ： image processing + pattern recognition

•Computer Vision ： image processing + camera/3D model

•Image Understanding ： image processing + knowledge/reasoning

Input / Output Data

Computational Methods

Image Processing ー contrast enhancement －

Image input preprocessing

Output Image

Input Image

Weighting Matrix

Sum of Products

Spatial Filtering (2D Convolution)

∫∫ −−=S

dxdyyxtyxfS ),(),(),(

:nConvolutio 2D

βαβα

−−−

010151

010:Filter Sharping

Image Processing ー silhouette extraction －

Image input preprocessing Image feature extraction (segmentation)

Geometric Processing ー extraction of small defects －

Input binary image expansion erosion

expansion erosion output XOR

Image input preprocessing Image feature extraction (segmentation)

Feature Measurement

Concavity (area size, number)

Shape projection

Bounding box (area size, location)

Principal axis (moment, direction)

Convex hull (area size)

Chord (length)

Area size

Boundary length

Hole (area size, number)

Feature measurement Image input preprocessing Image feature extraction (segmentation)

Feature vector

X1 X2 ・・・・・・・ Xn

Color features

Shape features

Texture features

Region / line in an image

Image input

preprocessing Image feature

extraction (segmentation)

Feature measurement

Image Processing

recognition

Class1

Class3 x1

Class2

decision boundaries

Feature Vectors

X1 X2 ・・・ Xn

Basic Scheme of Statistical Pattern Classification

Types of

information

Method of

recognition

Classification

(Categorization)

Matching

(Identification)

Statistical

Syntactic

Image Understanding

Recognition by Matching

【Signal Matching】【Symbol Matching】 ① Template Matching ① Word Matching ② Elastic Matching ② DNA Analysis ③ Model Matching ③ String Pattern Matching “at” matches with “hat”, “cat”, “bat”, … ④ Unification unify(f(x), f(g(a)) x=g(a)

2)()|(minarg tsignaltmodel −θθ

Correlation

*)()()(:TransformFourier )()()(

:)( and )(between Function n Correlatio

ωωω GFYdstsgsfty

=−= ∫∞

∞−

Correlation Function between Signals

∫∫∫

∞−

dttgtfdttgdttf

dttgtgtftf

dttgtf

)()(2)()(

))()()(2)((

))()((

:sDifference Squared of Sum

Correlation Function: 　∫∞

∞−

−= dstsmsgtr )()()(

Input signal to be processed

Target signal to be matched

Correlation Function：　∫∞

∞−

−= dstsmsgtr )()()(

a ta ma

Normalized Correlation Function

∫∞

∞−−= dstsgsfty )()()(

:y)(SimilaritFunction n Correlatio

∫ ∫

∫∫∫

∞−

−−−

−−−=

dttgtgtgdttf

dsgtsgdsfsf

dsgtsgfsfty

||)(|||)(|,||)(||

)()()(

:Functionn Correlatio Normalized

Invariant against biasing and scaling

∫∫

∫∫∫∫

∫∫

−−−

−−+=

−−−=

dxdyyxtyxf

dxdyyxtdxdyyxf

dxdyyxtyxfD

),(),(2

),(),(

),(),(),(

rity)(dissimila images obetween tw Difference

βαβα

∫∫∫∫

∫∫

−−−−

−−−−=

−−=

dxdytyxtdxdyfyxf

dxdytyxtfyxfS

dxdyyxtyxfS

),(),(

),(),(),(

:Fucntionn Correlatio Normalized

),(),(),(

:y)(SimilaritFunction n Correlatio

βαβα

Image Processing by Correlation

Template Matching

),( yxf ),( yxt

Image Light Source P Light Source

Camera

3D scene

P’’

Stereo Image Analysis

Finding the best matching point

• Resultant displacement is in units of pixels.

),( yxf

),( yxt

Depth Measurement by Triangulation The 3D depth of a scene point can be computed from a pair of matching image points in left and right images.

Baseline b θ1 θ2

?d Image plane of camera 2 Image Plane

of camera 1

l1 cosθ1 + l2 cosθ2 = b l1 sinθ1 = l2 sinθ2 = d

Eliminate l1, l2

d = b/(tan-1 θ1 + tan-1 θ2 )

Motion Analysis by Correlation

Observed 2D motion images T=2 T=F T=1

template image

Best matching position

Motion vector: (i0-x0, j0-y0)

Elastic (DP) Matching

Signal

P1 P2 P3 P4 ・・・・・・・・・・・・・・・・・・・・・・・・ Pn

Qm ・・・・・・・・・ Q4 Q3 Q2 Q1

Mode l

Signal

Q1 matches with P1 and P2

Principle of Optimality: An optimal policy has the property that whatever the initial state and initial decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision.

),,,()],1([

),,,,()1,1([

),,,,()]1,([min)],([:),( cost to minimum

><><+−

><><+−−

><><+−=

−−

QPQPcjiPEQPQPcjiPE

QPQPcjiPEjiPEjiP

The optimal paths to

these points have been computed

1−iP iP

1−jQ

Report 3

∫∫

−−=

dxdyyxtyxfS

dxdyyxtyxfC

),(),(),(

:n Correlatio

),(),(),(

:nConvolutiorelation. their Discuss

similar.look very n Correlatio andn Convolutio

βαβα

Computational Scheme of Statistical Pattern Classification

Types of

information

Method of

recognition

Classification

(Categorization)

Matching

(Identification)

Statistical

Syntactic

Image Understanding

Natural

Pattern Feature

Measurement

Feature Selection

(Extraction) Classification

Learning

Sample

Pattern (Training Sample)

Picture

Feature

Set 2 x =

．．．．

Feature Vector : x =

．．．．

measurement 1 measurement 2

．．．．

measurement n

Architecture of Statistical Pattern Classification Systems

g c (×)

(×) g 2

FEATURE DISCRIMINANT VECTOR FUNCTIONS

MAXIMUM DECISION

SELECTOR

Architecture of Pattern Classifiers

[1] Nearest Neighbor Classification

Class1

Class3 x1

Class2

decision boundaries

Feature Vectors

X1 X2 ・・・ Xn

Basic Scheme of Nearest Neighbor Classification

[Q1]What distance measures?

Measuring Unit Problem

height(cm)

weight (Kg)

Unit Change

height(cm)

weight (g)

Non-isotropic distance measure based on the shape of data distribution

Distance between vectors and

1. Euclidean Distance :

2. Distance :

3. Similarity :

4. Mahalanobis Distance :

1])([ i

ii yx −∑

⋅⋅

=θcos

2/11 )]()[( MXMX t −− ∑−

(M : Mean Vector, ∑ : Covariance Matrix )

Distance Measures between a pair of feature vectors

=∑pppp

σσσ

σσσσσσ

Mean Covariance Matrix

Parameter Estimation from sample set Nxxx ,, 21

lili x

1µ pi ,,2,1 =

kkljjl

ljk xx

Nµµσ −−

−= ∑

,,2,1,,2,1

Mean and Covariance matrix of data distribution

1µ 1x

•• ••

•••

••

• ••

•••

(b) SCATTER DIAGRAM ( a ) BIVARIATE NORMAL DENSITY

Two representations of a normal density.

n-dimensional Normal Distribution

[Q2]Distance between which entities?

Distance to distribution centers

Class1

Class3 x1

Class2

Feature vector

X1 X2 ・・・ Xn

Decision boundary

Distance to sample data

• Decision rule: – Find k nearest neighbor sample

data. – Find the most popular class by

voting from the k nearest neighbor sample data.

1x 2xjx

1ωInput feature vector x

n dimensional feature space

Distance by voting: k-nearest neighbor classification

Report 4

Compare the performance between 1. the nearest neighbor classification with the Mahalanobis distance and 2. the k-nearest neighbor classification in the following case.

[2]Statistical Pattern Classification

. is nature of state when theaction for taking incurred loss the)|(diagnoses i.e. actions, possible ofset finite the,,

classesobject i.e. nature, of states ofset finite the,,

ωαωαλααωω

x).|(x)|(x)|( x)|(x)|(x)|(

casecategory two

2221212

2121111

ωλωλαωλωλα

PPRPPR

).()|()(

)()|()|(

ωωω

)|()|()|( 1

jii pRrisklconditiona ωωαλα ∑

統計的パターン分類（BAYES DECISION THEORY）

A priori probability

a posteriori probability

Bayesian Decision Rule

Probability distribution

)|( 1ωxp

)|( 2ωxp

.,,2,1 allfor

)|( )|(ifonly and if X Decide

mjxpxp ji

≥∈

ωωω

Maximum-Likelihood Classifier

(regard p(x|ωi) as a function of ωi)

12E 21E

)|x()( 11 ωω pP )|x()( 22 ωω pP

B1R 2R

dxxpPE

Minimal Error Rate Classifier

. and estaimate todifficult isIt

data. sample from ,,, as wellas , Estimate

),N()|(

:onsDistributi Normal of Mixture Assume

data. sample from ,,, Estimate),N()|(

),N()|(:onDistributi Normal Assume

ΣΣΣ=Σ=

µµµωµω

【２】

【１】

Estimation of Probability Distribution Function

Report 5

.Statistics andTheory Probality between sdifference Discuss

y.Explain whtraining.-over fitting,-over todue

effectivenot is functionson distributiy probabilit as data sample from computed histograms Using

[3]Linear Discriminant Function

g c (×)

(×) g 2

FEATURE DISCRIMINANT VECTOR FUNCTIONS

MAXIMUM DECISION

SELECTOR

Architecture of Pattern Classifiers

).( ofsign on the baseddecision make and ,)( asit denoteLet

0)()()()()(BoundaryDecision

)()(Functionsnt DiscriminaLinear

212121

222111

xxwxxwwxxx

xwxxwx

gwgwwggg

=−+−=⇒=

　　　：

，：

Class1

Class2

g＜0 g = 0

Two Class Linear Discriminant Function

Geometric Representation

).( ofregion positive the towardheading is

.||)( |,|)( ,0)( sinceThen

have which wefrom ,||

:follow as represent can Then we .0)( onto

of projection orthogonal thedenote Let .0)(by defined

hyperplane the toorthogonal is that means This0)( Then,

boundary.decision on the points denote and Let

210201

xwwxwxx

xxwxwxwxx

　①

=−⇒+=+

g＜0 g = 0

______|)(| xg|||| w

|| 0w___|||| w

!system coordinate same in thedrepresente are vector feature theand t vector coefficien that theNote

Ykkpkkk

byiedmisclassifsampleparameterinitialarbitrary

datasampleofsequencecyclicPROCEDURECORRECTIONSAMPLESINGLE

JPROCEDUREDESCENTGRADIENT

YwhereJFunctionCriterionPerceptronMinimize

sampleallforthatsuchfind

thenif

ayyaaa

yyyyyyyyy

yaayaa

　　　：　

　　　】　【

）　（：　　

　　　３】【

２】【

，　　

１】【

+=∇−=

−=∈

321321321

,,,,,,,,,,,STEP3

data sample iedmisclassif0)()()(STEP

0Then,

:. class of data sample all ofsign theFlipSTEP

.)(by drepresente isfunction nt discrinima Then the

1t vectors.coefficien and feature ExtendSTEP

Learning the coefficient vector from sample data

)( 11 ay

)( 22 ay

1ya01 >yat

)(data sample iedmisclassif0)()()(

yaayaa

PROCEDUREDESCENTGRADIENT

YwhereJFunctionCriterionPerceptronMinimize

ρ　　

）　（：　　

　　　３】【

system. coordinate same in thedrepresente are vector featureand t vector coefficien that Note

)( 11 ay

)( 22 ay

0>ktya

kk byiedmisclassifsampleparameterinitialarbitrary

datasampleofsequencecyclicPROCEDURECORRECTIONSAMPLESINGLE

ayyaaa

yyyyyyyyy

　　　：　

　　　】　【 ’

321321321 ,,,,,,,,,,,STEP3

Report 6

Describe how we can generalize the two-class linear classifier to a multi-class classifier.

Class1

Class2

g＜0 g = 0

Maximize the margin

Optimizing Generalization Capability in Linear Discriminant Function

unique!not isfunction But the them.separatingfunction nt discrimina aget can we

separable,linearly are data sample class two that whenNote

Optimizing Generalization Capability in Linear Discriminant Function

. data sample allfor 0)(such that Find

yyaxa >=

)( 11 ay

1ya01 >yat

abt >1ˆ ya

. data sample allfor 0ˆ)(such that ˆ Find

yyaxa >>=

Linearly non-separable

space. feature original in thefunction nt discriminalinear -non

space. feature extended in thefunction discrinantlinear theFind

, ,:mappinglinear -nonby vector feature Extend

3212211

−−+=

→→→

yxxyxyx

Generalized Linear Discriminant Function

S layer R layer A layer

Random connection

coefficients： 1±

Complete connection

coefficient： ijω

<=±==

∑∑

Taifrsa

0)1,0(

ωαα

Perceptron

The Perceptron is a linear discrininant

function.

Natural

Pattern Feature

Measurement

Feature Selection

(Extraction) Classification

Learning

Sample

Pattern (Training Sample)

Picture

Feature

Set 2 x =

．．．．

Feature Vector : x =

．．．．

measurement 1 measurement 2

．．．．

measurement n

Principal Component Analysis Independent Component Analysis

Discriminant Analysis Multivariate Analysis

k-NN method Bayesian Method Sub-space method

Support Vector Machine Hidden Markov Model Dynamic Programming

Model Fitting Clustering

Self-Organizing Map Multidimensional

Scaling ML Estimation EM Algorithm

Architecture of Statistical Pattern Recognition Systems

【Artifacts】 1. Highly correlated features are included. The recognition rate does not improved as expected.

2. Sparse distribution Curse of dimensionality (Hughes effect) Require larger training samples for learning

3. The recognition error rate may be increased! over-fitting

Select useful features from observed features. （Pattern recognition systems should be designed

to recognize UNKNOW data correctly!）

Myth: Increase features to improve recognition rate.

image processing, image recognition, computer vision...

Documents

image recognition with tensorflow

the link between image segmentation and image recognition

error recognition and image analysis

currency recognition system using image...

pattern recognition and image analysis

image segmentation and recognition

multi-label image recognition by recurrently...

adversarial examples improve image recognition

recognition quality image - visenze · quality recognition....

pattern recognition in image analysis

hand gesture recognition based on digital image - ijser ·...

csse463: image recognition day 6

csse463: image recognition day 21

active wearable vision sensor: recognition of human...

csse463: image recognition

visual recognition: image formation

high-speed, large-scale image recognition and api ·...

currency recognition system using image...

image manipulation for face recognition

image recognition in context: application to...