image processing, image recognition, computer vision...
TRANSCRIPT
Image Processing, Image Recognition,
Computer Vision, Image Understanding
Takashi Matsuyama
Dept. of Intelligence Science and Technology
Graduate School of Informatics, Kyoto University
Sound Processing, Speech Recognition,
Auditory Scene Understanding
Mathematical Theory of Pattern Recognition
The introduction will be given in two weeks. Slides will be on http://vision.kuee.kyoto-u.ac.jp/lecture/dsp Questions should be asked to [email protected] All reports should be sent to [email protected] by May 2nd (Fri.)
What is recognition?
Longman Dictionary: 1. the act of realizing and accepting that something is true or important 2. public respect and thanks for someone's work or achievements 3. the act of knowing someone or something because you have known or learned about them in the past 4. the act of officially accepting that an organization, government, person etc has legal or official authority
awareness perception cognition
understanding
Intelligent mental function
Outer
World Environments
Other systems
Intelligent System
Reasoning, Learning
Knowledge
Perception
(Sensory System)
Recognition Sensation
Action, Manipulation
(Motor System)
Architecture of Intelligent Systems
Thought
Interaction
Report 1
(a) Describe the meanings of and differences among 1. Sensation, 2. Perception, and 3. Recognition.
(b) Describe the differences between 1. Cognition and 2. Recognition.
Information at the (physical) signal level
VS
Information at the (mental) cognitive level
Discrimination between these levels is important!
>
>
<
<
Physical Quantity vs Psychophysical Quantity
We see what we want to see and
hear what we want to hear.
Information at the image (signal) level
a pair of intersecting line segments in an image
an imaged part
(b) Electric circuit world
Information at the cognitive level
(a) block world
an imaged part
3D world Semantic world
θ
Shape from Shading
Outer
World Environments
Other systems
Intelligent System
Reasoning, Learning
Knowledge
Perception
(Sensory System)
Recognition Sensation
Action, Manipulation
(Motor System)
Architecture of Intelligent Systems
Mental World (Informatics) Physical World
(Physics)
How to bridge two worlds
Thought Thought
Architecture of 21st Century
Cyber Society (mental world)
Physical World
Cyber Network Society
Physical Real World
Social Structure in the 21st Century
?
Physical Laws (obey)
Rules, Standards (comply)
Physical Model
Computation Model
Cyber-Physical Systems
13
Cyber-Physical Systems for Developing Smart Society
1. e-money in economy 2. e-Tag in transportation (ubiquitous
systems) 3. Digitizing Human Activities 4. Smart Energy Management
Cyber Network Society
Physical Real World
(1) e-Money in economy
Authentication Security Pricing
Credit Warranty
Cyber Network Society
Physical Real World
(2) e-Tag in transportation (ubiquitous systems)
ID type age origin grade
11 onion 1 kyoto A
12 beef 3 USA B
E-tag
E-tag
ID role name opinion
8 leader Jim Yes
10 chair John ?
Location Information GIS
(3) Digitizing Human Activities
Real-time Integration
Sensing and Recognition Presentation and Control
Cyber Network Society
Human
Sensor Networks Embedded in the Real World
Motion and Blood Pressure Sensor
Taken from Panasonic Homepage
Real-Time Sensing &
Control
Power, Frequency, Phase Sensing
Power, Frequency, Phase Control
Cyber Network Society
(4)Integrating Information and Electricity Networks
Physical Real World (Electricity Power Network)
Solar Cell
Fuel Cell
Solar Cell
Solar Cell
Solar Cell
Solar Cell
battery
EV
EV
battery battery
battery
・distributed ・personalization ・bi-directional
Fundamental Concepts
Symbol (in the cyber network society) segmented entity with a unique ID
(basic processing: entity identification)
Signal (in the physical real world) non-segmented numerical data with physical measures
(basic processing: segmentation, similarity evaluation)
Pattern Recognition: Transform signal data to symbols. Informationization
Informationization: Bridging between cyber and physical worlds
Digitization
t
≠ Informationization
Cyber Network Society
Physical Real World
ID Authentication
Object Recognition
Real world objects human
car dog cat
real estate
Data Structure numeral character
figure graph tree
Relation Interaction
Computation
Modeling Prediction
Mechanism of Informationization
bit sequence:0100, 1110
Representing Information in a Computer
Internal State of a computer (Electronic Circuit)
World of Information (Cyber Society)
Algorithm
, Coding
numeral, character, sound, image tree, graph, knowledge, concept
computation reasoning
bit operation
Report 2
In each application of the cyber-physical systems, explain how pattern recognition technologies can be used to realize the informationization. 1. e-money in economy 2. e-Tag in transportation (ubiquitous systems) 3. Digitizing Human Activities 4. Smart Energy Management
Pattern Recognition
Pattern Recognition in Informatics
1. What are patterns? ① Classes/categories/types of objects (class: a set of objects)
② Internal structures of objects(example: design patterns, fabric patterns, sound and image patterns, behavior patterns)
2. What is recognition? ① Decision about the membership of a set
(class/category/type classification) X(observed data) ∈ C(class)?
② Identification of an object (similarity, identity) X (observed data)= M(object model)?
Types of Pattern Recognition Methods
Types of
information
Method of
recognition
Classification
(Categorization)
Matching
(Identification)
Attributes Relations
Statistical
Pattern Classification
Syntactic
Pattern Classification
Pattern Matching Computer Vision
Image Understanding
Data Representation in Statistical Pattern Classification
All data are represented by Feature Vectors
X =
X1 X2 ・ ・ ・ Xn
heightweight ・ ・ ・ age
Vector Representation of Video Data
Video Data i-th frame 1D signal
1/30 second
Scan line
Raster scan
Frame 1 Frame 2
row1 row2…rowN row1 row2
t 5 10 11 9 6 3 3 3 5 13 15 11
Class1
Class3 x1
x2
Class2
decision boundaries
Feature Vectors
X =
X1 X2 ・ ・ ・ Xn
heightweight ・ ・ ・ age
Basic Scheme of Statistical Pattern Classification
Processes of Image Recognition
How Pattern Classification is used in practical applications.
Image Processing, Image Recognition,
Computer Vision, Image Understanding
•Image Processing : image → image
•Pattern Classification : feature vector → class name
•Computer Vision, Image Understanding : image → scene description
•Image Processing : signal processing + geometric processing
•Image Recognition : image processing + pattern recognition
•Computer Vision : image processing + camera/3D model
•Image Understanding : image processing + knowledge/reasoning
Input / Output Data
Computational Methods
Image Processing ー contrast enhancement -
Image input preprocessing
Output Image
Input Image
Weighting Matrix
Sum of Products
Spatial Filtering (2D Convolution)
∫∫ −−=S
dxdyyxtyxfS ),(),(),(
:nConvolutio 2D
βαβα
−−−
−
010151
010:Filter Sharping
Image Processing ー silhouette extraction -
Image input preprocessing Image feature extraction (segmentation)
Geometric Processing ー extraction of small defects -
Input binary image expansion erosion
expansion erosion output XOR
Image input preprocessing Image feature extraction (segmentation)
Feature Measurement
Concavity (area size, number)
×
Shape projection
Bounding box (area size, location)
Principal axis (moment, direction)
Convex hull (area size)
Chord (length)
Area size
Boundary length
Hole (area size, number)
Feature measurement Image input preprocessing Image feature extraction (segmentation)
Feature vector
X =
X1 X2 ・ ・ ・ ・ ・ ・ ・ Xn
Color features
Shape features
Texture features
Region / line in an image
Image input
preprocessing Image feature
extraction (segmentation)
Feature measurement
Image Processing
recognition
Pattern Classification
Class1
Class3 x1
x2
Class2
decision boundaries
Feature Vectors
X =
X1 X2 ・ ・ ・ Xn
heightweight ・ ・ ・ age
Basic Scheme of Statistical Pattern Classification
Types of Pattern Recognition Methods
Types of
information
Method of
recognition
Classification
(Categorization)
Matching
(Identification)
Attributes Relations
Statistical
Pattern Classification
Syntactic
Pattern Classification
Pattern Matching Computer Vision
Image Understanding
Recognition by Matching
【Signal Matching】 【Symbol Matching】 ① Template Matching ① Word Matching ② Elastic Matching ② DNA Analysis ③ Model Matching ③ String Pattern Matching “at” matches with “hat”, “cat”, “bat”, … ④ Unification unify(f(x), f(g(a)) x=g(a)
2)()|(minarg tsignaltmodel −θθ
Correlation
*)()()(:TransformFourier )()()(
:)( and )(between Function n Correlatio
ωωω GFYdstsgsfty
tgtf
=−= ∫∞
∞−
Correlation Function between Signals
∫∫∫
∫
∫
∞
∞−
∞
∞−
∞
∞−
∞
∞−
∞
∞−
−+=
+−=
−
dttgtfdttgdttf
dttgtgtftf
dttgtf
)()(2)()(
))()()(2)((
))()((
:sDifference Squared of Sum
22
22
2
Correlation Function: ∫∞
∞−
−= dstsmsgtr )()()(
Input signal to be processed
Target signal to be matched
x
Correlation Function: ∫∞
∞−
−= dstsmsgtr )()()(
a ta ma
a
Normalized Correlation Function
∫∞
∞−−= dstsgsfty )()()(
:y)(SimilaritFunction n Correlatio
∫ ∫
∫∫∫
∞
∞−
∞
∞−
∞
∞−
∞
∞−
∞
∞−
==
−−−
−−−=
dttgtgtgdttf
tftf
dsgtsgdsfsf
dsgtsgfsfty
2
2
2
2
22
*
||)(|||)(|,||)(||
|)(|
)()(
)()()(
:Functionn Correlatio Normalized
Invariant against biasing and scaling
∫∫
∫∫∫∫
∫∫
−−−
−−+=
−−−=
S
SS
S
dxdyyxtyxf
dxdyyxtdxdyyxf
dxdyyxtyxfD
),(),(2
),(),(
),(),(),(
rity)(dissimila images obetween tw Difference
22
2
βα
βα
βαβα
∫∫∫∫
∫∫
∫∫
−−−−
−−−−=
−−=
SS
S
S
dxdytyxtdxdyfyxf
dxdytyxtfyxfS
dxdyyxtyxfS
22
*
),(),(
),(),(),(
:Fucntionn Correlatio Normalized
),(),(),(
:y)(SimilaritFunction n Correlatio
βα
βαβα
βαβα
Image Processing by Correlation
Template Matching
),( yxf ),( yxt
Image Light Source P Light Source
Camera
3D scene
P
P’
P’’
Stereo Image Analysis
Finding the best matching point
• Resultant displacement is in units of pixels.
),( yxf
),( yxt
49
Depth Measurement by Triangulation The 3D depth of a scene point can be computed from a pair of matching image points in left and right images.
Baseline b θ1 θ2
?d Image plane of camera 2 Image Plane
of camera 1
l1 cosθ1 + l2 cosθ2 = b l1 sinθ1 = l2 sinθ2 = d
Eliminate l1, l2
d = b/(tan-1 θ1 + tan-1 θ2 )
l2 l1
Motion Analysis by Correlation
Observed 2D motion images T=2 T=F T=1
template image
Best matching position
Motion vector: (i0-x0, j0-y0)
Elastic (DP) Matching
Model
Signal
P1 P2 P3 P4 ・・・・・・・・・・・・・・・・・・・・・・・・ Pn
Qm ・ ・ ・ ・ ・ ・ ・ ・ ・ Q4 Q3 Q2 Q1
Mode l
Signal
Q1 matches with P1 and P2
Principle of Optimality: An optimal policy has the property that whatever the initial state and initial decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision.
),,,()],1([
),,,,()1,1([
),,,,()]1,([min)],([:),( cost to minimum
1
11
1
><><+−
><><+−−
><><+−=
−
−−
−
jiji
jiji
jiji
QPQPcjiPEQPQPcjiPE
QPQPcjiPEjiPEjiP
The optimal paths to
these points have been computed
1−iP iP
jQ
1−jQ
Report 3
∫∫
∫∫
−−=
−−=
S
dxdyyxtyxfS
dxdyyxtyxfC
),(),(),(
:n Correlatio
),(),(),(
:nConvolutiorelation. their Discuss
similar.look very n Correlatio andn Convolutio
S
βαβα
βαβα
Computational Scheme of Statistical Pattern Classification
Types of Pattern Recognition Methods
Types of
information
Method of
recognition
Classification
(Categorization)
Matching
(Identification)
Attributes Relations
Statistical
Pattern Classification
Syntactic
Pattern Classification
Pattern Matching Computer Vision
Image Understanding
Natural
Pattern Feature
Measurement
Feature Selection
(Extraction) Classification
Learning
Sample
Pattern (Training Sample)
Picture
Data
Feature
Set 1
Feature
Set 2 x =
x x
x
1
2
m
....
y =
y y
y
....
1
2
n
Class
Name
Feature Vector : x =
x x
x
1
2
n
....
measurement 1 measurement 2
....
measurement n
α
Architecture of Statistical Pattern Classification Systems
g 1
g 2
g c
x
x
x
1
2
d
MAX
g c (×)
(×) g 2
g 1
(×)
FEATURE DISCRIMINANT VECTOR FUNCTIONS
MAXIMUM DECISION
SELECTOR
α
Architecture of Pattern Classifiers
[1] Nearest Neighbor Classification
Class1
Class3 x1
x2
Class2
decision boundaries
Feature Vectors
X =
X1 X2 ・ ・ ・ Xn
heightweight ・ ・ ・ age
Basic Scheme of Nearest Neighbor Classification
[Q1]What distance measures?
Measuring Unit Problem
height(cm)
weight (Kg)
Unit Change
height(cm)
weight (g)
Non-isotropic distance measure based on the shape of data distribution
=
nx
xx
X2
1
Distance between vectors and
=
ny
yy
Y2
1
1. Euclidean Distance :
2. Distance :
3. Similarity :
4. Mahalanobis Distance :
1L
2/12
1])([ i
n
ii yx −∑
=
||1
i
n
ii yx −∑
=
YXYX
⋅⋅
=θcos
2/11 )]()[( MXMX t −− ∑−
(M : Mean Vector, ∑ : Covariance Matrix )
•
θX
Y
1
2
3
Distance Measures between a pair of feature vectors
=
px
xx
2
1
x
=
pµ
µµ
2
1
u
=∑pppp
p
p
i
σσσ
σσσσσσ
21
22221
11211
Mean Covariance Matrix
Parameter Estimation from sample set Nxxx ,, 21
∑=
=N
lili x
N 1
1µ pi ,,2,1 =
))((1
11
kkljjl
N
ljk xx
Nµµσ −−
−= ∑
=pkpj
,,2,1,,2,1
==
Mean and Covariance matrix of data distribution
2µ
2x
1µ 1x
•
•
•
•• ••
•
•
•
•••
••
• ••
•
•••
•••
•
•
•
•
•
(b) SCATTER DIAGRAM ( a ) BIVARIATE NORMAL DENSITY
Two representations of a normal density.
n-dimensional Normal Distribution
[Q2]Distance between which entities?
Distance to distribution centers
Class1
Class3 x1
x2
Class2
Feature vector
X =
X1 X2 ・ ・ ・ Xn
Decision boundary
Distance to sample data
• Decision rule: – Find k nearest neighbor sample
data. – Find the most popular class by
voting from the k nearest neighbor sample data.
1x 2xjx
nx
...
2ω
1ωInput feature vector x
n dimensional feature space
X
Distance by voting: k-nearest neighbor classification
Report 4
Compare the performance between 1. the nearest neighbor classification with the Mahalanobis distance and 2. the k-nearest neighbor classification in the following case.
[2]Statistical Pattern Classification
. is nature of state when theaction for taking incurred loss the)|(diagnoses i.e. actions, possible ofset finite the,,
classesobject i.e. nature, of states ofset finite the,,
1
1
iiii
a
s
aAs
ωαωαλααωω
==Ω
x).|(x)|(x)|( x)|(x)|(x)|(
casecategory two
2221212
2121111
ωλωλαωλωλα
PPRPPR
+=+=
).()|()(
where
,)(
)()|()|(
1i
s
ji
iii
Ppp
pPpp
ωω
ωωω
∑=
=
=
xx
xxx
)|()|()|( 1
xx jj
s
jii pRrisklconditiona ωωαλα ∑
=
=
統計的パターン分類(BAYES DECISION THEORY)
A priori probability
a posteriori probability
Bayesian Decision Rule
Probability distribution
)|( 1ωxp
x x
)|( 2ωxp
.,,2,1 allfor
)|( )|(ifonly and if X Decide
mjxpxp ji
i
=
≥∈
ωωω
Maximum-Likelihood Classifier
(regard p(x|ωi) as a function of ωi)
12E 21E
)|x()( 11 ωω pP )|x()( 22 ωω pP
B1R 2R
dxxpPE
dxxpPE
EEP
R
R
E
)|()(
)|()(
where
1121
2212
2112
2
1
ωω
ωω
∫
∫=
=
+=
Minimal Error Rate Classifier
. and estaimate todifficult isIt
data. sample from ,,, as wellas , Estimate
),N()|(
),N()|(
:onsDistributi Normal of Mixture Assume
data. sample from ,,, Estimate),N()|(
),N()|(:onDistributi Normal Assume
21
2121
1222
1111
2121
22
111
2
1
NNba
bp
ap
pp
ji
N
jjjj
N
iiii
ΣΣ
Σ=
Σ=
ΣΣΣ=Σ=
∑
∑
=
=
µµ
µω
µω
µµµωµω
x
x
xx
2
【2】
【1】
Estimation of Probability Distribution Function
Report 5
.Statistics andTheory Probality between sdifference Discuss
y.Explain whtraining.-over fitting,-over todue
effectivenot is functionson distributiy probabilit as data sample from computed histograms Using
[3]Linear Discriminant Function
g 1
g 2
g c
x
x
x
1
2
d
MAX
g c (×)
(×) g 2
g 1
(×)
FEATURE DISCRIMINANT VECTOR FUNCTIONS
MAXIMUM DECISION
SELECTOR
α
Architecture of Pattern Classifiers
).( ofsign on the baseddecision make and ,)( asit denoteLet
0)()()()()(BoundaryDecision
)()(Functionsnt DiscriminaLinear
0
212121
222111
xxwxxwwxxx
xwxxwx
gwgwwggg
wgwg
t
tt
tt
+=
=−+−=⇒=
+=+=
:
,:
Class1
Class2
x
g>0
g<0 g = 0
Two Class Linear Discriminant Function
Geometric Representation
).( ofregion positive the towardheading is
.||)( |,|)( ,0)( sinceThen
.||
)()||
()(
have which wefrom ,||
:follow as represent can Then we .0)( onto
of projection orthogonal thedenote Let .0)(by defined
hyperplane the toorthogonal is that means This0)( Then,
boundary.decision on the points denote and Let
0
210201
21
xwwxwxx
wwwx
wwxwx
wwxx
xxxx
xw
xxwxwxwxx
g
grrgg
rgwrg
r
g
g
ww
p
t
ppt
p
p
ttt
①
===
+=++=
+=
=
=
=−⇒+=+
g>0
g<0 g = 0
______|)(| xg|||| w
xW
|| 0w___|||| w
px
!system coordinate same in thedrepresente are vector feature theand t vector coefficien that theNote
xw
1x
2x・
・
kkk
kk
Ykkpkkk
t
Y
tp
iit
iii
byiedmisclassifsampleparameterinitialarbitrary
datasampleofsequencecyclicPROCEDURECORRECTIONSAMPLESINGLE
JPROCEDUREDESCENTGRADIENT
YwhereJFunctionCriterionPerceptronMinimize
sampleallforthatsuchfind
thenif
wg
w
ayyaaa
yyyyyyyyy
yaaaa
yaayaa
yyaa
yyy
yaxwx
wa
xy
ay
ay
tt
:
:
:
】 【
) (:
3】【
2】【
,
1】【
’
+=
+=∇−=
<−=
>
−=∈
=+=
=
=
+
∈+
∈
∑
∑
1
1
321321321
)(1
)(
2
2
0
0
,,,,,,,,,,,STEP3
)(
data sample iedmisclassif0)()()(STEP
0Then,
:. class of data sample all ofsign theFlipSTEP
.)(by drepresente isfunction nt discrinima Then the
1t vectors.coefficien and feature ExtendSTEP
ρρ
ωω
Learning the coefficient vector from sample data
)( 11 ay
)( 22 ay
1ya01 >yat
2y
∑
∑
∈+
∈
+=
<−=
)(1
)(data sample iedmisclassif0)()()(
STEP
ay
ay
yaa
yaayaa
Ykkk
t
Y
tp
PROCEDUREDESCENTGRADIENT
YwhereJFunctionCriterionPerceptronMinimize
ρ
) (:
3】【
system. coordinate same in thedrepresente are vector featureand t vector coefficien that Note
ya
)( 11 ay
)( 22 ay
ky
ka
0>ktya
ky
1+ka
kkk
kk byiedmisclassifsampleparameterinitialarbitrary
datasampleofsequencecyclicPROCEDURECORRECTIONSAMPLESINGLE
ayyaaa
yyyyyyyyy
:
:
:
】 【 ’
+=+1
1
321321321 ,,,,,,,,,,,STEP3
Report 6
Describe how we can generalize the two-class linear classifier to a multi-class classifier.
Class1
Class2
x
g>0
g<0 g = 0
Maximize the margin
Optimizing Generalization Capability in Linear Discriminant Function
unique!not isfunction But the them.separatingfunction nt discrimina aget can we
separable,linearly are data sample class two that whenNote
Optimizing Generalization Capability in Linear Discriminant Function
. data sample allfor 0)(such that Find
i
itg
yyaxa >=
)( 11 ay
1ya01 >yat
abt >1ˆ ya
. data sample allfor 0ˆ)(such that ˆ Find
i
it bg
yyaxa >>=
b
2x
1x
Linearly non-separable
312)(
space. feature original in thefunction nt discriminalinear -non
312)(
space. feature extended in thefunction discrinantlinear theFind
, ,:mappinglinear -nonby vector feature Extend
2121
321
3212211
−−+=
−−+=
→→→
xxxxg
yyyg
yxxyxyx
x
y
1y
3y
2y
Generalized Linear Discriminant Function
S layer R layer A layer
i
j
Random connection
coefficients: 1±
Complete connection
coefficient: ijω
≥
<=±==
∑
∑∑
=
=
=a
s
N
iiij
Na
iiij
jm
N
mmmi
Taif
Taifrsa
1
1
1 1
0)1,0(
ω
ωαα
Perceptron
The Perceptron is a linear discrininant
function.
Natural
Pattern Feature
Measurement
Feature Selection
(Extraction) Classification
Learning
Sample
Pattern (Training Sample)
Picture
Data
Feature
Set 1
Feature
Set 2 x =
x x
x
1
2
m
....
y =
y y
y
....
1
2
n
Class
Name
Feature Vector : x =
x x
x
1
2
n
....
measurement 1 measurement 2
....
measurement n
α
Principal Component Analysis Independent Component Analysis
Discriminant Analysis Multivariate Analysis
k-NN method Bayesian Method Sub-space method
Support Vector Machine Hidden Markov Model Dynamic Programming
Model Fitting Clustering
Self-Organizing Map Multidimensional
Scaling ML Estimation EM Algorithm
Architecture of Statistical Pattern Recognition Systems
【Artifacts】 1. Highly correlated features are included. The recognition rate does not improved as expected.
2. Sparse distribution Curse of dimensionality (Hughes effect) Require larger training samples for learning
3. The recognition error rate may be increased! over-fitting
Select useful features from observed features. (Pattern recognition systems should be designed
to recognize UNKNOW data correctly!)
Myth: Increase features to improve recognition rate.