recent trends in fuzzy clustering: from data to knowledge
DESCRIPTION
Recent Trends in Fuzzy Clustering: From Data to Knowledge. Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland. [email protected]. Shenyang, August 2009. - PowerPoint PPT PresentationTRANSCRIPT
Agenda
Introduction: clustering, information granulation and paradigm shift
Key challenges in clustering
Fuzzy objective-based clustering
Knowledge-based augmentation of fuzzy clustering
Collaborative fuzzy clustering
Concluding comments
Clustering
Areas of research and applications:
•Data analysis•Modeling•Structure determination
Google Scholar -2, 190,000 hits for “clustering” (as of August 6, 2009)
Clustering as aconceptual and algorithmic framework of information
granulationData information granules (clusters) abstraction of data
Formalism of: set theory (K-Means) fuzzy sets (FCM) rough sets
shadowed sets
Main categories of clustering
Graph-oriented and hierarchical (single linkage, complete linkage, average linkage..)
Objective function-based clustering
Diversity of formalisms and optimization tools(e.g., methods of Evolutionary Computing)
Key challenges of clustering
Data-driven methods
Selection of distance function (geometry of clusters)
Number of clusters
Quality of clustering results
Fuzzy Clustering: Fuzzy C-Means (FCM)
Given data x1, x2, …, xN, determine its structure byforming a collection of information granules – fuzzy sets
Objective function
2ik
N
1k
mik
c
1i||||uQ vx
Minimize Q; structure in data (partition matrix and prototypes)
FCM – optimization
2ik
N
1k
mik
c
1i||||uQ vx
Minimize
subject to
(a) prototypes
(b) partition matrix
Optimization - details
Partition matrix – the use of Lagrange multipliers
€
V = uikm
i=1
c
∑ dik2 + λ ( uik −1)
i=1
c
∑
€
∂V
∂ust
= 0 ∂V
∂λ= 0
dik= ||xk-vi||2
–Lagrange multiplier
Optimization – partition matrix (1)
c
1iik
2ik
c
1i
mik 1)uλ(duV 0
λ
V 0
u
V
st
λ dmuu
V 2st
1mst
st
dm
λu 1-m
2
st1-m
1
st
c
1j
1m
2
jt1m
1
1dm
λ
c
1j
1m
2
jt
1m
1
d
1
m
λ
c
1j
1m
1
2jt
2st
st
dd
1u
Optimization- prototypes (2)
2ij
n
1jkj
N
1k
mik
c
1i)v(xuQ
Gradient of Q with respect to vs
N
1kstkt
mik 0)v(xu
N
1k
mik
N
1kkt
mik
st
u
xu
v
Euclidean distance
Fuzzy C-Means (FCM): An overviewprocedure FCM-CLUSTERING (x) returns prototypes and partition matrix
input : data x = {x1, x2, ..., xk}
local: fuzzification parameter: m
threshold:
norm: ||.||
INITIALIZE-PARTITION-MATRIX
t 0
repeat
for i=1:c do
N
1k
mik
N
1kk
mik
i
)t(u
)t(u
)t(
x
v compute prototypes
for i = 1:c do
for k = 1:N do
update partition matrix
c
1j
1)2/(m
jk
ik
ik
||(t)||
||(t)||
1)1t(u
vx
vx
update partition matrix
t t + 1
until ||U(t+1)-U(t)||
return U, V
Domain Knowledge:Category of knowledge-
oriented guidance
Partially labeled data: some data are provided with labels (classes)
Proximity knowledge: some pairs of data are quantified interms of their proximity (closeness)
Viewpoints: some structural information is provided
Context-based guidance: clustering realized in a certain contextspecified with regard to some attribute
Clustering with domain knowledge
(Knowledge-based clustering)
Data
Information granules (structure)
CLUSTERING
Domain knowledge
Data-driven Data- and knowledge-driven
Data
Information granules (structure)
CLUSTERING
Context-based clustering
To align the agenda of fuzzy clustering with the principles of fuzzymodeling, the following features are considered:
Active role of the designer [customization of the model]
The structural backbone of the model is fully reflective of relationshipsbetween information granules in the input and output space
Clustering : construct clusters in input space X
Context-based Clustering : construct clusters in input space X given some context expressed in output space Y
Context-based clustering:Computing considerations
•computationally more efficient,•well-focused, •designer-guided clustering process
Data
structure
Data
structure
context
Context-based clustering
Context-based Clustering : construct clusters in input space X given some context expressed in output space Y
Context – hint (piece of domain knowledge) provided by designer who actively impacts the
development of the model
Context-based clustering:Context design
Context – hint (piece of domain knowledge) provided by designer who actively impacts the
development of the model. As such, context is imposed by the designer at the beginning
Realization of context
Designer focus information granule (fuzzy set)
(a) Designer, and (b) clustering of scalar data in output space
Context – fuzzy set (set) formed in the output space
Context-based clustering:Modeling
Determine structure in input space given the output is high
Determine structure in input space given the output is medium
Determine structure in input space given the output is low
Input space (data)
Context-based clustering:examples
Find a structure of customer data [clustering]
Find a structure of customer data considering customers making weekly purchases in the range [$1,000 $3,000]
Find a structure of customer data considering customers making weekly purchases at the level of
around $ 2,500
Find a structure of customer data considering customers making significant weekly purchases who
are young
no context
context
context
context(compound)
Context-oriented FCM
Data (xk, targetk), k=1,2,…,N
Contexts: fuzzy sets W1, W2, …, Wp
wjk = Wi(targetk) membership of j-th context for k-th data
c
1i
N
1kikjkikikj iNu0andk wu|0,1u)(WU
Context-driven partition matrix
Context-oriented FCM:Optimization flow
Objective function
Iterative adjustment of partition matrix and prototypes
2ik
c
1i
N
1k
mik ||||uQ vx
c
1j
1m
2
jk
ik
jkik
wu
vx
vx
N
1k
mik
N
1kk
mik
i
u
u xv
Subject to constraint U in U(Wj)
Viewpoints: definition
Description of entity (concept) which is deemed essential in describing phenomenon (system) and helpful in castingan overall analysis in a required setting
“external” , “reinforced” clusters
Viewpoints: definition
-150
-100
-50
0
50
100
150
200
0 100 200 300 400 500
x1
x2
a
b
x1
x2
a
viewpoint (a,b) viewpoint (a,?)
Viewpoints: definition
Description of entity (concept) which is deemed essential in describing phenomenon (system) and helpful in castingan overall analysis in a required setting
“external” , “reinforced” clusters
Viewpoints: definition
-150
-100
-50
0
50
100
150
200
0 100 200 300 400 500
x1
x2
a
b
x1
x2
a
viewpoint (a,b) viewpoint (a,?)
Viewpoints in fuzzy clustering
x1
x2
a
b
otherwise 0,
viewpointby the determined is B of rowth -i theof featureth -j theif 1,b ij
0
0
1
0
0
1
B
0
0
b
0
0
a
F
B- Boolean matrix characterizing structure: viewpoints prototypes (induced by data)
Viewpoints in fuzzy clustering
Q = 2ijkj
n
1:bji,1j
mik
c
1i
N
1k
2ijkj
n
0:bji,1j
mik
c
1i
N
1k
)f(xu)v(xu
ijij
1b if f
0bif vg
ijij
ijijij
2ijkj
n
1j
mik
c
1i
N
1k
)g(xuQ
Viewpoints in fuzzy clustering
x1
x2
a
b
otherwise 0,
viewpointby the determined is B of rowth -i theof featureth -j theif 1,b ij
0
0
1
0
0
1
B
0
0
b
0
0
a
F
B- Boolean matrix characterizing structure: viewpoints prototypes (induced by data)
Viewpoints in fuzzy clustering
Q = 2ijkj
n
1:bji,1j
mik
c
1i
N
1k
2ijkj
n
0:bji,1j
mik
c
1i
N
1k
)f(xu)v(xu
ijij
1b if f
0bif vg
ijij
ijijij
2ijkj
n
1j
mik
c
1i
N
1k
)g(xuQ
Labelled data and their description
Characterization in terms of membership degrees:
F = [fik] i=12,…,c , k=1,2, …., N
and supervision indicator b = [bk], k=1,2,…, N
Augmented objective function
€
Q =i=1
c
∑ uik2
k=1
N
∑ || xk − vi ||2 +β∑ (uik − fik )2bk || xk − vi ||2∑
> 0
Proximity hints
Characterization in terms of proximity degrees:
Prox(k, l), k, l=1,2, …., N
and supervision indicator matrix B = [bkl], k, l=1,2,…, N
Prox(k,l)
Prox(s,t)
Proximity measure
Properties of proximity:
(a)Prox(k, k) =1
(b)Prox(k,l) = Prox(l,k)
Proximity induced by partition matrix U:
€
Prox(k,l) = min(uik
i=1
c
∑ ,uil )
Augmented objective function
€
Q =i=1
c
∑ uik2
k=1
N
∑ || xk − vi ||2 +βi=1
c
∑k1=1
N
∑ [Prox(k1,k2) − Prox(U)(k1,k2)]2b(k1, k2) || xk1 − xk2 ||2
k2=1
N
∑
> 0
Two general development strategies
(1) HIERARCHICAL DEVELOPMENT OF INFORMATION GRANULES (INFORMMATION GRANULES OF HIGHER TYPE)
Information granulesType -1
Information granulesType -2
Two general development strategies
(2) HIERARCHICAL DEVELOPMENT OF INFORMATION GRANULES AND THE USE OF VIEWPOINTS
Information granulesType -1
Information granulesType -2
viewpoints
Two general development strategies
(3) HIERARCHICAL DEVELOPMENT OF INFORMATION GRANULES – A MODE OF SUCCESSIVE CONSTRUCTION
Information granules andtheir representatives
€
ui(vk[ii]) =1
|| vk[ii]− z i ||Fii ∩ F
|| vk[ii]− z j ||Fii ∩ F
⎛
⎝ ⎜ ⎜
⎞
⎠ ⎟ ⎟
2/(m−1)
j=1
c
∑
z1
z2
zc v1[ii]
Represent vk[ii] with the use of z1, z2, …, zc
Fii
F
Representation of fuzzy sets:two performance measures
Entropy measure
Reconstruction criterion (error)
Reconstruction error
Q =
c[ii]
1k
2kk
p
1iiii
||[ii][ii])(ˆ|| Fvvv
where
ik
c
1i
mik [ii])(u[ii])(ˆ zvvv
[ii])(u/[ii])(u[ii])(ˆ k
c
1i
miik
c
1i
mik vzvvv
Requirement of “coverage” condition
p
1ii
c
1kik
FF
Optimization problem
p
1ii
c
1kik
FF p
1ii
c
1kik
FF
Form a collection of prototypes Z = {z1, z2, …, zc} such that
entropy (or reconstruction error)
is minimized while satisfying coverage criterion
MinZ Q subject to
Optimization of fuzzification coefficient (m)
MinZ Q subject to m>1 and p
1ii
c
1kik
FF
Collaborative structure development (2)
phenomenon, process, system…
Informationgranules
data-1 data-2data-P
Informationgranules ofhigher type
Collaborative structure determination:Information granules of higher order
D[1] D[2] D[P]
prototypes
Clustering
Prototypes(higher order)
phenomenon, process, system…
I nformationgranules
data-1 data-2data-P
I nformationgranules ofhigher type
phenomenon, process, system…
I nformationgranules
data-1 data-2data-P
I nformationgranules ofhigher type
Determining correspondence between clusters (3)
Clustering
Prototypes(higher order)
zj
Select prototypes in D[1], D[2], …, D[p] associated with zj with the highest degree of membership
Determining correspondence between clusters (4)
vi[ii]
zj
D[ii]
ijc[ii]1,2,...,iji
c[ii]
1k
2
jk
ji
ij
λmax[ii]λ
||[ii]||
||[ii]||
1[ii]λ
0
zv
zv
Prototype i0 associated with prototype zj
Family of associated prototypes
Prototype i1 in D[1] associated with prototype zj
Prototype i2 in D[2] associated with prototype zj
Prototype ip in D[p] associated with prototype zj
…
p21
p21
iii
iii
,...., ,
[P] [2],...., [1],
vvv
From numeric prototypes to granular prototypes
p21
p21
iii
iii
,...., ,
[P] [2],...., [1],
vvv
individual coordinate of the associated prototypes:
a1 a2 …. ap
1 2 …. p
Information granule
R
[0,1]
The principle of justifiable granularity:Interval representation
a1 a2 …. ap
1 2 …. p
b d
1
0
€
if a i ∈ [b,d] then elevate to membership grades to 1
required change : 1- μ i
a0
The principle of justifiable granularity:Interval representation
a1 a2 …. ap
1 2 …. p
b d
1
0
€
if a i ∉ [b,d] then reduce membership grades to 0
required change : μ i
a0
The principle of justifiable granularity:optimization criterion
z1 z2
1
0
€
Min b,d ∈R:b≤d{ (1− μ i) + μ i}a i ∉[b,d]
∑a i ∈[b,d]
∑
Interval-valued fuzzy setsand granular prototypes
vi
x
Bounds of distances determined coordinate-wise
maxi
mini
|||
||||
vx
vx
Interval-valued fuzzy sets:membership function
c
1j
1
2
minj
maxi
i
c
1j
1
2
maxj
mini
i
||||
||||
1)(u
||||
||||
1)(u
m
m
vx
vx
x
vx
vx
x Upper bound
Lower bound
Collaborative structure determination:Structure refinement
Feedback and structurerefinement
phenomenon, process, system…
I nformationgranules
data-1 data-2data-P
I nformationgranules ofhigher type
phenomenon, process, system…
I nformationgranules
data-1 data-2data-P
I nformationgranules ofhigher type
Collaborative structure determination:Structure refinement
Iterate Clustering at the local level
Sharing findings and clustering at the higher (global) level
Assessment of quality of clusters in light of the global structure i(U)[ii] formed at the higher level
Refinement of clustering
Until termination criterion satisfied
phenomenon, process, system…
I nformationgranules
data-1 data-2data-P
I nformationgranules ofhigher type
phenomenon, process, system…
I nformationgranules
data-1 data-2data-P
I nformationgranules ofhigher type
2c[ii]
1i [ii]iki ||[ii]||(U)[ii]γQ[ii]
k
Xx
vx