information mining with relational and possibilistic graphical models
DESCRIPTION
Information Mining with Relational and Possibilistic Graphical Models. Example: Continuously Adapting Gear Shift Schedule in VW New Beetle. Continuously Adapting Gear Shift Schedule: Technical Details. Mamdani controller with 7 rules Optimized program 24 Byte RAM on Digimat - PowerPoint PPT PresentationTRANSCRIPT
SNFEURO
UZZY
Prof. Dr. Rudolf Kruse
University of Magdeburg
Faculty of Computer Science
Magdeburg, Germany
Information Mining
with
Relational and Possibilistic
Graphical Models
SNFEURO
UZZY
Example: Continuously Adapting Gear Shift Schedule in VW New Beetle
classification of driver / driving situationby fuzzy logic
accelerator pedal
filtered speed ofaccelerator pedal
number ofchanges in pedal direction
sport factor [t-1]
gear shiftcomputation
rulebase
sportfactor [t]
determinationof speed limitsfor shiftinginto higher orlower geardepending onsport factor
gearselection
fuzzification inferencemachine
defuzzifi-cation
interpolation
SNFEURO
UZZY
Continuously Adapting Gear Shift Schedule: Technical Details
Mamdani controller with 7 rules
Optimized program
24 Byte RAM on Digimat
702 Byte ROM
Runtime 80 ms12 times per second a new sport factor is assigned
How to find suitable rules?
}AG4
SNFEURO
UZZY
Information Mining
Information mining is the non-trivial process of identifying valid, novel, potentially useful, and understandable information and patterns in heterogeneous information sources.
Information sources are data bases, expert background knowledge, textual description, images, sounds, ...
SNFEURO
UZZY
Information Mining
Problem Understanding
Information Understanding
Information Preparation
Modeling Evaluation Deployment
Determine Problem Objectives Assess Situations Determine Information Mining Goals Produce Project Plan
Collect Initial Information Describe Information Explore Information Verify Information Quality
Select Infor-mation Clean Infor-mation Construct In-formation Integrate In-formation Format Infor-mation
Select Modeling Technique Generate Test Design Build Model Assess Model
Evaluate Results Review Process Determine Next Steps
Plan Deployment Plan Moni-toring and Maintenance Produce Final Results Review Project
SNFEURO
UZZY
Example: Line Filtering
Extraction of edge segments (Burns’ operator) Production net:
edges lines long lines parallel lines runways
SNFEURO
UZZY
SOMAccess V1.0
Available on CD-ROM: G. Hartmann, A. Nölle, M. Richards, and R. Leitinger (eds.), Data Utilization Software Tools 2 (DUST-2 CD-ROM), Copernicus Gesellschaft e.V., Katlenburg-Lindau, 2000 (ISBN 3-9804862-3-0)
SNFEURO
UZZY
Current Research Topics
Multi-dimensional data analysis: Data warehouse and OLAP (on-line analytical processing)
Association, correlation, and causality analysis Classification: scalability and new approaches Clustering and outlier analysis Sequential patterns and time-series analysis Similarity analysis: curves, trends, images, texts, etc. Text mining, web mining and weblog analysis Spatial, multimedia, scientific data analysis Data preprocessing and database completion Data visualization and visual data mining Many others, e.g., collaborative filtering
SNFEURO
UZZY
Fuzzy Methods in Information Mining
here: Exploiting quantitative and qualitative information
Fuzzy Data Analysis (Projects with Siemens)
Dependency Analysis (Project with Daimler)
SNFEURO
UZZY
Analysis of Imprecise Data
Statistics with fuzzy sets
Fuzzy Database
[7,8]Small[3,4]3
About 7Medium2.52
MediumVery large
Large1
CBA
Computing with words
The mean w.r.t. A is „approximately 5“
3
2
1
CBA
Linguistic modeling
Linguistic approximation
Mean of attribute A
SNFEURO
UZZY
Fuzzy Data Analysis
Strong law of large numbers (Ralescu, Klement, Kruse, Miyakoshi, ...)
Let {xk | k 1} be independent and identically distributedfuzzy random variables such that E||supp x1|| < . Then
0))(co(, 121
xEn
xxxd n
Books:Kruse, Meyer: Statistics with Vague Data, Reidel, 1987Bandemer, Näther: Fuzzy Data Analysis, Kluwer, 1992Seising, Tanaka and Guo, Wolkenhauer, Viertl, ...
SNFEURO
UZZY
Analysis of Daimler/Chrysler Database
Database: ~ 18.500 passenger cars> 100 attributes per car
Analysis of dependencies between special equipment andfaults.
Results used as a starting point for technical experts lookingfor causes.
SNFEURO
UZZY
Bayesian Networks
Qualitative Part + Quantitative Part = Model
unique joint model on the(high-dimensional)space
local models on low-dimensional spaces
knowledge about(conditional) independence, causality, ...
directed acyclic graph
A B
C
ABC P(A,B,C)
a b c 0.8
a b c 0.1
a b c 0.1
a b c 0.0
SNFEURO
UZZY
Example: Genotype Determination of Jersey Cattle
variables: 22, state space 6 1013, parameters: 324
Graphical ModelGraphical Model
•node random variable
•edges conditional dependencies
•decomposition
•diagnosis P( | knowledge)
Phenogr.1(3 diff.)
Phenogr.2(3 diff.)
Genotype(6 diff.)
22
1221 ))( parents|(),,(
iii XXPXXP
SNFEURO
UZZY
Learning Graphical Models
A B
C
data+
prior information
Inducer local models
SNFEURO
UZZY
The Learning Problem
known structure unknown structure
complete data
incomplete data(missing values,hidden variables,...)
A<a4,
<a3,
B?,
b2,
Cc1>
?>
A<a4,
<a3,
Bb3,
b2,
Cc1>
c4>
A B
C
A B
C
Statistical Parametric Estimation (closed from eq.): statistical parameter fitting, ML Estimation, Bayesian Inference, ...
Discrete Optimization overStructures (discrete search): likelihood scores, MDL Problem: search complexity heuristics
Parametric Optimization: EM, gradient descent, ...
Combined Methods: structured EM only few approachesProblems: criterion for fit? new variables? local maxima?
fuzzy values?
SNFEURO
UZZY
Information Mining
18.500 passenger cars
130 attributes per car
Imprecise data
Fuzzy Database
IF air conditioning and electr. roof top
Then more battery faults
Linguistic modeling
Rule generation
Learning graphical models
Computing with words
relational/possibilistic
graphical model
SNFEURO
UZZY
A Simple Example
Example World Relation
color
shape
smallmediumsmallmediummediumlargemediummediummedium large
size
• 10 simple geometric objects, 3 attributes
• one object is chosen at random and examined.
• Inferences are drawn about the unobserved attributes.
SNFEURO
UZZY
The Reasoning Space
Relation
color
shape
smallmediumsmallmediummediumlargemediummediummedium large
size
Geometric Interpretation
Each cube represents one tuple
large
mediumsmall
SNFEURO
UZZY
Prior Knowledge and Its Projections
largemedium
small
largemedium
small
largemedium
small
large
mediumsmall
SNFEURO
UZZY
Cylindrical Extensions and Their Intersection
Intersecting the cylindrical extensions of the projection to the subspace formed by color and shape and of the projection to the subspace formed by shape and size yields the original three-dimensional relation.
large
mediumsmall
large
mediumsmall
largemedium
small
SNFEURO
UZZY
Reasoning
Let it be known (e.g. from an observation) that the given object is green. This information considerably reduces the space of possible value combinations.
From the prior knowledge it follows that the given object must be - either a triangle or a square and - either medium or large
large
mediumsmall
large
mediumsmall
SNFEURO
UZZY
Reasoning with Projections
The same result can be obtained using only the projections to the subspaces without reconstructing the original three-dimensional space:
s m l color size
extend shape project
project extend
s m l
This justifies a network representation color shape size
SNFEURO
UZZY
Interpretation of Graphical Models
Relational Graphical Model
Decomposition + local models
Learning a relational graphical model
Searching for a suitable decomposition
+ local relations
Example
colour shape size
graph
colour shape size
hypergraph
SNFEURO
UZZY
Genotype Determination of Danish Jersey Cattle
Assumptions about parents:risk about misstatement
genotype mother genotype father
genotype child,6 possible values
4 lysis valuesmeasured by photometer
Reliability of databases
Inheritance rules
Blood group determination
SNFEURO
UZZY
Qualitative Knowledge
parental error
Dam correct Sire correct
phenogroup 2stated dam
phenogroup 2true dam
phenogroup 1true dam
genotypeoffspring
phenogroup 2stated sire
phenogroup 1stated sire
phenogroup 1true sire
phenogroup 2true sire
factor 40 (F1) factor 43 (V2)
lysis 40 lysis 43
phenogroup 1offspring
phenogroup 2offspring
phenogroup 1stated dam
factor 41 (F2)
lysis 41
factor 42 (V1)
lysis 42
SNFEURO
UZZY
Example: Genotype Determination of Jersey Cattle
variables: 22, state space 6 1013, parameters: 324
Graphical ModelGraphical Model
•node random variable
•edges conditional dependencies
•decomposition
•diagnosis P( | knowledge)
Phenogr.1(3 diff.)
Phenogr.2(3 diff.)
Genotype(6 diff.)
22
1221 ))( parents|(),,(
iii XXPXXP
SNFEURO
UZZY
Learning Graphical Models from Data
• Test whether a distribution is decomposable w.r.t. a given graph. This is the most direct approach. It is not bound to a graphical representation, but can also be carried out w.r.t. other representations of the set of subspaces to be used to compute the (candidate) decomposition of the given distribution.
• Find an independence map by conditional independence tests. This approach exploits the theorems that connect conditional independence graphs and graphs that represent decompositions. it has the advantage that a single conditional independence test, if it fails, can exclude several candidate graphs.
• Find a suitable graph by measuring the strength of dependences. This is a heuristic, but often highly successful approach, which is based on the frequently valid assumption that in a distribution that is decomposable w.r.t. a graph an attribute is more strongly dependent on adjacent attributes than on attributes that are not directly connected to them.
SNFEURO
UZZY
Is Decomposition Always Possible?
largemedium
small
largemedium
small
largemedium
small
large
mediumsmall
1
2
SNFEURO
UZZY
Direct Test for decomposability
large
mediumsmall
shape
color
size
1.
largemedium
small
largemedium
small
shape
color
size shape
color
size
2. 3.
largemedium
small
shape
color
size
4.
large
mediumsmall
shape
color
size
5.
large
medium
shape
color
size
6.
large
medium
shape
color
size
7.
large
medium
shape
color
size
8.
small small small
SNFEURO
UZZY
Evaluation Measures and Search Methods
An exhaustive search over all graphs is too expensive:
possible undirected graphs for n attributes.
possible directed acyclic graphs.
Therefore all learning algorithms consist of an evaluation measure
(scoring function), e.g. Hartley information gain relative number of occurring value combinations
and a (heuristic) search method, e.g. guided random search greedy search (K2 algorithm)conditional independence search
22
n
n
i
nii infi
nnf
1
11 21
SNFEURO
UZZY
Measuring the Strengths of Marginal Dependences
Relational networks: Find a set of subspaces, for which the intersection of the cylindrical extensions of the projections to these subspaces contains as few additional states as possible.
This size of the intersection depends on the sizes of the cylindrical extensions, which in turn depend on the sizes of the projections.
Therefore it is plausible to use the relative number of occurring value combinations to assess the quality of a subspace.
The relational network can be obtained by interpreting the relative numbers as edge weights and constructing the minimal weight spanning tree.
subspace color shape shape size size color
possible combinations
occurring combinations
relative number
12
6
50%
9
5
56%
12
8
67%
SNFEURO
UZZY
Conditional Independence Tests
Hartley information needed to determine
coordinates: log24+ log23= log212 3.58
coordinate pair: log26 2.58
gain: log212- log26= log22 =1
Definition: Let A and B be two attributes and R a discrete possibility measure with adom(A): bdom(B):R(A=a,B=b)=1 Then
is called the Hartley information gain of A and B w.r.t. R.
Aa Bb
BbAa
Aa Bb
BbAa
bBaAR
bBRaAR
bBaAR
bBRaARBAI
dom dom
domdom2
dom dom2
dom2dom2Hartley
gain
,log
,log
loglog,
SNFEURO
UZZY
Conditional Independence Tests (continued)
The Hartley information gain can be used directly to test for (approximate) marginal independence.
attributes relative number of possible value combinations
Hartley information gain
color, shape 6/(3*4)=1/2=50% log23+ log24- log26=1
color, size 6/(3*4)=2/3=67% log23+ log24- log280.58
shape, size 5/(3*3)=5/9=56% log23+ log23- log25 0.85
In order to test for (approximate) conditional independence: Compute the Hartley information gain for each possible instantiation of the conditioning attributes. Aggregate the result over all possible instantiations, for instance, by simply averaging them.
SNFEURO
UZZY
Direct Test for Decomposability
Definition: Let p1 and p2 be two strictly positive probability distributions
on the same set of events. Then
is called the Kullback-Leibler information divergence of p1 and p2.
The Kullback-Leibler information divergence is non-negative. It is zero if and only if p1 p2.
Therefore it is plausible that this measure can be used to asses the
quality of the approximation of a given multi-dimensional distribution
p1 by the distribution p2 that is represented by a given graph:
The smaller the value of this measure, the better the approximation.
E EpEp
EpppI2
12121KLdiv log,
SNFEURO
UZZY
Direct Test for Decomposability (continued)
C B
A1.
C B
A2.
C B
A3.
C B
A4.
C B
A5.
C B
A6.
C B
A7.
C B
A8.
0-4401
0.566-5041
0.137-4612
0.429-4830
0.540-4991
0.111-4563
0.402-4780
0-4401
Upper numbers: The Kullback-Leibler information divergence of the original distribution and its approximation.
Lower numbers: The binary logarithms of the probability of an example database (log-likelihood of data).
SNFEURO
UZZY
Evaluation Measures / Scoring Functions
Relational Networks Relative number of occurring value combinations Hartley Information Gain
Probabilistic Networks 2-Measure Mutual Information / Cross Entropy / Information Gain (Symmetric) Information Gain Ratio (Symmetric/Modified) Gini Index Bayesian Measures (g-function, BDeu metric) Other measures that are known from Decision Tree Induction
SNFEURO
UZZY
A Probabilistic Evaluation Measure
Mutual Information / Cross Entropy / Information Gain
based on Shannon entropy
Idea:
n
iii ppH
12log
Aa Bb
Bb
Aa
ABBABAAgain
bBaAPbBaAP
bBPbBP
aAPaAP
HHHHHBAI
dom dom2
dom2
dom2
|
,log,
log
log
,
bBaAPbBaAPbBPHBb Aa
BA
|log| 2dom dom
|
SNFEURO
UZZY
Possiblity Theory
50 65 85 100
1
cloudy
fuzzy set induces possibility
A
supA
32
60,550
axioms
00 B,AminBA
B,AmaxBA
1
SNFEURO
UZZY
Possibility Distributions and the Context Model
Let be the set of all possible states of the world, 0 the actual (but unknown) state.
Let C={c1,…,ck} be a set of contexts (observers, frame conditions etc.),
(C,2C,P) a finite probability space (context weights). Let :C2 be a set-valued mapping, assigning to each context the
most specific correct set-valued specification of 0. g is called a random set (since it is a set-valued random
variable); thesets g(c) are also called focal sets. The induced one point coverage of or the
induced possibility distribution is
.|
1,0:
cCcP
SNFEURO
UZZY
Database-induced Possibility Distributions
A B C D
a1 {b2, b3} c3 {d1, d2}
a3 {b1, b2} c2 d3
{a2, a4} b3 {c1, c2} {d1, d3 , d4}
{a1, a2 , a3} b2 * {d1, d4}
Imprecise Database
Focal Sets
Each imprecise tuple – or, more precisely, the set of all precise tuples compatible with it – is interpreted as a focal set of a random set.
In the absence of other information equal weights are assigned to the contexts. In this way an imprecise database induces a possibility distribution.
A B C D
a1 b2 c3 d1
a1 b3 c3 d1
a1 b2 c3 d2
a1 b3 c3 d2
a3 b1 c2 d3
a3 b2 c2 d3
a2 b3 c1 d1
SNFEURO
UZZY
Reasoning
0 0 700
0 0 7000 0 2000 0 100
0 0 7000 0 6000 0 100
0 0 2000 0 4000 0 100
0 0 7000 0 6000 0 100
706010
0 0 7000 0 7000 0 400
20 707040 206010 1010
s m l
large medium small
70large
all numbers in parts per 1000
medium
70
small40
• Using the information that the given object is green.
SNFEURO
UZZY
Reasoning with Projections
0 0 700
80 90 7070
40 0
Again the same result can be obtained using only projections to subspaces (maximal degrees of possibility):
80 0
10 0
70 70
30 0
10 0
70 0
60 60
80 0
90 0
20 0
10 10
old new
min new
This justifies a network representation:
new
oldcolor
70
60
10
max line
80
70
90
20 20
80 70
70 70
40 40
70 60
20 20
90 10
60 10
30 10
min new
color shape size
old new
max column
90 7080
40 7070
s m l
s m l
shape new old
old
newsize
SNFEURO
UZZY
POSSINFER
SNFEURO
UZZY
Possibilistic Evaluation Measures / Scoring Functions
Specificity Gain [Gebhardt and Kruse 1996, Borgelt et al. 1996]
(Symmetric) Specificity Gain Ratio [Borgelt et al. 1996]
Analog of Mutual Information [Borgelt and Kruse 1997]
Analog of the 2-measure [Borgelt and Kruse 1997]
SNFEURO
UZZY
Possibilistic Evaluation Measures
log21 + log21 - log21 = 0
log22 + log22 - log23 0.42
log23 + log22 - log25 0.26
log24 + log23 - log28 0.58
log24 + log23 - log212 = 0
Usable relational measures relative number of value combinations/Hartley information gain specificity gain
number of additional value combinations in the Cartesian product of the marginal distributions
0.4
0.3
0.2
0.1
0
Reduction to the relational case via -cuts
0.4
0.3
0.2
0.1
0
SNFEURO
UZZY
Specificity Gain
Definition: Let A and B be two attributes and a possibility measure.
))((log),()(dom
2sup0gain aABAS
Aa
))((log)(dom
2 aAAa
d)),((log)(dom)(dom
2 bBaABbAa
is called the specificity gain of A and B w.r.t. .
Generalization of Hartley information gain on the basis of the -cut view of possibility distributions.
Analogous to Shannon information gain.
SNFEURO
UZZY
Specificity Gain in the Example
40 80 701030 10 607080 90 1020
80 80 707070 70 707080 90 7070
20 708040 207090 3060
s m l70 707080 807090 8070
s m l
projection tosubspace
40 70 702060 80 707080 90 4040
minimum ofmarginals
70 70 707080 80 707080 90 7070
specificitygain
0.055 bit
0.048 bit
0.027 bitlarge
mediumsmall
largemedium
small
SNFEURO
UZZY
Learning Graphical Models from Data
large
mediumsmall
shape
color
size
1.
largemedium
small
largemedium
small
shape
color
size shape
color
size
2. 3.
largemedium
small
shape
color
size
4.
large
mediumsmall
shape
color
size
5.
large
medium
shape
color
size
6.
large
medium
shape
color
size
7.
large
medium
shape
color
size
8.
small small small
SNFEURO
UZZY
Data Mining Tool Clementine
SNFEURO
UZZY
Analysis of Daimler/Chrysler Database
electricalroof top
air con-ditioning
type ofengine
type oftyres
slippagecontrol
faultybattery
faultycompressor
faultybrakes
Fictitious example:There are significantly more faulty batteries, if bothair conditioning and electrical roof top are builtinto the car.
Fictitious example:There are significantly more faulty batteries, if bothair conditioning and electrical roof top are builtinto the car.
SNFEURO
UZZY
Example Subnet
Influence of special equipment on battery faults:
(fictitious) frequency of air conditioningbattery faults with withoutelectrical sliding roof with 8% 3%
without 3% 2%
significant deviation from independent distribution hints to possible causes and improvements here: larger battery may be required, if an air conditioning
system and an electrical sliding roof are built in
(The dependencies and frequencies of this example are fictious,
true numbers are confidential.)
SNFEURO
UZZY
Resources
http://fuzzy.cs.uni-magdeburg.de
free software tools such as NEFCLASS, …
C. Borgelt, R. Kruse:
Graphical models – Methods for data analysis and mining
Wiley, Chichester, January 2002.