research on ensemble learning - feng zhou · background first method second method experiments...
TRANSCRIPT
BackgroundFirst Method
Second MethodExperiments
References
Research on Ensemble Learning
Feng Zhou1, Baoliang Lu1
1Department of Computer Science
Shanghai Jiao Tong University
Jan. 22th, 2008
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Outline1 Background
Ensemble LearningProbabilistic ClassifierM3 FrameworkSummary
2 First MethodProblem DescriptionDecompositionIntegrationComparisonSummary
3 Second MethodProblem RevisitedHomo-pairwise CombinationComparison
4 Experiments Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Ensemble LearningProbabilistic ClassifierM3 FrameworkSummary
Where Ensemble Learning arises
The limitations of traditional classifier algorithms
Statistical Problem
Computational Problem
Representation Problem
The existed approaches of ensemble learning
Adaboost Strong ← WeakOne-vs-One(All) Multiclass ← PairwiseM3 Complicated ← Simple
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Ensemble LearningProbabilistic ClassifierM3 FrameworkSummary
Where Ensemble Learning arises
The limitations of traditional classifier algorithms
Statistical Problem
Computational Problem
Representation Problem
The existed approaches of ensemble learning
Adaboost Strong ← WeakOne-vs-One(All) Multiclass ← PairwiseM3 Complicated ← Simple
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Ensemble LearningProbabilistic ClassifierM3 FrameworkSummary
How to report Probabilistic Outputs[1]
Definition
Find f : x ∈ R → p(y = 1|x) ∈ [0, 1]e.g. Sigmod
f (x) =1
1 + eAx+B
Optimization Criteria
Minimize the Cross Entropy
−
n∑
i
yi log(pi ) + (1− yi) log(1− pi )
which could be solved by the algorithm Model Trust Minimization
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Ensemble LearningProbabilistic ClassifierM3 FrameworkSummary
How to report Probabilistic Outputs[1]
Definition
Find f : x ∈ R → p(y = 1|x) ∈ [0, 1]e.g. Sigmod
f (x) =1
1 + eAx+B
Optimization Criteria
Minimize the Cross Entropy
−
n∑
i
yi log(pi ) + (1− yi) log(1− pi )
which could be solved by the algorithm Model Trust Minimization
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Ensemble LearningProbabilistic ClassifierM3 FrameworkSummary
Estimated Priors
−10 −8 −6 −4 −2 0 2 40
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
p(x|ω+)
p(x|ω−)
Fitting on Posteriors
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Ensemble LearningProbabilistic ClassifierM3 FrameworkSummary
Estimated Priors
−10 −8 −6 −4 −2 0 2 40
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
p(x|ω+)
p(x|ω−)
Fitting on Posteriors
−10 −5 0−0.2
0
0.2
0.4
0.6
0.8
1
1.2
1.4
p(ω+
|x)
sigmod
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Ensemble LearningProbabilistic ClassifierM3 FrameworkSummary
How Min-Max-Modular (M3) works [2]
Learning on the decomposed training sets
−10 −5 0 5 10
−10
−5
0
5
10
Integrating the classifiers’ reports
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Ensemble LearningProbabilistic ClassifierM3 FrameworkSummary
How Min-Max-Modular (M3) works [2]
Learning on the decomposed training sets
−10 −5 0 5 10
−10
−5
0
5
10
−10 −5 0 5 10
−10
−5
0
5
10
Integrating the classifiers’ reports
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Ensemble LearningProbabilistic ClassifierM3 FrameworkSummary
How Min-Max-Modular (M3) works [2]
Learning on the decomposed training sets
−10 −5 0 5 10
−10
−5
0
5
10
−10 −5 0 5 10
−10
−5
0
5
10
Integrating the classifiers’ reports
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Ensemble LearningProbabilistic ClassifierM3 FrameworkSummary
How Min-Max-Modular (M3) works [2]
Learning on the decomposed training sets
−10 −5 0 5 10
−10
−5
0
5
10
−10 −5 0 5 10
−10
−5
0
5
10
Integrating the classifiers’ reports
Min
Min
Min
Min
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Ensemble LearningProbabilistic ClassifierM3 FrameworkSummary
How Min-Max-Modular (M3) works [2]
Learning on the decomposed training sets
−10 −5 0 5 10
−10
−5
0
5
10
−10 −5 0 5 10
−10
−5
0
5
10
Integrating the classifiers’ reports
Min
Min
Min
Min
Max
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Ensemble LearningProbabilistic ClassifierM3 FrameworkSummary
What we gain from the past researches
Why could the Min-Max principles successfully perform theintegration job?
How the decomposition stage influences the later integrationstage?
Could the system be accelerated?
Which one
is best?
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Ensemble LearningProbabilistic ClassifierM3 FrameworkSummary
What we gain from the past researches
Why could the Min-Max principles successfully perform theintegration job?
How the decomposition stage influences the later integrationstage?
Could the system be accelerated?
−0.5 0 0.5
−1.5
−1
−0.5
0
0.5
1
1.5
−0.5 0 0.5
−1.5
−1
−0.5
0
0.5
1
1.5
−0.5 0 0.5
−1.5
−1
−0.5
0
0.5
1
1.5
−0.5 0 0.5
−1.5
−1
−0.5
0
0.5
1
1.5
−0.5 0 0.5
−1.5
−1
−0.5
0
0.5
1
1.5
−0.5 0 0.5
−1.5
−1
−0.5
0
0.5
1
1.5
−0.5 0 0.5
−1.5
−1
−0.5
0
0.5
1
1.5
−0.5 0 0.5
−1.5
−1
−0.5
0
0.5
1
1.5
−0.5 0 0.5
−1.5
−1
−0.5
0
0.5
1
1.5
Which one
is best?
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Ensemble LearningProbabilistic ClassifierM3 FrameworkSummary
What we gain from the past researches
Why could the Min-Max principles successfully perform theintegration job?
How the decomposition stage influences the later integrationstage?
Could the system be accelerated?
−0.5 0 0.5
−1.5
−1
−0.5
0
0.5
1
1.5
−0.5 0 0.5
−1.5
−1
−0.5
0
0.5
1
1.5
−0.5 0 0.5
−1.5
−1
−0.5
0
0.5
1
1.5
−0.5 0 0.5
−1.5
−1
−0.5
0
0.5
1
1.5
−0.5 0 0.5
−1.5
−1
−0.5
0
0.5
1
1.5
−0.5 0 0.5
−1.5
−1
−0.5
0
0.5
1
1.5
−0.5 0 0.5
−1.5
−1
−0.5
0
0.5
1
1.5
−0.5 0 0.5
−1.5
−1
−0.5
0
0.5
1
1.5
−0.5 0 0.5
−1.5
−1
−0.5
0
0.5
1
1.5
Which one
is best?
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem DescriptionDecompositionIntegrationComparisonSummary
Let’s consider it again [3]
−5 0 5
−8
−6
−4
−2
0
2
4
6
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem DescriptionDecompositionIntegrationComparisonSummary
Let’s consider it again [3]
−5 0 5
−8
−6
−4
−2
0
2
4
6
Learning
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
ω+
ω−
Priors
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem DescriptionDecompositionIntegrationComparisonSummary
Let’s consider it again [3]
−5 0 5
−8
−6
−4
−2
0
2
4
6
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Decision
Boundary
Learning
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
ω+
ω−
Priors Posteriors
Bayes Rule
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem DescriptionDecompositionIntegrationComparisonSummary
Let’s consider it again [3]
−5 0 5
−8
−6
−4
−2
0
2
4
6
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Decision
Boundary
Learning
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
ω+
ω−
Priors Posteriors
Bayes Rule
Testing
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem DescriptionDecompositionIntegrationComparisonSummary
Let’s consider it again [3]
−5 0 5
−8
−6
−4
−2
0
2
4
6
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Decision
Boundary
Learning
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
ω+
ω−
Priors Posteriors
Bayes Rule
Testing
Complicated
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem DescriptionDecompositionIntegrationComparisonSummary
How to simplify the problem
−5 0 5
−8
−6
−4
−2
0
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem DescriptionDecompositionIntegrationComparisonSummary
How to simplify the problem
−5 0 5
−8
−6
−4
−2
0
2
4
6
8
10
−5 0 5
−8
−6
−4
−2
0
2
4
6
8
10
−5 0 5
−8
−6
−4
−2
0
2
4
6
8
10
−5 0 5
−8
−6
−4
−2
0
2
4
6
8
10
−5 0 5
−8
−6
−4
−2
0
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem DescriptionDecompositionIntegrationComparisonSummary
How to simplify the problem
−5 0 5
−8
−6
−4
−2
0
2
4
6
8
10
−5 0 5
−8
−6
−4
−2
0
2
4
6
8
10
−5 0 5
−8
−6
−4
−2
0
2
4
6
8
10
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−5 0 5
−8
−6
−4
−2
0
2
4
6
8
10
−5 0 5
−8
−6
−4
−2
0
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem DescriptionDecompositionIntegrationComparisonSummary
How to simplify the problem
−5 0 5
−8
−6
−4
−2
0
2
4
6
8
10
−5 0 5
−8
−6
−4
−2
0
2
4
6
8
10
−5 0 5
−8
−6
−4
−2
0
2
4
6
8
10
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−5 0 5
−8
−6
−4
−2
0
2
4
6
8
10
−5 0 5
−8
−6
−4
−2
0
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem DescriptionDecompositionIntegrationComparisonSummary
How to simplify the problem
−5 0 5
−8
−6
−4
−2
0
2
4
6
8
10
−5 0 5
−8
−6
−4
−2
0
2
4
6
8
10
−5 0 5
−8
−6
−4
−2
0
2
4
6
8
10
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−5 0 5
−8
−6
−4
−2
0
2
4
6
8
10
−5 0 5
−8
−6
−4
−2
0
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
ω+
ω−
Original ProblemCurrent Problems
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem DescriptionDecompositionIntegrationComparisonSummary
How to integrate the patches
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
ProbabilisticOutputting
Shrinking orMinimizing
Expanding orMaximizing
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem DescriptionDecompositionIntegrationComparisonSummary
How to integrate the patches
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.2
0.4
0.6
0.8
1
−10 −5 0 5 100
0.2
0.4
0.6
0.8
1
−10 −5 0 5 100
0.2
0.4
0.6
0.8
1
−10 −5 0 5 100
0.2
0.4
0.6
0.8
1
ProbabilisticOutputting
Shrinking orMinimizing
Expanding orMaximizing
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem DescriptionDecompositionIntegrationComparisonSummary
How to integrate the patches
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.2
0.4
0.6
0.8
1
−10 −5 0 5 100
0.2
0.4
0.6
0.8
1
−10 −5 0 5 100
0.2
0.4
0.6
0.8
1
−10 −5 0 5 100
0.2
0.4
0.6
0.8
1
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
ProbabilisticOutputting
Shrinking orMinimizing
Expanding orMaximizing
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem DescriptionDecompositionIntegrationComparisonSummary
How to integrate the patches
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.2
0.4
0.6
0.8
1
−10 −5 0 5 100
0.2
0.4
0.6
0.8
1
−10 −5 0 5 100
0.2
0.4
0.6
0.8
1
−10 −5 0 5 100
0.2
0.4
0.6
0.8
1
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
ProbabilisticOutputting
Shrinking orMinimizing
Expanding orMaximizing
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem DescriptionDecompositionIntegrationComparisonSummary
What’s the difference
Stage One
Min(x) =d
mini=1
xi
Shrink(x) =1
∑di=1
1xi− (d − 1)
Stage Two
Max(x) =d
maxi=1
xi
Expand(x) = 1−1
∑di=1
11−xi− (d − 1)
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem DescriptionDecompositionIntegrationComparisonSummary
What’s the difference
Stage One
Min(x) =d
mini=1
xi
Shrink(x) =1
∑di=1
1xi− (d − 1)
Stage Two
Max(x) =d
maxi=1
xi
Expand(x) = 1−1
∑di=1
11−xi− (d − 1)
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Much Lower
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem DescriptionDecompositionIntegrationComparisonSummary
What’s the difference
Stage One
Min(x) =d
mini=1
xi
Shrink(x) =1
∑di=1
1xi− (d − 1)
Stage Two
Max(x) =d
maxi=1
xi
Expand(x) = 1−1
∑di=1
11−xi− (d − 1)
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Much Lower
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem DescriptionDecompositionIntegrationComparisonSummary
What’s the difference
Stage One
Min(x) =d
mini=1
xi
Shrink(x) =1
∑di=1
1xi− (d − 1)
Stage Two
Max(x) =d
maxi=1
xi
Expand(x) = 1−1
∑di=1
11−xi− (d − 1)
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Much Higher
Much Lower
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem DescriptionDecompositionIntegrationComparisonSummary
What we could conclude
Question One (Answered)
Why could the Min-Max principles successfully perform theintegration job?Because it partially obeys the Bayes Decision Rule.
Question Two (Answered)
How the decomposition stage influences the later integration stage?The decomposition along the inner structure of each class wouldcontribute to the large distance among the patches.
Question Three (Unsolved)
Could the system be accelerated?Current Complexity O(n+ × n−)
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem RevisitedHomo-pairwise CombinationComparison
A General Consideration
Single Classifier
Former framework Another View
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem RevisitedHomo-pairwise CombinationComparison
A General Consideration
Single Classifier Former framework
Another View
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem RevisitedHomo-pairwise CombinationComparison
A General Consideration
Single Classifier Former framework Another View
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem RevisitedHomo-pairwise CombinationComparison
A General Consideration
Single Classifier Former framework Another ViewBridge Class
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem RevisitedHomo-pairwise CombinationComparison
What happens if we face the same classes
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem RevisitedHomo-pairwise CombinationComparison
What happens if we face the same classes
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Former Approach
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem RevisitedHomo-pairwise CombinationComparison
What happens if we face the same classes
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Former Approach
Homo-pairwise
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem RevisitedHomo-pairwise CombinationComparison
What happens if we face the same classes
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
−10 −5 0 5 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Former Approach
Homo-pairwise
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem RevisitedHomo-pairwise CombinationComparison
What’s the source of the efficiency
Complexity of Algorithms
Min-Max, Shrink-Expansion: O(n+ × n−)Homo-pairwise: O(n+ + n−)
Special Probabilistic Relationship
Suppose
Linkx(ωk , ωi ) =p(ωk , x)
p(ωi , x)and Linkx(ωk , ωj) =
p(ωk , x)
p(ωj , x)
Then
Linkx(ωi , ωj) =p(ωi , x)/p(ωk , x)
p(ωj , x)/p(ωk , x)=
Linkx (ωk , ωj)
Linkx(ωk , ωi )
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Problem RevisitedHomo-pairwise CombinationComparison
What’s the source of the efficiency
Complexity of Algorithms
Min-Max, Shrink-Expansion: O(n+ × n−)Homo-pairwise: O(n+ + n−)
Special Probabilistic Relationship
Suppose
Linkx(ωk , ωi ) =p(ωk , x)
p(ωi , x)and Linkx(ωk , ωj) =
p(ωk , x)
p(ωj , x)
Then
Linkx(ωi , ωj) =p(ωi , x)/p(ωk , x)
p(ωj , x)/p(ωk , x)=
Linkx (ωk , ωj)
Linkx(ωk , ωi )
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Toy DataLarge-scale Data
Does it happen as we thought
Gaussian Parameters
Label µ Σ
ω+1 (1, 1.2) (0.8, 2.1)
ω+2 (−1,−1) (1.1, 1.5)
ω−
1 (0.8,−0.5) (2.2, 0.8)ω−
2 (−0.7, 1) (1.7, 0.6)
Performance
Case ID Min-Max SE SEr
1 73.00 75.00 75.00
2 78.00 83.00 83.00
3 72.00 75.00 75.00
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
Toy DataLarge-scale Data
20 Newsgroups
Probability Estimate
−2000 −1500 −1000 −500 0 500 1000 1500 20000
0.05
0.1
0.15
0.2
0.25
0.3
alt.atheism (+) vs comp.graphics (−)
p(x|ω+)
p(x|ω−)
−2000 −1500 −1000 −500 0 500 1000 1500 2000−0.5
0
0.5
1
1.5A=−13.23 B=6.97
p(ω+|x)sigmod
Feng Zhou Research on Ensemble Learning
BackgroundFirst Method
Second MethodExperiments
References
References
J. Platt.
Probabilistic outputs for support vector machines and comparisons to
regularized likelihood methods.
Advances in Large Margin Classifiers, 1999.
B.L. Lu and M. Ito.
Task decomposition and module combination based on class relations: a
modular neural network for pattern classification.
IEEE Transactions on Neural Networks, 1999.
F. Zhou and B.L. Lu.
Learning Concepts from Large-Scale Data Sets by Pairwise Coupling with
Probabilistic Outputs.
IEEE International Joint Conference on Neural Networks, 2007.
Feng Zhou Research on Ensemble Learning