adaboost talk - cmpcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 presentation motivation...

155
AdaBoost Lecturer: Jan ˇ Sochman Authors: Jan ˇ Sochman, Jiˇ ı Matas Center for Machine Perception Czech Technical University, Prague http://cmp.felk.cvut.cz

Upload: truongkhue

Post on 30-Apr-2018

222 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

AdaBoost

Lecturer:Jan Sochman

Authors:Jan Sochman, Jirı Matas

Center for Machine PerceptionCzech Technical University, Prague

http://cmp.felk.cvut.cz

Page 2: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

2/17

Presentation

Motivation

AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

Outline:

� AdaBoost algorithm

• How it works?

• Why it works?

� Online AdaBoost and other variants

Page 3: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

3/17

What is AdaBoost?

AdaBoost is an algorithm for constructing a “strong” classifier as linear combination

f(x) =T∑

t=1

αtht(x)

of “simple” “weak” classifiers ht(x) : X → {−1,+1}.

Terminology

� ht(x) ... “weak” or basis classifier, hypothesis, “feature”

� H(x) = sign(f(x)) ... ‘’strong” or final classifier/hypothesis

Interesting properties

� AB is capable reducing both bias (e.g. stumps) and variance (e.g. trees) of the weakclassifiers

� AB has good generalisation properties (maximises margin)

� AB output converges to the logarithm of likelihood ratio

� AB can be seen as a feature selector with a principled strategy (minimisation of upperbound on empirical error)

� AB is close to sequential decision making (it produces a sequence of gradually morecomplex classifiers)

Page 4: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

4/17

The AdaBoost Algorithm

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}

Page 5: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

4/17

The AdaBoost Algorithm

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/m

Page 6: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

4/17

The AdaBoost Algorithm

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/mFor t = 1, ..., T :

Page 7: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

4/17

The AdaBoost Algorithm

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/mFor t = 1, ..., T :

� Find ht = arg minhj∈H

εj =m∑

i=1

Dt(i)Jyi 6= hj(xi)K

t = 1

Page 8: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

4/17

The AdaBoost Algorithm

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/mFor t = 1, ..., T :

� Find ht = arg minhj∈H

εj =m∑

i=1

Dt(i)Jyi 6= hj(xi)K

� If εt ≥ 1/2 then stop

t = 1

Page 9: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

4/17

The AdaBoost Algorithm

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/mFor t = 1, ..., T :

� Find ht = arg minhj∈H

εj =m∑

i=1

Dt(i)Jyi 6= hj(xi)K

� If εt ≥ 1/2 then stop

� Set αt = 12 log(1−εt

εt)

t = 1

Page 10: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

4/17

The AdaBoost Algorithm

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/mFor t = 1, ..., T :

� Find ht = arg minhj∈H

εj =m∑

i=1

Dt(i)Jyi 6= hj(xi)K

� If εt ≥ 1/2 then stop

� Set αt = 12 log(1−εt

εt)

� Update

Dt+1(i) =Dt(i)exp(−αtyiht(xi))

Zt

where Zt is normalisation factor

t = 1

Page 11: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

4/17

The AdaBoost Algorithm

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/mFor t = 1, ..., T :

� Find ht = arg minhj∈H

εj =m∑

i=1

Dt(i)Jyi 6= hj(xi)K

� If εt ≥ 1/2 then stop

� Set αt = 12 log(1−εt

εt)

� Update

Dt+1(i) =Dt(i)exp(−αtyiht(xi))

Zt

where Zt is normalisation factor

Output the final classifier:

H(x) = sign

(T∑

t=1

αtht(x)

)

step

trai

nin

ger

ror

t = 1

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 12: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

4/17

The AdaBoost Algorithm

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/mFor t = 1, ..., T :

� Find ht = arg minhj∈H

εj =m∑

i=1

Dt(i)Jyi 6= hj(xi)K

� If εt ≥ 1/2 then stop

� Set αt = 12 log(1−εt

εt)

� Update

Dt+1(i) =Dt(i)exp(−αtyiht(xi))

Zt

where Zt is normalisation factor

Output the final classifier:

H(x) = sign

(T∑

t=1

αtht(x)

)

step

trai

nin

ger

ror

t = 2

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 13: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

4/17

The AdaBoost Algorithm

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/mFor t = 1, ..., T :

� Find ht = arg minhj∈H

εj =m∑

i=1

Dt(i)Jyi 6= hj(xi)K

� If εt ≥ 1/2 then stop

� Set αt = 12 log(1−εt

εt)

� Update

Dt+1(i) =Dt(i)exp(−αtyiht(xi))

Zt

where Zt is normalisation factor

Output the final classifier:

H(x) = sign

(T∑

t=1

αtht(x)

)

step

trai

nin

ger

ror

t = 3

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 14: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

4/17

The AdaBoost Algorithm

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/mFor t = 1, ..., T :

� Find ht = arg minhj∈H

εj =m∑

i=1

Dt(i)Jyi 6= hj(xi)K

� If εt ≥ 1/2 then stop

� Set αt = 12 log(1−εt

εt)

� Update

Dt+1(i) =Dt(i)exp(−αtyiht(xi))

Zt

where Zt is normalisation factor

Output the final classifier:

H(x) = sign

(T∑

t=1

αtht(x)

)

step

trai

nin

ger

ror

t = 4

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 15: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

4/17

The AdaBoost Algorithm

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/mFor t = 1, ..., T :

� Find ht = arg minhj∈H

εj =m∑

i=1

Dt(i)Jyi 6= hj(xi)K

� If εt ≥ 1/2 then stop

� Set αt = 12 log(1−εt

εt)

� Update

Dt+1(i) =Dt(i)exp(−αtyiht(xi))

Zt

where Zt is normalisation factor

Output the final classifier:

H(x) = sign

(T∑

t=1

αtht(x)

)

step

trai

nin

ger

ror

t = 5

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 16: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

4/17

The AdaBoost Algorithm

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/mFor t = 1, ..., T :

� Find ht = arg minhj∈H

εj =m∑

i=1

Dt(i)Jyi 6= hj(xi)K

� If εt ≥ 1/2 then stop

� Set αt = 12 log(1−εt

εt)

� Update

Dt+1(i) =Dt(i)exp(−αtyiht(xi))

Zt

where Zt is normalisation factor

Output the final classifier:

H(x) = sign

(T∑

t=1

αtht(x)

)

step

trai

nin

ger

ror

t = 6

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 17: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

4/17

The AdaBoost Algorithm

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/mFor t = 1, ..., T :

� Find ht = arg minhj∈H

εj =m∑

i=1

Dt(i)Jyi 6= hj(xi)K

� If εt ≥ 1/2 then stop

� Set αt = 12 log(1−εt

εt)

� Update

Dt+1(i) =Dt(i)exp(−αtyiht(xi))

Zt

where Zt is normalisation factor

Output the final classifier:

H(x) = sign

(T∑

t=1

αtht(x)

)

step

trai

nin

ger

ror

t = 7

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 18: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

4/17

The AdaBoost Algorithm

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/mFor t = 1, ..., T :

� Find ht = arg minhj∈H

εj =m∑

i=1

Dt(i)Jyi 6= hj(xi)K

� If εt ≥ 1/2 then stop

� Set αt = 12 log(1−εt

εt)

� Update

Dt+1(i) =Dt(i)exp(−αtyiht(xi))

Zt

where Zt is normalisation factor

Output the final classifier:

H(x) = sign

(T∑

t=1

αtht(x)

)

step

trai

nin

ger

ror

t = 40

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 19: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

5/17

Reweighting

Effect on the training set

Dt+1(i) =Dt(i)exp(−αtyiht(xi))

Zt

exp(−αtyiht(xi)){

< 1, yi = ht(xi)> 1, yi 6= ht(xi)

⇒ Increase (decrease) weight of wrongly (correctly) classified examples

⇒ The weight is the upper bound on the error of a given example

⇒ All information about previously selected “features” is captured in Dt

Page 20: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

5/17

Reweighting

Effect on the training set

Dt+1(i) =Dt(i)exp(−αtyiht(xi))

Zt

exp(−αtyiht(xi)){

< 1, yi = ht(xi)> 1, yi 6= ht(xi)

⇒ Increase (decrease) weight of wrongly (correctly) classified examples

⇒ The weight is the upper bound on the error of a given example

⇒ All information about previously selected “features” is captured in Dt

Page 21: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

5/17

Reweighting

Effect on the training set

Dt+1(i) =Dt(i)exp(−αtyiht(xi))

Zt

exp(−αtyiht(xi)){

< 1, yi = ht(xi)> 1, yi 6= ht(xi)

⇒ Increase (decrease) weight of wrongly (correctly) classified examples

⇒ The weight is the upper bound on the error of a given example

⇒ All information about previously selected “features” is captured in Dt

Page 22: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

6/17

Upper Bound Theorem

Theorem: The following upper bound holds on the training error of H

1m|{i : H(xi) 6= yi}| ≤

T∏t=1

Zt

Proof: By unravelling the update rule

DT+1(i) =Dt(i)exp(−αtyiht(xi))

Zt

=exp(−

∑t αtyiht(xi))

m∏

t Zt=

exp(−yif(xi))m∏

t Zt

If H(xi) 6= yi then yif(xi) ≤ 0 implying that exp(−yif(xi)) > 1, thus

JH(xi) 6= yiK ≤ exp(−yif(xi))1m

∑i

JH(xi) 6= yiK ≤ 1m

∑i

exp(−yif(xi))

=∑

i

(∏

t

Zt)DT+1(i) =∏

t

Zt −2 −1 0 1 20

0.5

1

1.5

2

2.5

3

yf(x)er

r

Page 23: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

7/17

Consequences of the Theorem

� Instead of minimising the training error, its upper bound can be minimised

� This can be done by minimising Zt in each training round by:

• Choosing optimal ht, and

• Finding optimal αt

� AdaBoost can be proved to maximise margin

� AdaBoost iteratively fits an additive logistic regression model

Page 24: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

8/17

Choosing αt

We attempt to minimise Zt =∑

i Dt(i)exp(−αtyiht(xi)):

dZ

dα= −

m∑i=1

D(i)yih(xi)e−yiαih(xi) = 0

−∑

i:yi=h(xi)

D(i)e−α +∑

i:yi 6=h(xi)

D(i)eα = 0

−e−α(1− ε) + eαε = 0

α =12

log1− ε

ε

⇒ The minimisator of the upper bound is αt = 12 log 1−εt

εt

Page 25: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

9/17

Choosing ht

Weak classifier examples

� Decision tree (or stump), Perceptron – H infinite

� Selecting the best one from given finite set H

Justification of the weighted error minimisation

Having αt = 12 log 1−εt

εt

Zt =m∑

i=1

Dt(i)e−yiαiht(xi)

=∑

i:yi=ht(xi)

Dt(i)e−αt +∑

i:yi 6=ht(xi)

Dt(i)eαt

= (1− εt)e−αt + εteαt

= 2√

εt(1− εt)

⇒ Zt is minimised by selecting ht with minimal weighted error εt

Page 26: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

10/17

Generalisation (Schapire & Singer 1999)

Maximising margins in AdaBoost

P(x,y)∼S[yf(x) ≤ θ] ≤ 2TT∏

t=1

√ε1−θt (1− εt)1+θ where f(x) =

~α · ~h(x)‖~α‖1

� Choosing ht(x) with minimal εt in each step one minimises the margin

� Margin in SVM use the L2 norm instead: (~α · ~h(x))/‖~α‖2

Upper bounds based on margin

With probability 1− δ over the random choice of the training set S

P(x,y)∼D[yf(x) ≤ 0] ≤ P(x,y)∼S[yf(x) ≤ θ] +O

1√m

(d log2(m/d)

θ2+ log(1/δ)

)1/2

where D is a distribution over X × {+1,−1}, and d is pseudodimension of H.

Problem: The upper bound is very loose. In practice AdaBoost works much better.

Page 27: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

11/17

Convergence (Friedman et al. 1998)

Proposition 1 The discrete AdaBoost algorithm minimises J(f(x)) = E(e−yf(x)) byadaptive Newton updates

Lemma J(f(x)) is minimised at

f(x) =T∑

t=1

αtht(x) =12

logP (y = 1|x)

P (y = −1|x)

Hence

P (y = 1|x) =ef(x)

e−f(x) + ef(x)and P (y = −1|x) =

e−f(x)

e−f(x) + ef(x)

Additive logistic regression model

T∑t=1

at(x) = logP (y = 1|x)

P (y = −1|x)

Proposition 2 By minimising J(f(x)) the discrete AdaBoost fits (up to a factor 2) anadditive logistic regression model

Page 28: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

12/17

The Algorithm Recapitulation

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}

Page 29: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

12/17

The Algorithm Recapitulation

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/m

Page 30: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

12/17

The Algorithm Recapitulation

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/mFor t = 1, ..., T :

Page 31: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

12/17

The Algorithm Recapitulation

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/mFor t = 1, ..., T :

� Find ht = arg minhj∈H

εj =m∑

i=1

Dt(i)Jyi 6= hj(xi)K

t = 1

Page 32: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

12/17

The Algorithm Recapitulation

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/mFor t = 1, ..., T :

� Find ht = arg minhj∈H

εj =m∑

i=1

Dt(i)Jyi 6= hj(xi)K

� If εt ≥ 1/2 then stop

t = 1

Page 33: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

12/17

The Algorithm Recapitulation

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/mFor t = 1, ..., T :

� Find ht = arg minhj∈H

εj =m∑

i=1

Dt(i)Jyi 6= hj(xi)K

� If εt ≥ 1/2 then stop

� Set αt = 12 log(1−εt

εt)

t = 1

Page 34: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

12/17

The Algorithm Recapitulation

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/mFor t = 1, ..., T :

� Find ht = arg minhj∈H

εj =m∑

i=1

Dt(i)Jyi 6= hj(xi)K

� If εt ≥ 1/2 then stop

� Set αt = 12 log(1−εt

εt)

� Update

Dt+1(i) =Dt(i)exp(−αtyiht(xi))

Zt

t = 1

Page 35: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

12/17

The Algorithm Recapitulation

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/mFor t = 1, ..., T :

� Find ht = arg minhj∈H

εj =m∑

i=1

Dt(i)Jyi 6= hj(xi)K

� If εt ≥ 1/2 then stop

� Set αt = 12 log(1−εt

εt)

� Update

Dt+1(i) =Dt(i)exp(−αtyiht(xi))

Zt

Output the final classifier:

H(x) = sign

(T∑

t=1

αtht(x)

)

step

trai

nin

ger

ror

t = 1

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 36: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

12/17

The Algorithm Recapitulation

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/mFor t = 1, ..., T :

� Find ht = arg minhj∈H

εj =m∑

i=1

Dt(i)Jyi 6= hj(xi)K

� If εt ≥ 1/2 then stop

� Set αt = 12 log(1−εt

εt)

� Update

Dt+1(i) =Dt(i)exp(−αtyiht(xi))

Zt

Output the final classifier:

H(x) = sign

(T∑

t=1

αtht(x)

)

step

trai

nin

ger

ror

t = 2

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 37: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

12/17

The Algorithm Recapitulation

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/mFor t = 1, ..., T :

� Find ht = arg minhj∈H

εj =m∑

i=1

Dt(i)Jyi 6= hj(xi)K

� If εt ≥ 1/2 then stop

� Set αt = 12 log(1−εt

εt)

� Update

Dt+1(i) =Dt(i)exp(−αtyiht(xi))

Zt

Output the final classifier:

H(x) = sign

(T∑

t=1

αtht(x)

)

step

trai

nin

ger

ror

t = 3

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 38: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

12/17

The Algorithm Recapitulation

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/mFor t = 1, ..., T :

� Find ht = arg minhj∈H

εj =m∑

i=1

Dt(i)Jyi 6= hj(xi)K

� If εt ≥ 1/2 then stop

� Set αt = 12 log(1−εt

εt)

� Update

Dt+1(i) =Dt(i)exp(−αtyiht(xi))

Zt

Output the final classifier:

H(x) = sign

(T∑

t=1

αtht(x)

)

step

trai

nin

ger

ror

t = 4

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 39: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

12/17

The Algorithm Recapitulation

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/mFor t = 1, ..., T :

� Find ht = arg minhj∈H

εj =m∑

i=1

Dt(i)Jyi 6= hj(xi)K

� If εt ≥ 1/2 then stop

� Set αt = 12 log(1−εt

εt)

� Update

Dt+1(i) =Dt(i)exp(−αtyiht(xi))

Zt

Output the final classifier:

H(x) = sign

(T∑

t=1

αtht(x)

)

step

trai

nin

ger

ror

t = 5

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 40: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

12/17

The Algorithm Recapitulation

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/mFor t = 1, ..., T :

� Find ht = arg minhj∈H

εj =m∑

i=1

Dt(i)Jyi 6= hj(xi)K

� If εt ≥ 1/2 then stop

� Set αt = 12 log(1−εt

εt)

� Update

Dt+1(i) =Dt(i)exp(−αtyiht(xi))

Zt

Output the final classifier:

H(x) = sign

(T∑

t=1

αtht(x)

)

step

trai

nin

ger

ror

t = 6

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 41: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

12/17

The Algorithm Recapitulation

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/mFor t = 1, ..., T :

� Find ht = arg minhj∈H

εj =m∑

i=1

Dt(i)Jyi 6= hj(xi)K

� If εt ≥ 1/2 then stop

� Set αt = 12 log(1−εt

εt)

� Update

Dt+1(i) =Dt(i)exp(−αtyiht(xi))

Zt

Output the final classifier:

H(x) = sign

(T∑

t=1

αtht(x)

)

step

trai

nin

ger

ror

t = 7

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 42: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

12/17

The Algorithm Recapitulation

Given: (x1, y1), . . . , (xm, ym);xi ∈ X , yi ∈ {−1,+1}Initialise weights D1(i) = 1/mFor t = 1, ..., T :

� Find ht = arg minhj∈H

εj =m∑

i=1

Dt(i)Jyi 6= hj(xi)K

� If εt ≥ 1/2 then stop

� Set αt = 12 log(1−εt

εt)

� Update

Dt+1(i) =Dt(i)exp(−αtyiht(xi))

Zt

Output the final classifier:

H(x) = sign

(T∑

t=1

αtht(x)

)

step

trai

nin

ger

ror

t = 40

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 43: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

13/17

AdaBoost Variants

Freund & Schapire 1995

� Discrete (h : X → {0, 1})

� Multiclass AdaBoost.M1 (h : X → {0, 1, ..., k})

� Multiclass AdaBoost.M2 (h : X → [0, 1]k)

� Real valued AdaBoost.R (Y = [0, 1], h : X → [0, 1])

Schapire & Singer 1999

� Confidence rated prediction (h : X → R, two-class)

� Multilabel AdaBoost.MR, AdaBoost.MH (different formulation of minimised loss)

Oza 2001

� Online AdaBoost

Many other modifications since then: cascaded AB, WaldBoost, probabilistic boostingtree, ...

Page 44: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

14/17

Online AdaBoost

Offline

Given:

� Set of labeled training samplesX = {(x1, y1), ..., (xm, ym)|y = ±1}

� Weight distribution over XD0 = 1/m

For t = 1, . . . , T

� Train a weak classifier using samplesand weight distribution

ht(x) = L(X , Dt−1)

� Calculate error εt

� Calculate coeficient αt from εt

� Update weight distribution Dt

Output:F (x) = sign(

∑Tt=1 αtht(x))

Online

Given:

For t = 1, . . . , T

Output:F (x) = sign(

∑Tt=1 αtht(x))

Page 45: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

14/17

Online AdaBoost

Offline

Given:

� Set of labeled training samplesX = {(x1, y1), ..., (xm, ym)|y = ±1}

� Weight distribution over XD0 = 1/m

For t = 1, . . . , T

� Train a weak classifier using samplesand weight distribution

ht(x) = L(X , Dt−1)

� Calculate error εt

� Calculate coeficient αt from εt

� Update weight distribution Dt

Output:F (x) = sign(

∑Tt=1 αtht(x))

Online

Given:

For t = 1, . . . , T

Output:F (x) = sign(

∑Tt=1 αtht(x))

� One labeled training sample(x, y)|y = ±1

� Strong classifier to update

Page 46: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

14/17

Online AdaBoost

Offline

Given:

� Set of labeled training samplesX = {(x1, y1), ..., (xm, ym)|y = ±1}

� Weight distribution over XD0 = 1/m

For t = 1, . . . , T

� Train a weak classifier using samplesand weight distribution

ht(x) = L(X , Dt−1)

� Calculate error εt

� Calculate coeficient αt from εt

� Update weight distribution Dt

Output:F (x) = sign(

∑Tt=1 αtht(x))

Online

Given:

For t = 1, . . . , T

Output:F (x) = sign(

∑Tt=1 αtht(x))

� One labeled training sample(x, y)|y = ±1

� Strong classifier to update

� Initial importance λ = 1

Page 47: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

14/17

Online AdaBoost

Offline

Given:

� Set of labeled training samplesX = {(x1, y1), ..., (xm, ym)|y = ±1}

� Weight distribution over XD0 = 1/m

For t = 1, . . . , T

� Train a weak classifier using samplesand weight distribution

ht(x) = L(X , Dt−1)

� Calculate error εt

� Calculate coeficient αt from εt

� Update weight distribution Dt

Output:F (x) = sign(

∑Tt=1 αtht(x))

Online

Given:

For t = 1, . . . , T

Output:F (x) = sign(

∑Tt=1 αtht(x))

� One labeled training sample(x, y)|y = ±1

� Strong classifier to update

� Initial importance λ = 1

� Update the weak classifier using thesample and the importance

ht(x) = L(ht, (x, y), λ)

Page 48: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

14/17

Online AdaBoost

Offline

Given:

� Set of labeled training samplesX = {(x1, y1), ..., (xm, ym)|y = ±1}

� Weight distribution over XD0 = 1/m

For t = 1, . . . , T

� Train a weak classifier using samplesand weight distribution

ht(x) = L(X , Dt−1)

� Calculate error εt

� Calculate coeficient αt from εt

� Update weight distribution Dt

Output:F (x) = sign(

∑Tt=1 αtht(x))

Online

Given:

For t = 1, . . . , T

Output:F (x) = sign(

∑Tt=1 αtht(x))

� One labeled training sample(x, y)|y = ±1

� Strong classifier to update

� Initial importance λ = 1

� Update the weak classifier using thesample and the importance

ht(x) = L(ht, (x, y), λ)

� Update error estimation εt

� Update weight αt based on εt

Page 49: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

14/17

Online AdaBoost

Offline

Given:

� Set of labeled training samplesX = {(x1, y1), ..., (xm, ym)|y = ±1}

� Weight distribution over XD0 = 1/m

For t = 1, . . . , T

� Train a weak classifier using samplesand weight distribution

ht(x) = L(X , Dt−1)

� Calculate error εt

� Calculate coeficient αt from εt

� Update weight distribution Dt

Output:F (x) = sign(

∑Tt=1 αtht(x))

Online

Given:

For t = 1, . . . , T

Output:F (x) = sign(

∑Tt=1 αtht(x))

� One labeled training sample(x, y)|y = ±1

� Strong classifier to update

� Initial importance λ = 1

� Update the weak classifier using thesample and the importance

ht(x) = L(ht, (x, y), λ)

� Update error estimation εt

� Update weight αt based on εt

� Update importance weight λ

Page 50: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

14/17

Online AdaBoost

Offline

Given:

� Set of labeled training samplesX = {(x1, y1), ..., (xm, ym)|y = ±1}

� Weight distribution over XD0 = 1/m

For t = 1, . . . , T

� Train a weak classifier using samplesand weight distribution

ht(x) = L(X , Dt−1)

� Calculate error εt

� Calculate coeficient αt from εt

� Update weight distribution Dt

Output:F (x) = sign(

∑Tt=1 αtht(x))

Online

Given:

For t = 1, . . . , T

Output:F (x) = sign(

∑Tt=1 αtht(x))

� One labeled training sample(x, y)|y = ±1

� Strong classifier to update

� Initial importance λ = 1

� Update the weak classifier using thesample and the importance

ht(x) = L(ht, (x, y), λ)

� Update error estimation εt

� Update weight αt based on εt

� Update importance weight λ

Page 51: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

15/17

Online AdaBoost

Converges to offline results given the same trainingset and the number of iterations N →∞

N. Oza and S. Russell. Online Bagging and Boosting.Artificial Inteligence and Statistics, 2001.

Page 52: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

16/17

Pros and Cons of AdaBoost

Advantages

� Very simple to implement

� General learning scheme - can be used for various learning tasks

� Feature selection on very large sets of features

� Good generalisation

� Seems not to overfit in practice (probably due to margin maximisation)

Disadvantages

� Suboptimal solution (greedy learning)

Page 53: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

17/17

Selected references

� Y. Freund, R.E. Schapire. A Decision-theoretic Generalization of On-line Learningand an Application to Boosting. Journal of Computer and System Sciences. 1997

� R.E. Schapire, Y. Freund, P. Bartlett, W.S. Lee. Boosting the Margin: A NewExplanation for the Effectiveness of Voting Methods. The Annals of Statistics,1998

� R.E. Schapire, Y. Singer. Improved Boosting Algorithms Using Confidence-ratedPredictions. Machine Learning. 1999

� J. Friedman, T. Hastie, R. Tibshirani. Additive Logistic Regression: a StatisticalView of Boosting. Technical report. 1998

� N.C. Oza. Online Ensemble Learning. PhD thesis. 2001

� http://www.boosting.org

Page 54: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

17/17

Selected references

� Y. Freund, R.E. Schapire. A Decision-theoretic Generalization of On-line Learningand an Application to Boosting. Journal of Computer and System Sciences. 1997

� R.E. Schapire, Y. Freund, P. Bartlett, W.S. Lee. Boosting the Margin: A NewExplanation for the Effectiveness of Voting Methods. The Annals of Statistics,1998

� R.E. Schapire, Y. Singer. Improved Boosting Algorithms Using Confidence-ratedPredictions. Machine Learning. 1999

� J. Friedman, T. Hastie, R. Tibshirani. Additive Logistic Regression: a StatisticalView of Boosting. Technical report. 1998

� N.C. Oza. Online Ensemble Learning. PhD thesis. 2001

� http://www.boosting.org

Thank you for attention

Page 55: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 56: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 57: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 58: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 59: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 60: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 61: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 62: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 63: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 64: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 65: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 66: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 67: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 68: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 69: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 70: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 71: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 72: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 73: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 74: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 75: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 76: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 77: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 78: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 79: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 80: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 81: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 82: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 83: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 84: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 85: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 86: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 87: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 88: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 89: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 90: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 91: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 92: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 93: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 94: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 95: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 96: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 97: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 98: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 99: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 100: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 101: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 102: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 103: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 104: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 105: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 106: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 107: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 108: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

−2 −1 0 1 20

0.5

1

1.5

2

2.5

3

yf(x)

err

Page 109: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 110: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 111: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 112: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 113: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 114: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 115: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 116: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 117: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 118: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 119: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 120: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 121: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 122: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 123: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 124: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 125: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 126: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 127: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 128: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 129: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 130: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 131: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 132: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 133: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 134: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 135: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 136: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 137: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 138: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 139: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 140: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 141: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 142: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 143: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 144: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 145: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 146: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 147: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 148: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 149: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 150: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 151: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 152: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 153: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Page 154: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)
Page 155: AdaBoost Talk - CMPcmp.felk.cvut.cz/~sochmj1/adaboost_talk.pdf · 2/17 Presentation Motivation AdaBoost with trees is the best off-the-shelf classifier in the world. (Breiman 1998)

0 5 10 15 20 25 30 35 400

0.05

0.1

0.15

0.2

0.25

0.3

0.35