bagging ling 572 fei xia 1/24/06. ensemble methods so far, we have covered several learning methods:...

32
Bagging LING 572 Fei Xia 1/24/06

Post on 21-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Bagging

LING 572

Fei Xia

1/24/06

Page 2: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Ensemble methods

• So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL.

• Question: how to improve results?

• One solution: generating and combining multiple predictors– Bagging: bootstrap aggregating– Boosting– …

Page 3: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Outline

• An introduction to the bootstrap

• Bagging: basic concepts (Breiman, 1996)

• Case study: bagging a treebank parser (Henderson and Brill, ANLP 2000)

Page 4: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Introduction to bootstrap

Page 5: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Motivation

• What’s the average price of house prices?

• From F, get a sample x=(x1, x2, …, xn), and calculate the average u.

• Question: how reliable is u? What’s the standard error of u? what’s the confidence interval?

Page 6: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Solutions

• One possibility: get several samples from F.

• Problem: it is impossible (or too expensive) to get multiple samples.

• Solution: bootstrap

Page 7: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

The general bootstrap algorithmLet the original sample be L=(x1,x2,…,xn)

• Repeat B time:– Generate a sample Lk of size n from L by sampling with

replacement.– Compute for x*.

Now we end up with bootstrap values

• Use these values for calculating all the quantities of interest (e.g., standard deviation, confidence intervals)

)ˆ,...,ˆ(*ˆ **1 B

Page 8: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

An example

X=(3.12, 0, 1.57, 19.67, 0.22, 2.20)

Mean=4.46

X1=(1.57,0.22,19.67, 0,0,2.2,3.12)

Mean=4.13

X2=(0, 2.20, 2.20, 2.20, 19.67, 1.57)

Mean=4.64

X3=(0.22, 3.12,1.57, 3.12, 2.20, 0.22)

Mean=1.74

Page 9: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

A quick view of bootstrapping• Introduced by Bradley Efron in 1979

• Named from the phrase “to pull oneself up by one’s bootstraps”, which is widely believed to come from “the Adventures of Baron Munchausen”.

• Popularized in 1980s due to the introduction of computers in statistical practice.

• It has a strong mathematical background.

• It is well known as a method for estimating standard errors, bias, and constructing confidence intervals for parameters.

Page 10: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Bootstrap distribution

• The bootstrap does not replace or add to the original data.

• We use bootstrap distribution as a way to estimate the variation in a statistic based on the original data.

Page 11: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Sampling distribution vs. bootstrap distribution

• The population: certain unknown quantities of interest (e.g., mean)

• Multiple samples sampling distribution

• Bootstrapping:– One original sample B bootstrap samples– B bootstrap samples bootstrap distribution

Page 12: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

• Bootstrap distributions usually approximate the shape, spread, and bias of the actual sampling distribution.

• Bootstrap distributions are centered at the value of the statistic from the original sample plus any bias.

• The sampling distribution is centered at the value of the parameter in the population, plus any bias.

Page 13: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Cases where bootstrap does not apply

• Small data sets: the original sample is not a good approximation of the population

• Dirty data: outliers add variability in our estimates.

• Dependence structures (e.g., time series, spatial problems): Bootstrap is based on the assumption of independence.

• …

Page 14: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

How many bootstrap samples are needed?

Choice of B depends on

• Computer availability

• Type of the problem: standard errors, confidence intervals, …

• Complexity of the problem

Page 15: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Resampling methods

• Boostrap

• Permutation tests

• Jackknife: we ignore one observation at each time

• …

Page 16: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Bagging: basic concepts

Page 17: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Bagging

• Introduced by Breiman (1996)

• “Bagging” stands for “bootstrap aggregating”.

• It is an ensemble method: a method of combining multiple predictors.

Page 18: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Predictors

• Let L be a training set {(xi, yi) | xi in X, yi in Y}, drawn from the set Λ of possible training sets.

• A predictor Φ: X Y is a function that for any given x, it produces y=Φ(x).

• A learning algorithm Ψ: Λ that given any L in Λ, it produces a predictor Φ=Ψ(L) in .

• Types of predictors: – Classifiers: DTs, DLs, TBLs, …– Estimators: Regression trees– Others: parsers

Page 19: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Bagging algorithm

Let the original training data be L• Repeat B times:

– Get a bootstrap sample Lk from L.– Train a predictor using Lk.

• Combine B predictors by – Voting (for classification problem)– Averaging (for estimation problem)– …

Page 20: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Bagging decision trees

1. Splitting the data set into training set T1 and test set T2. 2. Bagging using 50 bootstrap samples. 3. Repeat Steps 1-2 100 times, and calculate average test set misclassification rate.

Page 21: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Bagging regression trees

Bagging with 25 bootstrap samples.Repeat 100 times.

Page 22: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

How many bootstrap samples are needed?

Bagging decision trees for the waveform task:• Unbagged rate is 29.0%.• We are getting most of the improvement using only 10 bootstrap samples.

Page 23: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Bagging k-nearest neighbor classifiers

100 bootstrap samples. 100 iterations.Bagging does not help.

Page 24: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Experiment results

• Bagging works well for “unstable” learning algorithms.

• Bagging can slightly degrade the performance of “stable” learning algorithms.

Page 25: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Learning algorithms

• Unstable learning algorithms: small changes in the training set result in large changes in predictions. – Neural network– Decision tree– Regression tree– Subset selection in linear regression

• Stable learning algorithms:– K-nearest neighbors

Page 26: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Case study

Page 27: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Experiment settings

• Henderson and Brill ANLP-2000 paper

• Parser: Collins’s Model 2 (1997)• Training data: sections 01-21• Test data: Section 23

• Bagging:• Different ways of combining parsing results

Page 28: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Techniques for combining parsers(Henderson and Brill, EMNLP-1999)

• Parse hybridization: combining the substructures of the input parses– Constituent voting– Naïve Bayes

• Parser switching: selecting one of the input parses– Similarity switching– Naïve Bayes

Page 29: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Experiment results

Baseline (no bagging): 88.63Initial (one bag): 88.38Final (15 bags): 89.17

Page 30: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Training corpus size effects

Page 31: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Summary

• Bootstrap is a resampling method.

• Bagging is directly related to bootstrap.– It uses bootstrap samples to train multiple predictors.– Output of predictors are combined by voting or other

methods.

• Experiment results:– It is effective for unstable learning methods.– It does not help stable learning methods.

Page 32: Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Uncovered issues

• How to determine whether a learning method is stable or unstable?

• Why bagging works for unstable algorithms?