simplex techniques for quantile regression model selection · applications of simplex techniques...

21
Simplex Techniques for Simplex Techniques for Quantile Regression Model Selection Yonggang Yao SAS Institute Inc. SAS Institute Inc. 8/01/2010 Nonparametric Statistics 2010 JSM, Vancouver, Canada

Upload: others

Post on 30-Jun-2020

5 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Simplex Techniques for Quantile Regression Model Selection · Applications of Simplex Techniques • Resampling Cross-validation, Bootstrap •Manippgp g pgulating Simplex Al gorithm

Simplex Techniques for Simplex Techniques for Quantile Regression Model Selection

Yonggang Yao

SAS Institute Inc.SAS Institute Inc.

8/01/2010

Nonparametric Statistics — 2010 JSM, Vancouver, Canada

Page 2: Simplex Techniques for Quantile Regression Model Selection · Applications of Simplex Techniques • Resampling Cross-validation, Bootstrap •Manippgp g pgulating Simplex Al gorithm

Outline

k d

Outline

Background

Quantile Regression

Linear Programming

Model Selection and Simplex Tableau

G d M th dGreedy Methods

Penalty Methods

Resampling

Some Computing Issues

Page 3: Simplex Techniques for Quantile Regression Model Selection · Applications of Simplex Techniques • Resampling Cross-validation, Bootstrap •Manippgp g pgulating Simplex Al gorithm

Background

•Three Formulations of Quantile Regression (QR) at level

Background

n1

n

ii

n

iii

xyxy

xyn

21)(

21-min

)(1min 1

nn

n

iii

i

aXaXay

n

]1,0[ and 1)1( s.t. max

22

1

1

where .

Li P i (LP)

ttt )1( )(

• Linear Programming (LP)

0 and s.t.min

zbAzc'z

Page 4: Simplex Techniques for Quantile Regression Model Selection · Applications of Simplex Techniques • Resampling Cross-validation, Bootstrap •Manippgp g pgulating Simplex Al gorithm

Background

• Cast QR problem as LP problem

Background

IIXXA

nncz

'/)1(/ 0 0

)' (

0 and s.t.min

Form) (Standard LP

zbAzc'z

where c and z are m-vectors with m=2p+2n, A is a n-by-m matrix, and

Yb

is the residual vector. XY

Page 5: Simplex Techniques for Quantile Regression Model Selection · Applications of Simplex Techniques • Resampling Cross-validation, Bootstrap •Manippgp g pgulating Simplex Al gorithm

Simplex Theory

• Let denote an index set. d i ibl b i f A

Simplex Theory

},,2,1{},,{ 1 nBBB n ][ AAA denote an invertible sub-matrix of A.is called a basic solution if satisfies:],,[

1 nBBB AAA

],,[ **1

*mzzz *z

bAz BB1*

Form)(StandardLP

• is an optimal solution if

Bmjz j \},,2,1{for 0*

1*

*z 0 and s.t.min

Form)(StandardLP

zbAzc'z

• Simplex Tableau0'

01

1*

BB

BB

cAAc

bAz

p

AAbA

AAcczc BBBB

11

1'''

AAbA BB

Page 6: Simplex Techniques for Quantile Regression Model Selection · Applications of Simplex Techniques • Resampling Cross-validation, Bootstrap •Manippgp g pgulating Simplex Al gorithm

Model Selection and Simplex Tableau

• Cast QR model selection problem as LP problem

Model Selection and Simplex Tableau

)'00( XXAIIXXA

nncz

)- ( '0 0 /)1(/ 0 0 )'0 0 (

22*

11

where is forced to be a zero vector.2

0ands tmin

Form) (Standard LP

bAc'z

Yb

• Simplex Tableau for Model Selection

*111

*11 '''''AAAAbA

AAccAAcczc BBBBBB

0and s.t. zbAz

where is for , and is for . U i d l d l l i

AAAAbA BBB

A 1X *A 2X

}{ BBB• Use index set to control a model selection process. },,{ 1 nBBB

Page 7: Simplex Techniques for Quantile Regression Model Selection · Applications of Simplex Techniques • Resampling Cross-validation, Bootstrap •Manippgp g pgulating Simplex Al gorithm

Key IdeasKey Ideas

A model is like a port. The simplex tableau is like a cargo ship.

Evaluate models with data.

Page 8: Simplex Techniques for Quantile Regression Model Selection · Applications of Simplex Techniques • Resampling Cross-validation, Bootstrap •Manippgp g pgulating Simplex Al gorithm

Model Selection and Simplex Tableau• Greedy Methods (Forward, Backward, Stepwise)

Model Selection and Simplex Tableau

1 n

• Fit Criteria

)(vs.)(1)( 21 RMWARR F

.][1min )(

][1min )(

1model-reduced

1model-full

n

iiiR

n

iiiF

xyn

MWAR

xyn

MWAR

)(vs. )(

1)( RMWAR

RR

npMWARnSICpMWARnAIClog))(log(2)(

2))(log(2)(

log))(log(2)(sSawa'

2)1(2))(log(2)(

pnnMWARnBIC

pnnpMWARnAICC

2

log))(log(2)( sSawa'

pn

nMWARnBIC

scoreWaldratio Likelihood

score WaldscoreRank

Page 9: Simplex Techniques for Quantile Regression Model Selection · Applications of Simplex Techniques • Resampling Cross-validation, Bootstrap •Manippgp g pgulating Simplex Al gorithm

SimulationSimulation

True Model: p=20 and n=1000

)10(~)(

32 1815121021

diiUnifxxX

exxxxxxy

... )25,0(~... )1,0(~),,( 101

diiNediiUnifxxX

Page 10: Simplex Techniques for Quantile Regression Model Selection · Applications of Simplex Techniques • Resampling Cross-validation, Bootstrap •Manippgp g pgulating Simplex Al gorithm

SimulationSimulation

Forward Selection Summary at quantile level 0.5 EFFECT Objective p-value j pstep entered function (Wald Scores) QRR ADJQRR AIC SIC0 Intercept 0.875704 0.000000 -0.000500 -131.727 -125.819

-----------------------------------------------------------------------------1 x10 0.858893 0.00001 0.019198 0.018707 -151.111 -145.2032 x15 0.850475 0.00001 0.028811 0.027838 -159.960 -148.1453 x12 0.842748 0.00001 0.037634 0.036187 -168.087 -150.3644 x1 0.837204 0.00006 0.043965 0.042047 -173.687 -150.056*5 x5 0.832609 0.00058 0.049213 0.046827 -178.192 -148.6536 x18 0.830066 0.00138 0.052117 0.049260 -180.250 -144.804

* Optimal Value Of CriterionSelection stopped as the candidate for entry has p-value> 0.1.y

Stop DetailsCandidate Candidate CompareFor Effect Significance SignificanceEntry x3 0.13957 > 0.1000 (p-value on Wald Score)

Page 11: Simplex Techniques for Quantile Regression Model Selection · Applications of Simplex Techniques • Resampling Cross-validation, Bootstrap •Manippgp g pgulating Simplex Al gorithm

SimulationSimulation

Effects: Intercept x10 x15 x12 x1 x5 x18p

Parameter Estimates at quantile level 0.5 Standard 95% Confidence

Parameter DF Estimate Error Limits t Value Pr > |t|Intercept 1 -0.0519 0.3907 -0.8186 0.7148 -0.13 0.8944x10 1 1.3327 0.3115 0.7215 1.9439 4.28 <.0001x15 1 1.2865 0.3086 0.6809 1.8921 4.17 <.0001x12 1 1.1219 0.3066 0.5203 1.7235 3.66 0.0003x1 1 0.8864 0.3010 0.2957 1.4770 2.94 0.0033

5 1 0 7483 0 3030 0 1536 1 3430 2 47 0 0137x5 1 0.7483 0.3030 0.1536 1.3430 2.47 0.0137x18 1 0.6918 0.3021 0.0990 1.2845 2.29 0.0222

Page 12: Simplex Techniques for Quantile Regression Model Selection · Applications of Simplex Techniques • Resampling Cross-validation, Bootstrap •Manippgp g pgulating Simplex Al gorithm

Model Selection and Simplex Tableau

• Penalty Methods

Model Selection and Simplex Tableau

LASSO penalty, OSCAR penalty, Grouped LASSO penalty

• Manipulating Simplex Algorithm for Penalty MethodsManipulating Simplex Algorithm for Penalty MethodsFor example, LASSO penalty can be measured by using vector as follow:

AA 1'''

a

Form)Costc(ParametriLP

AAAAaaAAcc

bAzazc

B

BB

BB

B

BB

BB

1

1

1

1

''''

''

0 and s.t.'min

Form)Cost -c(ParametriLP

zbAzzac'z

where = (1, 1, 0, 0) according to .)'( za

Page 13: Simplex Techniques for Quantile Regression Model Selection · Applications of Simplex Techniques • Resampling Cross-validation, Bootstrap •Manippgp g pgulating Simplex Al gorithm

SimulationSimulation

True Model: p=11 and n=1000.

3

1 1exy

g

p

igigi

g

... )50,0(~... )1,0(~),,(

(-3,2,-2))(0,0,0,0),((2,3,2), True

101

diiNediiNxxX

),(

Page 14: Simplex Techniques for Quantile Regression Model Selection · Applications of Simplex Techniques • Resampling Cross-validation, Bootstrap •Manippgp g pgulating Simplex Al gorithm

Solution Path for LASSO QR Solution Path for LASSO QR

. :Penalty p

is 1i

(-3,2,-2))(0,0,0,0),((2,3,2), True

Page 15: Simplex Techniques for Quantile Regression Model Selection · Applications of Simplex Techniques • Resampling Cross-validation, Bootstrap •Manippgp g pgulating Simplex Al gorithm

Solution Path for OSCAR QR Solution Path for OSCAR QR

(-3,2,-2))(0,0,0,0),((2,3,2), True

Page 16: Simplex Techniques for Quantile Regression Model Selection · Applications of Simplex Techniques • Resampling Cross-validation, Bootstrap •Manippgp g pgulating Simplex Al gorithm

Solution Path for Grouped-LASSO QRSolution Path for Grouped LASSO QR

. ,....,max :Penalty1

1

G

gggg p

s 1g

(-3,2,-2))(0,0,0,0),((2,3,2), True

Page 17: Simplex Techniques for Quantile Regression Model Selection · Applications of Simplex Techniques • Resampling Cross-validation, Bootstrap •Manippgp g pgulating Simplex Al gorithm

Applications of Simplex Techniques

• Resampling

Applications of Simplex Techniques

Cross-validation, Bootstrap

• Manipulating Simplex Algorithm for Resamplingp g p g p g1. Check whether an observation is active for a fitted model.2. Drive-out some observations by changing the objective function.

Page 18: Simplex Techniques for Quantile Regression Model Selection · Applications of Simplex Techniques • Resampling Cross-validation, Bootstrap •Manippgp g pgulating Simplex Al gorithm

Key IdeasKey Ideas

Simplex Tableau can be used to:

• update an optimal partial model to another optimal partial model or full model.

dd t t i t d l • add extra constraints on a model.

• update an optimal model on a subset of a dataset to the optimal model on another subset of the datasetsubset of the dataset.

Page 19: Simplex Techniques for Quantile Regression Model Selection · Applications of Simplex Techniques • Resampling Cross-validation, Bootstrap •Manippgp g pgulating Simplex Al gorithm

Computational GoalsComputational Goals

Hi h f i• High-performance computing

• Massive data processingp g

• Re-usable programs

Page 20: Simplex Techniques for Quantile Regression Model Selection · Applications of Simplex Techniques • Resampling Cross-validation, Bootstrap •Manippgp g pgulating Simplex Al gorithm

Parallel Computing

• Parallel Computation can expedite Tableau Simplex algorithm on

Parallel Computing

1. Building initial tableau

2. Sorting/Ordering positive tableau rows

3. Changing the signs of tableau rows

4. Pivot Updating

Page 21: Simplex Techniques for Quantile Regression Model Selection · Applications of Simplex Techniques • Resampling Cross-validation, Bootstrap •Manippgp g pgulating Simplex Al gorithm

Reference• Chen, C. and Wei Y. (2005), Computational Issues for Quantile Regression, The Indian Journal of Statistics, (67), pp.399-417.

Reference

• Koenker, R. (2005), Quantile Regression, Cambridge University Press.

• Koenker, R. and Machado, J.A.F. (1999), Goodness of fit and related inference processes for quantile regression, Journal of the American Statistician Association, (94), pp.1296-1310.

• Li, Y. and Zhu, J. (2008), L1-norm quantile regression, Journal of Computational & Graphical Statistics, (17), pp.163-185.Statistics, (17), pp.163 185.

• Sawa, T. (1978), Information criteria for discriminating among alternative regression models, Econometrica, (46), pp.1273–1282.

• Schwarz, G. (1978), Estimating the dimension of a model, Annals of Statistics, (6), pp.461–464.

• Yao, Y. and Lee, Y. (2007), Another look at linear programming for feature selection via methods of regularization. Technical Report No. 800, Department of Statistics, The Ohio State University.