alexis boukouvalas work in collaboration with d. m. maniyar and d. cornford managing uncertainty in...

20
Alexis Boukouvalas Work in collaboration with D. M. Maniyar and D. Cornford Managing Uncertainty in Complex Models, Aston University

Upload: amia-wilkinson

Post on 28-Mar-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Alexis Boukouvalas Work in collaboration with D. M. Maniyar and D. Cornford Managing Uncertainty in Complex Models, Aston University

Alexis BoukouvalasWork in collaboration with D. M.

Maniyar and D. Cornford

Managing Uncertainty in Complex Models, Aston University

Page 2: Alexis Boukouvalas Work in collaboration with D. M. Maniyar and D. Cornford Managing Uncertainty in Complex Models, Aston University

Develop methods for dimensionality reduction of either the input and/or output space of models.

To gain an understanding initially use a toy dataset to compare existing methods.

Later on utilize methods on real world models.

Goal is to extend methods to work with high number of variables - 10^5

Managing Uncertainty in Complex Models, Aston University

Page 3: Alexis Boukouvalas Work in collaboration with D. M. Maniyar and D. Cornford Managing Uncertainty in Complex Models, Aston University

Feature Selection Also known as Screening in statistical literature Select p most relevant of the original k variables. Meaning of variables is preserved => method

results are interpretable Projective methods

Variables are transformed X’=F(X) Transformations can be linear or non-linear Interpretation is non-trivial especially for non-

linear mappings.

Managing Uncertainty in Complex Models, Aston University

Page 4: Alexis Boukouvalas Work in collaboration with D. M. Maniyar and D. Cornford Managing Uncertainty in Complex Models, Aston University

Managing Uncertainty in Complex Models, Aston University

• Generate N base vectors x of dimensionality d from sampling a Latin hypercube. Normalize the data.• Evaluate the generative model g(.)• Corrupt the model output with independent identically distributed Gaussian noise. Initially we set noise variance is 0.1*signal variance.•[Screening] Augment with extra noise dimensions

e = Bx + input noiseNoise is always N(0,I). B matrix is described on the next slide.

•[Projection] Project to a higher dimensional space using x’ = W*F(x)

Page 5: Alexis Boukouvalas Work in collaboration with D. M. Maniyar and D. Cornford Managing Uncertainty in Complex Models, Aston University

[Screening] B matrix determines correlation between noise and model variables B=0 constructs noise variables that are

uncorrelated to the model variables. k randomly selected rows have a single non zero

entry corresponding to the noise variable being linearly correlated to a single model variable. Currently k=0.5*#noise variables and coefficient is set to 0.5

Same as previous but two elements of k rows are non-zero, k=0.8 and coefficients are randomly taken from the set {-0.2,-0.5,+0.5,+0.7}

Managing Uncertainty in Complex Models, Aston University

Page 6: Alexis Boukouvalas Work in collaboration with D. M. Maniyar and D. Cornford Managing Uncertainty in Complex Models, Aston University

[Projection] Project into higher dimensional space q

x’ = W*F(x) W is a q*d weight matrix and F(·) are

basis functions which are responsible for the projection mapping. A typical choice of such projection mapping is to use Radial Basis Functions (RBF).

Managing Uncertainty in Complex Models, Aston University

Page 7: Alexis Boukouvalas Work in collaboration with D. M. Maniyar and D. Cornford Managing Uncertainty in Complex Models, Aston University

Different noise models Correlated Multiplicative

Non-linear interactions of noise variables with model variables

Mix screening and projection

Managing Uncertainty in Complex Models, Aston University

Page 8: Alexis Boukouvalas Work in collaboration with D. M. Maniyar and D. Cornford Managing Uncertainty in Complex Models, Aston University

Variable selection methods have been broadly categorised in three categories Variable Ranking. Input variables are ranked

according to the prediction accuracy of each input calculated against the model output.

Wrapper methods. The emulator is used to assess the predictive power of subsets of variables

Embedded methods. For both variable ranking and wrapper methods, the emulator is considered a perfect black box. In embedded methods, the variable selection is done as part of the training of the emulator.

Managing Uncertainty in Complex Models, Aston University

Page 9: Alexis Boukouvalas Work in collaboration with D. M. Maniyar and D. Cornford Managing Uncertainty in Complex Models, Aston University

Forward selection where variables are progressively incorporated in larger and larger subsets

Backward elimination proceeds in the opposite direction.

Efroymson’s algorithm aka stepwise selection. Proceed as forward selection but after each variable is added, check if any of the selected variables can be deleted without significantly affecting RSS.

Exhaustive search where all possible subsets are considered.

Branch and Bound. Eliminate subset choices as early as possible. E.g. is variables A-Z, RSS of A,B subset 100, then C-Z subset branch need not be followed if RSS of all C-Z variables > 100.

Managing Uncertainty in Complex Models, Aston University

Page 10: Alexis Boukouvalas Work in collaboration with D. M. Maniyar and D. Cornford Managing Uncertainty in Complex Models, Aston University

An embedded method commonly employed in the context of Gaussian Processes is Automatic Relevance Determination (ARD) where the characteristic length scales l determine the input relevance

Managing Uncertainty in Complex Models, Aston University

Page 11: Alexis Boukouvalas Work in collaboration with D. M. Maniyar and D. Cornford Managing Uncertainty in Complex Models, Aston University

The following algorithms were used in the experiments • BaseRelevant: Baseline run using the relevant dimensions

only. The RMSE was obtained by training a GP on the relevant dimensions. This value can be interpreted as the optimal RMSE value.

• BaseAll: Baseline run using all the dimensions, i.e. relevant + extra. Again the RMSE was obtained by training a GP on this set. The difference BaseAll-BaseRelevant is a measure of the effect of the extra variables on the predictive accuracy of the GP.

• CorrCoef: Pearson Correlation Coefficient. A variable ranking is performed using the formulae 10 and the top 3 variables are selected and used to train a GP.

• LinFS: Employ a forward selection subset selection strategy using a multivariate linear regression model. The RMSE is obtained from evaluating the selected subset on a multiple linear regression model.

• GPFS: Again employ forward selection to generate subsets but use a GP rather than a linear model.

• ARD: Employ the ARD method to rank the input variables and select the top 3 to train a GP model.

Managing Uncertainty in Complex Models, Aston University

Page 12: Alexis Boukouvalas Work in collaboration with D. M. Maniyar and D. Cornford Managing Uncertainty in Complex Models, Aston University

200 observations,3 model dimensions, 6 total

Managing Uncertainty in Complex Models, Aston University

Algorithm Variables Selected

RMSE Elapsed time

BaseRelevant 1,2,3 0.9128 1.44142

BaseAll 1,2,3,4,5,6 1.0473 1.60529

CorrCoef 1,4,2(,3,5,6) 2.1642 1.50487

LinFS 1,4,2 2.7803 0.134283

GPFS 1,2,3 0.9092 18.2017

ARD 1,2,3 0.9134 5.56684

5.566845.56684

Page 13: Alexis Boukouvalas Work in collaboration with D. M. Maniyar and D. Cornford Managing Uncertainty in Complex Models, Aston University

200 observations,3 model dimensions, 6 total

Managing Uncertainty in Complex Models, Aston University

Algorithm Variables Selected

RMSE Elapsed time

BaseRelevant 1,2,3 0.9111 1.42363

BaseAll 1,2,3,4,5,6 1.0633 1.66093

CorrCoef 1,4,5(,2,6,3) 2.6794 1.31676

LinFS 1,4,6 2.8083 0.143308

GPFS 1,2,3 0.9274 19.0051

ARD 1,2,3 1.0076 5.0611

5.566845.56684

Page 14: Alexis Boukouvalas Work in collaboration with D. M. Maniyar and D. Cornford Managing Uncertainty in Complex Models, Aston University

Initial results for high-D input, two-correlated, model inputs 100, noise dimensions 500, number of observations 500.

Managing Uncertainty in Complex Models, Aston University

Length - Input Number 31.8373 361 18.7081 501 14.2097 296 12.7581 51 12.3160 456 11.8689 496 11.3176 166 10.2424 310 10.2220 420 9.6192 325 9.0732 363

Length - Input Number 8.6898 53 8.5453 347 7.9338 419 7.8201 294 7.8017 188 7.4327 103 7.3760 13 7.1526 572 7.0997 478 6.9481 393 6.6417 187

Page 15: Alexis Boukouvalas Work in collaboration with D. M. Maniyar and D. Cornford Managing Uncertainty in Complex Models, Aston University

Best performing methods are GPFS and ARD which usually find the optimal subset. However the GPFS method is on average more than three times slower than ARD.

The CorrCoef and LinFS methods are computationally inexpensive but provide unsatisfactory results.

Even for simple mapping functions (sinx) on underdetermined systems where number of observations < dimensions, ARD breaks down.

Managing Uncertainty in Complex Models, Aston University

Page 16: Alexis Boukouvalas Work in collaboration with D. M. Maniyar and D. Cornford Managing Uncertainty in Complex Models, Aston University

Batch hierarchical screening Explore the potential of partitioning the input

space into groups of inputs, applying screening methods on the groups and combining the important inputs

Some work already done for linear models (Gabriel and Pan 1979)

Grouping of variables such that if two variable Xi Xj are in different groups, then their regression sum of squares (RSS) are additive, i.e. if Si is the reduction in RSS from including Xi and Sj for Xj, then when including both Xi Xj Si.j=Si+Sj

Managing Uncertainty in Complex Models, Aston University

Page 17: Alexis Boukouvalas Work in collaboration with D. M. Maniyar and D. Cornford Managing Uncertainty in Complex Models, Aston University

Coupled Emulation separate emulators for different outputs,

linked with some model for the covariance Connections to sequential methods to

handle large datasets. Linked to Sequential Sparse GPs?

Projective methods in conjunction with feature selection.

Managing Uncertainty in Complex Models, Aston University

Page 18: Alexis Boukouvalas Work in collaboration with D. M. Maniyar and D. Cornford Managing Uncertainty in Complex Models, Aston University

Managing Uncertainty in Complex Models, Aston University

[From Van der Maaten et al 2007]

Page 19: Alexis Boukouvalas Work in collaboration with D. M. Maniyar and D. Cornford Managing Uncertainty in Complex Models, Aston University

But [Van der Maaten et al 2007] compared the non-linear to linear methods and found them no better. Reasons they propose relate to curse of dimensionality, overfitting of local models and others.

Managing Uncertainty in Complex Models, Aston University

Page 20: Alexis Boukouvalas Work in collaboration with D. M. Maniyar and D. Cornford Managing Uncertainty in Complex Models, Aston University

Dimensionality Reduction: A Comparative Review, L.J.P. van der Maaten E.O. Postma H.J. van den Herik 2007

Andr Elisseeff Isabelle Guyon. An Introduction to Variable and Feature Selection. Journal of Maching Learning Research, 3:1157–1182, 2003.

Managing Uncertainty in Complex Models, Aston University