nonparametric bootstrap inference on the characterization of a response surface
DESCRIPTION
Nonparametric Bootstrap Inference on the Characterization of a Response Surface. Robert Parody Center for Quality and Applied Statistics Rochester Institute of Technology 2009 QPRC June 4, 2009. Presentation Outline. Introduction Previous Work New Technique Example Simulation Study - PowerPoint PPT PresentationTRANSCRIPT
Nonparametric Bootstrap Inference Nonparametric Bootstrap Inference on the Characterization of a on the Characterization of a
Response SurfaceResponse Surface
Robert ParodyRobert ParodyCenter for Quality and Applied StatisticsCenter for Quality and Applied Statistics
Rochester Institute of TechnologyRochester Institute of Technology2009 QPRC2009 QPRCJune 4, 2009June 4, 2009
Presentation OutlinePresentation Outline
Introduction Introduction
Previous WorkPrevious Work
New TechniqueNew Technique
ExampleExample
Simulation StudySimulation Study
Conclusion and Future ResearchConclusion and Future Research
IntroductionIntroduction
Response Surface Methodology (RSM)Response Surface Methodology (RSM)
– Identify the relationship between a set of k-Identify the relationship between a set of k-
predictor variables and the response predictor variables and the response
variable yvariable y
– Typically, the goal of the experiment is to optimize Typically, the goal of the experiment is to optimize
E(Y)E(Y)
is transformed into coded is transformed into coded x x by by
k ,,1 ξ
iii
i scx 0
The ModelThe Model
A second order model is fit to the data A second order model is fit to the data
represented by represented by
– where:where: ii, , iiii, and , and ijij are unknown parameters are unknown parameters
~ F(0,~ F(0,22) and independent) and independent uu are other effects such as block effects and are other effects such as block effects and
covariates, which are not interacting with the covariates, which are not interacting with the xxii’s ’s
0 uuuu εωy x
k
i
k
i
k
jjiijiii
k
iii xxxx 2x
Equivalently, in matrix form, Equivalently, in matrix form,
whereBxxβxx
kk
k
k
k
sym
.
Β,...,β
2
22
and 2
22
11211
1
BackgroundBackground
Canonical AnalysisRotate the axis system so that the new system lies parallel to the principle axes of the surface
P is the matrix of eigenvectors of B where
PP = PP = I
The rotated variables and parameters: – w = Px– = P– = PBP = diag(i)
If all If all ii < 0 (> 0), < 0 (> 0), the stationary point is a the stationary point is a
maximizer (minimizer); contours are maximizer (minimizer); contours are
ellipsoidal.ellipsoidal.
If the If the ii have different signs, the stationary have different signs, the stationary
point is a minimax point (complicated point is a minimax point (complicated
hyperbolic contours).hyperbolic contours).
Types of Surfaces
Standard Errors for the Standard Errors for the ii
Carter Chinchilli and Campbell (1990)Carter Chinchilli and Campbell (1990)– Found standard errors and covariances for Found standard errors and covariances for ii by by
way of the delta methodway of the delta method
Bisgaard and Ankenman (1996)Bisgaard and Ankenman (1996) – Simplified this with the creation of the Double Simplified this with the creation of the Double
Linear Regression (DLR) methodLinear Regression (DLR) method
Previous WorkPrevious Work
Edwards and Berry (1987)Edwards and Berry (1987)– Simulated a critical point for a prespecified linear Simulated a critical point for a prespecified linear
combination of the parameterscombination of the parameters– The natural pivotal quantity for constructing The natural pivotal quantity for constructing
simultaneous intervals for these linear combinations simultaneous intervals for these linear combinations of the parameters isof the parameters is
1/2*
1ˆ ˆQ max /j j j
j r
c γ γ c V c
ShortcomingShortcoming
The technique on the previous slide is only The technique on the previous slide is only valid when valid when
– The errors are i.i.d. normal with constant The errors are i.i.d. normal with constant variancevariance
– The set of linear combinations of interest are The set of linear combinations of interest are prespecifiedprespecified
Research GoalResearch Goal
Employ a nonparametric bootstrap based on Employ a nonparametric bootstrap based on a pivotal quantity to extend the previously a pivotal quantity to extend the previously mentioned work to include situations where:mentioned work to include situations where:
1.1. The set of linear combinations of interest are The set of linear combinations of interest are not prespecifiednot prespecified
2.2. Relax the error distribution assumptionRelax the error distribution assumption
12
Bootstrap IdeaBootstrap IdeaResample from the original data – either Resample from the original data – either directly or via a fitted model – to create directly or via a fitted model – to create replicate datasetsreplicate datasets
Use these replicate datasets to create Use these replicate datasets to create distributions for parameters of interestdistributions for parameters of interest
Consider the nonparametric version by Consider the nonparametric version by utilizing the empirical distributionutilizing the empirical distribution
13
Empirical DistributionEmpirical DistributionThe empirical distribution is one which equal The empirical distribution is one which equal probability 1/N is given to each sample value yprobability 1/N is given to each sample value y ii
The corresponding estimate of the cdf F is the The corresponding estimate of the cdf F is the empirical distribution function (EDF) , which is empirical distribution function (EDF) , which is defined as the sample proportion:defined as the sample proportion:
F̂
#ˆ iy yF y
N
New TechniqueNew Technique
The pivotal quantity for simultaneous inference on i:
1
ˆ ˆQ max /j jj k
s
Bootstrap EquivalentBootstrap Equivalent
Replace the parameter with the estimates and the estimates with the bootstrap estimates to get:
* * *
1
ˆQ max /j j jj k
s
Bootstrap Parameter EstimationBootstrap Parameter Estimation
Find the model fitsFind the model fits
Resample from the modified residuals N times Resample from the modified residuals N times with replacementwith replacement
Add these values to the fits and use them as Add these values to the fits and use them as observationsobservations
Fit the new model and determine the bootstrap Fit the new model and determine the bootstrap parameter estimatesparameter estimates
17
An AdjustmentAn AdjustmentWe usually at least assume that the errors are iid from We usually at least assume that the errors are iid from a distribution with mean 0 and constant variance a distribution with mean 0 and constant variance 22
The residuals on the other hand come from a common The residuals on the other hand come from a common distribution with mean 0 and variance distribution with mean 0 and variance 22(1-h(1-hiiii))
So the modified residuals become So the modified residuals become
*
1i
i
ii
ed
h
Critical Point ProcedureCritical Point Procedure
Create nonparametric bootstrap estimates for Create nonparametric bootstrap estimates for the unknown parameters in Q*the unknown parameters in Q*
Now find Q* by maximizing over the j Now find Q* by maximizing over the j elementselements
Repeat this process for a large number of Repeat this process for a large number of bootstrap samples (m) and take the (m+1)(1-bootstrap samples (m) and take the (m+1)(1-))thth order statistic order statistic
Bootstrap Simulation SizeBootstrap Simulation Size
Edwards and Berry (1987) showed conditional Edwards and Berry (1987) showed conditional coverage probability of 95% simulation-based coverage probability of 95% simulation-based bounds will be +/-0.002 for 99% of the bounds will be +/-0.002 for 99% of the generations for (m+1)=80000generations for (m+1)=80000
ExampleExample
Chemical process experiment with k=5 from Chemical process experiment with k=5 from Box (1954)Box (1954)
Goal: Maximize percentage yield Goal: Maximize percentage yield
Parameter EstimatesParameter Estimates
Parameter Estimate
11 -0.041
22 -0.400
33 -1.782
44 -2.625
55 -4.461
Parameter EstimatesParameter Estimates
Parameter Estimate
11 -0.041
22 -0.400
33 -1.782
44 -2.625
55 -4.461
Critical PointCritical Point
Using Using =0.05 and (m+1)=80000, we get=0.05 and (m+1)=80000, we get
0.05Q 2.937
Estimates and Estimates and 95% Simultaneous Confidence Intervals
Parameter LCL Estimate UCL
11 -0.741 -0.041 0.660
22 -0.840 -0.400 0.045
33 -2.553 -1.782 -1.011
44 -3.332 -2.625 -1.918
55 -5.205 -4.461 -3.717
Estimates and Estimates and 95% Simultaneous Confidence Intervals
Parameter LCL Estimate UCL
11 -0.741 -0.041 0.660
22 -0.840 -0.400 0.045
33 -2.553 -1.782 -1.011
44 -3.332 -2.625 -1.918
55 -5.205 -4.461 -3.717
Comparison of critical pointsComparison of critical points– For the example, we would only need ~88% For the example, we would only need ~88%
of the sample size for the simulation method of the sample size for the simulation method as compared to traditional simultaneous as compared to traditional simultaneous methodsmethods
Computer TimeComputer Time– Approximately 2 minutes on a Intel Core 2 Approximately 2 minutes on a Intel Core 2
Duo computerDuo computer
Relative EfficiencyRelative Efficiency
Simulation StudySimulation Study
10 critical points were created10 critical points were created
For each critical point, 10000 confidence For each critical point, 10000 confidence intervals were created by bootstrapping the intervals were created by bootstrapping the residualsresiduals
This was done 100 times for each pointThis was done 100 times for each point
Simulation ResultsSimulation Results
ConclusionsConclusions
New technique yields tighter bounds New technique yields tighter bounds
Works for linear combinations not Works for linear combinations not prespecifiedprespecified
Relaxes normality assumption on the error Relaxes normality assumption on the error termsterms
Simulation study yields adequate coverageSimulation study yields adequate coverage
Future ResearchFuture Research
Relax model assumptions further to include Relax model assumptions further to include nonhomogeneous error variancesnonhomogeneous error variances
Apply to other situations where we are unable Apply to other situations where we are unable to prespecify the combinations, such as ridge to prespecify the combinations, such as ridge analysisanalysis