batch startups using multivariate statistics and optimization susan l. albin di xu rutgers...
TRANSCRIPT
Batch Startups Using Multivariate Statistics and
Optimization
Susan L. Albin
Di Xu
Rutgers University
supported by NSF/Industry-University Cooperative Center for Quality and
Reliability Engineering
IBM, January 2003
Outline of Talk
• Batch processes & start-ups• Multivariate models for process and
product variables• Optimization algorithm, operator
assisted, to reduce batch startup time• Optimizing startup by accounting for
uncontrollable variables – raw materials, environmental variables
• Sequential sampling method to estimate uncontrollable variable parameters
Startup Stage Accounts for Up to 50% of Batch Time
Startup stage
Production stage
Batch 1
Startup stage
Production stage
Batch 2
••••••
Long time interval
Goal: Decrease Mean and Variance of Batch Startup
Time
• create capacity without adding machines, personnel, or space
• improve production planning
• reduce scrap
• ease bottleneck at off-line testing
Multiple Input and Output Variables in Batch Processes
• Process variables, X– temperature, pressure, speeds
• Product variables, Y– diameter, tensile strength,
elongation
• Correlations among all variables
Traditional Batch Startup Procedure – One Variable at-a-time
with delay
Production
Adjust Process Variables,
Take process measurement
Take Product Sample
no
no
in spec?
in spec?
real-time
25-D Hypercube
25-D Hypercube
USL
X2
X1
good combination
bad combination
USL
LSL
LSL
Consequences of Monitoring Multiple Process Variables
One-at-a-Time
• X1 and X2 correlated
Why Different Settings for Different Batches?
• Long time between batches• Uncontrollable variables change
batch-to-batch– Raw material changes – Environment changes– Maintenance levels
• Uncontrollable variables often unknown– Different system– Not easily measured by sensor
Batch Startup:
• Characterize Good Baseline Data– Process & product variables– Multivariate statistical Model
• For New Batch - Start at baseline average
• If product not ok, select new setting– Consistent with Model– Taking into account
operator/engineering advice
Partial Least Squares (PLS) Characterizes Process &
Product Variables in BaselineBaseline: Good Production Data
Input X’s Output Y’s
Construct PLS component T’sEach T is linear combination of X’sT1 = w11X1 + w12X2 + w13X3 + ···T2 = w21X1 + w22X2 + w23X3 + ···
PLS components are independent
Data reduction: 3 ~ 5 components contain sufficient information in data
Construct PLS Component T’s
T1 = w1X1 + w2X2 + w3X3 + ······
U1 = c1Y1 + c2Y2 + c3Y3 + ······
• Find w’s and c’s (normalized):
Max Cov(T1 , U1)
• Find w’s and c’s:
Max Cov(T2 , U2)
s.t. T2 T1
Comparison of Principal Components Analysis & PLS
• Both– Reduce dimension of data– Components are linear
combinations of the X’s
• BUT PLS components consider the Y’s – X’s that are correlated with Y’s
emphasized in PLS components
Measure Distance Between Current Process & Baseline:
Squared Prediction Error SPE
SPE
X1
X2
Current process
Baseline data
PLS model
Calculate SPE
PLS baseline model
x1
x2
xk
T1
T2
T3
T’s predict X’s Regress X on T’s
1x̂
2x̂
kx̂
SPE is sum over all process variables
k
iii xxSPE
1
2ˆ
332211ˆ pppx ttt
A Filament Extrusion Process
• Conveying screw pushes solid raw material down length of enclosed barrel
• Melting occurs due to shear stresses, increased pressure and externally added heat
• Semi-molten extrudate pushed through die, producing desired filament shape
• Stretching and re-heating steps control molecular properties e.g. diameter and tensile strength
• Finished product wound onto take-up spools, each batch producing dozens
Process & Product Variables
• Input: 25 On-line Process Variables– ex: temperatures, pressures,
speeds– observations every few minutes
• Output: 12 Off-line Product Vars– ex: diameters, tensile strength– observations every few hours– delay of an hour or more
Develop PLS Model on Baseline Data
(17 batches, 114 observations)
• 5 PLS components account for
– 98% cov (Xs, Ys)
– 84% var(Xs)
– 29% var(Ys)
• Could use fewer - 3 comps acct for
– 91% cov (Xs, Ys)
– 70% var(Xs)
– 22% var(Ys)
– 1Geladi, P. and Kowalski, B.R., (1986)2Lindberg, W., Persson, J., and Wold, S. (1983)3Wold, S., (1978)
Graph of SPE for Baseline Data with Control Limit
Baseline production data: 17 batches covering 114 observation points
SPE
• If observations ~Normalthen calculate control limit• Control limit used to assessstartup
Ad Hoc Use of PLS to Find Adjustment: Decompose SPE
0 10 20 30 40
SPE2
Variable i
Contribution of variable i:
1 2 3 4 5 6 7 8 9…..
k
iii xxSPE
1
2ˆ
Startup Batch
Time
Improving on the Ad Hoc Decomposition Method
• Decomposing SPE suggests which variable to adjust
• Does not give– how much to adjust– what related variables need
adjustment• New methodology
– combines optimization & multivariate statistics
– gives which variables to adjust and how much
Operator-Assisted Batch Startup
Begin Startup
OK?Production
No
Yes
Operator may inputprocess variable to adjust
Algorithm recommends adjustment
Operator Interfaces with Startup Algorithm in Several
Modes
Operator gives the variable to adjust– algorithm gives setting and
other process settings
Operator gives several possible variables – algorithm helps choose
Operator unaware adjustment needed– without prompt, algorithm
suggests adjustment
Relationship Between Process Settings and Variables
• Process variables are a linear function of process settings
)(sgx
Process Variables
XLinearmodel
SetpointsS
Mathematical Optimization: Determine Adjusted Process
Vars xa & Settings sa
• Minimize SPE(xa)
• Subject to:
s u
z or i k
s s Mz k
M
z L
x g s
t w x i A
t r i A
ja
i
ic
ia
i
ii
k
a a
i ia
i
0 1 1
1
1
1
1
. . .
. . .
( )
. . .
. . .
large
Objective Function
• Given current process – settings sc
– variables xc
• Find adjusted settings– settings sa
• Minimize SPE(xa)– distance from adjusted variables to
baseline
k
iii xxSPE
1
2ˆ
Predicted from PLS components
Constraint: Follow the Operator’s Recommendation
• ex: adjust setting 23 to a new value u
• ex: adjust setting 23 to a new value exceeding the current setting
s ua23
s sa c23 23
Constraints: Limit Size of Adjustments & No. of
Variables Adjusted
• Introduce one integer variable zi for each possible adjustment
• Limit size of each adjustment
• Limit number of variables adjusted, typically 2 or 3
z or i ki 0 1 1...
s s M z k Mic
ia
i i i 1... large
z Lii
k
1
Constraint: PLS Components Should Be Within Reasonable
Range
• Compute PLS components, Ts, after adjustment
• Ts should be in a reasonable range
t w x i Ai ia 1...
X1
X2
Baseline data
t r i Ai 1...
T1=w1X1+ w2X2
Mixed Integer Quadratic Program
• Objective function: convex quadratic
• Mixed decision variables– 0-1 variables in constraint
limiting no. of adjustments– continuous process settings
• Linear constraints• Solve with Bender’s Algorithm or
Search
Derive SPE as x’Bx
prove B is postive semi definte
About SPE
• B contains– weights to compute PLS
components, t, from process variables x
– loadings to computefrom PLS components t
Bxxx )(SPE
x̂
Example: Operator Considers Two possibilities and
Algorithm Helps to Select
• Historical– t=40: adjust v7– t=60: adjust v4, v5, v6– t=210: adjust v5, v6– t=240: adjust v5, v6– t=330: adjust v5, v6– t=360: adjust v7 (start) & production
• With algorithm – t=40: input v4 OR v7
output v4, v5, v6– t=50: production!
• Startup reduced 86% from 360 to 50 minutes
Example cont: Two possible adjustments at t=40
• Adjust v7– SPE 13.8– plus other adjustments
• Adjust v4– SPE 8.3– also adjust v5 & v6
• Select second choice with min SPE
Uncontrollable Variables Contribute to Batch-to-Batch
Variability
• Uncontrollable variables are random variables– New values for each batch– You can measure them – You can control them within
specifications– You cannot set them
• Examples – raw material characteristics, environmental and maintenance variables
Select Better Settings by Accounting for
Uncontrollable VariablesInput
raw material, environmental,
output stage n-1)
Process
Settings
Output
PROCESS
Feedforward controlto reducebatch-to-batchvariation
Objective
• Given means and variances for uncontrollable variables– Identify optimal settings quickly– Predict whether likely to produce
successful outputs
Extend SPE to Include Uncontrollable Variables
• Original
• Divide x into two groups
xxx BSPE )(
U
SUSUS BSPE
x
xxxxx ),(),(
Process settings
Uncontrollable variables(random variables)
Optimization Objective Function
• Min Expected Value of SPE
• Select new settings xS
• xu are random variables– mean vector & variance matrix
known
),(min USSPEEU
xxx
Mathematical Optimization: Choose Settings xS to Minimize
ESPE
Subject to:
kukSxkl
AiritE
Ai
i
j jjIiit
,
,2,1,|)(|
,2,1
,)1
1(
xwpw
),(min USESPE xx
Defn of PLScomps
PLS comps in baseline range
Settings withinlimits
Find xS
Settings depend on mean xu -
min ESPE depends on mean and variances
U
US
US
BtraceSPE
SPEEU
0
00),(min
),(min
xx
xxx
Min ESPE depends on both means and variances of uncontrollable variables
Best settings only depend on
mean of uncontrollable variables
Predicting if this Batch is Likely to Work Well
• Find mean and variance for
uncontrollable variables
• Solve for optimal settings
• If min ESPE exceeds threshold from
baseline data, optimal settings are
unlikely to produce successful
outputs
Polystyrene Extrusion Simulation: Baseline of 260
Good Batches
• 4 uncontrollable raw material vars density, specific heat, thermal
conductivity, power law index
• 3 process settingsflow rate, screw speed, barrel temp
• 8 outputs - extruder performancereq axial length, bulk temp, pressure
at screw tip & die entrance, max shear rate in channel & die, specific mechanical energy, ave residence time
Comparison of Success Rates: Ave Baseline vs. Min ESPE
Settings
Varianceof xs
ave baseline
settings
min ESPE
settings
reasonable 57/100 93/100
large 61/100 Quit batch!
Min ESPE
> 95 %tile
100 scenarios• uncontrollable variables taken from join normal with mean & var known• settings from optimization
Raw Material Sample Estimates May Be Uncertain
• High variability in some materials– food, oil, bulk chemicals
• Measurement error– lab-to-lab and other testing errors
• Sampling problems – how to sample from a large lot of
bulk chemical• Constraints on time/money
– small samples
Sample Estimates of Input Variables Form Joint Confidence Interval
ConfInterval
Conf Interval
1
2
Point Estimate
1 and 2 are means of two inputs
Yellow Box is CI for Inputs
ESPE Between Baseline and Uncontrollables Vars & Settings
X1
Current samplelarge
Baseline data
PLS modelX2
• CI around current uncontrollables• ESPE is distance averaged over CI• ESPE large if CI or distance is large
Compute Confidence Interval for ESPE
Find ESPE under optimal settings(math program)
Max ESPE
Min ESPE
Yellow CI on uncontrollable variable means 2
ESPEConfidence Intervalfor ESPE
Sequential Sampling Algorithm to Determine
Whether to Process Batch
Compare ESPE CI to 90th percentile of SPE’s in baseline control limit
Sample more
Control limitESPE
Input infeasible
Proceed
ESPE
ESPE
If We Proceed with Batch, Select Settings
• Use point estimates of uncontrollable variables mean and variance, find settings to min ESPE
• More conservative – Use minimax optimization to minimize “worst case” ESPE over the CI of the uncontrollable variables
Summary: Batch Startups Using Multivariate Statistics
and Optimization
• Uncontrollable variables contribute to batch-to-batch variability– no info on uncontrollables– means and variances– estimates of means and variances
• Feedforward info on uncontrollables to select optimal batch settings (or quit batch)
Summary: Batch Startups Using Multivariate Statistics
and Optimization
• PLS baseline model characterizes uncontrollable variables, settings & process output
• Math program finds settings – Objective: min distance from
baseline PLS model to current process
– Constraints: consistent with PLS model, operator suggestions, & engineering considerations
• Synthesis of multivariate statistics and mathematical programming
Continuing Research
• Monitoring Batch-to-Batch and Within Batch Variance during the production stage
• Robust optimization - takes into account that the objective function contains parameter estimates with confidence intervals