the changepoint approach to spc douglas m. hawkins, peihua qiu university of minnesota chang-wook...

The Changepoint Approach to SPC

Douglas M. Hawkins, Peihua QiuUniversity of Minnesota

Chang-Wook KangHanyang University

Changepoint approach to SPC 2

Background to SPC

• Have stream of process readings X1, X2,…Xn,….

• Need to decide whether all follow common statistical model, versus

• Isolated (transient) special causes (affect individual readings) or

• Persistent special causes that remain until detected and fixed.

The simplest statistical model

• In control the Xn are iid N(2)

• Isolated special causes change mean and/or variance then revert.

• Persistent special cause shifts the mean and/or variance.

• For example, step change in mean to

Standard SPC methods

• Shewhart Xbar and R/S chart used for isolated special causes.

• Persistent causes need memory – cumulative sum (cusum) or exponentially weighted moving average (EWMA) chart.

• For now we concentrate on latter.

Designing a chart

• An upward cusum is defined by

where K is ‘reference value’ or ‘allowance.

The chart signals a change if

where H is the ‘decision interval’.

max(0, )1

S S X Kn nn

The things you need to know

• Cusum is the optimal way to detect step shift if K is halfway between in-control and out-of-control means.

• So you must know and • You decide H by setting acceptable in-

control average run length (ARL).

• To do this, you also need to know

Who told you the Greek stuff?

• Very rarely, you do actually know it.

• More commonly, – do a Phase I study to estimate and – carefully check data for control (can use fixed-

sample-size methods for this)– pick a big enough to matter, small enough

not to be easy to see.

An estimate is not a parameter

• But sample estimates are not population parameters.

• So you have a target ARL, but your actual ARL will be a random variable.

• For sensitive methods like cusum with small K, EWMA with small , resulting uncertainty in your ARL can be large.

What cusum optimality?

• On top of this, cusum is optimal only for shift it is tuned for. Get a much different shift, you lose performance.

• Similarly for EWMA.

The changepoint-in-mean model

• For this model– Xi ~ N(2) for i <=

~ N(for i >

• None of the Greeks is known a priori.

• Suppose we are at observation number n.

Likelihood approach

• Write

• If we knew changepoint was (say) k then MLE’s for would be

2 MLE would be Sk,n = (V0,k + Vk,n)/(n-2)

(after the usual bias adjustment

/( ), ( )k j k j

i k j ii k kX X j k V X X

0, ,k k nX X

…. continued

• Two-sample t for H0: =0 (no change) is

• Finally, estimate as k maximizing |Tk,n|

• And diagnose step change if Tmax,n > hn

, 0 ,,,

( )( )k n kk n

k n kT X X

Phase II use

• Changepoint formulation for fixed-sample (Phase I setting) is classical.

• For Phase II SPC use n is not constant. Modify the procedure to:– If Tmax,n, < hn, diagnose in control, continue

– If Tmax,n, > hn, conclude out of control. Use the MLE’s to diagnose time of change and pre- and post-change means.

Getting the control limits

• We need sequence of control limits hn.

• Fixed-sample theory not much help.

• A conceptual objective: Pick the hn so that

Pr[Tmax,n > hn | no signal before time n] = .

• With such a sequence, in-control RL would be geometric (like Shewhart), and with – In-control ARL = 1/

How to get the hn

• Big simulation: 16 million data sets.

• Estimated hn for several values.

• All on web site www.stat.umn.edu/hawkins

So why have a Phase I?

• Don’t need in-control parameter estimates, and so don’t need Phase I data gathering,

• Can get up and running in Phase II.

• As time goes by in control, ever-growing data base gives ever-better estimates (unlike conventional Phase I/II dichotomy)

….continued

• But most folk would ‘dry run’ at least some readings before turning on testing.

• For lack of obvious best choice, suggest starting testing at n=10 (but Web tables give cutoffs for starts of n=3 through 21)

• For example, for =0.005:

1.0 1.3 1.6 1.9 2.2

Plot of h_n for alpha=0.005

LOG10N

The 0.005 cutoff

• The cutoffs seem to tend to around 3.2

• This corresponds roughly to the two-sided 0.001 point of a N(0,1)

• This ‘Bonferroni multiplier’ of 5 is what you pay for the multiple testing.

Do we need the Shewhart?

• Changepoint formulation with compares latest X with mean of all previous data; this includes Shewhart I chart as one of its tests. Asymptotic cutoff of 3.2 is close to European standard.

• and tests the newest mean against grand mean of all previous data; this includes Shewhart Xbar chart for rational groups of any and all sizes.

How does method perform?

• Compared to what? Methods that fix IC ARL with unknown parameters scarce.

• Self-starting cusum doesn’t need IC parameter values. Also seamless from Phase I to Phase II.

• Does however need size of shift for tuning purposes.

A method comparison

• Three cusums, k=0.25, 0.5, 1

• (tuned for shifts of 0.5, 1, 2 sd’s)

• Two in-control ARL’s – 100, 500

• Shift occurring early (observation 10) or later (observation 100)

• a: ARL 100, early; b: ARL 100, later

• c: ARL 500, early; d: ARL 500, later

Results

• Changepoint is sometimes best.

• Mostly is second best (no surprise, given cusum’s theoretical optimality).

• Where not best, it is a close second best and has by far most robustly good performance.

Example – triglyceride data

• Data set kindly supplied by Dr. Dan Schultz, Rogasin Institute, New York.

• Assay triglyceride standard every week. Use as a QC check on unknowns. Triglyceride reading should be constant (doesn’t much matter what its value is).

• Here’s one year of data (given as I chart):

Outlier? Upward shift at end?

0 9 18 27 36 45 54

I Chart for TRIG

Case Number

119.35

128.79

109.91

sigma 3.1463 E(MR bar) 3.5490 Exceptions: 34

First clear exceedance is at week 40

10 20 30 40 50

T_max,n and h_n

What are estimates of the changepoint?

10 20 30 40 50

MLE of changepoint

and of the before- and after-change means

10 20 30 40 50

Estimates of pre- and post-change mean

XB_0_K

XB_K_N

• Don’t interpret estimate of changepoint or of separate means in non-significant bit.

• First signal is 5 weeks after apparent shift.

• Pre-change mean estimate is 117 mg/dL

• Post-change mean estimate is124 mg/dL

• Right from first signal, all three estimates highly stable.

Conclusions

• Conventional Shewhart, cusum, EWMA calibrated assuming known parameters.

• Random errors of estimation in parameters become systematic distortions in run distribution of any particular chart

• making IC and OOC ARL’s random.

• Ugly tradeoff between Phase I sample size and control over IC RL distribution.

• The unknown-parameter changepoint formulation lets you fix in-control run length distribution exactly, with or without sizeable Phase I sample.

• Furthermore, interval alternative means performance competitive regardless of size of the shift.

References

Hawkins, D. M., Qiu, P., and Kang, C.-W. (2003) The Changepoint Model for Statistical Process Control to appear in Journal of Quality Technology.

Pollak, M. and Siegmund, D., (1991), 'Sequential Detection of a Change in a Normal Mean When the Initial Value Is Unknown', Annals of Statistics, 19, 394-416.

Siegmund, D, (1985), Sequential analysis : tests and confidence intervals, Springer-Verlag, New York.

Siegmund, D. and Venkatraman, E. S., (1995), 'Using the Generalized Likelihood Ratio Statistic for Sequential Detection of a Change-point', Annals of Statistics, 23, 255-271.

the changepoint approach to spc douglas m. hawkins, peihua qiu university of minnesota chang-wook...

changepoint approach

n h n slide

time n

estimated h n

dichotomy slide

spc18 slide

spc use n

use changepoint formulation

Documents

changepoint infographic - secrets of smart services leaders

365 ideias - wook

dr. lee jong-wook - seoul project

changepoint detection over graphs with the spectral scan...

package ‘changepoint’ - r · package ‘changepoint ......

changepoint: an r package for changepoint...

changepoint analysis for e cient variant calling

parametric changepoint survival model with...

online bayesian changepoint detection for articulated...

changepoint detection with bayesian inference

stoker (dir. park chan-wook) - review

applied anatomy of pharynx wang peihua department of...

sep. 29, 2016 jong wook lee

evaluation of bayesian changepoint detection of sepsis in...

wook-jin chung - university of the philippines...

dtmb enterprise portfolio management with changepoint ppm...

changepoint analysis by modified empirical likelihood method...

1 biographical data - university of florida · peihua qiu...

20022037 kim jinah 20032001 gang...

4as como ver - wook