surviving survival forecasting of product failure · unit 1 1 mar 16 unit 3 29 mar 16 today time...
TRANSCRIPT
#AnalyticsXC o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
Surviving Survival Forecasting of Product Failure
Ryan CarrAdvisory Statistical Data ScientistSAS
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
Agenda
• Survival Model Concepts Censoring & time Alignment
Preparing the data for analysis
• Parametric Models Exponential
Log-linear
Weibull 2p
Weibull 3p
Generalized Gamma
• Process to Forecast % failure at fixed points from in-service date
• Process to Forecast weekly failure based on in-service dates
• Conclusion
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
Unit 1
Unit 21 Mar 16
1 Mar 16
Today
Right Censored
Unit 1 Failed after 16 weeks
The failure time for Unit 2 is considered “censored” since it did not fail during our study period
29 Jun 16
Survival Model Concepts
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
Unit 1
1 Mar 16
Unit 329 Mar 16 Today
Time Aligned
Unit 1
Week 0
Unit 3Week 0 Week 20
Unit 1 was placed into service the first week
Unit 2 was not placed into service until 4 weeks later
Time align each unit by using relative times(hours, days, weeks…) rather than absolute times.
Survival Model Concepts
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
• Preparing the dataIn Service Data Returns by Week FOR each in-service date
Two ways to get 1 week in service.
Survival Model Concepts
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
• Preparing the dataCensored / time Aligned Data
Censored “replacements” are actually units still in service
this row has only been in service 1 week before the end of the study.
These 44 units actually failed only 1 week after being placed in service.
It is the sum of any failing 1 week after any in-service date
Survival Model Concepts
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
• Preparing the dataForecast Template
For forecasting (after the model is fit) …
We focus back on only those units still in service.
Survival Model Concepts
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
Comparing Models
Exponential
Weibull 2p
Weibull 3p
Lognormal
Generalized Gamma
SOME of the Relationships among the distributions:
• Exponential is Weibull 2p with Scale=1
• Weibull 2p is Generalized Gamma with Shape=1
• Weibull 3p is Weibull 2p with an offset parameter
• LogNormal is Generalized Gamma with Shape=0
Distributions
NOTE: distribution information from https://en.wikipedia.org/wiki/Exponential_distribution
and https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_lifereg_sect019.htm
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
Comparing ModelsExponential
CDF
proc lifereg data=Returns_Censored outest=pe_Exponential ;
model WeeksInService*censor(1)= / distribution=exponential ;
weight replacements ;
output out=resid_exponential sres=sresiduals ;
probplot / hlower=.05 ;
inset ;
run;
𝐺 𝑡 = exp(−𝛼 𝑡 )
𝛼 = 10443
ods output ParameterEstimates = exp_pe2 ;
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
Comparing ModelsWeibull 2p
CDF 𝐺 𝑡 = exp(−𝛼 𝑡 𝛾)
𝛼 = 1800𝛾 = 1.374
proc lifereg data=Returns_Censored outest=pe_Weibull2p ;
model WeeksInService*censor(1)= / distribution=weibull ;
weight replacements ;
output out=resid_Weibull2p sres=sresiduals ;
probplot / hlower=0.05 ;
inset ;
run ;
ods output ParameterEstimates = w2p_pe2 ;
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
The Weibull 3p model is a generalization of the
Weibull 2p model where a “location” or offset
parameter is added.
This offset represents the “minimum time to event”.
Weibull 3p
CDF 𝐺 𝑡 = exp(−𝛼 𝑡 − 𝛿 𝛾)
Comparing Models
𝛼 = 3382𝛾 = 1.195𝛿 = 0.971
proc reliability data=Returns_Censored ;
freq replacements ;
distribution W3 ;
probplot WeeksInService*Censor(1) ;
run ;
ods output ParmEst = pe_W3P ;
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
LogNormal
Comparing ModelsCDF
𝜇 = 9.999𝜎 = 2.443
proc lifereg data=Returns_Censored outest=pe_LogNormal ;
model WeeksInService*censor(1)= / distribution=lognormal ;
weight replacements ;
output out=resid_LogNormal sres=sresiduals ;
probplot / hlower=0.05 ;
inset ;
run;
ods output ParameterEstimates = ln_pe2 ;
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
Generalized Gamma
CDF
Comparing Models
𝑎 = 9.838d= 18.788p= -13.351
proc lifereg data=Returns_Censored
inest=in_estw outest=pe_GGamma ;
model WeeksInService*censor(1)= / distribution=gamma ;
weight replacements ;
output out=resid_GGamma sres=sresiduals ;
probplot ;
inset ;
run;
ods output ParameterEstimates = gg_pe2 ;
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
Setting initial parameters
Generalized Gamma
proc lifereg data=Returns_Censored outest=out_estw noprint ;
model WeeksInService*censor(1)= / distribution=Weibull maxiter=5000 ;
weight replacements ;
run ;
data in_estw ;
set out_estw ;
_dist_ = "Gamma" ;
_shape1_ = 1 ; * Weibull 2p * ;
run ;
proc lifereg data=Returns_Censored
inest=in_estw outest=pe_GGamma ;
model WeeksInService*censor(1)= / distribution=gamma maxiter=10000 ;
weight replacements ;
output out=resid_GGamma sres=sresiduals ;
probplot ;
inset ;
run;
NOTE: The Generalized Gamma is a fairly complex distribution and may have convergence problems in maximum likelihood parameter estimation
Two steps to help with convergence are:1) Start parameter search at
a reasonable position like the Weibull 2p estimates
2) Set the maximum iterations to a higher number
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
Applying Models to forecasts Have models
Have parameter estimates
…
How do we apply these to get estimated future values?
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
Applying Models – Points in timeExponential
proc sql ;
select estimate
into :alpha
from exp_pe2
where parameter ='Weibull Scale' ;
run ;
data prob_failure ;
do WeeksInService = 4, 13, 26 ;
cdf = 1 - exp(- WeeksInService/&alpha.) ;
output ;
end ;
run ;
CDF 𝐺 𝑡 = exp(−𝛼 𝑡 )
𝛼 = 10443
0.25% Chance of failure by week 26
Direct Formula
ods output ParameterEstimates = exp_pe2 ;
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
Applying Models – Points in timeWeibull 2p
CDF
0.30% Chance of failure by week 26
𝐺 𝑡 = exp(−𝛼 𝑡 𝛾)𝛼 = 1800𝛾 = 1.374
proc sql ;
select put(estimate, 15.10) as estimate
into :alpha
from w2p_pe2
where parameter ='Weibull Shape' ;
select put(estimate, 15.10) as estimate
into :gamma
from w2p_pe2
where parameter ='Weibull Scale' ;
run ;
data prob_failure ;
do WeeksInService = 4, 13, 26 ;
cdf = (1- (exp(-((weeksInService)/&gamma.)**&alpha.)) ) ;
output ;
end ;
run ;
Direct Formula
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
Applying Models – Points in timeWeibull 3p
CDF
0.28% Chance of failure by week 26
𝐺 𝑡 = exp(−𝛼 𝑡 − 𝛿 𝛾)
𝛼 = 3382,𝛾 = 1.195,𝛿 = 0.971
…
select estimate
into :delta
from pe_W3P
where parameter ='Weibull Threshold' ;
…
data prob_failure ;
do WeeksInService = 4, 13, 26 ;
cdf = (1- (exp(-((weeksInService-&delta.)/&gamma.)**&alpha.)) ) ;
output ;
end ;
run ;
Direct Formula
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
proc sql ;
select intercept, _scale_ into :mu, :sigma
from pe_lognormal;
…
data prob_failure ;
do WeeksInService = 4, 13, 26 ;
cdf2 = cdf('lognormal', WeeksInService, &mu., &sigma.) ;
output ;
end ;
run ;
proc lifereg data=Returns_Censored outest=pe_LogNormal ;
...
Applying Models – Points in timeLogNormal
CDF
0.29% Chance of failure by week 26
𝜇 = 9.999, 𝜎 = 2.443
CDF Function
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
data pfail_in ;
censor = 0 ;
replacements = 1 ;
do WeeksInService = 4, 13, 26 ;
output ;
end ;
run ;
proc lifereg data=pfail_in inest=pe_GGamma noprint ;
model WeeksInService*censor(1)= / distribution=gamma maxiter=0 ;
output out=prob_failurel cdf=cdf ;
run ;
Applying Models – Points in timeGeneralized Gamma
CDF
0.25% Chance of failure by week 26
𝑎 = 9.838, d= 18.788, p= -13.351
Lifereg … maxiter=0
proc lifereg … outest=pe_GGamma ;
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
Applying Models – Weekly Returns
Generate periods / weeks for projection
Determine units still in service
from each source (ship week)
Apply models to get probability of failure each
week
For each source (ship week)
Apply weekly failure rates. Remove
units from service for next week
Predict next week’s failure
Align returns by source (ship week)
and summarize expected returns each future week.
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
ForecastIn
data forecastin ;
fcstrange = &numperiods. ;
do WeeksInService = 1 to fcstrange ;
replacements=1 ;
censor = 0 ;
output ;
end ;
run ;
Generate periods / weeks for projection
Applying Models – Weekly Returns
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
proc lifereg data=forecastin inest=pe_GGamma noprint ;
model WeeksInService*censor(1)= / distribution=gamma maxiter=0 ;
weight replacements ;
output out=predcdf cdf=cdf ;
run ;
Apply models to get probability of failure each
weekdata predpct ;
set predcdf (keep=WeeksInService CDF) ;
prevcdf = lag(cdf) ;
if _n_ = 1 then prevcdf = 0 ;
retn_pct = cdf - prevcdf ;
run ;
Applying Models – Weekly Returns
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
Determine units still in service
from each source (ship week)
Applying Models – Weekly Returnsproc sql ;
create table UnitsInService as
select ship_Week,
WeeksInService,
censor,
field_pop
from ShippedStillInField a
where censor = 1
group by WeeksInService, censor
;
run ;
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
For each source (ship week)
Apply weekly failure rates. Remove
units from service for next week
Predict next week’s failure
Applying Models – Weekly Returnsproc transpose data=predpct out=predpctt prefix=retnpct ;
var retn_pct ;
run ;
data fct ;
set UnitsInService ;
if _n_ = 1 then set predpctt (drop=_name_) ;
retain retnpct: ;
array retnpct(*) retnpct: ;
array forc[&numperiods.] ;
offset = WeeksInService - 1 ;
do i = 1 to (&numperiods.-offset) ;
forc[i] = round(field_pop * retnpct(i+offset), 1) ;
field_pop = field_pop - forc[i] ;
end ;
run ;
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
Align returns by source (ship week)
and summarize expected returns each future week.
data forecast;
merge shipsum fct_sort;
by descending WeeksInService ;
run ;
proc print data=forecast noobs label ;
id ship_week ;
var field_pop forc1-forc26 ;
sum field_pop forc1-forc26 ;
label field_pop="Units in Service" ;
format field_pop forc: comma9. ;
run ;
Applying Models – Weekly Returns
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
Comparing results with graphs
Applying Models – Weekly Returns
Generalized Gamma Weibull 3p
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
Distribution Formula CDF() LIFEREG
Exponential Yes Yes Yes
Weibull2p Yes Yes Yes
Weibull3p Yes Yes With mods
LogNormal Yes Yes
Generalized Gamma
Yes
NOTE: searching the internet for applications of the Generalized Gamma in SAS leads to many unanswered questions. The few answers I could find focused on implementation of the partial gamma function via SAS IML.The use of LIFEREG with maxiter=0 as a means of forecasting with an existing model and new data was not directly documented.
Applying Models – Summary
#analyticsx
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
Conclusion – selecting model• Can structure comparisons leveraging
relationships of models and nested ML Lognormal, Exponential and Weibull 2p are all
instances of Generalized Gamma
But Weibull 3p is not?
• Could structure comparison using RMSE of actual vs predicted
• Ultimately Test against reality Understand “essentially, all models are wrong, but
some are useful” George E. P. Box.
• Use simplest projection method(s)
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
#AnalyticsX