risk prediction models: calibration, recalibration, and remodeling

52
© 2003 By Default! A Free sample background from www.powerpointbackgrounds.com Slide 1 Risk Prediction Models: Risk Prediction Models: Calibration, Calibration, Recalibration, and Recalibration, and Remodeling Remodeling HST 951: Biomedical Decision Support HST 951: Biomedical Decision Support 12/04/2006 – Lecture 23 12/04/2006 – Lecture 23 Michael E. Matheny, MD, MS Michael E. Matheny, MD, MS Brigham & Women’s Hospital Brigham & Women’s Hospital Boston, MA Boston, MA

Upload: keith-hall

Post on 01-Jan-2016

41 views

Category:

Documents


0 download

DESCRIPTION

Risk Prediction Models: Calibration, Recalibration, and Remodeling. HST 951: Biomedical Decision Support 12/04/2006 – Lecture 23 Michael E. Matheny, MD, MS Brigham & Women’s Hospital Boston, MA. Lecture Outline. Review Risk Model Performance Measurements - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 1

Risk Prediction Models: Risk Prediction Models: Calibration, Recalibration, and Calibration, Recalibration, and

RemodelingRemodeling

HST 951: Biomedical Decision SupportHST 951: Biomedical Decision Support12/04/2006 – Lecture 2312/04/2006 – Lecture 23

Michael E. Matheny, MD, MSMichael E. Matheny, MD, MSBrigham & Women’s HospitalBrigham & Women’s Hospital

Boston, MABoston, MA

Page 2: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 2

Lecture OutlineLecture Outline

Review Risk Model Performance MeasurementsReview Risk Model Performance Measurements

Individual Risk Prediction for Binary OutcomesIndividual Risk Prediction for Binary Outcomes

Inadequate Calibration is “the rule not the Inadequate Calibration is “the rule not the exception”exception”

Addressing the problem with Recalibration and Addressing the problem with Recalibration and RemodelingRemodeling

Page 3: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 3

Model Performance MeasuresModel Performance Measures

DiscriminationDiscrimination– Ability to distinguish well between patients who Ability to distinguish well between patients who

will and will not experience an outcomewill and will not experience an outcome

CalibrationCalibration– Ability of a model to match expected and Ability of a model to match expected and

observed outcome rates across all of the dataobserved outcome rates across all of the data

Page 4: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 4

DiscriminationDiscriminationArea Under the Receiver Operating Characteristic CurveArea Under the Receiver Operating Characteristic Curve

PositiveFalseNegativeTrue

NegativeTrueSpec

__

_

NegativeFalsePositiveTrue

PositiveTrueSens

__

_

Page 5: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 5

DiscriminationDiscriminationROC Curve GenerationROC Curve Generation

Page 6: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 6

CalibrationCalibrationExample DataExample Data

Expected OutcomeExpected Outcome Observed OutcomeObserved Outcome0.050.05 000.100.10 000.150.15 000.200.20 000.250.25 110.300.30 000.350.35 000.400.40 110.450.45 110.500.50 112.752.75 44

Page 7: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 7

Standardized Outcomes RatioStandardized Outcomes Ratio

Most Aggregated (Crude) comparison of expected Most Aggregated (Crude) comparison of expected and observed valuesand observed values

1 Value for Entire Sample1 Value for Entire Sample

Risk-Adjusted by using a risk prediction model to Risk-Adjusted by using a risk prediction model to generate expected outcomesgenerate expected outcomes

45.175.2

4

_

_

OutcomesExpected

OutcomesObserved

Page 8: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 8

Standardized Mortality RatiosStandardized Mortality Ratios(SMR)(SMR)

CANCER MORTALITY ANALYSIS ALL MALES, SCRANTON CITY, 1975-1985

CAUSE OF DEATH(ICD CODES 140-204)

EXPECTEDDEATHS

OBSERVEDDEATHS SMR

All Cancer Deaths 1325.37 1516 1.14

Lip, Oral Cavity and Pharynx 33.81 47 1.39

Esophagus 36.84 45 1.22

Stomach 54.58 72 1.32

Colon, Rectum, Rectosigmoid 180.48 238 1.32

Pancreas 62.51 72 1.15

Trachea, Bronchus & Lung 430.98 481 1.12

Genitourinary 168.90 162 0.96

Bladder 45.02 50 1.11

Lymphomas 44.57 47 1.05

Page 9: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 9

Outcome RatiosOutcome Ratios

StrengthsStrengths

– SimpleSimple

– Frequently used in medical literatureFrequently used in medical literature

– Easily understood by clinical audiencesEasily understood by clinical audiences

WeaknessesWeaknesses

– Not a quantitative test of model calibrationNot a quantitative test of model calibration

– Unable to show variations in calibration in different risk Unable to show variations in calibration in different risk stratastrata

– Likely to underestimate the lack of fitLikely to underestimate the lack of fit

Page 10: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 10

Outcome RatiosOutcome RatiosExample Calibration PlotExample Calibration Plot

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Expected

Page 11: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 11

Global Performance MeasurementsGlobal Performance Measurementswith Calibration Componentswith Calibration Components

Methods that calculate a value for each data point Methods that calculate a value for each data point (most granular)(most granular)

– Pearson TestPearson Test

– Residual DevianceResidual Deviance

– Brier ScoreBrier Score

2)(*1

ii pyn

Page 12: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 12

Brier Score CalculationBrier Score Calculation

Expected Expected OutcomeOutcome

Observed Observed OutcomeOutcome

(Y(Yii – P – Pii))22

0.050.05 00 0.00250.00250.100.10 00 0.010.010.150.15 00 0.02250.02250.200.20 00 0.040.040.250.25 11 0.56250.56250.300.30 00 0.090.090.350.35 00 0.12250.12250.400.40 11 0.360.360.450.45 11 0.30250.30250.500.50 11 0.250.25

1.76251.7625

Page 13: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 13

Brier Score CalculationBrier Score Calculation

To assess the accuracy of the set of predictions, To assess the accuracy of the set of predictions, Spiegelhalter’s method is usedSpiegelhalter’s method is used

– Expected Brier (EBrier) = 0.18775Expected Brier (EBrier) = 0.18775– Variance of Brier (VBrier) = 0.003292Variance of Brier (VBrier) = 0.003292

17625.07625.1*10

1)(*

1 2 ii pyn

04357.0003292.0

)18775.017625.0()(5.05.0

VBrier

EBrierBrierZ

Page 14: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 14

Brier ScoreBrier Score

StrengthsStrengths

– Quantitative evaluationQuantitative evaluation

WeaknessesWeaknesses

– Sensitive to sample size (Sensitive to sample size (↑sample size more likely to fail ↑sample size more likely to fail test)test)

– Sensitive to outliers (large differences between expected Sensitive to outliers (large differences between expected and observed)and observed)

– Difficult to determine relative performance in risk Difficult to determine relative performance in risk subpopulationssubpopulations

Page 15: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 15

Hosmer-LemeshowHosmer-LemeshowGoodness of FitGoodness of Fit

Divide the data into subgroups and compare Divide the data into subgroups and compare observed to expected outcomes by subgroupobserved to expected outcomes by subgroup

C TestC Test– Divides the sample into 10 equal groups (by Divides the sample into 10 equal groups (by

number of samples)number of samples)

H TestH Test– Divides the sample into 10 groups (by deciles of Divides the sample into 10 groups (by deciles of

risk)risk)

Page 16: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 16

Hosmer-LemeshowHosmer-LemeshowGoodness of FitGoodness of Fit

10

1

28

22 ~

)/1(

)(

j jjj

jjHL x

nEE

EOG

group j in the cases ofnumber expected

group j in the cases ofnumber observed

group j in the nsobservatio ofnumber

th

th

th

j

j

j

E

O

n

Page 17: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 17

CALICO RegistryCALICO RegistryHosmer-Lemeshow Goodness of FitHosmer-Lemeshow Goodness of Fit

C Test

Predicted Mortality by Decile (%) Admissions Observed Expected H-L

Deaths Deaths Statistic

0.007 - .034 466 2 10.3 6.88

0.034 - 0.052 461 17 19.7 0.39

0.052 - 0.073 454 27 28.3 0.07

0.073 - 0.100 478 24 41.5 8.07

0.100 - 0.127 450 35 51.4 5.89

0.127 - 0.154 469 53 65.8 2.90

0.154 - 0.202 465 66 82.1 3.83

0.203 - 0.287 461 93 111.2 3.94

0.288 - 0.445 463 138 162.5 5.70

0.445 - 0.968 463 255 287.9 9.94

Total 4630 710 860.8 47.61

C= 47.61 df 8, p < 0.0001

Page 18: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 18

Calibration PlotCalibration PlotC Test DataC Test Data

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Expected

Ob

serv

ed

Page 19: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 19

CALICO RegistryCALICO RegistryHosmer-Lemeshow Goodness of FitHosmer-Lemeshow Goodness of Fit

H Test

Predicted Mortality by Decile (%) Admissions Observed Expected H-L

Deaths Deaths Statistic

0.007 - 0.100 1859 70 99.9 9.46

0.100 - 0.200 1348 149 192.0 11.24

0.200 - 0.300 555 115 135.5 4.10

0.301 - 0.400 323 97 110.9 2.65

0.400 - 0.499 185 58 83.0 13.64

0.500 - 0.598 131 70 71.7 0.09

0.600 - 0.694 103 58 66.4 3.02

0.701 - 0.800 65 48 48.6 0.03

0.803 - 0.896 48 34 40.7 7.29

0.904 - 0.968 13 11 12.1 1.59

Total 4630 710 860.8 53.10

H= 53.10 df 8, p < 0.0001

Page 20: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 20

Calibration PlotCalibration PlotH Test DataH Test Data

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Expected

Ob

serv

ed

Page 21: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 21

Hosmer-LemeshowHosmer-LemeshowGoodness of FitGoodness of Fit

StrengthsStrengths

– Quantitative evaluationQuantitative evaluation

– Assesses calibration in risk subgroupsAssesses calibration in risk subgroups

WeaknessesWeaknesses

– Disagreement with how to generate subgroups (C versus H)Disagreement with how to generate subgroups (C versus H)

– Even among the same method (C or H), different statistical Even among the same method (C or H), different statistical packages generate different results due to rounding rule differencespackages generate different results due to rounding rule differences

– Sensitive to sample size (Sensitive to sample size (↑sample size more likely to fail test)↑sample size more likely to fail test)

– Sensitive to outliers (but to a lesser degree than Brier Score)Sensitive to outliers (but to a lesser degree than Brier Score)

Page 22: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 22

Risk Prediction ModelsRisk Prediction Modelsfor Binary Outcomesfor Binary Outcomes

Case Data (Variables XCase Data (Variables X11..X..Xii) ) -> Predictive Model for Outcome Y (Yes/No)-> Predictive Model for Outcome Y (Yes/No)-> Case Outcome Prediction (0 – 1)-> Case Outcome Prediction (0 – 1)

Logistic RegressionLogistic Regression Bayesian NetworksBayesian Networks Artificial Neural NetworksArtificial Neural Networks Support Vector Machine RegressionSupport Vector Machine Regression

Page 23: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 23

Risk Prediction ModelsRisk Prediction ModelsClinical UtilityClinical Utility

Risk Stratification for Research and Clinical Risk Stratification for Research and Clinical PracticePractice

Risk-Adjusted Assessment of Providers and Risk-Adjusted Assessment of Providers and InstitutionsInstitutions

Individual risk predictionIndividual risk prediction

Page 24: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 24

Individual Risk PredictionIndividual Risk Prediction

Good discrimination is necessary but not Good discrimination is necessary but not sufficient for individual risk predictionsufficient for individual risk prediction

Calibration is the key index for individual risk Calibration is the key index for individual risk predictionprediction

Page 25: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 25

Inadequate CalibrationInadequate CalibrationWhy?Why?

Models require external validation to be generally Models require external validation to be generally accepted, and in those studies the general trend is:accepted, and in those studies the general trend is:

– Discrimination retainedDiscrimination retained– Calibration failsCalibration fails

Factors that contribute to inadequate model Factors that contribute to inadequate model calibration in clinical practicecalibration in clinical practice

– Regional VariationRegional Variation• Different Clinical Practice StandardsDifferent Clinical Practice Standards• Different Patient Case MixesDifferent Patient Case Mixes

– Temporal VariationTemporal Variation• Changes in Clinical Practice Changes in Clinical Practice • New diagnostic tools availableNew diagnostic tools available• Changes in Disease Incidence and PrevalenceChanges in Disease Incidence and Prevalence

Page 26: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 26

Individual Risk PredictionIndividual Risk PredictionClinical ExamplesClinical Examples

10 year “Hard” Coronary 10 year “Hard” Coronary heart disease risk heart disease risk estimationestimation

Logistic RegressionLogistic Regression– Framingham Heart StudyFramingham Heart Study

Calibration ProblemsCalibration Problems– Low SESLow SES– Young ageYoung age– FemaleFemale– Non-US populationsNon-US populations

Kannel et al. Am J Cardiol, 1976Kannel et al. Am J Cardiol, 1976

Page 27: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 27

Individual Risk PredictionIndividual Risk PredictionClinical ExamplesClinical Examples

Lifetime Invasive Breast Lifetime Invasive Breast Cancer Risk EstimationCancer Risk Estimation

Logistic RegressionLogistic Regression– Gail ModelGail Model

Calibration ProblemsCalibration Problems– Age <35Age <35– Prior Hx Breast CAPrior Hx Breast CA– Strong Family HxStrong Family Hx– Lack of regular Lack of regular

mammogramsmammograms

Gail et al. JNCI, 1989Gail et al. JNCI, 1989

Page 28: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 28

Individual Risk PredictionIndividual Risk PredictionClinical ExamplesClinical Examples

Intensive Care Unit Mortality PredictionIntensive Care Unit Mortality Prediction

– APACHE-IIAPACHE-II– APACHE-IIIAPACHE-III– MPMMPM00– MPMMPM00-II-II– SAPSSAPS– SAPS-IISAPS-II

Page 29: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 29

Individual Risk PredictionIndividual Risk PredictionClinical ExamplesClinical Examples

Ohno-Machado, et al. Annu Rev Biomed Eng. 2006;8:567-99Ohno-Machado, et al. Annu Rev Biomed Eng. 2006;8:567-99

Page 30: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 30

Individual Risk PredictionIndividual Risk PredictionClinical ExamplesClinical Examples

Ohno-Machado, et al. Annu Rev Biomed Eng. 2006;8:567-99Ohno-Machado, et al. Annu Rev Biomed Eng. 2006;8:567-99

Page 31: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 31

Individual Risk Prediction Individual Risk Prediction Clinical ExamplesClinical Examples

Model  Dates Location Sample

NY 1992 1991 NY 5827

NY 1997 1991 – 1994 NY 62670

CC 1997 1993 – 1994 Cleveland, OH 12985

NNE 1999 1994 – 1996 NH, ME, MA, VT 15331

MI 2001 1999 – 2000 Detroit, MI 10796

BWH 2001 1997 – 1999 Boston, MA  2804

ACC 2002 1998 – 2000 National 100253

Matheny, et al. J Biomed Inform. 2005 Oct;38(5):367-75Matheny, et al. J Biomed Inform. 2005 Oct;38(5):367-75

Interventional Cardiology Mortality PredictionInterventional Cardiology Mortality Prediction

Page 32: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 32

Individual Risk Prediction Individual Risk Prediction Clinical ExamplesClinical Examples

Model Deaths AUC HL χ2 HL (p)

NY 1992  96.7 0.82 31.1 <0.001

NY 1997  61.6 0.88 32.2 <0.001

CC 1997 78.8 0.88 27.8 <0.001

NNE 1999  56.2 0.89 45.9 <0.001

MI 2001  61.8 0.86 30.4 <0.001

BWH 2001  136.1 0.89 39.7 <0.001

ACC 2002  49.9 0.90 42.0 <0.001

BWH 2004 70.5 0.93 7.61 0.473

Observed Deaths = 71Observed Deaths = 71

Matheny, et al. J Biomed Inform. 2005 Oct;38(5):367-75Matheny, et al. J Biomed Inform. 2005 Oct;38(5):367-75

Page 33: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 33

Inadequate CalibrationInadequate CalibrationWhat to do?What to do?

In most cases, risk prediction models are In most cases, risk prediction models are developed on much larger data sets than are developed on much larger data sets than are available for local model generation.available for local model generation.

– Decreased variance and increased stability of model Decreased variance and increased stability of model covariate valuescovariate values

– Large, external models (especially those that have been Large, external models (especially those that have been externally validated) are generally accepted by domain externally validated) are generally accepted by domain expertsexperts

Goal is to ‘throw out’ as little prior model Goal is to ‘throw out’ as little prior model information as possible while improving information as possible while improving performanceperformance

Page 34: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 34

Recalibration and RemodelingRecalibration and RemodelingGeneral Evaluation RulesGeneral Evaluation Rules

Model recalibration or remodeling follows the same Model recalibration or remodeling follows the same rules of evaluation as model building in generalrules of evaluation as model building in general

– Separate training and test data, orSeparate training and test data, or– Cross-Validation, etcCross-Validation, etc

If temporal issues are central to that domain’s If temporal issues are central to that domain’s calibration problems, training data should be both calibration problems, training data should be both before (in time) and separate from testing databefore (in time) and separate from testing data

Page 35: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 35

Discrimination versus CalibrationDiscrimination versus Calibration

Model AModel AExpected OutcomeExpected Outcome

Model BModel BExpected OutcomeExpected Outcome

ObservedObservedOutcomeOutcome

0.050.05 0.330.33 000.100.10 0.450.45 000.150.15 0.470.47 000.200.20 0.530.53 000.250.25 0.680.68 110.300.30 0.770.77 000.350.35 0.810.81 000.400.40 0.930.93 110.450.45 0.950.95 110.500.50 0.960.96 112.752.75 6.886.88 44

Page 36: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 36

Logistic RegressionLogistic RegressionGeneral EquationGeneral Equation

BB00 is the intercept of the equation, which represents is the intercept of the equation, which represents the outcome probability in the absence of all other the outcome probability in the absence of all other risk factors (baseline risk)risk factors (baseline risk)

The model assumes each covariate is independent The model assumes each covariate is independent of each other, and Bof each other, and Bxx is the natural log of the odds is the natural log of the odds ratio of the risk attributable to that risk factorratio of the risk attributable to that risk factor

)( 1101

1)]1[(

iixBxBBeYP

Page 37: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 37

Logistic RegressionLogistic Regression”Original” Model and Cases”Original” Model and Cases

ModelModel

VariableVariable ββ coeff coeff Case 1Case 1 Case 2Case 2 Case 3Case 3 Case 4*Case 4*

InterceptIntercept -3-3 11 11 11 11

Variable 1Variable 1 0.20.2 00 11 11 11

Variable 2Variable 2 0.50.5 00 00 11 11

Variable 3Variable 3 1.01.0 00 00 00 11

Case Case ProbabilityProbability

0.0470.047 0.0570.057 0.0910.091 0.3100.310

Minimum predicted risk for each case is intercept Minimum predicted risk for each case is intercept onlyonly

Adjusting intercept scales all resultsAdjusting intercept scales all results

* Case 4 is Outcome = 1, Case 1 -3 are Outcome = 0* Case 4 is Outcome = 1, Case 1 -3 are Outcome = 0

Page 38: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 38

LR Intercept RecalibrationLR Intercept Recalibration

The proportion of risk contributed by the intercept The proportion of risk contributed by the intercept (baseline) can be calculated for a data set by:(baseline) can be calculated for a data set by:

nobsxBxBB

nobsB

iie

eRiskInt

)(

)(

110

0

11

11

(%)

Page 39: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 39

LR Intercept Recalibration LR Intercept Recalibration

The intercept contribution to risk (RiskInt(%)) is The intercept contribution to risk (RiskInt(%)) is multiplied by the observed event rate, and multiplied by the observed event rate, and converted back to a Beta Coefficient from a converted back to a Beta Coefficient from a probability:probability:

1

*(%)

1ln)(0 teObsEventRaRiskInt

NewB

A relative weakness of the method is that values A relative weakness of the method is that values can exceed 1, and must be truncatedcan exceed 1, and must be truncated

Page 40: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 40

LR Intercept RecalibrationLR Intercept RecalibrationExample Model and CasesExample Model and Cases

OldOld NewNew

VariableVariable ββ coeff coeff ββ coeff coeff Case 1Case 1 Case 2Case 2 Case 3Case 3 Case 4*Case 4*

InterceptIntercept -3.0-3.0 -2.2-2.2 11 11 11 11

Variable 1Variable 1 0.20.2 0.20.2 00 11 11 11

Variable 2Variable 2 0.50.5 0.50.5 00 00 11 11

Variable 3Variable 3 1.01.0 1.01.0 00 00 00 11

New Prob.New Prob. 0.0990.099 0.1190.119 0.1820.182 0.5000.500

Orig Prob.Orig Prob. 0.0470.047 0.0570.057 0.0910.091 0.3100.310

Original Expected = 0.51Original Expected = 0.51 Intercept Recalibration Expected = 0.90Intercept Recalibration Expected = 0.90

Page 41: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 41

LR Slope RecalibrationLR Slope Recalibration

In this method, the output probability of the original In this method, the output probability of the original LR equation is used to model a new LR equation LR equation is used to model a new LR equation with that output as the only covariate:with that output as the only covariate:

)])([( 101

1)( OldPBBe

NewP

Page 42: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 42

LR Slope RecalibrationLR Slope RecalibrationExample Model and CasesExample Model and Cases

New ModelNew Model

VariableVariable ββ coeff coeff Case 1Case 1 Case 2Case 2 Case 3Case 3 Case 4*Case 4*

New Model InterceptNew Model Intercept -3.0-3.0 11 11 11 11

Orig Model ResultOrig Model Result 11.011.0 0.0470.047 0.0570.057 0.0910.091 0.3100.310

New ProbabilityNew Probability 0.0770.077 0.0860.086 0.1190.119 0.6010.601

Intercept ProbabilityIntercept Probability 0.0990.099 0.1190.119 0.1820.182 0.5000.500

Original Expected = 0.51Original Expected = 0.51 Slope Recalibration Expected = 0.88Slope Recalibration Expected = 0.88

Page 43: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 43

LR Covariate RecalibrationLR Covariate Recalibration

OldOld NewNew

VariableVariable ββ coeff coeff ββ coeff coeff Case 1Case 1 Case 2Case 2 Case 3Case 3 Case 4*Case 4*

InterceptIntercept -3-3 -2.5-2.5 11 11 11 11

Variable 1Variable 1 0.20.2 0.10.1 00 11 11 11

Variable 2Variable 2 0.50.5 0.30.3 00 00 11 11

Variable 3Variable 3 1.01.0 3.03.0 00 00 00 11

New ProbNew Prob 0.0760.076 0.0830.083 0.1090.109 0.7110.711

Orig ProbOrig Prob 0.0470.047 0.0570.057 0.0910.091 0.3100.310

Original Expected = 0.51Original Expected = 0.51 Covariate Recalibration Expected = 0.97Covariate Recalibration Expected = 0.97

Page 44: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 44

Recalibration ExampleRecalibration ExampleLocal Institutional DataLocal Institutional Data

Year Cases Mortality (%)2002 1947 15 (0.8%)

2003 1841 33 (1.8%)

2004 1767 33 (1.9%)

Page 45: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 45

Recalibration ExampleRecalibration ExampleExternal Risk Prediction ModelsExternal Risk Prediction Models

Year Abbrev Outcomes Sample %

National ACC ACC 707 50123 1.4

Northern New England NNE 165 15331 1.1

University of Michigan MIC 169 10796 1.6

Cleveland Clinic CCL 169 2985 1.3

Page 46: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 46

ResultsResultsNo RecalibrationNo Recalibration

Model Observed Expected HL χ2

2003

ACC 33 414 634

NNE 33 39.0 24.3

MIC 33 27.2 6.6

CCL 33 56.3 14.0

2004

ACC 33 418 641

NNE 33 36.6 51.0

MIC 33 23.3 22.9

CCL 33 60.3 21.2

Page 47: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 47

ResultsResultsLR Intercept RecalibrationLR Intercept Recalibration

Model Observed Expected HL χ2

2003

ACC 33 45.1 10.0

NNE 33 26.0 43.6

MIC 33 22.1 12.7

CCL 33 24.8 10.5

2004

ACC 33 34.1 14.6

NNE 33 28.9 69.8

MIC 33 26.5 17.6

CCL 33 33.5 14.2

Page 48: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 48

ResultsResultsLR Slope RecalibrationLR Slope Recalibration

Model Observed Expected HL χ2

2003

ACC 33 24.0 12.7

NNE 33 18.6 32.9

MIC 33 20.1 24.0

CCL 33 25.5 15.2

2004

ACC 33 32.0 35.7

NNE 33 31.2 21.7

MIC 33 31.0 23.6

CCL 33 31.6 13.2

Page 49: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 49

Clinical ApplicationsClinical ApplicationsCALICOCALICO

California Intensive Care Outcomes (CALICO) California Intensive Care Outcomes (CALICO) ProjectProject

– 23 Volunteer Hospitals beginning in 200223 Volunteer Hospitals beginning in 2002

– Compare hospital outcomes for selected conditions, Compare hospital outcomes for selected conditions, procedures, and intensive care unit typesprocedures, and intensive care unit types

– Identified popular, well-validated modelsIdentified popular, well-validated models• MPMMPMoo-II, SAPS-II, APACHE-II, APACHE-III-II, SAPS-II, APACHE-II, APACHE-III

– Evaluated the models on CALICO data, after determining Evaluated the models on CALICO data, after determining they were inadequately calibrated, conducted they were inadequately calibrated, conducted recalibration of each of the models using the LR recalibration of each of the models using the LR Covariate Recalibration methodCovariate Recalibration method

Page 50: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 50

Clinical ApplicationsClinical ApplicationsCALICOCALICO

Page 51: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 51

Examples on WebsiteExamples on Website

Most of the calculations from this Most of the calculations from this presentation are available on the website in presentation are available on the website in an Excel workbookan Excel workbook

Page 52: Risk Prediction Models: Calibration, Recalibration, and Remodeling

© 2003 By Default!

A Free sample background from www.powerpointbackgrounds.com

Slide 52

Michael Matheny, MD, MSMichael Matheny, MD, MS [email protected]@dsg.harvard.edu

Brigham & Women’s HospitalBrigham & Women’s HospitalThorn 309Thorn 309

75 Francis Street75 Francis StreetBoston, MA 02115Boston, MA 02115

The EndThe End