assessing binary outcomes: logistic regression

of 44/44
Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Statistics for Health Research Research

Post on 13-Feb-2016

50 views

Category:

Documents

Embed Size (px)

DESCRIPTION

Statistics for Health Research. Assessing Binary Outcomes: Logistic Regression. Peter T. Donnan Professor of Epidemiology and Biostatistics. Objectives of Session . Understand what is meant by a binary outcome How analyses of binary outcomes implemented in logistic regression model - PowerPoint PPT Presentation

TRANSCRIPT

• Assessing Binary Outcomes: Logistic Regression Peter T. DonnanProfessor of Epidemiology and BiostatisticsStatistics for Health Research

• Objectives of Session Understand what is meant by a binary outcomeHow analyses of binary outcomes implemented in logistic regression model Understand when a logistic model is appropriateBe able to implement in SPSS and Interpret logistic model output

• Binary OutcomeExtremely common in health research:Dead / AliveHospitalisation (Yes / No)Diagnosis of diabetes (Yes / No)Met target e.g. total cholesterol < 5.0 mmol/l (Yes / No)n.b. Can use any code such as 1 / 2 but mathematically easier to use 0 / 1

• How is relationship formulated?For linear simplest equation is :y is the outcome; a is the intercept;b is the slope related to x the explanatory variable and;e is the error term or random noise

• Can we fit y as a probability range 0 to 1?Not quite! Y as continuous can take any value from - to + Outcome is a probability of event, (or p) on scale 0 1 Certain transformations of p can give the required scaleProbit is a normal transformation of pBut not easy to interpret results

• We can now fit p as a probability range 0 to 1 And y in range - to + The logit transformation works!

• Logistic Regression ModelThis has very useful propertiesThe term p/(1-p) is called the Odds of an eventNote: not the same as the probability of an event pIf x is binary coded 0/1 then - exp (b) = ODDS RATIOfor the outcome in those coded 1 relative to code 0 e.g. Odds of death in men (1) vs. women (0)

• Logistic Regression ModelConsider the LDL data. It has two binary outcomes LDL target achievedChol target achieved For example consider gender as a predictor Male = 1 & Female = 2

• For a binary x we can express results as odds ratios (available in crosstabs)NoYesMaleFemaleLDL target achievedOdds yes = 563/140Odds yes = 531/149

140563149531

• Odds ratio = 4.02 / 3.56OR = 0.886 Female cf MaleNoYesMaleFemaleLDL target achievedOdds yes = 563/140= 4.02Odds yes = 531/149= 3.56N.b. Odds is different to prob Men p = 563/(140+563) = 0.80 or 80%

140563149531

• Odds ratio from CrosstabsObtain odds ratios for 2 x 2 tables from crosstabs and select option risk

• Results from CrosstabsOdds ratios for achieving LDL target in females vs. malesn.b. OR given for Female vs male = 0.886

• Fit Logistic Regression ModelDependent is binary outcome LDL target met (Yes = 1, No = 0)Independent Gender 1 = M, 2 = FShould get same as the crosstabs result Select Analyze / Regression / Binary LogisticSelect option of 95% CI for exp (b)

• Regression / Binary logistic..

• Odds ratio from logistic model results for a binary predictorEXP (B) = Odds ratio F vs. MNote that OR for Men vs Women = 1/0.886 = 1.13

• Fit Logistic Regression Model continuous predictorDependent is binary outcome LDL target metIndependent Continuous predictor AdherenceB represents the change in the ODDS RATIO for a 1 unit increase in adherenceB x 10 represents the change in the ODDS RATIO for a 10 unit increase in adherence

• Odds ratio from logistic model results for a continuous EXP (B) = Odds ratio for 1% increase in AdherenceOR for 10% increase is exp(10 x 0.010) = 1.105 i.e. a 10.5% increase in odds of meeting LDL target for each 10% increase in adherence

• Fit Logistic Regression Model categorical predictorDependent is binary outcome LDL target metIndependent APOE genotype (1 6)Choose a reference category, in this case worst outcome is genotype 6 so choose 6 to give ORs > 1B represents the OR for each category relative to the reference category

• Regression / Binary logistic..Choose Categorical

• Odds ratios from logistic model results for a categorical predictorEXP (B) = Odds ratio for APOE (2) vs APOE (6) OR = 4.381 (95% CI 1.742, 11.021)

• Epidemiological DesignsLogistic model common in epidemiological researchIn case-control designs, case is coded 1 and controls as 0 and used as dependent variableIn cohort study outcome (e.g. death) is used as binary outcome in logistic modelNote in cohort study exp(b) is Relative Risk (RR) rather than OR

• Definition- Clinical Prediction RuleClinical tool that quantifies contribution of:HistoryExaminationDiagnostic testsStratify patients according to probability of having target disorderOutcome can be in terms of diagnosis, prognosis, referral or treatment

• Thresholds for decision makingDiagnosis / test thresholdTest / reassurance thresholdDerived Probability of disease100%0%TreatmentFurther diagnostic testingReassurance

• Ottawa ankle rule

• Identify high risk through risk stratification andIntervene through case management at highest risk

Risk StratificationKaiser-Permanente Pyramid

• Framingham Risk AlgorithmPrediction of risk: Cardiovascular (Framingham)55 yr-old woman 15-20% 5 yr risk

• Increasing appearance of prediction models in literature (ISI Web of Knowledge v3)

• Stages of development and assessment of a CPRCross Sectional

or

CohortRandomized Controlled TrialCross Sectional

or

CohortStep 1 DerivationIdentification of factors with predictive powerStep 2 ValidationEvidence of reproducible accuracy

Application of a rule in similar clinical settings and population or better still multiple clinical settings and different populations with varying prevalence and outcomes of diseaseStep 3 Impact AnalysisEvidence that rule changes physician behaviour and improves patient outcomes and /or reduces costs

• How to derive a CPR?Toss a coin to make decision?Individual opinion and experience?Huddle of wise ones Delphi technique to reach consensus?Statistical prediction models !

• Regression Models for predictionIn all of these models we combine a set of factors:Usually between 2-20 predictorsOccams razor suggests smaller is betterFit a multiple regression modelExtract probabilities of outcome or diagnosisCreate CPR

• Regression Models for predictionLinear if outcome continuousBinary OutcomesLogistic regression model Survival models Cox PH, Weibull, log logistic, etcOrdinal or nominal outcomesOrdinal logistic regression

• We can now fit p as a probability range 0 to 1 And y in range - to + The logit transformation

• Statistical prediction ModelsLogistic regression model:p= probability of the Event and effect of factors (x) increase or decrease risk of this event

• Derivation of probability of eventsLogistic regression model:Call Linear Predictor as a linear function of the predictors x1, x2, x3, etc.

• Derivation of probability of eventsThen:Take exp of both sides :

• Derivation of probability of eventsThen rearrange:Or:

• Example:PEONY model to predict risk of emergency admission to hospital over the next yearNow implemented in NHS Tayside as part of Virtual Wards management of LTCPEONY II model developed watch this space!Donnan et al Arch Int Med 2008Risk Stratification based on derived probabilities

• Other binary modelsThe logistic model is only applicable whenever the length of follow-up is same for each individual e.g. 5-yr follow-up of a cohortFor binary outcomes where censoring occurs i.e. people leave the cohort from death or migration then length of follow-up varies and need to use survival models such as Cox Proportional Hazards model

• SummaryLogistic model easily fitted in SPSSClear link with ODDS RATIOSCommon model for case-control, cohort studies as well as development of clinical prediction models

• General ReferencesCampbell MJ, Machin D. Medical Statistics. A commonsense approach. 3rd ed. Wiley, New York, 1999.Hosmer DW and Lemeshow S. Applied logistic regression. John Wiley& sons, New Jersey, 2000. Altman DG. Practical statistics for medical research. London: Chapman and Hall, 1991.Armitage P and Berry G. Statistical Methods in Medical research. 3rd ed. Oxford: Blackwell Scientific, 1994.Agresti A. An introduction to Categorical Data Analysis. Wiley, New York, 1996.

• Practical: Fit Multiple Logistic Regression ModelDependent is binary outcome LDL target met (Yes = 1, No = 0)Independent Gender 1 = M, 2 = F, add APOE, adherence, etcRemember Select Analyze / Regression / Binary LogisticSelect option of 95% CI for exp (b)

• 3) Screening for variables to eliminateConsider screening procedures to eliminate a number of variables under consideration Test each variable separatelyIf p > 0.3 then they would have to be very strong confounders to become significant on adjustment in a multiple regression so could be discardedHosmer-Lemeshow criteria

• 4) A mixture of automatic procedures and self selectionUse automatic procedures as a guideCompare stepwise and backward elimination Think about what factors are importantAdd important factorsDo not follow blindly statistical significance

• Remember Occams RazorEntia non sunt multiplicanda praeter necessitatem

Entities must not be multiplied beyond necessityWilliam of Ockham 14th century Friar and logician1288-1347