ntdt5602(methods(in(nutrition(research( most(research ... · ! 1!...

Post on 24-Jun-2020

7 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

! 1!

NTDT5602(Methods(in(Nutrition(Research(!

Module(1(Nutritional(Epidemiology(!

Nutritional(Epidemiology(Lecture(1(–(Introduction(&(Study(Design(!Introduction(

(

What(is(epidemiology?((

• “…!the!study!of!the!distribution!and!determination!of!health5related!states!or!events!in!specified!populations!and!the!application!of!this!study!in!the!prevention!and!control!of!health!problems”!!

!What(does(this(mean?((

• The!magnitude!of!a!health!problem!!o Frequency!–"how"often"does"a"health"problem"occur?"!

• The!cause!of!a!health!problem!!o Determinants!–!the"associations"between"lifestyle"factors"and"disease."E.g.,"what"is"the"cause"of"

bowel"cancer?"!• The!evaluation!of!a!treatment!or!prevention!campaign!!

o Clinical!or!community!application!–"e.g.,"with"drugs"used"in"medicine,"we"see"if"it"works"by"running"a"RCT,"or"for"a"prevention"campaign,"we"run"a"clustered"RCT"where"some"centres"get"the"treatment"and"others"do"not,"and"then"measuring"the"difference"between"the"centres"(e.g.,"schools)!

!Overall(aim(of(these(lectures(

• To!be!able!to!critically!appraise!the!nutrition!and!dietetic!scientific!literature!!Course(overview((

• Study!types!and!levels!of!evidence!–!looking"at"study"designs!• Measures!of!frequency!–!using"the"right"measures"of"frequencies!• Measures!of!association!!• Confounding!!• Selection!bias!–!what"can"happen"if"you"do"not"select"the"study"population"using"the"best"methods"!• Measurement!error!–!i.e.,"bias"in"measuring"the"outcome"factors"!• Causality!in!nutrition!related!disease!–!how"we"reach"a"final"conclusion"after"looking"at"all"the"

paths/papers,"how"we"conclude"that"there"is"causality"between"diet"and"disease,"e.g.,"diet"with"less"fibre"!"bowel"cancer"in"30yrs!

!

! 3!

Most(research(questions(fall(into(3(types:((

1) How!common!is!the!nutritional!problem?!Or!the!magnitude/frequency!of!the!nutritional!problem?!

o =(Observational(descriptive(study,(usually(crossVsectional((" How(common(is(the(nutritional(problem?(FREQUENCY(

• Select!study!population!!!Representative!sample!of!the!population!!!Measure!factor!of!interest(

o E.g.,(National(Health(Surveys!o E.g.,"How"many"people"in"Australia"are"eating"less"than"2"serves"of"fruit?"–"You"can’t"

go"out"and"ask"20"million"people"how"much"fruit"they’re"eating;"therefore"you"have"to"select"a"representative"sample."!

o How(many(adults(in(Australia(are(obese?(!Australians!18!years!and!over!!!Representative!sample!(taking"into"account"the"demographic"profile"of"Australia)!!!Get!weight!and!height!and!then!BMI;!HOW!MANY!≥30?!!

!

2) What!nutritional!factor!caused!or!prevented!the!disease?!!o =!Observational(cohort(study,(caseVcontrol(study,(crossVsectional(analytical(!o Observational(studies(for(causal(relationships:((Is"the"study"factor"causing"the"outcome"factor"or"not?)!

" Looking!at!causal!relationships!from!observational!analytical!studies!" Study!factor!(exposure)!––???––>!Outcome!factor!!" High!saturated!fat!diet!––???––>!Ischaemic!Heart!Disease!!

!o Cohort(studies((analytical)(

" Exposure!to!the!study!factor!is!determined!by!the!subjects!–!(i.e.,"cohort"studies"are"observational,"unlike"RCTs"in"which"you"assign"people."In"cohort"studies"the"subjects"determine"their"own"exposure"and"the"investigator"has"nothing"do"with"their"choices,"they"just"monitor)!

" Researchers!measure!the!extent!of!the!exposure!–"(usually"use"FFQs)!• Researchers"rank"the"people"based"on"exposure,"e.g.,"tertiles/quintiles"of"lowest"intake"!"

highest"intake"!" The!outcome!factor!is!measure!LATER!–!e.g.,"See"how"many"new"cases"of"disease"5G20yrs"later!" Determine!the!study!factor!!!(1)!Group!exposed!to!study!factor!(2)!Group!not!exposed!!!

Measure!outcome!factor!in!exposed!versus!unexposed!group"(continually"monitored"every"5"years"until"point"of"analysis;"then"classify"the"people)!

• E.g.,"(1)"Group"exposed"(smokers)"(2)"Group"not"exposed"(nonGsmokers)"!"measure"outcome"factor"(number"of"people"who"got"lung"cancer)"in"each"group!

• E.g.,(the(EPIC(study((n=435000)((

Measure!dietary!fibre!intake,!with!657y!follow5up((! ((1)(Very!low!intake((2)(Low!intake((3)(Medium!intake((4)(High!intake(((! (Measure!the!occurrence!of!gastric!cancer!in!different!intake!levels!(!

! 10!

Nutritional'Epidemiology'Lecture'2.2'–'Measures'of'Association'!

Key'measures'of'association!–!(Measures%to%decide%whether%a%particular%dietary%pattern%is%associated%with%decreased/increased%incidence%of%disease)!

• Relative!Risk!(RR)!

• Odds!Ratio!(OR)!

• Attributable!risk!percent!(AR%)!

• Population!attributable!risk!(PAR)!

• These%are%measures%of%association%in%cohort%studies%and%case%control%studies!

• !%allow%us%to%determine%dietKdisease%relationships%!

!

What'Associations?''

• How!big,!or!how!strong!is!the!association!between!the!study!factor!(exposure)!and!the!outcome!factor?!

–!So%we%can%apply%a%numerical%value%to%the%risk%that%a%particular%diet%will%lead%to%a%disease!

!

Incidence''

• Cumulative!Incidence!=!number!of!people!experiencing!a!NEW!event!during!a!time!period!/!number!of!

susceptible!people!at!the!beginning!of!the!time!period!!

• Incidence!is!a!measure!of!events!(event!rate)!

• Incidence!is!a!measure!of!risk!!

!

Relative'Risk'(Risk'Ratio)'–'for'cohort'studies'and'RCTs''

• RR'='incidence'of'study'factor'in'the'exposed'group'/'incidence'in'the'control'group''

o The%RR%describes%the%likelihood%that%people%exposed%to%the%nutritional%factor%would%get%the%disease%or%are%

protected%from%the%disease%%

• Researchers!measure!their!lifestyle!behaviours,!rank!them!into!groups!according!to!their!level!of!

exposure,!and!then!measure!how!many!of!them!in!each!group!get!a!particular!disease.!!

• The!lowest!quintile!=!the!control!group!

• E.g.,!if!the!exposed!=!20%,!and!the!control!is!10%,!hence!the!RR!=!2.0!!

o This!means!that!the!exposed!are!twice!as!likely!to!have!the!event!!

!

Cohort'study'of'effect'of'food'x'consumption'on'metabolic'syndrome''

' Present'(have&the&disease)&

Absent'(don’t&have&the&disease)&

Total'

Exposed' 2000! 28,000! 30,000!

Unexposed' 500! 14,500! 15,000!

Total' 2500! 43,500! 45,000!

• Incidence!in!exposed!group:!2000/30000!=!0.067!

• Incidence!in!unexposed!group:!500/15000!=!0.033!

• Relative!risk!=!0.067/0.033!=!2.03!

!

Risk'difference'

• How!many!outcomes!are!due!to!the!exposure?!!

• Risk!in!exposed!group!minus!risk!in!the!unexposed!group!

• Risk!Difference!=!Iexposed!–!Iunexposed!

• Risk!Difference!=!0.067!–!0.033!=!0.024!=!24!per!1000!

• This%gives%you%the%DIFFERENCE%in%risk%between%groups%!

• Normally%we%see%“Relative%Risk”%in%papers%!

!

! 14!

Nutritional'Epidemiology'Lecture'3'–'Confounding'!

Key'Points''

• Understand!the!definition!of!confounding!variable!

• Understand!the!difference!between!a!crude!and!an!adjusted!estimate!of!association!!

• Understand!in!general!terms!the!main!ways!of!controlling!confounding!(restriction,!randomization,!

matching,!stratification!and!multivariate!analysis)!!

• Understand!the!difference!between!confounding!and!effect!modification!!

!

Definition'of'confounding''

• Confounding!occurs!when!a!measure!of!association!is!biased,!because!of!the!association!of!the!study!

factor!with!other!factors!that!influence!the!outcome!factor.!!

o E.g.,%you%may%get%a%positive%association%between%the%amount%of%vegetable%consumption%and%

diabetes.%This%association%may%be%real,%or%other%factors%that%influence%the%development%of%

diabetes,%e.g.,%BMI,%is%confounding%the%observed%RR.%!

o I.e.,%confounding%occurs%when%there%is%bias%in%the%measurement%of%association,%because%another%

variable%is%changing%the%observed%risk%of%the%study/exposure%factor.%!

!

An'Example'–'“Is'high'alcohol'intake'associated'with'lung'cancer?”'

!Presence'of'Lung'Cancer!

YES' NO' TOTAL'

Exposed'to'High'

Alcohol'

YES' 615! 24385! 25000!

NO' 210! 24790! 25000!

TOTAL' 825! 49175! 50000!

!

• Relative!Risk!=!(615/25000)!/!(210/25000)!=!0.0246/!0.0084!=!2.9!!o Incidence%of%lung%cancer%in%those%exposed%to%high%alcohol%=%615/25000!

o Incidence%of%lung%cancer%in%those%who%aren’t%exposed%to%high%alcohol%(control)%=%210/25000!

o I.e.,%study%factor:%high%alcohol%KKKK>%outcome%factor:%lung%cancer!

" However,%this%is%the%crude%RR,%and%it%may%be%biased%due%to%the%influence%of%confounders!

!

• Confounding'

o Study!factor:!high!alcohol!!!Confounding!variable:!smoking!!!Outcome!factor:!lung!cancer!

" Somewhere%along%the%way,%there%may%be%a%factor%that’s%confounding%the%relationship%

between%high%alcohol%and%lung%cancer,%i.e.,%smoking.%!

" The%observed%high%association%between%alcohol%and%lung%cancer%could%be%due%to%smoking%

being%a%confounder.%!

!

! 29!

Nutritional*Epidemiology*Revision*!

Study*Design**

• What!is!the*research*question?!–$There$are$3$types$of$questions!• What!is/are!the!study!factors!and!outcome!factors?!!• What!is!the!study!type?!

o RCT!o Cohort!o Case$control!

o CrossSsectional$descriptive!o CrossSsectional$analytical$!

• What!level!of!evidence!is!offered!by!the!study!type?!!o Is$the$level$of$evidence$offered$high$or$low$when$studying$causation?$!

• What!study!design!is!feasible!to!answer!the!research!question?!!o What!study!type!would!give!best!evidence!vs!what!is!most!feasible?!!

! Sometimes$there's$a$tradeSoff$between$feasibility$and$highest$level$of$evidence$(because$it's$

improbable)$

! E.g.,$the$best$way$to$study$the$relationship$between$dietary$fibre$and$bowel$cancer$is$an$RCT$to$avoid$

subject$bias.$But$it$would$take$20$years$for$the$relationship$between$dietary$fibre$and$bowel$cancer$to$

become$apparent,$and$you$wouldn't$be$able$to$ethically$do$an$RCT$for$20$years$

!Study*Type!–$the$study$type$you$choose$depends$on$your$question$(of$which$there$are$3$types)$

• How*common*is*the*nutritional*problem?*

o Use$a$crossSsectional$study,$e.g.,$survey!o !$Prevalence$(how$common)$!o New$cases$=$“incidence”!

• What*nutritional*factor*caused*or*prevented*the*disease?*

o Cohort$study!o Case$control$study!

! Case$control$=$opposite$of$a$cohort$study$

! You$already$have$your$cases,$and$then$you$recruit$a$representative$group$of$controls,$then$use$recall$

to$examine$previous$diet$and$get$the$OR$of$the$disease$being$caused$by$a$particular$dietary$exposure$

o CrossSsectional$analytical$(but$you$cannot$infer$causation$from$a$crossSsectional$analytical$study)!• Does*dietary*intervention*prevent*or*cure*the*disease?**

o RCT!o E.g.,$does$a$particular$drug$lower$blood$pressure?$Does$nS3$fatty$acids$cure$rheumatoid$arthritis?!o Systematic$literature$review$with$metaSanalysis$of$RCTs$=$highest$level$of$evidence$$

!!!

! 30!

*Epidemiologic*Study*Designs**

!

! 31!

*Module*4*Statistics*for*Nutrition*Practice*!Outline*

• Descriptive!Statistics!(Lecture!1)!o Graphical!Displays!of!Data!o Measuring!Centre!and!Spread!o Tabular!Data!!

• Inferential!Statistics!(Lectures!2!and!3)!o How!to!set!up!a!Hypothesis!Test!o Types!of!Hypothesis!Tests!

!Descriptive*and*Inferential*Statistics**

• Descriptive!Statistics!–!describe!the!sample!• Inferential!Statistics!–!use!the!sample!to!test!theories!about!the!population!!

!

Statistics*Lecture*1*–*Descriptive*Statistics**

Learning*Objectives**

• Summarise!statistical!data!using:!o Graphs!and!tables!o Appropriate!measures!of!location!including!means!and!medians!o Appropriate!measures!of!variability!including!standard!deviations!and!interquartile!ranges!

• Interpret!significance!tests!and!confidence!intervals!• Use!IBM!SPSS!statistical!software!to!present!data!!

!Descriptive*Statistics**

• The!purpose!of!descriptive!statistics!is!to!become!familiar!with!the!data!that!you!have!collected!!o What!information!can!you!get!from!the!data?!!

• What!are!we!looking!for?!!o What!is!a!“typical”!response?!!o How!different!are!responses!from!different!individuals?!o Do!responses!differ!between!groups!of!individuals?!!o Is!a!measurement!of!one!aspect!of!an!individual!dependent!on!another!aspect?!!

!Types*of*Data*

• The!way!that!we!look!at!data!will!depend!on!the!type!of!data!that!we!have!!

Categorical* Numeric*

Nominal!–!no!order!E.g.,!Nationality,!Gender!

Ordinal!–!order!E.g.,!Likert!Scale,!level!of!

education!

Discrete!–!takes!whole!number!values!

E.g.,!number!of!birds!in!a!tree!

Continuous*

E.g.,!height!

Mode! Mode!or!Median!

Mean,!Median!or!Mode!$

If$your$data$is$symmetrical,$

you$use$the$mean.$If$it$is$

skewed,$you$use$the$median.$

Mode$is$used$as$a$last$resort.$

Mean!or!Median!

!

! 33!

Measures*of*Central*Tendency**

• Where!is!the!‘middle’!of!the!data?!–!when$you$need$to$summarise$the$data$!o We!could!also!think!of!it!as!what!we!would!expect!to!measure!for!a!typical!respondent.!!

• Three!different!measures!o Mean!–!add!the!observations!and!then!divide!by!the!number!of!observations!

! The!mean!can!be!quite!sensitive!to!large!values!and!skewness!–!avoid!using!the!mean!to!describe!skewed!data!(asymmetry$in$data)!

! Skewness**

• When$the$mean$is$dragged$out$to$the$right$or$left$(may$be$due$to$extreme$values)$

• With$symmetric$data,$the$mean$and$the$median$are$very$similar,$and$often$the$

same.$However,$the$mean$has$many$statistical$properties$that$we$can$use,$which$the$

median$doesn’t$have.$$

! Skewness*from*boxplots**

• If!most!of!the!observations!are!concentrated!on!the!low!end!of!the!scale,!the!distribution!is!skewed!right!(positively!skewed);!and!vice!versa!!

• If!a!distribution!is!symmetric,!the!observations!will!be!evenly!split!at!the!median,!as!shown!in!the!middle!figure.!!

Positively*Skewed* Symmetric* Negatively*Skewed*

! ! !!

!

!

!

!

!!

o Median!–!is!the!middle!observation!(n!=!odd)!or!the!average!of!the!two!middle!observations!(n!=!even)!

! The!median!is!useful!when!describing!skewed!data!!! Boxplot*revisited**

!!

o Mode!–!is!the!most!frequently!occurring!observation!(merely$a$descriptive$tool)!! 38!

Statistics*Lectures*2*&*3*–*Inferential*Statistics**

Where*does*inference*fit*in?**

• So!far!we!have!looked!at!how!to!describe!data,!but!we!haven’t!been!able!to!test!our!observations!• Inference!provides!the!tools!to!test!whether!our!observations!can!be!applied!to!the!population,!as!

opposed!to!just!seeing!the!results!by!chance!!

!• From$the$sample,$we$can$make$an$inference$about$the$population,$including$the$mean,$and$

variation.$!!The*Structure*of*a*Hypothesis*Test*

• Step*1:!Set!up!Hypotheses!!o Null!Hypothesis:!H0!o Alternative!Hypothesis:!H1!

• Step*2:!Choose!an!appropriate!test!!o Depending$on$the$characteristics$of$your$dataset,$there$is$an$appropriate$test$to$apply$to$test$

your$hypothesis!• Step*3:!Execute!the!test!in!SPSS!and!obtain!a!pIvalue!!

o PIvalue!=!probability!that!you!would!observe!inequality!between!the!means!of!each!group!when!the!null!hypothesis!is!actually!true!(probability$that$we$are$detecting$an$effect$that$is$there$by$chance,$and$not$actually$there)!

o I.e.,$the$chance$that$we$only$observed$the$sample$results$by$chance$when$the$null$hypothesis$is$true!• Step*4:!Make!a!conclusion!!

o Either!reject!H0!if!the!pIvalue!is!small!enough,!or!do!not!reject!H0!o You!cannot!accept!a!hypothesis!–!it$doesn’t$automatically$mean$that$the$alternative$hypothesis$is$true.$!

• Step*5:!State!the!conclusion!in!plain!language!!o So!people!who!don’t!understand!statistics!can!still!understand!your!conclusions.!!

!

! 39!

Some*Concepts*

• The!null!hypothesis!can!usually!be!expressed!as!an!equality!between!means!from!each!group!!o If!there!is!only!one!group,!then!the!null!hypothesis!will!be!that!the!mean!equals!a!particular!value!(a!

hypothesized!value!–!just$a$nominated$value)!• The!alternative!hypothesis!can!either!be!one!sided!or!two!sided!!

o One!sided!–!we!reject!the!null!hypothesis!if!the!sample!mean!for!one!group!is!sufficiently!larger!in!one!group!than!the!other!(or!sufficiently!larger!than!the!hypothesized!value)!

o Two!sided!–!we!reject!the!null!hypothesis!if!the!sample!mean!for!one!group!is!sufficiently!different!to!the!other!group!(larger!or!smaller!–!no$nominated$direction)!

o Which!to!choose!depends!on!what!you!are!wanting!to!know!• We!make!a!decision!based!on!a!pIvalue!!

o The!pIvalue!is!the!probability!that!we!would!observe!the!unequal!sample!means!or!something!more!extreme!by!chance!when!the!null!hypothesis!is!true.!!

• We!compare!the!pIvalue!to!a!level!of!significance!!o The!level!of!significance!is!the!level!of!risk!that!we!are!willing!to!accept,!that!we!will!incorrectly!

reject!a!true!null!hypothesis!!o By!default,!we!will!use!5%!=!0.05!significance!–!that!is,!we!are!willing!to!accept!a!5%!chance!that!we!

will!incorrectly!reject!the!null!hypothesis!!! E.g.,$we$are$willing$to$accept$5%$chance$that$we$incorrectly$rejected$the$null$hypothesis/made$an$

incorrect$conclusion$that$there$is$inequality$between$the$means,$when$in$fact$there$isn’t$!o In!some!situations!we!may!change!the!level!of!significance!!o If!pIvalue!<!0.05!we!reject!H0!o If!pIvalue!≥!0.05!we!do!not!reject!H0!

!Types*of*Hypothesis*Tests**

• The!choice!of!test!depends!on!what!you’re!trying!to!do:!What*You’re*Trying*to*Do* Hypothesis*Test*

Comparing!the!mean!of!a!sample!to!a!value! 1Isample!test!Comparing!the!means!of!two!samples!to!each!other! 2Isample!test!Comparing!the!means!of!more!than!two!samples!to!each!other! ANOVA!Comparing!proportions! 1I!and!2Isample!proportion!tests!Finding!relationships!between!categorical!variables! Chi!Square!test!!

• It!also!depends!on!the!type!of!data!that!you!have!!o If!you!can!assume!that!your!data!are!normally!distributed,!use!a!parametric!test!(t,!ANOVA)!o If!you!cannot!assume!that!your!data!are!normally!distributed,!use!a!nonIparametric!test!(Wilcoxon,!

MannIWhitney,!KruskalIWallis)!!Which*Test?**

!• Paired$data$=$when$the$result$on$the$2

nd$measurement$is$dependent$on$the$1

st$measurement$$

o E.g.,$individuals$measured$before$and$after$an$intervention$$

! 40!

Types*of*Hypothesis*Tests**

• Recall!that!the!appropriate!graphical!display!of!data!depends!on!the!type!of!data!that!you!have:!

Type*of*data* Categorical* Numeric*

Categorical* Multiple!Bar!Charts! Multiple!Boxplot!Numeric* Multiple!Boxplot! Scatterplot!

!• It!is!the!same!for!hypothesis!testing!

Type*of*data* Categorical* Numeric*

Categorical* ChiISquared!Test! 1Isample!t,!paired!t!2Isample!t,!ANOVA!

Numeric* Regression! Regression!!19Sample*Tests*

• I!have!one!column!of!data,!and!I!want!to!compare!the!population!mean!to!a!particular!measurement,!e.g.,$a$known$value$from$a$wellSdefined$population!!

• 1Isample!tItests!assume!that!the!data!is!normally!distributed!!o I.e.,!if!you!have!clearly!skewed!data,!it!is!not!appropriate!to!use!a!1Isample!tItest!

!H0:! µ!=!value!

H1:!µ!≠!value!(two!sided!alternative)!!µ!<!value!(one!sided!alternative)!µ!>!value!(one!sided!alternative)!

!• 19sample*t9test*example*–*Pulse*Data*

o In!the!lab!sessions,!we!considered!a!data!set!based!on!the!pulses!of!people!who!either!ran!for!one!minute,!or!rested!for!one!minute.!!

o We!can!use!inferential!techniques!to!test!some!of!the!theories!that!we!may!have!made!from!our!exploration.!!

o Suppose!that!we!would!like!to!test!whether!our!group!has!a!starting!pulse!rate!that!is!different!from!the!typical!resting!pulse!rate!of!75bpm!

o Step*1:!Set!up!the!hypotheses!!! H0:!µ!=!75!! H1:*µ!≠!75!(i.e.,$the$mean$is$different$to$75)!

o Step*2:*Choose!an!appropriate!test!! 1Isample!tItest!!

o Step*3:!Execute!the!test!in!SPSS!and!obtain!a!pIvalue!!

!o Step*4:!Make!a!conclusion!!

! PIvalue!=!0.067,!which!is!greater!than!0.05!level!of!significance!!! Therefore!we!will!not!reject!H0!

o Step*5:!State!the!conclusion!in!plain!language!!! Therefore!the!initial!pulse!rate!measurements!do!not!differ!significantly!from!the!

typical!resting!pulse!rate!of!75bpm!(i.e.,$73$is$not$statistically$different$to$75)!

! 42!

29sample*tests***

! I!have!two!samples!and!I!want!to!compare!them!to!each!other!o If!each!individual!has!one!measurement!in!each!sample,!then!my!data!is!paired!by!the!

individual,!use!a!paired*t9test!–!also$used$for$individuals$who$were$matched$according$to$certain$

characteristics$thought$to$influence$outcomes$!o If!my!data!is!not!paired,!use!a*29sample*t9test!

! Both!tests!assume!that!the!data!are!normally!distributed!–!you$will$get$misleading$data$if$you$have$

skewed$data$!!

! Paired*t9test*example*–*Pulse*Data*

o Suppose!that!I!want!see!whether!the!first!and!second!pulse!rates!for!those!participants!who!ran!were!different!or!not!!

!o Step*1:!Set!up!the!hypotheses!

! H0:!µ1!=!µ2!–!the$mean$obtained$from$first$pulse$rate$measurements$is$the$same$as$the$

mean$obtained$from$the$second$pulse$rate$measurements$!! H1:*µ1!≠!µ2!–!the$means$are$not$the$same$!

o Step*2:*Choose!an!appropriate!test!*! Paired!tItest!*

o Step*3:*Execute!the!test!in!SPSS!and!obtain!a!pIvalue!*

*o Step*4:!Make!a!conclusion!*

! PIvalue!<!0.001!(0.05!level!of!significance)!–!the$probability$of$observing$a$difference$of$this$magnitude$when$in$truth$there$isn’t$a$difference$is$very$small*

! Therefore!we!will!reject!H0!–!we$reject$the$notion$of$there$being$no$difference$*o Step*5:!State!the!conclusion!in!plain!language*

! Therefore!there!is!a!significant!difference!between!the!initial!pulse!rate!and!the!final!pulse!rate!for!those!participants!who!ran!*

*

! 43!

! 29Sample*t9test*example*–*Pulse*Data*

o The!two!sample!tItest!determines!whether!the!means!of!two!unrelated!populations!are!the!same!or!not!*

o In!general,!the!hypotheses!are:!*! H0:!µ1!=!µ2!–!i.e.,$the$mean$of$group$one$is$equal$to$the$mean$of$group$two!! H1:*µ1!≠!µ2!–!i.e.,$the$difference$between$the$means$of$the$groups$is$not$0!

o We!can!also!decide!whether!or!not!we!should!use!a!“pooled!variance”,!i.e.,!assume!that!the!variance!of!the!two!groups!are!equal!and!obtain!a!more!powerful!test!*

! To!decide!this,!we!need!to!look!at!a!2Isample!variances!test!*! SPSS!automatically!conducts!a!test!for!equal!variances!when!it!does!a!2Isample!tItest!*

• This!test!is!called!Levene’s!test*! Levene’s!test!has!hypotheses:!*

• H0:!σ12!=!σ2

2!–!i.e.,$there$is$no$difference$in$the$variances*• H1:!σ1

2!≠!σ22!–$i.e.,$there$is$a$difference$in$the$variances!*

! If!we!reject!the!null!hypothesis,!we!cannot!assume!equal!variances!and!need!to!use!the!pIvalue!associated!with!“equal!variances!not!assumed”*

! If!we!do!not!reject!the!null!hypothesis,!we!can!assume!equal!variances!and!use!the!pIvalue!associated!with!“equal!variances!assumed”*

*

o Now!suppose!that!I!wish!to!test!whether!there!is!a!significant!difference!between!the!final!pulse!rates!between!the!males!and!females!who!ran!*

! In!this!case,!the!participants!in!one!group!will!be!males!and!the!other!females*! Therefore!the!data!won’t!be!paired,!and!we!should!use!a!2Isample!tItest!*

*

o Step*1:!Set!up!the!hypotheses!*! H0:!µM!=!µF!–!i.e.,$the$mean$final$pulse$rate$in$males$and$females$are$equal!! H1:*µM!≠!µF!–!i.e.,$the$mean$final$pulse$rate$in$males$and$females$are$different!

o Step*2:*Choose!an!appropriate!test!*! 2Isample!tItest*

o Step*3:*Execute!the!test!in!SPSS!and!obtain!a!pIvalue!*

**

**

o Step*4:*Make!a!conclusion!*! PIvalue!<!0.001!(0.05!level!of!significance)*! Therefore!we!will!reject!H0*

o Step*5:*State!the!conclusion!in!plain!language*! Therefore!there!is!a!significant!difference!between!the!final!pulse!rates!of!the!male!

participants!and!the!final!pulse!rates!of!the!female!participants!who!ran!*

! 44!

Testing*Proportions**

• We!can!also!perform!tests!on!proportions,!e.g.,!the!percentage!of!smokers!• We!set!up!hypotheses!in!the!same!way!as!we!did!for!means!• A!1Iproportions!test!will!compare!the!sample!proportion!to!a!hypothesized!proportion,!similar!to!a!

1Isample!tItest!!o H0:!p!=!p0!–$i.e.,$there$is$no$difference$between$the$proportions$!o H1:!p!≠!p0!(>!or!<)!–!i.e.,$there$is$a$difference$between$the$proportions!

!• 19Proportions*test*example*–*Pulse*Data**

o It!is!claimed!that!out!of!the!cohort!that!participated!in!the!pulse!experiment,!more!than!15%!of!students!smoke.!*

o Using!the!pulse!sample,!we!can!test!this!claim.!**

o Step*1:*Set!up!the!hypotheses*! H0:!p!=!0.15!–$i.e.,$the$number$of$people$who$smoke$is$equal$to$15%*

! H1:!p!>!0.15!*o Step*2:!Choose!an!appropriate!test*

! 1Iproportions!test*o Step*3:!Execute!the!test!in!SPSS!and!obtain!a!pIvalue!*

*

o Step*4:*Make!a!conclusion!*! PIvalue!<!0.001!(0.05!level!of!significance)*! Therefore!we!will!reject!H0*

o Step*5:!State!the!conclusion!in!plain!language*! Therefore!the!proportion!of!students!who!smoke!is!significantly!greater!than!15%*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

! 46!

• Multiple*Comparisons*Tests*

o Analysis!of!Variance!only!considers!whether!or!not!there!are!differences!between!the!means!across!the!groups.!It!does!not!find!WHERE!those!differences!are!(or$how$much$

difference$there$is)!o Multiple!comparisons!tests!test!difference!between!pairs!of!groups!o Tukey’s*multiple*comparisons*tests!set!a!family!error!rate!of!0.05.!!

! In!this!case,!the!probability!that!we!observe!any!significant!differences!when!none!exist!is!0.05!

o Some*tests!set!the!individual!error!rate!at!0.05.!!! In!this!case,!the!probability!of!incorrectly!finding!differences!somewhere!will!be!a*lot*more*

than*0.05*

o Therefore,!care!needs!to!be!taken!when!choosing!a!postIhoc!procedure.!**

o Multiple*Comparisons*Tests*Example*–*Pulse*Data*

! We!used!ANOVA!to!determine!whether!the!changes!in!pulse!rates!differed!between!participants!with!different!levels!of!activity;!now!we!would!like!to!see!exactly!which!groups!differed.!*

! We!reIdo!a!OneIWay!ANOVA!and!select!“PostIHoc”,!then!select!Tukey!*! We!will!obtain!the!ANOVA!table,!as!well!as!the!following!table.!*

*! We!can!notice!that!there!is!a!significant!difference!between!Moderate!and!High!levels!of!

activity,!but!no!significant!differences!between!the!other!levels!of!activity.!*• The!change!in!pulse!rate!for!those!with!a!Slight!level!of!activity!is!not!significantly!different!to!

the!other!groups.!*! This*is*better*understood*with*a*multiple*boxplot.**

*! Notice*that*SPSS*gives*95%*Confidence*Intervals*between*pairs*of*groups.**

• If!0!lies!in!the!confidence!interval,!then!we!would!conclude!that!the!means!of!the!two!groups!are!not!significantly!different*

• If!both!bounds!are!positive,!or!if!both!bounds!are!negative,!then!we!would!say!that!the!groups!have!different!means*

! 48!

*

• Kruskal9Wallis*Test*Example*–*Pulse*Data**

o We!can!also!repeat!the!activity!level!analysis!using!a!nonIparametric!test*o First!we!need!to!calculate!the!change!in!pulse!rate:!Change*in*pulse*rate*=*Pulse2*–*Pulse1!!o Step*1:!Set!up!the!hypotheses!!

! H0:!median1!=!median2!=!median3*! H1:!not!all!of!the!medians!are!equal!*

o Step*2:*Choose!an!appropriate!test!!! KruskalIWallis!

o Step*3:!Execute!the!test!in!SPSS!and!obtain!a!pIvalue!!

!o Step*4:*Make!a!conclusion*!

! PIvalue!=!0.002!(0.05!level!of!significance)!! Therefore!we!will!reject!H0!

o Step*5:!State!the!conclusion!in!plain!language!! Therefore!there!is!a!significant!difference!in!the!changes!in!pulse!rates!between!

participants!with!different!levels!of!activity.!!o In!this!case,!we!made!the!same!conclusion,!but!the!observed!pIvalue!was!larger.!!

!!!!!!!!!!!!!!!!!!!!!!!!!

! 49!

Chi9Square*Tests*

! We!have!looked!at!a!test!for!a!single!categorical!variable!with!two!levels!(proportions$of$smokers)!o We!performed!hypothesis!tests!based!on!the!proportions!in!each!group!

! If!we!wish!to!compare!two!or!more!groups,!or!have!more!than!two!levels!in!the!categorical!variable,!then!we!need!a!more!sophisticated!test!!

! A!chiIsquared!test!compares!two!or!more!categorical!variables,!each!with!two!or!more!categories.!!o This!test!compares!the!observed!frequencies!in!each!cell!of!the!crosstab!to!what!we!would!expect!to!

see!there!if!the!variables!were!independent.!!! We!assume!that!we!have!an!adequate!sample!for!each!cell!of!the!crosstabulation!!

o An!observed!frequency!of!at!least!5,!and!an!expected!frequency!of!at!least!5!!o If!there!is!not!an!adequate!sample,!then!we!need!to!combine!groups!

! The!hypotheses!for!this!test!are:!!o H0:!The!variables!are!independent!of!each!other!*o H1:!The!variables!are!not!independent!of!each!other!*

! If!we!reject!the!null!hypothesis!then!we!can!say!that!the!variables!are!related,!or!the!proportions!differ!between!groups!(depending!on!what!you!set!out!to!find)!

!! Chi9square*test*example*–*Smoking*

o Suppose!that!we!would!like!to!test!whether!there!are!gender!differences!in!whether!a!person!smokes!or!not!

o Step*1:*Set!up!the!hypotheses!*! H0:!Gender!and!smokes!are!independent!!! H1:!Gender!and!smokes!are!not!independent!!

o Step*2:!Choose!an!appropriate!test!!! ChiIsquare!test!!

o Step*3:*Execute!the!test!in!SPSS!and!obtain!a!pIvalue!!

!! Expected$Count$=$the$count$if$there$was$no$association$between$the$variables$

! E.g.,$62%$of$the$participants$are$male,$so$if$the$H0$was$true,$we$should$find$62%$of$the$participants$to$

be$male$in$each$category,$i.e.,$we’d$find$62%$of$smokers$were$male.$62%$of$62$participants$=$39.7$

participants$

!o Step*4:*Make!a!conclusion!!

! PIvalue!=!0.216!(0.05!level!of!significance)!–!We!are!using!the!Pearson!chiIsquare!pIvalue!! Therefore!we!will!not!reject!H0!

o Step*5:*State!the!conclusion!in!plain!language!! Therefore!the!proportions!of!males!and!females!who!smoke!are!not!significantly!different!

(the$proportion$of$males$and$females$across$the$groups,$smokers$and$nonSsmokers,$are$not$

significantly$different)!

! 50!

Regression*

• Linear!regression!(Ordinary!least!squares/OLS)!o Use$linear$regression$when$the$outcome$of$interest$is$a$continuous$variable!

• Logistic!regression!(Binary!logistic)!o Use$logistic$regression$if$the$outcome$variable$is$categorical;$i.e.,$2$possible$outcomes$(e.g.,$did$the$

patients$survive$or$not$survive$at$the$end$of$the$period$of$time)!!

• Simple*linear*regression*

o The!association!between!two!continuous!variables!can!be!depicted!graphically!using!a!scatter!diagram!!

o The!method!of!simple!linear!regression!allows!us!to!use!an!equation!to!represent!the!relationship!between!x!and!y!

o The!equation!of!a!straight!line!can!be!written!as:!y*=*α *+*βx!o α!is!the!expected!value!of!y!when!x!is!zero!!o β!(the!regression*coefficient!is!the!expected!change!in!y!as!x!

increases!by!1!unit!(the$slope)!o A!positive!β!indicates!y!increases!as!x!increases;!a!negative!β!indicates!y!decreases!as!x!increases!o When!β!=!0,!then!there!is!no!association!between!y!and!x,!because!the!expected!value!of!y!does!not!

change!as!x!changes!–!i.e.,$when$the$gradient$is$0,$there$is$no$meaningful$relationship$between$x$and$y$–$

when$x$changes,$y$doesn’t$change$!!

o Three*assumptions*are*made:**

1. There!is!a!linear!relationship!between!x!and!y!2. The!variability!about!the!regression!line!(the$line$of$best$fit)!is!the!same!for!all!values!with!x,!

with!constant!standard!deviation!–!i.e.,$the$spread$of$data$is$even;$the$ySvalues$are$spread$evenly$around$the$line$of$best$fit!

3. The!distribution!of!y!for!any!given!x!is!normal!!o The!null!hypothesis!is!that!there!is!no!association!between!y!and!x,!which!is!equivalent!to!assuming!

that!the!slope!is!0.!!!

o Simple*linear*regression*example*–*height*and*weight**

! Step*1:!Set!up!the!hypotheses!• H0:!β!=!0!• H1:!β!≠!0!–$i.e.,$the$gradient$is$not$0!

! Step*2:*Choose!an!appropriate!test!!• linear!regression!

! Step*3:!Execute!the!test!in!SPSS!and!obtain!a!pIvalue!

!

!! Step*4:*Make!a!conclusion!!

• PIvalue!<!0.001!(0.05!level!of!significance)!!• Therefore!we!will!reject!H0!(that$there$is$no$association$between$height$and$weight)!

! Step*5:!State!the!conclusion!in!plain!language!• Therefore!there!is!a!relationship!between!weight!and!height:!!

o weight!=!I91.147!+!90.008!x!height!

top related