ntdt5602(methods(in(nutrition(research( most(research ... · ! 1!...

2
1 NTDT5602 Methods in Nutrition Research Module 1 Nutritional Epidemiology Nutritional Epidemiology Lecture 1 – Introduction & Study Design Introduction What is epidemiology? “… the study of the distribution and determination of health5related states or events in specified populations and the application of this study in the prevention and control of health problems” What does this mean? The magnitude of a health problem o Frequency – how often does a health problem occur? The cause of a health problem o Determinants – the associations between lifestyle factors and disease. E.g., what is the cause of bowel cancer? The evaluation of a treatment or prevention campaign o Clinical or community application – e.g., with drugs used in medicine, we see if it works by running a RCT, or for a prevention campaign, we run a clustered RCT where some centres get the treatment and others do not, and then measuring the difference between the centres (e.g., schools) Overall aim of these lectures To be able to critically appraise the nutrition and dietetic scientific literature Course overview Study types and levels of evidence – looking at study designs Measures of frequency – using the right measures of frequencies Measures of association Confounding Selection bias – what can happen if you do not select the study population using the best methods Measurement error – i.e., bias in measuring the outcome factors Causality in nutrition related disease – how we reach a final conclusion after looking at all the paths/papers, how we conclude that there is causality between diet and disease, e.g., diet with less fibre ! bowel cancer in 30yrs 3 Most research questions fall into 3 types: 1) How common is the nutritional problem? Or the magnitude/frequency of the nutritional problem? o = Observational descriptive study, usually crossVsectional " How common is the nutritional problem? FREQUENCY Select study population !Representative sample of the population !Measure factor of interest o E.g., National Health Surveys o E.g., How many people in Australia are eating less than 2 serves of fruit? – You can’t go out and ask 20 million people how much fruit they’re eating; therefore you have to select a representative sample. o How many adults in Australia are obese? Australians 18 years and over !Representative sample (taking into account the demographic profile of Australia) !Get weight and height and then BMI; HOW MANY ≥30? 2) What nutritional factor caused or prevented the disease? o = Observational cohort study, caseVcontrol study, crossVsectional analytical o Observational studies for causal relationships: (Is the study factor causing the outcome factor or not?) " Looking at causal relationships from observational analytical studies " Study factor (exposure) ––???––> Outcome factor " High saturated fat diet ––???––> Ischaemic Heart Disease o Cohort studies (analytical) " Exposure to the study factor is determined by the subjects – (i.e., cohort studies are observational, unlike RCTs in which you assign people. In cohort studies the subjects determine their own exposure and the investigator has nothing do with their choices, they just monitor) " Researchers measure the extent of the exposure – (usually use FFQs) Researchers rank the people based on exposure, e.g., tertiles/quintiles of lowest intake ! highest intake " The outcome factor is measure LATER – e.g., See how many new cases of disease 5G20yrs later " Determine the study factor !(1) Group exposed to study factor (2) Group not exposed ! Measure outcome factor in exposed versus unexposed group (continually monitored every 5 years until point of analysis; then classify the people) E.g., (1) Group exposed (smokers) (2) Group not exposed (nonGsmokers) !measure outcome factor (number of people who got lung cancer) in each group E.g., the EPIC study (n=435000) Measure dietary fibre intake, with 657y follow5up !(1) Very low intake (2) Low intake (3) Medium intake (4) High intake !Measure the occurrence of gastric cancer in different intake levels 10 Nutritional Epidemiology Lecture 2.2 – Measures of Association Key measures of association (Measures to decide whether a particular dietary pattern is associated with decreased/increased incidence of disease) Relative Risk (RR) Odds Ratio (OR) Attributable risk percent (AR%) Population attributable risk (PAR) These are measures of association in cohort studies and case control studies !allow us to determine dietKdisease relationships What Associations? How big, or how strong is the association between the study factor (exposure) and the outcome factor? So we can apply a numerical value to the risk that a particular diet will lead to a disease Incidence Cumulative Incidence = number of people experiencing a NEW event during a time period / number of susceptible people at the beginning of the time period Incidence is a measure of events (event rate) Incidence is a measure of risk Relative Risk (Risk Ratio) – for cohort studies and RCTs RR = incidence of study factor in the exposed group / incidence in the control group o The RR describes the likelihood that people exposed to the nutritional factor would get the disease or are protected from the disease Researchers measure their lifestyle behaviours, rank them into groups according to their level of exposure, and then measure how many of them in each group get a particular disease. The lowest quintile = the control group E.g., if the exposed = 20%, and the control is 10%, hence the RR = 2.0 o This means that the exposed are twice as likely to have the event Cohort study of effect of food x consumption on metabolic syndrome Present (have the disease) Absent (don’t have the disease) Total Exposed 2000 28,000 30,000 Unexposed 500 14,500 15,000 Total 2500 43,500 45,000 Incidence in exposed group: 2000/30000 = 0.067 Incidence in unexposed group: 500/15000 = 0.033 Relative risk = 0.067/0.033 = 2.03 Risk difference How many outcomes are due to the exposure? Risk in exposed group minus risk in the unexposed group Risk Difference = Iexposed –Iunexposed Risk Difference = 0.067 – 0.033 = 0.024 = 24 per 1000 This gives you the DIFFERENCE in risk between groups Normally we see “Relative Risk” in papers 14 Nutritional Epidemiology Lecture 3 – Confounding Key Points Understand the definition of confounding variable Understand the difference between a crude and an adjusted estimate of association Understand in general terms the main ways of controlling confounding (restriction, randomization, matching, stratification and multivariate analysis) Understand the difference between confounding and effect modification Definition of confounding Confounding occurs when a measure of association is biased, because of the association of the study factor with other factors that influence the outcome factor. o E.g., you may get a positive association between the amount of vegetable consumption and diabetes. This association may be real, or other factors that influence the development of diabetes, e.g., BMI, is confounding the observed RR. o I.e., confounding occurs when there is bias in the measurement of association, because another variable is changing the observed risk of the study/exposure factor. An Example – “Is high alcohol intake associated with lung cancer?” Presence of Lung Cancer YES NO TOTAL Exposed to High Alcohol YES 615 24385 25000 NO 210 24790 25000 TOTAL 825 49175 50000 Relative Risk = (615/25000) / (210/25000) = 0.0246/ 0.0084 = 2.9 o Incidence of lung cancer in those exposed to high alcohol = 615/25000 o Incidence of lung cancer in those who aren’t exposed to high alcohol (control) = 210/25000 o I.e., study factor: high alcohol KKKK> outcome factor: lung cancer " However, this is the crude RR, and it may be biased due to the influence of confounders Confounding o Study factor: high alcohol !Confounding variable: smoking !Outcome factor: lung cancer " Somewhere along the way, there may be a factor that’s confounding the relationship between high alcohol and lung cancer, i.e., smoking. " The observed high association between alcohol and lung cancer could be due to smoking being a confounder. 29 Nutritional Epidemiology Revision Study Design What is the research question? – There are 3 types of questions What is/are the study factors and outcome factors? What is the study type? o RCT o Cohort o Case control o CrossSsectional descriptive o CrossSsectional analytical What level of evidence is offered by the study type? o Is the level of evidence offered high or low when studying causation? What study design is feasible to answer the research question? o What study type would give best evidence vs what is most feasible? ! Sometimes there's a tradeSoff between feasibility and highest level of evidence (because it's improbable) ! E.g., the best way to study the relationship between dietary fibre and bowel cancer is an RCT to avoid subject bias. But it would take 20 years for the relationship between dietary fibre and bowel cancer to become apparent, and you wouldn't be able to ethically do an RCT for 20 years Study Type – the study type you choose depends on your question (of which there are 3 types) How common is the nutritional problem? o Use a crossSsectional study, e.g., survey o !Prevalence (how common) o New cases = “incidence” What nutritional factor caused or prevented the disease? o Cohort study o Case control study ! Case control = opposite of a cohort study ! You already have your cases, and then you recruit a representative group of controls, then use recall to examine previous diet and get the OR of the disease being caused by a particular dietary exposure o CrossSsectional analytical (but you cannot infer causation from a crossSsectional analytical study) Does dietary intervention prevent or cure the disease? o RCT o E.g., does a particular drug lower blood pressure? Does nS3 fatty acids cure rheumatoid arthritis? o Systematic literature review with metaSanalysis of RCTs = highest level of evidence 30 Epidemiologic Study Designs 31 Module 4 Statistics for Nutrition Practice Outline Descriptive Statistics (Lecture 1) o Graphical Displays of Data o Measuring Centre and Spread o Tabular Data Inferential Statistics (Lectures 2 and 3) o How to set up a Hypothesis Test o Types of Hypothesis Tests Descriptive and Inferential Statistics Descriptive Statistics – describe the sample Inferential Statistics – use the sample to test theories about the population Statistics Lecture 1 – Descriptive Statistics Learning Objectives Summarise statistical data using: o Graphs and tables o Appropriate measures of location including means and medians o Appropriate measures of variability including standard deviations and interquartile ranges Interpret significance tests and confidence intervals Use IBM SPSS statistical software to present data Descriptive Statistics The purpose of descriptive statistics is to become familiar with the data that you have collected o What information can you get from the data? What are we looking for? o What is a “typical” response? o How different are responses from different individuals? o Do responses differ between groups of individuals? o Is a measurement of one aspect of an individual dependent on another aspect? Types of Data The way that we look at data will depend on the type of data that we have Categorical Numeric Nominal – no order E.g., Nationality, Gender Ordinal – order E.g., Likert Scale, level of education Discrete – takes whole number values E.g., number of birds in a tree Continuous E.g., height Mode Mode or Median Mean, Median or Mode If your data is symmetrical, you use the mean. If it is skewed, you use the median. Mode is used as a last resort. Mean or Median 33 Measures of Central Tendency Where is the ‘middle’ of the data? – when you need to summarise the data o We could also think of it as what we would expect to measure for a typical respondent. Three different measures o Mean – add the observations and then divide by the number of observations ! The mean can be quite sensitive to large values and skewness – avoid using the mean to describe skewed data (asymmetry in data) ! Skewness When the mean is dragged out to the right or left (may be due to extreme values) With symmetric data, the mean and the median are very similar, and often the same. However, the mean has many statistical properties that we can use, which the median doesn’t have. ! Skewness from boxplots If most of the observations are concentrated on the low end of the scale, the distribution is skewed right (positively skewed); and vice versa If a distribution is symmetric, the observations will be evenly split at the median, as shown in the middle figure. Positively Skewed Symmetric Negatively Skewed o Median – is the middle observation (n = odd) or the average of the two middle observations (n = even) ! The median is useful when describing skewed data ! Boxplot revisited o Mode – is the most frequently occurring observation (merely a descriptive tool) 38 Statistics Lectures 2 & 3 – Inferential Statistics Where does inference fit in? So far we have looked at how to describe data, but we haven’t been able to test our observations Inference provides the tools to test whether our observations can be applied to the population, as opposed to just seeing the results by chance From the sample, we can make an inference about the population, including the mean, and variation. The Structure of a Hypothesis Test Step 1: Set up Hypotheses o Null Hypothesis: H0 o Alternative Hypothesis: H1 Step 2: Choose an appropriate test o Depending on the characteristics of your dataset, there is an appropriate test to apply to test your hypothesis Step 3: Execute the test in SPSS and obtain a pIvalue o PIvalue = probability that you would observe inequality between the means of each group when the null hypothesis is actually true (probability that we are detecting an effect that is there by chance, and not actually there) o I.e., the chance that we only observed the sample results by chance when the null hypothesis is true Step 4: Make a conclusion o Either reject H0 if the pIvalue is small enough, or do not reject H0 o You cannot accept a hypothesis – it doesn’t automatically mean that the alternative hypothesis is true. Step 5: State the conclusion in plain language o So people who don’t understand statistics can still understand your conclusions.

Upload: others

Post on 24-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: NTDT5602(Methods(in(Nutrition(Research( Most(research ... · ! 1! NTDT5602(Methods(in(Nutrition(Research(! Module(1(Nutritional(Epidemiology(! Nutritional(Epidemiology(Lecture(1(–(Introduction(&(Study(Design

! 1!

NTDT5602(Methods(in(Nutrition(Research(!

Module(1(Nutritional(Epidemiology(!

Nutritional(Epidemiology(Lecture(1(–(Introduction(&(Study(Design(!Introduction(

(

What(is(epidemiology?((

• “…!the!study!of!the!distribution!and!determination!of!health5related!states!or!events!in!specified!populations!and!the!application!of!this!study!in!the!prevention!and!control!of!health!problems”!!

!What(does(this(mean?((

• The!magnitude!of!a!health!problem!!o Frequency!–"how"often"does"a"health"problem"occur?"!

• The!cause!of!a!health!problem!!o Determinants!–!the"associations"between"lifestyle"factors"and"disease."E.g.,"what"is"the"cause"of"

bowel"cancer?"!• The!evaluation!of!a!treatment!or!prevention!campaign!!

o Clinical!or!community!application!–"e.g.,"with"drugs"used"in"medicine,"we"see"if"it"works"by"running"a"RCT,"or"for"a"prevention"campaign,"we"run"a"clustered"RCT"where"some"centres"get"the"treatment"and"others"do"not,"and"then"measuring"the"difference"between"the"centres"(e.g.,"schools)!

!Overall(aim(of(these(lectures(

• To!be!able!to!critically!appraise!the!nutrition!and!dietetic!scientific!literature!!Course(overview((

• Study!types!and!levels!of!evidence!–!looking"at"study"designs!• Measures!of!frequency!–!using"the"right"measures"of"frequencies!• Measures!of!association!!• Confounding!!• Selection!bias!–!what"can"happen"if"you"do"not"select"the"study"population"using"the"best"methods"!• Measurement!error!–!i.e.,"bias"in"measuring"the"outcome"factors"!• Causality!in!nutrition!related!disease!–!how"we"reach"a"final"conclusion"after"looking"at"all"the"

paths/papers,"how"we"conclude"that"there"is"causality"between"diet"and"disease,"e.g.,"diet"with"less"fibre"!"bowel"cancer"in"30yrs!

!

! 3!

Most(research(questions(fall(into(3(types:((

1) How!common!is!the!nutritional!problem?!Or!the!magnitude/frequency!of!the!nutritional!problem?!

o =(Observational(descriptive(study,(usually(crossVsectional((" How(common(is(the(nutritional(problem?(FREQUENCY(

• Select!study!population!!!Representative!sample!of!the!population!!!Measure!factor!of!interest(

o E.g.,(National(Health(Surveys!o E.g.,"How"many"people"in"Australia"are"eating"less"than"2"serves"of"fruit?"–"You"can’t"

go"out"and"ask"20"million"people"how"much"fruit"they’re"eating;"therefore"you"have"to"select"a"representative"sample."!

o How(many(adults(in(Australia(are(obese?(!Australians!18!years!and!over!!!Representative!sample!(taking"into"account"the"demographic"profile"of"Australia)!!!Get!weight!and!height!and!then!BMI;!HOW!MANY!≥30?!!

!

2) What!nutritional!factor!caused!or!prevented!the!disease?!!o =!Observational(cohort(study,(caseVcontrol(study,(crossVsectional(analytical(!o Observational(studies(for(causal(relationships:((Is"the"study"factor"causing"the"outcome"factor"or"not?)!

" Looking!at!causal!relationships!from!observational!analytical!studies!" Study!factor!(exposure)!––???––>!Outcome!factor!!" High!saturated!fat!diet!––???––>!Ischaemic!Heart!Disease!!

!o Cohort(studies((analytical)(

" Exposure!to!the!study!factor!is!determined!by!the!subjects!–!(i.e.,"cohort"studies"are"observational,"unlike"RCTs"in"which"you"assign"people."In"cohort"studies"the"subjects"determine"their"own"exposure"and"the"investigator"has"nothing"do"with"their"choices,"they"just"monitor)!

" Researchers!measure!the!extent!of!the!exposure!–"(usually"use"FFQs)!• Researchers"rank"the"people"based"on"exposure,"e.g.,"tertiles/quintiles"of"lowest"intake"!"

highest"intake"!" The!outcome!factor!is!measure!LATER!–!e.g.,"See"how"many"new"cases"of"disease"5G20yrs"later!" Determine!the!study!factor!!!(1)!Group!exposed!to!study!factor!(2)!Group!not!exposed!!!

Measure!outcome!factor!in!exposed!versus!unexposed!group"(continually"monitored"every"5"years"until"point"of"analysis;"then"classify"the"people)!

• E.g.,"(1)"Group"exposed"(smokers)"(2)"Group"not"exposed"(nonGsmokers)"!"measure"outcome"factor"(number"of"people"who"got"lung"cancer)"in"each"group!

• E.g.,(the(EPIC(study((n=435000)((

Measure!dietary!fibre!intake,!with!657y!follow5up((! ((1)(Very!low!intake((2)(Low!intake((3)(Medium!intake((4)(High!intake(((! (Measure!the!occurrence!of!gastric!cancer!in!different!intake!levels!(!

! 10!

Nutritional'Epidemiology'Lecture'2.2'–'Measures'of'Association'!

Key'measures'of'association!–!(Measures%to%decide%whether%a%particular%dietary%pattern%is%associated%with%decreased/increased%incidence%of%disease)!

• Relative!Risk!(RR)!

• Odds!Ratio!(OR)!

• Attributable!risk!percent!(AR%)!

• Population!attributable!risk!(PAR)!

• These%are%measures%of%association%in%cohort%studies%and%case%control%studies!

• !%allow%us%to%determine%dietKdisease%relationships%!

!

What'Associations?''

• How!big,!or!how!strong!is!the!association!between!the!study!factor!(exposure)!and!the!outcome!factor?!

–!So%we%can%apply%a%numerical%value%to%the%risk%that%a%particular%diet%will%lead%to%a%disease!

!

Incidence''

• Cumulative!Incidence!=!number!of!people!experiencing!a!NEW!event!during!a!time!period!/!number!of!

susceptible!people!at!the!beginning!of!the!time!period!!

• Incidence!is!a!measure!of!events!(event!rate)!

• Incidence!is!a!measure!of!risk!!

!

Relative'Risk'(Risk'Ratio)'–'for'cohort'studies'and'RCTs''

• RR'='incidence'of'study'factor'in'the'exposed'group'/'incidence'in'the'control'group''

o The%RR%describes%the%likelihood%that%people%exposed%to%the%nutritional%factor%would%get%the%disease%or%are%

protected%from%the%disease%%

• Researchers!measure!their!lifestyle!behaviours,!rank!them!into!groups!according!to!their!level!of!

exposure,!and!then!measure!how!many!of!them!in!each!group!get!a!particular!disease.!!

• The!lowest!quintile!=!the!control!group!

• E.g.,!if!the!exposed!=!20%,!and!the!control!is!10%,!hence!the!RR!=!2.0!!

o This!means!that!the!exposed!are!twice!as!likely!to!have!the!event!!

!

Cohort'study'of'effect'of'food'x'consumption'on'metabolic'syndrome''

' Present'(have&the&disease)&

Absent'(don’t&have&the&disease)&

Total'

Exposed' 2000! 28,000! 30,000!

Unexposed' 500! 14,500! 15,000!

Total' 2500! 43,500! 45,000!

• Incidence!in!exposed!group:!2000/30000!=!0.067!

• Incidence!in!unexposed!group:!500/15000!=!0.033!

• Relative!risk!=!0.067/0.033!=!2.03!

!

Risk'difference'

• How!many!outcomes!are!due!to!the!exposure?!!

• Risk!in!exposed!group!minus!risk!in!the!unexposed!group!

• Risk!Difference!=!Iexposed!–!Iunexposed!

• Risk!Difference!=!0.067!–!0.033!=!0.024!=!24!per!1000!

• This%gives%you%the%DIFFERENCE%in%risk%between%groups%!

• Normally%we%see%“Relative%Risk”%in%papers%!

!

! 14!

Nutritional'Epidemiology'Lecture'3'–'Confounding'!

Key'Points''

• Understand!the!definition!of!confounding!variable!

• Understand!the!difference!between!a!crude!and!an!adjusted!estimate!of!association!!

• Understand!in!general!terms!the!main!ways!of!controlling!confounding!(restriction,!randomization,!

matching,!stratification!and!multivariate!analysis)!!

• Understand!the!difference!between!confounding!and!effect!modification!!

!

Definition'of'confounding''

• Confounding!occurs!when!a!measure!of!association!is!biased,!because!of!the!association!of!the!study!

factor!with!other!factors!that!influence!the!outcome!factor.!!

o E.g.,%you%may%get%a%positive%association%between%the%amount%of%vegetable%consumption%and%

diabetes.%This%association%may%be%real,%or%other%factors%that%influence%the%development%of%

diabetes,%e.g.,%BMI,%is%confounding%the%observed%RR.%!

o I.e.,%confounding%occurs%when%there%is%bias%in%the%measurement%of%association,%because%another%

variable%is%changing%the%observed%risk%of%the%study/exposure%factor.%!

!

An'Example'–'“Is'high'alcohol'intake'associated'with'lung'cancer?”'

!Presence'of'Lung'Cancer!

YES' NO' TOTAL'

Exposed'to'High'

Alcohol'

YES' 615! 24385! 25000!

NO' 210! 24790! 25000!

TOTAL' 825! 49175! 50000!

!

• Relative!Risk!=!(615/25000)!/!(210/25000)!=!0.0246/!0.0084!=!2.9!!o Incidence%of%lung%cancer%in%those%exposed%to%high%alcohol%=%615/25000!

o Incidence%of%lung%cancer%in%those%who%aren’t%exposed%to%high%alcohol%(control)%=%210/25000!

o I.e.,%study%factor:%high%alcohol%KKKK>%outcome%factor:%lung%cancer!

" However,%this%is%the%crude%RR,%and%it%may%be%biased%due%to%the%influence%of%confounders!

!

• Confounding'

o Study!factor:!high!alcohol!!!Confounding!variable:!smoking!!!Outcome!factor:!lung!cancer!

" Somewhere%along%the%way,%there%may%be%a%factor%that’s%confounding%the%relationship%

between%high%alcohol%and%lung%cancer,%i.e.,%smoking.%!

" The%observed%high%association%between%alcohol%and%lung%cancer%could%be%due%to%smoking%

being%a%confounder.%!

!

! 29!

Nutritional*Epidemiology*Revision*!

Study*Design**

• What!is!the*research*question?!–$There$are$3$types$of$questions!• What!is/are!the!study!factors!and!outcome!factors?!!• What!is!the!study!type?!

o RCT!o Cohort!o Case$control!

o CrossSsectional$descriptive!o CrossSsectional$analytical$!

• What!level!of!evidence!is!offered!by!the!study!type?!!o Is$the$level$of$evidence$offered$high$or$low$when$studying$causation?$!

• What!study!design!is!feasible!to!answer!the!research!question?!!o What!study!type!would!give!best!evidence!vs!what!is!most!feasible?!!

! Sometimes$there's$a$tradeSoff$between$feasibility$and$highest$level$of$evidence$(because$it's$

improbable)$

! E.g.,$the$best$way$to$study$the$relationship$between$dietary$fibre$and$bowel$cancer$is$an$RCT$to$avoid$

subject$bias.$But$it$would$take$20$years$for$the$relationship$between$dietary$fibre$and$bowel$cancer$to$

become$apparent,$and$you$wouldn't$be$able$to$ethically$do$an$RCT$for$20$years$

!Study*Type!–$the$study$type$you$choose$depends$on$your$question$(of$which$there$are$3$types)$

• How*common*is*the*nutritional*problem?*

o Use$a$crossSsectional$study,$e.g.,$survey!o !$Prevalence$(how$common)$!o New$cases$=$“incidence”!

• What*nutritional*factor*caused*or*prevented*the*disease?*

o Cohort$study!o Case$control$study!

! Case$control$=$opposite$of$a$cohort$study$

! You$already$have$your$cases,$and$then$you$recruit$a$representative$group$of$controls,$then$use$recall$

to$examine$previous$diet$and$get$the$OR$of$the$disease$being$caused$by$a$particular$dietary$exposure$

o CrossSsectional$analytical$(but$you$cannot$infer$causation$from$a$crossSsectional$analytical$study)!• Does*dietary*intervention*prevent*or*cure*the*disease?**

o RCT!o E.g.,$does$a$particular$drug$lower$blood$pressure?$Does$nS3$fatty$acids$cure$rheumatoid$arthritis?!o Systematic$literature$review$with$metaSanalysis$of$RCTs$=$highest$level$of$evidence$$

!!!

! 30!

*Epidemiologic*Study*Designs**

!

! 31!

*Module*4*Statistics*for*Nutrition*Practice*!Outline*

• Descriptive!Statistics!(Lecture!1)!o Graphical!Displays!of!Data!o Measuring!Centre!and!Spread!o Tabular!Data!!

• Inferential!Statistics!(Lectures!2!and!3)!o How!to!set!up!a!Hypothesis!Test!o Types!of!Hypothesis!Tests!

!Descriptive*and*Inferential*Statistics**

• Descriptive!Statistics!–!describe!the!sample!• Inferential!Statistics!–!use!the!sample!to!test!theories!about!the!population!!

!

Statistics*Lecture*1*–*Descriptive*Statistics**

Learning*Objectives**

• Summarise!statistical!data!using:!o Graphs!and!tables!o Appropriate!measures!of!location!including!means!and!medians!o Appropriate!measures!of!variability!including!standard!deviations!and!interquartile!ranges!

• Interpret!significance!tests!and!confidence!intervals!• Use!IBM!SPSS!statistical!software!to!present!data!!

!Descriptive*Statistics**

• The!purpose!of!descriptive!statistics!is!to!become!familiar!with!the!data!that!you!have!collected!!o What!information!can!you!get!from!the!data?!!

• What!are!we!looking!for?!!o What!is!a!“typical”!response?!!o How!different!are!responses!from!different!individuals?!o Do!responses!differ!between!groups!of!individuals?!!o Is!a!measurement!of!one!aspect!of!an!individual!dependent!on!another!aspect?!!

!Types*of*Data*

• The!way!that!we!look!at!data!will!depend!on!the!type!of!data!that!we!have!!

Categorical* Numeric*

Nominal!–!no!order!E.g.,!Nationality,!Gender!

Ordinal!–!order!E.g.,!Likert!Scale,!level!of!

education!

Discrete!–!takes!whole!number!values!

E.g.,!number!of!birds!in!a!tree!

Continuous*

E.g.,!height!

Mode! Mode!or!Median!

Mean,!Median!or!Mode!$

If$your$data$is$symmetrical,$

you$use$the$mean.$If$it$is$

skewed,$you$use$the$median.$

Mode$is$used$as$a$last$resort.$

Mean!or!Median!

!

! 33!

Measures*of*Central*Tendency**

• Where!is!the!‘middle’!of!the!data?!–!when$you$need$to$summarise$the$data$!o We!could!also!think!of!it!as!what!we!would!expect!to!measure!for!a!typical!respondent.!!

• Three!different!measures!o Mean!–!add!the!observations!and!then!divide!by!the!number!of!observations!

! The!mean!can!be!quite!sensitive!to!large!values!and!skewness!–!avoid!using!the!mean!to!describe!skewed!data!(asymmetry$in$data)!

! Skewness**

• When$the$mean$is$dragged$out$to$the$right$or$left$(may$be$due$to$extreme$values)$

• With$symmetric$data,$the$mean$and$the$median$are$very$similar,$and$often$the$

same.$However,$the$mean$has$many$statistical$properties$that$we$can$use,$which$the$

median$doesn’t$have.$$

! Skewness*from*boxplots**

• If!most!of!the!observations!are!concentrated!on!the!low!end!of!the!scale,!the!distribution!is!skewed!right!(positively!skewed);!and!vice!versa!!

• If!a!distribution!is!symmetric,!the!observations!will!be!evenly!split!at!the!median,!as!shown!in!the!middle!figure.!!

Positively*Skewed* Symmetric* Negatively*Skewed*

! ! !!

!

!

!

!

!!

o Median!–!is!the!middle!observation!(n!=!odd)!or!the!average!of!the!two!middle!observations!(n!=!even)!

! The!median!is!useful!when!describing!skewed!data!!! Boxplot*revisited**

!!

o Mode!–!is!the!most!frequently!occurring!observation!(merely$a$descriptive$tool)!! 38!

Statistics*Lectures*2*&*3*–*Inferential*Statistics**

Where*does*inference*fit*in?**

• So!far!we!have!looked!at!how!to!describe!data,!but!we!haven’t!been!able!to!test!our!observations!• Inference!provides!the!tools!to!test!whether!our!observations!can!be!applied!to!the!population,!as!

opposed!to!just!seeing!the!results!by!chance!!

!• From$the$sample,$we$can$make$an$inference$about$the$population,$including$the$mean,$and$

variation.$!!The*Structure*of*a*Hypothesis*Test*

• Step*1:!Set!up!Hypotheses!!o Null!Hypothesis:!H0!o Alternative!Hypothesis:!H1!

• Step*2:!Choose!an!appropriate!test!!o Depending$on$the$characteristics$of$your$dataset,$there$is$an$appropriate$test$to$apply$to$test$

your$hypothesis!• Step*3:!Execute!the!test!in!SPSS!and!obtain!a!pIvalue!!

o PIvalue!=!probability!that!you!would!observe!inequality!between!the!means!of!each!group!when!the!null!hypothesis!is!actually!true!(probability$that$we$are$detecting$an$effect$that$is$there$by$chance,$and$not$actually$there)!

o I.e.,$the$chance$that$we$only$observed$the$sample$results$by$chance$when$the$null$hypothesis$is$true!• Step*4:!Make!a!conclusion!!

o Either!reject!H0!if!the!pIvalue!is!small!enough,!or!do!not!reject!H0!o You!cannot!accept!a!hypothesis!–!it$doesn’t$automatically$mean$that$the$alternative$hypothesis$is$true.$!

• Step*5:!State!the!conclusion!in!plain!language!!o So!people!who!don’t!understand!statistics!can!still!understand!your!conclusions.!!

!

Page 2: NTDT5602(Methods(in(Nutrition(Research( Most(research ... · ! 1! NTDT5602(Methods(in(Nutrition(Research(! Module(1(Nutritional(Epidemiology(! Nutritional(Epidemiology(Lecture(1(–(Introduction(&(Study(Design

! 39!

Some*Concepts*

• The!null!hypothesis!can!usually!be!expressed!as!an!equality!between!means!from!each!group!!o If!there!is!only!one!group,!then!the!null!hypothesis!will!be!that!the!mean!equals!a!particular!value!(a!

hypothesized!value!–!just$a$nominated$value)!• The!alternative!hypothesis!can!either!be!one!sided!or!two!sided!!

o One!sided!–!we!reject!the!null!hypothesis!if!the!sample!mean!for!one!group!is!sufficiently!larger!in!one!group!than!the!other!(or!sufficiently!larger!than!the!hypothesized!value)!

o Two!sided!–!we!reject!the!null!hypothesis!if!the!sample!mean!for!one!group!is!sufficiently!different!to!the!other!group!(larger!or!smaller!–!no$nominated$direction)!

o Which!to!choose!depends!on!what!you!are!wanting!to!know!• We!make!a!decision!based!on!a!pIvalue!!

o The!pIvalue!is!the!probability!that!we!would!observe!the!unequal!sample!means!or!something!more!extreme!by!chance!when!the!null!hypothesis!is!true.!!

• We!compare!the!pIvalue!to!a!level!of!significance!!o The!level!of!significance!is!the!level!of!risk!that!we!are!willing!to!accept,!that!we!will!incorrectly!

reject!a!true!null!hypothesis!!o By!default,!we!will!use!5%!=!0.05!significance!–!that!is,!we!are!willing!to!accept!a!5%!chance!that!we!

will!incorrectly!reject!the!null!hypothesis!!! E.g.,$we$are$willing$to$accept$5%$chance$that$we$incorrectly$rejected$the$null$hypothesis/made$an$

incorrect$conclusion$that$there$is$inequality$between$the$means,$when$in$fact$there$isn’t$!o In!some!situations!we!may!change!the!level!of!significance!!o If!pIvalue!<!0.05!we!reject!H0!o If!pIvalue!≥!0.05!we!do!not!reject!H0!

!Types*of*Hypothesis*Tests**

• The!choice!of!test!depends!on!what!you’re!trying!to!do:!What*You’re*Trying*to*Do* Hypothesis*Test*

Comparing!the!mean!of!a!sample!to!a!value! 1Isample!test!Comparing!the!means!of!two!samples!to!each!other! 2Isample!test!Comparing!the!means!of!more!than!two!samples!to!each!other! ANOVA!Comparing!proportions! 1I!and!2Isample!proportion!tests!Finding!relationships!between!categorical!variables! Chi!Square!test!!

• It!also!depends!on!the!type!of!data!that!you!have!!o If!you!can!assume!that!your!data!are!normally!distributed,!use!a!parametric!test!(t,!ANOVA)!o If!you!cannot!assume!that!your!data!are!normally!distributed,!use!a!nonIparametric!test!(Wilcoxon,!

MannIWhitney,!KruskalIWallis)!!Which*Test?**

!• Paired$data$=$when$the$result$on$the$2

nd$measurement$is$dependent$on$the$1

st$measurement$$

o E.g.,$individuals$measured$before$and$after$an$intervention$$

! 40!

Types*of*Hypothesis*Tests**

• Recall!that!the!appropriate!graphical!display!of!data!depends!on!the!type!of!data!that!you!have:!

Type*of*data* Categorical* Numeric*

Categorical* Multiple!Bar!Charts! Multiple!Boxplot!Numeric* Multiple!Boxplot! Scatterplot!

!• It!is!the!same!for!hypothesis!testing!

Type*of*data* Categorical* Numeric*

Categorical* ChiISquared!Test! 1Isample!t,!paired!t!2Isample!t,!ANOVA!

Numeric* Regression! Regression!!19Sample*Tests*

• I!have!one!column!of!data,!and!I!want!to!compare!the!population!mean!to!a!particular!measurement,!e.g.,$a$known$value$from$a$wellSdefined$population!!

• 1Isample!tItests!assume!that!the!data!is!normally!distributed!!o I.e.,!if!you!have!clearly!skewed!data,!it!is!not!appropriate!to!use!a!1Isample!tItest!

!H0:! µ!=!value!

H1:!µ!≠!value!(two!sided!alternative)!!µ!<!value!(one!sided!alternative)!µ!>!value!(one!sided!alternative)!

!• 19sample*t9test*example*–*Pulse*Data*

o In!the!lab!sessions,!we!considered!a!data!set!based!on!the!pulses!of!people!who!either!ran!for!one!minute,!or!rested!for!one!minute.!!

o We!can!use!inferential!techniques!to!test!some!of!the!theories!that!we!may!have!made!from!our!exploration.!!

o Suppose!that!we!would!like!to!test!whether!our!group!has!a!starting!pulse!rate!that!is!different!from!the!typical!resting!pulse!rate!of!75bpm!

o Step*1:!Set!up!the!hypotheses!!! H0:!µ!=!75!! H1:*µ!≠!75!(i.e.,$the$mean$is$different$to$75)!

o Step*2:*Choose!an!appropriate!test!! 1Isample!tItest!!

o Step*3:!Execute!the!test!in!SPSS!and!obtain!a!pIvalue!!

!o Step*4:!Make!a!conclusion!!

! PIvalue!=!0.067,!which!is!greater!than!0.05!level!of!significance!!! Therefore!we!will!not!reject!H0!

o Step*5:!State!the!conclusion!in!plain!language!!! Therefore!the!initial!pulse!rate!measurements!do!not!differ!significantly!from!the!

typical!resting!pulse!rate!of!75bpm!(i.e.,$73$is$not$statistically$different$to$75)!

! 42!

29sample*tests***

! I!have!two!samples!and!I!want!to!compare!them!to!each!other!o If!each!individual!has!one!measurement!in!each!sample,!then!my!data!is!paired!by!the!

individual,!use!a!paired*t9test!–!also$used$for$individuals$who$were$matched$according$to$certain$

characteristics$thought$to$influence$outcomes$!o If!my!data!is!not!paired,!use!a*29sample*t9test!

! Both!tests!assume!that!the!data!are!normally!distributed!–!you$will$get$misleading$data$if$you$have$

skewed$data$!!

! Paired*t9test*example*–*Pulse*Data*

o Suppose!that!I!want!see!whether!the!first!and!second!pulse!rates!for!those!participants!who!ran!were!different!or!not!!

!o Step*1:!Set!up!the!hypotheses!

! H0:!µ1!=!µ2!–!the$mean$obtained$from$first$pulse$rate$measurements$is$the$same$as$the$

mean$obtained$from$the$second$pulse$rate$measurements$!! H1:*µ1!≠!µ2!–!the$means$are$not$the$same$!

o Step*2:*Choose!an!appropriate!test!*! Paired!tItest!*

o Step*3:*Execute!the!test!in!SPSS!and!obtain!a!pIvalue!*

*o Step*4:!Make!a!conclusion!*

! PIvalue!<!0.001!(0.05!level!of!significance)!–!the$probability$of$observing$a$difference$of$this$magnitude$when$in$truth$there$isn’t$a$difference$is$very$small*

! Therefore!we!will!reject!H0!–!we$reject$the$notion$of$there$being$no$difference$*o Step*5:!State!the!conclusion!in!plain!language*

! Therefore!there!is!a!significant!difference!between!the!initial!pulse!rate!and!the!final!pulse!rate!for!those!participants!who!ran!*

*

! 43!

! 29Sample*t9test*example*–*Pulse*Data*

o The!two!sample!tItest!determines!whether!the!means!of!two!unrelated!populations!are!the!same!or!not!*

o In!general,!the!hypotheses!are:!*! H0:!µ1!=!µ2!–!i.e.,$the$mean$of$group$one$is$equal$to$the$mean$of$group$two!! H1:*µ1!≠!µ2!–!i.e.,$the$difference$between$the$means$of$the$groups$is$not$0!

o We!can!also!decide!whether!or!not!we!should!use!a!“pooled!variance”,!i.e.,!assume!that!the!variance!of!the!two!groups!are!equal!and!obtain!a!more!powerful!test!*

! To!decide!this,!we!need!to!look!at!a!2Isample!variances!test!*! SPSS!automatically!conducts!a!test!for!equal!variances!when!it!does!a!2Isample!tItest!*

• This!test!is!called!Levene’s!test*! Levene’s!test!has!hypotheses:!*

• H0:!σ12!=!σ2

2!–!i.e.,$there$is$no$difference$in$the$variances*• H1:!σ1

2!≠!σ22!–$i.e.,$there$is$a$difference$in$the$variances!*

! If!we!reject!the!null!hypothesis,!we!cannot!assume!equal!variances!and!need!to!use!the!pIvalue!associated!with!“equal!variances!not!assumed”*

! If!we!do!not!reject!the!null!hypothesis,!we!can!assume!equal!variances!and!use!the!pIvalue!associated!with!“equal!variances!assumed”*

*

o Now!suppose!that!I!wish!to!test!whether!there!is!a!significant!difference!between!the!final!pulse!rates!between!the!males!and!females!who!ran!*

! In!this!case,!the!participants!in!one!group!will!be!males!and!the!other!females*! Therefore!the!data!won’t!be!paired,!and!we!should!use!a!2Isample!tItest!*

*

o Step*1:!Set!up!the!hypotheses!*! H0:!µM!=!µF!–!i.e.,$the$mean$final$pulse$rate$in$males$and$females$are$equal!! H1:*µM!≠!µF!–!i.e.,$the$mean$final$pulse$rate$in$males$and$females$are$different!

o Step*2:*Choose!an!appropriate!test!*! 2Isample!tItest*

o Step*3:*Execute!the!test!in!SPSS!and!obtain!a!pIvalue!*

**

**

o Step*4:*Make!a!conclusion!*! PIvalue!<!0.001!(0.05!level!of!significance)*! Therefore!we!will!reject!H0*

o Step*5:*State!the!conclusion!in!plain!language*! Therefore!there!is!a!significant!difference!between!the!final!pulse!rates!of!the!male!

participants!and!the!final!pulse!rates!of!the!female!participants!who!ran!*

! 44!

Testing*Proportions**

• We!can!also!perform!tests!on!proportions,!e.g.,!the!percentage!of!smokers!• We!set!up!hypotheses!in!the!same!way!as!we!did!for!means!• A!1Iproportions!test!will!compare!the!sample!proportion!to!a!hypothesized!proportion,!similar!to!a!

1Isample!tItest!!o H0:!p!=!p0!–$i.e.,$there$is$no$difference$between$the$proportions$!o H1:!p!≠!p0!(>!or!<)!–!i.e.,$there$is$a$difference$between$the$proportions!

!• 19Proportions*test*example*–*Pulse*Data**

o It!is!claimed!that!out!of!the!cohort!that!participated!in!the!pulse!experiment,!more!than!15%!of!students!smoke.!*

o Using!the!pulse!sample,!we!can!test!this!claim.!**

o Step*1:*Set!up!the!hypotheses*! H0:!p!=!0.15!–$i.e.,$the$number$of$people$who$smoke$is$equal$to$15%*

! H1:!p!>!0.15!*o Step*2:!Choose!an!appropriate!test*

! 1Iproportions!test*o Step*3:!Execute!the!test!in!SPSS!and!obtain!a!pIvalue!*

*

o Step*4:*Make!a!conclusion!*! PIvalue!<!0.001!(0.05!level!of!significance)*! Therefore!we!will!reject!H0*

o Step*5:!State!the!conclusion!in!plain!language*! Therefore!the!proportion!of!students!who!smoke!is!significantly!greater!than!15%*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

! 46!

• Multiple*Comparisons*Tests*

o Analysis!of!Variance!only!considers!whether!or!not!there!are!differences!between!the!means!across!the!groups.!It!does!not!find!WHERE!those!differences!are!(or$how$much$

difference$there$is)!o Multiple!comparisons!tests!test!difference!between!pairs!of!groups!o Tukey’s*multiple*comparisons*tests!set!a!family!error!rate!of!0.05.!!

! In!this!case,!the!probability!that!we!observe!any!significant!differences!when!none!exist!is!0.05!

o Some*tests!set!the!individual!error!rate!at!0.05.!!! In!this!case,!the!probability!of!incorrectly!finding!differences!somewhere!will!be!a*lot*more*

than*0.05*

o Therefore,!care!needs!to!be!taken!when!choosing!a!postIhoc!procedure.!**

o Multiple*Comparisons*Tests*Example*–*Pulse*Data*

! We!used!ANOVA!to!determine!whether!the!changes!in!pulse!rates!differed!between!participants!with!different!levels!of!activity;!now!we!would!like!to!see!exactly!which!groups!differed.!*

! We!reIdo!a!OneIWay!ANOVA!and!select!“PostIHoc”,!then!select!Tukey!*! We!will!obtain!the!ANOVA!table,!as!well!as!the!following!table.!*

*! We!can!notice!that!there!is!a!significant!difference!between!Moderate!and!High!levels!of!

activity,!but!no!significant!differences!between!the!other!levels!of!activity.!*• The!change!in!pulse!rate!for!those!with!a!Slight!level!of!activity!is!not!significantly!different!to!

the!other!groups.!*! This*is*better*understood*with*a*multiple*boxplot.**

*! Notice*that*SPSS*gives*95%*Confidence*Intervals*between*pairs*of*groups.**

• If!0!lies!in!the!confidence!interval,!then!we!would!conclude!that!the!means!of!the!two!groups!are!not!significantly!different*

• If!both!bounds!are!positive,!or!if!both!bounds!are!negative,!then!we!would!say!that!the!groups!have!different!means*

! 48!

*

• Kruskal9Wallis*Test*Example*–*Pulse*Data**

o We!can!also!repeat!the!activity!level!analysis!using!a!nonIparametric!test*o First!we!need!to!calculate!the!change!in!pulse!rate:!Change*in*pulse*rate*=*Pulse2*–*Pulse1!!o Step*1:!Set!up!the!hypotheses!!

! H0:!median1!=!median2!=!median3*! H1:!not!all!of!the!medians!are!equal!*

o Step*2:*Choose!an!appropriate!test!!! KruskalIWallis!

o Step*3:!Execute!the!test!in!SPSS!and!obtain!a!pIvalue!!

!o Step*4:*Make!a!conclusion*!

! PIvalue!=!0.002!(0.05!level!of!significance)!! Therefore!we!will!reject!H0!

o Step*5:!State!the!conclusion!in!plain!language!! Therefore!there!is!a!significant!difference!in!the!changes!in!pulse!rates!between!

participants!with!different!levels!of!activity.!!o In!this!case,!we!made!the!same!conclusion,!but!the!observed!pIvalue!was!larger.!!

!!!!!!!!!!!!!!!!!!!!!!!!!

! 49!

Chi9Square*Tests*

! We!have!looked!at!a!test!for!a!single!categorical!variable!with!two!levels!(proportions$of$smokers)!o We!performed!hypothesis!tests!based!on!the!proportions!in!each!group!

! If!we!wish!to!compare!two!or!more!groups,!or!have!more!than!two!levels!in!the!categorical!variable,!then!we!need!a!more!sophisticated!test!!

! A!chiIsquared!test!compares!two!or!more!categorical!variables,!each!with!two!or!more!categories.!!o This!test!compares!the!observed!frequencies!in!each!cell!of!the!crosstab!to!what!we!would!expect!to!

see!there!if!the!variables!were!independent.!!! We!assume!that!we!have!an!adequate!sample!for!each!cell!of!the!crosstabulation!!

o An!observed!frequency!of!at!least!5,!and!an!expected!frequency!of!at!least!5!!o If!there!is!not!an!adequate!sample,!then!we!need!to!combine!groups!

! The!hypotheses!for!this!test!are:!!o H0:!The!variables!are!independent!of!each!other!*o H1:!The!variables!are!not!independent!of!each!other!*

! If!we!reject!the!null!hypothesis!then!we!can!say!that!the!variables!are!related,!or!the!proportions!differ!between!groups!(depending!on!what!you!set!out!to!find)!

!! Chi9square*test*example*–*Smoking*

o Suppose!that!we!would!like!to!test!whether!there!are!gender!differences!in!whether!a!person!smokes!or!not!

o Step*1:*Set!up!the!hypotheses!*! H0:!Gender!and!smokes!are!independent!!! H1:!Gender!and!smokes!are!not!independent!!

o Step*2:!Choose!an!appropriate!test!!! ChiIsquare!test!!

o Step*3:*Execute!the!test!in!SPSS!and!obtain!a!pIvalue!!

!! Expected$Count$=$the$count$if$there$was$no$association$between$the$variables$

! E.g.,$62%$of$the$participants$are$male,$so$if$the$H0$was$true,$we$should$find$62%$of$the$participants$to$

be$male$in$each$category,$i.e.,$we’d$find$62%$of$smokers$were$male.$62%$of$62$participants$=$39.7$

participants$

!o Step*4:*Make!a!conclusion!!

! PIvalue!=!0.216!(0.05!level!of!significance)!–!We!are!using!the!Pearson!chiIsquare!pIvalue!! Therefore!we!will!not!reject!H0!

o Step*5:*State!the!conclusion!in!plain!language!! Therefore!the!proportions!of!males!and!females!who!smoke!are!not!significantly!different!

(the$proportion$of$males$and$females$across$the$groups,$smokers$and$nonSsmokers,$are$not$

significantly$different)!

! 50!

Regression*

• Linear!regression!(Ordinary!least!squares/OLS)!o Use$linear$regression$when$the$outcome$of$interest$is$a$continuous$variable!

• Logistic!regression!(Binary!logistic)!o Use$logistic$regression$if$the$outcome$variable$is$categorical;$i.e.,$2$possible$outcomes$(e.g.,$did$the$

patients$survive$or$not$survive$at$the$end$of$the$period$of$time)!!

• Simple*linear*regression*

o The!association!between!two!continuous!variables!can!be!depicted!graphically!using!a!scatter!diagram!!

o The!method!of!simple!linear!regression!allows!us!to!use!an!equation!to!represent!the!relationship!between!x!and!y!

o The!equation!of!a!straight!line!can!be!written!as:!y*=*α *+*βx!o α!is!the!expected!value!of!y!when!x!is!zero!!o β!(the!regression*coefficient!is!the!expected!change!in!y!as!x!

increases!by!1!unit!(the$slope)!o A!positive!β!indicates!y!increases!as!x!increases;!a!negative!β!indicates!y!decreases!as!x!increases!o When!β!=!0,!then!there!is!no!association!between!y!and!x,!because!the!expected!value!of!y!does!not!

change!as!x!changes!–!i.e.,$when$the$gradient$is$0,$there$is$no$meaningful$relationship$between$x$and$y$–$

when$x$changes,$y$doesn’t$change$!!

o Three*assumptions*are*made:**

1. There!is!a!linear!relationship!between!x!and!y!2. The!variability!about!the!regression!line!(the$line$of$best$fit)!is!the!same!for!all!values!with!x,!

with!constant!standard!deviation!–!i.e.,$the$spread$of$data$is$even;$the$ySvalues$are$spread$evenly$around$the$line$of$best$fit!

3. The!distribution!of!y!for!any!given!x!is!normal!!o The!null!hypothesis!is!that!there!is!no!association!between!y!and!x,!which!is!equivalent!to!assuming!

that!the!slope!is!0.!!!

o Simple*linear*regression*example*–*height*and*weight**

! Step*1:!Set!up!the!hypotheses!• H0:!β!=!0!• H1:!β!≠!0!–$i.e.,$the$gradient$is$not$0!

! Step*2:*Choose!an!appropriate!test!!• linear!regression!

! Step*3:!Execute!the!test!in!SPSS!and!obtain!a!pIvalue!

!

!! Step*4:*Make!a!conclusion!!

• PIvalue!<!0.001!(0.05!level!of!significance)!!• Therefore!we!will!reject!H0!(that$there$is$no$association$between$height$and$weight)!

! Step*5:!State!the!conclusion!in!plain!language!• Therefore!there!is!a!relationship!between!weight!and!height:!!

o weight!=!I91.147!+!90.008!x!height!