methods of applied statistics i · 2015-09-19 · logistics lecture 2-5pm every friday quiz at ~...

108
Methods of Applied Statistics I STA442 / STA2101 Craig Burkett 1

Upload: others

Post on 23-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Methods of Applied Statistics I

STA442 / STA2101

Craig Burkett

1

Page 2: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

About me Craig Burkett Formerly Aerospace engineer High-school teacher Lecturer at UBC Lecturer at UTM

Now a lecturer at U of T and a statistical consultant

2

Page 3: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when

you may retrieve old quizzes) All course material (outline, lecture slides, practice

problems, quiz solutions) will be posted on the course website: http://utstat.toronto.edu/burkett/sta442f15/

For all inquiries come to office hours or speak to me before lecture or during the break Please do not send me an email

3

Page 4: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Marking Scheme

Assessment Weight Due

Quizzes(best 10 of 11) 70%

Sept 25Oct 2Oct 9Oct 16Oct 23Oct 30Nov 6

Nov 13Nov 20Nov 27Dec 4

Final Exam 30% ???

4

Page 5: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Hierarchy of Stats Programs R / SAS R more popular among academics (free!) SAS more popular in large business & medicine

Everything else that can be scripted SPSS, Stata, …

Everything else that is menu-driven Minitab, R Commander, …

Anything browser-based Statcrunch, …

Excel 5

Page 6: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Hierarchy of Operating Systems Unix It’s the fastest, most customizable

Windows power user Because they build their own machines

MAC power user Benefit mostly to video/audio

Windows average user Good enough for this course

MAC hipster user Grrrrrrr! 6

Page 7: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Ready … ?

Introduction to Everything

7

Page 8: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Applied Statistics ProcessFormulate Research

Question & Sample size

Collect Data

Obtain Funding

MergeSort

CleanAggregateCompute

Descrip-tive

Stats & Graphs

Obtain Funding

Choose / Fit

Model

Check Assump-

tions

Interpret Results & Follow-up Tests / CIs

Formal Writeup / Publish

Our main focus Our

secondary focus

8

Page 9: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Types of Statements Analytic Propositions Statements that are true by definition All red cars are red All bachelors are unmarried sin2 𝑥𝑥 + cos2 𝑥𝑥 = 1

Domain of Mathematicians If you reject these types of claims, you are

widely considered an Irrational person Or, at best, you don’t understand the symbols

9

Page 10: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Types of Statements Scientific Statements Can be verified empirically Tylenol is an effective painkiller Fertilizer A is better than B for crop growth As the force, so the deformation (Hooke’s Law: 𝐹𝐹 = 𝑘𝑘𝑥𝑥 for springs)

Domain of Scientists (that’s us!) Logical Positivists

If you reject these types of claims, you are a very poor scientist

10

Page 11: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Types of Statements Dogmatic Claims Cannot be verified or proven I think, therefore I am There is a God Friends are more important than money

Domain of Theologians & Yoga instructors If you reject these claims, you are considered

either a cynic, a boor, enlightened, thoughtful, playing devil’s advocate, or a heathen, depending on who is judging you

11

Page 12: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Types of Statements Aesthetic/Ethical Claims Confer an opinion about something That painting is beautiful Students should not text during class Craig should not slurp his cereal

Domain of people who are more interesting at parties than the previous three groups

If you reject these claims, you might annoy my wife

12

Page 13: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Scientific Claims Must be falsifiable Karl Popper

Are ‘proven’ true by experiment Evidence-based medicine

A hypothesis is formed, and either rejected or not rejected based on the results of a properly designed experiment That’s where we come in!

This is the scientific method If you don’t like it, go back to the middle ages 13

Page 14: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Applied Stats Algorithm We have our first steps in a methodology

Is the research question scientific?

Continue to next step

Not our problem

Yes No

14

Page 15: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Proven? How?

“When you have eliminated the impossible, whatever remains, however improbable, MUST be the truth”- Sherlock Holmes

15

Page 16: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Experiments We prove things in Science by setting up

experiments In the simplest setup, we form two groups Apply a different Treatment to each group Measure something of interest (Response)

Ideally, the groups are identical in every possible way, except for the treatment

If the groups differ in the response, it must be due to the treatment variable Elementary, my dear Watson 16

Page 17: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Observational Studies We also do research this way Cheaper, faster

Analytical methods basically the same as those used in experiments

Setup is fundamentally different Treatments not assigned randomly

You can say a lot about Obs. Studies, but no matter how you slice it, you cannot determine causation (Some folks are working on this …) Will return to this later, in a mathematical way 17

Page 18: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Experiments and Observational Studies

Similarities and Differences

18

Page 19: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Experiments vs. Observational Studies Two main types of studies that use regression

models (ie. applied statistics) Analysis can be similar (at least for MLR

models) but conclusions very different Experiments can determine Cause and Effect

relationships At least they’re our best way to determine C&E

thus far

19

Page 20: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Experiments A treatment imposed on experimental units Popular in engineering, medicine, forestry,

farming, education, biology, psychology …

Engineering Effect of temperature on failure rate of electronics Which production method produces ‘better’ quality

parts?

20

Page 21: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Experiments Medicine Does drug A work ‘better’ than drug B? Do patients receiving surgery live longer than

patients on a chemical treatment?

Forestry Predict future coverage Estimation of age (without core samples)

21

Page 22: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Experiments Farming What is the optimal irrigation level and fertilizer

dosage to maximize crop yield?

Education Do students in a ‘flipped classroom’ retain more? Which medium is best to keep attention?

22

Page 23: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Experiments Biology What pH level allows cells to live longest? In vivo vs. in vitro

Psychology Eyewitness testimony Pygmalion / Golem effect Demand effect

23

Page 24: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Experiments Medicine - RCTs Randomized Controlled Trials Gold standard for evidence-based medicine

Registered now at ClinicalTrials.gov to avoid publication bias

Double-blind Both the experimental unit (patient) and the

researcher do not know the treatment Lanarkshire Milk Study

24

Page 25: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Experiments Placebo-controlled Placebo effect

Sample size calculations Ethical reasons

Intention To Treat Subjects can switch treatments of their own

accord – how to deal with it? Often assume they stayed with the original group,

to avoid selection bias Conservative results

25

Page 26: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Experiments Advantages Strength of conclusion If properly randomized can make cause-effect

conclusions

Disadvantages Expensive Time-consuming Loss to follow-up Ethical issues (ie. animal testing)

26

Page 27: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Observational Studies Data measured without intervention Can determine associations, but can’t say that X

causes Y Can make predictions Popular in medicine, education, social science

27

Page 28: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Observational Studies Medicine Does smoking cause lung cancer? What are the long-term effects of amputation? Not ethical to assign these treatments!

Education Do students from higher income families do better? Is there a difference in graduation rate based on race?

Not possible to assign these treatments

28

Page 29: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Observational Studies Social Science Is there a personality difference between dog and

cat owners? Can we predict rat populations throughout

Vancouver using demographic information? Treatments not assigned but ‘chosen’ by subjects

29

Page 30: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Observational Studies Can be prospective or retrospective Prospective Identify a cohort and follow them through time

Retrospective Look back through time and see what happened

to a cohort Prospective studies provide better evidence of

an association, but take longer, require ethical approval and have follow-up issues

30

Page 31: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Observational Studies Advantages Cheaper Data can be expensive, but analysis can be done by a

single person Instant (Retrospective, anyway) Fewer ethical issues as subjects chose own Tx No withdrawl problems (for retrospective)

Disadvantages No Cause-Effect conclusions Although they can motivate future experiments

31

Page 32: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Observational Studies They differ from experiments in that the

Treatments were not assigned randomly They were ‘chosen’ by the subjects themselves Does smoking cause lung cancer? What is the effect of an amputation on longevity? What is the effect of climate on happiness?

In observational studies, causation cannot be determined

32

Page 33: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Experiments vs. Observational Studies This is so important, that it gets its own slide

Only properly controlled, double-blind, randomized experiments can determine causation

Confounding cannot be removed from observational studies so a causal link cannot be made Doesn’t stop people from doing it though 33

Page 34: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

34

Page 35: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Anecdotal Evidence “My grandfather lived to 102 and he smoked a

pack a day!” “My aunt went to this tarot-card reader and

found true love the next week!” … Not statistical or scientific Arguably more influential than both experiments

and observational studies You should work to change this! How? Collect and study data

35

Page 36: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Data“In God we trust; all others bring data”

- W. Edwards Deming

(Quote on the wall at JSC – NASA)

36

Page 37: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

X: Independent Variable(s) Not the best name for it Predictor, Explanatory

Manipulated in an experiment

Can take one of four types Categorical / Nominal Ordinal Interval Ratio

37

Page 38: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Categorical variables Takes on several levels, none of which have any

natural ordering Sex (M, F, …) Race (Black, White, Asian, …) Program major (Stat, CS, Math, Psych, Bio, …) Type of fertilizer (A, B, …) Drug (Active, Placebo)

When controlled by the experimenter, called a Factor

38

Page 39: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Ordinal variables Takes on several levels which have a natural

order, but no consistent distance metric Grade (A+, A, A-, B+, …) Professor Rating (5, 4, 3, 2, 1) Likert item

Level of education (PhD, Masters, Bachelors, HS, Primary, None)

Sports (Rugby, Football, Soccer, … Basketball) Difficult to deal with, so we usually consider

them as either Categorical, or39

Page 40: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Interval variables Numerical variable with a consistent distance

metric, but no proper zero point IQ Temperature (in °C) SAT score

Slope and difference are meaningful, but ratios are not

40

Page 41: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Ratio variables Interval variable with a proper zero point Age Weight Temperature (in K) Amount of rainfall

Ratios are meaningful Important for reporting on multiplicative effects

and using log transformations41

Page 42: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Example data fileSex Grade IQ AgeM B+ 110 19M C- 102 21F A+ 119 19F C+ 103 20… … … … Rows are cases Columns are variables That’s the way we roll – don’t rock the boat

42

Page 43: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Y: Dependent Variable Not a good name for it either Response variable is better

Measured by experimenter Can take four types as well We’ll study numerical and categorical responses

in this course Should not be subject to Floor or Ceiling Effect Math contest scores and high school grades

43

Page 44: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

X: Independent Variable(s) Could be numerical or categorical If X is categorical We call X a factor if we manipulate it Single-Factor experiments have only one X

variable, call it Factor A The possible levels of Factor A are {a1, a2, a3, …}

Two-Factor experiments have two independent variables, Factors A and B

Treatments are combinations of Factor levels

44

Page 45: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Y: Response Variable(s) Could also be numerical or categorical This leads to many different setups, some of

which you may have studied alreadyY

Categorical Numerical

XCategorical Cont. Tables ANOVANumerical Log. Reg. Regression

45

Page 46: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Experimental Units A minimal unit that could possibly receive a

unique treatment Often, but not always, a single row in the datafile

Each one gets a single treatment from the possible levels of all factors ie. Households may have multiple people

(measurements) but if the treatment was applied to the household, then household is the unit There is no way that different members of the

household could get a different Tx46

Page 47: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Example We want to do an experiment to see if Teaching

Style affects the Learning Outcome in a classroom Note italics: those are the variables Which is the response? Independent? What type is each? How many levels? Go!

47

Page 48: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Example Suppose we want to see if Teaching Style and

Lecture Time affect Learning Outcome How many factors? Levels of each? What are the possible treatments? What are the experimental units? Go!

48

Page 49: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Crossed Factors When all possible factor combinations are actual

treatments, the experiment is said to be fully crossed

These are easier to deal with than other designs such as Nested, which we may study later Leads to hierarchical linear models

49

Page 50: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

X vs Y When there is more than one response variable,

this calls for a Multivariate Analysis More difficult We may cover these methods in this course

Most interesting experiments have more than one independent variable This is not a “multivariate” situation Call it “multiple” if you must

50

Page 51: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Nuisance variables Anything that influences the response variable

other than the treatment condition Teaching study Different lecturers for each section Amount of instruction given for each section Older students may be better (or worse) at

performance Amount of homework done Time of day for lecture section

What can we do about them? 51

Page 52: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Control If we can control a nuisance variable and keep it

constant throughout all treatment levels, then we should It is then no longer a ‘variable’ – problem solved

Teaching study Same lecturer delivers each style Amount of instruction given for each section can

easily be controlled – 36h each

52

Page 53: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Blocking For nuisance variables that cannot be controlled

but can still be observed, we can use Blocking to make sure each group has an equal amount of each

Teaching study When forming the two sections, ensure that there

is an equal distribution by age in each class

53

Page 54: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Randomization For nuisance variables that cannot be controlled

or observed, we depend on Randomization to spread these variables out evenly

Teaching study If, after any desired blocking, we allocate students

randomly to each section, then the amount of homework done by each student should be equal between the two sections Even if not equal among all units

54

Page 55: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Replication It would be silly to run an experiment with just

one observation at each treatment combination If there’s a difference between treatments, it might

just be that experimental units are different! Cannot say for sure

Also, we can’t estimate variance with only one observation

Teaching study There are many students in each class Another university could replicate the study

55

Page 56: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Replication There are really two levels of replication Treatment Level We take more than one observation at each

treatment combination Experiment Level We like to replicate the entire experiment, to make

sure that our results were not an artifact of something else, and to generalize to other populations

Also an easy way to get a “Me Too!” paper published 56

Page 57: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Blinding If subjects are aware of the assigned treatments,

they may influence the response variable Consciously or not Placebos

If researchers are aware of the assigned treatments, they may influence the response too Consciously or not Lanarkshire Milk Study

We try to make experiments Double-Blind for these reasons

57

Page 58: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Principles of Experiment Design These form the principles of good experiment

design: Control Blocking Randomization Replication Blinding

Observational studies lack randomization, and likely one or more other principles as well Exactly why we can’t infer causation from them 58

Page 59: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Does this matter? Ignoring these principles is like applying the

rules of one sport (say, basketball) to another (say, soccer) It might work, but it probably won’t

It doesn’t make ANY SENSE to apply statistical methods rooted in the preceding assumptions, to an experiment that hasn’t followed the principles of good design

So yes, it does matter

59

Page 60: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Confounding Any nuisance variables that are systematically

related to the treatments will have their effects confounded with the treatment effects we are trying to measure

Teaching study If style A is taught in the morning and style B in

the evening section, then any difference in response between sections cannot be uniquely attributed to teaching style

60

Page 61: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Mozart Effect Babies who listen to classical music tend to do

better in school later on

Does this mean parents should play classical music for their babies?

Please comment What is one possible confounding variable?

Page 62: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Balance We generally try to design experiments with the

same number of experimental units in each treatment group

Such an experiment is called balanced There are reasons to design unbalanced

experiments as well, on purpose At any rate, as soon as somebody drops a test

tube, it’s no longer balanced We’ll have to learn to deal with both

62

Page 63: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Applied Stats Algorithm

Experiment Observational Study

Scientificquestion?

Prospective Retrospective

Properly Controlled?

No

No

Crossed Nested

SingleFactor

Factorial

63

Page 64: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Applied Stats AlgorithmScientificquestion?

No

Classify Study

Unacceptable

64

Page 65: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Acquiring Data

Surveys and Samples

65

Page 66: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Another way We already saw two ways to come into data Designed Experiments Observational Studies

You could also run a survey, or take a sample

66

Page 67: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Population and Sample Population A (large) group of experimental (observational)

units about which we want to make some inference People, rabbits, trees, bags of water, …

Sometimes well-defined, sometimes not Sample A (smaller) group of experimental units that we

hope well-represents the population Selected from a frame Approximation to list of population 67

Page 68: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Population and Sample From our sample, we compute statistics We don’t really want statistics, we want the

population parameters With our knowledge of probability and the magic

of inductive reasoning, we can infer parameters from statistics, usually with some sort of interval Confidence, Credible

68

Page 69: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Confidence Interval Pair of numbers chosen so that the probability

they will enclose the (fixed) parameter, or function of parameters, is large, like 95%

CIs are random – there is nothing particularly special about your CI

Because they are random, they miss the parameter occasionally Like 𝛼𝛼 % of the time

Page 70: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Credible Interval This is a Bayesian concept, and thus heresy in this

course (and UG program)

I don’t really believe this, and we shall discuss Bayesian philosophy shortly But not methods

Image credit: Monty Python Series 2, Episode 2

Page 71: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Probability Sampling Techniques SRS Stratified Cluster Multistage

Systematic

71

Page 72: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Simple Random Sample Every possible subset of size n has equal

chance of selection Theoretically easy Practically difficult Basis against whichall other samples are measured

Images: D. Kernler (CC BY-SA 4.0)

72

Page 73: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Stratified Sample Divide the population into fixed ‘strata’, and

sample as an SRS from each No less efficient than SRS; potentially more Allows stratum-levelinference as well

73

Page 74: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Cluster Sample Divide the population into natural clusters, and

take an SRS of clusters Sample entire cluster Very cost-effectivesampling strategy Population-level framenot required Usually implementedas multistage

74

Page 75: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Stratified vs Cluster Strata are fixed groups and there are usually few Clusters are random and there are usually many We sample from all strata but only some clusters Stratified sampling is very efficient when The strata themselves are very homogeneous Differences (variation) between strata large

Cluster sampling is very efficient when The clusters themselves are very heterogeneous Differences (variation) between clusters minimal

75

Page 76: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Systematic Sample Choose every kth unit

Does not strictly require a frame, but watch out for natural periods! 76

Page 77: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Differences with Studies A study tries to measure the effect of something,

whether or not it is causal A survey is one way to collect data in an OS, if

you don’t feel like measuring it yourself Or can’t It’s certainly not an experiment, since there is no

treatment applied With surveys, you don’t need to look at “effects” You can still estimate parameters, with intervals

77

Page 78: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Probability and Statistics

What’s the difference?

78

Page 79: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Probability I have a fair coin (½ chance to get tails) I am going to flip it n times What is the probability that I see tails exactly

k times?

𝑃𝑃 𝑋𝑋 = 𝑘𝑘 = 𝑛𝑛𝑘𝑘

12

k 12

n−k

You know the parameter (p = ½), and want the chances of observing a specific sample

Deductive reasoning

Page 80: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Statistics I have a coin, but I don’t know if it’s fair p chance to get tails

I have just flipped it n times I got tails k times What can I say about the coin’s fairness? ie. I want to estimate p

You know the outcome of a specific sample, and want to infer the parameter p

Inductive reasoning

Page 81: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Again Probability Parameters Experiment

Statistics Experiment Parameters

They are inverse problems

Page 82: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Basic Statistical Problem I have a coin and I want to know if it’s fair Or, more specifically, what is p The probability of tails on a single toss

There are three approaches to solve this problem1. Layman2. Frequentist3. Bayesian

Page 83: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

The Layman Approach It’s a coin, of course it’s fair!

I don’t need to do an experiment Let’s go watch the hockey game instead

Obviously not very scientific

Page 84: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

The Frequentist Approach I have no idea, let’s do an experiment Toss the coin 10 times Get tails on 3 out of 10 tosses

Predict 𝑝𝑝 = 0.3

Much more scientific Definitely objective and unbiased

Page 85: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

The Bayesian Approach I think it’s probably fair since most coins are

fair, but I’ll do an experiment anyway Toss the coin 10 times Get tails on 3 out of 10 tosses

I thought 𝑝𝑝 = 0.5, but it looks like 𝑝𝑝 = 0.3 Predict 𝑝𝑝 = 0.4 It’s a bit more complicated than taking the midpoint

Subjective Uses more information than just the sample

Page 86: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Polling Time Which do you prefer?

1. Layman2. Frequentist3. Bayesian

Page 87: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

The situation You are a teacher, and one of your students

(Billy) has turned in five items of ‘A’ quality so far this term (out of five total items)

Call them papers They could be essays, tests, assignments, …

A sixth paper has just come due, and you have to mark it

A A A A A ???

You turn to Statistics for guidance

Page 88: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

The paper Unfortunately for Billy, the paper is actually of

‘D’ quality Objectively

Look at three possible marking options

Page 89: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

The Layman says … Billy always gets an A, so I’m not even going

to read this paper Give him an A and let’s watch the game!

This approach would miss the objective fact that the paper is of D quality

Sounds ridiculous, but I’m pretty sure some of my former colleagues (ahem … English department) marked this way

Page 90: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

The Frequentist says … This paper is a D, so I’m going to give Billy a

D because that’s what he earned I don’t care that he always gets an A

Seems ‘fair’, although Billy won’t like it Of course, he would like it just fine if he usually got

D’s and turned in an A paper

Page 91: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

The Bayesian says … Billy always gets an A, but I’ll read his paper

anyway to see if I should update my thinking Oh, it’s a D Better give him a B just to be safe If he turns in another D paper, I’ll give him a C, and

the following D paper will get a D Billy will appreciate this method, unless of

course the situation were reversed

Page 92: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Polling Time Which do you prefer?

1. Layman2. Frequentist3. Bayesian

Page 93: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Anecdotally Most choose the Frequentist method, in

contrast to the coin experiment Maybe because it’s ingrained in us early as

teachers & students However, if you’re looking at the name first,

you’re a Bayesian whether you like it or not Even if you don’t look, you may recognize the

hand-writing We can extend this

Page 94: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

It’s an important question We can extend this idea beyond just marking,

to cover all interactions with people I generally like my friend Mark, but he just

said something mean to me Should I react accordingly, or temper my reaction

based on his past good behaviour? Similarly with people who have screwed me

over in the past Should I trust them now after one good instance,

or will that be naïve?

Page 95: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Lastly I have a 6-sided die, the general outcome of

which we’ll call X And we’ll call it x when we observe a specific

outcome I just rolled it for you to see x = 5

Is x a fixed constant, or a random variable?

95

Page 96: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Lastly Same setup, except I haven’t rolled the die yet

Is X a fixed constant, or a random variable?

What’s the best we can say about our upcoming outcome x?

96

Page 97: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Lastly Same setup, except I rolled the die but didn’t

show you the outcome

Is x a fixed constant, or a random variable?

To be clear, X (or x) is not a parameter, but it’s still something unknown, at least to you

97

Page 98: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Distribution = Population Histogram

Page 99: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

For each value x of the independent variable X, there is a separate distribution of the dependent Variable Y

This is called the conditional distribution of Ygiven X = x Conditional distribution of height given Sex = F

Conditional Distribution

Page 100: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Definition of “Related” We will say that the independent and dependent

variables are unrelated if the conditional distribution of the dependent variable is identical for each value of the independent variable

If the distribution of the dependent variable doesdepend on the value of the independent variable, we will describe the two variables as related, or associated

Page 101: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Testing Statistical Significance Are IV and DV “really” related? Null Hypothesis H0: They are unrelated in the

population. Reasoning: Suppose that the IV and DV are actually unrelated

in the population. If H0 is true, what is the probability of obtaining a sample relationship between the variables that is as strong or stronger than the one observed? If the probability is small (say, p < 0.05), then we describe the sample relationship as statistically significant, and it is socially acceptable to discuss the results

Page 102: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

P-value The probability of getting our results (or better)

just by chance.

The conditional probability of our test statistic, given the null hypothesis is true

The minimum significance level at which the null hypothesis can be rejected.

Page 103: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

We can be wrong Type I error: H0 is true, but we reject it

Type II error: H0 is false, but we fail to reject it

Page 104: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Errors

http://experimentaltheology.blogspot.ca/2010/09/theology-of-type-1-type-2-errors.html

104

Page 105: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Image: Stock photo

Page 106: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

We can be wrong Type I error: H0 is true, but we reject it

Type II error: H0 is false, but we fail to reject it

Type III error: Answer the wrong question

Page 107: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Power The probability of correctly rejecting H0

Power = 1 - P(Type II Error)

Power increases with true strength of relationship, and with sample size

Power can be used to select sample size in advance of data collection

Page 108: Methods of Applied Statistics I · 2015-09-19 · Logistics Lecture 2-5pm every Friday Quiz at ~ 3pm, followed by 10-minute break (when you may retrieve old quizzes) All course material

Should we accept H0? When the results are not statistically significant,

usually we will say that the data do not provide enough evidence to conclude that the variables are related

Sometimes, we have to make a decision either way, in which case ‘not rejecting’ H0 is tantamount to accepting H0 Quality control