week6 sciencespo-louisdecharsonville spring2018louisdecharson.github.io/files/srqm/week6.pdf ·...

28
Statistical Reasoning Week 6 Sciences Po - Louis de Charsonville Spring 2018 Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 1 / 28

Upload: others

Post on 06-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

Statistical ReasoningWeek 6

Sciences Po - Louis de Charsonville

Spring 2018

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 1 / 28

Page 2: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

Outine

Research Paper

Statistical Hypotheses

Type I and II Errors

t-Tests

Practice

Annex

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 2 / 28

Page 3: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

Research Paper

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 3 / 28

Page 4: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

Research Paper

Timeline

1st draft Done2nd draft 10 April

Final draft 24 April

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 4 / 28

Page 5: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

Statistical Hypotheses

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 5 / 28

Page 6: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

Last Week ReminderÏ Sample 6= PopulationÏ A short video on Central Limit Theorem

Univariate statistics ⇒ Bivariate statistics

Bivariate statisticsÏ Describing the association between two variablesÏ Measuring the strength of the relationship.

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 6 / 28

Page 7: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

Comparing means across groups

Ï Dependent Variable is quantitative → you can calculate amean

Ï The value of a mean is not always interesting in itself ...Ï But comparing the means of two groups might highlightrelationships between variables

Ï Descriptive statistics : describe and compare the differencein means

Ï Inferential statistics : determine whether or not thedifference in means is due to random chance or is itstatistically significant.

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 7 / 28

Page 8: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

Statistical Significance 1/2

Statistical significanceLikelihood that a statistic derived from a sample represents somegenuine phenomenon in the population.

Example

Ï Sample : 1000 womenÏ X = 9, σX = 2

Ï Population mean : 8.

1. What is the probability to observe a mean of 9 ?2. Is this difference statistically significant ?

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 8 / 28

Page 9: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

Statistical Significance 2/2

Sampling error

sX = 2p1000

= 0.06

t Value

t = 9−8

0.06= 16.67

Ï Using the t distribution, the probability associated with a tvalue of 16.67 is less than 0.001 (see Table of critical valuesfor the Student distribution in Annex).

Ï This probability is known as p-value.

Is this difference statistcally significant ?Ï We need to delve into hypothesis testing.

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 9 / 28

Page 10: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

No Booze ? You May Loose, 2006

"A number of theorists assume that drinking has harmful economiceffects, but data show that drinking and earnings are positivelycorrelated. We hypothesize that drinking leads to higherearnings by increasing social capital. If drinkers have largersocial networks, their earnings should increase. Examining theGeneral Social Survey, we find that self-reported drinkers earn10-14 percent more than abstainers, which replicates results fromother data sets"

H1 : "An increase in social drinkings leads to an increase inearnings."

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 10 / 28

Page 11: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 11 / 28

Page 12: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

Hypotheses

Substantive hypothesesH1 : ± social drinking → ± social capital → ± earningsH2 : ± earnings → ± disposable income → ± social drinking

Rejecting the null hypothesis H0

H0 : no relationship between social drinkings and earningsH1 : any relationship between social drinking and earnings

Proof by contradiction

Ï Reject of retain H0 with a certain level of confidence (usually95% or 99%)

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 12 / 28

Page 13: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

Null hypothesis

Null hypothesis H0

Ï The null hypothesis alwayssuggests that there will bean absence of effect in thepopulation. Denoted H0.

Ï Usually, researchers want toreject H0 with a certainlevel of confidence.

Alternate hypothesis H1

Ï Alternative to the nullhypothesis. Usually,it is thehypothesis that there issome effect present in thepopulation. Denoted H1.

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 13 / 28

Page 14: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

Type I and II Errors

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 14 / 28

Page 15: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

Errors in hypothesis testing 1/2

Type I Error - Rejecting H0 when it is actually true"Last year executed man proven innocent by DNA evidence."

Ï H0 : innocent...Ï H1 : ... until proven guilty

H0 wrongly rejected

Type II Error - Retaining H0 when its actually false"I didn’t see your email Professor, it was in the spam folder."

Ï H0 : email is spamÏ H1 : email is non-spam

H0 wrongly accepted.

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 15 / 28

Page 16: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

Errors in hypothesis testing 2/2

H0 : Patient is not pregnant

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 16 / 28

Page 17: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

t-Tests

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 17 / 28

Page 18: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

Different type of t-tests

Different t-testsÏ Independent samples t-test : compare the means of twoindependent samples on a given variable.

Ï Example : compare the average earnings between 100 randomlychosen social drinker and 100 randomly chosen nondrinker.

Ï Dependent samples t-test : compare two means on a singledependent variable of two matched or paired samples.

Ï Example : compare average grade for first draft and final draft

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 18 / 28

Page 19: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

What do t-test measure ? 1/2

Ï Do two independent samples differ from each othersignificantly in their average scores on some variable ?

Ï Significantly = the difference in samples’ average is largeenough to suggest a difference in population.

Ï Inferential statistics : Sample ⇒ Population.Ï What difference can I expect to observe given randomsampling error ?

Ï What is the average expected difference between the means oftwo samples of this size selected randomly ?

Ï What is the standard error of the difference between themeans ?

Ï Is our observed difference between our two sample means islarge relative to the standard error of the difference betweenthe means.

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 19 / 28

Page 20: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

What do t-test measure ? 2/2

t-test tells us how likely it is to observe a difference between twovalues in a sample if no real difference actually exists in thepopulation.

Example - Earnings and drinkings1. Assume H0 is true : No difference between in earnings of

drinkers and nondrinkers in the population.2. Under H0, how likely (e.g. what is the probability) is it to

observe such a difference in group means ? ( p-value, p).3. Two cases :

Ï p < 0.05 (e.g. it is very unlikely) : reject H0 → there is adifference in earnings between drinkers and non-drinkers.

Ï p > 0.05 : We can’t reject H0.

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 20 / 28

Page 21: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

Mathematics of the t-tests

We compute the t statistics :

t = observed difference between sample meansstandard error of the difference between the means

(1)

= X1 − X2√s2

x1+ s2

x2

(2)

With :Ï X1 is the mean for the first sampleÏ X2 is the mean for the second sampleÏ sx1 is the standard error of the mean for the first sampleÏ sx2 is the standard error of the mean for the second sample

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 21 / 28

Page 22: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

One-tailed vs Two-tailed tests

Two-tailed t-testsÏ H0 : X = A

Ï H1 : X 6= A

One-tailed t-testsÏ H0 : X = A

Ï H1 : X < A

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 22 / 28

Page 23: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

In Stata

Comparing differencesÏ Comparing means H0 :∆= X − Y = 0 ttestÏ Comparing proportions H0 :∆= Pr X −Pr Y = 0 prtest

ttest y, by(x)+Ï y is continuous, x is a dummyÏ use prtest if y is also a dummy (proportions test).Ï use tab, gen() to create dummies from categories variables

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 23 / 28

Page 24: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

Bivariate statistics - Which procedure ?

The choice of the appropriate procedure depends on the type ofvariables you are working with

Ï Two quantitative variables ?Ï Correlation coefficient

Ï One quantitative / one qualitative variable ?Ï t-Tests

Ï Two qualitative variables ?Ï Cross tables ; Chi-square ; Cramer’s V

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 24 / 28

Page 25: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

Practice

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 25 / 28

Page 26: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

Practice

Opposition to Torture in Israel

What mattersÏ Age ? Gender ? Income ? Education ?Ï Religious faith ?Ï Media exposure ?

Before the sessionÏ Run setup.do (do it again !)Ï Check that all packages are installed for this session : fre,renvars, spineplot, tab_chi

Ï doedit code/week6.do

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 26 / 28

Page 27: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

Annex

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 27 / 28

Page 28: Week6 SciencesPo-LouisdeCharsonville Spring2018louisdecharson.github.io/files/SRQM/week6.pdf · Week6 SciencesPo-LouisdeCharsonville Spring2018 Sciences Po - Louis de Charsonville

Sciences Po - Louis de Charsonville Statistical Reasoning Spring 2018 28 / 28