introduction to econometrics · introduction to econometrics lecture 2: review of basic statistics...

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Introduction to EconometricsLecture 2: Review of Basic Statistics & Randomized Controlled

Trials(RCTs)

Zhaopeng Qu

Business School,Nanjing University

Sep. 24th, 2020

Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 24th, 2020 1 / 106

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Outlines

1 An Brief Review of Basic Statistics

2 Review: Random Experiment as the Research Design

3 What is an RCT?

4 Assuming Case: the California School

5 Limitations of RCTs

6 An Example of Randomized Controlled Trials


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An Brief Review of Basic Statistics

An Brief Review of Basic Statistics


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An Brief Review of Basic Statistics Basic Concepts

Population and Sample(总体与样本）

A population is a collection of people, items, or events aboutwhich you want to make inferences.

Population always have a probability distribution.A sample is a subset of population, which draw from populationin a certain way.

The sample could also follow a probability distribution.To represent the population well, a sample should be randomlycollected and adequately large.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Random Sample(随机样本) and i.i.d(独立同分布）

DefinitionThe r.v.s are called a random sample of size n from the populationf(x) if X1, ...,Xn are mutually independent and have the samep.d.f/p.m.f f(x). Alternatively, X1, ...,Xn are called independent,and identically distributed random variables(r.v.s) with p.d.f/p.m.f, commonly abbreviated to i.i.d. r.v.s.

eg. Random sample of n respondents in a survey.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Statistic(统计量) and Sampling Distribution

DefinitionX1, ...,Xn is a random sample of size n from the population f(x). Astatistic(T) is a real-valued or vector-valued function fully dependedon X1, ...,Xn, thus

T = T(X1, ...,Xn)

A statistic is only a function of the sample (统计量是样本的函数).The probability distribution of a statistic T is called thesampling distribution (抽样分布）of T.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Sample Mean(样本均值) and Sample Variance(样本方差）

Two common and important estimators

DefinitionThe sample average or sample mean, X, of the n observationsX1, ...,Xn is

X̄ = 1n(X1 + X2 + ...+ Xn) =1

n

n∑i=1

Xi

Accordingly, the sample variance is

S2 = 1n − 1

n∑i=1

(Xi − X)2


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Sample Mean(样本均值) and Sample Variance(样本方差）

As we know that if Xi is a random variable(r.v.), then f(Xi),which is a function of Xi, is also a r.v..(随机变量的函数还是随机变量）So if Xi is a r.v., then

∑Xi is also a r.v..

the sample mean and the sample variance are alsofunctions of sums, therefore they are a r.v.s too.there are some certain probability functions which candescribe distributions of the sample mean and the samplevariance.Then naturally ask a question: what are the expectation,variance or p.d.f./c.d.f. of these distributions?


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


A simple case of sample meanLet {X1, ...,Xn} ∈ [1, 100] , assume n = 2, thus only X1and X2


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An Brief Review of Basic Statistics Large-Sample Approximations to Sampling Distributions

Sampling Distributions

There are two approaches to characterizing samplingdistributions:

exact/finite sample distribution: The samplingdistribution that exactly describes the distribution of X forany n is called the exact/finite sample distribution of X.approximate/asymptotic distribution: when the samplesize n is large, the sample distribution approximates to acertain distribution function.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Two Key Tools

Two key tools used to approximate sampling distributions whenthe sample size is large, thus assume that n → ∞

The Law of Large Numbers(L.L.N.): when the sample sizeis large, X will be close to µY , the population mean withvery high probability.The Central Limit Theorem(C.L.T.): when the samplesize is large, the sampling distribution of the standardizedsample average,(Y−µY)/σY , is approximately normal.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Convergence in probability(概率收敛）

DefinitionLet X1, ...,Xn be an random variables or sequence, is said to convergein probability to a value b if for every ε > 0,

P(| Xn − b |> ε) → 0

as n → ∞. We denote this as Xnp−→ b or plim(Xn) = b.

It is similar to the concept of a limitation in a probability way.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


The Law of Large Numbers(大数定律）

TheoremLet X1, ...,Xn be an i.i.d draws from a distribution with mean µ andfinite variance σ2(a population) and X = 1n

∑ni=1 Xi is the sample

mean, thenX p−→ µ

Intuition: the distribution of Xn “collapses”on µ.直观解释：抽样样本量越大，样本平均值越接近总体平均值（抽样分布更紧凑）


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


A simple case

ExampleSuppose X has a Bernoulli distribution if it have a binary valuesX ∈ {0, 1} and its probability mass function is

P(X = x) ={0.78 if x = 10.22 if x = 0

then E(X) = p = 0.78 and Var(X) = p(1− p) = 0.1716.


. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Convergence in Distribution(分布收敛)DefinitionLet X1,X2,... be a sequence of r.v.s, and for n = 1, 2, .... Let Fn(x)be the c.d.f of Xn. Then it is said that X1,X2, ...converges indistribution to r.v. W with c.d.f, FW, if

limn�∞Fn(x) = FW(x)

which we write as Xn d−→ W.

Basically: when n is big, the distribution of Xn is very similar tothe distribution of W.Standardize: by subtracting its expectation and dividing by itsstandard deviation

Z = X − E[X]√Var[X]


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


The Central Limit Theorem(中心极限定理)TheoremLet X1, ...,Xn be an i.i.d draws from a distribution with sample size nwith mean µ and 0 < σ2 < ∞, then

Xn − µσ/√n

d∼ N(0, 1)

Because we don’t have to make any specific assumption aboutthe distribution of Xi, so whatever the distribution of Xi, when nis big,

the standardized Xn d∼ N(0, 1)or Xn d∼ N(µ, σ

2

n )

直观理解：选取的样本量越大，样本均值的分布越趋于正态分布。


. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


How large is “large enough”?

How large is large enough ?how large must n be for the distribution of Y to beapproximately normal?

The answer: it depends.if Yi are themselves normally distributed, then Y is exactlynormally distributed for all n.if Yi themselves have a distribution that is far from normal,then this approximation can require n = 30 or even more.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An Brief Review of Basic Statistics Statistical Inference: Estimation, Confident Intervals and Testing

Statistical Inference: from Samples to Population

InferenceWhat is our best guess about some quantity of interest?What are a set of plausible values of the quantity of interest?

Our focus: {Y1,Y2, ...,Yn} are i.i.d. draws from f(y) or F(Y),thus population distribution.Statistical inference is using samples to infer f(y).

Normally, we don’t need to know everything of thepopulation, just some measures (the moment) enough todescribe the characteristics of the population.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Statistical Inference: Point estimation

Point estimation: providing a single “best guess”as to thevalue of some fixed, unknown quantity of interest, θ, which is is afeature of the population distribution, f(y).

µ =?E[Y]σ2 =?Var[Y]µy − µx? = E[Y]− E[X]


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Three Characteristics of an EstimatorLet µ̂Y denote the some estimation value of the populationmoment, µY and E(µ̂Y) is the mean of the sampling distributionof µ̂Y,

1 Unbiasedness: the estimator of µY is unbiased ifE(µ̂Y) = µY

2 Consistency:the estimator of µY is consistent ifµ̂Y

p−→ µY3 Efficiency:Let µ̃Y be another estimator of µY and suppose that

both µ̃Y and µ̂Y are unbiased.Then µ̂Y is said to be moreefficient than µ̂Y

var(µ̂Y) < var(µ̃Y)Comparing variances is difficult if we do not restrict ourattention to unbiased estimators because we could alwaysuse a trivial estimator with variance zero that is biased.Zhaopeng Qu (Nanjing University) Introduction to Econometrics Sep. 24th, 2020 23 / 106

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Three Characteristics of an Estimator


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.




...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Properties of the Sample Mean

Let µY and σ2Y denote the mean and variance of Y,(总体的均值和方差）Let Y = 1n

∑ni=1 Yi is the sample mean of Yi(样本均值）

Then the expectation of the sample mean(样本均值的期望) is

E(Y) = 1n

n∑i=1

E(Yi) = µY

so Y is an unbiased estimator of µY.Based on the L.L.N., Y p−→ µY, so Y is also consistent.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.



The Variance of sample mean(样本均值的方差）

Var(Y) = var(1

n

n∑i=1

Yi

)=

1

n2n∑

i=1Var(Yi) =

σ2Yn

Then, the Standard Deviation of the sample mean is σY = σY√n


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.



Follow the C.L.T, the

Y d∼ N(µY,σ2Yn )

And let Z be the standardized Y, then

Z = Y − µYσY√n

d∼ N(0, 1)


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Properties of the Sample Variance

Let µY and σ2Y denote the mean and variance of Yi , then thesample variance：

S2Y =1

n − 1

n∑i=1

(Yi − Y)2

Then it is easy to prove that1 E(S2Y) = σ2Y, thus S2 is an unbiased estimator of σ2Y which is

also the reason why the average uses the divisor n − 1instead of n.

2 S2YP−→ σ2Y, thus the sample variance is a consistent

estimator of the population variance.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


The Standard Error

Recall: the standardized sample mean will be approximatelyfollow a standard normal distribution when n is large.

Z = Y − µYσY√n

d∼ N(0, 1)

But in general σY, the standard deviation of population isunknown, so we have to use sample to estimate it.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


The Standard Error

Let σY = σY√n , because S2Y is an unbiased and consistentestimator of the σ2Y, then we can use SY√n as an estimator of thestandard deviation of the sample mean, σY.It is called the standard error(标准误）of the sample mean

SE[Y] = σ̂Y =SY√

n

Equivalence to the“standard deviation”（标准差）of the sampledistribution which measures the deviations of the sample mean.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Application: Sample Size and Standard Error(Population)

Population ∼ N(0, SD)


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Sample Size and Standard Error(sample n=250)


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Sample Size and Standard Error(n=500)


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Sample Size and Standard Error(n=1000)


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Recall: The Chi-Square Distribution

Let Zi(i = 1, 2, ...,m) be independent random variables, eachdistributed as standard normal. Then a new random variablecan be defined as the sum of the squares of Zi :

X =m∑

i=1Z2i

Then X has a chi-squared distribution with m degrees offreedom.Then, it can be prove that a variation of the sample variance willfollow a Chi-Square distribution

(n − 1)S2Yσ2

∼ χ2n−1


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


The Student-t Distribution

The Student t distribution can be obtained from a standardnormal and a chi-square random variable.Let Z have a standard normal distribution, let X have achi-square distribution with n degrees of freedom and assumethat Z and X are independent. Then the random variable

T = Z√X/n

has has a t-distribution with n degrees of freedom, denoted asT ∼ tn.Then, the Z will follow a student t distribution.

Z = Y − µYSY√n

∼ t(n − 1)


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


The Student-t Distribution

It does not matter alot in the large sample.As the degrees offreedom get largewhich is highlycorrelated with thesample size n, thet-distribution actuallyapproaches thestandard normaldistribution.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An Brief Review of Basic Statistics Interval Estimation and Confidence Intervals

Interval Estimation

A point estimate provides no information about how close theestimate is likely to be to the population parameter.We cannot know how close an estimate for a particular sample isto the population parameter because the population is neverunknown.A different (complementary) approach to estimation is toproduce a range of values that will contain the truth with somefixed probability.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


What is a Confidence Interval?

DefinitionA 100(1− α)% confidence interval for a population parameter θ is aninterval Cn = (a, b) , where a = a(Y1, ...,Yn) and b = b(Y1, ...,Yn)are functions of the data such that

P(a < θ < b) = 1− α

In general, this confidence level is 1− α (置信水平) ; where αis called significance level.(显著性水平)The key is how to obtain or construct the values of a and b.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Interval Estimation and Confidence IntervalsSuppose the population has a normal distribution N(µ, σ2) andlet Y1,Y2, ...,Yn be a random sample from the population.

Then the sample mean Y has a normal distribution:Y ∼ N(µ, σ2n )The standardized sample mean Z is given by:Z = Y−µσ/√n ∼ N(0, 1)

Then let θ = Z, then P(a < θ < b) = 1− α turns into

a < Y − µσ/√n

< b

then it follows thatP(Y − aσ/√n < µ < Y + bσ/√n) = 1− α

Thus the random interval contains the population mean with aprobability 1− α.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Interval Estimation and Confidence IntervalsTwo cases: σ is known and unknownWhen σ is known, for example,σ = 1, thus Y ∼ N(µ, 1),

Y ∼ N(µ, σ2

n =1

n)

From this, we can standardize Y, and, because the standardizedversion of Y has a standard normal distribution, and we let α = 0.05,then we have

P(−1.96 < Y − µ1/√

n< 1.96) = 1− 0.05

The event in parentheses is identical to the eventY − 1.96/√n ≤ µ ≤ Y + 1.96/√n, so

P(Y − 1.96/√n ≤ µ ≤ Y + 1.96/√n) = 0.95The interval estimate of µ may be written as[Y − 1.96/√n,Y + 1.96/√n]


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Interval Estimation and Confidence Intervals

When σ is unknown, we could use an estimate of σ,thusSE[Y] = σ̂Y = SY√n , the standard error, replacing unknown σthus

Zt =Y − µY

SY√n

=Y − µYSE(Y)

We just suggested that it follows a student t distribution.

Zt ∼ tn−1

DefinitionThe t-statistic or t-ratio:

Y − µYSE(Y)

∼ tn−1


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.



To construct a 95% confidence interval, let c denote the 97.5thpercentile in the tn−1 distribution.

P(−c < t ≤ c) = 0.95

where cα/2 is the critical value of the t distribution.The condence interval may be written as [Y ± cα/2S/√n]


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.




...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


A simple rule of thumb for a 95% confidence interval

Because as the degrees of freedom get large which is highlycorrelated with the sample size n, the t-distribution approachesthe standard normal distribution.And Φ(1.96) = 0.975, so a rule of thumb for an approximate95% confidence interval is

[Y ± 1.96× SE(Y)]

Or[Y ± 2× SE(Y)]


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An Brief Review of Basic Statistics Hypothesis Testing(假设检验）

Hypothesis TestingDefinitionA hypothesis is a statement about a population parameter, thus θ.Formally, we want to test whether is significantly different from acertain value µ0

H0 : θ = µ0which is called null hypothesis. The alternativehypothesis(two-sided) is

H1 : θ ̸= µ0

If the value µ0 does not lie within the calculated confidenceinterval, then we reject the null hypothesis.If the value µ0 lie within the calculated confidence interval, thenwe fail to reject the null hypothesis.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Introduction

In criminal law, institutions in most countries follow the rule:“innocent until proven guilty”(疑罪从无)

The prosecutor wants to prove their hypothesis that theaccused person is guilty.However, the burden is on the prosecutor to show guilt.The jury or judge starts with the “null hypothesis”thatthe accused person is innocent.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Introduction

In program evaluations,instead of “presumption of innocence,”the rule is: “presumption of insignificance”Policymaker’s hypothesis: the program improves learning.Evaluators approach experiments using the hypothesis:

there is zero impact of the programThen we test this “null hypothesis”

The burden of proof is on the programit should show a statistically significant impact.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Two Type Errors(两种错误）In both cases, there is a certain risk that our conclusion is wrong

Type I ErrorA Type I error is when we reject the null hypothesis when it is in facttrue.A Type II error is when we fail to reject the null hypothesis when it isfalse.

In criminal trialThe Type I : the judge reject the null hypothesis when thesuspect is actually no guilty.“宁可错杀一千，不能放过一个”

The Type II: the judge fail to reject the null hypothesiswhen the suspect is actually guilty.“宁可放过一千，不能错杀一个”


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


The Significance level(显著性水平)

DefinitionThe significance level or size of a test is the maximum probabilityfor the Type I Error

P(Type I error) = P(reject H0 | H0is true) = α

Usually, we has to carry the "burden of proof,"We would like to prove that the assertion of H1 is true byshowing that the data rejects H0.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Testing procedure

The following are the steps of the hypothesis testing:1 Specify H0 and H1.2 Choose the significance level α.3 Define a decision rule (critical value).4 Given the data compute the test statistic and see if it falls

into the critical region.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Decision Rule

The decision rule that leads us to reject or not to reject H0 isbased on a test statistic, which is a function of the data

Tn = T(Y1, ...,Yn)

Usually, one rejects H0 if the test statistic falls into a criticalregion(rejection region). A critical region is constructed bytaking into account the probability of making wrongdecisions,thus α.By convention, α is chosen to be a small number, for example,α = 0.01, 0.05, or 0.10


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


P-Value

To provide additional information, we could ask the question:What is the largest significance level at which we could carry outthe test and still fail to reject the null hypothesis?Or in other word, given the data, the smallest significance levelat which the null can be rejected.We can consider the p-value of a test

1 Calculate the t-statistic t2 The largest significance level at which we would fail to reject

H0 is the significance level associated with using t as ourcritical value

p − value = 1− Φ(t)where Φ(t) denotes the standard normal c.d.f.(we assumethat n is large enough)


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


P-ValueSuppose that t = 1.52, then we can find the largest significance levelat which we would fail to reject H0

p − value = P(T > 1.52 | H0) = 1− Φ(1.52) = 0.065


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Hypothesis Test of of Ȳ

Specify H0 and H1

H0 : E[Y] = µY,0 H1 : E[Y] ̸= µY,0

Choose the significance level α and define a decision rule (criticalregion or critical value)

eg. if we choose α = 0.05, then the critical value is 1.96,then the region is (−∞,−1.96] and [1.96,+∞)


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.




...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.



Given the data compute the test statisticStep1: Compute the sample average ȲStep2: Compute the standard error of Ȳ

SE(Y) = sY√n

Step3: Compute the t-statistic

tact = Ȳ − µY,0SE(Ȳ)

Step4: Reject the null hypothesis if| tact |> critical valueor if p − value < significance level


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Review: Random Experiment as the Research Design



...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Recall the last lecture

The Core of Empirical Studies: Causality v.s. ForecastingThe Central Question of Causality?

Rubin Causal Model: comparing counterfactuals or potentialoutcomes.However, we can never observe both counterfactuals —fundamental problem of causal inference.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Recall the last lecture

Random Assignment Solves the Selection Problem.We should treat experimental design as a Benchmark.To construct the counterfactuals, two broad categories ofempirical strategies.

Random Controlled Trials/Experiments:it can eliminates selection bias which is the mostimportant bias arises in empirical research. If we couldobserve the counterfactual directly, then there is noevaluation problem, just simply difference.

Program Evaluation Econometrics:The various approaches using naturally-occurring dataprovide alternative methods of constructing the propercounterfactual.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

What is an RCT?

What is an RCT?


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

What is an RCT?

Randomized Controlled Trials(RCTs)(随机可控试验）

In essence, an RCT is an experiment carried out on two ormore groups where participants are randomly assigned toreceive an intervention or not.

Participants are randomly assigned to either an treatmentgroup who are given the intervention, or a control groupwho are not..

In RCTs, each group is tested at the end of the trial and theresults from the groups are compared to see if the interventionhas made a difference and achieved its desired outcome. If therandomized groups are large enough, you can be confident thatdifferences observed are due to the intervention and not someother factor.RCTs are considered the gold standard for establishing a causallink between an intervention and change.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

What is an RCT?

RCT in History

First recorded RCTwas done in 1747 byJames Lind,who wasa Scottish physician inthe Royal Navy.Scurvy(败血症) is aterrible disease causedby Vitamin Cdeficiency. Seriousissue during long seavoyages.Lind took 12 sailorswith scurvy and splitthem into six groups oftwo.Groups were assigned:

(1) 1 qt cider(苹果酒) (2) 25 dropsof vitriol(硫酸）(3) 6 spoonfuls ofvinegar, (4) 1/2 ptof sea water, (5)garlic, mustard(芥末)and barleywater（大麦汤）,(6) 2 oranges and1 lemon


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

What is an RCT?

RCT in History

Only Group 6 (citrus fruit) showed substantial improvement.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

What is an RCT?

RCT in History

Ronald A.Fisher(1890-1962),Britishstatistician and geneticist whopioneered the application ofstatistical procedures to the design ofscientific experiments.“a genius who almostsingle-handedly created thefoundations for modernstatistical science”.“Rothamsted Experimental

Station”


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

What is an RCT?

RCTs in Economics


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

What is an RCT?

Randomized Experimental Methods: Noble Prize 2019


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

What is an RCT?

RCTs in Social Policies

According to Baruch (1978), 245 randomized field experimentshad been conducted in U.S for social policies evaluations up to1978.The huge effort has been prompted by the 1% part of everysocial budget devoted to evaluation.Some of them were ambitious and very costly, and affecteddifferent kind of policies.

the Perry Preschool Program in 1961The Rand Health Insurance Experiment from 1974-1982.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

What is an RCT?

Education: the Perry Preschool Program

123 children born between 1958 and 1962 in MichiganHalf of them (drawn at random) entered the perry schoolprogram at 3 or 4 years old.Education by skilled professionals in nurseries and kindergarten.Program duration circle 30 weeksfollow-up survey (age : 14, 15, 19, 27 and 40 years old)


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

What is an RCT?

Health Care: The Rand Health Insurance Experiment

5809 people randomly assigned in 1974 to different insuranceprograms with 0%, 25%, 50% and 75% sharing.They were followed until 1982.Main results : paying a portion of health cost make people giveup some “superfluous”cares, with little harm on their health.But some heterogeneity : not true for poor people.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

What is an RCT?

RCTs in China

“One egg a day”program in rural China by REAP at Stanford.One egg a day

“Free-lunch”program in primary schools at Western China.Free Lunch


https://reap.fsi.stanford.edu/research/egg_programs_and_nutritional_deficiencies_in_gansu_worth_the_efforthttp://www.mianfeiwucan.org/

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

What is an RCT?

RCT in Business

An interesting question: What is the optimal color for taxis?Ho, Chong and Xia(2017), Yellow taxis have fewer accidentsthan blue taxis because yellow is more visible than blue,PNAS.


https://www.pnas.org/content/early/2017/02/28/1612551114https://www.pnas.org/content/early/2017/02/28/1612551114https://www.pnas.org/content/early/2017/02/28/1612551114

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

What is an RCT?

RCT in Business


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

What is an RCT?

RCT in Business

Another Critical Question for business: Is Working at Home isbetter than Working at Office?

Bloom, Liang, Roberts and Ying,(2015), “Does Working from HomeWork? Evidence from a Chinese Experiment”, The Quarterly Journalof Economics


https://academic.oup.com/qje/article-abstract/130/1/165/2337855?redirectedFrom=fulltexthttps://academic.oup.com/qje/article-abstract/130/1/165/2337855?redirectedFrom=fulltexthttps://academic.oup.com/qje/article-abstract/130/1/165/2337855?redirectedFrom=fulltext

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

What is an RCT?

Types of RCTs

Lab Experimentseg: students evolves a experiment in a classroom.eg: computer game for gamble in Lab

Field Experimentseg: the role of women in household’s decision or fakeresumes in job application

Quasi-Experiment or Natural Experiments: some unexpectedinstitutional change or natural shock

eg: Germany Reunion(德国统一), Great Famine inChina(1959-1961 年大饥荒）and U.S Bombing inVietnam(美国轰炸越南).


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Assuming Case: the California School

Assuming Case: the California School


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Assuming Case: the California School An assuming case: the California School

A Case: the California School

Draw schools (n = 420) randomly from all school in CaliforniaVariables:

5th grade test scores (Stanford-9 achievement test, combined mathand reading), district averageStudent-teacher ratio (STR) = no. of students in the district dividedby no. full-time equivalent teachers


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Summary Table: Descriptive Statistics

Does this table tell us anything about the relationship between testscores and the STR?


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Scatterplot: test score v. student-teacher ratio

What does this figure show? and it may suggest...?


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


The California Test Score

We need to get some numerical evidence on whether districts withlow STRs have higher test scores.But how?

1 Compare average test scores in districts with low STRs to those withhigh STRs (“estimation”)

2 Test the “null”hypothesis that the mean test scores in the two typesof districts are the same, against the “alternative”hypothesis thatthey differ (“hypothesis testing”)

3 Estimate an interval for the difference in the mean test scores, high v.low STR districts (“confidence interval”)


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


The California Test Score

Compare districts with “small”and “large”class sizes:

Small v.s. LargeClass Size Average score(Y) Standard deviation N

Small(STR < 20) 657.4 19.4 238Large(STR ⩾ 20) 650.0 17.9 182

1 Estimation of ∆= difference between group means2 Test the hypothesis that ∆ = 03 Construct a confidence interval for ∆


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Assuming Case: the California School Comparing Means from Different Populations

An Example: Comparing Means from Different Populations

In an RCT, we would like to estimate the average causal effectsover the population

ATE = ATT = E{Yi(1)− Yi(0)}

We only have random samples and random assignment totreatment, then what we can estimate instead

difference in mean = Ytreated − Ycontrol

Under randomization, difference-in-means is a good estimate forthe ATE.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Hypothesis Tests for the Difference Between Two Means

To illustrate a test for the difference between two means, let µwbe the mean hourly earning in the population of women recentlygraduated from college and let µm be the population mean forrecently graduated men.Then the null hypothesis and the two-sided alternativehypothesis are

H0 : µm = µwH1 : µm ̸= µw

Consider the null hypothesis that mean earnings for these twopopulations differ by a certain amount, say d0. The nullhypothesis that men and women in these populations have thesame mean earnings corresponds to H0 : H0 : d0 = µm − µw = 0


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


The Difference Between Two MeansSuppose we have samples of nm men and nw women drawn atrandom from their populations. Let the sample average annualearnings be Ym for men and Yw for women. Then an estimatorof µm − µw is Ym − Yw .Let us discuss the distribution of Ym − Yw .

∼ N(µm − µw,σ2mnm

+σ2wnw

)

if σ2mand σ2w are known, then the this approximate normaldistribution can be used to compute p-values for the test of thenull hypothesis. In practice, however, these population variancesare typically unknown so they must be estimated.Thus the standard error of Ym − Yw is

SE(Ym − Yw) =

√s2mnm

+s2wnw


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


The Difference Between Two Means

The t-statistic for testing the null hypothesis is constructedanalogously to the t-statistic for testing a hypothesis about asingle population mean, thus t-statistic for comparing two meansis

tact =Ym − Yw − d0SE(Ym − Yw)

If both nmand nm are large, then this t-statistic has a standardnormal distribution when the null hypothesis is true,thusYm − Yw = 0.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Confidence Intervals for the Difference Between TwoMeans

the 95% two-sided confidence interval for d consists of thosevalues of d within ±1.96 standard errors of Ym − Yw , thusd = µm − µw is

(Ym − Yw)± 1.96SE(Ym − Yw)


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Hypothesis Test of the Difference Between Two Means

Reject the null hypothesis if| tact |=| Ym−Yw−d0SE(Ym−Yw) |> critical valueor if p − value < significance level


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Limitations of RCTs

Limitations of RCTs


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Limitations of RCTs

RCT are far from perfect!

High Costs, Long DurationPotential Ethical Problems: “Parachutes reduce the risk ofinjury after gravitational challenge, but their effectiveness has notbeen proved with randomized controlled trials."

Milgram ExperimentStanford Prison ExperimentMonkey Experiment

Limited GeneralizabilityRCTs allow us to gain knowledge about causal effects butwithout knowing the mechanism.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Limitations of RCTs

Potential Problems in Practice

Small sample: Student EffectHawthorne effect(霍桑效应）：The subjects are in an experimentcan change their behavior.Attrition（样本流失）：It refers to subjects dropping out of thestudy after being randomly assigned to the treatment or controlgroup.Failure to randomize or failure to follow treatment protocol:People don’t always do what they are told.

Wearing glasses program in Western Rural China.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Limitations of RCTs

Program Evaluation Econometrics(项目评估计量经济学）Question: How to do empirical research scientifically when wecan not do experiments? It means that we always have selectionbias in our data, or in term of “endogeneity”.Answer: Build a reasonable counterfactual world by naturallyoccurring data to find a proper control group is the core ofeconometrical methods.Here you Furious Seven Weapons in Applied Econometrics(七种盖世武器）

1 Random Controlled Trials (RCT)（随机实验）2 OLS(最小二乘回归）3 Decomposition（分解）4 Instrumental Variable（工具变量）5 Differences in Differences（双差分）6 Matching and Propensity Score（匹配）7 Regression Discontinuity（断点回归）8 Synthetic Control （合成控制法）


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

An Example of Randomized Controlled Trials



...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Working from Home(WFH) v.s Working from Office

“Does Working from Home Work? Evidence from a ChineseExperiment”,by Nicholas A. Bloom, James Liang, John Roberts,Zhichun Jenny Ying The Quarterly Journal ofEconomics,February 2015, Vol. 130, Issue 1, Pages 165-218.Basic Question: WFH = SFH

SFH(Shirking from Home)?


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Working from Home(WFH) is a trend internationally


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Motivations

Working from home is a modern management practice whichappears to be stochastically spreading in the US and Europe20 million people in US report working from home at least onceper weekLittle evidence on the effect of workplace flexibility

productivityemployee satisfactionshirking


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Ctrip Experiment

Ctrip, China’s largest travel-agent, with16,000 employees, $6bnNASDAQ.Co-founder of Ctrip, James Liang, was an Econ PhD atStanford and decided to run a experiment to test WFH.


http://www.ctrip.comhttp://www.itb-china.com/speaker/mr-james-jianzhang-liang/

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Ctrip Experiment: A call center in ShanghaiThe experiment runs on airfare & hotel departments in Shanghai.Main Work: Employees take calls and make bookings.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


The Experimental Design

Treatment: work 4 shifts (days) a week at home and to workthe 5th shift in the office on a fixed day.Control: work in the office on all 5 days.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


The Experimental Design: Timeline

In early November 2010, employees in the airfare and hotelbooking departments were informed of the WFH program.Of the 994 employees in the airfare and hotel bookingdepartments, 503 (51%) volunteered for the experiment.Among the volunteers, 249 (50%) of the employees met theeligibility requirements and were recruited into the experiment.The treatment and control groups were then determined fromthis group of 249 employees through a public lottery.


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


The Experimental Design


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Results: the number of receiving calls


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Results: Working hours


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Results:Many Outcomes


...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.


Conclusion: Very positive

They found a highly significant 13% increase in employeeperformance from WFH,

of which about 9% was from employees working moreminutes of their shift period (fewer breaks and sick days)and about 4% from higher performance per minute.

Home workers also reported substantially higher work satisfactionand psychological attitude scores, and their job attrition rates fellby over 50%.


An Brief Review of Basic Statistics Basic ConceptsLarge-Sample Approximations to Sampling DistributionsStatistical Inference: Estimation, Confident Intervals and Testing Interval Estimation and Confidence IntervalsHypothesis Testing(假设检验）

Review: Random Experiment as the Research DesignWhat is an RCT?Assuming Case: the California SchoolAn assuming case: the California SchoolComparing Means from Different Populations

Limitations of RCTsAn Example of Randomized Controlled Trials

introduction to econometrics · introduction to econometrics lecture 2: review of basic statistics...

Documents