probability and statistics - stanford...

41
Probability and Statistics Part 1. Probability Concepts and Limit Theorems Chang-han Rhee Stanford University Sep 19, 2011 / CME001 1

Upload: ledien

Post on 06-Mar-2018

235 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Probability and StatisticsPart 1. Probability Concepts and Limit Theorems

Chang-han Rhee

Stanford University

Sep 19, 2011 / CME001

1

Page 2: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Outline

Probability ConceptsProbability SpaceRandom VariablesExpectationConditional Probability and Expectation

Limit TheoremsModes of ConvergenceLaw of Large NumbersCentral Limit Theorem

2

Page 3: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Outline

Probability ConceptsProbability SpaceRandom VariablesExpectationConditional Probability and Expectation

Limit TheoremsModes of ConvergenceLaw of Large NumbersCentral Limit Theorem

3

Page 4: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Probability of an Eventin a random experiment

Relative frequency of an event, when repeating a random experiment.

e.g. coin flip, dice roll, roulette

4

Page 5: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Sample SpaceSet of all possible outcomes.

I Single coin flipΩ = H,T

I Two coin flipsΩ = (H,H), (H,T), (T,H), (T,T)

I Single dice rollΩ = 1, 2, 3, 4, 5, 6

I Two dice rollsΩ = (1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6)

(2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6)

(3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6)

(4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6)

(5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6)

(6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6)

5

Page 6: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Event

Subset of sample space.I Single coin flip : The event that the coin lands head

A = H

I Two coin flips : The event that the first coin lands head

A = (H,H), (H,T)

I Single dice roll : The event that the dice falls on an odd number

A = 1, 3, 5

I Two dice roll : The event that the sum is 4

A = (1, 3), (2, 2), (3, 1)

6

Page 7: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Ω = (H,H), (H,T), (T,H), (T,T)

Sample Space

Event: first coin lands on head

Outcome: both coin lands on tail

7

Page 8: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Probability

DefinitionA set function P is called a probability if

I 0 ≤ P(A) ≤ 1 for each event AI P(Ω) = 1 (Unitarity)I For each sequence A1,A2, . . . of mutually disjoint events

P

(∞∪1

Ai

)=

∞∑1

P(Ai) (Countable Additivity)

8

Page 9: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Back to Examples

I Fair CoinP(∅) = 0

P(H) = 1/2

P(T) = 1/2

P(H,T) = 1

I Biased Coin ( p ∈ [0, 1] )P(∅) = 0

P(H) = p

P(T) = 1 − p

P(H,T) = 1

9

Page 10: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Outline

Probability ConceptsProbability SpaceRandom VariablesExpectationConditional Probability and Expectation

Limit TheoremsModes of ConvergenceLaw of Large NumbersCentral Limit Theorem

10

Page 11: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Random Variables

A random variable is a function from a sample space to a real number.e.g.

I Winnning in single coin flip

X(H) = 1

X(T) = −1

I First roll, second roll, and sum of two dice

X(i, j) = i

Y(i, j) = j

Z(i, j) = i + j

11

Page 12: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Discrete Random VariablesA discrete random variable X assumes values in discrete subset S of R.

The distribution of a discrete random variable is completely describedby a probability mass function pX : R → [0, 1] such that

P(X = x) = pX(x)

e.g.I [Bernoulli] X ∼ Ber(p) if X ∈ 0, 1 and

P(X = 1) = 1 − P(X = 0) = pi.e.,

pX(1) = p and pX(0) = 1 − p

I [Binomial] X ∼ Bin(n, p) if X ∈ 0, 1, . . . , n and

pX(k) =(

nk

)pk(1 − p)n−k

12

Page 13: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Continuous Random VariablesA continuous random variable X assumes values in R.

The distribution of continuous random variables is completelydescribed by a probability density function fX : R → R+ such that

P(a ≤ X ≤ b) =∫ b

afX(x)dx

e.g.I [Uniform] X ∼ Unif (a, b), a < b if

fX(x) = 1

b−a a ≤ x ≤ b0 o.w.

I [Gaussian/Normal] X ∼ N(µ, σ2), µ ∈ R, σ2 > 0 if

fX(x) =1√

2πσ2e−

(x−µ)2

2σ2

13

Page 14: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Probability Distribution∗

Each random variable induces another probability PX : 2R → [0, 1] onreal line through the following:

PX((−∞, x]) := P(X ≤ x)

We often denote the distribution function with FX:

FX(x) := P(X ≤ x)

[NOTATION] The right-hand sides of the previous displays areshorthand notation for the following:

P(X ≤ x) := P(ω ∈ Ω : X(ω) ≤ x)

14

Page 15: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Note: Distribution can be identical even if the supporting probabilityspace is different.

e.g.X(H) = 1

X(T) = −1

Y(i) =

1 if i is odd−1 if i is even

15

Page 16: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Joint Distribution

Two random variables X and Y induce a probabillity PX,Y on R2:

PX,Y((−∞, x]× (−∞, y]) = P(X ≤ x,Y ≤ y)

A collection of random variables X1,X2, . . . ,Xn induce a probabillityPX1,...,Xn on Rn:

PX1,...,Xn((−∞, x1]× · · · × (−∞, xn]) = P(X1 ≤ x1, · · · ,Xn ≤ xn)

16

Page 17: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Joint distribution of two discrete random variables X and Y assumingvalues in SX and SY can be completely described by joint probabilitymass function pX,Y : R× R → [0, 1] such that

P(X = x,Y = y) = pX,Y(x, y)

Joint distribution of two continuous random variables X and Y can becompletely described by joint probability density functionfX,Y : R× R → R+ such that

P(X ≤ x,Y ≤ y) =∫ x

−∞

∫ y

−∞fX,Y(x, y)dydx

17

Page 18: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Outline

Probability ConceptsProbability SpaceRandom VariablesExpectationConditional Probability and Expectation

Limit TheoremsModes of ConvergenceLaw of Large NumbersCentral Limit Theorem

18

Page 19: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Expectation

For discrete random variable X, the expectation of X is

E[X] =∑x∈S

x pX(x)

For continuous random variable Y , the expectation of Y is

E[Y] =∫ ∞

−∞y fY(y)dy

19

Page 20: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Computation of Expectation

We can also compute the expectation of g(X) and g(Y) as follows:

E[g(X)] =∑x∈S

g(x)pX(x)

and

E[g(Y)] =∫ ∞

−∞g(x)pY(x)dx

20

Page 21: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Properties of Expectation

I LinearityEaX + bY = aEX + bEY

I MonotonocityX ≤ Y =⇒ EX ≤ EY

21

Page 22: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Probability as an Expectatoin

[NOTATION] We denote the indicator function of A with IA(·)

IA(ω) =

1 if ω ∈ A0 if ω /∈ A

Probability can be written as an expectation:

PX(A) = E IA(X)

More generally,P(A) = E IA

22

Page 23: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Summary Statistics

I MeanE[X]

I Variance

var(X) = E[(X − EX)2]

= EX2 − (EX)2

I Standard Deviation

σ(X) =√

var(X)

23

Page 24: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Outline

Probability ConceptsProbability SpaceRandom VariablesExpectationConditional Probability and Expectation

Limit TheoremsModes of ConvergenceLaw of Large NumbersCentral Limit Theorem

24

Page 25: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Conditoinal Probability

The conditional probability of A given B is defined as

P(A|B) = P(A ∩ B)P(B)

25

Page 26: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Conditional Probability Mass and Density

If X and Y are both discrete random variables with joint probabilitymass function pX,Y(x, y),

P(X = x|Y = y) = pX|Y(x|y) :=pX,Y(x, y)

pX(y)

If X and Y are both continuous random variables with joint densityfunction fX,Y(x, y),

P(a ≤ X ≤ b|Y = y) =∫ b

afX|Y(x|y)dx

where

fX|Y(x|y) =fX,Y(x, y)

fY(y)

26

Page 27: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Independence

Two events A and B are independent if

P(A ∩ B) = P(A)P(B)

Two random variables X and Y are independent if

P(X ≤ x,Y ≤ y) = P(X ≤ x)P(Y ≤ y)

27

Page 28: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Conditional ExpectationDiscrete Random Variable

For discrete random variables X and Y , the conditional expectation ofX given Y = y is

E[X|Y = y] =∑x∈S

x pX|Y(x|y) =∞∑

x∈S

xP(X = x,Y = y)

P(Y = y)

28

Page 29: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Conditional ExpectationContinuous Random Variable

For continuous random variables X and Y , the conditional expectationof X given Y = y is

E[X|Y = y] =∫ ∞

−∞x fX|Y(x|y)dx =

∫ ∞

−∞x

fX,Y(x, y)fY(y)

29

Page 30: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Properties of Conditional Expectation

I LinearityE[aX + bY|Z] = aE[X|Z] + bE[Y|Z]

I Monotonocity

X ≤ Y =⇒ E[X|Z] ≤ E[Y|Z]

30

Page 31: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Outline

Probability ConceptsProbability SpaceRandom VariablesExpectationConditional Probability and Expectation

Limit TheoremsModes of ConvergenceLaw of Large NumbersCentral Limit Theorem

31

Page 32: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Almost Sure Convergence

Let X1,X2, . . . be a sequence of random variables. We say that Xn

converges almost surely to X∞ as n → ∞ if

P(Xn → X∞ as n → ∞) = 1

We use the notation Xna.s.→ X∞ to denote almost sure convergence, or

convergence with probability 1.

32

Page 33: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Lp Convergence

[NOTATION] For p > 0, we denote p-norm of X with ∥ · ∥p

∥X∥p := (E|X|p)1/p

Let X1,X2, . . . be a sequence of random variables. For p > 0, we saythat Xn converges to X∞ in pth mean if

∥Xn − X∞∥p → 0

as n → ∞.

We use the notation XnLp

→ X∞ to denote convergence in pth mean, orLp convergence.

33

Page 34: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Convergence in Probability

Let X1,X2, . . . be a sequence of random variables. We say that Xn

converges in probability to X∞ if for each ϵ > 0,

P(|Xn − X∞| > ϵ) → 0

as n → ∞.

We use the notation Xnp→ X∞ to denote convergence in probability.

34

Page 35: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Weak Convergence

Let X1,X2, . . . be a sequence of random variables. We say that Xn

converges weakly to X∞ if

P(Xn ≤ x) → P(X∞ ≤ x)

as n → ∞ for each x at which P(X∞ ≤ ·) is continuous.

We use the notation Xn ⇒ X∞ or XnD→ X∞ to denote convergence in

probability or convergence in distribution.

35

Page 36: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Implications

Almost Sure Convergence Lp Convergence

Convergence in Probability

Weak Convergence

36

Page 37: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Outline

Probability ConceptsProbability SpaceRandom VariablesExpectationConditional Probability and Expectation

Limit TheoremsModes of ConvergenceLaw of Large NumbersCentral Limit Theorem

37

Page 38: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Weak Law of Large Numbers

Theorem (Weak Law of Large Numbers)Suppose that X1,X2, · · · is a sequence of i.i.d. r.v.-s such thatE|X1| < ∞. Then,

1n(X1 + · · ·+ Xn)

P→ EX1 as n → ∞

38

Page 39: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Strong Law of Large Numbers

Theorem (Strong Law of Large Numbers)Suppose that X1,X2, · · · is a sequence of i.i.d. r.v.-s such that EX1exists. Then,

1n(X1 + · · ·+ Xn)

a.s.→ EX1 as n → ∞

39

Page 40: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Outline

Probability ConceptsProbability SpaceRandom VariablesExpectationConditional Probability and Expectation

Limit TheoremsModes of ConvergenceLaw of Large NumbersCentral Limit Theorem

40

Page 41: Probability and Statistics - Stanford Universityweb.stanford.edu/class/cme001/handouts/changhan/Refresher1.pdf · Probability and Statistics Part 1. Probability Concepts and Limit

Central Limit Theorem

TheoremSuppose that the Xi’s are iid rv’s with common finite variance σ2.Then, if Sn = X1 + · · ·+ Xn,

Sn − nE(X1)√n

⇒ σN(0, 1)

as n → ∞.

From here, we can deduce the following approximation:

1n

Sn − EX1D≈ 1√

nN(0, 1)

41