sta347 - week 51 more on distribution function the distribution of a random variable x can be...

34
STA347 - week 5 1 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function F X . Theorem: Let X be any random variable, with cumulative distribution function F X . Let B be any subset of real numbers. Then can be determined solely from the values of F X (x). Proof: B X P

Upload: margaretmargaret-sullivan

Post on 04-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 1

More on Distribution Function

• The distribution of a random variable X can be determined directly from its cumulative distribution function FX .

• Theorem: Let X be any random variable, with cumulative distribution function FX . Let B be any subset of real numbers. Then can be determined solely from the values of FX (x).

• Proof:

BXP

Page 2: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 2

Expectation

• In the long run, rolling a die repeatedly what average result do you expect?

• In 6,000,000 rolls expect about 1,000,000 1’s, 1,000,000 2’s etc.

Average is

• For a random variable X, the Expectation (or expected value or mean) of X is the expected average value of X in the long run.

• Symbols: μ, μX, E(X) and EX.

Page 3: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 3

Expectation of Discrete Random Variable

• For a discrete random variable X with pmf

whenever the sum converge absolutely .

Page 4: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 4

Examples

1) Roll a die. Let X = outcome on 1 roll. Then E(X) = 3.5.

2) Bernoulli trials and . Then

3) X ~ Binomial(n, p). Then

4) X ~ Geometric(p). Then

5) X ~ Poisson(λ). Then

Page 5: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 5

Expectation of Continuous Random Variable

• For a continuous random variable X with density

whenever this integral converge absolutely.

Page 6: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 6

Examples

1) X ~ Uniform(a, b). Then

2) X ~ Exponential(λ). Then

3) X is a random variable with density

(i) Check if this is a valid density.

(ii) Find E(X)

Page 7: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 7

4) X ~ Gamma(α, λ). Then

5) X ~ Beta(α, β). Then

Page 8: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 8

Theorem

For g: R R

• If X is a discrete random variable then

• If X is a continuous random variable

• Proof:

x

X xpxgXgE

dxxfxgXgE X

Page 9: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 9

Examples

1. Suppose X ~ Uniform(0, 1). Let then,

2. Suppose X ~ Poisson(λ). Let , then

2XY

XeY

Page 10: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 10

Properties of Expectation

For X, Y random variables and constants,

• E(aX + b) = aE(X) + b

Proof: Continuous case

• E(aX + bY) = aE(X) + bE(Y)

Proof to come…

• If X is a non-negative random variable, then E(X) = 0 if and only if X = 0 with probability 1.

• If X is a non-negative random variable, then E(X) ≥ 0

• E(a) = a

Rba ,

Page 11: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 11

Moments • The kth moment of a distribution is E(Xk). We are usually interested in 1st

and 2nd moments (sometimes in 3rd and 4th)

• Some second moments:

1. Suppose X ~ Uniform(0, 1), then

2. Suppose X ~ Geometric(p), then

3

12 XE

1

122

x

xpqxXE

Page 12: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 12

Variance• The expected value of a random variable E(X) is a measure of the “center”

of a distribution.

• The variance is a measure of how closely concentrated to center (µ) the probability is. It is also called 2nd central moment.

• Definition The variance of a random variable X is

• Claim: Proof:

• We can use the above formula for convenience of calculation.

• The standard deviation of a random variable X is denoted by σX ; it is the square root of the variance i.e. .

22 XEXEXEXVar

2222 XEXEXEXVar

XVarX

Page 13: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 13

Properties of Variance

For X, Y random variables and are constants, then

• Var(aX + b) = a2Var(X)

Proof:

• Var(aX + bY) = a2Var(X) + b2Var(Y) + 2abE[(X – E(X ))(Y – E(Y ))]

Proof:

• Var(X) ≥ 0

• Var(X) = 0 if and only if X = E(X) with probability 1

• Var(a) = 0

Page 14: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 14

Examples

1. Suppose X ~ Uniform(0, 1), then and therefore

2. Suppose X ~ Geometric(p), then and therefore

3. Suppose X ~ Bernoulli(p), then and

therefore,

2

1XE

3

12 XE

12

1

2

1

3

12

XVar

p

XE1

2

2 1

p

qXE

2222

111

p

p

p

q

pp

qXVar

pXE pqpXE 222 01

ppppXVar 12

Page 15: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 15

Joint Distribution of two or More Random Variables • Sometimes more than one measurement (r.v.) is taken on each member of

the sample space. In cases like this there will be a few random variables defined on the same probability space and we would like to explore their joint distribution.

• Joint behavior of 2 random variable (continuous or discrete), X and Y is determined by their joint cumulative distribution function

• n – dimensional case

.,,, yYxXPyxF YX

.,,,..., 111,...,1 nnnXX xXxXPxxFn

Page 16: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 16

Properties of Joint Distribution Function

For random variables X, Y , FX,Y : R2 [0,1] given by FX,Y (x,y) = P(X ≤ x,Y ≤ x)

• FX,Y (x,y) is non-decreasing in each variable i.e.

if x1 ≤ x2 and y1 ≤ y2 .

• and

0,lim ,

yxF YX

yx

1,lim ,

yxF YX

yx

22,11, ,, yxFyxF YXYX

yFyxF YYXx

,lim , xFyxF XYXy

,lim ,

Page 17: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 17

Discrete case • Suppose X, Y are discrete random variables defined on the same probability

space.

• The joint probability mass function of 2 discrete random variables X and Y

is the function pX,Y(x,y) defined for all pairs of real numbers x and y by

• For a joint pmf pX,Y(x,y) we must have: pX,Y(x,y) ≥ 0 for all values of x,y and

yYandxXPyxp YX ,,

1,, x y

YX yxp

Page 18: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 18

Example for illustration • Toss a coin 3 times. Define,

X: number of heads on 1st toss, Y: total number of heads.

• The sample space is Ω ={TTT, TTH, THT, HTT, THH, HTH, HHT, HHH}.

• We display the joint distribution of X and Y in the following table

• Can we recover the probability mass function for X and Y from the joint table?

• To find the probability mass function of X we sum the appropriate rows of the table of the joint probability function.

• Similarly, to find the mass function for Y we sum the appropriate columns.

Page 19: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 19

Marginal Probability Function

• The marginal probability mass function for X is

• The marginal probability mass function for Y is

y

YXX yxpxp ,,

x

YXY yxpyp ,,

Page 20: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 20

• Case of several discrete random variables is analogous.

• If X1,…,Xm are discrete random variables on the same sample space with joint probability function

The marginal probability function for X1 is

• The 2-dimentional marginal probability function for X1 and X2 is

mmmXX xXxXPxxpn

,...,,... 111,...1

m

nxx

mXXX xxpxp,...,

1,...1

2

11,...

m

nxx

mXXXX xxxxpxxp,...,

321,...21

3

121,...,,,,

Page 21: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 21

Example • Roll a die twice. Let X: number of 1’s and Y: total of the 2 die.

There is no available form of the joint mass function for X, Y. We display the joint distribution of X and Y with the following table.

• The marginal probability mass function of X and Y are

• Find P(X ≤ 1 and Y ≤ 4)

Page 22: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 22

The Joint Distribution of two Continuous R.V’s

• DefinitionRandom variables X and Y are (jointly) continuous if there is a non-negative function fX,Y(x,y) such that

for any “reasonable” 2-dimensional set A.

• fX,Y(x,y) is called a joint density function for (X, Y).

• In particular , if A = {(X, Y): X ≤ x, Y ≤ x}, the joint CDF of X,Y is

• From Fundamental Theorem of Calculus we have

A

YX dxdyyxfAYXP ,, ,

x y

YXYX dvduvufyxF ,,, ,,

yxFxy

yxFyx

yxf YXYXYX ,,, ,

2

,

2

.

Page 23: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 23

Properties of joint density function

• for all

• It’s integral over R2 is

0,, yxf YX

1,, dxdyyxf YX

Ryx ,

Page 24: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 24

Example• Consider the following bivariate density function

• Check if it’s a valid density function.

• Compute P(X > Y).

otherwise

yxxyxyxf YX

0

10,107

12,

2

,

Page 25: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 25

Marginal Densities and Distribution Functions

• The marginal (cumulative) distribution function of X is

• The marginal density of X is then

• Similarly the marginal density of Y is

x

YXX dyduyufxXPxF ,,

dyyxfxFxf YXXX ,,

'

dxyxfyf YXY ,,

Page 26: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 26

Example

• Consider the following bivariate density function

• Check if it is a valid density function.

• Find the joint CDF of (X, Y) and compute P(X ≤ ½ ,Y ≤ ½ ).

• Compute P(X ≤ 2 ,Y ≤ ½ ).

• Find the marginal densities of X and Y.

otherwise

yxxyyxf YX

0

10,106,

2

,

Page 27: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 27

Generalization to higher dimensions

Suppose X, Y, Z are jointly continuous random variables with density f(x,y,z), then

• Marginal density of X is given by:

• Marginal density of X, Y is given by :

dydzzyxfxf X ,,

dzzyxfyxf YX ,,,,

Page 28: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 28

Example

Given the following joint CDF of X, Y

• Find the joint density of X, Y.

• Find the marginal densities of X and Y and identify them.

10,10, 2222, yxyxxyyxyxF YX

Page 29: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 29

Example

Consider the joint density

where λ is a positive parameter.

• Check if it is a valid density.

• Find the marginal densities of X and Y and identify them.

otherwise

yxeyxf

y

YX0

0,

2

,

Page 30: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 30

Independence of random variables

• Recall the definition of random variable X: mapping from Ω to R such that

for .

• By definition of F, this implies that (X > 1.4) is an event and for the discrete case (X = 2) is an event.

• In general is an event for any set A that is formed by taking unions / complements / intersections of intervals from R.

• DefinitionRandom variables X and Y are independent if the events and are independent.

FxX :

AXAX :

Rx

AX BY

Page 31: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 31

Theorem

• Two discrete random variables X and Y with joint pmf pX,Y(x,y) and

marginal mass function pX(x) and pY(y), are independent if and only if

• Proof:

• Question: Back to the rolling die 2 times example, are X and Y independent?

ypxpyxp YXYX ,,

Page 32: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 32

Theorem• Suppose X and Y are jointly continuous random variables. X and Y

are independent if and only if given any two densities for X and Y

their product is the joint density for the pair (X,Y) i.e.

Proof:

• If X and Y are independent random variables and Z =g(X), W = h(Y) then Z, W are also independent.

yfxfyxf YXYX ,,

Page 33: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 33

Example

• Suppose X and Y are discrete random variables whose values are the non-negative integers and their joint probability function is

Are X and Y independent? What are their marginal distributions?

• Factorization is enough for independence, but we need to be careful of constant terms for factors to be marginal probability functions.

...2,1,0,!!

1,, yxe

yxyxp yx

YX

Page 34: STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function

STA347 - week 5 34

Example and Important Comment

• The joint density for X, Y is given by

• Are X, Y independent?

• Independence requires that the set of points where the joint density is positive must be the Cartesian product of the set of points where the marginal densities are positive i.e. the set of points where fX,Y(x,y) >0 must be (possibly infinite) rectangles.

otherwise

yxyxyxyxf YX

0

1,0,4,

2

,