joint distribution
TRANSCRIPT
-
7/28/2019 Joint Distribution
1/19
Two Random Variables
W&W, Chapter 5
-
7/28/2019 Joint Distribution
2/19
Joint Distributions
So far we have been talking about the
probability of a single variable, or a variable
conditional on another.We often want to determine the joint probability
of two variables, such as X and Y.
Suppose we are able to determine the following
information for education (X) and age (Y) for
all U.S. citizens based on the census.
-
7/28/2019 Joint Distribution
3/19
Joint Distributions
Education (X) Age (Y):
25-35
30
Age: 35-
55
45
Age: 55-
100
70
None 0 .01 .02 .05
Primary 1 .03 .06 .10
Secondary 2 .18 .21 .15
College 3 .07 .08 .04
-
7/28/2019 Joint Distribution
4/19
Joint Distributions
Each cell is the relative frequency (f/N).
We can define the joint probabilitydistribution as:
p(x,y) = Pr(X=x and Y=y)
Example: what is the probability of gettinga 30 year old college graduate?
-
7/28/2019 Joint Distribution
5/19
Joint Distributions
p(x,y) = Pr(X=3 and Y=30)
= .07
We can see that:
p(x) = y p(x,y)
p(x=1) = .03 + .06 + .10 = .19
-
7/28/2019 Joint Distribution
6/19
Marginal Probability
We call this the marginal probability
because it is calculated by summing
across rows or columns and is thusreported in the margins of the table.
We can calculate this for our entire table.
-
7/28/2019 Joint Distribution
7/19
Marginal Probability Distribution
Education
(X)
Age (Y):
30 45 70
p(x)
None: 0 .01 .02 .05 .08
Primary: 1 .03 .06 .10 .19
Secondary:
2
.18 .21 .15 .54
College: 3 .07 .08 .04 .19
p(y) .29 .37 .34 1
-
7/28/2019 Joint Distribution
8/19
Independence
Two random variables X and Y are
independent if the events (X=x) and
(Y=y) are independent, or:p(x,y) = p(x)p(y) for all x and y
Note that this is similar to Event E is
independent of F if:
Pr(E and F) = Pr(E)Pr(F) Eq. 3-21
-
7/28/2019 Joint Distribution
9/19
Example
Are education and age independent?Start with the upper left hand cell:
p(x,y) = .01p(x) = .08
p(y) = .29
We can see they are not independentbecause (.08)(.29)=.0232, which is notequal to .01.
-
7/28/2019 Joint Distribution
10/19
Independence
In a table like this, if X and Y are
independent, then the rows of the table
p(x,y) will be proportional and so will thecolumns (see Example 5-1, page 158).
-
7/28/2019 Joint Distribution
11/19
Covariance
It is useful to know how two variables varytogether, or how they co-vary. We
begin with the familiar concept ofvariance (E is expectation).
2 = E(x- )2 = (x- )2 p(x)
X,Y = Covariance of X and Y= E(X - X)(Y - Y)
= (X - X)(Y - Y)p(x,y)
-
7/28/2019 Joint Distribution
12/19
Covariance
Lets calculate the covariance for education (X)and age (Y).
First we need to calculate the mean for X andY:X = xp(x) = (0)(.08)+(1)(.19)+(2)(.54)+(3)(.19)=1.84
Y = yp(y) = (30)(.29)+(45)(.37)+(70)(.34)=49.15
Now calculate each value in the table minus itsmean (for X and Y), multiplied by the jointprobability!
-
7/28/2019 Joint Distribution
13/19
Covariance
X,Y = (X - X)(Y - Y)p(x,y)
= (0-1.84)(30-49.15)(.01) +
(0-1.84)(45-49.15)(.02) + (0-1.84)(70-49.15)(.05) +
(1-1.84)(30-49.15)(.03) + (1-1.84)(45-49.15)(.06) +
(1-1.84)(70-49.15)(.10) + (2-1.84)(30-49.15)(.18) +
(2-1.84)(45-49.15)(.21) + (2-1.84)(70-49.15)(.15) +(3-1.84)(30-49.15)(.07) + (3-1.84)(45-49.15)(.08) +
(3-1.84)(70-49.15)(.04) = -3.636
-
7/28/2019 Joint Distribution
14/19
-
7/28/2019 Joint Distribution
15/19
Covariance and Independence
If X and Y are independent, then they areuncorrelated, or their covariance is zero:
X,Y = 0
The value for covariance depends on the unitsin which X and Y are measured. If X, for
example, were measured in inches instead offeet, each X deviation and hence X,Y itselfwould increase by 12 times.
-
7/28/2019 Joint Distribution
16/19
Correlation
We can calculate the correlation instead:
= X,Y
X Y
Correlation is independent of the scale it is
measured in, and is always bounded:
-1 1
-
7/28/2019 Joint Distribution
17/19
Correlation
A perfect positive correlation (=1); all x,y coordinate
points will fall on a straight line with positive slope.
A perfect negative correlation (=-1); all x,y coordinate
points will fall on a straight line with negative slope.
A correlation of zero indicates no relationship between
X and Y (or independence!).
Positive correlations (as X increases, Y increases)
Negative correlations (as X increases, Y decreases)
-
7/28/2019 Joint Distribution
18/19
Example of Correlation
Calculate the correlation between
education and age:
= X,Y = -3.636
X Y (.8212)(16.14)
= -0.2743
-
7/28/2019 Joint Distribution
19/19
Interpretation
There is a weak, negative correlation
between education and age, which
means that older people have lesseducation.
Later on we will learn how to conduct a
hypothesis test to determine ifissignificantly different from zero.