i - ucsb's department of economicsecon.ucsb.edu/~llad/econ240af04/lecture_12.doc · web...
Post on 25-May-2018
214 Views
Preview:
TRANSCRIPT
Nov. 4, 2004 LEC #12 ECON 240A-1 L. PhillipsBivariate Normal Distribution: Isodensity Curves
I. Introduction
Economists rely heavily on regression to investigate the relationship between a
dependent variable, y, and one or more independent variables, x, w, etc. As we have seen,
graphical analysis often provides insight into these bivariate relationships and can reveal
non-linear dependence, outliers, and other features that may complicate the analysis.
There are other methodologies for examining bivariate relations. We have
examined some of them. For example, correlation analysis, using the correlation
coefficient, , is one method, as discussed in Lecture Eight. Another method is
contingency table analysis. We will discuss the latter shortly. First we turn to the
bivariate normal distribution, which provides a useful visual model for bivariate
relationships just as the univariate normal distribution provides a useful probability
model for a single variable.
It is useful to have a mental model in mind for bivariate relationships and the iso-
density lines, or contour lines of the bivariate normal provide a visual representation. The
bivariate normal distribution of two variables, y and x, is a joint density function, f(x,y),
and if the variables are jointly normal, then the marginal densities, e.g. f(x) and f(y), are
each normal. In addition, the conditional densities, y given x, f(y/x), are normal as well.
The isodensity lines, i.e. the locus where f(x,y) is constant, is a circle around the
origin for the bivariate normal if both x and y have mean zero and variance one, i.e. are
standardized normal variates, and are not correlated. If x and y have nonzero means, x
and y , respectively, then these contour lines are circles around the point (x, y).
If x has a larger variance than y, then the contour lines are ellipses with the long
axis in the x direction. If x and y are correlated, then these ellipses are slanted.
Nov. 4, 2004 LEC #12 ECON 240A-2 L. PhillipsBivariate Normal Distribution: Isodensity Curves
II. Bivariate Normal Density
The density function, f(x,y) for two jointly normal variables, x and y where, for
example, x has mean x, variance x2, and correlation coefficient , is:
f(x, y) = 1/[2x y (1-2)] exp{(-1/[2(1-2)])([(x- x)/x]2 - 2[(x- x)/x ][(y- y)/y] +
[(y- y)/y]2 }. (1)
A. Case 1: correlation is zero, means are zero, and variances are one
f(x, y) = 1/[2 ] exp{(-1/2)[ x2 + y2 ]} (2)
and for an isodensity, where f(x,y) is a constant, k, taking logarithms,
ln [2 f(x, y)] = -1/2 [x2 + y2 ],
or [x2 + y2 ] = -2 ln [2 f(x, y)] = -2ln [2 k]. (3)
Recall [x2 + y2] = r2 is the equation of a circle around the origin, (0, 0) with radius
r, as illustrated in Figure 1.
--------------------------------------------------------------------------------
Figure 1: Isodensity Circles About the Origin
y
x
Nov. 4, 2004 LEC #12 ECON 240A-3 L. PhillipsBivariate Normal Distribution: Isodensity Curves
Note that if x and y are independent, then the correlation coefficient, , is zero
and the joint density function, f(x, y), is the product of the marginal density
functions for x and y, i.e.
f(x, y) = f(x) f(y) = 1/ exp [-1/2 x2 ] 1/ exp [-1/2 y2 ] (4)
where x and y have mean zero and variance one.
B. Case 2: correlation is zero, variances are one, means x and y
In this case, the origin is translated to the point of the means, (x, y). The
bivariate density function is:
f(x, y) = 1/(2) exp {(-1/2)[(x - x)2 + (y - y)2 ]}. (5)
For a density equal to k:
[(x - x)2 + (y - y)2 ] = -2 ln [2 f(x,y)] = -2 ln[2k] (6)
This is illustrated in Figure 2.
-------------------------------------------------------------------------------------
Figure 2: Isodensity Lines About the Point of Means, Bivariate Normal
C. Case 3: correlation is zero, variance of x > variance of y
y
xx
y
Nov. 4, 2004 LEC #12 ECON 240A-4 L. PhillipsBivariate Normal Distribution: Isodensity Curves
If the variance of x exceeds the variance of y, then the isodensity lines are ellipses
about the point of the means with the semi-major axis in the x direction:
f(x,y) = 1/(2 x y ) exp{ (-1/2) ([(x-x)/x]2 + [(x-y)/y]2 )} (7)
Note that if x and y are independent, then the correlation coefficient is zero and
the joint density is the product of the marginal densities:
f(x, y) = f(x) f(y) = 1/(x ) exp[-1/2[(x- x)/x]2 1/(y ) exp[-1/2[(y- y)/y]2
For a constant isodensity, f(x, y) = k, from Eq. (7) we have,
([(x-x)/x]2 + [(x-y)/y]2 = -2 ln (2 x y f(x, y)) = -2 ln (2 x y k) (8)
Recall the equation of an ellipse about the origin with semi-major axis a
and semi-minor axis b is:
x2/a2 + y2/b2 = 1 (9)
Elliptical isodensity lines around the point of the means are illustrated for Eq. (7)
in Figure 3.
Case 4: correlation is nonzero.
The joint density function is given by Eq. (1) above, and the isodensity lines are
tilted ellipses around the point of the means as illustrated in Figure 4, for positive
autocorrelation.
y
y
Nov. 4, 2004 LEC #12 ECON 240A-5 L. PhillipsBivariate Normal Distribution: Isodensity Curves
Figure 3: Isodensity Lines About the Point of the Means, Var x > Var y
-----------------------------------------------------------------------------------
Figure 4: isodensity lines, x and y correlated
----------------------------------------------------------------------------------------------
III. Marginal Density Functions
If x and y are jointly normal, then both x and y each have normal density
functions. For example, the marginal density of x, f(x) is:
f(x) = = 1/(x ) exp[-1/2[(x- x)/x]2 (10)
x x
x x
y
y
Nov. 4, 2004 LEC #12 ECON 240A-6 L. PhillipsBivariate Normal Distribution: Isodensity Curves
and similarly for y.
IV. Conditional Density Function
The density of y conditional on a particular value of x, x = x*, is just a vertical slice of
the isodensity curve plot at that value of x, and if x and y are jointly normal, is also
normal. It can be obtained by dividing the joint density function by the marginal density
and simplifying:
f(y/x) = f(x, y)/f(x) = 1/[y (1 - 2)1/2] exp{[-1/[2(1-2)y2][y-y-(x-x)(y/x)]}
(11)
where the mean of the conditional distribution is y + (x-x)(y/x), i.e this is the
expected value of y for a given value of x, such as x*:
E[y/x=x*] = y + (x* - x)(y/x) (12)
So, if x is at its mean, x, then the expected value of y is its mean y. If x is above its
mean, and the correlation is positive, then the expected value of y conditional on x is
greater than y. This is called the regression of y on x with intercept y - x(y/x), and
slope (y/x). Of course, if x and y are not correlated, then the slope is zero, and the
intercept is y. The variance of the conditional distribution is:
Var[y/x=x*] = y2 (1 - 2) (13)
The isodensity lines and the regression line, the mean of y conditional on x, is
illustrated in Figure 5, for the case where x and y are positively correlated and the
variance of x is greater than the variance of y.
y
y
Expected Value of y Conditional on x
Nov. 4, 2004 LEC #12 ECON 240A-7 L. PhillipsBivariate Normal Distribution: Isodensity Curves
Figure 5: The Expected value of y Conditional on x
V. Example: Rates of Return for a Stock and the Market
In Lab Six we look at the data file XR17-34 for 48 monthly rates of return to the
General Electric (GE) stock and the Standard and Poor’s Composite Index. Both of these
variables are not significantly different from normal in their marginal distributions. An
example is the histogram and statistics for the rate of return for GE, shown in Figure 6.
The coefficient of skewness, S, is a measure of non-symmetry:
S = (1/n) 3 (14)
x
x
0
1
2
3
4
5
6
-0.05 0.00 0.05 0.10
Series: GESample 1993:01 1996:12Observations 48
Mean 0.022218Median 0.019524Maximum 0.117833Minimum -0.058824Std. Dev. 0.043669Skewness 0.064629Kurtosis 2.231861
Jarque-Bera 1.213490Probability 0.545122
Figure 6
Nov. 4, 2004 LEC #12 ECON 240A-8 L. PhillipsBivariate Normal Distribution: Isodensity Curves
Where is s, the sample standard deviation. For the normal distribution, the coefficient of
skewness is zero, since the cube of deviations from the mean sum to zero with the
negative values offset by the positive ones because of symmetry.
The coefficient of kurtosis, K, is a measure of how peaked or how flat the density
is, capturing the weight in the tails.
K = (1/n) 4 (15)
For the normal distribution, the coefficient of kurtosis is three.
The Jarque-Bera statistic, JB, combines these two coefficients:
JB = (n- k/6) [S2 + (1/4)(K – 3)2 (16)
Where k is the number of estimated parameters, such as the sample mean and sample
standard deviation, needed to calculate the statistics. If S is zero and K is 3, then the JB
statistic will be zero. Large values of JB indicate a deviation from normality, and can be
tested using the Chi-Square distribution with two degrees of freedom.
The descriptive statistics for GE and the Index are given in Table 1. The estimated
correlation coefficient is 0.636. These estimates can be used to implement Eq. (12):
E[y/x=x*] = [y - x(y/x)] + x*(y/x)
E[GE/Index] = [0.0222 – 0.636*0.0144*(0.0437/0.0254)] + 0.636*1.720*Index
E[GE/Index] = 0.0064 + 1.094*Index (13)
For comparison, the estimated regression is reported in Table 2. The coefficients are nearly identical. So the
regression can be interpreted as the expected value of y for a given value of x. A plot of the rates of return
for GE and the stock Index are shown in Figure 6.
Table 1Sample: 1993:01 1996:12
Nov. 4, 2004 LEC #12 ECON 240A-9 L. PhillipsBivariate Normal Distribution: Isodensity Curves
GE INDEX
Mean 0.022218 0.014361 Median 0.019524 0.017553 Maximum 0.117833 0.076412 Minimum -0.058824 -0.044581 Std. Dev. 0.043669 0.025430 Skewness 0.064629 -0.453474 Kurtosis 2.231861 3.222043
Jarque-Bera 1.213490 1.743715 Probability 0.545122 0.418174
Observations 48 48
Table 2Dependent variable:
GEMethod:
Least Squares
Coefficient Std. Error t-Statistic Prob.
0.006526 0.005659 1.153229 0.25481.092674 0.195328 5.594046 0.0000
0.404865 Mean dependent var 0.0222180.391927 S.D. dependent var 0.0436690.034053 Akaike info criterion -3.8810390.053341 Schwarz criterion -3.80307295.14493 F-statistic 31.293352.442439 Prob(F-statistic) 0.000001
-0.10
-0.05
0.00
0.05
0.10
0.15
-0.05 0.00 0.05 0.10
INDEX
GE
Figure 6: Rates of Return for GE Stock and S&P Composite Index
Nov. 4, 2004 LEC #12 ECON 240A-10 L. PhillipsBivariate Normal Distribution: Isodensity Curves
V. Discriminating Between Two Populations
As an example, we will use the data file XR18-58 on lottery expenditure as a
percent of income, introduced in Lab Six. Twenty-three individuals did not gamble. The
means for their age, number of children, years of education, and income are shown in
Table 3. For comparison, the means of the 77 individuals who did gamble are shown in
Table 4. The question is, can these explanatory variables predict who will and who will
not buy lottery tickets.
The means for number of children and age are fairly similar for the two groups. Those who do not
buy lottery tickets are better educated with higher incomes than those who participate in the lottery. The
correlation between education and income is 0.65 for ticket buyers, and 0.74 for the entire sample.
Table 3Sample: 1 23
AGE CHILDREN EDUCATION INCOME LOTTERY
Mean 40.43478 1.782609 15.56522 47.56522 0.000000 Median 41.00000 2.000000 16.00000 42.00000 0.000000 Maximum 54.00000 4.000000 20.00000 95.00000 0.000000 Minimum 23.00000 0.000000 7.000000 18.00000 0.000000 Std. Dev. 8.805092 1.277658 3.368653 22.51631 0.000000 Skewness -0.446250 0.014659 -0.919721 0.518080 NA Kurtosis 2.308389 1.985475 3.156800 2.097295 NA
Jarque-Bera 1.221762 0.987199 3.266130 1.809815 NA
Nov. 4, 2004 LEC #12 ECON 240A-11 L. PhillipsBivariate Normal Distribution: Isodensity Curves
Probability 0.542872 0.610425 0.195330 0.404579 NA
Observations 23 23 23 23 23----------------------------------------------------------------------------------------
Table 4Sample: 24 100
AGE CHILDREN EDUCATION INCOME LOTTERY
Mean 44.19481 1.779221 11.94805 28.54545 7.000000 Median 43.00000 2.000000 11.00000 27.00000 7.000000 Maximum 82.00000 6.000000 17.00000 64.00000 13.00000 Minimum 21.00000 0.000000 7.000000 11.00000 1.000000 Std. Dev. 12.70727 1.343830 2.887797 9.423578 2.695025 Skewness 0.466514 0.506085 0.293006 1.304264 -0.308533 Kurtosis 3.189937 3.149919 1.918891 5.036654 2.741336
Jarque-Bera 2.908734 3.359008 4.851659 35.13888 1.436299 Probability 0.233548 0.186466 0.088405 0.000000 0.487654
Observations 77 77 77 77 77---------------------------------------------------------------------------
The conceptual framework is provided in Figure 7, which shows isodensity curves
for the two populations for the explanatory variables income and education.
x
X = income
y
Y = education
Lottery Players
x
Lottery Avoiders
y
Decision Rule Line
Nov. 4, 2004 LEC #12 ECON 240A-12 L. PhillipsBivariate Normal Distribution: Isodensity Curves
Figure 7: Discriminating Between Those Who Play the Lottery and Those Who Don’t
---------------------------------------------------------------------------------------
Using a single variable, we could test for a difference in sample means for
education or for a difference in the sample means for income. But why not use both
variables and instead of a decision rule classifying them as gamblers if x < x*, or y < y*
use a decision rule line that separates the two populations. This is called discriminant
function analysis.
Another approach is to use a probability model. A linear probability model can be
estimated with regression using a dependent variable coded one for those who buy tickets
and zero for those who do not(designated bern for Bernoulli), and regressing it against
education and income. The results are shown in Table 7, with a plot of actual, fitted and
residuals following. Since income is very skewed, it is better to use the natural logarithm
of income, which is more bell shaped.
Using the same coding for the dependent variable, non-linear estimation of the
logit probability model is possible using Eviews, which avoids some problems that occur
with the linear probability model.
Table 7Dependent Variable: BERNMethod: Least Squares
Sample: 1 100Included observations: 100
Variable Coefficient Std. Error t-Statistic Prob.
Nov. 4, 2004 LEC #12 ECON 240A-13 L. PhillipsBivariate Normal Distribution: Isodensity Curves
EDUCATION -0.021597 0.016017 -1.348392 0.1807INCOME -0.010462 0.003430 -3.049569 0.0030
C 1.390402 0.148465 9.365178 0.0000
R-squared 0.277095 Mean dependent var 0.770000Adjusted R-squared 0.262190 S.D. dependent var 0.422953S.E. of regression 0.363299 Akaike info criterion 0.842358Sum squared resid 12.80264 Schwarz criterion 0.920513Log likelihood -39.11792 F-statistic 18.59045Durbin-Watson stat 0.651758 Prob(F-statistic) 0.000000--------------------------------------------------------------------------
.
The linear probability model can be interpreted from the perspective of decision
theory, and used to come up with a decision rule or discriminant function. The expected
cost of misclassification is the sum of the expected costs of two kinds of
misclassification, (1) labeling a non-player a player, and (2) labeling a player a non-
player. For example, if we have the cost of labeling a non-player a player, C(P/N), and
multiply it by the conditional probability, P(P/N) of incorrectly classifying this non-
player a player, given this individual’s values for income and education, and multiply by
the probability of observing non-players in the population, P(N), we have this first
-1.5
-1.0
-0.5
0.0
0.5
1.0
-0.5
0.0
0.5
1.0
1.5
10 20 30 40 50 60 70 80 90 100
Residual Actual Fitted
Figure 8: Actual , Fitted and residuals from Linear Probabili ty Model
Nov. 4, 2004 LEC #12 ECON 240A-14 L. PhillipsBivariate Normal Distribution: Isodensity Curves
component of misclassification: C(P/N)*P(P/N)*P(N). Adding the other expected cost of
misclassification, we have the total expected costs, E(C), of misclassification:
E(C) = C(P/N)*P(P/N)*P(N) + C(N/P)*P(N/P)*P(P). (14)
If the two costs of misclassification are equal, i.e. C(P/N) = C(N/P), noting that
there are 23 non-players or about one in four in the population, the expected costs are
E(C) = C(P/N)*P(P/N)*(1/4) + C(N/P)*P(N/P)*(3/4), (15)
We could weight the expected costs of misclassification equally by setting the probability
of classifying a non-player (coded one in the linear probability model) as a player to ¾,
i.e setting (P/N) = ¾, i.e.
E(C) = C(P/N)*(3/4)*(1/4) + C(N/P)*(1/4)*(3/4). (16)
This is equivalent to setting the fitted value of Bern to ¾, and classifying an individual as
a player if the individuals fitted probability is greater than ¾, i.e. if ern > ¾, where
Bern = ¾ = 1.390 –0.0216*education – 0.0105*income, (17)
drawing on Table 7. Thus the discriminant function or decision rule line in education
income space is, rearranging Eq. (17):
Education = 29.63 – 0.486*income, (18)
Which is illustrated in Figure 9.
Note that five non-players are misclassified as well as fourteen players, for a total
of nineteen. You could shift the line to the right, misclassifying fewer players but more
non-players. If Bern were set to 0.5, shifting the line to the right, One player would be
misclassified, but thirteen non-players would be misclassified, for a total of fourteen.
--------------------------------------------------------------------------------
Legend: Non-Players PlayersMean: Non-PlayersMean: Players
Discriminant Function or Decision Rule:Bern = ¾ = 1.39 – 0.0216*education – 0.0105*income
Nov. 4, 2004 LEC #12 ECON 240A-15 L. PhillipsBivariate Normal Distribution: Isodensity Curves
Lottery: Players and Non-Players Vs. Education & Income
0
5
10
15
20
25
0 10 20 30 40 50 60 70 80 90 100
Income ($000)
Educ
atio
n (Y
ears
)
Mean: Non-PlayersMean: Players
top related