simple linear regression. deterministic relationship if the value of y (dependent) is completely...
TRANSCRIPT
Simple Linear Regression
Deterministic Relationship If the value of y (dependent) is completely determined by
the value of x (Independent variable)
(Like an equation in the form y = 2x + 10, or f(x) = 5x-1)
However, in most situations, the variables of interest are not deterministically related!
For example, the value of y = 1st year college GPA is
certainly not determined solely by x = high school
GPA.
Probabilistic Model
A description of the relation between 2 variables x and y that are not deterministic.
The general form allows y to be larger or smaller than f(x) by a random amount, e.
Let x* denote the value of x….
*
*
*
( ) if 0
( ) if 0
( ) if 0
y f x e
y f x e
y f x e
Without the random deviation e, all observed (x, y) points would fall exactly on the population regression line. The inclusion of e in the model equation recognizes that points will deviate from the line.
Simple Linear Regression Model:
Assumptions about the distribution of eMean = 0 St. Dev. is the same for any value of x.
Distribution of e at any x value is normalRandom deviations associated with different
observations are independent of one another
y x e
1e
Slope
Average change in y associated with a 1 unit increase in x.
Point estimate is the slope (b). (Population is )
Y-intercept’s point estimate is a. (Population is )
y x e Population Regression Line
Summary
where
and
The estimated regression line is then just the least-squares line
X* denotes a specified value of the predictor variable x ….
So has 2 different interpretations It is a point estimate of the true mean y value
when x = x*.
It is a point predictor of an individual y value that would be observed when x = x*.
*a bx
Find the point estimate of the mean y-value for the following: mother's age
birth weight
x
y
Age (x) 15 17 18 15 16 19 17 16 18 20
Weight (y) 2289 3393 3271 2648 2897 3327 2970 2535 3138 3573
So what’s the point estimate for an 18 year old mom?
Point estimate and point prediction are identical – only the interpretation is different.
Prediction – weight of single baby who mom is 18
Estimate – average weight of all babies born to 18 year-olds
Answer the following:
Explain the slope in context of the problem
Explain the y-intercept in context of the problem.
Find SSResid.
On calculator – every time you calculate a linear regression – it calculates the residuals. Put them in list 3 and square them & add the list.
2
y y
Point estimate of is is
It represents the typical deviation in the y-variable from the least squares line.
Re
2sid
e
SSS
n
Find the residual for a mother who is 19.
Find the probability that a 19 year old mother has a baby that is more than 3000 g.
Coefficient of determination (r2)
It’s the amount of variation in the y-variables that can be explained by the least squares line.
2
2
2
TotSS y y
yy
n
2 1 resid
Tot
SSr
SS
Homework
Worksheet