1 simple linear regression linear regression model prediction limitation correlation

of 25/25
1 Simple Linear Regression •Linear regression model •Prediction •Limitation •Correlation

Post on 22-Dec-2015




4 download

Embed Size (px)


  • Slide 1
  • 1 Simple Linear Regression Linear regression model Prediction Limitation Correlation
  • Slide 2
  • 2 Example: Computer Repair A company markets and repairs small computers. How fast (Time) an electronic component (Computer Unit) can be repaired is very important to the efficiency of the company. The Variables in this example are: Time and Units.
  • Slide 3
  • 3 Humm How long will it take me to repair this unit? Goal: to predict the length of repair Time for a given number of computer Units
  • Slide 4
  • 4 Computer Repair Data UnitsMinsUnitsMins 123697 2297109 3498119 4649149 4749145 58710154 69610166
  • Slide 5
  • 5 Scatterplot of response variable against explanatory variable What is the overall (average) pattern? What is the direction of the pattern? How much do data points vary from the overall (average) pattern? Any potential outliers? Graphical Summary of Two Quantitative Variable
  • Slide 6
  • 6 Time is Linearly related with computer Units. (The length of) Time is Increasing as (the number of) Units increases. Data points are closed to the line. No potential outlier. Scatterplot (Time vs Units)Some Simple Conclusions Summary for Computer Repair Data
  • Slide 7
  • 7 Numerical Summary of Two Quantitative Variable Regression Model Correlation
  • Slide 8
  • 8 Linear Regression Model Y: the response variable X: the explanatory variable X Y Y=b 0 +b 1 X+error } b 0 } b 1 1
  • Slide 9
  • 9 Linear Regression Model The regression line models the relationship between X and Y on average.
  • Slide 10
  • 10 Prediction : Predicted value of Y for a given X value Regression equation: Eg. How long will it take to repair 3 computer units?
  • Slide 11
  • 11 The Limitation of the Regression Equation The regression equation cannot be used to predict Y value for the X values which are (far) beyond the range in which data are observed. Eg. The predicted WT of a given HT: Given HT of 40, the regression equation will give us WT of -205+5x40 = -5 pounds!!
  • Slide 12
  • 12 The Unpredicted Part The value is the part the regression equation (model) cannot predict, and it is called residual.
  • Slide 13
  • 13 residual {
  • Slide 14
  • 14 Correlation between X and Y X and Y might be related to each other in many ways: linear or curved.
  • Slide 15
  • 15 r=.98 Strong Linearity r=.71 Median Linearity Examples of Different Levels of Correlation
  • Slide 16
  • 16 r=-.09 Nearly Uncorrelated Examples of Different Levels of Correlation r=.00 Nearly Curved
  • Slide 17
  • 17 (Pearson) Correlation Coefficient of X and Y A measurement of the strength of the LINEAR association between X and Y The correlation coefficient of X and Y is:
  • Slide 18
  • 18 Correlation Coefficient of X and Y -1< r < 1 The magnitude of r measures the strength of the linear association of X and Y The sign of r indicate the direction of the association: - negative association + positive association
  • Slide 19
  • 19 Correlation Coefficient The value r is almost 0 the best line to fit the data points is exactly horizontal the value of X wont change our prediction on Y The value r is almost 1 A line fits the data points almost perfectly.
  • Slide 20
  • Goodness of Fit of SLR Model For a data point: residuals For the whole dataset: R^2 R^2 (=r^2) is the proportion o f variation in Y explained by (the variation in) X 20
  • Slide 21
  • 21 i 12n12n . Total Table for Computing Mean, St. Deviation, and Corr. Coef.
  • Slide 22
  • 22 Example: Computer Repair Time
  • Slide 23
  • 23 (1) Fill the following table, then compute the mean and st. deviation of Y and X (2) Compute the corr. coef. of Y and X (3) Draw a scatterplot i 1-.3.09.1-.9.81.27 2-.2.04.4-.6.36.12 3-.1.01.7 Total0 *6.0* Exercise
  • Slide 24
  • 24 The Influence of Outliers The slope becomes bigger (toward outliers) The r value becomes smaller (less linear)
  • Slide 25
  • 25 The slope becomes clear (toward outliers) The | r | value becomes larger (more linear: 0.159 0.935) The Influence of Outliers