session 4. applied regression -- prof. juran2 outline for session 4 summary measures for the full...

69
Session 4

Upload: anna-ward

Post on 26-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

  • Slide 1
  • Session 4
  • Slide 2
  • Applied Regression -- Prof. Juran2 Outline for Session 4 Summary Measures for the Full Model Top Section of the Output Interval Estimation More Multiple Regression Movers Nonlinear Regression Insurance
  • Slide 3
  • Applied Regression -- Prof. Juran3 Top Section: Summary Statistics
  • Slide 4
  • Applied Regression -- Prof. Juran4
  • Slide 5
  • 5
  • Slide 6
  • 6 Top Section: Summary Statistics
  • Slide 7
  • Applied Regression -- Prof. Juran7
  • Slide 8
  • 8 As stated earlier R 2 is closely related to the correlation between X and Y, indeed Furthermore, R 2, and thus r X,Y, is closely related to the slope of the regression line via Thus, testing the significance of the slope, testing the significance of R 2 and testing the significance of r X,Y are essentially equivalent.
  • Slide 9
  • Applied Regression -- Prof. Juran9
  • Slide 10
  • 10
  • Slide 11
  • Applied Regression -- Prof. Juran11
  • Slide 12
  • Applied Regression -- Prof. Juran12
  • Slide 13
  • Applied Regression -- Prof. Juran13
  • Slide 14
  • Applied Regression -- Prof. Juran14
  • Slide 15
  • Applied Regression -- Prof. Juran15 Interval Estimation
  • Slide 16
  • Applied Regression -- Prof. Juran16 An Image of the Residuals xixi yiyi (x i, y i ) X Y The observed values: The fitted values: The residuals: Recall: The regression line passes through the data so that the sum of squared residuals is as small as possible. (x i, y i )
  • Slide 17
  • Applied Regression -- Prof. Juran17 Regression and Prediction Regression lines are frequently used for predicting future values of Y given future, conjectural or speculative values of X. Suppose we posit a future value of X, say x 0. The predicted value,, is
  • Slide 18
  • Applied Regression -- Prof. Juran18 Under our assumptions this is an unbiased estimate of Y given that x=x 0, regardless of the value of x 0. Let 0 = E ( Y(x 0 )) and thus, since the estimate is unbiased, 0 = b 0 + b 1 x 0. However, be alert to the fact that this estimate (prediction) of a future value has a standard error of Furthermore, the standard error of the prediction of the expected (mean) value of Y given x = x 0 is
  • Slide 19
  • Applied Regression -- Prof. Juran19 From these facts it follows that a 2-sided confidence interval on the expected value of Y given x= x 0, , is given by
  • Slide 20
  • Applied Regression -- Prof. Juran20 A 2-sided predictioninterval on future individual values of Y given x = x 0, y , is given by
  • Slide 21
  • Applied Regression -- Prof. Juran21 Confidence Interval on E ( Y ( x 0 )) Prediction Interval on Y ( x 0 )
  • Slide 22
  • Applied Regression -- Prof. Juran22 Note that both of these intervals are parabolic functions in x 0, have their minimum interval width at x 0 =, and their widths depend on and on S xx The sum of squared x term appears so often in regression equations that it is useful to use the abbreviation S xx. Note that S xx can easily be obtained from the variance as computed in most spreadsheets or statistics packages.
  • Slide 23
  • Applied Regression -- Prof. Juran23 An Image of the Prediction and Confidence Intervals
  • Slide 24
  • Applied Regression -- Prof. Juran24
  • Slide 25
  • Applied Regression -- Prof. Juran25
  • Slide 26
  • Applied Regression -- Prof. Juran26
  • Slide 27
  • Applied Regression -- Prof. Juran27 All-Around Movers The management question here is whether historical data can be used to create a cost estimation model for intra-Manhattan apartment moves. The dependent variable is the number of labor hours used, which is a proxy for total cost in the moving business. There are two potential independent variables: volume (in cubic feet) and the number of rooms in the apartment being vacated.
  • Slide 28
  • Applied Regression -- Prof. Juran28 Summary Statistics
  • Slide 29
  • Applied Regression -- Prof. Juran29
  • Slide 30
  • Applied Regression -- Prof. Juran30
  • Slide 31
  • Applied Regression -- Prof. Juran31
  • Slide 32
  • Applied Regression -- Prof. Juran32
  • Slide 33
  • Applied Regression -- Prof. Juran33 The Most Obvious Simple Regression
  • Slide 34
  • Applied Regression -- Prof. Juran34
  • Slide 35
  • Applied Regression -- Prof. Juran35 An Alternative Simple Regression Model
  • Slide 36
  • Applied Regression -- Prof. Juran36
  • Slide 37
  • Applied Regression -- Prof. Juran37 A Multiple Regression Model
  • Slide 38
  • Applied Regression -- Prof. Juran38
  • Slide 39
  • Applied Regression -- Prof. Juran39 Volume is the best single predictor, but perhaps not useful if customers are to be expected to collect these data and enter them on a web site. Rooms is a pretty good predictor (not as good as Volume), and may be more useful on a practical basis. Preliminary Observations
  • Slide 40
  • Applied Regression -- Prof. Juran40 The multiple regression model makes better predictions, but not much better than either of the simple regression models. The multiple regression model has problems with multicollinearity. Notice the lack of significance for the Rooms variable (and the strange coefficient). Preliminary Observations
  • Slide 41
  • Applied Regression -- Prof. Juran41 Prediction intervals, corresponding to the estimated number of hours for one specific move, given one specific value for the number of rooms. Confidence intervals, corresponding to the estimated population average number of hours over a large number of moves, all with the same number of rooms.
  • Slide 42
  • Applied Regression -- Prof. Juran42 Validity of the Rooms Model
  • Slide 43
  • Applied Regression -- Prof. Juran43 Analysis of the Residuals
  • Slide 44
  • Applied Regression -- Prof. Juran44
  • Slide 45
  • Applied Regression -- Prof. Juran45 Comments on the Rooms Model Good explanatory power Statistically Significant Points fit the line well But Small apartments tend to be over-estimated Large apartments tend to be badly estimated, especially on the high side Maybe could use more data Maybe nonlinear
  • Slide 46
  • Applied Regression -- Prof. Juran46 A Non-linear Model? Note: If Ae B , then ln ( A ) = B.
  • Slide 47
  • Applied Regression -- Prof. Juran47
  • Slide 48
  • Applied Regression -- Prof. Juran48
  • Slide 49
  • Applied Regression -- Prof. Juran49
  • Slide 50
  • Applied Regression -- Prof. Juran50 Residual Analysis Histogram of Residuals 0 2 4 6 8 10 12 -35-30-25-20-15-10-505101520253035 Residual Error Frequency Histogram of Residuals 0 2 4 6 8 10 12 14 -35-30-25-20-15-10-505101520253035 Residual Error Frequency Linear Model Exponential Model
  • Slide 51
  • Applied Regression -- Prof. Juran51 Residual Errors vs. Predictions -15 -10 -5 0 5 10 15 20 25 30 35 0102030405060 Predicted Hours Errors (Hours) Linear Model Residual Errors vs. Predictions -20 -15 -10 -5 0 5 10 15 20 25 30 0102030405060 Predicted Hours Errors (Hours) Exponential Model
  • Slide 52
  • Applied Regression -- Prof. Juran52 Residual Errors vs. Rooms -15 -10 -5 0 5 10 15 20 25 30 35 0123456 Rooms Errors (Hours) Linear Model Residual Errors vs. Rooms -20 -15 -10 -5 0 5 10 15 20 25 30 0123456 Rooms Errors (Hours) Exponential Model
  • Slide 53
  • Applied Regression -- Prof. Juran53 Conclusions Regression analysis is technically easy Creating a reliable model is subject to creativity and judgment The Rooms model (either linear or otherwise) is reasonably useful for this managerial application The most serious estimation problem is when we try to make predictions for large apartments. What about a separate model for very large apartments?
  • Slide 54
  • Applied Regression -- Prof. Juran54
  • Slide 55
  • Insurance Case Applied Regression -- Prof. Juran55
  • Slide 56
  • Insurance Case Applied Regression -- Prof. Juran56
  • Slide 57
  • Insurance Case Applied Regression -- Prof. Juran57 The regression with exponential equation has a higher R 2. One "real world" explanation: companies that generate very high ROAEs will be rewarded with higher valuation multiples The relationship might be exponential as opposed to linear because an investment will compound at this higher ROAE. The primary driver for this is that Duck is an outlier in both dimensions it has a VERY high P/B and ROAE.
  • Slide 58
  • Insurance Case Applied Regression -- Prof. Juran58
  • Slide 59
  • Applied Regression -- Prof. Juran59
  • Slide 60
  • Insurance Case Applied Regression -- Prof. Juran60 What is the implied P/B multiple and implied total value of Circle? Using the following equation to calculate the implied P/B multiple: Plugging in 14.2 for x, we get y = 1.387481. The implied book value is $2.5 billion times P/B multiple of 1.387481 = an estimated total value of $3.4687 billion.
  • Slide 61
  • Insurance Case Applied Regression -- Prof. Juran61 3. Abe has announced that it will be making an acquisition. It is trying to decide whether to pay in stock or in cash. a. If Abe pays with stock, the pro-forma ROAE of the combined company will be 12.2% and the pro-forma book value will be $16.5 billion. What is the implied P/B multiple and implied total value of the pro-forma company? b. If Abe pays with cash, the pro-forma ROAE of the combined company will be 15.5% and the pro-forma book value will be $11.5 billion. What is the implied P/B multiple and implied total value of the pro-forma company? c. If the goal is to maximize the pro-forma total value of the new company, how should Abe pay for the acquisition?
  • Slide 62
  • Insurance Case Applied Regression -- Prof. Juran62 Depending on which version of the equation we use, there are several possible results for the estimate P/B of the new company: Abe should pay in cash, since the total value would be $0.044527 billion higher than if Abe paid in stock.
  • Slide 63
  • Insurance Case Applied Regression -- Prof. Juran63 4. Assume that before the acquisition, Abe has a book value of $11.5 billion and an ROAE of 12.8%. Abe will either issue $5 billion in stock or use $5 billion in cash to complete the acquisition. What incremental value, if any, is created in both the stock and cash scenarios described above?
  • Slide 64
  • Insurance Case Applied Regression -- Prof. Juran64 Abe's total value before the acquisition is determined by taking its ROAE of 12.8% and applying the regression equation, to get an implied P/B multiple of 1.189957x. Applying that to total book value of $11.5 billion, we would get an implied total value of $13.68451 billion. Adding in the $5 billion cost of the proposed acquisition, we would get an adjusted value for Abe of $18.68451 billion. In both the scenarios described in question 3 (stock and cash), the pro- forma total value would be LESS than $18.68451 billion. Thus, NO incremental value is created. (The exact result will vary depending on which model you use.)
  • Slide 65
  • Insurance Case Applied Regression -- Prof. Juran65
  • Slide 66
  • Insurance Case Applied Regression -- Prof. Juran66
  • Slide 67
  • Insurance Case Applied Regression -- Prof. Juran67
  • Slide 68
  • Applied Regression -- Prof. Juran68 Summary Summary Measures for the Full Model Top Section of the Output Interval Estimation More Multiple Regression Movers Nonlinear Regression Insurance
  • Slide 69
  • Applied Regression -- Prof. Juran69 For Session 5 Cigarettes Do a full multiple regression model of the cigarette data, and answer the questions: www.ilr.cornell.edu/~hadi/RABE/Data/P081.txt Cars Do a multiple regression model of the cars data Just quantitative independent variables; well talk next time about the qualitative ones