8202 · 2014. 8. 28. · applied regression analysis third edition norman r. draper harry smith a...

30
FTP SITE NOW AVAILABLE

Upload: others

Post on 24-Oct-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

  • The methods of regression analysis are the most widely used statistical tools for discovering the relationships among variables. This classic text, with its emphasis on clear, thorough presentation of concepts and applications, offers a complete, easily accessible introduction to the fundamentals of regression analysis. Assuming only a basic knowledge of elementary statistics, AppliedRegression Analysis, Third Edition focuses on the fitting and checking of both linear and nonlinear regression models, using small and large data sets, with pocket calculators or computers.

    This Third Edition features separate chapters on multicollinearity, generalized linear models, mixture ingredients, geometry of regression, robust regression, and resampling procedures. Extensive support materials include sets of carefully designed exercises with full or partial solutions and a series of true/false questions with answers. All data sets used in both the text and the exercises can be found on the book’s related FTP site.

    For analysts, researchers, and students in university, industrial, and government courses on regression, this text is an excellent introduction to the subject and an efficient means of learning how to use a valuable analytical tool. It will also prove an invaluable reference resource for applied scientists and statisticians.

    NORMAN R. DRAPER teaches in the Department of Statistics at the University of Wisconsin. HARRY SMITH is a former faculty member of the Mt. Sinai School of Medicine.

    Cover Design: Paul DiNovo

    Draper 159 8202 532

    FTP SITE NOW

    AVAILABLE

  • Applied Regression Analysis

  • WILEY SERIES IN PROBABILITY AND STATISTICS TEXTS AND REFERENCES SECTION

    Established by WALTER A. SHEWHART and SAMUEL S. WILKS

    Editors: Vic Barnett, Ralph A. Bradley, Noel A. C. Cress ie, Nicholas I. Fisher, lain M. Johnstone, 1. B. Kadane, David G. Kendall, David W. Scott, Bernard W. Silverman, Adrian F. M. Smith, Joze! L. Teugels, Geoffrey S. Watson; 1. Stuart Hunter, Emeritus

    A complete list of the titles in this series appears at the end of this volume.

  • Applied Regression Analysis THIRD EDITION

    Norman R. Draper

    Harry Smith

    A Wiley-Interscience Publication

    JOHN WILEY & SONS, INC. New York . Chichester . Weinheim . Brisbane . Singapore . Toronto

  • This book is printed on acid-free paper. @)

    Copyright © 1998 by John Wiley & Sons, Inc. All rights reserved.

    Published simultaneously in Canada.

    No part of this publication may be rcproduced, stored in a retrieval system or transmilled in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or

    authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., III River Street, Hoboken, NJ 07030, (201) 748-6011 , fax (201) 748-6008.

    Library of Congress Cataloging-in-Publication Data:

    Draper, Norman Richard. Applied regression analysis / N.R. Draper, H. Smith. - 3rd ed.

    p. cm. - (Wiley series in probability and statistics. Texts and references section)

    ,. A Wiley-Interscience publication." Includes bibliographical references (p. - ) and index. ISBN 0-471-17082-8 (acid-free paper) I. Regression analysis. I. Smith, Harry. 1923-

    Ill. Series. QA278.2.D7 1998 SI9.S'36-dc21

    Printed in the United States of America.

    18

    II. Title.

    97-17969 CIP

  • Contents

    Preface

    About the Software

    o Basic Prerequisite Knowledge 0.1 Distributions: Normal, t, and F, 1 0.2 Confidence Intervals (or Bands) and t-Tests, 4 0.3 Elements of Matrix Algebra, 6

    1 Fitting a Straight Line by Least Squares 1.0 Introduction: The Need for Statistical Analysis, 15 1.1 Straight Line Relationship Between Two Variables, 18 1.2 Linear Regression: Fitting a Straight Line by Least Squares, 20

    1.3 The Analysis of Variance, 28 1.4 Confidence Intervals and Tests for f3() and f31, 34 1.5 F-Test for Significance of Regression, 38 1.6 The Correlation Between X and Y, 40 1.7 Summary of the Straight Line Fit Computations, 44 1.8 Historical Remarks, 45 Appendix lA Steam Plant Data, 46 Exercises are in "Exercises for Chapters 1-3", 96

    2 Checking the Straight Line Fit

    2.1 Lack of Fit and Pure Error, 47

    2.2 Testing Homogeneity of Pure Error, 56 2.3 Examining Residuals: The Basic Plots, 59 2.4 Non-normality Checks on Residuals, 61 2.5 Checks for Time Effects, Nonconstant Variance, Need for

    Transformation, and Curvature, 62 2.6 Other Residuals Plots, 67

    xiii

    xvii

    1

    15

    47

    v

  • vi CONTENTS

    2.7 Durbin-Watson Test. 69

    2.8 Reference Books for Analysis of Residuals, 70 Appendix 2A Normal Plots, 70 Appendix 2B MINITAB Instructions, 76 Exercises are in "Exercises for Chapters 1-3", 96

    3 Fitting Straight Lines: Special Topics

    3.0 Summary and Preliminaries, 79 3.1 Standard Error of Y, 80 3.2 Inverse Regression (Straight Line Case), 83

    3.3 Some Practical Design of Experiment Implications of Regression, 86

    3.4 Straight Line Regression When Both Variables Are Subject to Error, 89

    Exercises for Chapters 1-3, 96

    4 Regression in Matrix Terms: Straight Line Case

    4.1 Fitting a Straight Line in Matrix Terms, 115 4.2 Singularity: What Happens in Regression to Make X'X Singular?

    An Example, 125 4.3 The Analysis of Variance in Matrix Terms, 127 4.4 The Variances and Covariance of bo and b1 from the Matrix

    Calculation, 128

    4.5 Variance of Y Using the Matrix Development, 130 4.6 Summary of Matrix Approach to Fitting a Straight Line

    (Nonsingular Case), 130 4.7 The General Regression Situation, 131 Exercises for Chapter 4, 132

    5 The General Regression Situation

    5.1 General Linear Regression, 135 5.2 Least Squares Properties, 137 5.3 Least Squares Properties When E ~ N(O, Ia2), 140

    5.4 Confidence Intervals Versus Regions, 142 5.5 More on Confidence Intervals Versus Regions, 143 Appendix SA Selected Useful Matrix Results, 147 Exercises are in "Exercises for Chapters 5 and 6", 169

    6 Extra Sums of Squares and Tests for Several Parameters Being Zero

    6.1 The "Extra Sum of Squares" Principle, 149

    6.2 Two Predictor Variables: Example, 154 6.3 Sum of Squares of a Set of Linear Functions of Y's, 162

    79

    115

    135

    149

  • CONTENTS

    Appendix 6A Orthogonal Columns in the X Matrix, 165 Appendix 6B Two Predictors: Sequential Sums of Squares, 167 Exercises for Chapters 5 and 6, 169

    vii

    7 Serial Correlation in the Residuals and the Durbin-Watson Test 179

    7.1 Serial Correlation in Residuals, 179

    7.2 The Durbin-Watson Test for a Certain Type of Serial Correlation, 181

    7.3 Examining Runs in the Time Sequence Plot of Residuals: Runs Test, 192

    Exercises for Chapter 7, 198

    8 More on Checking Fitted Models

    8.1 The Hat Matrix H and the Various Types of Residuals, 205 8.2 Added Variable Plot and Partial Residuals, 209

    8.3 Detection of Influential Observations: Cook's Statistics, 210

    8.4 Other Statistics Measuring Influence, 214

    8.5 Reference Books for Analysis of Residuals, 214

    Exercises for Chapter 8, 215

    9 Multiple Regression: Special Topics

    9.1 Testing a General Linear Hypothesis, 217

    9.2 Generalized Least Squares and Weighted Least Squares, 221

    9.3 An Example of Weighted Least Squares, 224

    9.4 A Numerical Example of Weighted Least Squares, 226

    9.5 Restricted Least Squares, 229

    9.6 Inverse Regression (Multiple Predictor Case), 229

    9.7 Planar Regression When All the Variables Are Subject to Error, 231

    Appendix 9A Lagrange's Undetermined Multipliers, 231 Exercises for Chapter 9, 233

    10 Bias in Regression Estimates, and Expected Values of Mean Squares and Sums of Squares

    10.1 Bias in Regression Estimates, 235 10.2 The Effect of Bias on the Least Squares Analysis of Variance, 238

    10.3 Finding the Expected Values of Mean Squares, 239

    10.4 Expected Value of Extra Sum of Squares, 240

    Exercises for Chapter 10, 241

    11 On Worthwhile Regressions, Big F's, and R2

    11.1 Is My Regression a Useful One?, 243 11.2 A Conversation About R2, 245

    205

    217

    235

    243

  • viii CON TEN T S

    Appendix 11 A How Significant Should My Regression Be?, 247 Exercises for Chapter 11, 250

    12 Models Containing Functions of the Predictors, Including Polynomial Models

    12.1 More Complicated Model Functions, 251 12.2 Worked Examples of Second-Order Surface Fitting for k = 3 and

    k = 2 Predictor Variables, 254 12.3 Retaining Terms in Polynomial Models, 266 Exercises for Chapter 12. 272

    13 Transformation of the Response Variable

    13.1 Introduction and Preliminary Remarks, 277 13.2 Power Family of Transformations on the Response: Box~Cox

    Method. 280 13.3 A Second Method for Estimation A, 286 13.4 Response Transformations: Other Interesting and Sometimes

    Useful Plots, 289 13.5 Other Types of Response Transformations, 290 13.6 Response Transformations Chosen to Stabilize Variance, 291

    Exercises for Chapter 13, 294

    14 "Dummy" Variables

    14.1 Dummy Variables to Separate Blocks of Data with Different Intercepts, Same Model, 299

    14.2 Interaction Terms Involving Dummy Variables, 307 14.3 Dummy Variables for Segmented Models, 311 Exercises for Chapter 14, 317

    251

    277

    299

    15 Selecting the "Best" Regression Equation 327

    15.0 Introduction, 327 15.1 All Possible Regressions and "Best Subset" Regression, 329 15.2 Stepwise Regression, 335 15.3 Backward Elimination, 339 15.4 Significance Levels for Selection Procedures, 342

    15.5 Variations and Summary, 343 15.6 Selection Procedures Applied to the Steam Data, 345 Appendix 15A Hald Data. Correlation Matrix, and All 15 Possible

    Regressions. 348 Exercises for Chapter 15, 355

    16 III-Conditioning in Regression Data

    16.1 Introduction, 369 16.2 Centering Regression Data, 371

    369

  • CONTENTS

    16.3 Centering and Scaling Regression Data, 373 16.4 Measuring Multicollinearity, 375 16.5 Belsley's Suggestion for Detecting Multicollinearity, 376 Appendix 16A Transforming X Matrices to Obtain Orthogonal

    Columns, 382 Exercises for Chapter 16, 385

    17 Ridge Regression

    17.1 Introduction, 387

    17.2 Basic Form of Ridge Regression, 387 17.3 Ridge Regression of the RaId Data, 389 17.4 In What Circumstances Is Ridge Regression Absolutely the

    Correct Way to Proceed?, 391 17.5 The Phoney Data Viewpoint, 394 17.6 Concluding Remarks, 395 Appendix 17 A Ridge Estimates in Terms of Least Squares Estimates,

    396 Appendix 17B Mean Square Error Argument, 396 Appendix 17C Canonical Form of Ridge Regression, 397 Exercises for Chapter 17, 400

    18 Generalized Linear Models (GUM)

    18.1 Introduction, 401 18.2 The Exponential Family of Distributions, 402 18.3 Fitting Generalized Linear Models (GUM), 404 18.4 Performing the Calculations: An Example, 406 18.5 Further Reading, 408 Exercises for Chapter 18, 408

    19 Mixture Ingredients as Predictor Variables

    19.1 Mixture Experiments: Experimental Spaces, 409 19.2 Models for Mixture Experiments, 412 19.3 Mixture Experiments in Restricted Regions, 416 19.4 Example 1, 418 19.5 Example 2, 419 Appendix 19A Transforming k Mixture Variables to k - 1 Working

    Variables, 422 Exercises for Chapter 19, 425

    20 The Geometry of Least Squares

    20.1 The Basic Geometry, 427 20.2 Pythagoras and Analysis of Variance, 429 20.3 Analysis of Variance and F-Test for Overall Regression, 432 20.4 The Singular X'X Case: An Example, 433

    ix

    387

    401

    409

    427

  • X CONTENTS

    20.5 Orthogonalizing in the General Regression Case, 435

    20.6 Range Space and Null Space of a Matrix M, 437 20.7 The Algebra and Geometry of Pure Error, 439 Appendix 20A Generalized Inverses M-, 441 Exercises for Chapter 20, 444

    21 More Geometry of Least Squares 447

    21.1 The Geometry of a Null Hypothesis: A Simple Example, 447 21.2 General Case HI): A{3 = c: The Projection Algebra, 448 21.3 Geometric Illustrations, 449 21.4 The F-Test for H(), Geometrically, 450 21.5 The Geometry of R2, 452

    2l.6 Change in R2 for Models Nested Via A{3 = 0, Not Involving /30, 452

    21.7 Multiple Regression with Two Predictor Variables as a Sequence of Straight Line Regressions, 454

    Exercises for Chapter 21, 459

    22 Orthogonal Polynomials and Summary Data

    22.1 Introduction, 461 22.2 Orthogonal Polynomials, 461 22.3 Regression Analysis of Summary Data, 467 Exercises for Chapter 22, 469

    461

    23 Multiple Regression Applied to Analysis of Variance Problems 473

    23.1 Introduction, 473 23.2 The One-Way Classification: Standard Analysis and an Example,

    474 23.3 Regression Treatment of the One-Way Classification Example,

    477

    23.4 Regression Treatment of the One-Way Classification Using the Original Model, 481

    23.5 Regression Treatment of the One-Way Classification: Independent Normal Equations, 486

    23.6 The Two-Way Classification with Equal Numbers of Observations in the Cells: An Example, 488

    23.7 Regression Treatment of the Two-Way Classification Example, 489

    23.8 The Two-Way Classification with Equal Numbers of Observations in the Cells, 493

    23.9 Regression Treatment of the Two-Way Classification with Equal Numbers of Observations in the Cells, 494

    23.10 Example: The Two-Way Classification, 498

  • CONTENTS

    24

    25

    23.11 Recapitulation and Comments, 499 Exercises for Chapter 23, 500

    An Introduction to Nonlinear Estimation

    24.1 Least Squares for Nonlinear Models, 505 24.2 Estimating the Parameters of a Nonlinear System, 508 24.3 An Example, 518 24.4 A Note on Reparameterization of the Model, 529

    24.5 The Geometry of Linear Least Squares, 530 24.6 The Geometry of Nonlinear Least Squares, 539 24.7 Nonlinear Growth Models, 543 24.8 Nonlinear Models: Other Work, 550 24.9 References, 553

    Exercises for Chapter 24, 553

    Robust Regression

    25.1 Least Absolute Deviations Regression (LI Regression), 567 25.2 M-Estimators, 567 25.3 Steel Employment Example, 573 25.4 Trees Example, 575 25.5 Least Median of Squares (LMS) Regression, 577

    25.6 Robust Regression with Ranked Residuals (rreg), 577 25.7 Other Methods, 580 25.8 Comments and Opinions, 580 25.9 References, 581

    Exercises for Chapter 25, 584

    26 Resampling Procedures (Bootstrapping)

    26.1 Resampling Procedures for Regression Models, 585 26.2 Example: Straight Line Fit, 586 26.3 Example: Planar Fit, Three Predictors, 588 26.4 Reference Books, 588 Appendix 26A Sample MINIT AB Programs to Bootstrap Residuals

    for a Specific Example, 589 Appendix 26B Sample MINITAB Programs to Bootstrap Pairs for a

    Specific Example, 590 Additional Comments, 591 Exercises for Chapter 26, 591

    Bibliography

    True/False Questions

    Answers to Exercises

    xi

    505

    567

    585

    593

    605

    609

  • xii

    Tables Normal Distribution, 684 Percentage Points of the t-Distribution, 686 Percentage Points of the X2-Distribution, 687 Percentage Points of the F-Distribution, 688

    Index of Authors Associated with Exercises

    Index

    CON TEN T S

    684

    695

    697

  • Preface to the Third Edition

    The second edition had 10 chapters; this edition has 26. On the whole (but not entirely) we have chosen to use smaller chapters, and so distinguish more between different types of material. The tabulation below shows the major relationships be-tween second edition and third edition sections and chapters.

    Material dropped consists mainly of second edition Sections 6.8 to 6.13 and 6.15, Sections 7.1 to 7.6, and Chapter 8. New to this edition are Chapters 16 on multicollinear-ity, 18 on generalized linear models, 19 on mixture ingredients, 20 and 21 on the geometry of least squares, 25 on robust regression, and 26 on resampling procedures. Small revisions have been made even in sections where the text is basically unchanged. Less prominence has been given to printouts, which nowadays can easily be generated due to the excellent software available, and to references and bibliography, which are now freely available (either in book or computer form) via the annual updates in Current Index to Statistics. References are mostly given in brief either in situ or close by, at the end of a section or chapter. Full references are in a bibliography but some references are also given in full in sections or within the text or in exercises, whenever this was felt to be the appropriate thing to do. There is no precise rule for doing this, merely the authors' predilection. Exercises have been grouped as seemed appropriate. They are intended as an expansion to the text and so most exercises have full or partial solutions; there are a very few exceptions. One hundred and one true/false questions have also been provided; all of these are in "true" form to prevent readers remembering erroneous material. Instructors can reword them to create "false" ques-tions easily enough. Sections 24.5 and 24.6 have some duplication with work in Chapter 20, but we decided not to eliminate this because the sections contain some differences and have different emphases. Other smaller duplications occur; in general, we feel that duplication is a good feature, and so we do not avoid it.

    Our viewpoint in putting this book together is that it is desirable for students of regression to work through the straight line fit case using a pocket calculator and then to proceed quickly to analyzing larger models on the computer. We are aware that many instructors like to get on to the computer right away. Our personal experience is that this can be unwise and, over the years, we have met many students who enrolled for our courses saying "I know how to put a regression on the computer but I don't understand what I am doing." We have tried to keep such participants constantly in mind.

    xiii

  • xiv PREFACE TO THE THIRD EDITION

    We have made no effort to explain any of the dozens of available computing systems. Most of our specific references to these were removed after we received reviews of an earlier draft. Reviewers suggested we delete certain specifics and replace them by others. Unfortunately, the reviewers disagreed on the specifics! In addition, many specific program versions quickly become obsolete as new versions are issued. Quite often students point out to us in class that "the new version of BLANK does (or doesn't!) do that now." For these reasons we have tried to stay away from advocat-ing any particular way to handle computations. A few mild references to MINIT AB (used in our University of Wisconsin classes) have been retained but readers will find it easy to ignore these, if they wish.

    We are grateful for help from a number of people, many of these connected with N. R. Draper at the University of Wisconsin. Teaching assistants contributed in many ways, by working new assignments, providing class notes of lectures spoken but not recorded, and discussing specific problems. Former University of Wisconsin student Dennis K. J. Lin, now a faculty member at Pennsylvania State University, contributed most in this regard. More generally, we profited from teaching for many years from the excellent Wiley book Linear Regression Analysis, by George A. F. Seber, whose detailed algebraic treatment has clearly influenced the geometrical presentations of Chapters 20 and 21.

    N. R. Draper is grateful to the University of Wisconsin and to his colleagues there for a timely sabbatical leave, and to Professor Friedrich Pukelsheim of the University of Augsburg, Germany, for inviting him to spend the leave there, providing full technical facilities and many unexpected kindnesses as well. Support from the German Alexander von Humboldt Stiftung is also gratefully acknowledged. N. R. Draper is also thankful to present and former faculty and staff at the University of Southampton, particularly Fred (T. M. F.) Smith, Nye (1. A.) John (now at Waikato University, New Zealand), Sue Lewis, Phil Prescott, and Daphne Turner, all of whom have made him most welcome on annual visits for many years. The enduring influence of R. C. Bose (1901-1987) is also gratefully acknowledged.

    The staff at the Statistics Department, Mary Esser (staff supervisor, retired), Candy Smith, Mary Ann Clark (retired), Wanda Gray (retired), and Gloria Scalissi, have all contributed over the years. Our special thanks go to Gloria Scalissi who typed much of a difficult and intricate manuscript.

    For John Wiley & Sons, the effects of Bea Shube's help and wisdom linger on, supplemented more recently by those of Kate Roach, Jessica Downey, and Steve Quigley. We also thank Alison Bory on the editorial side and Production Editor Lisa Van Horn for their patience and skills in the final stages.

    We are grateful to all of our reviewers, including David Belsley and Richard (Rick) Chappell and several anonymous ones. The reviews were all very helpful and we followed up most of the suggestions made, but not all. We ourselves have often profited by reading varying presentations in different places and so we sometimes resisted changing our presentation to conform to presentations elsewhere.

    Many others contributed with correspondence or conversation over the years. We do not have a complete list, but some of them were Cuthbert Daniel, Jim Durbin, Xiaoyin (Frank) Fan, Conrad Fung, Stratis Gavaris, Michael Haber, Brian Joiner, Jane Kawasaki, Russell Langley, A. G. C. Morris, Ella Munro, Vedula N. Murty, Alvin P. Rainosek, J. Harold Ranck, Guangheng (Sharon) Shen, Jake Sredni, Daniel Weiner, William J. Welch, Yonghong (Fred) Yang, Yuyun (Jessie) Yang, and Lisa Ying. Others are mentioned within the text, where appropriate. We are grateful to them all.

  • PREFACE TO THE THIRD EDITION xv

    To notify us of errors or misprints, please e-mail [email protected] updated list of such discrepancies will be returned e-mail, if requested. For a hardcopy of the list, please send a stamped addressed envelope to N. R. Draper, University of Wisconsin Statistics Department, 1210 West Dayton Street, Madison, WI 53706, U.S.A.

    NORMAN R. DRAPER HARRY SMITH

  • Relationships of Second Edition and Third Edition Text Material

    Sections Sections

    Second Third Second Third Topic Edition Edition Topic Edition Edition

    Straight line fit 1.0-1.4 1.0-1.5 Polynomial models 5.1,5.2 12.1 Pure error 1.5 2.1-2.2 Transformations 5.3 13 Correlation 1.6 1.6 Dummy variables 5.4 14 Inverse regression 1.7 3.2 Centering and scaling 5.5 16.2,16.3 Practical implications 1.8 3.3 Orthogonal polynomials 5.6 22.2

    Orthogonalizing X 5.7 16A Straight line, matrices 2.0-2.5 4 Summary data 5.8 22.3 General regression 2.6 5 Extra SS 2.7-2.9 6.1, 6.2, 6A Selection procedures 6.0-6.6, 6.12 15 General linear hy- Ridge regression 6.7 17

    pothesis 2.10 9.1 Ridge, canonical form 6A 17A Weighted least Press 6.8

    squares 2.11 9.2, 9.3, 9.4 Principal components 6.9 Restricted least Latent root regression 6.10

    squares 2.13 9.5 Stagewise regression 6.13 Inverse regression 2.15 9.6 Robust regression 6.14 25 Errors in multiple X's 9.7 Bias in estimates 2.12 10 Data example 7.0-7.6 Errors in X and Y 2.14 3.4 Polynomial example 7.7 12.2 Inverse regression 2.15 9.6 Matrix results 2A SA Model building talk 8 E (Extra SS) 2B 10.4 How significant? 2C 11 ANOV A models 9 23 Lagrange's multipliers 2D 9A Nonlinear estimation 10 24

    Residuals plots 3.1-3.8 2 Multicollinearity 16 Serial correlation 3.9-3.11 7 GUM 18 Influential observa- Mixtures models 19

    tions 3.12 8.3,8.4 Geometry of LS 20 Normal plots 3A 2A More geometry 21.1-21.6 Two X's example 4.0,4.2 6.3 Robust regression 25 Geometry 4.1 21.7 Resampling methods 26

  • About the Software

    The diskette that accompanies the book includes data files for the examples used in the chapters and for the exercises. These files can be used as input for standard statistical analysis programs. When writing program scripts, please note that descriptive text lines are included above data sections in the files.

    The data files are included in the REGRESS directory on the diskette, which can be placed on your hard drive by your computer operating system's usual copying methods. You can also use the installation program on the diskette to copy the files by doing the following.

    1. Type a:install at the Run selection of the File menu in a Windows 3.1 system or access the floppy drive directory through a Windows file manager and double click on the INST ALL.EXE file.

    2. After skipping through the introductory screens, select a path for installing the files. The default directory for the file installation is C:\REGRESS. You may edit this selection to choose a different drive or directory. Press Enter when done.

    3. The files will be installed to the selected directory.

    xvii

  • Applied Regression Analysis

  • CHAPTER 0

    Basic Prerequisite Knowledge

    Readers need some of the knowledge contained in a basic course in statistics to tackle regression. We summarize some of the main requirements very briefly in this chapter. Also useful is a pocket calculator capable of getting sums of squares and sums of products easily. Excellent calculators of this type cost about $25-50 in the United States. Buy the most versatile you can afford.

    0.1. DISTRIBUTIONS: NORMAL, t, AND F

    Normal Distribution

    The normal distribution occurs frequently in the natural world, either for data "as they come" or for transformed data. The heights of a large group of people selected randomly will look normal in general, for example. The distribution is symmetric about its mean J.L and has a standard deviation (1', which is such that practically all of the distribution (99.73%) lies inside the range J.L - 3(1' :0:; X :0:; J.L + 3(1'. The frequency function is

    1 (X-J.L)2) f(x) = (1'(2rr)lf2 exp _2(1'2 ' -oo:o:;x:O:;oo. (0.1.1 )

    We usually write that x - N(J.L, (1'2), read as "x is normally distributed with mean J.L and variance (1'2." Most manipUlations are done in terms of the standard normal or unit normal distribution, N(O, 1), for which J.L = 0 and (1' = 1. To move from a general normal variable x to a standard normal variable z, we set

    z = (x - J.L)/(1'. (0.l.2)

    A standard normal distribution is shown in Figure 0.1 along with some properties useful in certain regression contexts. All the information shown is obtainable from the normal table in the Tables section. Check that you understand how this is done. Remember to use the fact that the total area under each curve is l.

    Gamma Function

    The gamma function f(q), which occurs in Eqs. (0.1.3) and (0.1.4), is defined as an integral in general:

    1

  • -3

    2 BASIC PREREQUISITE KNOWLEDGE

    3

    3

    Fig u r e 0.1. The standard (or unit) normal distribution N(O, 1) and some of its properties.

    However, it is easier to think of it as a generalized factorial with the basic property that, for any q,

    f(q) = (q - l)f(q - 1)

    = (q - 1)(q - 2)f(q - 2), and so on. Moreover,

    fm = 1[112 and f(l) = 1.

    So, for the applications of Eqs. (0.1.3) and (0.1.4), where P, m, and n are integers, the gamma functions are either simple factorials or simple products ending in 1[112.

    Example 1

    Example 2

    t-Distribution

    f(5) = 4 X f(4) = 4 X 3 X f(3) = 4 X 3 X 2 X f(2)

    = 4 X 3 X 2 X 1 X f(l) = 24.

    fm = i X fm = i X ~ X fm = 31[112/4.

    There are many t-distributions, because the form of the curve, defined by

    (-00 :5 t:5 (0), (0.1.3)

  • 0.1. DISTRIBUTIONS: NORMAL, t, AND F 3

    v=oo

    -10 -8 -6 -4 -2 o 2 4 6 8 10 Fig u r e 0.2. The I-distributions for" = 1,9,00; t(oo) = N(O, 1).

    depends on v, the number of degrees of freedom. In general, the t( v) distribution looks somewhat like a standard (unit) normal but is "heavier in the tails," and so lower in the middle, because the total area under the curve is 1. As v increases, the distribution becomes "more normal." In fact, t( 00) is the N(O, 1) distribution, and, when v exceeds about 30, there is so little difference between t( v) and N(O, 1) that it has become conventional (but not mandatory) to use the N(O, 1) instead. Figure 0.2 illustrates the situation. A two-tailed table of percentage points is given in the Tables section.

    F-Distribution

    The F-distribution depends on two separate degrees of freedom m and n, say. Its curve is defined by

    r(T)(~r/2 Fm/2-1 f~AF) ~ r( ; )rW -:-(1-+-m-F-ln-)-:--(m-+n=)/2 (F 2: 0). (0.1.4)

    The distribution rises from zero, sometimes quite steeply for certain m and n, and reaches a peak, falling off very skewed to the right. See Figure 0.3. Percentage points for the upper tail levels of 10%, 5%, and 1 % are in the Tables section.

    0.8

    (10,10) 0.6

    0.4

    0.2

    o 1 2 3 4 5 6 Fig u r e 0.3. Some selected f(m, n) distributions.

  • 4 BASIC PREREQUISITE KNOWLEDGE

    The F-distribution is usually introduced in the context of testing to see whether two variances are equal, that is, the null hypothesis that Ho: aV cr~ = 1, versus the alternative hypothesis that HI: crf/ crT ¥- 1. The test uses the statistic F = sT I sL sT and s~ being statistically independent estimates of crT and crL with VI and Vz degrees of freedom (df), respectively, and depends on the fact that, if the two samples that give rise to sT and s~ are independent and normal, then (sf/sD/(crf/crD follows the F(vJ, Vz) distribution. Thus if crT = cr~, F = s r / s ~ follows F( Vb V2)' When given in basic statistics courses, this is usually described as a two-tailed test, which it usually is. In regression applications, it is typically a one-tailed, upper-tailed test. This is because regression tests always involve putting the "S2 that could be too big, but cannot be too small" at the top and the "S2 that we think estimates the true cr2 well" at the bottom of the F-statistic. In other words. we are in the situation where we test HIJ: (TT = (T~ versus HI: crT > cr~.

    0.2. CONFIDENCE INTERVALS (OR BANDS) AND t-TESTS

    Let 0 be a parameter (or "thing") that we want to estimate. Let 0 be an estimate of o ("estimate of thing"). Typically, 0 will follow a normal distribution, either exactly because of the normality of the observations in e, or approximately due to the effect of the Central Limit Theorem. Let cro be the standard deviation of e and let see e) be the standard error, that is, the estimated standard deviation, of e ("standard error of thing"), based on V degrees of freedom. Typically we get see 0) by substituting an estimate (based on V degrees of freedom) of an unknown standard deviation into the formula for cril.

    1. A 100(1 - a)% confidence interval (CI) for the parameter 0 is given by

    8 ± t(v, 1 - aI2)se(8) (0.2.1)

    where t.,(l - a12) is the percentage point of a t-variable with v degrees of freedom (df) that leaves a probability a/2 in the upper tail, and so 1 - a/2 in the lower tail. A two-tailed table where these percentage points are listed under the heading of 2(aI2) = a is given in the Tables section. Equation (0.2.1) in words is

    {A t percentage point }{S d d } . I' 12 . h tan ar error Estimate eavmg a m t e f' . ± . a estImate .

    { of thmg } upper taIl, based on of thin V degrees of freedom g

    (0.2.2)

    2. To test 0 = 80, where 00 is some specified value of 0 that is presumed to be valid (often 80 = 0 in tests of regression coefficients) we evaluate the statistic

    or, in words,

    {Estimate} _ {postUlated or test} of thing value of thing

    t= .

    { Standard error Of} estimate of thing

    (0.2.3)

    (0.2.4)

  • 0.2. CONFIDENCE INTERVALS (OR BANDS) AND I·TESTS 5

    (a)

    o t

    (b)

    Fig u r e 0.4. Two cases for a I-test. (a) The observed t is positive (black dot) and the upper tail area is o. A two-tailed test considers that this value could just as well have been negative (open "phantom" dot) and quotes ·'a two-tailed I-probability of 20." (b) The observed I is negative; similar argument, with tails reversed.

    This "observed value of t" (our "dot") is then placed on a diagram of the t(v) distribution. [Recall that v is the number of degrees of freedom on which see 0) is based and that is the number of df in the estimate of a 2 that was used.] The tail probability beyond the dot is evaluated and doubled for a two-tail test. See Figure 0.4 for the probability 28. It is conventional to ask if the 28 value is "significant" or not by concluding that, if 28 < 0.05, t is significant and the idea (or hypothesis) that 8 = 80 is unlikely and so "rejected," whereas if 28 > 0.05, t is nonsignificant and we "do not reject" the hypothesis 8 = 80 , The alternative hypothesis here is 8 =1= 80 , a two-sided alternative. Note that the value 0.05 is not handed down in holy writings, although we sometimes talk as though it is. Using an "alpha level" of a = 0.05 simply means we are prepared to risk a 1 in 20 chance of making the wrong decision. If we wish to go to a = 0.10 (1 in 10) or a = 0.01 (1 in 100), that is up to us. Whatever we decide, we should remain consistent about this level throughout our testing.

    However, it is pointless to agonize too much about a. A journal editor who will publish a paper describing an experiment if 28 = 0.049, but will not publish it if 28 = 0.051 is placing a purely arbitrary standard on the work. (Of course, it is the editor's right to do that.) Such an attitude should not necessarily be imposed by experimenters on themselves, because it is too restrictive a posture in general. Promis-

  • 6 BASIC PREREQUISITE KNOWLEDGE

    TAB L E 0.1. Example Applications for Formulas (0.2.1)-(0.2.4)

    Situation

    Straight line fit Y = /30 + /31 X + e

    Predicted response

    Yu=bo+hIXIl at X = Xo

    13 &

    f31 b l

    /30 bu

    E(Y) at Xo

    see &)

    s -, -S112 ,Sxx = L(X; - X)-

    xx

    S{~Xi}li2 nSxx

    s{l + -'--.(X_Il ___ X--,-)2}112 11 SIX

    ing experimental leads need to be followed up, even if the arbitrary a standard has not been attained. For example. an a = 0.05 person might be perfectly justified in following up a 215 = 0.06 experiment by performing more experimental runs to further elucidate the results attained by the first set of runs. To give up an avenue of investiga-tion merely because one experiment did not provide a significant result may be a mistake. The a value should be thought of as a guidepost, not a boundary.

    In every application of formulas (0.2.1 )-(0.2.4), we have to ask what 8, 0, eo, see 0), and the t percentage point are. The actual formulas we use are always the same. Table 0.1 contains some examples of e, 0, and see 0) we shall subsequently meet. (The symbol s replaces the u of the corresponding standard deviation formulas.)

    0.3. ELEMENTS OF MATRIX ALGEBRA

    Matrix, Vector, Scalar

    A p x q matrix M is a rectangular array of numbers containing p rows and q col-umns written

    mil n112 mlq

    mZI mn m2q M=

    mpl mp2 mpq

    For example,

    is a 3 X 4 matrix. The plural of matrix is matrices. A "matrix" with only one row is called a row vector: a "matrix" with only one column is called a column vector. For example, if

  • 0.3. ELEMENTS OF MATRIX ALGEBRA 7

    a' = [1,6,3,2,1] and

    then a' is a row vector of length five and b is a column vector of length three. A I x 1 "vector" is an ordinary number or scalar.

    Equality

    Two matrices are equal if and only if their dimensions are identical and they have exactly the same entries in the same positions. Thus a matrix equality implies as many individual equalities as there are terms in the matrices set equal.

    Sum and Difference

    The sum (or difference) of two matrices is the matrix each of whose elements is the sum (or difference) of the corresponding elements of the matrices added (or sub-tracted). For example,

    769

    4 2 1

    653

    214

    1 2 4

    -1 3 -2

    6 2

    7 0

    1

    2

    6 4 5

    5 -1 3

    o -5

    3 2

    1 2

    The matrices must be of exactly the same dimensions for addition or subtraction to be carried out. Otherwise the operations are not defined.

    Transpose

    The transpose of a matrix M is a matrix M' whose rows are the columns of M and whose columns are the rows of M in the same original order. Thus for M and A as defined above,

    n111 m21 nIp I

    ml2 m22 n11'2

    M'=

    mlq m2q mpq

    4 -1 6

    1 0 5 A'=

    3 2 -2

    7 2 1

    Note that the transpose notation enables us to write, for example,

    b' = (-1,0,1) or alternatively b = (-1,0, I)'.

  • 8 BASIC PREREQUISITE KNOWLEDGE

    Note: The parentheses around a matrix or vector can be square-ended or curved. Often, capital letters are used to denote matrices and lowercase letters to denote vectors. Boldface print is often used, but this is not universal.

    Symmetry

    A matrix M is said to be symmetric if M' = M.

    Multiplication

    Suppose we have two matrices, A, which is p X q, and B, which is r X s. They are conformable for the product C = AB only if q = r. The resulting product is then a p X s matrix, the multiplication procedure being defined as follows: If

    [a" al2

    a,"] b ll b 12 bls

    all a22 a2q b21 b 22 b 2s A= . . , B=

    a p l a p2 aI''' b ql b q2 b qs

    pXq q X s

    then the product

    ['" C12

    C"] Cll C22 C2s AB = C = :

    L Cp l Cp2 P X s

    is such that

    Cli = f ai/blj; 1~1

    that is, the entry in the ith row and jth column of C is the inner product (the element by element cross-product) of the ith row of A with the jth column of B. For example,

    [ 1 2

    -1 3

    2 X 3 3 X 3

    [ 1(1) + 2(4) + 1( -2)

    - -1(1)+3(4)+0(-2)

    = [1~ -: -:J. 2 X 3

    1(2) + 2(0) + 1(1)

    -1(2) + 3(0) + 0(1)

    1(3) + 2(-1) + 1(3)J

    -1(3) + 3(-1) + 0(3)