[wiley series in probability and statistics] methods of multivariate analysis (rencher/methods) ||...

24
METHODS OF MULTIVARIATE ANALYSIS

Upload: william-f

Post on 15-Dec-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: [Wiley Series in Probability and Statistics] Methods of Multivariate Analysis (Rencher/Methods) || Front Matter

METHODS OF MULTIVARIATE ANALYSIS

Page 2: [Wiley Series in Probability and Statistics] Methods of Multivariate Analysis (Rencher/Methods) || Front Matter

WILEY SERIES IN PROBABILITY AND STATISTICS

Established by WALTER A. SHEWHART and SAMUEL S. WILKS

Editors: David J. Balding, Noel A. C. Cressie, Garrett M. Fitzmaurice, Harvey Goldstein, Iain M. Johnstone, Geert Molenberghs, David W. Scott, Adrian F. M. Smith, Ruey S. Tsay, Sanford Weisberg Editors Emeriti: Vic Barnett, J. Stuart Hunter, Joseph B. Kadane, JozefL. Teugels

A complete list of the titles in this series appears at the end of this volume.

Page 3: [Wiley Series in Probability and Statistics] Methods of Multivariate Analysis (Rencher/Methods) || Front Matter

METHODS OF MULTIVARIATE ANALYSIS Third Edition

Alvin C. Rencher

William F. Christensen Department of Statistics Brigham Young University Provo, Utah

)WILEY A JOHN WILEY & SONS, INC., PUBLICATION

Page 4: [Wiley Series in Probability and Statistics] Methods of Multivariate Analysis (Rencher/Methods) || Front Matter

Copyright © 2012 by John Wiley & Sons, Inc. All rights reserved

Published by John Wiley & Sons, Inc., Hoboken, New Jersey Published simultaneously in Canada

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.

Library of Congress Cataloging-in-Publication Data:

Rencher, Alvin C , 1934-Methods of multivariate analysis / Alvin C. Rencher, William F. Christensen, Department of Statistics,

Brigham Young University, Provo, UT. — Third Edition. pages cm. — (Wiley series in probability and statistics)

Includes index. ISBN 978-0-470-17896-6 (hardback)

1. Multivariate analysis. I. Christensen, William F., 1970- II. Title. QA278.R45 2012 519.5'35—dc23 2012009793

Printed in the United States of America.

10 9 8 7 6 5 4 3 2 1

Page 5: [Wiley Series in Probability and Statistics] Methods of Multivariate Analysis (Rencher/Methods) || Front Matter

CONTENTS

Preface xvii

Acknowledgments xxi

1 Introduction 1

1.1 WHY MULTIVARIATE ANALYSIS? 1 1.2 PREREQUISITES 3 1.3 OBJECTIVES 3 1.4 BASIC TYPES OF DATA AND ANALYSIS 4

2 Matrix Algebra 7

2.1 INTRODUCTION 7 2.2 NOTATION AND BASIC DEFINITIONS 8

2.2.1 Matrices, Vectors, and Scalars 8 2.2.2 Equality of Vectors and Matrices 9 2.2.3 Transpose and Symmetric Matrices 9 2.2.4 Special Matrices 10

2.3 OPERATIONS 11

Page 6: [Wiley Series in Probability and Statistics] Methods of Multivariate Analysis (Rencher/Methods) || Front Matter

vi CONTENTS

2.3.1 Summation and Product Notation 11 2.3.2 Addition of Matrices and Vectors 12 2.3.3 Multiplication of Matrices and Vectors 13

2.4 PARTITIONED MATRICES 22 2.5 RANK 23 2.6 INVERSE 25 2.7 POSITIVE DEFINITE MATRICES 26 2.8 DETERMINANTS 28 2.9 TRACE 31 2.10 ORTHOGONAL VECTORS AND MATRICES 31 2.11 EIGENVALUES AND EIGENVECTORS 32

2.11.1 Definition 32 2.11.2 I + A a n d l - A 34 2.11.3 tr(A)and|Aj 34 2.11.4 Positive Definite and Semidefinite Matrices 35 2.11.5 The Product A B 35 2.11.6 Symmetric Matrix 35 2.11.7 Spectral Decomposition 35 2.11.8 Square Root Matrix 36 2.11.9 Square and Inverse Matrices 36 2.11.10 Singular Value Decomposition 37

2.12 KRONECKER AND VEC NOTATION 37 Problems 39

Characterizing and Displaying Multivariate Data 47

3.1 MEAN AND VARIANCE OF A UNIVARIATE RANDOM VARIABLE 47

3.2 COVARIANCE AND CORRELATION OF BIVARIATE RANDOM VARIABLES 49 3.2.1 Covariance 49 3.2.2 Correlation 53

3.3 SCATTERPLOTS OF BIVARIATE SAMPLES 55 3.4 GRAPHICAL DISPLAYS FOR MULTIVARIATE SAMPLES 56 3.5 DYNAMIC GRAPHICS 58 3.6 MEAN VECTORS 63 3.7 COVARIANCE MATRICES 66 3.8 CORRELATION MATRICES 69

Page 7: [Wiley Series in Probability and Statistics] Methods of Multivariate Analysis (Rencher/Methods) || Front Matter

CONTENTS VII

3.9 MEAN VECTORS AND COVARIANCE MATRICES FOR SUBSETS OF VARIABLES 71 3.9.1 Two Subsets 71 3.9.2 Three or More Subsets 73

3.10 LINEAR COMBINATIONS OF VARIABLES 75 3.10.1 Sample Properties 75 3.10.2 Population Properties 81

3.11 MEASURES OF OVERALL VARIABILITY 81 3.12 ESTIMATION OF MISSING VALUES 82 3.13 DISTANCE BETWEEN VECTORS 84

Problems 85

The Multivariate Normal Distribution 91

4.1 MULTIVARIATE NORMAL DENSITY FUNCTION 91 4.1.1 Univariate Normal Density 92 4.1.2 Multivariate Normal Density 92 4.1.3 Generalized Population Variance 93 4.1.4 Diversity of Applications of the Multivariate Normal 93

4.2 PROPERTIES OF MULTIVARIATE NORMAL RANDOM VARIABLES 94

4.3 ESTIMATION IN THE MULTIVARIATE NORMAL 99 4.3.1 Maximum Likelihood Estimation 99 4.3.2 Distribution of y and S 100

4.4 ASSESSING MULTIVARIATE NORMALITY 101 4.4.1 Investigating Univariate Normality 101 4.4.2 Investigating Multivariate Normality 106

4.5 TRANSFORMATIONS TO NORMALITY 108 4.5.1 Univariate Transformations to Normality 109 4.5.2 Multivariate Transformations to Normality 110

4.6 OUTLIERS 111 4.6.1 Outliers in Univariate Samples 112 4.6.2 Outliers in Multivariate Samples 113 Problems 117

Tests on One or Two Mean Vectors 125

5.1 MULTIVARIATE VERSUS UNIVARIATE TESTS 125 5.2 TESTS ON μ WITH Σ KNOWN 126

Page 8: [Wiley Series in Probability and Statistics] Methods of Multivariate Analysis (Rencher/Methods) || Front Matter

viii CONTENTS

5.2.1 Review of Univariate Test for H0: μ = μ0 with σ

Known 126 5.2.2 Multivariate Test for H0: μ = μ0 with Σ Known 127

5.3 TESTS ON μ WHEN Σ IS UNKNOWN 130 5.3.1 Review of Univariate ί-Test for H0: μ = μ0 with σ

Unknown 130 5.3.2 Hotelling's T2-Test for H0: μ = μ0 with Σ Unknown 131

5.4 COMPARING TWO MEAN VECTORS 134 5.4.1 Review of Univariate Two-Sample i-Test 134 5.4.2 Multivariate Two-Sample T2 -Test 135 5.4.3 Likelihood Ratio Tests 139

5.5 TESTS ON INDIVIDUAL VARIABLES CONDITIONAL ON REJECTION OF H0 BY THE T2-TEST 139

5.6 COMPUTATION OF T2 143 5.6.1 Obtaining T2 from a MANOVA Program 143 5.6.2 Obtaining T2 from Multiple Regression 144

5.7 PAIRED OBSERVATIONS TEST 145 5.7.1 Univariate Case 145 5.7.2 Multivariate Case 147

5.8 TEST FOR ADDITIONAL INFORMATION 149 5.9 PROFILE ANALYSIS 152

5.9.1 One-Sample Profile Analysis 152 5.9.2 Two-Sample Profile Analysis 154 Problems 161

Multivariate Analysis of Variance 169

6.1 ONE-WAY MODELS 169 6.1.1 Univariate One-Way Analysis of Variance (ANOVA) 169 6.1.2 Multivariate One-Way Analysis of Variance Model

(MANOVA) 171 6.1.3 Wilks'Test Statistic 174 6.1.4 Roy's Test 178 6.1.5 Pillai and Lawley-Hotelling Tests 179 6.1.6 Unbalanced One-Way MANOVA 181 6.1.7 Summary of the Four Tests and Relationship to T2 182 6.1.8 Measures of Multivariate Association 186

6.2 COMPARISON OF THE FOUR MANOVA TEST STATISTICS 189 6.3 CONTRASTS 191

Page 9: [Wiley Series in Probability and Statistics] Methods of Multivariate Analysis (Rencher/Methods) || Front Matter

CONTENTS IX

6.3.1 Univariate Contrasts 191 6.3.2 Multivariate Contrasts 192

6.4 TESTS ON INDIVIDUAL VARIABLES FOLLOWING REJECTION OF i/0 BY THE OVERALL MANOVA TEST 195

6.5 TWO-WAY CLASSIFICATION 198 6.5.1 Review of Univariate Two-Way ANOVA 198 6.5.2 Multivariate Two-Way MANOVA 201

6.6 OTHER MODELS 207 6.6.1 Higher-Order Fixed Effects 207 6.6.2 Mixed Models 208

6.7 CHECKING ON THE ASSUMPTIONS 210 6.8 PROFILE ANALYSIS 211 6.9 REPEATED MEASURES DESIGNS 215

6.9.1 Multivariate Versus Univariate Approach 215 6.9.2 One-Sample Repeated Measures Model 219 6.9.3 fc-Sample Repeated Measures Model 222 6.9.4 Computation of Repeated Measures Tests 224 6.9.5 Repeated Measures with Two Within-Subjects Factors

and One Between-Subjects Factor 224 6.9.6 Repeated Measures with Two Within-Subjects Factors

and Two Between-Subjects Factors 230 6.9.7 Additional Topics 232

6.10 GROWTH CURVES 232 6.10.1 Growth Curve for One Sample 232 6.10.2 Growth Curves for Several Samples 239 6.10.3 Additional Topics 241

6.11 TESTS ON A SUB VECTOR 241 6.11.1 Test for Additional Information 241 6.11.2 Stepwise Selection of Variables 243 Problems 244

Tests on Covariance Matrices 259

7.1 INTRODUCTION 259 7.2 TESTING A SPECIFIED PATTERN FOR Σ 259

7.2.1 Testing H0: Σ = Σ 0 260 7.2.2 Testing Sphericity 261 7.2.3 Testing H0: Σ = σ2[(1 - p)l + pj] 263

7.3 TESTS COMPARING COVARIANCE MATRICES 265

Page 10: [Wiley Series in Probability and Statistics] Methods of Multivariate Analysis (Rencher/Methods) || Front Matter

CONTENTS

7.3.1 Univariate Tests of Equality of Variances 265 7.3.2 Multivariate Tests of Equality of Covariance Matrices 266

7.4 TESTS OF INDEPENDENCE 269 7.4.1 Independence of Two Subvectors 269 7.4.2 Independence of Several Subvectors 271 7.4.3 Test for Independence of All Variables 275 Problems 276

Discriminant Analysis: Description of Group Separation 281

8.1 INTRODUCTION 281 8.2 THE DISCRIMINANT FUNCTION FOR TWO GROUPS 282 8.3 RELATIONSHIP BETWEEN TWO-GROUP DISCRIMINANT

ANALYSIS AND MULTIPLE REGRESSION 286 8.4 DISCRIMINANT ANALYSIS FOR SEVERAL GROUPS 288

8.4.1 Discriminant Functions 288 8.4.2 A Measure of Association for Discriminant Functions 292

8.5 STANDARDIZED DISCRIMINANT FUNCTIONS 292 8.6 TESTS OF SIGNIFICANCE 294

8.6.1 Tests for the Two-Group Case 294 8.6.2 Tests for the Several-Group Case 295

8.7 INTERPRETATION OF DISCRIMINANT FUNCTIONS 298 8.7.1 Standardized Coefficients 298 8.7.2 Partial F-Values 299 8.7.3 Correlations Between Variables and Discriminant

Functions 300 8.7.4 Rotation 301

8.8 SCATTERPLOTS 301 8.9 STEPWISE SELECTION OF VARIABLES 303

Problems 306

Classification Analysis: Allocation of Observations to Groups;309

9.1 INTRODUCTION 309 9.2 CLASSIFICATION INTO TWO GROUPS 310 9.3 CLASSIFICATION INTO SEVERAL GROUPS 314

9.3.1 Equal Population Covariance Matrices: Linear Classification Functions 315

9.3.2 Unequal Population Covariance Matrices: Quadratic Classification Functions 317

Page 11: [Wiley Series in Probability and Statistics] Methods of Multivariate Analysis (Rencher/Methods) || Front Matter

CONTENTS XI

9.4 ESTIMATING MISCLASSIFICATION RATES 318 9.5 IMPROVED ESTIMATES OF ERROR RATES 320

9.5.1 Partitioning the Sample 321 9.5.2 Holdout Method 322

9.6 SUBSET SELECTION 322 9.7 NONPARAMETRIC PROCEDURES 326

9.7.1 Multinomial Data 326 9.7.2 Classification Based on Density Estimators 327 9.7.3 Nearest Neighbor Classification Rule 330 9.7.4 Classification Trees 331 Problems 336

10 Multivariate Regression 339

10.1 INTRODUCTION 339 10.2 MULTIPLE REGRESSION: FIXED x's 340

10.2.1 Model for Fixed x's 340 10.2.2 Least Squares Estimation in the Fixed-x Model 342 10.2.3 An Estimator for σ2 343 10.2.4 The Model Corrected for Means 344 10.2.5 Hypothesis Tests 346 10.2.6 R2 in Fixed-x Regression 349 10.2.7 Subset Selection 350

10.3 MULTIPLE REGRESSION: RANDOM x's 354 10.4 MULTIVARIATE MULTIPLE REGRESSION: ESTIMATION 354

10.4.1 The Multivariate Linear Model 354 10.4.2 Least Squares Estimation in the Multivariate Model 356 10.4.3 Properties of Least Squares Estimator B 358 10.4.4 An Estimator for Σ 360 10.4.5 Model Corrected for Means 361 10.4.6 Estimation in the Seemingly Unrelated Regressions

(SUR) Model 362 10.5 MULTIVARIATE MULTIPLE REGRESSION: HYPOTHESIS

TESTS 364 10.5.1 Test of Overall Regression 364 10.5.2 Test on a Subset of the x's 367

10.6 MULTIVARIATE MULTIPLE REGRESSION: PREDICTION 370 10.6.1 Confidence Interval for E(y0) 370 10.6.2 Prediction Interval for a Future Observation yo 371

Page 12: [Wiley Series in Probability and Statistics] Methods of Multivariate Analysis (Rencher/Methods) || Front Matter

CONTENTS

10.7 MEASURES OF ASSOCIATION BETWEEN THE y\ AND THE x's 372

10.8 SUBSET SELECTION 374 10.8.1 Stepwise Procedures 374 10.8.2 All Possible Subsets 377

10.9 MULTIVARIATE REGRESSION: RANDOM x's 380 Problems 381

Canonical Correlation 385

11.1 INTRODUCTION 385 11.2 CANONICAL CORRELATIONS AND CANONICAL

VARIATES 385 11.3 PROPERTIES OF CANONICAL CORRELATIONS 390 11.4 TESTS OF SIGNIFICANCE 391

11.4.1 Tests of No Relationship Between the y's and the x's 391 11.4.2 Test of Significance of Succeeding Canonical

Correlations After the First 393 11.5 INTERPRETATION 395

11.5.1 Standardized Coefficients 396 11.5.2 Correlations between Variables and Canonical Variates 397 11.5.3 Rotation 397 11.5.4 Redundancy Analysis 398

11.6 RELATIONSHIPS OF CANONICAL CORRELATION ANALYSIS TO OTHER MULTIVARIATE TECHNIQUES 398 11.6.1 Regression 398 11.6.2 MANOVA and Discriminant Analysis 400 Problems 402

Principal Component Analysis 405

12.1 INTRODUCTION 405 12.2 GEOMETRIC AND ALGEBRAIC BASES OF PRINCIPAL

COMPONENTS 406 12.2.1 Geometric Approach 406 12.2.2 Algebraic Approach 410

12.3 PRINCIPAL COMPONENTS AND PERPENDICULAR REGRESSION 412

12.4 PLOTTING OF PRINCIPAL COMPONENTS 414

Page 13: [Wiley Series in Probability and Statistics] Methods of Multivariate Analysis (Rencher/Methods) || Front Matter

CONTENTS XIII

12.5 PRINCIPAL COMPONENTS FROM THE CORRELATION MATRIX 419

12.6 DECIDING HOW MANY COMPONENTS TO RETAIN 423 12.7 INFORMATION IN THE LAST FEW PRINCIPAL

COMPONENTS 427 12.8 INTERPRETATION OF PRINCIPAL COMPONENTS 427

12.8.1 Special Patterns in S or R 427 12.8.2 Rotation 429 12.8.3 Correlations Between Variables and Principal

Components 429 12.9 SELECTION OF VARIABLES 430

Problems 432

Exploratory Factor Analysis 435

13.1 INTRODUCTION 435 13.2 ORTHOGONAL FACTOR MODEL 437

13.2.1 Model Definition and Assumptions 437 13.2.2 Nonuniqueness of Factor Loadings 441

13.3 ESTIMATION OF LOADINGS AND COMMONALITIES 442 13.3.1 Principal Component Method 443 13.3.2 Principal Factor Method 448 13.3.3 Iterated Principal Factor Method 450 13.3.4 Maximum Likelihood Method 452

13.4 CHOOSING THE NUMBER OF FACTORS, m 453 13.5 ROTATION 457

13.5.1 Introduction 457 13.5.2 Orthogonal Rotation 458 13.5.3 Oblique Rotation 462 13.5.4 Interpretation 465

13.6 FACTOR SCORES 466 13.7 VALIDITY OF THE FACTOR ANALYSIS MODEL 470 13.8 RELATIONSHIP OF FACTOR ANALYSIS TO PRINCIPAL

COMPONENT ANALYSIS 475 Problems 476

Confirmatory Factor Analysis 479

14.1 INTRODUCTION 479 14.2 MODEL SPECIFICATION AND IDENTIFICATION 480

Page 14: [Wiley Series in Probability and Statistics] Methods of Multivariate Analysis (Rencher/Methods) || Front Matter

CONTENTS

14.2.1 Confirmatory Factor Analysis Model 480 14.2.2 Identified Models 482

14.3 PARAMETER ESTIMATION AND MODEL ASSESSMENT 487 14.3.1 Maximum Likelihood Estimation 487 14.3.2 Least Squares Estimation 488 14.3.3 Model Assessment 489

14.4 INFERENCE FOR MODEL PARAMETERS 492 14.5 FACTOR SCORES 495

Problems 496

Cluster Analysis 501

15.1 INTRODUCTION 501 15.2 MEASURES OF SIMILARITY OR DISSIMILARITY 502 15.3 HIERARCHICAL CLUSTERING 505

15.3.1 Introduction 505 15.3.2 Single Linkage (Nearest Neighbor) 506 15.3.3 Complete Linkage (Farthest Neighbor) 508 15.3.4 Average Linkage 511 15.3.5 Centroid 514 15.3.6 Median 514 15.3.7 Ward's Method 517 15.3.8 Flexible Beta Method 520 15.3.9 Properties of Hierarchical Methods 521 15.3.10 Divisive Methods 529

15.4 NONHIERARCHICAL METHODS 531 15.4.1 Partitioning 532 15.4.2 Other Methods 540

15.5 CHOOSING THE NUMBER OF CLUSTERS 544 15.6 CLUSTER VALIDITY 546 15.7 CLUSTERING VARIABLES 547

Problems 548

Graphical Procedures 555

16.1 MULTIDIMENSIONAL SCALING 555 16.1.1 Introduction 555 16.1.2 Metric Multidimensional Scaling 556 16.1.3 Nonmetric Multidimensional Scaling 560

16.2 CORRESPONDENCE ANALYSIS 565

Page 15: [Wiley Series in Probability and Statistics] Methods of Multivariate Analysis (Rencher/Methods) || Front Matter

CONTENTS XV

16.3

16.2.1 16.2.2 16.2.3 16.2.4 16.2.5 BIPLOr

16.3.1 16.3.2 16.3.3 16.3.4 16.3.5

Introduction Row and Column Profiles Testing Independence Coordinates for Plotting Row and Column Profiles Multiple Correspondence Analysis

rs Introduction Principal Component Plots Singular Value Decomposition Plots Coordinates Other Methods

Problems

565 566 570 572 576 580 580 581 583 583 585 588

Appendix A: Tables 597

Appendix B: Answers and Hints to Problems 637

Appendix C: Data Sets and SAS Files 727

References 728

Index 745

Page 16: [Wiley Series in Probability and Statistics] Methods of Multivariate Analysis (Rencher/Methods) || Front Matter

Preface

We have long been fascinated by the interplay of variables in multivariate data and by the challenge of unraveling the effect of each variable. Our continuing objective in the third edition has been to present the power and utility of multivariate analysis in a highly readable format.

Practitioners and researchers in all applied disciplines often measure several vari-ables on each subject or experimental unit. In some cases, it may be productive to isolate each variable in a system and study it separately. Typically, however, the variables are not only correlated with each other, but each variable is influenced by the other variables as it affects a test statistic or descriptive statistics. Thus, in many instances, the variables are intertwined in such a way that when analyzed individu-ally they yield little information about the system. Using multivariate analysis, the variables can be examined simultaneously in order to access the key features of the process that produced them. The multivariate approach enables us to (1) explore the joint performance of the variables and (2) determine the effect of each variable in the presence of the others.

Multivariate analysis provides both descriptive and inferential procedures—we can search for patterns in the data or test hypotheses about patterns of a priori inter-est. With multivariate descriptive techniques, we can peer beneath the tangled web of variables on the surface and extract the essence of the system. Multivariate infer-ential procedures include hypothesis tests that (1) process any number of variables

XVII

Page 17: [Wiley Series in Probability and Statistics] Methods of Multivariate Analysis (Rencher/Methods) || Front Matter

XVÜi PREFACE

without inflating the Type I error rate and (2) allow for whatever intercorrelations the variables possess. A wide variety of multivariate descriptive and inferential proce-dures is readily accessible in statistical software packages.

Our selection of topics for this volume reflects years of consulting with researchers in many fields of inquiry. A brief overview of multivariate analysis is given in Chap-ter 1. Chapter 2 reviews the fundamentals of matrix algebra. Chapters 3 and 4 give an introduction to sampling from multivariate populations. Chapters 5, 6, 7, 10, and 11 extend univariate procedures with one dependent variable (including ί-tests, anal-ysis of variance, tests on variances, multiple regression, and multiple correlation) to analogous multivariate techniques involving several dependent variables. A review of each univariate procedure is presented before covering the multivariate counter-part. These reviews may provide key insights that the student has missed in previous courses.

Chapters 8, 9, 12, 13, 14, 15, and 16 describe multivariate techniques that are not extensions of univariate procedures. In Chapters 8 and 9, we find functions of the variables that discriminate among groups in the data. In Chapters 12, 13, and 14 (new in the third edition), we find functions of the variables that reveal the basic dimensionality and characteristic patterns of the data, and we discuss procedures for finding the underlying latent variables of a system. In Chapters 15 and 16, we give methods for searching for groups in the data, and we provide plotting techniques that show relationships in a reduced dimensionality for various kinds of data.

In Appendix A, tables are provided for many multivariate distributions and tests. These enable the reader to conduct an exact test in many cases for which software packages provide only approximate tests. Appendix B gives answers and hints for most of the problems in the book.

Appendix C describes an ftp site that contains (1) all data sets and (2) SAS com-mand files for all examples in the text. These command files can be adapted for use in working problems or in analyzing data sets encountered in applications.

To illustrate multivariate applications, we have provided many examples and ex-ercises based on 60 real data sets from a wide variety of disciplines. A practitioner or consultant in multivariate analysis gains insights and acumen from long experi-ence in working with data. It is not expected that a student can achieve this kind of seasoning in a one-semester class. However, the examples provide a good start, and further development is gained by working problems with the data sets. For example, in Chapters 12-14, the exercises cover several typical patterns in the covanance or correlation matrix. The student's intuition is expanded by associating these covari-ance patterns with the resulting configuration of the principal components or factors.

Although this is a methods book, we have included a few derivations. For some readers, an occasional proof provides insights obtainable in no other way. We hope that instructors who do not wish to use proofs will not be deterred by their presence. The proofs can easily be disregarded when reading the book.

Our objective has been to make the book accessible to readers who have taken as few as two statistical methods courses. The students in our classes in multivariate analysis include majors in statistics and majors from other departments. With the applied researcher in mind, we have provided careful intuitive explanations of the

Page 18: [Wiley Series in Probability and Statistics] Methods of Multivariate Analysis (Rencher/Methods) || Front Matter

PREFACE XIX

concepts and have included many insights typically available only in journal articles or in the minds of practitioners.

Our overriding goal in preparation of this book has been clarity of exposition. We hope that students and instructors alike will find this multivariate text more comfort-able than most. In the final stages of development of each edition, we asked our students for written reports on their initial reaction as they read each day's assign-ment. They made many comments that led to improvements in the manuscript. We will be very grateful if readers will take the time to notify us of errors or of other suggestions they might have for improvements.

We have tried to use standard mathematical and statistical notation as far as possi-ble and to maintain consistency of notation throughout the book. We have refrained from the use of abbreviations and mnemonic devices. These save space when one is reading a book page by page, but they are annoying to those using a book as a reference.

Equations are numbered sequentially throughout a chapter; for example, (3.75) indicates the 75th numbered equation in Chapter 3. Tables and figures are also num-bered sequentially throughout a chapter in the form "Table 3.9" or "Figure 3.1." Examples are not numbered sequentially; each example is identified by the same number as the section in which it appears and is placed at the end of the section.

When citing references in the text, we have used the standard format involving the year of publication. For a journal article, the year alone suffices, for example, Fisher (1936). But for books, we have included a page number, as in Seber (1984, p. 216).

This is the first volume of a two-volume set on multivariate analysis. The sec-ond volume is entitled Multivariate Statistical Inference and Applications by Alvin Rencher (Wiley, 1998). The two volumes are not necessarily sequential; they can be read independently. Al adopted the two-volume format in order to (1) provide broader coverage than would be possible in a single volume and (2) offer the reader a choice of approach.

The second volume includes proofs of many techniques covered in the first 13 chapters of the present volume and also introduces additional topics. The present volume includes many examples and problems using actual data sets, and there are fewer algebraic problems. The second volume emphasizes derivations of the results and contains fewer examples and problems with real data. The present volume has fewer references to the literature than the other volume, which includes a careful review of the latest developments and a more comprehensive bibliography. In this third edition, we have occasionally referred the reader to "Rencher (1998)" to note that added coverage of a certain subject is available in the second volume.

We are indebted to many individuals in the preparation of the first two editions of this book. Al's initial exposure to multivariate analysis came in courses taught by Rolf Bargmann at the University of Georgia and D.R. Jensen at Virginia Tech. Additional impetus to probe the subtleties of this field came from research conducted with Bruce Brown at BYU. William's interest and training in multivariate statistics were primarily influenced by Alvin Rencher and Yasuo Amemiya. We thank these mentors and colleagues and also thank Bruce Brown, Deane Branstetter, Del Scott,

Page 19: [Wiley Series in Probability and Statistics] Methods of Multivariate Analysis (Rencher/Methods) || Front Matter

XX PREFACE

Robert Smidt, and Ingram Olkin for reading various versions of the manuscript and making valuable suggestions.

We are grateful to the following colleagues and students at BYU who helped with computations and typing: Mitchell Tolland, Tawnia Newton, Marianne Matis Mohr, Gregg Littlefield, Suzanne Kimball, Wendy Nielsen, Tiffany Nordgren, David Whiting, Karla Wasden, Rachel Jones, Lonette Stoddard, Candace B. McNaughton, J. D. Williams, and Jonathan Christensen. We are grateful to the many readers who have pointed out errors or made suggestions for improvements. The book is better for their caring and their efforts.

THIRD EDITION

For the third edition, we have added a new chapter covering confirmatory factor analysis (Chapter 14). We have added new sections discussing Kronecker products and vec notation (Section 2.12), dynamic graphics (Section 3.5), transformations to normality (Section 4.5), classification trees (Section 9.7.4), seemingly unrelated re-gressions (Section 10.4.6), and prediction for multivariate multiple regression (Sec-tion 10.6). Additionally, we have updated and revised the graphics throughout the book and have substantially expanded the section discussing estimation for multi-variate multiple regression (Section 10.4).

Many other additions and changes have been made in an effort to broaden the book's scope and improve its exposition. Additional problems have been added to accompany the new material. The new ftp site for the third edition can be found at: ftp://ftp.wiley.com/public/sci_tech_med/multivariate_analysis-3e.

We thank Jonathan Christensen for the countless ways he contributed to the revi-sion, from updated graphics to technical and formatting assistance. We are grateful to Byron Gajewski, Scott Grimshaw, and Jeremy Yorgason, who have provided use-ful feedback on sections of the book that are new to this edition. We are also grateful to Scott Simpson and Storm Atwood for invaluable editing assistance and to the stu-dents of William's multivariate statistics course at Brigham Young University for pointing out errors and making suggestions for improvements in this edition. We are deeply appreciative of the support provided by the BYU Department of Statistics for the writing of the book.

Al dedicates this volume to his wife, LaRue, who has supplied much needed sup-port and encouragement. William dedicates this volume to his wife Mary Elizabeth, who has provided both encouragement and careful feedback throughout the writing process.

ALVIN C. RENCHER & WILLIAM F. CHRISTENSEN

Department of Statistics

Brigham Young University

Provo, Utah

Page 20: [Wiley Series in Probability and Statistics] Methods of Multivariate Analysis (Rencher/Methods) || Front Matter

Acknowledgments

We wish to thank the authors, editors, and owners of copyrights for permission to reproduce the following materials:

• Figure 3.8 and Table 3.2, Kleiner and Hartigan (1981), Reprinted by permis-sion of Journal of the American Statistical Association

• Table 3.3, Colvin et al. (1997), Reprinted by permission of T. S. Colvin and D. L. Karlen

• Table 3.4, Kramer and Jensen (1969a), Reprinted by permission of Journal of Quality Technology

• Table 3.5, Reaven and Miller (1979), Reprinted by permission of Diabetologia

• Table 3.6, Timm (1975), Reprinted by permission of Elsevier North-Holland Publishing Company

• Table 3.7, Elston and Grizzle (1962), Reprinted by permission oi Biometrics

• Table 3.8, Frets (1921), Reprinted by permission of Genetica

• Table 3.9, O'Sullivan and Mahan (1966), Reprinted by permission of Ameri-can Journal of Clinical Nutrition

xxi

Page 21: [Wiley Series in Probability and Statistics] Methods of Multivariate Analysis (Rencher/Methods) || Front Matter

• Table 4.2, Royston (1983), Reprinted by permission of Applied Statistics

• Table 5.1, Beall (1945), Reprinted by permission of Psychometrika

• Table 5.2, Hummel and Sligo (1971), Reprinted by permission of Psychologi-cal Bulletin

• Table 5.3, Kramer and Jensen (1969b), Reprinted by permission of Journal of Quality Technology

• Table 5.5, Lubischew (1962), Reprinted by permission of Biometrics

• Table 5.6, Travers (1939), Reprinted by permission of Psychometrika

• Table 5.7, Andrews and Herzberg (1985), Reprinted by permission of Springer-Verlag

• Table 5.8, Tintner (1946), Reprinted by permission of Journal of the American Statistical Association

• Table 5.9, Kramer (1972), Reprinted by permission of the author

• Table 5.10, Cameron and Pauling (1978), Reprinted by permission of National Academy of Sciences

• Table 6.2, Andrews and Herzberg (1985), Reprinted by permission of Springer-Verlag

• Table 6.3, Rencher and Scott (1990), Reprinted by permission of Communica-tions in Statistics: Simulation and Computation

• Table 6.6, Posten (1962), Reprinted by permission of the author

• Table 6.8, Crowder and Hand (1990, pp. 21-29), Reprinted by permission of Routledge Chapman and Hall

• Table 6.12, Cochran and Cox (1957), Timm (1980), Reprinted by permission of John Wiley and Sons and Elsevier North-Holland Publishing Company

• Table 6.14, Timm (1980), Reprinted by permission of Elsevier North-Holland Publishing Company

• Table 6.16, Potthoff and Roy (1964), Reprinted by permission of Biometrika Trustees

• Table 6.17, Baten, Tack, and Baeder (1958), Reprinted by permission of Qual-ity Progress

• Table 6.18, Keuls et al. (1984), Reprinted by permission of Scientia Horticul-turae

• Table 6.19, Burdick (1979), Reprinted by permission of the author

Page 22: [Wiley Series in Probability and Statistics] Methods of Multivariate Analysis (Rencher/Methods) || Front Matter

XXIII

• Table 6.20, Box (1950), Reprinted by permission of Biometrics

• Table 6.21, Rao (1948), Reprinted by permission of Biometrika Trustees

• Table 6.22, Cameron and Pauling (1978), Reprinted by permission of National Academy of Sciences

• Table 6.23, Williams and Izenman (1981), Reprinted by permission of Col-orado State University

• Table 6.24, Beauchamp and Hoel (1973), Reprinted by permission of Journal of Statistical Computation and Simulation

• Table 6.25, Box (1950), Reprinted by permission of Biometrics

• Table 6.26, Grizzle and Allen (1969), Reprinted by permission of Biometrics

• Table 6.27, Crepeau et. al (1985), Reprinted by permission of Biometrics

• Table 6.28, Zerbe (1979a), Reprinted by permission of Journal of the Ameri-can Statistical Association

• Table 6.29, Timm (1980), Reprinted by permission of Elsevier North-Holland Publishing Company

• Table 7.1, Siotani et al. (1963), Reprinted by permission of Institute of Statis-tical Mathematics

• Table 7.2, Reprinted by permission of R. J. Freund.

• Table 8.1, Kramer and Jensen (1969a), Reprinted by permission of Journal of Quality Technology

• Table 8.3, Reprinted by permission of G. R. Bryce and R. M. Barker

• Table 10.1, Box and Youle (1955), Reprinted by permission of Biometrics

• Tables 12.2, 12.3, and 12.4, Jeffers (1967), Reprinted by permission of Applied Statistics

• Table 13.1, Brown et al. (1984), Reprinted by permission of Journal of Pascal, Ada, and Modula

• Correlation matrix in Example 13.6, Brown, Strong, and Rencher (1973), Reprinted by permission of The Journal of the Acoustical Society of America

• Table 15.1, Hartigan (1975a), Reprinted by permission of John Wiley and Sons

• Table 15.3, Dawkins (1989), Reprinted by permission of The American Statis-tician

• Table 15.7, Hand et al. (1994), Reprinted by permission of D. J. Hand

Page 23: [Wiley Series in Probability and Statistics] Methods of Multivariate Analysis (Rencher/Methods) || Front Matter

xxiv

• Table 15.12, Sokol and Rohlf (1981), Reprinted by permission of W. H. Free-man and Co.

• Table 15.13, Hand et al. (1994), Reprinted by permission of D. J. Hand

• Table 16.1, Kruskal and Wish (1978), Reprinted by permission of Sage Publi-cations

• Tables 16.2 and 16.5, Hand et al. (1994), Reprinted by permission of D. J. Hand

• Table 16.13, Edwards and iCreiner (1983), Reprinted by permission of Biometrika

• Table 16.15, Hand et al. (1994), Reprinted by permission of D. J. Hand

• Table 16.16, Everitt (1987), Reprinted by permission of the author

• Table 16.17, Andrews and Herzberg (1985), Reprinted by permission of Springer Verlag

• Table 16.18, Clausen (1998), Reprinted by permission of Sage Publications

• Table 16.19, Andrews and Herzberg (1985), Reprinted by permission of Springer Verlag

• Table A. 1, Mulholland (1977), Reprinted by permission of Biometrika Trustees

• Table A.2, D'Agostino and Pearson (1973), Reprinted by permission of Biometril Trustees

• Table A.3, D'Agostino and Tietjen (1971), Reprinted by permission of Biometrik; Trustees

• Table A.4, D'Agostino (1972), Reprinted by permission of Biometrika Trustees

• Table A.5, Mardia (1970,1974), Reprinted by permission of Biometrika Trustees

• Table A.6, Barnett and Lewis (1978), Reprinted by permission of John Wiley and Sons

• Table A.7, Kramer and Jensen (1969a), Reprinted by permission of Journal of Quality Technology

• Table A.8, Bailey (1977), Reprinted by permission of Journal of the American Statistical Association

• Table A.9, Wall (1967), Reprinted by permission of the author, Albuquerque, NM

• Table A.10, Pearson and Hartley (1972) and Pillai (1964, 1965), Reprinted by permission of Biometrika Trustees

Page 24: [Wiley Series in Probability and Statistics] Methods of Multivariate Analysis (Rencher/Methods) || Front Matter

XXV

• Table A. 11, Schuurmann et al. (1975), Reprinted by permission of Journal of Statistical Computation and Simulation

• Table A. 12, Davis (1970a,b, 1980), Reprinted by permission of Biometnka Trustees

• Table A. 13, Kleinbaum, Kupper, and Muller (1988), Reprinted by permission of PWS-KENT Publishing Company

• Table A. 14, Lee et al. (1977), Reprinted by permission of Elsevier North-Holland Publishing Company

• Table A. 15, Mathai and Katiyar (1979), Reprinted by permission of Biometrika Trustees