[wiley series in probability and statistics] a user's guide to principal components...

16
CHAPTER 9 A Case History-Hearing Loss I1 9.1 INTRODUCTION In Chapter 5 (“Putting it Altogether”), a numerical example was presented to illustrate many of the techniques that had been introduced in the book up to that point. The example dealt with a sample of 100 observations from a much larger study of audiometric examinations. In this chapter, the larger study will be discussed in detail. The purpose of this chapter is to illustrate a case history in which the method of principal components played a prominent part and of the various modifications and additions to the PCA methodology that were required by the specific nature of this application. As the reader will recall from Chapter 5, the purpose of this study was to be able to characterize normal hearing loss as a function of age and sex so as to be able to distinguish between normal hearing loss due to aging and abnormal hearing loss due to noise exposure, illness, and so on. Comparisons of this sort might be carried out either on individuals or groups of individuals. This study is presented here solely for the purpose of illustrating the uses and extensions of PCA and not for the purposes of making specific comments about hearing experience or comparing this with other studies. This particular study was carried on in the 1960s. The norms presented here represent that period of time for a single group of people-employees of the Eastman Kodak Company-and should in no way be construed to represent any other point in time or any other group of people. Changes in instrumentation alone could render these data useless for comparisonif one did not have detailed information about these changes. There has been a change in the instrumentation standard since that time (ANSI, 1969). In addition, it is well known that hearing norms do vary in different parts of the world [see, for instance, Taylor et al. (1969, Kell et al. (1970), and Kryter (1973)l. It is also apparent that a portion of the younger population of the United States, and possibly other parts of the world as well, has shown a significant change in average hearing threshold since the 1960s due to increased exposure to loud noise (Mills, 1978). 173 A User’s Guide to Principal Components. J. Edward Jackson Copyright 0 1991 John Wiley & Sons, Inc. ISBN: 0-47 1-62267-2

Upload: j-edward

Post on 04-Jan-2017

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: [Wiley Series in Probability and Statistics] A User's Guide to Principal Components (Jackson/User's Guide to Principal Components) || A Case History-Hearing Loss II

C H A P T E R 9

A Case History-Hearing Loss I1

9.1 INTRODUCTION

In Chapter 5 (“Putting it Altogether”), a numerical example was presented to illustrate many of the techniques that had been introduced in the book up to that point. The example dealt with a sample of 100 observations from a much larger study of audiometric examinations. In this chapter, the larger study will be discussed in detail. The purpose of this chapter is to illustrate a case history in which the method of principal components played a prominent part and of the various modifications and additions to the PCA methodology that were required by the specific nature of this application.

As the reader will recall from Chapter 5, the purpose of this study was to be able to characterize normal hearing loss as a function of age and sex so as to be able to distinguish between normal hearing loss due to aging and abnormal hearing loss due to noise exposure, illness, and so on. Comparisons of this sort might be carried out either on individuals or groups of individuals.

This study is presented here solely for the purpose of illustrating the uses and extensions of PCA and not for the purposes of making specific comments about hearing experience or comparing this with other studies. This particular study was carried on in the 1960s. The norms presented here represent that period of time for a single group of people-employees of the Eastman Kodak Company-and should in no way be construed to represent any other point in time or any other group of people. Changes in instrumentation alone could render these data useless for comparison if one did not have detailed information about these changes. There has been a change in the instrumentation standard since that time (ANSI, 1969). In addition, it is well known that hearing norms do vary in different parts of the world [see, for instance, Taylor et al. (1969, Kell et al. (1970), and Kryter (1973)l. It is also apparent that a portion of the younger population of the United States, and possibly other parts of the world as well, has shown a significant change in average hearing threshold since the 1960s due to increased exposure to loud noise (Mills, 1978).

173

A User’s Guide to Principal Components. J. Edward Jackson Copyright 0 1991 John Wiley & Sons, Inc.

ISBN: 0-47 1-62267-2

Page 2: [Wiley Series in Probability and Statistics] A User's Guide to Principal Components (Jackson/User's Guide to Principal Components) || A Case History-Hearing Loss II

174 A CASE HISTORY-HEARING LOSS 11

A number of studies had been carried out previous to the one being described. Riley et al. (1961) discussed a study carried out in the 1950s relating normal hearing loss to age. The variability associated with audiometric examinations including instrument and operator variability as well as the inherent subject variability was discussed in Jackson et al. (1962). This last component of variability was obtained by subjecting a number of persons taking preemploy- ment physicals to two examinations, one at the beginning and one at the end of the physical. Some preliminary investigation of the use of multivariate methods indicated that PCA would be a useful technique in the analysis and reporting of audiometric data and that four pc’s would adequately represent the variability of normal individuals from the same age group.

9.2 THEDATA

I expect that most of our readers, at one time or another, have taken an audiometric examination. This generally consists of the subject sitting in a soundproof booth and wearing earphones through which are sent a series of signals of varying frequencies, one ear at a time. The intensity of these signals is increased until the subject indicates that it has been perceived. The data, then, consist of the intensity required for each ear at each frequency. These are known as thresholds and are reported in units of “decibel loss.” These frequencies are usually transmitted in a random order rather than going from the lowest frequency to the highest. At the time this study was conducted, nine frequencies were employed for routine examinations, 250,500,1000, 2000,3000,4000,6000, and 8000 Hz, but this study concentrated on only four of them. The reason for this was primarily one of speed and capacity of the first-generation mainframe computers in use at that time. It had been demonstrated that given three frequencies (lo00 Hz, 2000 Hz, and 4000 Hz) the other six could be predicted with reasonable precision. However, 500 Hz was also included in the study because this frequency, along with 1000 Hz and 2000 Hz, was used by the New York State Worker’s Compensation Board in determining hearing impairment caused by exposure to occupational noise.

To put things in perspective, 500 Hz corresponds roughly to “middle C” on a piano and the other three frequencies are successive octaves of 500Hz (i.e., 500 Hz to lo00 Hz is one octave, loo0 Hz to 2000 Hz is another, and so on).

Earlier studies had usually broken the population into five age groups each representing 10 years covering the range of age 16-65. One problem with this stemmed from the monitoring of individuals over time. For example, a person aged 25 would be among the oldest in the first age group and being compared to the average for that group, would generally appear to be worse off. The next year, this individual would be 26 and among the youngest in age group 26-35 and hence would appear to be better than average. The resolution of this problem, which will be discussed in Section 9.4, resulted in the age groups being

Page 3: [Wiley Series in Probability and Statistics] A User's Guide to Principal Components (Jackson/User's Guide to Principal Components) || A Case History-Hearing Loss II

THE DATA 175

restructured into 3-year increments, these being the smallest age brackets that would still produce adequate sample sizes, particularly for the older age groups.

The data were obtained from the Company medical records and included everyone who would be considered “normal” from the standpoint of hearing. Excluded from this normal group were:

1. Those who had any significant noise exposure in the present job, in a prior job, or while pursuing noisy avocations

2. Those who were subjected to significant noise while in military service 3. Anyone indicating any history of ear trouble 4. Anyone who exhibited a significantly high hearing threshold or any other

evidence of an abnormal audiogram

The screening of individuals for steps 1, 2, and 3 was done from medical records. Step 4 was done by examining the Q- and T2-statistics for each individual during the PCA phase of the analysis in the manner illustrated in Chapter 5. After this screening, 10 358 audiograms remained in the male normal group and 7672 in the female normal group. The sample sizes by age groups are shown in Table 9.1.

Because this is a description of statistical techniques rather than a clinical report, the procedures in the following sections will be illustrated for the male data only, but exactly the same steps were carried out for the female data. The results differed somewhat because, for these particular populations, the males

Table 9.1. Audiometric Case Study. Sample Sizes for Three-Year Age Groups of Normals

Sample Size

Age Group Males Females

17-19 20-22 23-25 26-28 29-31 32-34 35-37 38-40 41-43 44-46 47-49 50-52 53-55 56-58 59-61 62-64

2215 2243 1424 796 515 405 337 300 312 344 323 363 276 232 156 117

3173 1708 619 388 273 233 294 243 224 180 128 85 45 40 28 11

Page 4: [Wiley Series in Probability and Statistics] A User's Guide to Principal Components (Jackson/User's Guide to Principal Components) || A Case History-Hearing Loss II

176 A CASE HISTORY-HEARING LOSS 11

Table 9.2. Audiometric Case Study. Sample Means-Males

Left Ear Right Ear

AgeGroup 500 1000 2000 4OOO 500 lo00 2000 4000

17-19 -3.6 -4.0 -3.9 .5 -4.0 -4.0 -3.9 .O 20-22 -3.2 -3.3 -3.8 1.3 -3.6 -3.3 -3.5 1.1 23-25 -3.4 -3.3 -3.1 4.5 -3.8 -3.4 -3.4 3.4 26-28 -3.4 -2.7 -2.1 8.7 -3.9 -2.8 -2.5 7.0 29-31 -3.0 -2.3 -1.3 11.7 -3.7 -2.4 -1.9 10.9 32-34 -2.9 -2.4 -1.7 15.3 -3.5 -2.0 -1.7 14.3 35-37 -2.7 - 1.8 1.3 17.9 -3.0 -2.0 .1 16.2 38-40 -2.2 -.5 1.4 21.4 -2.7 -.6 .9 19.3 41-43 -2.4 -.3 3.4 24.8 -2.4 -.2 2.0 23.5 44-46 - 2.0 .O 3.3 25.2 -2.3 -.4 2.6 23.7 47-49 - 1.2 .6 4.6 27.0 -1.6 .9 3.4 25.4 50-52 - .9 2.1 6.9 32.2 - .6 1.9 5.0 27.7 53-55 -.l 2.1 8.8 32.8 - .9 1.6 7.1 29.3 56-58 1.3 4.3 13.3 38.4 .7 4.7 10.4 36.6 59-61 1.8 5.8 14.1 40.1 2.2 5.2 11.0 36.4 62-64 3.9 7.4 17.1 43.6 2.9 7.1 16.8 40.9

Table 9.3. Audiometric Case Study. Sample Standard Deviations-Males

Left Ear Right Ear

AgeGroup 500 1000 2000 4000 500 lo00 2000 4000

17-19 20-22 23-25 26-28 29-31 32-34 35-37 38-40 41-43 44-46 47-49 50-52 53-55 56-58 59-61 62-64

5.45 5.46 6.79 10.93 5.27 5.20 5.95 8.89 5.69 5.77 6.95 11.62 5.44 5.35 6.12 9.68 5.58 5.66 7.43 12.94 5.37 5.44 6.23 11.18 5.63 5.81 8,15 15.75 5.70 5.71 6.77 13.65 5.78 6.12 8.27 17.81 5.57 5.94 6.93 15.86 5.86 6.21 8.44 17.72 5.56 5.95 7.34 16.60 6.31 6.72 9.54 17.87 6.18 6.32 8.99 16.91 6.38 7.19 10.13 19.12 6.60 6.79 8.93 18.14 6.51 6.96 10.49 18.78 6.43 6.56 9.08 18.94 6.77 7.49 11.41 18.88 6.37 6.91 10.65 18.01 6.78 7.07 11.36 19.63 6.99 7.23 10.17 18.89 7.27 7.67 11.82 18.64 7.56 7.85 10.57 18.78 7.36 8.43 14.65 19.83 7.29 7.63 11.56 18.75 7.58 9.11 15.75 20.04 7.25 8.70 13.76 19.73 8.80 9.93 15.99 18.39 8.71 10.23 13.69 18.90 9.68 11.40 18.62 19.46 9.20 11.11 16.30 19.88

Page 5: [Wiley Series in Probability and Statistics] A User's Guide to Principal Components (Jackson/User's Guide to Principal Components) || A Case History-Hearing Loss II

PRINCIPAL COMPONENT ANALYSIS 177

had higher average hearing loss than the females for all age levels as well as greater variability at all age levels. The age functions described in Section 9.4 were different in some cases, due primarily to the fact that, owing to insufficient data, the female data for ages 47-64 were omitted and therefore the functions, which were generally nonlinear, were fitted over a considerably shorter range. Both groups used four principal components and these were quite similar.

The means and standard deviations for each male age group are displayed in Tables 9.2 and 9.3. Note that both quantities increase as a function of age.

9.3 PRINCIPAL COMPONENT ANALYSIS

The principal component analysis was carried out for each age group separately. The correlation matrix for each age group was used rather than the covariance matrix because of the large differences in variability between the lower and higher frequencies. Previous studies had indicated that four principal components would leave, as a residual, that variability expected from repeat examinations on individuals. The U-vectors were quite similar to those obtained from the small data set in Chapter 5. The first pc represented overall hearing loss; the second pc represented the difference between the high frequencies (2000 Hz and 4000 Hz) and the low frequencies (500 Hz and loo0 Hz); the third pc was related primarily to the difference between 2000 Hz and 4000 Hz, while the fourth pc represented the left-right ear differences.

In fact, the U-vectors differed very little between age groups. What did change was the proportion of the total variability explained by the associated pc's. Section 16.6 contains a test of the hypothesis that certain unit vectors are not significantly different among groups-only their roots. The present situation, although relevant, is different to the extent that these groups are a function of a continuous variable, age. (Also, the techniques listed in Section 16.6 were developed nearly 20 years after this study was carried out.) The characteristic roots, li, are given in Table 9.4 along with the amount unexplained, which is equal to the four remaining roots.

Since the correlation matrices all have the same trace, Tr(R) = p = 8, they may be compared directly. While it appears that the older the age group, the better the fit, this is due to the fact that the older the group, the more variability there is. This additional increment in variability is, by and large, being accounted for by the first four pc's and although the inherent variability is also growing, it is doing so more slowly, which accounts for the appearance of Table 9.4. Note that although 11, the root associated with the first pc, overall hearing loss, increases steadily with age, I z (high-low) is constant while I 3 (2000-4000Hz) and I, (left-right) decrease. This does not mean, for instance, that left-right differences decrease with age but merely that they do not increase as rapidly as the overall hearing loss. This illustrates one of the difficulties of using correlation matrices but the alternative of using covariance matrices would have yielded vectors heavily dominated by the higher frequencies.

Page 6: [Wiley Series in Probability and Statistics] A User's Guide to Principal Components (Jackson/User's Guide to Principal Components) || A Case History-Hearing Loss II

178 A CASE HISTORY-HEARING LOSS I1

Table 9.4. Audiometric Case Study. Sample Characteristic Roots-Males

Age Group 4 12 13 L Trace Residual

17-19 3.175 1.523 .898 .665 1.739 20-22 3.262 1.498 .929 .618 1.693 23-25 3.234 1.47 1 .984 .639 1.672 26-28 3.333 1.643 .898 .609 1.517 29-31 3.567 1.432 .923 .613 1.465 32-34 3.499 1.524 .991 .596 1.390 35-37 3.642 1.655 .900 .576 1.227 38-40 3.689 1.533 .935 .543 1.300 41-43 3.693 1.778 .769 .575 1.185 44-46 3.798 1.663 .877 .494 1.168 47-49 3.901 1.539 .798 3 3 1.209 50-52 3.940 1.567 .841 .532 1.120 53-55 4.107 1.670 .67 1 .487 1.065 56-58 4.189 1.545 .827 .430 1.009 59-61 4.707 1.359 .637 .494 .813 62-64 4.820 1.510 3 5 .424 .691

9.4 ALLOWANCE FOR AGE

As stated in the introduction to this chapter, the purpose of this study was to be able to characterize normal hearing as a function of age and sex. One would then be able to compare an individual with his or her peers to determine whether there was any evidence of abnormal hearing. One would also wish to be able to compare groups of individuals, say those representing a certain occupation or environment, to determine whether that group as a whole is different from the normal population. In this latter case, the group might cover a large gamut of ages. Since a PCA has been carried out on each of the 3-year age groups, y-scores may be obtained for each individual, and in the case of groups these scores may be combined to obtain information for the group as a whole.

However, when one is working with large databases such as these with over 18000 in the control groups alone, some simplification of the analytical procedures is desirable. For this application, the simplification was done in two parts: (1) simplification of the characteristic vectors and (2) expression of these new vectors directly as a function of age. It was stated in the previous section that, for a given sex, the U-vectors differed little among age groups. Further, quantities such as the mean and standard deviations of the original variables as well as the characteristic roots appeared to be monotonically related to age. The intent, then, is to combine these two results into an efficient procedure for carrying out the desired comparisons.

Page 7: [Wiley Series in Probability and Statistics] A User's Guide to Principal Components (Jackson/User's Guide to Principal Components) || A Case History-Hearing Loss II

ALLOWANCE FOR AGE 179

To do this, three assumptions were made:

Assumption I . The coefficients for the left and right ears were the same (except for sign on u4) and could be represented by the average of the two ears for each frequency. It was assumed that this would also hold for the means and standard deviations of each frequency. There did appear to be a small but consistent bias between ears for 2000 Hz and 4000 Hz for the older age groups but, clinically, there was no reason why this should be so.

Assumption 2. The means, standard deviations and characteristic roots could be related to age by fairly simple functions. Clinically, there is no reason why this should not be so.

Figures 9.1, 9.2 and 9.3 show how the original quantities appeared as a function of age for the males and the functions used to relate them. Each age group was weighted equally in fitting these functions rather than taking the sample sizes into account. The means required exponential functions except for 4000 Hz, where a linear relationship was adequate. Exponential functions were also used for the standard deviations although, in the case of 4000Hz, the sign of the exponential coefficient was negative, reflecting an upper asymptote. Linear functions were used for all four characteristic roots. (The values of I, for the two highest age groups were not used in fitting this line; it was assumed that their departure from a linear model was due to small sample sizes rather than any fundamental cause.) By using these functions to relate these various parameter estimates to age, it was felt that the precision of the system was probably enhanced over what it would have been using the original estimates for each age group. It was also necessary to obtain the limit for Q as a function of age and this is shown in Figure 9.4.

Assumption 3. The characteristic vectors could be approximated by a set of “streamlined” vectors, in the manner of the examples in Chapter 6. Recalling from that chapter, the criteria for such streamlining were that the approximations be simple, that the new “pc’s” correlate well with the old, and that they be relatively uncorrelated with each other. It can be seen from Table 9.5 that the approximations are simple. Subsequently, it will be seen that the other two criteria are fulfilled as well.

For the applications in Chapter 6, approximations of this sort were sufficient. However, the complexity of this operation requires that these be restated into something equivalent to U-, V-, and W-vectors. These will be designated by U”, V“, and W“, respectively.

To obtain u;, simply divide the elements of each column in Table 9.5 by the square root of the sum of squares of the coefficients. For instance, for u;,. the coefficients are all 1’s; hence, the sum of squares is 8 so each coefficient is divided by 9 and becomes .3536. The sums of squares for the other three vectors are 84, 28, and 8 respectively and U” is displayed in Table 9.6.

Page 8: [Wiley Series in Probability and Statistics] A User's Guide to Principal Components (Jackson/User's Guide to Principal Components) || A Case History-Hearing Loss II

+ 4

.0 I-

500

cps

t2’o

c

- ,0

72 (A

ge)

X =

-4.0

6 +

,080 e

s.e.e.

=.230

L

Q,

0.0 -

- E 0

m

-2.0

-

+8,0

+4

*0-

0.0

-4.0

1 2000 c

ps

l6.O

r

1000

cps

-

s.e.

e. =

.454

- -

e

.048 (A

ge)

X =

-6.4

6 +

1.0

99e

s.e.

e. =

,64

4

+12.

0 c

Q, -0

E

t8.0

50.0-

4000 c

ps

40

’0-

fi

30.0

-

20.0

-

X = - 1

7.12

+ ,9

33

(Age

) s.

e.e.

= 1

.23

10.0 -

0.0

20

30

40

50

60

Age

FI

GURE

9.1.

Aud

iom

etri

c Cas

e Stu

dy: m

ale s

ampl

e m

eans

vs.

age

. Rep

rodu

ced

from

Jac

kson

and

Hea

rne

(197

8) w

ith p

erm

issi

on fr

om B

iom

etri

e-Pr

axim

etri

e.

Page 9: [Wiley Series in Probability and Statistics] A User's Guide to Principal Components (Jackson/User's Guide to Principal Components) || A Case History-Hearing Loss II

0

.- 's 9.0-

U

P

.- >

aJ "O-

O

U

C

7.0

- 0

In c

0,

6,O

- Q

E Cn 50-

2000 c

ps

20.0r

h s =

5.19

+ ,0

98e

s.e.

e. =

,246

1 I

I I

I

C 0 - 1

8.0

0

>

Q,

.-

.-

16.0

E 0

0

In 2

12.0

4-

8.0

- a

- ,0

41 (A

ge)

2

= 4

54

+.9

19e

s.e.

e. =

48

6

- I

I I

I I

I 20

30

40

50

60

5 4.01

cn

Age

1000

cps

12

.oc

10.0

8.0~

6,O

0 =

5.3

4 +

,075

e s.

e.e.

= ,2

58

4.01

I

I I

I I

20.0

18.0

- -

16,O

-

14.0

-

12.0

-

4000 c

ps

- 0

0

2 = 1

9.67

- 57.

55 e -.094 (A

ge)

s.e.

e. = .762

1 I

I I

20

30

40

50

60

Age

FIGURE 93

. A

udio

met

ric C

ase S

tudy

: mal

e sam

ple s

tand

ard

devi

atio

ns vs

. age

. Rep

rodu

ced

from

Jac

kson

an

d H

earn

e (1

978)

with

per

mis

sion

from

Bio

met

rie-

Prax

imet

rie.

Page 10: [Wiley Series in Probability and Statistics] A User's Guide to Principal Components (Jackson/User's Guide to Principal Components) || A Case History-Hearing Loss II

5.0

v)

0 c

2 4.0-

v) 0

v)

.- +

.- $j 3.0 c 0 e t 0

2.0 0

w - 2 0 m

1.0-

* QQ = 5.07 - ,049 (Age)

2.0

Males

1, = 2.68 + ,026 (Age)

- A

s.e.e. = .I0 I

- A

1 2 = 1.55 - s.e.e. = . I04

0 - 0 - O . - 0 - * - - - - d 0 .& = 1.129 -.007(Aqe); s.e.e.=.078 - 1 r .

- - w a-•

1 a . - - - - - a - - - - W

-& = ,744 - .OOg( Age); s.&. = .03 I - - 0

I I I I 1

t 1 I I I 1

20 30 40 50 60 I

Age

FIGURE 9.4. Audiometric Case Study: Q vs. age. Reproduced from Jackson and Hearne (1978) with permission from Biometrie-Praximetrie.

182

Page 11: [Wiley Series in Probability and Statistics] A User's Guide to Principal Components (Jackson/User's Guide to Principal Components) || A Case History-Hearing Loss II

ALLOWANCE FOR AGE 183

Table 9.5. Audiometric Case Study. Simplified Vectors, Basic Form-Males

Vectors

Frequency 1 2 3 ~

4

500L 1 -3 1 -1 l000L 1 -2 0 -1 2000L 1 2 -3 -1 4000L 1 5 2 - 1

500R 1 -3 1 1 lOOOR 1 -2 0 1 2000R 1 2 -3 1 4000R 1 5 2 1

Table 9.6. Audiometric Case Study. Simplified U-Vectors-Males

Frequency u; u; u! u; 5OOL

l000L 2000L 4000L

500R lOOOR 2000R 4000R

.3536

.3536

.3536

.3536

.3536

.3536

.3536

.3536

- .3273 -.2182

.2 182 9 5 5

- .3273 -.2182

.2182 s455

.1890 0 - S669

.3780

.1890 0 - S669

.3780

~~

- .3536 -.3536 - .3536 -.3536

.3536

.3536

.3536

.3536

Let D" be a diagonal matrix made up of the predicted standard deviations, obtained as a function of age, from the relationships displayed in Figure 9.2. Let L" be the diagonal matrix of predicted characteristic roots as obtained from Figure 9.3. Then following (1.6.1) and (1.6.2):

V" = [L"]"2U"D" (9.4.1 )

and

This means that there are different sets of V"- and W"-vectors for each age from 17 through 64, 48 in all, but since the quantities D" and L" are functionally related to age, these vectors need not be stored but could be computed as required by storing the functions instead. V" and W" correspond to V* and W* (Section 3.3) in that they may be used with the data in original rather than standard units.

Page 12: [Wiley Series in Probability and Statistics] A User's Guide to Principal Components (Jackson/User's Guide to Principal Components) || A Case History-Hearing Loss II

184 A CASE HISTORY-HEARING LOSS I1

9.5 PUTTING IT ALL TOGETHER

To see how this all fits together, let us consider the example of person aged 39. The Va- and Wa-vectors for age 39 are given in Table 9.7.

Considering the entries for 500 Hz for the first characteristic vector, U(l, 1, from Table 9.6, is .3536. From Figure 9.2, the standard deviation for 500Hz for a 39-year old is 5.19 + .098EXP{(.059)(39)) = 6.168. The characteristic root associated with the first vector is, from Figure 9.3, 2.68 + (.026)(39) = 3.694, whose square root is 1.922. Then o ; ~ = (.3536)(1.922)(6.168) = 4.19. w ; ~ = (.3536)/{(1.922)(6.168)) = .0298. Later we shall need the a = .05 critical value for Q, which from Figure 9.4 becomes 5.07 - (.049)(39) = 3.16. With the large combined sample employed in this study, one would feel quite comfortable in using large-sample limits for the other quantities as well, resulting in limits of & 1.96 for the pc’s and 9.49 for T2.

are obtained from the expressions in Figure 9.1. Also included are the predicted k based on the y-scores for this individual (yl = -.52, y2 = .55, y, = -.92, and

Table 9.8 contains the data for one 39-year old male. The values for

Table 9.7. Audiometric Case Study. Simplified V- and W-Vectors-Males Age 39

Frequency v; v; v; v: w; w; w; w;

5ooL 4.19 -2.51 1.08 -1.62 ,0298 -.0426 .0330 -.0773 lo0OL 4.38 -1.75 0 -1.69 .0285 -.0272 0 - .0739 2000L 6.17 2.47 -4.78 -2.38 .0203 .0193 -.0673 -.0525 4000L 12.36 12.36 6.38 -4.77 .0101 .0241 .0224 -.0262 500R 4.19 -2.51 1.08 1.62 .0298 -.0426 .0330 .0773

lOOOR 4.38 -1.75 0 1.69 .0285 -.0272 0 .0739

4000R 12.36 12.36 6.38 4.77 .0101 .0241 .0224 .0262 2000R 6.17 2.47 -4.78 2.38 .0203 .0193 -.0673 .0525

Table 9.8. Audiometric Case Study. Sample Data for 39-yearsld Male

Res. = 9 = X - . t -

Frequency X 2 3 9 x - 2 3 , 2 3 9 + vy s39 (Res.)*

500L -10 -2.7 -7.3 - 7.07 - .47 .22 lo0OL 0 -1.3 1.3 -4.35 .67 .45 2000L 0 .7 - .7 3.51 - .39 .15 4000L 15 19.3 -4.3 14.33 .04 .oo

500R -10 -2.7 -7.3 - 7.43 - .42 .18 lOOOR - 5 -1.3 -3.7 -4.73 - .04 .oo 2000R 5 .7 4.3 2.99 .22 .05

.01 4000R 15 19.3 -4.3 13.28 - 1.06 = Q

Page 13: [Wiley Series in Probability and Statistics] A User's Guide to Principal Components (Jackson/User's Guide to Principal Components) || A Case History-Hearing Loss II

PUTTING IT ALL TOGETHER 185

y4 = .1 l), and the components that made up the residual sum of squares, Q. Table 9.9 compares these statistics with those for the same individual using characteristic vectors based only on the age group 38-40. These results are fairly typical. Table 9.10 gives the correlations between the principal components using the simplified vectors and those using the original vectors as obtained from each age group. With a few exceptions related to the fourth component, these correlations are quite high. The few low correlations occur quite randomly throughout the age span and it may be reasonable to assume that the approximations might, in fact, be superior to the original vectors in those instances,

Table 9.9. Audiometric Case Study. Comparison of New Method Adjusted for Age with Vectors Obtained for 38-40 Age Group for Data Shown in Table 9.8

New Age Group Method 38-40

Y l - .52 - .60 Y z .55 .51 Y 3 - .92 - .84 Y4 -.11 - .33 T2 1.42 1.43 Q 1.06 .62

Table 9.10. Audiometric Case Study. Males: Correlations Between pc’s Using Original and Simplified Vectors (Correlations > 395 = .99 + ) Age First Second Third Fourth Group PC PC PC Pc 17-19 .99 .99 + .99 .98 20-22 .99 .99 + .99 .98 23-25 .99 .99 + .99 3 9 26-28 .99 .99 + .94 .9 1 29-31 .99 + .99 + .97 .95 32-34 .99 .97 .92 .60 35-37 .99 .99 + .97 .96 38-40 .99 .99 + .97 .95 41 -43 .99 + -99 + .97 .90 44-46 .99 + .99 + .96 .54 47-49 .99 + .99 + .97 .99 + 50-52 .99 + .99 .98 .96 53-55 .99 + .99 .96 .97 56-58 .99 + .99 .96 .59 59-61 .99 + .99 38 .86 62-64 .99 + .99 .96 .96

Page 14: [Wiley Series in Probability and Statistics] A User's Guide to Principal Components (Jackson/User's Guide to Principal Components) || A Case History-Hearing Loss II

186 A CASE HISTORY-HEARING LOSS II

Table 9.11. Audiometric Case Study. Males: Correlations Among pc’s Using Simplified vectors

Components Age Group 1-2 1-3 1-4 2-3 2 -4 3 -4

17-19 .07 - .08 - .02 .02 .01 .02 20-22 .07 - .09 .oo .oo .01 - .02 23-25 .09 - .09 - .02 .05 .oo - .03 26-28 .09 -.16 - .02 . l l .oo - .01 29-31 .14 - .07 - .02 .09 .01 - .06 32-34 .07 -.11 .02 .16 .04 .03 35-37 .10 - .09 - .04 .09 .01 - .06 38-40 .10 -.12 - .04 .10 - .03 .02 41-43 .15 -.14 - .03 .04 - .01 - .03 44-46 .15 -.13 - .03 .03 - .02 .06 47-49 .13 -.14 .oo .06 - .02 - .02 50-52 .14 -.lo - .02 .08 .06 .02 53-55 .19 -.16 .oo - .03 .04 - .04 56-58 .20 -.14 - .04 .oo .12 .07 59-61 .15 - .20 .01 .19 .12 - .08 62-64 .18 -.16 .01 .01 - .07 - .04

How well the final criterion was handled can be seen in Table 9.11, which gives the correlations among all of the approximate pc’s. These are quite low. It was concluded that this system of approximations had produced a viable procedure for the routine analysis of audiometric data that adjusted for the aging process and, as will be seen in the next section, may also be used in studying groups of individuals.

9.6 ANALYSIS OF GROUPS

Even though a group of individuals may represent several different age groups, the use of PCA and the generalized statistics described in Chapter 6 allows these data to be combined for purposes of significance tests. Some of the generalized statistics can be used directly, in particular T i (equation 6.2.1) and T i (equation 6.2.4). Qo (equation 6.4.1) can be employed but requires some additional work. Computation of Qo requires the use of equation 2.7.5 and to take age into account, el , 02, ... would have to be expressed as a function of age. (These quantities involve the last four characteristic roots and, for this particular example, that information was no longer available when the statistic was developed.) Q,,, (equation 6.4.2) can also be employed but since Q.os used for this test is a function of age, one would have to assume that use of the average age of the group is adequate. TL (equation 6.3.1) is required to obtain T i but must be modified somewhat to report on its own. The reason for this

Page 15: [Wiley Series in Probability and Statistics] A User's Guide to Principal Components (Jackson/User's Guide to Principal Components) || A Case History-Hearing Loss II

ANALYSIS OF GROUPS 187

is due to the fourth pc, which represents left-right ear difference. For a group of individuals, one would expect, on the average, as many people with trouble in only the left ear as those with only trouble in the right ear and that these differences would cancel each other out. For a summary statistic, one would want an indication of the overall left-right difficulties. Accordingly, T& must be modified as follows: Equation (6.3.1) is

= ng: + ng: + ny; + ng;

Each of these terms is distributed in the limit as x' with one degree of freedom, so TL, as used here, would be distributed as xi. What is required is the substitution of some function of ly41 for ng:. Jackson and Hearne (1979a) showed that the quantity

g = n 1.6589 ly4,J/n - 1.3236 [ ( i I i ) 1 (9.6.1)

is approximately distributed as x2 with one degree of freedom. Therefore, T& may be replaced with

3 2 TY' = n Cjj? + g

i = 1 (9.6.2)

which is asymptotically distributed as xi and life goes on as before. Table 9.12 shows the data for five individuals, one being the same 39-year

old that was used in the previous section. Also included are the y-scores, T', Q, and the residuals expressed in standard units. Nothing is significant with the exception of y, for respondent # 1.

T$ is equal to the sum of the five values of T', which for this example is 23.05. The limit, x&(5,,.05, is equal to 31.4 so T$ is not significant. From Table 9.12, TL is seen to be 14.42. Ti , the difference, is equal to 8.63 and is also not significant when compared with its limit, x&(4,,.05 = 26.3.

To obtain T&,, first obtain g from (9.6.1), which is equal to

g = 5 [ ( 1.6589)(.236) - 1.323612 = 4.3440

and, as seen in Table 9.12, produces a T&,-value of 18.58. This is significant when compared to &5 = 9.49. The sums of squares of the residual averages is .3551, which when multiplied by 5, is equal to Qn = 1.78. The average age of these five individuals is 37.8, which would produce a value of Q . 0 5 = 3.22, so that is not significant. Qo as stated above, cannot be computed for this particular data set.

Page 16: [Wiley Series in Probability and Statistics] A User's Guide to Principal Components (Jackson/User's Guide to Principal Components) || A Case History-Hearing Loss II

188 A CASE HISTORY-HEARING LOSS I1

Table 9.12 Audiometric Case Study. Statistics for Group of Five Males of Dilieriog Ages

Age of Individual

Frequency 18 25 39 47 60 Ave.

Original Data 500L - 10

l000L 0 2000L 10 4OOOL 15 500R -5

lOOOR -5 2000R 5 4000R 10

0 5

10 15 0 5

10 15

T2 7.44 5.00 Q 1.43 .49

Principal Components Y1 1.08 1.74 Y2 2.14 .34 Y 3 -1.18 -1.37 Y4 -.57 0

Residuals in Standard Units 500L - .89

l000L .4 1 2000L .05 4OOOL -.16 500R .35

lOOOR -.16 2000R - .39 4OOOR -.38

- .09 .39

-.17 -.23 - .09

.39 -.17 - .23

- 10 0 0

15 - 10 -5

5 15

1.42 1.06

- .52 .55

- .92 -.11

- .47 .68

- .38 .04

- .42 - .04

.22

.09

10 10 15 15 20 35 30 55 5 5

15 15 20 40 25 60

5.63 3.56 .89 .45

1.75 1.43 1.096 - .87 .72 .576

-1.26 -1.00 -1.146 - .50 0 -.192

(ClY4l)/5 = -236

.24

.42 - .27 -.16 - .24

.67 - .02 -.17

T& = 5[(1.096)2 + * * * + ( -.192)2] = 14.42 The = 5[( 1.096)2 + a * * + ( - 1.146)’] + 4.3440 = 18.58 Q M = 5[( -.174)2 + - * * + ( -.164)2] = 1.78

.34

.12 - .28 - .38 - .24

.12

.04 -.13

-.174 .404

-.210 -.178 -.128

.196 - .064 -.164

Overall, nothing is significant except T& and from that one would conclude that the overall average of these individuals is significantly different from the normal group against which they are being compared. A glance at the y-score averages, whose limits are f 1.96/& = f.88, show that the first and third pc’s are significant; the overall hearing loss is greater and the differences between 2000 Hz and 4OOO Hz are less than would be expected from the normal group.