a bayesian approach to selection and ranking procedures: the unequal variance case

Test (1996) Vol. 5, No. 2, pp. 357-377

357

A Bayesian Approach to Selection and Ranking Procedures:

The Unequal Variance Case A. J. V A N D E R M E R W E and J. L. D U P L E S S I S

Department of Mathematical Statistics, Faculty of Science University of the O.ES., P.O. Box 339,

Bloemfontein, 9300, Republic of South Africa

SUMMARY

By reanalyzing a well known data set on the breaking strength and thickness of starch films, it is concluded that Fong's assumption of an exchangeable prior for the regression coefficients is correct but that the assumption of equal error variances might be wrong. This conclusion is made by calculating the Intrinsic Bayes factor for various models. The theory and results derived by Fong (1992) are therefore extended to the unequal variance case. This goal is achieved by implementing the Gibbs sampler. The vector of posterior probabilities thus obtained provides an easily understandable answer to the selection problem.

Keywords: BAYESIAN SELECTION; EXCHANGEABLE PRIOR; COVARIATE

MODEL; MULTIPLE SLOPES; INTRINSIC BAYES FACTOR;

BEHRENS-FISHER PROBLEM; GIBBS SAMPLER.

1. INTRODUCTION

As mentioned by Gibbons et al (1977) the methods known generally as selection and ranking procedures include techniques appropriate for many different goals, although each different goal requires a careful

Received April 95; Revised January 96.

358 A. J. van der Merwe and J. L. du Plessis

formulation of the corresponding problem. For any given set of n populations some of the goals that can be accomplished by these methods are

(a) Selecting the one best population. (b) Ordering all of the n populations from best to worst (or vice

versa). (c) Selecting the t best populations for t >_ 2 (i) in an ordered

manner or (ii) in an unordered manner. (d) Selecting a random number of populations, say r, that includes

the t best populations. (e) Selecting a fixed number of populations, say r, that includes the

t best populations. (f) Ordering a fixed-size subset of the populations from best to

worst (or vice versa). (g) Selecting a random number of populations such that all popula-

tions better than a control population or a standard are included in the selected subset.

Procedures appropriate for the first two goals are the primary subject of this article.

Selection and ranking procedures have been developed in modern statistical methodology over the past 40 years with fundamental papers beginning with Bechhofer (1954) and Gupta (1956). A discussion of their respective differences and the various modifications that have taken place since then can be found in the literature (e.g. Dudewicz (1976), Gibbons, Olkin and Sobel (1977), Gupta and Panchapakesan (1979) and Dudewicz and Koo (1982)).

Bayesian papers in the literature dealing with ranking and selection of normal means for one- and two-way models include Berger and Deely (1988), Deely and Zimmer (1988), Fong (1990), Fong (1992), Fong and Berger (1993) and Fong, Chow and Albert (1994).

Choosing the largest of several means can be a demanding problem especially in the presence of a covariate. By reanalyzing a well-known data set on the breaking strength and thickness of starch films (Freeman (1942) and Sheff6 (1959)) and by assuming equal error variances among the seven starches as well as an exchangeable prior for the regression coefficients, Fong (1992) considered a hierarchical Bayesian approach

A Bayesian Approach to Selection 359

to ranking and selection as well as estimation of related means in the presence of a covariate. For the multiple slopes model he computed, in addition to the posterior means and standard deviations of the parameters the posterior probabilities that each mean, at a given value of the covariate, is the largest.

In the first part of this paper it will be shown that Fong's assumption of exchangeability seems to be correct but that the assumption of equal error variances might be wrong. This conclusion is made by applying the Intrinsic Bayes factor to the example under discussion. For details about the Intrinsic Bayes factor see Berger and Pericchi (1994) and (1996) and Sanso and Pericchi (1994).

The problem of making inferences about means with no assumption of equal error variances is called the Behrens-Fisher problem. The literature devoted to this problem is immense and there appears no satisfactory solution within classical statistics. On the other hand there is little controversy within the Bayesian community about the Behrens- Fisher problem. Therefore in the second part of this paper the theory and results derived by Fong (1992) will be extended to the unequal variance case. This goal will be achieved by implementing the Gibbs sampler.

Technical difficulties arising in the calculation of the marginal posterior densities needed for Bayesian inference have long served as an impediment to the wider application of the Bayesian framework to real data. The reason for this is that the integration operation plays a fundamental role in Bayesian statistics. In the last few years there have been a number of advances in numerical integration and analytic approximation techniques for such calculations but implementation of these approaches typically requires sophisticated numerical or analytic approximation ex- pertise and possibly specialist software. While it was possible for Fong (1992) to calculate the posterior probabilities for the equal variance case using numerical integration, this is not so for the unequal variance case. However due to the work of Gelfand and Smith (1990), Gelfand et al (1990), Carlin et al (1992) and Gelfand et al (1992), the Gibbs sampler has been shown as a useful tool for applied Bayesian inference in a broad variety of statistical problems. The Gibbs sampler is implicit in the work of Hastings (1970) and made popular in the image processing context by Geman and Geman (1984). The Gibbs sampler is an adaptive Monte Carlo integration technique. The typical objective of the sampler is to


collect a sufficiently large enough number of parameter realizations from conditional posterior densities in order to obtain accurate estimates of the marginal posterior densities. The principal requirement of the sampler is that all conditional densities must be available in the sense that random variates can be generated from them.

2. MODEL ASSUMPTIONS AND PRIOR DISTRIBUTION

Table 2.1 is a well known data set on the breaking strength y in grams and the thickness in 10 -4 inch from tests on seven types of starch film. (Freeman (1942), Sheff6 (1959) and Fong (1992)).

T a b l e 2.1 . Breaking Strength (y) and Thickness (x) of Starch Films.

Wheat Rice Canna Corn Potato Dasheen Sweet Potato

y x y x y x y x y x y x y x

263.7 5.0

130.8 3.5

382.9 4.7

302.5 4.3

213.3 3.8

132.1 3.0

292.0 4.2

315.5 4.5

262.4 4.3

314.4 4.1

310.8 5.5

280.8 4.8

331.7 4.8

672.5 8.0

496.0 7.4

311.9 5.2

276.7 4.7

325.7 5.4

310.8 5.4

288.0 5.4

269.3 4.9

556.7 7.1

552.5 6.7

397.5 5.6

532.3 8.1

587.8 8.7

520.9 8.3

574.3 8.4

505.0 7.3

604.6 8.5

791.7 7.7 610.0 6.3

710.0 8.6

940.7 11.8

990.0 12.4

916.2 12.0

835.0 11.4

724.3 10.4

611.1 9.2

731.0 8.0

710.0 7.3

604.7 7.2

508.8 6.1

393.0 6.4

416.0 6.4

400.0 6.9

335.6 5.8

306.4 5.3

983.3 13.0

958.8 13.3

747.8 10.7

866.0 12.2

810.8 11.6

950.0 9.7

1282.0 10.8

1233.8 10.1

1660.0 12.7

485.4 7.0

395.4 6.0

465.4 7.1

371.4 5.3

402.0 6.2

371.9 5.8

430.0 6.6

380.0 6.6

522.5

555.0

561.1

7.8 621.7

8.0 735.4

8.4 990.0

862.7

9.0 426.0

9.5 382.5

12.5 340.8

11.7 436.7

333.3

382.3

397.7

619.1

857.3

592.5

6.7 746.0

5.8 650.0

5.7 992.5

6.1 896.7

6.2 873.9

6.3 924.4

6.0 1050.9

6.8 973.3

7.9

7.2

9.8

10.0

13.8

13.3

12.4

12.2

14.1

13.7

837.1 9.4

901.2 10.6

595.7 9.0

510.0 7.6


Fong pointed out that although it was imprudent to assume a common regression coefficient for the given problem (the F-test calculated by Freeman was significant at the 5% level but not at the 1% level) it also does not seem wise to take the other extreme and estimate the regression coefficients separately for each individual starch film. Re- lationships among the regression coefficients such as a prior belief in exchangeability of the regression coefficients should be incorporated into the analysis to find improved estimates. By assuming equal error variances and using a hierarchical Bayesian approach, Fong proposed a solution by calculating the posterior probabilities of each starch film, for a fixed thickness, having the greatest breaking strength. The vector of posterior probabilities thus evaluated gave a fairly complete and easily interpretable answer.

A visual inspection of the data however shows that the error variances among the seven starches might differ which makes pooling inappropriate, and the assumption of a common error variance invalid.

The single-factor covariate model with multiple slopes can be written as

Yij -~- OLi + /~iXij ~ eij (2.1) i = l , . . . , n and j = l , . . . , m i

where Yij and xi j represent the j - t h observation on the dependent variable and the value of the covariate associated with the j - th study unit for the i-th treatment, respectively, o~i and fli are unknown parameters. Fong (1992) assumed that : i j ~ N(0, o -2) are independent normal errors with mean zero and common (unknown) variance o -2.

As mentioned by Fong (1992) and Fong et al 0994) when the fli's are thought to be different but similar, it is inappropriate to assume a common slope or to treat the slopes as totally unrelated quantities. For exchangeable slopes it is most convenient to model the exchangeability through a hierarchical Bayesian approach. Typically the prior distribution is given in two stages, namely, the fli's are independently and normally distributed as g ( f l , o-~) and 7r(fl, o-~) is the prior on the hyperparameters. Should the information at the second stage be vague, then a locally uniform prior like Ir(/3, o-~) = 1 will be assumed.

For model comparison purposes the unequal variance case eij ,'-, N(0, o./2) (i = 1 , . . . , n) must also be considered in this paper as well as a vague prior on the/3i's, i.e. 7r(fl l , . . . , fin) = 1. Therefore in the next


section combinations of the following priors will be used in calculating the Intrinsic Bayes factors:

7 r ( O ~ l , . . . , O/n) = 1,

7r (/3, o-~) = 1,

7 1 " ( / ~ 1 , ' ' ' , / ~ n ) ---- 1.

7r(K31, . . . , /3n[~,~)- - N(/31, O'~In), ~'(a 2) 0( a -2, 7r(o/2) c< a~ -2 and

Here 1 is a vector of ones and In is the identity matrix of order n.

3. THE INTRINSIC BAYES FACTOR FOR MODEL SELECTION

Suppose that the set of models M 1 , . . . , Mp are under consideration with the data y following the density fi(ylOi) under Mi. We wish to choose a model Mi out of this set of models. If noninformative priors 7r N (Oi) are used for the parameter vector Oi, which are typically improper, then the resultant Bayes factor is indeterminate, one way to overcome this difficulty is to consider part of the data, y(g), as a so- called training sample, compute its marginal mN(y(g)), with respect to the noninformative prior and posterior 7r u (Oily(g)). The Bayes factors can then be computed with the remainder of the data, y( -g) , using 7r u (Oily(g) ) as priors"

= f fj(y(-e)lOr f fi(y(-e)lOi,y(e)) U(O ly(e))dOi

We can obviously do this only if the training sample marginal, mN(y(g) ) is proper. This will be the case if 0 < mN(y(g)) < c~. A training sample y(g) is called minimal if its marginal is proper, and no subset of y(g) results in a proper marginal. The rest of the data is called a maximum discriminating sample. These considerations led Berger and Pericchi (1994) and (1996) to introduce the Intrinsic Bayes factor. The arithmetic Intrinsic Bayes factor is obtained by the arithmetic average of Bji(g) over all possible training samples, i.e.

L 1

g=l


where L is the number of all possible minimal training samples.

In order to make the Intrinsic Bayes factor computationally feasible in complex problems where the marginals are not available in closed form, Markov Chain Monte Carlo procedures such as Gibbs sampling can be used to obtain approximations. To calculate Bji (g), expressions ofthe form ETrek (fk((y(--f.)lOk, (/?))) where 7r~ = 7r N (OktY(g.)) have

to be considered ; thus if a sample 0 1 , . . . , 0~ from 7r e is available, then the law of large numbers yields the approximations

n

n j = l

These calculations can be simplified by making use of the fact that

( 1 ) - 1

= E A(y(_e)lOk,y(e)) (3.1)

where 7r~ = 7r N(Okly). Equation (3.1) can be approximated by (1 )1 n 1

2_, A(y(-e)lOj:, u(e)) n i=1

Whether the marginals are available in closed form or not, if the number of training samples is large, the suggested procedure is to take some training samples randomly selected. It was shown by Sanso and Pericchi (1994) that random selection procedures give impressive results for the regression examples they considered.

4. MODEL COMPARISONS AND GIBBS SAMPLING TO CALCULATE THE INTRINSIC BAYES FACTOR

The models that will be considered are

M1 : eij ~ N(O, 0.2)

M 2 : ci j "~ N ( 0 , 0 . ? )

M 3 : ~ij ~ N ( O , 0 .2 ) M4 : Cij "~ N(O, o-2)

i = l , . . . , n

; 71"(/~1,. . . , fn l /~ , 0 . ~ 3 ) : N ( / ~ l , o ' ~ f n )

; ; 7 r ( f l , . - . , f n ) = 1 ; 7r( f l , . . . ,fin) = 1

; j = 1 , . . . , m i

364 A. J. van der M e r w e and J. L. du Pless is

The Gibbs sampling procedure will be illustrated for model M2. The procedure is similar for the other models.

Combining (2.1) and the corresponding priors defined in section 2, the joint posterior can be written as

�9 . , , . . . , G n l y ) P(O/1, �9 O~n, /31 ", f lu, f , 0"~3, 0"12, "" 2

(~----1 1 ) { ~ mi / } i=1 j=l (n )

1 1E(fi f ) 2 • ( ~ ) P exp - ~ ~ i=1

where

- o o <_ a i <_ oo , - o o <_ f i <_ ~ , o2i > 0 , - o o < f _ < oo , a ~ > 0

and y (~n=l mi x 1) is the vector of breaking strengths.

The marginal posterior densities p ( a i , f i [ y ) , P ( f l Y ) , P(~

and p ( o 2 1 y ) cannot be derived analytically and because of the many unknowns very difficult to calculate numerically. However for practi- cal purposes it can be successfully estimated using the Gibbs sampling approach. For the Gibbs sampler we need the complete conditional posterior distributions. These conditional distributions are

f i lu, f ,~,g,~,? "~ N (~ibi, ,Si) (4.1)

where mi E xijYij

miYi j=l f

4 ' 4 + bi --~

and

2 2 ( 4 + GZSi)

mi 1 0_2(0_2

j=l __~iG20.~ 2 2 O" i 0"/3


Also

Xi

1 mi

m i x i j , Yi

m i mi 2

1 E Y i j and s i = E x i 2 - m 2 2. mi j=l j=l

fllY, ill , . . . , fin - - n

where fl = E i = I fli/n,

- - 1 0 . 2 ~ N(fl, ~ fl, (4.2)

1 P(O'~IY, i l l , . . . , f l n , f l ) ( X (0"~3)-2 n {n /}

e x p _ 1 E ( f l i -- fl)2 0"~

i=1 J

is an Inverse Gamma density, and

~>o (4 .3)

~xp ~ ( y ~ j - ~ - ~ i ~ j ) ~ ~? j = l

is also an Inverse Gamma density.

~ > o

(4 .4)

The Gibbs sampling procedure can now easily be implemented. Standard routines will be used to generate random numbers from the required distributions. The iterative process starts by using arbitrary starting values

~o) , fl~o), fl(o)o.2(o) and or? (~ (i = 1 , . . . ,n)

to calculate the first iteration using the above conditional posterior densities. After q iterations in which the conditional densities were updated at each iteration the Gibbs sampler has generated the values

(q) f l ~ q ) f l ( q ) o ' ~ (q) and or? (q) ( i = 1 , . . . n ) . C~ i ~ . ~ ~

The process is then repeated m times. The results for this experiment have been obtained, using q = 30 and m = 10000.


The convergence of the algorithm is discussed in Gelfand and Smith (1990) and Gelfand et al (1990).

In a further study which is not included in this paper, we also com- pared the marginal posterior distributions for q = 40 and 50 as well as for an alternative Markov Chain Monte Carlo (MCMC) procedure in which after a long run (i.e. k sufficiently large) successive albeit corre- lated values of the random variable of interest was used to describe the unknown marginal distribution. As "burn-in-period" we used q = 500 iterations. The densities obtained, indicated that convergence had been achieved in all the cases which means that q = 30 and m = 10000 are appropriate for our problem.

It is less expensive to use successive values in the Gibbs sampling procedure. Gelman and Ruben (1992), however, warned "that particularly during the first tentative examination of a new problem, it can be argued that monitoring the evolutionary behaviour of several runs of the chain starting from a wide range of interval values is necessary". It is for this reason that we have decided to stick to the "traditional" procedure.

The more complex model here is M2, so the minimal training samples are of size 21 (3 from each of the 7 starches) and as in Sanso and Pericchi (1994) it was found that taking a sample of only 5% of all minimal training samples produces satisfactory results, particularly if 10% trimming is used.

The arithmetic Intrinsic Bayes factor for comparing M2 to M1 is

B AI = (1.9288)1015

while that for comparing M2 to M4 is

B2A~ = (1.1)1030

much in favour of Model 2. The assumption of exchangeability of the regression coefficients therefore seems to be correct but the assumption of equal error variances might be wrong.

We also considered another model,

where M 5 : Yij -~ oli --F /~ixij --F cl~ ]

~ N(O, (k = 1 , 2 , a )


are independent normal errors with mean zero and (unknown) variance O~k ] . ~ 0~1 ]~ is the common error variance for starches 1, 2, and 6, ~

for starches 3 and 4 and o-~3 ] for starches 5 and 7 with prior assumptions

7r(cr(k]) c< a ~ (k = 1,2,3) and 7r(31,. . . ,3n]3, o '~)= N(31, cr2~In). Inspection of the sample error variances indicated that the starches could be classified in these three almost homogeneous groups.

M5 was infact the best model only slightly better than M2. The value of the arithmetic Intrinsic Bayes factor for comparing these two models indicates that the improvement of M5 over M2 is equal to a factor of 86.42.

Kass and Raftery (1994) argued that a Bayes factor larger than 150 is decisive evidence against a null hypothesis. Since 86.42 is appreciably smaller and because grouping can be somewhat controversial it was decided to stay with Model 2.

5. BAYESIAN SELECTING AND RANKING

Using the m = 10 000 values of each parameter generated by the Gibbs sampler, the required posterior probabilities for any fixed x, are given by

1 m

E " / (q) ]~{q)x i s the la rges t ) , i 1 , . . . (5.1) Pi = - - I (,oL i -[- . = , n m

q = l

where I(.) is the indicator function.

The unconditional posterior densities p(ai, fli[Y), P(Ti,xlY), p(/3[y), p(cr~[y) and p(o-~ly) can also easily be estimated, using the Rao-Black- well procedure. For example

where

m

q = l

"7i,x ---- Cei -k /3iX and "7i,x]y, fl,(72/3, cr? ,~ N ( u i , V i )

The mean and variance can be obtained from (4. l) and are given by

^ 2 2 2 2 2 Hi = + ( x - + +

368 A.J. van der Merwe and J. L. du Plessis

and

where

- - 2 2 2 2 2 = - - - - S i O ' ~ j

m i

)/ ~ i : x i j Y i j - - m i x i Y i s 2

\ j = l

See also Fong (1992) for the equal error variance case.

As in Fong (1992) we will start our analysis by comparing different estimates of the slopes. Table 5.1 gives the least squares estimates of the regression coefficients, the Bayesian estimates obtained by Fong for the equal error variance case and the Bayesian estimates for the unequal variance case calculated by using Gibbs sampling.

Table 5.1. Posterior Estimates and the Least Squares Estimates for the Regression Coefficients.

Wheat Rice Canna Corn Potato Dashenn Sweet (i=1) (~=2) (~=3) (i=a) (i=~) (~=6) (i=7)

Bayes Estimates - Equal Error Variances

E(3~Iy) 87.9 56.8 61.4 158.2 38.2 72.3 114.4

v/Var(fl,,: [y) 21.9 34.2 16.8 38.6 19.3 51.2 45.0

Bayes Estimates - Unequal Error Variances - Gibbs Sampling

E ( / 3 i l Y ) 88.8 47.5 63.5 173.4 45.1 61.0 116.7

v/Var(flily) 10.4 12.6 13.0 27.3 37.0 18.0 46.1

Least Squares Estimates

/3i 89.0 45.6 59.3 188.9 32.5 59.0 138.6

23.1 38.9 17.3 37.5 19.5 72.4 55.0


Notice that for the unequal error variance case, the posterior estimates are closer to each other than the least squares estimates but not as similar as the estimates for the equal variance case. This could be expected because shrinkage is usually less dramatic when the error variances are unequal.

The posterior standard deviations for the unequal error variance case are in general smaller than the corresponding ones for the least squares and equal error variance estimates. Potato and Sweetpotato are however exceptions.

In Figure 5.1 the posterior density of the regression coefficient for rice is displayed as well as the mean, mode, standard deviation and 95% credibility interval.

p(/3~ [Y) 0.035 F -\ ~:

o.o f / \, t " ! 0.025 '\ I 1

0.02 t

0"015 1 /

001~- j , / \ I , / ",,,,

2[ ~J/" - 0 20 40 60 80 100 120 82

F i g u r e 5 . 1 .

Starch Film Posterior Density of the Regression Coefficient for the Rice

E(p2 [y) = 47.37gram Mo(/32 lY) = 46.9gram

v /%~(~ ly) = 12.57gram 95% Credibility Interval = 22.7gram - 73.0gram

3"10 A. J. van der Merwe and J. L. du Plessis

It is clear from the figure that the density function is quite symmet- ric. The 95% credibility interval is 22.7 - 73.0, much smaller than the corresponding one for the equal variance case or for the maximum likelihood estimate.

Table 5.2 gives the posterior ranking probabilities for value of thickness (x = 9) for the unequal as well as equal error variance cases.

Table 5.2. Posterior Ranking Probabilities for Value of Thickness (x = 9) ('~,i,9) for the Unequal and Equal Error Variance Cases.

Posterior ranking Unequal Error Variances Equal Error Variances probabilities Gibbs Sampling Fong (1992)

Pl 0.007 0.015 Pz 0.000 0.000 P3 0.003 0.004 P4 0.665 0.507 P5 0.300 0.444 P6 0.011 0.021 P7 0.014 0.007

These Pi'S were calculated using equation (5.1), i.e. the probabilities were obtained by taking the number of times the value of (ai + / 3 i x ) was the largest of the seven values at each Gibbs simulation and then dividing this number by the total number of Gibbs simulations. The posterior ranking probability for Canna (P3 = 0.665) is the largest. This probability is somewhat larger than the corresponding probability (P3 ~ 0,507) for the equal error variance case, computed by Fong (1992).

The corresponding posterior means, modes, standard deviations and 95% credibility intervals of the breaking strengths (for x = 9) are also evaluated; the results for the equal as well as the unequal error variance cases are presented in Table 5.3.


Table 5.3. Posterior Means, Modes, 95% Credibility Intervals and Standard Deviations for Breaking Strengths.

Unequal Error Variances - Gibbs Sampling

Mean Mode 95% Credibility Interval Standard Deviation

Wheat 674.6 6 7 4 . 9 591.3-756.9 41.8 Rice 599.1 598.0 561.4-638.8 19.4 Canna 727.3 727.4 670.4 - 783.8 28.0 Corn 909.2 917.4 766.0- 1035.0 67.7 Potato 844.2 844.2 596.7 - 1089.2 122.7 Dasheen 574.4 580.6 464.9 - 668.3 50.2 Sweet potato 697.0 682.6 563.6- 868.5 74.2

Equal Error Variances - Fong (1992)

Mean Mode 95% Credibility Interval Standard Deviation

Wheat 669.1 668.0 485.0 - 855.1 93.7 Rice 610.7 611.7 499.5 - 718.6 55.6 Canna 722.2 722.0 644.9-798.7 38.7 Corn 873.5 875.7 679.0- 1072.3 99.l Potato 863.4 863.0 734.2 - 991.1 64.1 Dasheen 605.6 611.6 313.9- 899.5 143.1 Sweet potato 693.8 693.7 574.1- 813.4 60.0

As ment ioned by Fong (1992) the order of the posterior means does not coincide with the order of the ranking probabilities. This occurs

because the posterior variances are very different from one another; this

is even more so for the unequal error variance case. Except for Potato and

Sweetpotato the posterior standard deviations and credibility intervals

are smaller for the unequal error variance case than those calculated for

the equal variance case.

Figure 5.2 is a representation of the uncondit ional posterior distribu-

tion "72;xlY (x = 9) for the Rice starch film with mean, mode, standard deviation and credibility interval as given in Table 5.3 for the unequal

3 7 2 A . J . van der Merwe and J. L. du Plessis

P('Y2;~IY) 0.025 r

t F I !

0.02 i-

0.015i

0.01

0.005

i

5oo

/ /

/ /

/ , /

550 600 650

\

\ \\

")'2;x

t i

700

Figure 5.2. P(Tz;~ l Y) - Posterior Density of the Breaking Strength for the Rice Starchfilm (x = 9)

E(72; z[y) = 599.1gram

Mo(f2;zly) = 598.0gram

~//Var(~'2;z [y) = 19.4gram

95% Credibility Interval = 561.4gram - 638.8gram

error variance case. The graph was obtained using the Rao-Blackwell procedure.

As in Laird and Louis (1989) we can extend our approach by treating the ranks of the breaking strengths as the parameters of interest, and developing ranking methods based on the posterior distribution of the ranks rather than the posterior distribution of the breaking strengths. This will be achieved by using the Gibbs sampler. This approach to ranking has the advantage of reporting the posterior means and variances of the ranks. Using these, rather than integer ranks can give a much clearer picture of what differences there are (if any) among the breaking strengths of the seven starches.


Table 5.4. Posterior Means, Modes and Standard Deviations of the Ranks of the Starches for Breaking Strength x = 9.

Mean Mode Standard Deviation

Wheat 4.50 5 1.01 Rice 6.18 6 0.64 Canna 3.22 3 0.74 Corn 1.34 1 0.57 Potato 2.26 2 1.33 Dasheen 6.44 7 0.82 Sweet potato 4.07 4 1.27

In Table 5.4 summary statistics of the posterior distributions of the ranks for the seven starches are given and in Figure 5.3 the posterior distribution of the ranks for Wheat are given.

One of the main reasons for conducting this experiment was to determine whether the breaking strengths of the seven starches differ from each other. In the Bayesian case, the posterior distribution contains all the available information about a parameter. Therefore, one approach to determine if the breaking strengths of two starches differ from each other is to calculate the posterior distribution of the difference between the two parameters.

Define

61 : 'T2;xIY -- 71;zlY (x = 9)

i.e. the difference between the breaking strengths of Rice and Wheat. Since the conditional posterior distribution of 61 is normal with mean u2 - ul and variance 112 + V1 one can use the Gibbs sampler and the Rao- Blackwell procedure to obtain the unconditional posterior distribution of 61. This density is displayed in Figure 5.4.

/

From the figure it can be seen that the 95% as well as the 90% credibility intervals contain zero, which is a good indication that there is no real difference between Rice and Wheat. We must, however, warn that indiscriminate use of the procedure may be dangerous, for if we have n starches, where n is large, then the probability is high that at


0.5

0.45

0.4

0.35

0.3

0.25

0.2

0.15

0.1

0.05

o; 1 2 3 4

I 6 7 8

Figure 5.3.

( z = 9)

Posterior Density of the Ranks for the Wheat Starch Film

Mean = 4 .5

Mode = 5 Standard Deviation = 1 .01

least one of the n ( n - 1)/2 differences will, due to chance alone be judged significantly different.

Using the Gibbs sampler it is also possible to obtain the posterior distribution of the largest breaking strength value for say z = 9. At each Gibbs simulation, the largest 7 /value was put into a vector and a histogram of this vector's elements is obtained. It was found that a Pearson curve of type 4 can be fitted to this histogram. The estimated density is represented in Figure 5.5.

For more information about the different types of Pearson curves and their applications refer to Elderton and Johnson (1969).

A B a y e s i a n A p p r o a c h to Se lec t ion 375

P(6,1Y) x10-3

6~

-250 -200

/ /

/ /

/ /

-150

\ \

-ioo - -~o

\ \

50 100 6j

150

Figure 5.4. p(61 lY) - Posterior Density of61 : 7z ;x ly - 71;xly . Difference in Breaking Strenghts between Rice and Wheat for x = 9

E(61 [y) = -75.58gram Mo( 611y) = -76.29gram

~/Var(~l [y) = 46.12gram 95% Credibility Interval = -166.29gram - 15.71gram 90% Credibility Interval = -155.09gram - lO.11gram

A C K N O W L E D G E M E N T S

The authors are grateful to the referees and editor for valuable comments and suggestions. The research was financially supported by the FRD and the University of the Orange Free State Research Fund.

REFERENCES

Bechhofer, R. E. (1954). A single-sample multiple decision procedure for ranking means of normal populations with known variances. Ann. Math. Statist. 25, 16-39.

Berger, J. O. and Deely, J. J. (1988). A Bayesian approach to ranking and selection of related means with alternatives to analysis-of-variance methodology. J. Amer. Statist. Assoc. 83, 364-373.

3 7 6 A . J . van d e r M e r w e a n d J. L. d u P l e s s i s

xlO a 6

i

J

700 800 900 1000 1100 1200 1300

Figure 5.5. Posterior Density of the Largest breaking strength for z = 9

Mean = 938.43gram Mode = 921.87gram

Standard Deviation = 74.99gram 95% Credibility Interval = 807.79gram - l l07.3gram

Berger, J. O. and Pericchi, L. R. (1994). The intrinsic Bayes factor for linear models. Bayesian Statistics 5 (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. E M. Smith, eds.). Oxford: University Press, (with discussion).

Berger, J. O. and Pericchi, L. R. (1996). The intrinsic Bayes factor for model selection and prediction. J. Amer. Statist. Assoc. 91, 109-122.

Carlin, B. P., Gelfand, A. E. and Smith, A. E M. (1992). Hierarchical Bayes analysis of change point problems. Ann. Statist. 41, 389-405.

Deely, J. J. and Zimmer, W. J. (1988). Choosing a quality supplier - a Bayesian approach. Bayesian Statistics 3 (J. M. Bernardo, M. H. DeGroot, D. V. Lindley and A. E M. Smith, eds.). Oxford: University Press, 585-592.

Dudewicz, E. J. (1976). Introduction to Statistics and Probability. New York: Holt, Rinehart and Winston.

Dudewicz, E. J. and Koo, J. O. (1982). The Complete Categorized Guide to Statistical Selection and Ranking Procedure. Columbus, OH: American Science Press.

Freeman, H. A. (1942). Industrial Statistics. New York: Wiley.

A Bayes ian Approach to Select ion 3 7 7

Fong, D. K. H. (1990). Ranking and estimation of related means in two-way models - a Bayesian approach, J. Statist. Computation and Simulation 34, 107-117.

Fong, D. K. H. (1992). Ranking and estimation of related means in the presence of a covariate - a Bayesian approach. J. Amer. Statist. Assoc. 87, 1128-1136.

Fong, D. K. H. and Berger, J. O. (1993). Ranking, estimation and hypothesis test- ing in unbalaced two-way additive models - a Bayesian approach. Statistics and Decisions 11, 1-24.

Fong, D. K. H., Chow, M. and Albert, J. H. (1994). Selecting the normal population with the best regression value - a Bayesian approach. J. Statist. Planning and Inference 40, 97-111.

Geman, S. and Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pat. Anal. and Mach. Intel. 6,721-741.

Gelfand, A. E., Hills, S. E., Racine-Poon, A. and Smith, A. E M. (1990). Sampling- based approaches to calculating marginal densities. J. Amer. Statist. Assoc. 85, 972-985.

Gelfand, A. E. and Smith A. E M. (1990). Sampling-based approaches to calculating marginal densities. J. Amer. Statist. Assoc. 85, 398-409.

Gelfand, A. E., Smith, A. F. M. and Lee, T. M. (1992). Bayesian analysis of constrained parameters and truncated data problems using Gibbs sampling. J. Amer. Statist. Assoc. 87, 523-532.

Gelman, A. and Rubin, D. R. (1992). A single series from the Gibbs sampler provides a false sense of security. Bayesian Statistics 4 (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.). Oxford: University Press, 627-635.

Gibbons, J. D., Olkin, I. and Sobel, M. (1977). Selecting and Ordering Populations. New York: Wiley.

Gupta, S. S. (1956). On a Decision Rule for a Problem in Ranking Means. Ph.D. Thesis, University of North Carolina at Chapel Hill.

Gupta, S. S. and Panchapakesan, S. (1979). Multiple Decision Procedures. New York: Wiley.

Hastings, W. K. (1970). Monte Carlo sample methods using Markov chain and their applications. Biometrika 57, 97-109.

Laird, N. M. and Louis, T. A. (1989). Empirical Bayes ranking methods. J. Educational Stat. 14, 29-46.

Sheff6, H. (1959). The Analysis of Variance. New York: Wiley. Sanso, B. and Pericchi, L. R. (1994). Calculating intrinsic Bayes factors using Monte

Carlo. Tech. Rep. 94--104, Universidad Simon Bolivar, Caracas.

a bayesian approach to selection and ranking procedures: the unequal variance case

Documents