a method for establishing a core collection of saccharum officinarum l. germplasm based on...

9
Genetic Resources and Crop Evolution 47: 1–9, 2000. © 2000 Kluwer Academic Publishers. Printed in the Netherlands. 1 A method for establishing a core collection of Saccharum officinarum L. germplasm based on quantitative-morphological data R. Balakrishnan * , N.V. Nair & T.V. Sreenivasan Sugarcane Breeding Institute, Coimbatore 641 007, India ( * Author for correspondence. Fax: 42 2472923) Received 16 November 1998; accepted in revised form 21 December 1998 Key words: core collection, Gower metric, principal components, Sachharum officinarum, Shannon-Weaver Diversity Index Abstract A technique is proposed for establishing a representative core collection of S. officinarum accessions from the world collection of sugarcane germplasm maintained at the Sugarcane Breeding Institute-Research Centre, Cannanore, India. In the proposed method, the accessions were first sorted based on their relative contributions to the total variability by means of principal component scores based on a set of quantitative characters. Then, the cumulative proportion of their contributions to the total variance was computed. A logistic regression model was fitted to evaluate the functional relationship between the cumulative proportion of variance and the number of accessions. The size of the core set was decided as the inflection point on that fitted curve, i.e., the point beyond which the rate of increase in cumulative proportion of variability contributed by an accession began to decline. A method for eliminating entries with a high degree of similarity from the selected core set is also proposed. Abbreviations: SDI – Shannon-Weaver Diversity Index; GSS – Generalized Sum of Squares; PCS – Principal Component Scores. Introduction The concept of a core collection was first proposed by Frankel (1984) as a means for improved utilization and accessibility to vast collections of crop germplasm be- ing maintained by large gene banks. Effective use of large collections of genetic material could be greatly facilitated by the establishment of a small, well char- acterized core collection, which represents with a minimum of duplication most of the genetic diversity found in the entire collection. The genetic diversity of a species may not always be randomly dispersed but may be organized to varying degrees. This means that carefully chosen sub-samples of the species should contain most of the genetic diversity. One common approach for selecting a core set is to stratify the collections and select a representative set by random sampling from each of the classified groups (Frankel & Brown, 1984; Hintum, 1995). The basis of stratific- ation may be the geographical source of the collections or groupings based on morphological and quantitative descriptors by appropriate multivariate methods. Spagnoletti et al. (1993) evaluated five sampling strategies for obtaining a core set in durum wheat. The methods they employed mainly classify the col- lections based on geographical source and selection of accessions randomly from each group to constitute the core collection. Hamon & Noirot (1990) and Noirot et al. (1996) employed the method of principal compon- ent analysis on quantitative data sets for establishing core collections. Their approach differed in the mode of sampling, which was no longer random, but in- stead was designed to maximize diversity and avoid redundancy or duplicates. Mahajan et al. (1996) pro- posed a methodology for constituting a core set in okra by using principal component scores as suggested by Hamon & Noirot (1990) and clustering the accessions for further selection of the core set. The size of the core set was decided by comparing the diversity of the core set with that of the original collection by means

Upload: r-balakrishnan

Post on 03-Aug-2016

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: A method for establishing a core collection of Saccharum officinarum L. germplasm based on quantitative-morphological data

Genetic Resources and Crop Evolution47: 1–9, 2000.© 2000Kluwer Academic Publishers. Printed in the Netherlands.

1

A method for establishing a core collection ofSaccharum officinarumL.germplasm based on quantitative-morphological data

R. Balakrishnan∗, N.V. Nair & T.V. SreenivasanSugarcane Breeding Institute, Coimbatore 641 007, India (∗Author for correspondence. Fax: 42 2472923)

Received 16 November 1998; accepted in revised form 21 December 1998

Key words:core collection, Gower metric, principal components,Sachharum officinarum, Shannon-WeaverDiversity Index

Abstract

A technique is proposed for establishing a representative core collection ofS. officinarumaccessions from the worldcollection of sugarcane germplasm maintained at the Sugarcane Breeding Institute-Research Centre, Cannanore,India. In the proposed method, the accessions were first sorted based on their relative contributions to the totalvariability by means of principal component scores based on a set of quantitative characters. Then, the cumulativeproportion of their contributions to the total variance was computed. A logistic regression model was fitted toevaluate the functional relationship between the cumulative proportion of variance and the number of accessions.The size of the core set was decided as the inflection point on that fitted curve, i.e., the point beyond which therate of increase in cumulative proportion of variability contributed by an accession began to decline. A method foreliminating entries with a high degree of similarity from the selected core set is also proposed.

Abbreviations: SDI – Shannon-Weaver Diversity Index; GSS – Generalized Sum of Squares; PCS – PrincipalComponent Scores.

Introduction

The concept of a core collection was first proposed byFrankel (1984) as a means for improved utilization andaccessibility to vast collections of crop germplasm be-ing maintained by large gene banks. Effective use oflarge collections of genetic material could be greatlyfacilitated by the establishment of a small, well char-acterized core collection, which represents with aminimum of duplication most of the genetic diversityfound in the entire collection. The genetic diversity ofa species may not always be randomly dispersed butmay be organized to varying degrees. This means thatcarefully chosen sub-samples of the species shouldcontain most of the genetic diversity. One commonapproach for selecting a core set is to stratify thecollections and select a representative set by randomsampling from each of the classified groups (Frankel& Brown, 1984; Hintum, 1995). The basis of stratific-ation may be the geographical source of the collections

or groupings based on morphological and quantitativedescriptors by appropriate multivariate methods.

Spagnoletti et al. (1993) evaluated five samplingstrategies for obtaining a core set in durum wheat.The methods they employed mainly classify the col-lections based on geographical source and selection ofaccessions randomly from each group to constitute thecore collection. Hamon & Noirot (1990) and Noirot etal. (1996) employed the method of principal compon-ent analysis on quantitative data sets for establishingcore collections. Their approach differed in the modeof sampling, which was no longer random, but in-stead was designed to maximize diversity and avoidredundancy or duplicates. Mahajan et al. (1996) pro-posed a methodology for constituting a core set in okraby using principal component scores as suggested byHamon & Noirot (1990) and clustering the accessionsfor further selection of the core set. The size of thecore set was decided by comparing the diversity of thecore set with that of the original collection by means

Page 2: A method for establishing a core collection of Saccharum officinarum L. germplasm based on quantitative-morphological data

2

of the Shannon-Weaver Diversity Index for a selectedset of qualitative descriptors.

Roach (1995) emphasized the need for establishinga core collection in sugarcane to improve conserva-tion, documentation and optimal utilization of sugar-cane germplasm. However, until now no attempt hadbeen made in this direction in India, which holds oneof the two world collections of sugarcane germplasm.Over 750 S. officinarumaccessions are maintainedin the world collection at the Sugarcane BreedingInstitute-Research Centre, Cannanore, India, repres-enting a vast spectrum of variability both in termsof geographic origin and individual plant traits. Ofthe 750 accessions, 713 accessions have been system-atically characterized and documented for botanical,yield and quality attributes and for disease ratings(Sreenivasan & Nair, 1991). In order to make the bestuse of the information available, we have used thisdata set to develop a core collection ofS. officinarumclones.

Materials and methods

The data used for the present investigation pertainsto 713 accessions ofSaccharum officinarum, main-tained in the world collection of sugarcane germplasmat the Sugarcane Breeding Institute-Research Centre,Cannanore, on the west coast of India (Latitude 11◦5′N; Longitude 75◦2′ E). 476 of these accessions havebeen collected during several expeditions from 1895to 1977, from the Indonesia–New Guinea area, whichis the primary centre of diversity for this species.The remaining accessions represent other geograph-ical areas like India, Fiji, Hawaii, Mauritius etc. Theclones were planted in single rows of 2 m lengthduring March, 1987, adopting uniform cultural prac-tices. Morphological characters were recorded duringOctober–November, which coincides with the flower-ing season. Yield and quality data were recorded atthe 10th month after planting. The number of mil-lable canes (NMC) was recorded on a per plot basis.Cane thickness, cane length and single cane weightwere recorded as the average of 5 canes. Brix% juice(total soluble solids) was recorded on 5 canes at the7th and 10th months. Sucrose% juice was estim-ated using Horne’s method as detailed in Meade &Chen (1977). The detailed data set has been presen-ted in the germplasm catalogue onS. officinarum(Sreenivasan & Nair, 1991). Of the 33 morphologicaldescriptors given in this catalogue, 27 were considered

Table 1. (a) List of descriptors used for computing the Shan-non-Weaver diversity index

S. No Descriptor No. of descriptor states

1. Ivory or corky marks 2

2. Weather marks/corky patch 2

3. Internode shape 6

4. Internode alignment 2

5. Internode cross section 7

6. Internode waxiness 5

7. Splits/growth cracks 2

8. Stripes on cane 2

9. Node swelling 3

10. Root zone swelling 3

11. Bud shape 12

12. Bud size 3

13. Bud cushion 3

14. Bud germpore 3

15. Bud groove 3

16. Bud hairiness 2

17. Bud extension 2

18. Growth ring swelling 3

19. Leaf upper surface 2

20. Leaf carriage 3

21. Leaf sheath waxiness 3

22. Leaf sheath prickles 5

23. Leaf sheath clasping 2

24. Ligule shape 12

25. Ligule hairiness 2

26. Ligule process symmetry 2

27. Ligular process shape 15

Table 1. (b) List of quantitative descriptors used for principalcomponent analysis

S. No Descriptor Range Mean

1. Leaf length (cm) 107 136.5

2. Leaf width (cm) 7.1 5.0

3. Number of millable canes/plot 55 18.0

4. Cane thickness/diameter (cm) 2.3 2.5

5. Cane length/height (cm) 311 281.0

6. Single cane weight (kg) 2.8 1.2

7. Brix% at 210 days after sowing 17.2 16.8

8. Brix% at 300 days after sowing 13.6 16.6

9. Sucrose% at 300 days after sowing 17.6 12.6

10. Purity% at 300 days after sowing 58.6 75.4

Page 3: A method for establishing a core collection of Saccharum officinarum L. germplasm based on quantitative-morphological data

3

for evaluating the phenotypic diversity by means ofa diversity index (Table 1a). In addition, 10 quant-itative characters were considered for computing thecontribution of individual accessions to the overallvariability (Table 1b) and also to form clusters basedon Euclidean distances. Out of 713 accessions, 690accessions had complete sets of data and were used inthe analyses.

Statistical analysis

Data on 10 quantitative characters were subjected toprincipal component analysis. The principal compon-ent scores for all 10 components were computed for allaccessions. Using the approach of Noirot et al. (1996),the contribution of thei-th accession to the total vari-ability (or Generalized Sum of Squares) of the systemwas computed as:

Pi =t∑

j=1

y2ij ,

whereyij is the component score of thei-th acces-sion on thej-th principal component. The relativecontribution of thei-th accession was computed as:CRi = Pi /(p×t)×100, wherep stands for the totalnumber of accessions andt for the number of principalcomponents. The product (p×t) is called the General-ized Sum of Squares (GSS) and is a measure of thetotal variance of the system with a set ofp individualsin the factor space oft standardized and independentvariables.

The relative contribution of each accession to thetotal GSS was computed, and the accessions weresorted in the descending order of their relative con-tributions. The cumulative percentage of variance (orcumulative inertia) was also computed by successivelyadding the contribution of each of the accessions to theGSS.

As the cumulative inertia for successive accessions(which have been sorted in descending order of theirindividual contribution) follows the typical epidemi-ological model akin to the one suggested by Van derPlank (1963), a logistic regression model of the formy/(A− y) = exp(a+ bn) was fitted. This can be con-verted to a linear form by taking natural logarithms onboth sides of the above relation as loge (y/(A− y)) =a + bn, wherea andb are the intercept and regressioncoefficient, respectively;y is the cumulative inertia;nis the serial number of the accession (in the serial order

as per the sorted data) andA stands for the asymptoteof the curve, which equals 100.

In the above equation, the cumulative inertia ofsuccessive accessions increases rapidly up to a certainnumber of accessions until it reaches the maximumand then decreases. Thedy/dncurve explains the rateof progress of inertia by successive accessions, andcan be computed asdy/dn = by(A − y). This curvestarts from a low value, progressively increases and,after reaching a peak, starts declining. The number ofaccessions included at the peak or inflection point wastaken to constitute the core set. This we designate ascore set A.

The quantitative data on all the 690 accessionswere subjected to non-hierarchical cluster analysisusing the SPSS (PC+) statistical package, and 10clusters were identified. Analyses of variance for theindividual characters were performed to test the dif-ferences between the cluster means. In each cluster,the contributions of the accessions to the GSS werecomputed as explained earlier, and the logistic re-gression model was used to get the core set fromeach of these clusters. The accessions thus selectedcluster-wise were pooled to form another core set anddesignated as core set B.

The percentage contribution of the accessions in-cluded in core set A was compared to that of samplesof the same size drawn at random. This was done bygenerating 1000 random samples of 185 accessionseach (sampling was without replacement in each set).The properties of the sampling distribution were eval-uated by computing the minimum, maximum, meanand standard deviation of per cent contribution to thetotal GSS by the accessions included in these randomsamples.

To evaluate phenotypic diversity in the originalcollection in terms of each of the 27 qualitat-ive descriptors, the Shannon-Weaver Diversity Index(SDI) was computed using the formula:

SDI= −s∑i=1

pi loge(pi),

where s is the number of phenotypic classes ordescriptor states for a given qualitative descriptor andpi is the proportion of the total number of accessionsin the i-th class (Jain et al., 1975). The index wasstandardized to keep its value in the range 0 to 1, bydividing the value by loge s (Yu Li et al., 1996). SDIwas also computed for all 27 qualitative descriptors forthe accessions represented in the two core sets for the

Page 4: A method for establishing a core collection of Saccharum officinarum L. germplasm based on quantitative-morphological data

4

comparison of their phenotypic diversity with that ofthe entire collection.

To identify pairs of accessions which have a highdegree of similarity and to cull out probable duplicatesfrom the final core set, a coefficient of similarityvij(Gower metric) was computed for each pair of acces-sionsi andj by the formula as suggested by Harch etal. (1996):

vij = 1− 1/m ∗ m∑p=1

1/Rp | xip − xjp | ,

wherem is the number of descriptors (which includesqualitative, binary and quantitative descriptors con-sidered for the analysis);Rp stands for the range incase of a quantitative descriptor or 1 otherwise;xipandxjp stand respectively for the values for thep-thdescriptor for thei-th and thej-th accessions. In thecase of a qualitative or binary descriptor, the quantityin parentheses equals 0 if the states are identical or 1if they are not. The value of the similarity coefficientalways lies between 0 and 1 and a value of 1 indicatesan exact match or a duplicate. For the present study,Gower metrics were computed for all possible pairsof 185 accessions that formed core set A. Removalof duplicates from the core set was done by elimin-ating one or more of each pair of accessions that hada similarity coefficient greater than a particular value.This cut off point was decided based on the frequencydistribution of the Gower metric computed for all pairsof accessions. From each pair of accessions whichwere closely related, the accession that had contrib-uted more to the GSS compared to its duplicate wasretained in the core set.

Results

Principal component analysis of the quantitative dataindicated that the first 3 components accounted for70% of the variability. However, for computation ofthe principal component scores, all the componentswere considered. By including all components, theoriginal set of correlated variables was completelytransformed into a new set of uncorrelated variables.Based on the principal component scores, the con-tributions of individual accessions to the total GSSwere estimated, sorted in descending order and addedsuccessively to obtain a progress curve as shown inFigure 1a. The evaluation of linear regression of loge

Table 2. Percent GSS accounted for by differ-ent sampling fractions of the base collectionby using the principal component scores basedon their ranking

Sample size (%) % Variance accounted

10 28.0

20 43.0

30 55.1

40 65.6

50 74.7

60 82.3

70 88.7

80 94.0

90 98.2

100 100.0

(y/(100− y)) on n (y being the cumulative contribu-tion of accessions to the GSS, andn the number ofaccessions), resulted in a highly significantR2 valueof 93.5%. The rate of increase in the cumulative con-tribution to the GSS was computed for each value ofyasdy/dn = by(100− y), whereb (= 0.008637) is theestimated regression coefficient. The GSS accountedfor by sampling the accessions based on the principalcomponent scores iteratively is given in Table 2.

The rate of increase (dy/dn) in the cumulative con-tribution with respect to the number of accessionsreached a maximum whenn= 185 and declined there-after (Figure 1b). Hence this inflection point was takenas the appropriate size of the core sample, and acces-sions included up to this point formed the core set A.This was roughly 27% of the base collection. This coreset of 185 accessions accounted for 51.4% of the totalvariability.

The non-hierarchical clustering of accessions res-ulted in 10 clusters with varying number of acces-sions (Table 3). The analysis of variance of indi-vidual characters indicated highly significant differ-ences (P ≤0.01) among cluster means. The clusteringdid not reveal any pattern or grouping of the acces-sions based on their geographical source. The numberof accessions selected from each cluster by using theprincipal component scores method along with theircumulative contribution to the GSS of each cluster isgiven in Table 3. By pooling the accessions selectedfrom individual clusters, the core set B was constitutedwith a total of 213 accessions, which accounted for47.16% of the total variance. The number of acces-

Page 5: A method for establishing a core collection of Saccharum officinarum L. germplasm based on quantitative-morphological data

5

Figure 1. (a). Progress curve of cumulative contribution of accessions to the GSS.

Figure 1. (b). Rate of increase of cumulative contribution to the GSS by the accessions.

Page 6: A method for establishing a core collection of Saccharum officinarum L. germplasm based on quantitative-morphological data

6

Table 3. Number of accessions in different clusters andnumber selected for the core set from each cluster

Cluster No. of No. selected Accessions

accessions for the core common with

set B core set A

I 68 24 (48.7) 20

II 122 40 (50.0) 6

III 84 24 (49.0) 18

IV 70 20 (48.6) 15

V 50 15 (49.8) 10

VI 55 18 (49.7) 10

VII 46 14 (50.2) 8

VIII 80 24 (49.6) 16

IX 55 16 (48.1) 8

X 60 18 (49.3) 11

Total 690 213 (47.2) 122

Total percentage of GSS explained by:Core set A=51.43 (185 accessions).Core set B=47.16 (213 accessions).Figures in parentheses indicate the percentage of the GSSaccounted for by the accessions added to the core set Bfrom each cluster.

sions common to both the core sets is also presentedin Table 3.

Random samples of the same size as that of core setA yielded a sampling distribution with minimum con-tribution = 23.74%; maximum contribution = 32.63%and mean contribution = 27.14% with a standard devi-ation = 1.43%, to the total GSS. This was substantiallylower than the contribution of 51.4% of core set A and47.16% of core set B to the total GSS.

Most of the qualitative descriptors that were con-sidered for diversity had fairly high SDI values in theoriginal collection, except a few, namely, ivory orcorky marks, weather marks and upper leaf surface(Table 4). Growth cracks, bud cushion, bud hairiness,leaf sheath clasping and leaf sheath waxiness had SDIvalues of more than 0.9, indicating high levels of phen-otypic diversity for these descriptors. For core set A,the values for the diversity index for all qualitativedescriptors are presented in Table 4. It was found thatthe phenotypic diversity in this core set nearly matchedor exceeded that of the entire collection for most ofthe descriptors. The SDI values presented in Table 4for core set B indicated that there was good agreementbetween the two core sets. The SDIs of the two coresets also showed close correspondence with that of theentire collection.

Screening of core set A for accessions that weresimilar with respect to the descriptors considered forthis study yielded 17 020 pairs of similarity coeffi-cients. The frequency table indicated that there wereonly two pairs with a Gower metric value (similar-ity coefficient) of 0.9 or greater, and 37 pairs with aGower metric value of 0.8 or greater. However, nearly700 pairs comprising a very high proportion of ac-cessions were found to have similarity coefficients inthe range of 0.7 to 0.8. Hence, a reasonable valuefor the cut off point was taken as 0.8 for identific-ation of duplicates in this data set. There were 38accessions (in non-overlapping and overlapping pairs)yielding 37 pairs with a similarity coefficient of 0.8or more. Among these, probable duplicates could beidentified for elimination from the core set. The resultsare presented in Table 5. The accessions on the left-hand side of the first 15 rows in this table had pairedonly once with the corresponding accessions to yieldsimilarity coefficients of 0.8 or more. The remain-ing accessions (excepting the first 8 on the right-handside) had paired with other accessions more than once,resulting in high similarity coefficients. Seventeen ac-cessions (underlined in Table 5) that had contributedmore to the GSS compared to their duplicates wereretained in the core set and the remaining 21 wereremoved from the core set. The final core set, hav-ing 185−21=164 accessions, contributed 46.7% tothe total GSS i.e., a reduction of 4.8% contributiondue to the accessions that were removed as probableduplicates.

Discussion

Core collection size and content have long been un-der debate (Frankel & Brown, 1984). Brown (1989)claimed that 10% of the base collection includes atleast 80% of alleles when the base collection is suf-ficiently large. Noirot et al. (1996), using simulateddata, arrived at the conclusion that GSS was alwaysgreater when sampling was based on iterative valuesof the principal component scores in descending order,than when sampling was done randomly. Our sug-gested core set of 185 accessions (27% of the basecollection), the size of which was fixed by the inflec-tion point on the progress curve, accounted for 51.4%variance of the base collection. The core set B of 213accessions based on cluster analysis contributed mar-ginally less (47%) to the GSS. In contrast, randomsamples of accessions of the same size had a mean

Page 7: A method for establishing a core collection of Saccharum officinarum L. germplasm based on quantitative-morphological data

7

Table 4. Shannon-Weaver diversity index for various descriptors in the base collection and the core sets

S. No Descriptor Base collection (1) Core set A (2) Core Set B (3)

1. Ivory or corky marks 0.35 0.30 0.37

2. Weather marks/corky patch 0.26 0.33 0.29

3. Internode shape 0.79 0.81 0.83

4. Internode alignment 0.70 0.79 0.74

5. Internode cross section 0.63 0.61 0.63

6. Internode waxiness 0.78 0.82 0.82

7. Splits/growth cracks 0.97 0.98 0.98

8. Stripes on cane 0.62 0.63 0.64

9. Node swelling 0.83 0.88 0.86

10. Root zone swelling 0.76 0.81 0.76

11. Bud shape 0.68 0.65 0.68

12. Bud size 0.84 0.82 0.85

13. Bud cushion 0.99 0.99 0.99

14. Bud germpore 0.46 0.61 0.56

15. Bud groove 0.85 0.93 0.86

16. Bud hairiness 0.97 0.94 0.98

17. Bud extension 0.83 0.86 0.88

18. Growth ring swelling 0.69 0.65 0.65

19. Leaf upper surface 0.27 0.37 0.29

20. Leaf carriage 0.78 0.68 0.74

21. Leaf sheath waxiness 0.97 0.98 0.97

22. Leaf sheath prickles 0.87 0.89 0.88

23. Leaf sheath clasping 0.90 0.82 0.84

24. Ligule shape 0.88 0.85 0.86

25. Ligule hairiness 0.81 0.82 0.83

26. Ligule process symmetry 0.71 0.75 0.74

27. Ligular process shape 0.75 0.75 0.74

Mean 0.74 0.75 0.75

Note: SDI are standardized values:(1) for 690 accessions, (2) for 185 accessions, (3) for 213 accessions.

contribution of only 27.14% to the total GSS, with arange of 23.74–32.63%, clearly indicating that randomsampling was inadequate for establishing a core set.

The approach of Noirot et al. (1996) in establish-ing the size of a core set involved either of the twosteps, namely, stop sampling when a desired samplesize (say, 10–16% of the base collection) was reached,or stop sampling when a desired percentage of theGSS (say, to around 30%), was reached. Mahajan etal. (1996) extended this approach, by computing theSDI at each stage after the addition of an accessionto the core set and comparing it with the entire col-lection. They suggested that the appropriate samplesize could be decided when the SDI for a set of qual-itative descriptors for the sub-set is nearly equal tothat of the entire collection. This approach, while it

makes the best use of the diversity in terms of bothquantitative and qualitative characters, involves iterat-ive computations of SDIs after addition of every newentry to the core set. While the main advantage ofsampling for principal component scores is to improvethe percentage of sampled diversity without modify-ing the relative intensity of selection, say to around10% as suggested by Brown (1989), there are chancesthat still many important accessions may be omittedfrom the core set whose contribution to the GSS is stillincreasing as shown in Figure 1a and Figure 1b.

The method we propose here makes use of thefunctional relationship between the cumulative con-tribution to GSS and the number of accessions. Theappropriate size of the core set is therefore set at theinflection point of thedy/dncurve. By this, we ensure

Page 8: A method for establishing a core collection of Saccharum officinarum L. germplasm based on quantitative-morphological data

8

Table 5. Pairs of entries with Gower similarity coefficient greater than or equal to 0.8

Pounda & Guam A

Hawaii Original 38 & Hawaii Original 26

28 NG 288 & 28 NG 288 Sport

51 NG 36 & 57 NG 62

51 NG 89 & 51 NG 92

IJ 76-567 & IS 76-214

NG 77-232 & NG 77-101

NG 77-142 & NG 77-154

IJ 76-313 & Ceram Red

Bangadya & China

Iscambine & China

28 NG 68 & 28 NG 14

51 NG 21 & 51 NG 126

51 NG 111 & 51 NG 114

51 NG 134 & 28 NG 15

IJ 76-372 & IJ 76-322, IJ 76-317

51 NG 114 & 51 NG 111, IJ 76-317

Ceram Red & IJ 76-313, IJ 76-322

57 NG 159 Stripe & 28 NG 15, 51 NG 43

IJ 76-322 & Ceram Red, IJ 76-317, IJ 76-372

28 NG 14 & 28 NG 15, 28 NG 68, NG 77-171

28 NG 274 & 51 NG 43, 51 NG 103, 51 NG 115 G

51 NG 21 & 51 NG 43, 51 NG 103, 51 NG 126

51 NG 115 G & NG 77-171, 28 NG 274, 51 NG 103

28 NG 15 & 28 NG 14, 51 NG 134, 57 NG 159 Stripe

IJ 76-317 & IJ 76-322, NG 77-171, 51 NG 114, IJ 76-372

51 NG 43 & NG 77-171, 28 NG 274, 51 NG 21, 57 NG 159 Stripe, 51 NG 103

51 NG 103 & NG 77-171, 28 NG 274, 51 NG 21, 51 NG 43, 51 NG 126

NG 77-171 & 28 NG 14, 51 NG 43, 51 NG 103, 51 NG 115 G, IJ 76-317

Note: 17 Entries underlined in the table are retained based on their higher contribution to the total GSS as compared to their duplicates; theremaining 21 entries, having lesser contribution to GSS, were dropped from the core set.

that sampling is continued until the rate of increasein the cumulative contribution to the GSS reaches itsmaximum. As such, there is no arbitrariness involvedin arriving at the sample size.

When the accessions in the original collection weresampled based on the Principal Component Scoremethod to form core set A, the sample size was about27%, which is much higher than the sample size of10–16% suggested by earlier authors. The core setB, developed by a clustering method, resulted in aslightly larger sampling fraction (31%) and a margin-ally lower contribution to the total GSS. This is invariance with the results of Mahajan et al. (1996),where core sets formed by the clustering method con-tributed more to the GSS. This may be due to differentpatterns of variation in the okra data set taken up intheir study. The selection of accessions directly fromthe entire collection using the functional relationship

between the cumulative contribution to the GSS andthe number of accessions, as suggested in the presentstudy, eliminates iterative selection and verification ofthe diversity in terms of SDI. It also ensures that thesize of the core set depends on the diversity of theentire collection. Again, exceptional types can alwaysbe included in the core set, based on the personaljudgement of germplasm managers, if they are notincluded in the core collection by the above method-ology. Probable duplicates and closely related typescould be excluded from the final core set using anappropriate threshold score of the Gower metric.

It is to be noted that if only a small number ofnew accessions are added to the base collection, suit-able candidates from among them could be includedin the core set also, if they are of exceptional type.However, if a fairly large number of them are ad-ded to the base collection over a period of time, it

Page 9: A method for establishing a core collection of Saccharum officinarum L. germplasm based on quantitative-morphological data

9

would require a fresh analysis of data including allaccessions so that the core collection could be recon-stituted as per the methodology suggested here. Awell maintained database with appropriate computerprograms necessary for various computations involvedin the methodology would greatly simplify the stepsinvolved in developing a core set.

Conclusions

The present investigation is the first attempt in estab-lishing a core set from the world collection of sugar-cane germplasm maintained in India. The proposedmethod makes use of both quantitative and morpho-logical data for evaluating the diversity in the basecollection based on principal component scores andthe Shannon-Weaver Diversity Index. The functionalrelationship between the cumulative contribution ofaccessions to the GSS and the number of accessions,which was essentially a logistic regression model, wasused to directly arrive at an appropriate size for thecore set. This method is more objective than earliermethods and avoids iterative selection of accessions.

Acknowledgement

The authors are thankful to the Director of the Sugar-cane Breeding Institute, Coimbatore (India) for facil-ities.

References

Brown, A.H.D., 1989. Core collections: a practical approach togenetic resources management. Genome 31: 818–824.

Frankel, O.H., 1984. Genetic perspectiveness of germplasm con-servation. In: Arber, W.K., K. Llimensee, W.J. Peacock & P.Starlinger (Eds.), Genetic Manipulation: Impact on Man andSociety, pp. 161–170. Cambridge University Press, Cambridge.

Frankel, O.H. & A.H.D. Brown, 1984. Plant genetic resourcestoday: a critical appraisal. In: Holden, J.H.W. & J.T. Williams(Eds.), Crop Genetic Resources: Conservation and Evaluation,pp. 249–257. George Allen and Unwin, London.

Hamon, S. & M. Noirot, 1990. Some proposed procedures for ob-taining a core collection using quantitative plant characterizationdata. Paper presented at International Workshop on Okra held atNBPGR, New Delhi, India, October. pp. 8–12.

Harch, B.D., K.E. Basford, I.H. DeLacy, P.K. Lawrence & A.Cruickshank, 1996. Mixed data types and the use of patternanalysis on the Australian groundnut germplasm data. Genet.Resour. Crop Evol. 43: 363–376.

Hintum, T.J.L. van, 1995. Hierarchical approaches to the analysis ofdiversity of crop plants. In: Hodgkin, T., A.H.D. Brown, J.L.H.van Hintum & E.A.V. Morales (Eds.), Core Collections of PlantGenetic Resources, pp. 23–34. John Wiley & Sons, Chichester,U.K.

Jain, S.K., C.O. Qualset, G.M. Bhat & K.K. Wu, 1975. Geograph-ical patterns of phenotypic diversity in a world collection ofdurum wheat. Crop Sci. 15: 700–704.

Mahajan, R.K., I.S. Bisht, R.C. Agrawal & R.S. Rana, 1996.Studies on South Asian okra collection: Methodology for es-tablishing a representative core set using characterization data.Genet. Resour. Crop Evol. 43: 249–255.

Meade, G.P & J.C.P. Chen, 1977. Cane Sugar Handbook. 10th ed.John Wiley & Sons Inc., New York, NY.

Noirot, M., S. Hamon & F. Anthony, 1996. The principal componentscoring: a new method of constituting a core collection usingquantitative data. Genet. Resour. Crop Evol. 43: 1–6.

Roach, B.T., 1995. Case for a core collection of sugarcane germ-plasm. In: Proceedings of XXI Congress of International Societyof Sugarcane Technologists held at Bangkok, Thailand (5–14March, 1992). pp. 339–352.

Spagnoletti Zeuli, P.L. & C.O. Qualset, 1993. Evaluation of fivestrategies for obtaining a core subset from a large genetic re-source collection of durum wheat. Theor. Appl. Genet. 87:295–304.

Sreenivasan, T.V. & N.V. Nair, 1991. Catalogue of SugarcaneGenetic Resources – III.S. officinarum. Sugarcane BreedingInstitute, Coimbatore, India.

Van der Plank, J.E., 1963. Plant Disease: Epidemics and Control.Academic Press, New York, NY.

Yu Li, W.K. Shuzhi, Yongsheng Cao & Xianzhen Zhang, 1996. Aphenotypic diversity analysis of foxtail millet (Setaria italicaL.P. Beauv.) landraces of Chinese origin. Genet. Resour. Crop Evol.43: 377–384.