planning rice breeding programs for impact multi-environment trials: design and analysis

Planning rice breeding programs for impact

Multi-environment trials:

design and analysis

IRRI: Planning breeding Programs for Impact

Introduction:Problem of individual trials?

Multi-environment trials (METs) used to predict performance in farmers fields

Its predictive power = low

SO


IntroductionProblem of METs?

Must be planned carefully to ensure they are predictive and efficient

very expensive and require much coordination and time

SO


Learning objectives

• To clarify the purpose of variety trials

• To introduce linear models for multi-environment trials (MET’s)

• To describe the structure of the analysis of variance for MET’s

• To model the variance of a cultivar mean estimated from a MET

• To examine the effect of replication within and across sites and years on measures of precision


To predict performance:

• Off-station

• In the future

WS 2002 WS 2003 +

Purpose of MET’s


0 Yield (t/ha) 6Single trial

0 Yield (t/ha) 6Mean of 3 trials

MET’s reduce SEM for cultivars


Simplest MET model considers trials “environments”

Yijkl = M + Ei + R(E)j(i) + Gk + GEik + eijkl [7.1]

The genotype x environment model

Where:• M = mean of all plots

• Ei = effect of trial i

• R(E)j(i) = effect of rep j in trial I• Gk = effect of genotype k

• GEik = interation of genotype k and trial i

• eijkl = plot residual



Trials and reps are random factorsThey sample the TPE

We do not select varieties for specific trials or reps

Genotypes are fixed factorsWe are interested in the performance of the

specific lines in the trial



The GE interaction is a random factor

Interactions of fixed and random factors are always random

Random interactions with genotypes are part of the error variance for genotype means


Single trial: Yijk = μ + Rj + Gi + ek(j)

GE model: Yijkl = M + Ei + R(E)j(i) + Gk + GEik + eijkl

Relationship between GE model and single-trial model:


ANOVA for GLY model

Source Mean Square EMS

Environments (E)

Replicates within E

Genotypes MSG σ2e + rσ2

GE + reσ2G

G x E MSGE σ2e + rσ2

GE

Error (Plot Residuals)

MSe σ2e


Variance of a cultivar mean

Where: • e = number of trials• r = number of reps per trial

σ2Y = σ2

GE/e + σ2e/re [7.2]


Estimating σ²G, σ²GE and σ²e

σ2e = MSerror

σ2GE = (MSGE – Mserror)/r

σ2G = (MSG – MSGE)/re


σ2e = .45 (t/ha)2

σ2GE = 0.30 (t/ha)2

Hypothetical values:

σ2Y = σ2

GE/e + σ2e/re [7.2]

Example: modeling the LSD for a MET program using GE model

Example: modeling the LSD for a MET program using GE model

Number of sites Nr of reps/site SEM t/ha LSD

1 1 .87 2.61

2 .72 2.16

4 .64 1.92

2 1 .61 1.83

2 .51 1.53

4 .45 1.35

5 1 .39 1.08

2 .32 0.96

4 .29 0.87

10 1 .27 0.810.69 2 .23

4 .20 0.60

Table 1. The effect of trial and replicate number on the standard deviation of a cultivar mean: genotype x environment model


The “real” SEM (with GE component estimated separately) for a single trial is:

SEM = (σ2GE/e + σ2

e/re)0.5

= ((0.3/1) + (0.45/4)) 0.5

= 0.64 t/ha

The “apparent” SEM (with GE and G components confounded) for a single trial is:

SEM = (σ2e/r)0.5

= (0.45/4) 0.5

= 0.35


Yijklm = M + Yi + Sj + YSij + R(YS)k(ij)+ Gl + GYil + GSjl + GYSijl + eijklm

Yijkl = M + Ei + R(E)j(i) + Gk + GEik + eijkl

σ2Y = σ2

GY/y + σ2GS/s + σ2

GYS/ys + σ2e/rys

The genotype x site x year model

A more realistic MET model subdivides the “environment” factor into “years” and “sites”:

SourceMean

squareEMS

Years (Y)

Sites (S)

Y x S

Replicates within Y x S

Genotypes (G) MSG σ2e + rσ2

GYS + rsσ2GY+ ryσ2

GS+ rysσ2G

G x S MSGS σ2e + rσ2

GYS + ryσ2GS

G x Y MSGY σ2e + rσ2

GYS + rsσ2GY

G x Y x S MSGYS σ2e + rσ2

GYS

Plot residuals MSe σ2e

ANOVA for GSY model


Estimating σ2GY , σ

2GS , σ

2GY S, and σ2

e

σ2e = MSerror

σ2GYS = (MSGYS – MSerror)/r

σ2GY = (MSGY – MSGYS)/rs

σ2GS = (MSGS – MSGYS)/ry

σ2G = (2MSG - MSGS – MSGY)/2rsy


Example: Modeling the LSD for a MET program using the GSY model

For NE Thailand OYT:

σ2e = 0.440 (t/ha)2

σ2GS = 0.003 (t/ha)2

σ2GY = 0.049 (t/ha)2

σ2GYS = 0.259 (t/ha)2

(Cooper et al., 1999)


Number of sites

Number of years

Number of replicates/site LSD (t ha-1)

1 1 1 2.45

2 2.06

4 1.85

2 1 1.79

2 1.52

4 1.37

5 1 1 1.10

2 0.93

4 0.83

2 1 0.81

2 0.69

4 0.62

Example: Modeling the LSD for a MET program using the GSY model


Conclusions from error modeling exercise?

• σ2GS was very small in this case

little evidence of specific adaptation to sites

• σ2GSY was very large in this case

much random variation in cultivar performance from site to site and year to year

• σ2e very large, methods to reduce plot error are needed

• σ2GYS was very large compared to σ2

GY and σ2GS

sites and years are equivalent for testing


Deciding whether to divide a TPE

• If TPE = large and diverse, it may be worthwhile to divide it into sets of more homogeneous sites

• If no pre-existing hypothesis about how to group

environments, use cluster, AMMI, or pattern analysis

• If there is a hypothesis that can be formed based on geography, soil type, management system, etc, group trials according to this fixed factor


Environments can be grouped into subregions:

Yijklm = M + Si + Ej(Si) + R(E(S))k(ij)+ Gl + GSil + GE(S)lij + eijklm

Yijkl = M + Ei + R(E)j(i) + Gk + GEik + eijkl

• Subregions are fixed

• Trials within subregions are random

• If GS interaction term is not significant, subdivision is unnecessary, and could be harmful

The genotype x subregion model


SourceMean

squareEMS

Subregions (S)

Locations within subregions (L(S))

Replicates within L(S)

Genotypes (G) MSG σ2e + rσ2

GL(S) + rlσ2GS+ rlsσ2

G

G x S MSGS σ2e + rσ2

GL(S) + rlσ2GS

G x L(S) MSGL(S) σ2e + rσ2

GL(S)

Plot residuals MSe σ2e

Expected mean squares for ANOVA of the genotype x subregion model for testing fixed groupings of sites


Example: Are central and southern Laos separate breeding targets?

Should breeders and agronomists in Laos consider central and southern regions as separate TPE for RL rice?

22 traditional varieties tested in 4-rep trials at

3 sites in central region, 3 in south in WS 2004

Source df MS F

Subregions (S) 1 5459785

Locations within subregions (L(S))

4 17284169

Replicates within L(S)

18 292059

Genotypes (G) 21 3644949 4.77**

G x S 21 764412 0.76

G x L(S) 84 1006974 6.58**

Plot residuals 378 153101

ANOVA testing hypothesis: central & southern regions of Laos = separate RL breeding targets

22 TVs tested in WS 2004


Are central and southern Laos separate breeding targets?

Genotype x subregion interaction is not significant when tested against variation among locations within subregions

Subdivision is therefore not needed

Subdivision might even be harmful, because it would reduce replication within each subregion


Can anyone briefly clarify the purpose of variety trials?

When should you divide a TPE?


Summary 1

• Purpose of a variety trial is to predict future performance in the TPE

• Random GEI interaction is large, and reduces precision with which cultivar means can be estimated

• Variance component estimates for the GLY model can be used to study resource allocation in testing programs

• Within homogeneous TPE, the GSY variance usually the largest. If so, strategies that emphasize testing over several sites or several years likely equally successful


Summary 2

• Little benefit from including more than 3 replicates (and often more than 2) in a MET

• Standard errors and LSD’s estimated from single sites are unrealistically low because they do not take into account random GEI

• Fixed-subregion hypotheses allow a hypothesis about the existence of genotype x subregion interaction to be tested against genotype x trial within subregion interaction

planning rice breeding programs for impact multi-environment trials: design and analysis

Documents