coat color genomic selection polledness dna -...

14
1 Genomic Selection Alessandro Bagnato Markers Phenotypic markers Coat color Blood type Polledness Genetic markers / molecular markers DNA Can be visualised thanks to technology Close relatedness to available technology Marker and selection Microsatellites Markers Marker Assisted Selection (MAS) New Technologies: Sequencers Genotyping techniques Cow Genome Sequenced SNP Markers Genomic Selection Cattle (Dairy) – Outbred populations Pig, Poultry – Inbreed populations Prove di progenie e MAS 1967: Suggerito lutilizzo di marcatori per il miglioramento genetico (MAS) 1980: Progetto Genoma Umano 1985: Sviluppato il termociclatore 1985: Identificati i QTL nei bovini da latte 1990: Proposto il Granddaughter design 1990s: Applicata MAS Bell locus 2009 Utilizzo di selezione genomica Marcatori molecolari Marcatori molecolari Mutazioni causali Direttamente studiabili per l'effetto che le loro mutazioni inducono sull’espressione fenotipica del carettere Esempio: le caseine nel bovino (αs1 αs2 β e k) "Anonimi" a funzione ignota Mutazioni genomiche non causali ma a comportamento mendeliano Microsatelliti, SNP, CNV Genetic markers Microsatelliti (Variable Number of Tandem Repeats - VNTR) Brevi ripetizioni di nucleotidi lungo il DNA Frequenti le ripetizioni di AC AT CAC GATA ACACACACACACACAC ATATATATATATAT CACCACCACCACCAC Nel genoma migliaia di microsatelliti Loci mendeliani a tutti gli effetti

Upload: trinhquynh

Post on 18-Feb-2019

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Coat color Genomic Selection Polledness DNA - unimi.itamaltea.vete.unimi.it/docenti/bagnato/Disp_Vet_Integrazione_H08/... · 6 Cosa è un QTL • La teoria genetica per le popolazioni

1

Genomic Selection

Alessandro Bagnato

Markers

•  Phenotypic markers – Coat color – Blood type – Polledness

•  Genetic markers / molecular markers – DNA

•  Can be visualised thanks to technology •  Close relatedness to available technology

Marker and selection

•  Microsatellites Markers – Marker Assisted Selection (MAS)

•  New Technologies: – Sequencers – Genotyping techniques

•  Cow Genome Sequenced – SNP Markers – Genomic Selection

•  Cattle (Dairy) – Outbred populations •  Pig, Poultry – Inbreed populations

Prove di progenie e MAS •  1967: Suggerito l’utilizzo di marcatori

per il miglioramento genetico (MAS) •  1980: Progetto Genoma Umano •  1985: Sviluppato il termociclatore •  1985: Identificati i QTL nei bovini da

latte •  1990: Proposto il Granddaughter design •  1990s: Applicata MAS Bell locus •  2009 Utilizzo di selezione genomica

Marcatori molecolari

•  Marcatori molecolari – Mutazioni causali

•  Direttamente studiabili per l'effetto che le loro mutazioni inducono sull’espressione fenotipica del carettere

–  Esempio: le caseine nel bovino (αs1 αs2 β e k)

– "Anonimi" a funzione ignota •  Mutazioni genomiche non causali ma a

comportamento mendeliano – Microsatelliti, SNP, CNV

Genetic markers – Microsatelliti (Variable Number of Tandem

Repeats - VNTR) •  Brevi ripetizioni di nucleotidi lungo il DNA •  Frequenti le ripetizioni di AC AT CAC GATA •  ACACACACACACACAC •  ATATATATATATAT •  CACCACCACCACCAC •  Nel genoma migliaia di microsatelliti •  Loci mendeliani a tutti gli effetti

Page 2: Coat color Genomic Selection Polledness DNA - unimi.itamaltea.vete.unimi.it/docenti/bagnato/Disp_Vet_Integrazione_H08/... · 6 Cosa è un QTL • La teoria genetica per le popolazioni

2

Genetic markers

DNA-microsatellites!

---> ------CACACACACACACACACACACACACACA------ ------GTGTGTGTGTGTGTGTGTGTGTGTGTGT------ <---

[CA]15"

---> ------CACACACACACACACACACACACACACACACA------ ------GTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGT------ <---

[CA]17!

R. Achmann

standard

Individual homozygous Individual heterozygous

primer peak

Electropherograms

Individual heterozygous

Individual homozygous

Dinucleotide microsatellite (RM 372 on BTA8)

1 gatcagcctc ggcaattatt atcatctttc 31 attaatgtca caatttagtt tcatgggttc 61 aacccaacat ccacttgttt aaacacacac 91 acacacacac acacacaggt cactcctcag 121 tttcttcttc tgtatttctt ttcttatttt 151 ccaagtcctg ggcttggaaa tctaagtgta 181 ccttaaggat c

Black, unique sequence (flanks); Green, primers; Red, microsatellite tract (here, n=12 repeats).

Marker maps

•  Marker map BTA6

•  Linkage maps – Based on

recombination rate •  Physical maps

– Based on DNA sequence

The  Causa)ve  DNA  change  may  be

•  A  small  change  in  DNA  sequence  GGATGTGGTCG  GGATGTGCTCG  

–  A  Single  Nucleo)de  Polymorphism  (SNP)  

GGATGT-­‐GGTCG  GGATGTAGGTCG  

–  An  inser)on  or  dele)on  (indel)  •  Larger  scale  structural  changes    

Page 3: Coat color Genomic Selection Polledness DNA - unimi.itamaltea.vete.unimi.it/docenti/bagnato/Disp_Vet_Integrazione_H08/... · 6 Cosa è un QTL • La teoria genetica per le popolazioni

3

Structural  Varia)on          

 

   Copy  number  polymorphisms/varia)on    (CNP/CNV)    

Copy  Number  Variants  (CNV)  

•  DNA  segment  1kb  or  larger  and  present  at  variable  copy  number  in  comparison  with  a  reference  genome  

•  Range  from  Kb  to  Mb  •  Dele)ons,  inser)ons,  duplica)ons  and  complex  mul)  site  variants  

•  Common  CNV  (minor  allele  frequency  >5%)  termed  copy  number  polymorphism  (CNP)  

•  Func)onal  significance  yet  to  be  fully  ascertained  

• Evidence for specific effects on gene expression/dosage, diseases and complex traits

Distribu)on  of  1447  Human  CNVs  

From  150  apparently  healthy  individuals  in  the  human  HapMap  project  

(Redon et al., 2006)

Italian  Brown  Swiss  -­‐  Consensus  CNVR  map  on  UMD3.1;  Y-­‐axis  Bos  Taurus  autosomes,  complex  CNVRs  contain  gains  &  losses  

Dolezal et al. 2012, Edinburgh ICQG

Step toward GS

•  Understand linkage between loci – Mendel laws – Recombination – Recombination rate / mapping functions

•  What a QTL is –  Identification of QTLs

•  Experimental designs to map a QTL – Population structure dependent

•  Cattle = outbreed populations •  Poultry / swine = inbreed populations

Mendel

Fonte: internet

Page 4: Coat color Genomic Selection Polledness DNA - unimi.itamaltea.vete.unimi.it/docenti/bagnato/Disp_Vet_Integrazione_H08/... · 6 Cosa è un QTL • La teoria genetica per le popolazioni

4

1 - Legge della dominanza

bb

BB

b

B

Gameti

F0 F1

Bb

Bb

2 - Legge della segregazione

bb

BB

b

B

Gameti

F1 F2

Bb

Bb

Bb

Bb

B

b

Legge dell’indipendenza

Gialli Lisci

Verdi Rugosi

GG-LL

vv-rr

G - L

v - r

Gv-Lr

Gialli Lisci

Gameti

F0 F1

Legge dell’indipendenza

Gv-Lr

Gialli Lisci

F1 F2

G – L G – r v - L v - r

G - L GGLL GGLr GvLL GvLr

G - r GGrL GGrr GvrL Gvrr

v – L vGLL vGLr vvLL vvLr

v – r vGrL vGrr vvrL vvrr

9/16 3/16 3/16 1/16

Linkage

•  2 loci on 2 different chromosomes segregate independently from each other. Their chance to be inherited together (co-inherit) is 0,5. These loci are unlinked.

•  2 loci are said to be linked if they are located on the same chromosome and segregate together. – Linkage Disequilibrium – LD

•  Due to recombination 2 loci on the same chromosome have got a change to be not inherited together.

Recombination

•  During meiosis, the chromosome often breaks up and rejoins with its homologue chromosome, resulting in new chromosomal combinations – cross overs.

•  The greater the distance between 2 loci on a chromosome the more likely it is that there is a recombination between them.

Page 5: Coat color Genomic Selection Polledness DNA - unimi.itamaltea.vete.unimi.it/docenti/bagnato/Disp_Vet_Integrazione_H08/... · 6 Cosa è un QTL • La teoria genetica per le popolazioni

5

Recombinants

Mapping functions

•  The mapping function gives the relationship between the distance of 2 chromosomal locations on a genetic map and their recombination frequency.

•  The distance between 2 loci is determined by their recombination fraction.

•  The mapping unit is Morgan. •  1 Morgan is the distance over which on

average 1 cross over /recombination occurs per meiosis. – 1cM = 1 meiosis over 100 meiosis

Mapping function

•  Morgan dM=r

•  Haldane Mapping function: dM=1/2 ln (1-2r)

r=1/2 [1-exp(-2dM) •  Kosambi Mapping Function

dM=1/4 ln[(1+2r)/(1-2r)] r=[1-exp(-4dM)] / 2*[(1+exp(-4dM)]

Haldane Function Cosa è un QTL

•  Il fenotipo di un individuo può essere descritto accuratamente da un numero quantitativo – es kg di latte prodotto

•  Distribuzione continua a cui è associata una variabilità

•  h2 = proporzione di variabilità dovuta a differenze genetiche tra individui

Page 6: Coat color Genomic Selection Polledness DNA - unimi.itamaltea.vete.unimi.it/docenti/bagnato/Disp_Vet_Integrazione_H08/... · 6 Cosa è un QTL • La teoria genetica per le popolazioni

6

Cosa è un QTL

•  La teoria genetica per le popolazioni in outbreeding – Numero moderato di geni per i quali è

possibile trovare alleli che influenzano in modo diverso l’espressione del carattere

•  Singolo gene che contribuisce alla variazione – Quantitative Trait Gene (QTG)

•  Insieme dell’effetto di tutti I geni – Determina variabilità del carattere

Cosa è un QTL

•  MA…. I QTG possono essere distinti sulla base della loro localizzazione sui cromosomi (locus)

•  Per questo gli elementi responsabili della variabilità del carattere sono chiamati – Quantitative Trait Loci (QTL)

•  Definiamo le regioni cromosomiche dove si trovano dei QTL come QTLR

QTL – why map it??

•  To provide knowledge of individual gene actions and interactions.

•  To build a more realistic model of phenotypic variation, response to selection and evolutionary processes.

•  To improve breeding value estimation and selection response / reduce cost of breeding programmes through marker assisted selection.

I passi per la MAS •  Identificare associazione tra marcatore e

carattere di interesse (QTL) – Scegliere i marcatori e il disegno

sperimentale (daughter / granddaughter design)

– Effettuare campionamento – Estrazione del DNA – Amplificazione (PCR) / genotyping /

sequencing – Analisi statistica – Utilizzo risultati per MAS / genomic selection

MAS - il marcatore

•  Il marcatore "segna" sul genoma la presenza di un gene di interesse zootecnico

Q M

q m

Q = QTL - Gene che codifica per il carattere di interesse M = Marcatore - "Segna" la presenza del QTL

MAS - il marcatore

•  Il marcatore "segna" sul genoma la presenza di un gene di interesse zootecnico

PAT. M1 M2 M3 M4 Q1 M5 M6 M7 Q2 M8

Q = QTL - Gene che codifica per il carattere di interesse M = Marcatore - "Segna" la presenza del QTL

MAT. m1 m2 m3 m4 q1 m5 m6 m7 q2 m8

Page 7: Coat color Genomic Selection Polledness DNA - unimi.itamaltea.vete.unimi.it/docenti/bagnato/Disp_Vet_Integrazione_H08/... · 6 Cosa è un QTL • La teoria genetica per le popolazioni

7

Different Q/M het/hom

PAT. M1 M2 M3 M4 Q1 M5 M6 M7 Q2 M8

MAT. m1 m2 m3 m4 Q1 m5 m6 m7 q2 m8

PAT. M1 M2 M3 M4 Q1 M5 M6 M7 Q2 M8

MAT. m1 m2 M3 M4 q1 M5 M6 M7 q2 M8

PAT. M1 M2 M3 M4 Q1 M5 M6 M7 Q2 M8

MAT. m1 M2 M3 m4 Q1 m5 m6 m7 q2 M8

Mapping QTL in different populations

•  Inbred line crosses (e.g. mice, plants) –  to detect QTL that differ between the lines –  Backcross (BC), –  Cross in second generation (F2) –  F3-F.. (AIL = advanced intercross lines, RIL

= recombinant inbred lines)

Mapping QTL in different populations

•  Outbred populations b) within structured outbred populations

•  search for QTL within populations •  e.g. daugter design & grand daughter design •  e.g. dairy cattle, commercial pigs •  e.g. trees (maternal half sib families)

c)   within unstructured outbred populations •  (e.g humans) complex pedigrees – many

generations, different relationships, inbreeding loops"

Mappare i QTL nei bovini

•  Bovini da latte – Popolazioni in outbreeding

•  Genome Scan – La ricerca di QTL su tutto il genoma

•  Disegni sperimentali – Daughter Design – Grand Daughter Design – Selective DNA Pooling

•  High resolution mapping – Una ricerca più fine in QTLR

Interval mapping Q MMq mm

Q MM

q MM

Q mM

q Mm

q mm

parent

example gametes

Meiosis

Haplotype

•  Haplotype ((linkage) phase): A combination of alleles (for different loci) which are located (closely) together on the same chromosome and which tend to be inherited together.

..A b C D ...... r Q S..

.. A b C D ...... R q s..

Page 8: Coat color Genomic Selection Polledness DNA - unimi.itamaltea.vete.unimi.it/docenti/bagnato/Disp_Vet_Integrazione_H08/... · 6 Cosa è un QTL • La teoria genetica per le popolazioni

8

The Daughter Design •  Bulls must be heterozygous at the

markers •  Large half sib family groups

Q M q m

1000 / 4000 Daughters

Daughter design m M

q Q

Sire

Inseminations

3.0% protein

3.2% protein

m

q

M

Q

M/m?

q/Q?

Daughters

Sire son

M/m?

q/Q?

Studi esistenti

Grandsire

PT sires

Candidate PT grandsons

CD ??

?? ??

?? ??

Cd ??

cD ??

cd ??

best worst

C Q D c q d

•  Use of interval mapping •  Many markers to assure

heterozygosity •  Use of flanking markers

to trace the trasmission of the QTL

A B C D E F

QTL

Interval mapping: Example

0

10

20

30

40

50

60

0 20 40 60 80 100

Location on chromosome (cM)

Test

sta

tistic

11 markers

6 markers

3 markers

Interval mapping Individual sire F values: Chr1

PROTEIN YIELD

0

2

4

6

8

10

12

1 11 21 31 41 51 61 71 81 91 101 111Position on chromosome (cM)

F value

Sire 1

Sire 2

Sire 3

Sire 4

Sire 5

Sire 6

The Selective DNA pooling •  Identify the individual in the tails •  Use the correct indicator

– DYD or corrected EBV

Q M q m

1000 / 4000 Daughters

Low 10% 2 sub-pools

High 10% 2 sub-pools

Page 9: Coat color Genomic Selection Polledness DNA - unimi.itamaltea.vete.unimi.it/docenti/bagnato/Disp_Vet_Integrazione_H08/... · 6 Cosa è un QTL • La teoria genetica per le popolazioni

9

The methodology •  Selective DNA pooling •  Estimation of allele frequency

–  A panel of 150/200 microsatellite markers –  shadow band correction

f(M) = .40 Q M q m f(m) = .10

f(M) = .10 Q M q m f(m) = .40

Q M q m

1000 / 4000 Daughters

Low 10% 2 sub-pools

High 10% 2 sub-pools

Identification of QTL

From Tal-Stein et al. 2010 - J. Dairy Sci. 93:4913-4927

Milk Somatic Cell Count – Holstein Friesian Selective DNA pooling

Progeny Test and MAS

+ 1000

- + 1000

-

IP = (1000 + 1000)/2 = 1000

+ + + - - -

+ : + : + : + : + : - : + : - :

From Traditional to Genomic Selection

•  Variation in phenotypes (eg. Milk production) determined by: –  Difference in management (environment) –  Difference in genetics: each individual is genetically different

from others

•  Genomic selection principle: –  Relate phenotypic differences to genetic differences –  Use this information (knowledge of genetic differences)to

select individuals

Measurement (e.g. meat quality) = Genetics + Environment

Combined effect of all influential Genes (G1+ G2 + G3 +…)

Useful to know the genes (alleles)

G1

G3

G2

e.g. Calpain and Calpastatin

gene tests detect variation in some of the genes involved

in (the genetic component of) beef

tenderness

Identification of QTL

MSCC - From Tal-Stein et al. 2010 - J. Dairy Sci. 93:4913-4927

What is really controlling the trait?

Page 10: Coat color Genomic Selection Polledness DNA - unimi.itamaltea.vete.unimi.it/docenti/bagnato/Disp_Vet_Integrazione_H08/... · 6 Cosa è un QTL • La teoria genetica per le popolazioni

10

Technology and Information •  2003 human genome delivered

–  After many years (US$100’s M)

•  2006 cattle genome delivered –  Two years (US$ 50M)

•  2008/2010 reality –  777,000+ SNP panel for cattle ~US$ 0.0003/SNP –  54,000+ SNP panels for pigs, sheep and chickens –  ~US$ 0.003/SNP

•  2011 reality –  1 cow sequenced for less than 10,000 US$

•  2012…. Whole genome re-sequencing genotyping –  1 cow sequenced for less than 1,000 US$

•  2015 …. ????

New technologies => new possibilities

•  Genomic technology –  SNP genotyping platforms –  Second / third generation sequencers –  Production of genomic information at a very low cost

per individual •  Phenotype recording

–  Automation in phenotypic recording –  E.g. NIR technology for milk quality traits

•  Trait ontology •  Reproductive technologies

–  Embryo Transfer –  Semen Sexing

•  Association of genomic information to phenotypic information –  Genomic selection

Copyright 2001 by the Genetics Society of America

Prediction of Total Genetic Value Using Genome-Wide Dense Marker Maps

T. H. E. Meuwissen,* B. J. Hayes† and M. E. Goddard†,‡

*Research Institute of Animal Science and Health, 8200 AB Lelystad, The Netherlands, †Victorian Institute of Animal Science,Attwood 3049, Victoria, Australia and ‡Institute of Land and Food Resources,

University of Melbourne, Parkville 3052, Victoria, Australia

Manuscript received August 17, 2000Accepted for publication January 17, 2001

ABSTRACTRecent advances in molecular genetic techniques will make dense marker maps available and genotyping

many individuals for these markers feasible. Here we attempted to estimate the effects of z50,000 markerhaplotypes simultaneously from a limited number of phenotypic records. A genome of 1000 cM wassimulated with a marker spacing of 1 cM. The markers surrounding every 1-cM region were combined intomarker haplotypes. Due to finite population size (Ne 5 100), the marker haplotypes were in linkage disequilib-rium with the QTL located between the markers. Using least squares, all haplotype effects could not beestimated simultaneously. When only the biggest effects were included, they were overestimated and theaccuracy of predicting genetic values of the offspring of the recorded animals was only 0.32. Best linearunbiased prediction of haplotype effects assumed equal variances associated to each 1-cM chromosomalsegment, which yielded an accuracy of 0.73, although this assumption was far from true. Bayesian methodsthat assumed a prior distribution of the variance associated with each chromosome segment increasedthis accuracy to 0.85, even when the prior was not correct. It was concluded that selection on geneticvalues predicted from markers could substantially increase the rate of genetic gain in animals and plants,especially if combined with reproductive techniques to shorten the generation interval.

SELECTION for economically important quantita- markers feasible (and perhaps even cost effective). How-tive traits in animals and plants is traditionally based ever, the precision of mapping QTL by traditional link-

on phenotypic records of the individual and its relatives. age analysis is little improved by the use of a very denseEstimated breeding values, based on this phenotypic marker map (Darvasi et al. 1993). Therefore, a differ-data, are commonly calculated by best linear unbiased ent approach is needed to efficiently use all this markerprediction (BLUP; Henderson 1984). One justification information.for molecular genetics research on livestock and crop With a dense marker map some markers will be veryspecies is the expectation that information at the DNA close to the QTL and probably in linkage disequilibriumlevel will lead to faster genetic gain than that achieved with it (e.g., Hastbacka et al. 1992). Therefore, somebased on phenotypic data only. The availability of a marker alleles will be correlated with positive effects onsparse map of genetic markers has resulted in the detec- the quantitative trait across all families and can be usedtion of some quantitative trait loci (QTL; Georges et for selection without the need to establish linkage phaseal. 1995). The inclusion of marker information into in each family. Close markers can be combined into aBLUP breeding values was demonstrated by Fernando haplotype. Chromosome segments that contain theand Grossman (1989) and is predicted to yield 8–38% same rare marker haplotypes are likely to be identicalextra genetic gain (Meuwissen and Goddard 1996). by descent (IBD) and hence carry the same QTL allele.However, the usefulness of information from a sparse Our approach is to estimate the effect on the quantita-marker map in outbreeding species is limited because tive trait of small chromosome segments defined by thethe linkage phase between a marker and QTL must be haplotypes of marker alleles that they carry.established for every family in which the marker is to Quantitative traits are usually affected by many genesbe used for selection. and consequently the benefit from marker-assisted se-

The total number of single nucleotide polymorphisms lection is limited by the proportion of the genetic vari-(SNP) is estimated at many millions (Halushka et al. ance explained by the QTL. It would be desirable to1999), and the advent of DNA chip technology may utilize all QTL affecting the trait in marker-assisted se-make genotyping of many animals for many of these lection. However, a dense marker map defines a very

large number of chromosome segments and so therewill be many effects to be estimated, probably more

Corresponding author: Theo Meuwissen, Department of Animal than there are phenotypic data points from which toBreeding and Genetics, DLO-Institute for Animal Science and Health, estimate them.Box 65, 8200 AB Lelystad, The Netherlands.E-mail: [email protected] The problem is essentially the same if we assume that

Genetics 157: 1819–1829 (April 2001)

Measured traits +

Thousand of markers Prediction equations

Thousand of markers +

Prediction equations GEBV

Training

Application

The Principle of ‘Genomic Selection’

Why EBV (DYD) is the phenotype?

P = µ + A + D + I + PE + TE

G E

P = Fenotipo µ = Fattore comune agli individui (es. managment) A = effetti genetici additivi D = effetti genetici di dominanza I = effetti genetici di interazione PE = effetti ambientali permanenti TE = effetti ambientali temporanei

=

Gestione

+

Casualità

+

Genetica

Phen / Gen Variability PAT. M1 M2 M3 M4 Q1 M5 M6 M7 Q2 M8 .... q/Qn… m/Mn

MAT. m1 m2 m3 m4 Q1 m5 m6 m7 Q2 m8 .... q/Qn… m/Mn

EBV => Milk Kg = 2340 Protein % +0.12 SCC = 95 1

PAT. M1 M2 M3 m4 q1 M5 M6 M7 Q2 M8 … q/Qn … m/Mn

MAT. m1 M2 M3 m4 q1 m5 m6 m7 q2 m8 ... q/Qn … m/Mn

EBV => Milk Kg = 1870 Protein % +0.23 SCC = 103 2

PAT. m1 M2 M3 m4 q1 M5 M6 M7 q2 M8 … q/Qn … m/Mn

MAT. m1 m2 M3 M4 Q1 m5 M6 M7 q2 m8 ... q/Qn … m/Mn

EBV => Milk Kg = 2140 Protein % -0.03 SCC = 101 n

... … … … … … … … … … … … … … … … … … … … … … …

... … … … … … … … … … … … … … … … … … … … … … …

Page 11: Coat color Genomic Selection Polledness DNA - unimi.itamaltea.vete.unimi.it/docenti/bagnato/Disp_Vet_Integrazione_H08/... · 6 Cosa è un QTL • La teoria genetica per le popolazioni

11

Es. Marker 10,720 Genotype N. Individuals Phenotype M10,720M10,720 38 2019 M10,720m10,720 545 1780 M10,720m10,720 319 1689

Number of M alleles in the genotype 1600

2000

0 1 2

Phenotype Means + BV

mm Mm MM

Mean Alpha = 141.7

1800

1900

1700

2100

Dom Dev

BV

0.0 0.0 +0.5 0.0 0.0 -0.07 0.0 +0.15

+0.2 0.0 0.0 0.0 0.0 -0.3 0.0 0.0

Assume all errors cancel each other out

Protein Content (%) = + 0.12 Longevity (years) = + 0.34

0.0 +0.2 0.0 0.0 0.0 0.0 +0.03 0.0

-0.1 0.0 +0.35 0.0 0.0 -0.05 -0.01 0.0

And so on … = ????? 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

Most markers will have no effect

Direct Genomic EBV •  Training (single marker analysis)

•  Test population

•  Where – xi = is the marker genotype of individual j – gj = is the estimated effect of marker j

DGEBV = xiji

n∑ g j

y = µ1n + Xigi + e

Accuracy of GEBV

•  Factors influencing the accuracy of genomic selection i.e. r(DGEBV,TBV) – Linkage Disequilibrium between QTL and

Markers •  Density of markers => whole genome

– Method of estimation •  Single marker / Haplotypes / IBD •  BLUB / Bayes etc.

– Number of records to estimate prediction equations, i.e. n. individuals in training population

Accuracy of genomic selection

• Factors affecting accuracy of genomic selection r(GEBV,TBV)–Linkage disequilibrium between QTL

and markers = density of markers• Haplotypes or single markers be in sufficient LD

with the QTL such that the haplotype or single markers will predict the effects of the QTL across the population.

Accuracy of genomic selection

• Factors affecting accuracy of genomic selection r(GEBV,TBV)–Linkage disequilibrium between QTL

and markers = density of markers• Haplotypes or single markers be in sufficient LD

with the QTL such that the haplotype or single markers will predict the effects of the QTL across the population.

• Calus et al. (2007) used simulation to assess effect of LD between QTL and markers on accuracy of genomic selection

Accuracy of genomic selection• Effect of LD on accuracy of

selection

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

0.075 0.095 0.115 0.135 0.155 0.175 0.195 0.215

Average r2 between adjacent marker loci

Acc

ura

cy o

f G

EB

V

HAP_IBS

Accuracy of genomic selection

• Factors affecting accuracy of genomic selection r(GEBV,TBV)– Linkage disequilibrium between QTL and

markers = density of markers– In dairy cattle populations, an average r2

of 0.2 between adjacent markers is only achieved when markers are spaced every 100kb.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500

Distance (kb)

Ave

rag

e r

2 v

alu

e

Australian Holstein

Norwegian Red

Australian Angus

New Zealand Jersey

Dutch Holsteins

Accuracy of genomic selection

• Factors affecting accuracy of genomic selection r(GEBV,TBV)– Linkage disequilibrium between QTL and

markers = density of markers– In dairy cattle populations, an average r2

of 0.2 between adjacent markers is only achieved when markers are spaced every 100kb.

– Bovine genome is approximately 3000000kb

– Implies that in order of 30 000 markers are required for genomic selection to achieve accuracies of 0.8!!

Accuracy of genomic selection

• Comparing the accuracy of genomic selection with – IBD approach –haplotypes – single markers–Calus et al (2007) used simulated

data

'QTL Mapping, MAS, and Genomic Selection'

Course Notes

Taught by Dr. Ben Hayes, Animal Genetics and Genomics group of the Department of Primary Industries Research Victoria (Attwood - Melbourne, Australia)

March 10-14, 2008The Animal Breeding and Genomics Centre (Animal Sciences Group - Wageningen University and Research Centre), Lelystad, The Netherlands

4/15

Courtesy of Ben Hayes, 2008

Accuracy of genomic selection

• Factors affecting accuracy of genomic selection r(GEBV,TBV)–Linkage disequilibrium between QTL

and markers = density of markers• Haplotypes or single markers be in sufficient LD

with the QTL such that the haplotype or single markers will predict the effects of the QTL across the population.

Accuracy of genomic selection

• Factors affecting accuracy of genomic selection r(GEBV,TBV)–Linkage disequilibrium between QTL

and markers = density of markers• Haplotypes or single markers be in sufficient LD

with the QTL such that the haplotype or single markers will predict the effects of the QTL across the population.

• Calus et al. (2007) used simulation to assess effect of LD between QTL and markers on accuracy of genomic selection

Accuracy of genomic selection• Effect of LD on accuracy of

selection

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

0.075 0.095 0.115 0.135 0.155 0.175 0.195 0.215

Average r2 between adjacent marker loci

Acc

ura

cy o

f G

EB

V

HAP_IBS

Accuracy of genomic selection

• Factors affecting accuracy of genomic selection r(GEBV,TBV)– Linkage disequilibrium between QTL and

markers = density of markers– In dairy cattle populations, an average r2

of 0.2 between adjacent markers is only achieved when markers are spaced every 100kb.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500

Distance (kb)

Ave

rag

e r

2 v

alu

e

Australian Holstein

Norwegian Red

Australian Angus

New Zealand Jersey

Dutch Holsteins

Accuracy of genomic selection

• Factors affecting accuracy of genomic selection r(GEBV,TBV)– Linkage disequilibrium between QTL and

markers = density of markers– In dairy cattle populations, an average r2

of 0.2 between adjacent markers is only achieved when markers are spaced every 100kb.

– Bovine genome is approximately 3000000kb

– Implies that in order of 30 000 markers are required for genomic selection to achieve accuracies of 0.8!!

Accuracy of genomic selection

• Comparing the accuracy of genomic selection with – IBD approach –haplotypes – single markers–Calus et al (2007) used simulated

data

'QTL Mapping, MAS, and Genomic Selection'

Course Notes

Taught by Dr. Ben Hayes, Animal Genetics and Genomics group of the Department of Primary Industries Research Victoria (Attwood - Melbourne, Australia)

March 10-14, 2008The Animal Breeding and Genomics Centre (Animal Sciences Group - Wageningen University and Research Centre), Lelystad, The Netherlands

4/15

Courtesy of Ben Hayes, 2008

Page 12: Coat color Genomic Selection Polledness DNA - unimi.itamaltea.vete.unimi.it/docenti/bagnato/Disp_Vet_Integrazione_H08/... · 6 Cosa è un QTL • La teoria genetica per le popolazioni

12

Accuracy of genomic selection

• Factors affecting accuracy of genomic selection r(GEBV,TBV)–Linkage disequilibrium between QTL

and markers = density of markers• Haplotypes or single markers be in sufficient LD

with the QTL such that the haplotype or single markers will predict the effects of the QTL across the population.

Accuracy of genomic selection

• Factors affecting accuracy of genomic selection r(GEBV,TBV)–Linkage disequilibrium between QTL

and markers = density of markers• Haplotypes or single markers be in sufficient LD

with the QTL such that the haplotype or single markers will predict the effects of the QTL across the population.

• Calus et al. (2007) used simulation to assess effect of LD between QTL and markers on accuracy of genomic selection

Accuracy of genomic selection• Effect of LD on accuracy of

selection

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

0.075 0.095 0.115 0.135 0.155 0.175 0.195 0.215

Average r2 between adjacent marker loci

Acc

ura

cy o

f G

EB

V

HAP_IBS

Accuracy of genomic selection

• Factors affecting accuracy of genomic selection r(GEBV,TBV)– Linkage disequilibrium between QTL and

markers = density of markers– In dairy cattle populations, an average r2

of 0.2 between adjacent markers is only achieved when markers are spaced every 100kb.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500

Distance (kb)

Ave

rag

e r

2 v

alu

e

Australian Holstein

Norwegian Red

Australian Angus

New Zealand Jersey

Dutch Holsteins

Accuracy of genomic selection

• Factors affecting accuracy of genomic selection r(GEBV,TBV)– Linkage disequilibrium between QTL and

markers = density of markers– In dairy cattle populations, an average r2

of 0.2 between adjacent markers is only achieved when markers are spaced every 100kb.

– Bovine genome is approximately 3000000kb

– Implies that in order of 30 000 markers are required for genomic selection to achieve accuracies of 0.8!!

Accuracy of genomic selection

• Comparing the accuracy of genomic selection with – IBD approach –haplotypes – single markers–Calus et al (2007) used simulated

data

'QTL Mapping, MAS, and Genomic Selection'

Course Notes

Taught by Dr. Ben Hayes, Animal Genetics and Genomics group of the Department of Primary Industries Research Victoria (Attwood - Melbourne, Australia)

March 10-14, 2008The Animal Breeding and Genomics Centre (Animal Sciences Group - Wageningen University and Research Centre), Lelystad, The Netherlands

4/15

Courtesy of Ben Hayes, 2008

Accuracy of genomic selection

• Factors affecting accuracy of genomic selection r(GEBV,TBV)–Linkage disequilibrium between QTL

and markers = density of markers• Haplotypes or single markers be in sufficient LD

with the QTL such that the haplotype or single markers will predict the effects of the QTL across the population.

Accuracy of genomic selection

• Factors affecting accuracy of genomic selection r(GEBV,TBV)–Linkage disequilibrium between QTL

and markers = density of markers• Haplotypes or single markers be in sufficient LD

with the QTL such that the haplotype or single markers will predict the effects of the QTL across the population.

• Calus et al. (2007) used simulation to assess effect of LD between QTL and markers on accuracy of genomic selection

Accuracy of genomic selection• Effect of LD on accuracy of

selection

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

0.075 0.095 0.115 0.135 0.155 0.175 0.195 0.215

Average r2 between adjacent marker loci

Acc

ura

cy o

f G

EB

V

HAP_IBS

Accuracy of genomic selection

• Factors affecting accuracy of genomic selection r(GEBV,TBV)– Linkage disequilibrium between QTL and

markers = density of markers– In dairy cattle populations, an average r2

of 0.2 between adjacent markers is only achieved when markers are spaced every 100kb.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500

Distance (kb)

Ave

rag

e r

2 v

alu

e

Australian Holstein

Norwegian Red

Australian Angus

New Zealand Jersey

Dutch Holsteins

Accuracy of genomic selection

• Factors affecting accuracy of genomic selection r(GEBV,TBV)– Linkage disequilibrium between QTL and

markers = density of markers– In dairy cattle populations, an average r2

of 0.2 between adjacent markers is only achieved when markers are spaced every 100kb.

– Bovine genome is approximately 3000000kb

– Implies that in order of 30 000 markers are required for genomic selection to achieve accuracies of 0.8!!

Accuracy of genomic selection

• Comparing the accuracy of genomic selection with – IBD approach –haplotypes – single markers–Calus et al (2007) used simulated

data

'QTL Mapping, MAS, and Genomic Selection'

Course Notes

Taught by Dr. Ben Hayes, Animal Genetics and Genomics group of the Department of Primary Industries Research Victoria (Attwood - Melbourne, Australia)

March 10-14, 2008The Animal Breeding and Genomics Centre (Animal Sciences Group - Wageningen University and Research Centre), Lelystad, The Netherlands

4/15

Courtesy of Ben Hayes, 2008

Accuracy of genomic selection

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

0.075 0.095 0.115 0.135 0.155 0.175 0.195 0.215

Average r2

between adjacent marker loci

Acc

ura

cy o

f G

EB

V

SNP1

HAP_IBS

HAP_IBD

Accuracy of genomic selection

• Number of records used to estimate chromosome segment effects–Chromosome segment effects gi

estimated in a reference population–How big does this reference

population need to be?–Meuwissen et al. (2001) evaluated

accuracy using LS, BLUP, BayesBusing 500, 1000 or 2000 records in the reference population

Accuracy of genomic selection

• Number of records used to estimate chromosome segment effects

No. of phenotypic records

500 1000 2200

Least squares 0.124 0.204 0.318

Best linear unbiased prediction (BLUP) 0.579 0.659 0.732

BayesB 0.708 0.787 0.848

Accuracy of genomic selection

• Number of records used to estimate chromosome segment effects

No. of phenotypic records

500 1000 2200

Least squares 0.124 0.204 0.318

Best linear unbiased prediction (BLUP) 0.579 0.659 0.732

BayesB 0.708 0.787 0.848

h2=0.5

Accuracy of genomic selection• Number of records used to estimate chromosome

segment effects

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

11000

12000

13000

14000

15000

16000

17000

18000

19000

20000

21000

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1Heritability

Nu

mb

er

of

ph

en

oty

pic

re

co

rds

ne

ce

ssa

ry t

o

ac

hiv

e t

his

ac

cu

rac

y

Accuracy of GEBV 0.7

Accuracy of GEBV 0.5

Genomic selection• An IBD approach• Factors affecting accuracy of genomic

selection• How often to re-estimate the

chromosome segment effects?• Non-additive effects • Genomic selection with low marker

density• Genomic selection across breeds• Cost effective genomic selection• Optimal breeding program design with

genomic selection

'QTL Mapping, MAS, and Genomic Selection'

Course Notes

Taught by Dr. Ben Hayes, Animal Genetics and Genomics group of the Department of Primary Industries Research Victoria (Attwood - Melbourne, Australia)

March 10-14, 2008The Animal Breeding and Genomics Centre (Animal Sciences Group - Wageningen University and Research Centre), Lelystad, The Netherlands

5/15

Courtesy of Ben Hayes, 2008

Technology and information

The SNP Genotyping technology

Why Genomic Selection?

•  Where traits measurements are: –  Expensive –  Difficult –  Only measured in one sex –  Only possible late in life –  Only possible on relatives of the

animals available for selection •  Low heritability traits

–  i.e. low signal:noise ratio •  Where inheritance is complex •  Where incidence is sporadic

•  For traits such as: –  Product composition –  Fertility –  Longevity –  Meat quality –  Disease resistance

Page 13: Coat color Genomic Selection Polledness DNA - unimi.itamaltea.vete.unimi.it/docenti/bagnato/Disp_Vet_Integrazione_H08/... · 6 Cosa è un QTL • La teoria genetica per le popolazioni

13

Why Genomic Selection?

•  Higher reliability of proofs (EBV) – Possibility to use a group of unproven bulls – Better choice of bull dams

•  Faster generation interval – Use of young bulls

•  Lower progeny testing costs – Smaller groups of young bulls to progeny

testing – Reduction in generation interval

Key Questions about GS •  Training

–  How many animals? –  How many markers? –  How many generations?

•  Application –  How accurate? –  How stable over time? –  How robust across

populations and environments?

•  E.g. Genotype x Environment interaction

From Muir et al. 2010 Interbull Bulletin 41

How can we improve genomic selection???

Razza Frisona Italiana Schema di selezione

1 ml vacche registrate 500,000 non registrate

Top 2% madri di toro

Top 1% tori italiani e Top 1% internazionali

500(T) 2000(G)

giovani tori

ANAFI Centro

Genetico

400(T) 100(G) Giovani

Tori

Progeny test(T) 23% Pop. Reg

Top 5% (T+G)

Alcuni marcatori in zootecnia

•  Caseine – Da circa 15 anni geni delle caseine isolati e

sequenziati interamente - scoperta di nuove varianti

•  β lattoglobulina – Almeno 6 varianti genetiche

•  Effetto sul tasso di caseine nel latte e sulla trasformazione in formaggio

•  BLAD (Bovine Leukocyte Adhesion Deficency) – Mutazione > 2% soggetti di razza Holstein

Alcuni marcatori in zootecnia

•  DUMPS (Deficency of Uridrine Monophosphate Synthase) – Solo nella Holstein

•  Portatori sani producono più latte •  In omozigosi aumento delle mortalità nei primi

due mesi di gestazione

•  WEAVER (Mioencefalopatia degenerativa progressiva) – Solo nella razza Bruna

•  Portatori sani producono più latte (700 kg anno) •  Gene non ancora isolato ma identificato

marcatore

Alcuni marcatori in zootecnia

•  PSS-MH (Porcine Stress Sindrome - Malignant Hyperthermia) – Compromette l'utilizzo della carne da parte

dell'industria di trasformazione – Gene identificato pochi anni fa

•  MH (ipertrofia muscolare) •  Boorola

– Razza Merino australiana – Omozigoti FF numero di nati doppio

Page 14: Coat color Genomic Selection Polledness DNA - unimi.itamaltea.vete.unimi.it/docenti/bagnato/Disp_Vet_Integrazione_H08/... · 6 Cosa è un QTL • La teoria genetica per le popolazioni

14

Diagnosi con marcatori

•  Es. “Complex Vertebral Malformation” •  Patologia a carattere recessivo

individuata di recente •  Carlin-M Ivanhoe Bell portatore e padre

di numerosi tori in riproduzione •  Danno per la selezione