the wheat genome sequence: a foundation for accelerating improvment of bread wheat
DESCRIPTION
Catherine Feuillet, INRATRANSCRIPT
INRA Clermont-Ferrand, France Genetics, Diversity & Ecophysiology of
Cereals
The wheat genome sequence: a foundation
for accelerating improvement of bread
wheat
Catherine Feuillet
ACTTGTGCATAGCATGCAATGCCATATATAGCAGTCTGCTAAGTCTATAGCAGACCCTCAACGTGGATCATCCGTAGCTAGCCATGACATTGATCCTGATTTACACCATGTACTATCGAGAGCAGTACTACCATGTTACGATCAAAGCCGTTACGATAGCATGAACTTGTGCATAGCATGCAATGCCATATATAGCAGTCTGCTAAGTCTATAGCAGACCCTCAACGTGGATCATCCGTAGCTAGCCATGACATTGATCCTGATTTACACCATGTACTATCGAGAGCAGTACTACCATGTTACGATCAAAGCCGTTACGATAGCATGAACTTGTGCATAGCATGCAATGCCATATATAGCAGTCTGCTAAGTCTATAGCAGACCCTCAACGTGGATCATCCGTAGCTAGCCATGACATTGATCCTGATTTACACCATGTACTATCGAGAGCAGTACTACCATGTTACGATCAAAGCCGTTACGATAGCATGAACTTGTGCATAGCATGCAATGCCATATATAGCAGTCTGCTAAGTCTATAGCAGACCCTCAACGTGGATCATCCGTAGCTAGCCATGACATTGATCCTGATTTACACCA
BGRI 2012 Technical Workshop September 1-4, Beijing
Gene and QTL
mapping
Map-based cloning
Candidate genes
Perfect markers
Allele mining
The future is in an integrated toolbox Tr
aini
ng, c
apac
ity b
uild
ing
-> e
xper
tise
and
criti
cal m
ass
Improved wheat varieties
Genetics and genomics resources (early 2000)
1. Genetic mapping • Molecular markers: low throughput RFLP and SSR
ü Wheat : 1634 RFLPs/2946 SSRs ü ESTs : > 1 million of wheat ESTs /
• Mapping populations: A few references with low marker coverage, numerous biparental populations of small sizes (100), a few specific high resolution F2/RILs for map-based cloning projects
Map-based cloning laborious and inefficient
3. Genome sequence: None but…. NGS sequencing revolution opened perspectives
Marker assisted selection not broadly deployed, not cost efficient for most breeders
2. Physical mapping • BAC libraries
ü Wheat : 1x ABD-genome (CS), 1x AB-genome, 1x A-genome, 1x D genome, Chromosome specific libraries: 3B, 1-4-6D
• Physical maps: None (D genome Ae. tauschii, IWGSC chromosome based roadmap)
Funding or Scientific Contributors
64 members, 22 countries www.wheatgenome.org
Launched in 2005 on the ini0a0ve of Kansas Growers
Sequencing Consortium
23 Sponsors
~ 500 members
40 countries
An integrated and ordered wheat genome sequence
Phenotyping Genetic mapping Physical mapping Sequencing
The Breadwheat genome is……
1. Big: 17Gb (5 x human genome, 40 x rice…)
2. Polyploid: 2n= 42 = 6x
T. urartu
Ae. speltoides (?)
Ae. tauschii
3. Full of TEs (>90%)
T. aestivum
T. turgidum
1 MYA
8-10KYA
§ Chromosomes: 605 - 995 Mbp (3.6 – 5.9% of the genome)
Dissection of the genome to single chromosomes (arms) representing
individual (sub)genomes Triticum aestivum (2n = 6x = 42) 1C ~ 17,000 Mbp
AA BB
DD
§ Chromosome arms: 225 - 585 Mbp (1.3 – 3.4% of the genome)
D
B
;
A
Doležel et al., Chromosome Res. 15: 51, 2007
Sheath fluid
Deflection plates
Excitation light
Waste
Laser
Scattered light
Fluorescence emission
Flow chamber
A chromosome-based approach
• Chromosome specific BAC libraries (End 2012) • Amplified DNA for chromosome survey (June 2012) IEB
Combined strategies to establish a wheat reference genome sequence
Physical mapping of individual chromosomes
MTP sequencing
A reference sequence anchored to the genetic and phenotypic maps
Gene catalog Virtual order
Markers Short term
Survey sequencing of individual chromosomes
Anchored and ordered sequence Intergenic regions
Markers Long term
Physical mapping of individual chromosomes
An international effort
T. aes/vum cv Chinese Spring
2A 3A 4A 5A 1A 6A 7A
2B 1B 3B 4B 5B 6B 7B
1D 2D 4D 5D 6D 7D 3D
Physical map of the 1Gb chromosome 3B chr 3B 1000 Mb
1 283 contigs (average size = 749 kb) with FPC
961 Mb coverage (97% chromosome)
ü 4367 molecular markers (SSRs, ISBPs,unigenes…)
ü Minimal Tiling Path (8448 clones)
h8p://urgi.versailles.inra.fr/projects/Tri0cum/index.php
Paux et al, Science 2008; Rustenholz et al, Plant Physiol 2011
http://urgi.versailles.inra.fr/cgi-bin/gbrowse/wheat_FPC_pub/
Combined strategies to establish a wheat reference genome sequence
Physical mapping of individual chromosomes
MTP sequencing
A reference sequence anchored to the genetic and phenotypic maps
Gene catalog Virtual order
Markers Short term
Survey sequencing of individual chromosomes
Anchored and ordered sequence Intergenic regions
Markers Long term
Physical mapping of individual chromosomes
Amplified Sorted DNA
(IEB)
~50X Survey sequence of all
individual chromosomes
Assembly of gene catalog for each
chromosome/arm (TGAC)
ComparaPve – “Genome Zipper” (MIPS)
Virtual Gene Order of the 21 Bread Wheat
Chromosomes
Sequencing Survey IniPaPve
Illumina reads (2*108 bp/) PE 0.5 kb Min 50 x
Chromosome Survey Sequencing
Assembly (ABySS) K-mer 71
Contigs > 200bp N50 = 2.4 kb
• 1,526 genes average per short arm • 2,460 genes average per long arm • Total: 83,977
Amplified DNA/sorted chromosomes
IEB
Map your favorite gene in silico
Ø Anyone can register to get a login and password through signing the data release policy agreement
Ø Click on a chromosome to have access to the survey sequence with blast search and viewers
Ø BLAST against all or selected surveys Ø Download your best hit sequences
http://urgi.versailles.inra.fr/Species/Wheat/Sequence-Repository
An unlimited source of markers
Low copy fracPon: 10%
Gene density: 1 / 104 kb
Candidate genes
RepePPve fracPon: 90%
ISBP density: 1 / 5 kb
Anonymous markers
Resequencing 4 European wheat elite lines
(Premio, Renan, Robigus and Xi19)
IWGSC chromosome arm survey sequences
X
Paux et al Plant J 2006; Plant Biotech J 2010
ISBP-derived SNPs: 3 millions Average polymorphic ISBP density: 1 / 20 kb
Average SNP density: 1.8 SNPs / ISBP
è High density isolated anonymous SNPs
An unlimited source of markers
Intergenic region-derived SNPs: 2,1 Millions variable density
Average SNP density: 2.2 SNPs / kb
è Low density blocks of anonymous SNPs
Gene-derived SNPs: 670,000 Average gene density: 1 / 104 kb
Average SNP density: 2.9 SNPs / gene
è Low density blocks of "candidate" SNPs
And integration of 12’175 ESTs, 1181 DArTs, 38’905 GBS and 7000 gene SNPs from the 9K infinium array
Combined strategies to establish a wheat reference genome sequence
Physical mapping of individual chromosomes
MTP sequencing
A reference sequence anchored to the genetic and phenotypic maps
Gene catalog Virtual order
Markers Short term
Survey sequencing of individual chromosomes
Anchored and ordered sequence Intergenic regions
Markers Long term
Chr 3B physical map 1282 BAC-contigs 8448 BACs
922 pools
Pool of 10 BACs (Roche 454 GSFLX Titanium, 8 Kb MP)
Sorted chr. 3B
(2*108 bp) PE 0.5 kb
Illumina (82X)
Sanger
(2*600 bp)
ü Annotation (TriAnnot) ü Anchoring/orientation (ISBP SNPs) ü Resequencing and polymorphisms analyses ü Transcription map (15 RNASeq)
3B SEQuencing Project (1Gb)
454 scaffolds
Illumina contigs
Super-scaffolds
BAC pool
BAC-ends
3B sequence automated annotation
Leroy et al, Frontiers in Plant Science 2012)
7975 non redundant genes with expression profiles
Assembly v2
• 5109 scaffolds • 995 Mb • N50 = 463 Kb (Max 1,6 Mb)
RNASeq data from 15 samples
An integrated and ordered wheat genome sequence
Phenotyping Genetic mapping Physical mapping Sequencing
An integrated and ordered wheat genome sequence
SSR RFLP ISBP SNP SFP DAr
T GBS STS AFLP
Others Tot.
Nb 348 99 88 373 114 790 108 30 4 19 1973
Seq 293 40 88 373 114 96 108 0 0 0 1112
Ø Integration of all known markers into the ordered sequence
Ø 3B consensus map (coll with wheat community) • Cs x Re as reference map (335 markers) • 10 addiDonal maps (>200 populaDons) • 1973 markers (1112 with sequence info) • metaQTL analysis underway
• 40 genes and QTL mapped on 3B....
Ø 13 map-based cloning projects underway using 3B resources ü Disease resistance genes (Sr, Lr, Yr, Stb…)
ü Solid stem (saw fly)
ü Yield
ü Drought tolerance
ü Boron transporter
ü Flowering time
ü NUE
ü Chromosome pairing…
3B physical map and sequence utilization
-> 343 scaffolds accounting for 29 Mb targeting 74 BAC-contigs sequences provided to collaborators
Map-based cloning
E
C
C
D
D
E D
E
YFG
A B
1-2 CM
E
C
C
D
D
E YFG
7-‐8 years
1-‐3 years
A B
R locus: a multiple disease resistance region
Sn2
Sr2
Stb2
Yr
Stagonospora nodorum
Puccinia graminis
Septoria tritici
Puccinia striiformis
Fhb1 Fusarium graminearum
R locus 20 Mb
3B
Sv2 Puccinia triticina
Leaf rust, incited by the biotrophic fungus
Puccinia tri/cina, is one of the most important
diseases of wheat worldwide, causing annual
yield losses of about 5-‐10% in ArgenDna
Some South American varieDes as La prevision 13, Pergamino Gaboto, Sinvalocho MA, Buck
MananDal, Buck Poncho and El Gaucho FA, among others, showed durable resistance
In Sinvalocho, the seedling resistance Lr3 in 6BL and two adult plant resistance genes,
LrSV1 in 2DS and LrSV2 in 3BS, were idenDfied
LrSV2: dominant
race-‐specific
Adult Plant Resistance (APR)
subtelomeric 3BS
María José Diéguez
Map-based cloning of LrSv2
• 2 physical contigs (ctg 11 and 344 of >1 Mb ) identified with markers flanking SV2
• 48 new markers developped and tested on parents and populations
• 15 new markers at the SV2 locus
• a high resolution genetic map Sinvalocho x G6 (1308 F2s = 2616 gametes)
• Crossover detection Sinvalocho x G6 (3403 F2s = 6806 gametes)
swm13
wmm1104cfp1410
cfb5008cfb5021cfb5018FMOcfb5006cfb5009cfb5023nw1821gpw7080SCAR 40/42cfb5000cfb5007cfb5010cfb5025cfb5026cfb5015cfb5011cfb5013cfp5222cfb5019cfb5061ger9Sr2-CAPSD10F-C5csSr2RKCoAcfb5014cfb5060
cfp41cfp37
cfp5231cfp5243
gwm533
3BS physical (Kb)cfb3417
swm13
cfb5010SV2wmm1104-cfp41cfp37
gwm533stm559stm560
3BS genetic (cM)swm13
wmm1104*cfp1410*
cfb5008*cfb5021*cfb5018*FMO*cfb5006*cfb5009*cfb5023*nw1821*gpw7080*SCAR 40/42*cfb5000*cfb5007*cfb5010*cfb5025*cfb5026*cfb5015*cfb5011*cfb5013*cfp5222*cfb5019*cfb5061*ger9*Sr2-CAPS*D10F-C5*csSr2RK*CoA*cfb5014*cfb5060*
cfp41*cfp37
cfp5231cfp5243
gwm533
crossovers0
50
100
150
200
250
300
350
400
450
500
550
600
650
700
750
800
850
900
950
1000
1050
1100
1150
1200
1250
1300
1350
1400
1450
1500
1550
1600
1650
1700
1750
1800
1850
0.26 cM
0.88 cM
0.22 cM
0.04 cM
www.wheatgenome.org
IWGSC MTP sequencing
Genotyping
Sequencing
Phenotyping
Polygenic traits
Monogenic traits
Some challenges remain…..
Catherine Feuillet
Patrick Wincker
Etienne Paux
Lise Pingault
Acknowledgments
Josquin Daron
Adriana Alberti
Julie Poulain Hadi Quenesville
Michael Alaux
Nicolas Guilhot
Pierre Sourdille
Frédéric Choulet
Philippe Leroy Arnaud Couloux
Delphine Boyer
Sébastien Theil
Institute of Experimental Botany Jaroslav Dolezel Hana Simkova Jan Bartos Jan Safar
J. Rogers M. Caccamo J. Wright
Natasha Glover
Valérie Barbe
K. Mayer M. Martis
K. Eversole (Eversole Associates)
María José Diéguez Nanda Pergolesi