an in-silico functional genomics resource: targeted re …€¦ · an in-silico functional genomics...

20
An in-silico functional genomics resource: Targeted re-sequencing of wheat TILLING mutant populations Cristobal Uauy ([email protected]) JIC / NIAB

Upload: others

Post on 27-Jun-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An in-silico functional genomics resource: Targeted re …€¦ · An in-silico functional genomics resource: Targeted re-sequencing of wheat TILLING mutant populations Cristobal

An in-silico functional genomics resource:

Targeted re-sequencing of wheat TILLING mutant populations

Cristobal Uauy ([email protected])

JIC / NIAB

Page 2: An in-silico functional genomics resource: Targeted re …€¦ · An in-silico functional genomics resource: Targeted re-sequencing of wheat TILLING mutant populations Cristobal

UC Davis

Ksenia Krasileva

Vince Buffalo

Jorge Dubcovsky

Rothamsted

Andy Phillips

An in-silico functional genomics resource:

Targeted re-sequencing of wheat TILLING mutant populations

Cristobal Uauy ([email protected])

JIC / NIAB

TGAC

Sarah Ayling

Matt Clark

Paul Bailey

Page 3: An in-silico functional genomics resource: Targeted re …€¦ · An in-silico functional genomics resource: Targeted re-sequencing of wheat TILLING mutant populations Cristobal

TILLING: Targeting Induced Local Lesions IN Genomes

Combines:

Population of (EMS) mutagenised plants

High throughput screen to identify mutations in a gene of interest

• reverse-genetics approach,

• requires knowledge of gene sequence,

• non-transgenic,

• best suited to knockout genes.

Reverse Genetics

Page 4: An in-silico functional genomics resource: Targeted re …€¦ · An in-silico functional genomics resource: Targeted re-sequencing of wheat TILLING mutant populations Cristobal

Polyploid TILLING

Screening of ~2,000 mutant plants should lead to at least one knock-out in >90% of

wheat genes (stop or splice junction mutation)

However, traditional approaches require genome specific amplification

Wang et al 2012

Page 5: An in-silico functional genomics resource: Targeted re …€¦ · An in-silico functional genomics resource: Targeted re-sequencing of wheat TILLING mutant populations Cristobal

Can we perform TILLING on thousands of genes?

High quality TILLING

populations

PCR or enrichment

methods

“Low cost”

sequencing

4x: Kronos (Uauy et al 2009)

6x: Cadenza (Rakszegi et al 2010)

Pooling of multiple PCR

products (Tsai et al 2011)

Exome capture

NGS approaches

Approach:

1. Establish feasibility of genome capture in 6x wheat

2. Define wheat exome (not just transcriptome)

3. Perform Nimblegen capture (multiplex mutants pre-capture)

4. Illumina sequence

5. Identify and assign mutation to gene and the specific homoeologue

6. Make data and germplasm accessible through public database and seed store

Page 6: An in-silico functional genomics resource: Targeted re …€¦ · An in-silico functional genomics resource: Targeted re-sequencing of wheat TILLING mutant populations Cristobal

Design

• 1,846 sequences (RIKEN FL-cDNA and some genes of interest)

• MySelect capture array (solution based hybridization)

• Designed 120-mer probes (60-bp overlap design)

• Exon-intron boundaries were not considered

• Probes based on one homoeologue of each target gene

Feasibility of exome capture in wheat

Capture

• Three 6x Cadenza EMS mutants were used

• DNA was barcoded and sequenced (Illumina PE, 100 cycles)

Quality control

• BLASTN against 5x 454 raw sequences CS

• 5.8% probes were excluded (>over 60 hits; E-50)

Page 7: An in-silico functional genomics resource: Targeted re …€¦ · An in-silico functional genomics resource: Targeted re-sequencing of wheat TILLING mutant populations Cristobal

• Expected coverage 0.19x

• Median coverage 250 x

Capture achieved a 1,300-fold enrichment of targets

Page 8: An in-silico functional genomics resource: Targeted re …€¦ · An in-silico functional genomics resource: Targeted re-sequencing of wheat TILLING mutant populations Cristobal

Good coverage of most exons using cDNA-based probes

Small exon,

no probes

• Genome-specific genomic contig as reference improves exon junction coverage and mapping quality for SNP detection

Short introns

covered

Page 9: An in-silico functional genomics resource: Targeted re …€¦ · An in-silico functional genomics resource: Targeted re-sequencing of wheat TILLING mutant populations Cristobal

Known mutations identified in captured sequence

• G>A mutation in GA20ox1D gene previously identified by HRM

Mutant 1

Mutant 2

Page 10: An in-silico functional genomics resource: Targeted re …€¦ · An in-silico functional genomics resource: Targeted re-sequencing of wheat TILLING mutant populations Cristobal

Probes based on one genome capture all three homoeologues but with varying efficiency

• On average, target genomes account for 45% of captured reads vs 25% of the two

other non-target genomes (Using target genome as reference)

Target Genome Average Median A B D Weighted average (%)

RFL_Contig1056 D 217 189 16.36 22.88 46.60 44.5 Target genome

RFL_Contig1063 B 32 30 25.68 47.34 26.88 24.1 Non-target genone

RFL_Contig1673 B 392 408 24.15 48.17 19.93

RFL_Contig1686 D 324 322 25.97 28.34 42.77

RFL_Contig1705 B 444 359 21.30 42.78 26.90

RFL_Contig1851 A 313 272 45.31 25.83 22.35

does not add to 100% as some reads could

not be unambigously assigned

Coverage all SNPs Average Freq for given SNP

Preliminary conclusions

• Trade off between “generic” probe and efficiency (sequencing costs vs design)

• Exon junctions will be important for future designs (padding exon junctions)

• Genome-specific genomic contigs to map reads translates into better SNP calling

Next steps?

• Develop genome-specific gene models to design wheat exome

• Combine IWGSC chromosome arm assemblies with transcriptome data to achieve this.

Page 11: An in-silico functional genomics resource: Targeted re …€¦ · An in-silico functional genomics resource: Targeted re-sequencing of wheat TILLING mutant populations Cristobal

Chromosome arm assemblies

Page 12: An in-silico functional genomics resource: Targeted re …€¦ · An in-silico functional genomics resource: Targeted re-sequencing of wheat TILLING mutant populations Cristobal

Integrated transcriptome data-set

• >140k non-redundant transcript sequences from different origins:

• 4x Kronos transcriptome

• Complementary dataset

• RIKEN FL-cDNA

• Re-assembled NCBI Unigenes (4,530)

• Published transcriptomes (Brenchley et al 2012, Cantu et al 2011)

Ksenia Krasileva

Vince Buffalo

Page 13: An in-silico functional genomics resource: Targeted re …€¦ · An in-silico functional genomics resource: Targeted re-sequencing of wheat TILLING mutant populations Cristobal

Integrated transcriptome data-set

• >140k non-redundant transcript sequences from different origins:

• 4x Kronos transcriptome

• Complementary dataset

• RIKEN FL-cDNA

• Re-assembled NCBI Unigenes (4,530)

• Published transcriptomes (Brenchley et al 2012, Cantu et al 2011)

Ksenia Krasileva

Vince Buffalo

• Identified 84,068 ORFs (69.3 Mb)

• 30% no similarity to any plant protein (BLASTX e-3; Pfam e-3)

• 13% disrupted by premature termination codon (pseudogenes)

Page 14: An in-silico functional genomics resource: Targeted re …€¦ · An in-silico functional genomics resource: Targeted re-sequencing of wheat TILLING mutant populations Cristobal

exon splice site

intron

transcript sequence

gDNA contigs (Chromosome arms)

5´ 3´

predicted exons EXONERATE est2genome model (splice aware)

Combining transcriptome and chromosome arms

Page 15: An in-silico functional genomics resource: Targeted re …€¦ · An in-silico functional genomics resource: Targeted re-sequencing of wheat TILLING mutant populations Cristobal

exon splice site

intron

transcript sequence

gDNA contigs (Chromosome arms)

5´ 3´

predicted exons

Combining transcriptome and chromosome arms

Classification Identity (%)

Coverage (%)

Durum and 6x

target homoeologue+ all exons ≥99 ≥95 48%

inferred homoeologue+ all exons 94-99 ≥95 29%

Page 16: An in-silico functional genomics resource: Targeted re …€¦ · An in-silico functional genomics resource: Targeted re-sequencing of wheat TILLING mutant populations Cristobal

exon splice site

intron

transcript sequence

gDNA contigs (Chromosome arms)

5´ 3´

predicted exons

Combining transcriptome and chromosome arms

Classification Identity (%)

Coverage (%)

Durum and 6x

target homoeologue+ all exons ≥99 ≥95 48%

inferred homoeologue+ all exons 94-99 ≥95 29%

partial coverage (65-95% cov.) ≥94 65-95 11%

no hits to assembly sequence with our criteria <94 <65 13%

• For ~75% of transcripts, all exons can be identified within the chromosome arm

assemblies by either the target genome or the homoeologous genome

• 13% of transcripts had no hits or below our criteria to any of the chromosome arm

assemblies

Page 17: An in-silico functional genomics resource: Targeted re …€¦ · An in-silico functional genomics resource: Targeted re-sequencing of wheat TILLING mutant populations Cristobal

Final output

>AY625680_1 1AL 1AL_3960881 (12201 bp) exon1 5358 .. 5151 (208 bp)

ATGATGTACCATGCTAAGAAGTTTTCTGTGCCCTTTGCACCACAGAGGGCTCAGAATAGTGAGCATGTAAGTAACATTGGAGCTTTCGGTGGATCCAACATAAGCAACCCTGCTAATCCTGTAGGGAGTGGCAAACAACGTCTAAGAT

GGACCTCTGATCTCCATAGTCGTTTTGTGGATGCAATCGCCCAACTTGGTGGACCAGATA

>AY625680_1 1AL 1AL_3960881 (12201 bp) exon2 4472 .. 4396 (77 bp)

GAGCAACACCTAAAGGAGTACTGACTGTAATGGGTGTACCGGGGATTACAATTTATCACGTGAAGAGCCATTTGCAG

>AY625680_1 1AL 1AL_3960881 (12201 bp) exon3 4313 .. 4271 (43 bp)

AAGTATCGCCTTGCAAAGTACATACCAGAATCTCCTGCTGAAG

>AY625680_1 1AL 1AL_3960881 (12201 bp) exon4 4168 .. 4111 (58 bp)

GTTCCAAGGACGAAAAGAAGGATTCTAGTGATTCATTCTCTAATGCAGATTCTGCACC

>AY625680_1 1AL 1AL_3960881 (12201 bp) exon5 3979 .. 3910 (70 bp)

GGGTTCACAAATCAATGAAGCATTGAAGATGCAAATGGAAGTTCAGAAGCGGCTCCATGAACAACTCGAG

>AY625680_1 1AL 1AL_3960881 (12201 bp) exon6 3549 .. 3205 (345 bp)

GTTCAAAAGCAGTTGCAGCTGAGAATCGAAGCACAAGGGAAGTACTTGCAGATGATCATAGAGGAGCAGCAAAAGCTTGGTGGCTCACTTGAAGGTTCTGAGGAGAGGAAGCTTTCACATTCACCACCTACCTTAGATGACTACCCTG

ACAGCATACAGCCTTCTCCGAAGAAACCACGGTTGGATGATCTGTCAACAGATGCGGTCCGGGGTGTTACACAGCCAGGGTTTGAATCCCATCTTATTGGCCCATGGGATCAAGAACTCTGTCCGAAGACCAACATATGCGATCCTGC

ATTCCAAGTGGATGAGTTTAAGGCAAACCCTGGTTTGAGCAAGTCATAA

>AY625680_1 1DL 1DL_2252313 (9898 bp) exon1 1193 .. 1400 (208 bp)

ATGATGTACCATGCTAAGAAGTTTTCTGTGCCCTTTGCACCACAGAGGGCTCAGAATAGTGAGCATGTGAGTAATATTGGAGCTTTCGGTGGATCCAACATAAGCAACCCTGCCAATCCTGTAGGGAGTGGGAAACAACGTCTAAGAT

GGACCTCTGATCTTCATAGTCGTTTTGTGGATGCAATCGCCCAACTTGGTGGACCAGATA

>AY625680_1 1DL 1DL_2252313 (9898 bp) exon2 2024 .. 2100 (77 bp)

GAGCAACACCTAAAGGAGTACTGACTGTAATGGGTGTACCGGGGATTACAATTTATCACGTGAAGAGCCATTTGCAG

>AY625680_1 1DL 1DL_2252313 (9898 bp) exon3 2183 .. 2225 (43 bp)

AAGTATCGCCTTGCAAAGTACATACCAGAATCGCCTGCTGAAG

>AY625680_1 1DL 1DL_2252313 (9898 bp) exon4 2329 .. 2386 (58 bp)

GTTCCAAGGACGAAAAGAAGGATTCTAGTGATTCATTCTCTAATGCAGATTCTGCACC

>AY625680_1 1DL 1DL_2252313 (9898 bp) exon5 2516 .. 2585 (70 bp)

GGGTTCACAAATCAATGAAGCATTGAAGATGCAAATGGAAGTTCAGAAGCGGCTCCATGAACAACTCGAG

>AY625680_1 1DL 1DL_2252313 (9898 bp) exon6 2937 .. 3281 (345 bp)

GTTCAAAAGCAGTTGCAGCTGAGAATCGAAGCACAAGGGAAGTACTTGCAGATGATCATAGAGGAGCAGCAAAAGCTTGGTGGCTCACTTGAAGGTTCTGAGGAGAGGAAGCTTTCACATTCACCACCTACCTTAGATGACTACCCTG

ACAGCATACAGCCTTCTCCGAAGAAACCACGGTTGGATGATCTGTCAACAGACGCGGTCCGGGGTGTTGCACAGCCAGGGTTTGAATCCCATCTTGTCGGCCCATGGGATCAAGAACTCTGTCCAAAGACCAACATATGCGATCCCGC

ATTCCAAGTGGATGAGTTTAAGGCAAACCCTGGTTTGAGCAAGTCATAA

• Predicted genome specific exons, associated chromosome arm, and contig for future

mapping of reads. Design submitted to Nimblegen last week.

Page 18: An in-silico functional genomics resource: Targeted re …€¦ · An in-silico functional genomics resource: Targeted re-sequencing of wheat TILLING mutant populations Cristobal

Final thoughts

• Sequencing of~2,000 mutant should lead to at least one knock-out in >90% of wheat genes

• Genome capture works well in polyploids and there is a trade-off between “generic”

probe and capture efficiency

• Determining exon junctions was important for probe designs and the use of genome-

specific genomic contigs to map reads will be critical for mapping and proper SNP calling

• Full-length genome contigs in at least one homoeologue for ~75% of transcripts

Access to mutants

• We plan to hold mirror collection of seeds at UC Davis and JIC Seed Store

• Mutants will be free from any IP for the mutations people find

• We plan to charge a small fee for 10-15 seeds of each mutant to maintain collections

Page 19: An in-silico functional genomics resource: Targeted re …€¦ · An in-silico functional genomics resource: Targeted re-sequencing of wheat TILLING mutant populations Cristobal

Wheat (and human) genetic diversity

UC Davis

Ksenia Krasileva

Vince Buffalo

Jorge Dubcovsky

Rothamsted

Andy Phillips

TGAC

Sarah Ayling

Matt Clark

Paul Bailey

Sophie Janacek

EBI

Paul Kersey

Dan Bolser

JIC

Mike Bevan

James Simmonds

Lorelei Bilham

IWGSC

Page 20: An in-silico functional genomics resource: Targeted re …€¦ · An in-silico functional genomics resource: Targeted re-sequencing of wheat TILLING mutant populations Cristobal

Exploiting genetic diversity for cereal breeding

4 July (all day), 5 July (am)

• Keynote speakers

• Jorge Dubcovsky (UC Davis and HHMI)

• Pat Schnable (Univ of Iowa)

• Robbie Waugh (JHI)

•Invited speakers

• Beat Keller (Univ of Zurich)

• Bin Han (SIPPE, CAS)

• Thorsten Schnurbusch (IPK)

• Viktor Korzun (KWS)

• Kentaro Yoshida (TSL)

• Wen Wang (Kunming Institute of Zoology, CAS)

• Woolhouse Lecture: Susan McCouch (Cornell)

http://www.sebiology.org/