fhi biotechnology approaches

16
FHI Biotechnology Approaches Genome sequencing Clonal testing Transgenics GE trees New varieties Marker-aided breeding

Upload: ingo

Post on 07-Jan-2016

37 views

Category:

Documents


0 download

DESCRIPTION

FHI Biotechnology Approaches. Clonal testing. New varieties. Marker-aided breeding. Transgenics. Genome sequencing. GE trees. Chestnut Genome Research Team. John E. Carlson, PI, Schatz Center, Penn State University. DNA Sequencing Stephan C. Schuster - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: FHI Biotechnology Approaches

FHI Biotechnology Approaches

Genome sequencing

Clonal testing

Transgenics

GE trees

New varieties

Marker-aidedbreeding

Page 2: FHI Biotechnology Approaches

Chestnut Genome Research Team

John E. Carlson, PI, Schatz Center, Penn State University

DNA SequencingStephan C. Schuster Professor of Biochemistry and Molecular Biology, Penn State Lynn P. Tomsho, Daniela Drautz, and Lindsay Kasson Sequencing Specialists, Penn StateTyler Wagner Research Assistant, Penn State

Bioinformatics and Comparative GenomicsWebb Miller Professor of Biology and Computer Science & Engineering, Penn StateCharles Addo-Quaye Postdoctoral Fellow, Penn StateMeg Staton, Stephen Ficklin and Christopher Saski Bioinformatics team at Clemson University Genomics InstituteAbdelali Barakat Research Associate, Clemson University

FHI Cooperators: Bert Abbott, Sandra Anagnostakis, Kathleen Baier, Ali Barakat, Nurul Faridi, Eric Feng, Stephen Ficklin, Fred Hebard, Thomas Kubisiak, Charles Maynard, Scott Merkle, Joseph Nairn, William Powell, Dana Nelson

Page 3: FHI Biotechnology Approaches

Our Goals:

1) Develop a complete reference genome sequence for chestnut

2) Identify all genes in the three blight resistance QTL

3) Deliver candidate genes to the FHI Transgenics group and the FHI Marker-aided breeding group

4) Provide the genome to the research community

5) Demonstrate the potential of genomics to address forest health and ecosystem restoration.

The Chinese Chestnut Genome Sequencing Project

Page 4: FHI Biotechnology Approaches

The Chinese Chestnut Genome Sequencing Project

1. The reference Castanea mollissima cv. Vanuxem genome was sequenced to over 25-fold depth.

2. Preliminary de novo assemblies of the reference genome sequence were conducted.

3. Commenced use of genetic and physical map information (from the FHI genetic technologies group) in genome assembly.

DELIVERABLES FOR YEAR ONEwere all achieved

Page 5: FHI Biotechnology Approaches

• “Shot-gun” sequencing completed by March, 201018-fold* depth by 454 technology = 14.2 Gigabases 47-fold* depth by Illumina technology = 37.6 Gigabases

• Passed QC tests: mtDNA < 0.4% and cpDNA < 0.3% of sequence microbial DNA negligible sequence reads over 350 bp repetitive DNA manageable (conserved repeats at 9 to 12%)

• Preliminary assemblies of the genome sequence were promising totalling app. 852 Mbp, but in smaller pieces than desired

* assumes a genome size for chestnut of app 800 Mbp

DELIVERABLES FOR YEAR ONE, the details

The Chinese Chestnut Genome Sequencing Project

Page 6: FHI Biotechnology Approaches

The Chinese Chestnut Genome Sequencing Project

WHAT WE LEARNED IN YEAR ONE

1. “Next Gen” sequencing technologies produce a large amount of high quality data, very quickly.

2. Large amounts of high quality data take a long time to assemble using currently available software.

3. Assembly of the reference genome will require more than just “shot gun” Next Gen sequence data.

4. “Paired end” data are required to pull contigs together into chromosome scaffolds.

5. For assembly purposes, the chestnut genome may be larger than 800 Mbp.

Page 7: FHI Biotechnology Approaches

1. Produced paired-end sequence data.

2. Covered the physical map with BAC-end sequences.

3. Commenced gene identification and characterization:

Transcripts aligned to the genome assembly

Assembly searched for genes

Preliminary annotations of genes conducted

4. Strategy for resistance gene discovery updated.

The Chinese Chestnut Genome Sequencing Project

DELIVERABLES ACHIEVED IN YEAR TWO

Page 8: FHI Biotechnology Approaches

1. Paired-end sequences from 454 sequencing at 4.5-fold depth (3.6 Gb).

2. 43,143 BAC-end sequences obtained, “tiling” the physical genome map to 1.5-fold depth, anchored to genetic map.

3. New assemblies conducted using the paired-end data: 587,208,063 bp assembled into 51,766 scaffolds, 925,312,071 bp assembled into 1,147,939 contigs

The Chinese Chestnut Genome Sequencing Project

DELIVERABLES FOR YEAR TWO, the details

Page 9: FHI Biotechnology Approaches

The Chinese Chestnut Genome Sequencing Project

DELIVERABLES FOR YEAR TWO Gene Identification and Characterization

4. Chinese chestnut unigenes (transcripts) from NSF project aligned well to the current genome assembly: 97% of transcripts (46,954) aligned to genome assembly 98% identity of transcripts and genome sequences

5. Results of gene search with preliminary assembly: 66,662 gene models predicted in the scaffolds

- certainly an over-estimate of gene number at this point

- mean gene length 2,761 bp, maximum length 43,203 bp

- mean number of genes per scaffold 12.8, maximum 58

6. Candidate gene sequences identified in genome contigs Coding sequences delivered to the transgenics team

Page 10: FHI Biotechnology Approaches

•Transcript length: 43,203 bases•Number of Exons: 71•Scaffold ID: scaffold01252

The Chinese Chestnut Genome Sequencing Project

The largest gene identified in the preliminaryChinese Chestnut genome assembly

Homolog of AT1G67120 (NP_176883.4), AAA ATPase, von Willebrand factor type A domain-containing protein, with nucleoside-triphosphatase activity.

Page 11: FHI Biotechnology Approaches

Num

ber

of G

enes

E-values (strength of matches)

N = 959

Most Arabidopsis single-copy genes have strong matches to the current genome assembly (by BLAST alignment)

The Chinese Chestnut Genome Sequencing Project

Page 12: FHI Biotechnology Approaches

Best matches of proteins from the chestnut genome assembly are to peach and other related species

Only 1% of best matches to Arabidopsis.

The Chinese Chestnut Genome Sequencing Project

• peach, 23%• rice, 12%• grapevine, 7%• Eurosids 1 species, 56%

Best matches:

BLASTx alignments to model plant genomes in Phytozome

The peach genome is best for chestnut gene discovery.

Page 13: FHI Biotechnology Approaches

Source: http://www.phytozome.net/

eurosids 1

eurosids 2

The predicted chestnut proteins are most similar to species in the Eurosids 1 clade, that also includes peach and chestnut.

The Chinese Chestnut Genome Sequencing Project

Page 14: FHI Biotechnology Approaches

However, the genome assembly is uneven and not as good as needed to assemble all of the blight resistance QTL genes

Range of coverage among genome scaffolds

The Chinese Chestnut Genome Sequencing Project

Page 15: FHI Biotechnology Approaches

QTL by Linkage Group

Physical Map Contig #

Estimated Contig Size

# Clones in minimum tiling path

Estimated Clone Lengths

DNA Pool

G 7039 4.51 Mb 40 6.22 Mb A

F 403 5.13 Mb 51 7.64 Mb B

B 9166 2.50 Mb 24 3.47 Mb C

B 4269 2.31 Mb 24 3.45 Mb C

B 3279 1.68 Mb 19 2.37 Mb D

B 11956 3.65 Mb 30 5.06 Mb D

TOTALS 19.79 Mb 188 28.2 Mb

• Sets of BAC contigs covering the QTLs were identified.• Sequencing of each QTL underway as contig pools.• Genes will be identified using peach resistance QTL and CC transcripts.

Our target is the blight resistance genes. We will sequence the Resistance QTL themselves, which is already in progress:

The Chinese Chestnut Genome Sequencing Project

Page 16: FHI Biotechnology Approaches

Marker-aidedbreeding

Genome sequencing

Clonal testing

Transgenics

GE trees

New varieties

Complete QTL sequences

Markers in QTL genes

Candidate genes from the QTLs

Candidate gene validation

Year 3 - Gene discovery