potato genome analysis - gcp21 · potato genome analysis xin liu deputy director . ... dm1-3 516...

36
2016 BGI research Potato Genome Analysis Xin Liu Deputy director BGI research 2016.1.21 WCRTC 2016 @ Nanning

Upload: vuongdung

Post on 29-Mar-2019

219 views

Category:

Documents


0 download

TRANSCRIPT

2016 BGI research

Potato Genome Analysis Xin Liu

Deputy director BGI research

2016.1.21 WCRTC 2016 @ Nanning

2016 BGI research

Reference genome construction

HELLO FRIENDS WELCOME TO BGISHENZHEN

Assemble

HELL RIEND WELCOME BGI ZHEN LLOFRI DSWEL

METOBG HENZH HELLOF SWEL METO GISHEN ELLOFR DSW COM OBGI ENZHEN

OFIIEN WELCOM GISH NZHEN

????????????????????????????????????????

Sequencing

2016 BGI research

Second generation sequencing for assembly

Construct libraries with hierarchical insert-sizes;

250bp, 500bp, 800bp, 2kb, 5kb, 10kb, 20kb, 40kb

Sequence the libraries; 60X genome coverage;

De novo assembly

Annotation and evolutionary analysis

2016 BGI research

Genome survey

1. 30X data

2. K-mer analysis

3. Preliminary assembly

4. Heterozygosity simulation analysis

5. GC depth distribution analysis

1.Genome size

2.Heterozygosity rate

3.GC content

4.Repeat sequence proportion

2016 BGI research

The potato genome

Would provide important resource for crop improvement

2016 BGI research

Information of potato genome

• Autotetraploid (2n=4x=48) • Highly heterozygous • Heterozygous diploid available • Double haplotype available • Different dataset available • Genome size: 850 Mb

2016 BGI research

Sample selection

DM1-3 516 R44 (DM) resulted from chromosome doubling of a monoploid (1n=1x=12) derived by anther culture of a heterozygous diploid (2n=2x=24) S. tuberosum group Phureja clone (PI 225669).

2016 BGI research

Heterozygosity affecting genome assembly

Heterozygosity would result in breakdown of the assembly.

Rei Kajitani, Kouta Toshimoto, Hideki Noguchi, et al.

2016 BGI research

Assess the genome

33,761,617,031 bases

Peak at 40

Genome size estimated to be: 844 Mb

S. tuberosum group Phureja DM1-3 516 R44

2016 BGI research

The heterozygous diploid

S. tuberosum group Tuberosum RH89-039-16

2016 BGI research

Assessing the heterozygosity

2016 BGI research

Assemble the DM genome: data

2016 BGI research

The potato genome assembly

a: Chromosome karyotype

b: Gene density

c: Repeats coverage

d: Transcription state

e: GC content

f: Subtelomeric repeats distribution

727 Mb, 6.1% Ns/gaps, 86% of the genome N90 349 kb, 443 super scaffolds

2016 BGI research

Comparing to Sanger sequenced BACs

97.1% of 181,558 available Sanger-sequenced S. tuberosum ESTs

2016 BGI research

Comparing to Sanger sequenced BACs

2016 BGI research

Comparing to BAC/fosmid ends

2016 BGI research

Anchoring to the chromosomes

Anchored 623Mb (86%) to chromosomes

With 90.3% of the genes on chromosomes

2016 BGI research

Repeat annotation and assessment

2016 BGI research

Repeat content

2016 BGI research

Gene annotation

Protein sequences

cDNA/EST sequences

Rough alignment Alignment

Precise alignment

Homology-based genes

ab initio prediction

ab initio genes

cDNA/EST genes

Genomic sequence

Gene sets combination

Combined gene set

Genome mapping

RNA-seq reads

Post-filtering TE proteins

Syteny info.

Final gene set

Gene setsmodification

31.5 Gb of RNA-Seq data from 32 DM and 16 RH samples/tissues

90.2% of 824,621,408 DM reads and 88.6% of 140,375,647 RH reads mapped

39,031 protein-coding genes

9,875 genes (25.3%) had alternative splicing

12.1% derived solely from ab initio gene predictions

2016 BGI research

Gene annotation result

2016 BGI research

Genome evolution – gene families

Oryza sativa Brachypodium distachyon Sorghum bicolor Zea mays Arabidopsis thaliana Carcia papaya Populus trichocarpa Vitis vinifera Glycine max Chalamydomanas reinhardtii Physcomitrella patens

Monocots

Eudicots

Algae, moss 4,479 potato genes clustered in 3,181 families 34,051 potato genes clustered with at least one genome 2,642 genes are asterid-specific 3,372 gens are potato lineage-specific

2016 BGI research

Genome evolution - synteny

1,811 syntenic blocks involving 10,046 genes

2016 BGI research

Genome evolution – whole genome duplication

~89 MYA

γ event (~185MYA) ~67 MYA

2016 BGI research

Genome evolution – evidence for WGD

2016 BGI research

Comparing RH and DM

• 1,644 RH BAC clones • 178Mb of non-redundant sequences (~10%) • 99Mb of RH sequence (55%) to the DM genome • The aligned regions with 97.5% identity • SNP every 40 bp and one indel (12.8 bp in average)

every 394 bp between RH and DM • 6.6Mb of sequence could be aligned with 96.5% identity

with in two haplotypes and SNP per 29 bp and 1 indel per 253 bp (average length 10.4 bp)

2016 BGI research

Comparing at the whole genome level

1,118 million NGS reads (84X) from RH

457.3 million reads aligned to 659.1 Mb (90.6%) of DM genome

Premature stop, frame shift, presence/absence variants

3.67 million SNPs

2016 BGI research

Inbreeding depression

• 3,018 premature stop codons (606 homozygous and 2,412 heterozygous, 1,760 of which are specific)

• 80 frameshift mutations (49 homozygous • and 31 heterozygous) • 275 PAV genes (246 RH specific and 29 were DM specific)

2016 BGI research

Inbreeding depression

• One instance of copy number variation • Five genes with premature stop codons • Seven RH-specific genes

2016 BGI research

Tuber biology

15,235

1,217

333

15,235 genes expressed in the transition from stolons to tubers

1,217 transcripts with >5-fold expression in stolons versus

five RH tuber tissues

333 transcripts upregulated during the transition from

stolons to tubers. Particularly, proteinase inhibitors, i.e. KTI

(Kunitz protease inhibitor)

2016 BGI research

KTI family

28 Kunitz protease inhibitor genes (KTIs)

2016 BGI research

KTI family

2016 BGI research

Starch synthesis

2016 BGI research

Flowering time regulation for tuber induction

2016 BGI research

Disease resistance

Many NBS-LRR genes are pseudogenes owing

to indels, frame shift mutations, or premature stop codons, including

R1, R3a et al., which might be driven by the

rapid evolution of effector genes in the

potato late blight pathogen, Phytophthora

infestans 39.4%

2016 BGI research

Acknowledgement