insights into a dinoflagellate genome through expressed sequence tag analysis jeremiah d. hackett,...

31
Insights Into a Dinoflagellate Genome Through Expressed Sequence Tag Analysis Jeremiah D. Hackett, Todd E. Scheetz, Hwan Su Yoon, Marcello B. Soares, Maria F. Bonaldo, Thomas L. Casavant, and Debashish Bhattacharya http://www.biomedcentral.com/1471- 2164/6/80

Post on 21-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Insights Into a Dinoflagellate Genome Through Expressed

Sequence Tag Analysis

Jeremiah D. Hackett, Todd E. Scheetz, Hwan Su Yoon, Marcello B. Soares, Maria F. Bonaldo, Thomas

L. Casavant, and Debashish Bhattacharya

http://www.biomedcentral.com/1471-2164/6/80

Dinoflagellates

• Marine producers & grazers of other bacterial & eukaryotic plankton

• ~1/2 contain plastids, although many mixotrophic (food by photosynthesis & phagotrophy)

• Many cause toxic “red tides”

Red Tides

• Result or more than 20 million cells/liter of seawater

• Cause a variety of poisonings

http://192.171.163.165/Edu_plankton_bio_indicators_of_change.htm

Genetic Uniqueness of Dinoflagellates

• Chromosomes are dense during the cell cycle except during DNA replication

• Lack nucleosomes, DNA is associated w/ histone-like proteins (HLPs)

• Crystal structure of DNA due its high concentration

• Plastid genes located in minicircles w/ few genes per circle (most genes transferred to the nucleus)

Subjects of the Research

• Study gene content

• Investigate dinoflagellate evolution

• Analyze DNA packaging

EST

• Expressed Sequence Tags• A small piece of DNA sequence (200 – 500 nucleotides)• Used for sequencing of DNA that

represent genes of interest• Can be generated from 5’ or 3’ end

How Is EST Made?

• From mRNA by using special enzymes to convert it to cDNA (complementary DNA) - mRNA is very unstable outside a cell

How Is EST Made?

Application of ESTs

• Discovery of new genes

• Mapping of the genome

• Identification of coding regions

From cDNA to ESTs

• Sequencing from 5’ or 3’ end

• 5’ EST: sequencing the beginning portion of the cDNA, tends to be conserved across species (same gene family)

• 3’ EST: sequencing the ending portion of cDNA, less crossed-species conservation

Alexandrium tamarense

• Toxic blooms

• Shellfish poisoning

• Peridinin-containing plastid

• As a haploid (143 chromosomes)

Library Construction

• RNA extracted using Trizol (GibcoBRL)• RNA purified using Oligotex mRNA Midi Kit

(Qiagen)• Culture strain produced by isolating a single cyst

- a diploid in resting stage that produced haploid vegetative cells by meiosis

• A single vegetative cell isolated• Grown at 20 oC on 13:11 hour light:dark cycle• cDNA library obtained, 3’ EST sequenced

A. Tamarense Life Cycle

http://www.irishscientist.ie/2003/contents.asp?contentxml=03p95.xml&contentxsl=is03pages.xsl

Results: ESTs

• 6,723 unique ESTs (out of 11,171 3’ ESTs)• Most clones were ~750 bp & singletons• Largest cluster (46 ESTs) related to HLPs• Second luciferin-binding protein

(bioluminescence), photosynthetic proteins (Rubisco, light harvesting proteins)

• One EST potentially coded for a protein w/ a plastid-targeting signal (candidate for dinoflagellate-specific protein)

BLAST hits

EST Processing

• Each cluster was searched against SwissProt protein database using blastx

• 515 hits had an e-value 10-20 but terminated within 10 aa of the SwissProt entry

• 3’ UTRs had lengths btw 25-620 nt (avg ~155 nt)• 3’ UTRs lack a polyA signal (mechanism of

polyadenylation happens different or don’t have a typical polyA signal)

• G&C content high – codon’s 3rd position strongly biased towards: 60.8% in the coding region, 57.6% in UTRs

• Stop codon TGA favored over TAG & TAA

Summary: ESTs

• Because only 20% of the significant hits to GenBank A. tamarense may be highly diverged and/or genes encode novel dinoflagellate-specific functions (or ESTs did not extend into the coding region of the transcript to be recognized)

Codon Usage

Results: Gene Content

• BLAST showed that 609 out of 6,723 ESTs were comparable to P. falciparum (the most highly conserved proteins include many “housekeeping” proteins – -tubulin or heat shock protein 70)

• Evolutionary relationship but gene content substantially different (P. falciparum lost most genes related to plastid function or other metabolic genes)

Summary: Gene Content

• A. tamarense most closely related to Plasmodium falciparum (both members of alveolate linkage w/ dinoflagellates and apicomplexans)

Dinoflagellate DNA

• Don’t have nucleosomes but smooth chromosomal DNA strands, DNA is associated w/ HLPs

• Chromosomes uniform in size, morphology, & remain condensed during the cell cycle & transcription from protruding loops

Results: Histone & HLPs

• Two rare ESTs out of 11,171 encode a partial histone H2A.X. One (169 aa) shares sequence identity to eukaryotic histone H2A.X (N-terminus longer than euk homologs but -helices and histone fold conserved)

• Comparison to Emilania huxlei – close relation to H2A (same monophyletic group)

• Multiple origins of chromalveolates

Alignment of A. tamarense H2A.X with Eukaryotic Homologs

Results: Histone & HLPs

• Alignment of the HLPs w/ other dinoflagellate HLPs & bacterial HU - moderate sequence similarity (bacterial HLPs have a longer N terminus, but secondary structure predictions are remarkably similar)

(HU protein – histone-like DNA binding protein, necessary for protein-DNA assembly & DNA compaction)

• Proline residue (*) not conserved thus unclear if able to interact w/ DNA as histones (bending DNA)

Alignment of HLPs from Dinoflagellates (red) and Bacteria (blue) and HU Proteins from

Bacteria (black)

•Bordetella petrussis (Bph2) – role in virulence gene expression & shares limited sequence w/ histone H1

Results: Histone & HLPs

• HLP concentration low (protein:DNA ratio = 1:10, eukaryotes 1:1) thus too low to function in DNA compaction - transcriptional regulators (role in repair of dsDNA that breaks non-homologous end-joining)

• HLP gene maintained specifically for DNA repair & conserved for interaction with DNA as H2A

• Similarities to HU proteins in structure due intracellular transfer from the mitochondrial or plastid endosymbionts

Conclusion

• The most extensive ESTs made & provided a useful glimpse into its nuclear genome

• This data will be used for future research to understand the unique & complex cell biology & to understand toxin production

• Future: - serial subtraction cDNA will be used to

improve/maintain library - new cDNA libraries created under various growth

conditions & life history stages to generate a more complex catalog

Questions?