vizbi2013: visualising rna

20
Visualising RNA Paul Gardner [email protected] University of Canterbury, Christchurch, New Zealand. March 20, 2013 Paul Gardner Visualising RNA

Upload: paul-gardner

Post on 20-Jun-2015

252 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Vizbi2013: Visualising RNA

Visualising RNA

Paul [email protected]

University of Canterbury, Christchurch,New Zealand.

March 20, 2013

Paul Gardner Visualising RNA

Page 2: Vizbi2013: Visualising RNA

Feel free to share

I Feel free to tweet (@ppgardne), Google+, tumblr, ...I Slides are available from

http://www.slideshare.net/ppgardne/.

Paul Gardner Visualising RNA

Page 3: Vizbi2013: Visualising RNA

What is an RNA?

GCGGAUUU

AGCUC

AGDDGG G A

G A G CG

CCA

GACUG

A A.A.

CUGGAGGU

CC U G U G

T . CGA

UCCACAG

AAUUCGC

AC

CA

VariableLoopAnticodon

Loop

T ΨCLoop

10 15 20 25 30 355 40 45 50 55 60 65 70 75

AnticodonLoop

Acceptor Stem

GCGGAUUUAGCUCAGDDGGGAGAGCGCCAGACUGAAYA.CUGGAGGUCCUGUGT.CGAUCCACAGAAUUCGCACCA5’ 3’

Secondary Structure Tertiary StructureB C

Primary StructureA

Acceptor Stem

T ΨCLoop

ΨΨ

Ψ

Ψ

Y

6560

55

40

10

20

155

70

75

25

30

35

45

50

D Loop

3’

5’

5’3’

D Loop

Paul Gardner Visualising RNA

Page 4: Vizbi2013: Visualising RNA

What is Rfam?

I Sister database to Pfam

I Aims to annotate all ncRNA families

I Consortium headed by Alex Bateman (Wellcome Trust SangerInstitute), Sean Eddy (Janelia, Howard Hughes), SamGriffiths-Jones (Manchester, BBSRC), Paul Gardner(University of Canterbury, RSNZ)

Paul Gardner Visualising RNA

Page 5: Vizbi2013: Visualising RNA

Rfam: families of ncRNAs

http://rfam.sanger.ac.ukhttp://rfam.janelia.org

Paul Gardner Visualising RNA

Page 6: Vizbi2013: Visualising RNA

Building an Rfam family

I A structure from literature

Pollard KS, et al. (2006). An RNA gene expressed during cortical development evolved rapidly in humans. Nature.

Paul Gardner Visualising RNA

Page 7: Vizbi2013: Visualising RNA

Building an Rfam family

I An Rfam family: produced manually from publication figures# STOCKHOLM 1.0

G.gallus.1 UGAAAUGGAGGAGAAAUUACAGCAAUUUAUCAACUGAAAUUAUAGGUGUAGACACAUGUCAGCAGUAGM.musculus.1 UAAAAUGGAGGAGAAAUUACAGCAAUUUAUCAGCUGAAAUUAUAGGUGUAGACACAUGUCAGCCGUGGM.mulatta.1 UGAAAUGGAGGAGAAAUUACAGCAAUUUAUCAGCUGAAAUUAUAGGUGUAGACACAUGUCAGCAGUGGG.gorilla.1 UGAAAUGGAGGAGAAAUUACAGCAAUUUAUCAACUGAAAUUAUAGGUGUAGACACAUGUCAGCAGUGGH.sapiens.1 UGAAACGGAGGAGACGUUACAGCAACGUGUCAGCUGAAAUGAUGGGCGUAGACGCACGUCAGCGGCGGP.troglodytes.1 UGAAAUGGAGGAGAAAUUACAGCAAUUUAUCAACUGAAAUUAUAGGUGUAGACACAUGUCAGCAGUGGP.abelii.1 UGAAAUGGAGGAGAAAUUACAGCAAUUUAUCAACUGAAAUUAUAGGUGUAGACACAUGUCAGCAGUGGC.lupus.1 UGAAAUGGAGGAGAAAUUACAGCAAUUUAUCAACUGAAAUUAUAGGUGUAGACACAUGUCAGCGGUGCT.truncatus.1 CGAAAAGGAGGGGAAAUUACAGCAAUUCAUCAACUGAAAUUAUAGGUGUAGACACAUGUCAGCAGUGGB.taurus.1 CGAAAUGGAGGAGAAAUUACAGCAAUUCAUCAGCUGAAAUUAUAGGUGUAGACACAUGUCAGCAGUGGV.pacos.1 UGAAACAGAGGAGAAAUUACAGCAAUUCAUCAACCGAAAUGAUAGGGAUAGACAUGUGUCGGCAGUGGM.lucifugus.1 CGAAAUGGAGGAGAAAUUACAGCAAUUUAUCAACUGAAAUUAUAGGUGUAGACACAUGUCAUCCGUGGO.anatinus.1 UGAAAUGGAGGAUAAAUUACAGCAAUUUAUCAAAUGAAAUUAUAGGUGUAGACACAUGUCAGCAAUGG#=GC SS_cons <<<<<<.<<<<<<<<<<<.....>>>>>.....>><<<<<.<<<.<<<....>>>.>>>.........#=GC RF uGaaacGGaGGagaaguuAcAGcaacuuAUcAgcuGaaacuaugGGcGUAGACgCAcgucAGcaguGg

G.gallus.1 AAACAGUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUCAAAUUUCAM.musculus.1 AAAUGGUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUCAAAUUUCAM.mulatta.1 AAAUAGUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUCAAAUUUCAG.gorilla.1 AAAUAGUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUCAAAUUUCAH.sapiens.1 AAAUGGUUUCUAUCAAAAUGAAAGUGUUUAGAGAUUUUCCUCAAGUUUCAP.troglodytes.1 AAAUAGUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUCAAAUUUCAP.abelii.1 AAAUAGUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUCAAAUUUCAC.lupus.1 AAACAGUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUCAAAUUUCAT.truncatus.1 GAACACUUUCUAUCAAAAUUAAAGUACUUAGCGAUUUUCCUUAAAUUUCAB.taurus.1 AAACCGUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUUAAAUUUCAV.pacos.1 AAACAGUUUCUAUCAAAAUUAAAGUAUUUAGAGACUUUCCUCAAAUUUCAM.lucifugus.1 AAACAGUUACGAUCAAAAUUAAAGUGUUUAGAGAUUUUCCUC.AAUUUUAO.anatinus.1 AAACAAUUUCUAUCAAAAUUAAAGUAUUUAGAGAUUUUCCUCAAAUUUCA#=GC SS_cons .....>>>>>....<<<<<..............>>>>>>>>>..>>>>>>#=GC RF AAAuaguuuCUAUcaaaauuAAAGUAUUUAGAGauuuuCCuCAAguuuCa//

Paul Gardner Visualising RNA

Page 8: Vizbi2013: Visualising RNA

Building an Rfam family

I And the Wikipedia entry

Paul Gardner Visualising RNA

Page 9: Vizbi2013: Visualising RNA

Conflicting priorities

I A Curator’s priorities

1. New families2. Accuracy of models3. Annotation4. Functional codebase5. Website6. Visualization

I A User’s priorities

1. FTP (Bioinformaticians)2. Website3. Visualization4. Number of families5. Accuracy of models6. Annotation

Image credits: www.conflictdynamics.org

Paul Gardner Visualising RNA

Page 10: Vizbi2013: Visualising RNA

2007: challenges

I Quality ControlI Re-write the website and add some blingI Update codebaseI Export annotation to WikipediaI User community input via RNA Biology

Paul Gardner Visualising RNA

Page 11: Vizbi2013: Visualising RNA

Visualisation priorities

SCALEI Two to two million sequences, 30 to 3,000 nucleotides long, 0

to 1,000 basepairs.

I AUTOMATED: thousands of families.

INFORMATIVEI Generates biologically relevant hypotheses

INCLUSIVEI Make the most of our fantastic Bioinformatic & Visualisation

community.

Paul Gardner Visualising RNA

Page 12: Vizbi2013: Visualising RNA

Examples

I Caveat: none of these images I am showing are final solutions,everything can be improved upon.

I Secondary Structure

I Taxonomic Distribution

I Alignment

I Genomic contexts & GeneOrder

Paul Gardner Visualising RNA

Page 13: Vizbi2013: Visualising RNA

RNA Secondary Structure

5’ 3’

0Sequence conservation

1

UVDWHAUGAUGA

GY

UC

MACUUCWUuGG

UC

CG U G U U U C U G A g a R MCYM

RUGAUMUBWRU

Ga

SA

AaGUUCUGAY

UHM

Gardner, Bateman & Poole (2010) SnoPatrol: how many snoRNA genes are there?. Journal of Biology.

Paul Gardner Visualising RNA

Page 14: Vizbi2013: Visualising RNA

Old Taxonomic distributions: RybB

I Contamination displayed first.

Paul Gardner Visualising RNA

Page 15: Vizbi2013: Visualising RNA

Old Taxonomic distributions: RybB

I After some scrolling

Paul Gardner Visualising RNA

Page 16: Vizbi2013: Visualising RNA

New Taxonomic distributions: RybBI Sunbursts: concentric “pie charts”, each external ring

contains the “children” nodes of the internal ring.

Paul Gardner Visualising RNA

Page 17: Vizbi2013: Visualising RNA

Alignments

I When we have sequenced everything, how is this view goingto look?

Paul Gardner Visualising RNA

Page 18: Vizbi2013: Visualising RNA

Genomic contexts & Gene Order

I How can we display comparative gene-order information in ascalable fashion?

I Think of hundreds to thousands of genomes, tens to hundredsof features.

Barquist L, et al. (2013). A comparison of dense transposon insertion libraries in the Salmonella serovars Typhi and

Typhimurium. Nucleic Acids Research.

Paul Gardner Visualising RNA

Page 19: Vizbi2013: Visualising RNA

Open problems

I Evolution and RNA structureI Scalable, alignment visualisation (and editing)

I As alignments grow, we need to be able to be able to partition,compress and summarize groupings of sequences. 1,000s ofsequences from the same species is not interesting to view, noris a screen full of gaps.

I Expression and conservation levels

I Genomic context & gene-order

Paul Gardner Visualising RNA

Page 20: Vizbi2013: Visualising RNA

Thanks!

I The Rfam Consortium:I Alex Bateman, Sean

Eddy, SamGriffiths-Jones, SarahBurge, Eric Nawrocki,John Tate, Rob Finn,Jennifer Daub, RuthEberhardt

I Visualisation Tools:I Ivo Hofacker, Yann

Ponti, Jim Proctor,Ian Holmes, IrmtraudMeyer, ZashaWeinberg and manyothers.

PPG is supported by a Rutherford Discovery Fellowship from Government funding, administered by the RoyalSociety of New Zealand.

Paul Gardner Visualising RNA