presenting results laura biggins [email protected] v1.0 1

37
Presenting Results Laura Biggins [email protected] v1.0 1

Upload: dominick-underwood

Post on 03-Jan-2016

300 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

1

Presenting Results

Laura [email protected]

v1.0

Page 2: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

2

I have my results in a table… what next?

Plot everything?

Page 3: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

3

ArtefactsArtefacts in the data can be caused by a whole myriad of reasons during any stage from library preparation to the final step of the analysis where the gene lists are produced.

• RNA-seq – transcript length, expression levelRibosomal, cytoskeleton, extracellular, secretedmulti-mapping reads – multi vs singleribosomal, translation

• Bisulphite – CpG density• GC content – low and high GC fragments are underrepresented in

libraries• Location, average copy number • Starting population of cells – remember to include background• Completely random genes….

Page 4: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

4

Differential power• RNA-seq – transcript length, expression level• Bisulphite – CpG density

• Non-random distribution– CpG density

Page 5: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

5

• Mapping– multi-mapping– genome

• Splice variants– Analysis at transcript vs gene level

Page 6: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

6

Copy number variation

Page 7: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

7

Categories to be wary of

• ribosomal• cytoskeleton• extracellular• secreted • translation• glycoprotein

Page 8: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

8

Beware…

GC < 0.35

Page 9: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

9

GC > 0.6

Page 10: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

10

All genes on chr 2, 8, 13

Page 11: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

11

No of transcripts > 4

Random sets of 1000 genes put through DAVID

Page 12: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

Artefacts – checking your gene list

12

• Make sure background is appropriate• Be suspicious of some ontology categories –

Ribosomal, cytoskeleton, extracellular, secreted, translation

http://www.bioinformatics.babraham.ac.uk/shiny/gene_screen/

gene_screen – Shiny app to check for obvious differences in target genes compared to background population

Page 13: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

13

What next?

Page 14: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

14

Figure examples

Page 15: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

15

Figure examples

Page 16: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

16

GO graph

Genes are often annotated with many functions

Page 17: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

17

Displaying ResultsInterpreting and exploring results• How can the results be displayed so that I can

interpret and explore them most easily?– Understanding the functional terms (incl GO hierarchy)

– Finding relevant information amongst the masses (GOslim, redundant terms, clustering)

Presenting results• How should I present my results?• What information should I include?

Page 18: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

18

Interpreting and Exploring Results• How can the results be displayed so that I can

interpret them most easily?• Understanding the functional categories– GOrilla – hierarchical map– Panther - interactive pie charts

• Reducing redundancy– DAVID – clusters of similar functions– REVIGO - semantic similarity– GOslims

Page 19: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

19

GOrilla

cbl-gorilla.cs.technion.ac.il/

Page 20: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

20

Panther

Page 21: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

21

GOrilla

cbl-gorilla.cs.technion.ac.il/

Page 22: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

22

Exploring Results• How can the results be displayed so that I can

interpret them most easily?• Understanding the functional categories– Gorilla – hierarchical map– Panther - interactive pie charts

• Reducing redundancy– DAVID – clusters of similar functions– REVIGO - semantic similarity– GOslims

Page 23: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

23

GOrilla

cbl-gorilla.cs.technion.ac.il/

Page 24: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

24

Exploring results

Page 25: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

25

Reducing redundancy

http://revigo.irb.hr/

Page 26: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

26

Reducing redundancy

Page 27: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

27

Reducing redundancy

Giraph.jar

genelist3.txt

mouse_genes_seqmonk.txt

Page 28: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

28

Reducing redundancy

• Use a clustering tool• Use a GOslim – various versions available, may lose

the interesting detail• Select non-redundant terms yourself – be

consistent– P-value filter, top x number of categories, largest

categories, most enriched

Page 29: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

What information should be included?

29

Page 30: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

30

Figure examples

Page 31: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

31

Figure examples

Page 32: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

32

Figure examples

Page 33: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

33

Summary

• Beware of artefacts – if something looks too good to be true it probably is….

• Remember your background population• Do not try and plot absolutely everything• Choose a method to deal with redundant terms• Think about what you’re plotting and whether

it makes sense• Do not be afraid of including tables

Page 34: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

34

Exercise 2

Category Term Count% PValue Genes List TotalPop HitsPop TotalFold EnrichmentBenjamini FDR

GOTERM_BP_FATGO:0006955~immune response 30 29 1.86E-22 CSF2, C3, LY86, H2-D1, OAS3, OAS2, CD74, B2M, LIF, OASL2, OASL1, GBP10, H2-K1, CIITA, ICAM1, H2-Q10, GBP6, GBP5, GBP9, H2-Q6, H2-Q7, PSMB9, SERPINA3G, H2-EB1, IRF8, H2-T22, TGTP1, TGTP2, OAS1A, GBP4, GBP3, GBP281 471 10.68 1.59E-19 2.88E-19

GOTERM_MF_FATGO:0005525~GTP binding 18 17 1.34E-11 GBP6, GM12185, EIF2S3Y, GBP5, GIMAP7, GBP9, IFI47, IGTP, GVIN1, GM4841, GBP10, IIGP1, TGTP1, TGTP2, GBP4, GBP3, GM4951, GBP2, GM407078 354 8.662 2.32E-09 1.64E-08

GOTERM_MF_FATGO:0032561~guanyl ribonucleotide binding 18 17 2.00E-11 GBP6, GM12185, EIF2S3Y, GBP5, GIMAP7, GBP9, IFI47, IGTP, GVIN1, GM4841, GBP10, IIGP1, TGTP1, TGTP2, GBP4, GBP3, GM4951, GBP2, GM407078 363 8.448 1.73E-09 2.44E-08

GOTERM_MF_FATGO:0019001~guanyl nucleotide binding 18 17 2.00E-11 GBP6, GM12185, EIF2S3Y, GBP5, GIMAP7, GBP9, IFI47, IGTP, GVIN1, GM4841, GBP10, IIGP1, TGTP1, TGTP2, GBP4, GBP3, GM4951, GBP2, GM407078 363 8.448 1.73E-09 2.44E-08

GOTERM_BP_FATGO:0019882~antigen processing and presentation 10 9.5 1.90E-09 H2-K1, ICAM1, H2-Q10, H2-EB1, H2-D1, H2-T22, H2-Q6, H2-Q7, CD74, B2M, PSMB981 87 19.28 8.10E-07 2.94E-06

GOTERM_BP_FATGO:0048002~antigen processing and presentation of peptide antigen 7 6.7 4.88E-08 H2-K1, H2-Q10, H2-EB1, H2-D1, H2-Q6, H2-Q7, CD74, B2M81 35 33.55 1.39E-05 7.55E-05

GOTERM_BP_FATGO:0001916~positive regulation of T cell mediated cytotoxicity 4 3.8 1.61E-05 H2-K1, P2RX7, H2-Q6, H2-Q7, B2M81 9 74.56 0.002288 0.024911

GOTERM_BP_FATGO:0006952~defense response 13 12 1.12E-05 CIITA, H2-K1, LYZ2, C3, LY86, H2-D1, IFI47, H2-Q6, H2-Q7, CD74, B2M, P2RX7, CD44, IRF881 448 4.868 0.001916 0.017378

GOTERM_BP_FATGO:0001914~regulation of T cell mediated cytotoxicity 4 3.8 3.13E-05 H2-K1, P2RX7, H2-Q6, H2-Q7, B2M81 11 61 0.003817 0.048513

GOTERM_MF_FATGO:0032555~purine ribonucleotide binding 32 30 5.16E-09 OAS3, HSPA1A, HSPA1B, OAS2, CKB, IGTP, OASL2, GM4841, OASL1, DDX3Y, GBP10, IIGP1, TOP2A, GM4070, CIITA, GM12185, GBP6, EIF2S3Y, MYO6, GBP5, GIMAP7, GBP9, IFI47, PSMB9, MYO10, P2RX7, GVIN1, TGTP1, TGTP2, OAS1A, GBP4, GM4951, GBP3, GBP278 1796 3.035 2.23E-07 6.31E-06

GOTERM_MF_FATGO:0003924~GTPase activity 11 10 3.07E-09 GBP6, IGTP, GBP5, EIF2S3Y, GBP9, GBP10, IIGP1, TGTP1, TGTP2, GBP4, GBP3, GBP278 128 14.64 1.77E-07 3.75E-06

GOTERM_CC_FATGO:0009897~external side of plasma membrane 12 11 3.14E-09 H2-K1, LY6A, LY6C1, ICAM1, P2RX7, IL12RB1, S1PR1, CD44, CD274, H2-D1, H2-Q6, H2-Q7, CD7461 206 11.94 3.46E-07 3.55E-06

Page 35: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

35

05

101520253035

Count

0.00E+004.00E+008.00E+001.20E+011.60E+012.00E+01

-log(FDR)

0

20

40

60

80

Fold Enrichment

Page 36: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

36

Page 37: Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v1.0 1

37

Panther plots