strategies & examples for functional modeling

50
Strategies & Examples for Functional Modeling COST Functional Modeling Workshop 22-24 April, Helsinki

Upload: aziza

Post on 23-Feb-2016

42 views

Category:

Documents


0 download

DESCRIPTION

Strategies & Examples for Functional Modeling. COST Functional Modeling Workshop 22-24 April, Helsinki. Types of data sets and modeling. Commercial array data – more likely to have tools that support the use of array IDs. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Strategies & Examples for Functional Modeling

Strategies & Examples for Functional Modeling

COST Functional Modeling Workshop22-24 April, Helsinki

Page 2: Strategies & Examples for Functional Modeling

Types of data sets and modeling• Commercial array data – more likely to have tools that

support the use of array IDs.• Custom/USDA array data – problems with updating IDs,

linking to function and using array IDs directly in functional modeling tools.

• Proteomics data – larger data sets; need to make background references to determine enrichment.

• RNA-Seq data – largerand more complex data sets; novel transcripts currently can’t be included in modeling (contact AgBase to assign GO).

• Real-time data or quantitative proteomics data – hypothesis testing.

Page 3: Strategies & Examples for Functional Modeling

Functional Modeling Strategies1. GO summary (using Slim sets)2. GO enrichment (statistical!)3. Pathways analysis4. Interaction or networks analysis5. Hypothesis testing

Note:• Functional modeling should be integrated.• Approaches are complementary, not exclusive.• Modeling is driven by the biology (not the other way round).

Page 4: Strategies & Examples for Functional Modeling

Modeling Strategy• Think about using multiple functional approaches.

• GO, pathways, networks• complementary

• What is available for your species?• What GO is available?• What species does the pathways/network analysis use?

• What resources do you have?• at your institute (e.g. commercial pathways analysis)• open source (e.g. GO Enrichment analysis)• using online vs installed

• Iterative – further functional modeling based on initial results• GO hypothesis testing?

Page 5: Strategies & Examples for Functional Modeling

1. GO Functional Summary• high throughput data sets gives us 1000s -10,000s of gene

products• can’t know everything about all gene products• tendency to ‘cherry pick’ ones you recognize

• instead, can group gene products by function• this gives us a manageable number of categories to process• enables us to see trends, patterns, etc

• Use GO Slim sets to ‘summarize’ data• Lose details (but can gain perspective).• Some GO Slim sets are ageing – not being updated as changes to

the GO are made.• Different Slim sets have different terms – which is best for

your data?AgBase GOSlimViewer tool.

Page 6: Strategies & Examples for Functional Modeling

http://www.agbase.msstate.edu/help/slimviewerhelp.htm

The Slim set you use matters - need to determine which one to use & report it in Methods.

Page 7: Strategies & Examples for Functional Modeling
Page 8: Strategies & Examples for Functional Modeling
Page 9: Strategies & Examples for Functional Modeling

Functional Summary

• Not all GO terms are annotated equally, e.g., metabolism!• can slim the complete GO for a species as a

background set and then determine terms in your data are disproportionately expressed.

• Can use Slims to compare two data sets (e.g., control vs treatment).

• Use Slims for your own sanity – are you seeing what you expect to see?

Page 10: Strategies & Examples for Functional Modeling

ion/proton transportcell migration

cell adhesioncell growthapoptosisimmune response

cell cycle/cell proliferation

cell-cell signalingfunction unknowndevelopmentendocytosisproteolysis and peptidolysis

protein modificationsignal transduction

B-cells StromaMembrane proteins grouped by GO BP:

Page 11: Strategies & Examples for Functional Modeling

B-cells StromaMembrane proteins grouped by GO BP:

cell migration

apoptosis

immune response

cell cycle/cell proliferation

cell-cell signalingfunction unknown

Page 12: Strategies & Examples for Functional Modeling

BVDV Infection – cytopathic (CP) vs non-cytopathic (NCP) infection(comparing function between 2 different conditions)

Page 13: Strategies & Examples for Functional Modeling
Page 14: Strategies & Examples for Functional Modeling

2. Determining over-represented or under-represented function.

• most typically used functional analysis method• many, many tools that do this – see:

http://www.geneontology.org/GO.tools.microarray.shtml• very different visualization• will use some of these tools in practical session

Page 15: Strategies & Examples for Functional Modeling

http://david.abcc.ncifcrf.gov/home.jsp

Page 16: Strategies & Examples for Functional Modeling

Some useful expression analysis tools:• Database for Annotation, Visualization and Integrated

Discovery (DAVID)• http://david.abcc.ncifcrf.gov/

• AgriGO -- GO Analysis Toolkit and Database for Agricultural Community

• http://bioinfo.cau.edu.cn/agriGO/• used to be EasyGO• chicken, cow, pig, mouse, cereals, dicots• adding new species by request

• Onto-Express• http://vortex.cs.wayne.edu/projects.htm#Onto-Express• can provide your own gene association file

• Ontologizer• WebStart widget (requires Java); now on Galaxy• http://compbio.charite.de/contao/index.php/ontologizer2.html• requires OBO file & GAF (enables users to select their own annotations)

Page 17: Strategies & Examples for Functional Modeling

GO Enrichment tools that support agricultural species.

Page 18: Strategies & Examples for Functional Modeling
Page 19: Strategies & Examples for Functional Modeling
Page 20: Strategies & Examples for Functional Modeling

• structurally and functionally re-annotated a microarray• quantified the impact of this re-annotation based on GO

annotations & pathways represented on the array• tested using a previously published experiment that used

this microarray• re-annotation allows more comprehensive GO based

modeling and improves pathway coverage • re-annotation resulted in a different model from

previously published research findings

Page 21: Strategies & Examples for Functional Modeling
Page 22: Strategies & Examples for Functional Modeling

Evaluating GO toolsSome criteria for evaluating GO Tools:1. Does it include my species of interest (or do I have to “humanize”

my list)?2. What does it require to set up (computer usage/online)3. What was the source for the GO (primary or secondary) and when

was it last updated?4. Does it report the GO evidence codes (and is IEA included)?5. Does it report which of my gene products has no GO?6. Does it report both over/under represented GO groups and how

does it evaluate this?7. Does it allow me to add my own GO annotations?8. Does it represent my results in a way that facilitates discovery?

Page 23: Strategies & Examples for Functional Modeling

RNASeq GO Enrichment• RNASeq experiments: longer transcripts and more highly expressed

transcript are more likely to be differentially expressed.• Current GO enrichment tools do not account for RNASeq platform

bias (most based upon arrays).• assume that all genes are independent and equally likely to be selected

as DE

Page 24: Strategies & Examples for Functional Modeling

3. Pathway Analysis• Freely available tools:

• from public databases, e.g. KEGG & Reactome• Freely available tools, e.g. Cytoscape

• Commercial pathways analysis tools: e.g., Ingenuity Pathways Analysis (IPA), Pathway Studio, etc.• some tools only have limited species – need to “humanize” animal

data, etc for plants with Arabidopsis• everything gives you cancer

• Many pathways analysis tools combine pathways analysis, network analysis.

Page 25: Strategies & Examples for Functional Modeling

Reactome Skypainterhttp://www.reactome.org/cgi-bin/skypainter2

Page 26: Strategies & Examples for Functional Modeling

KEGG Pathwayshttp://www.kegg.jp/kegg/download/kegtools.html

Page 27: Strategies & Examples for Functional Modeling

Analysis tools (commercial)

Ingenuity Pathway Analysis

NetworksPathwaysfunctions and diseases

Gene Ontology (GO) groupsPathway StudioGSEAPathways

http://www.ingenuity.com

http://www.ariadnegenomics.com/

IPA analysis included as IPA.txt

Page 28: Strategies & Examples for Functional Modeling

Data Curation• Ingenuity: Manually curated database by Ph.D level scientists

(mining 32 different peer reviewed journals).• Pathway studio: Automated curation by Medscan Reader using

Natural language processing (NLP) technology. Mining Pubmed abstracts and peer reviewed journals • users can do their own text mining

Page 29: Strategies & Examples for Functional Modeling

(Comparison by Divya Peddinti)

Comparison Criteria• Features• Proportion of proteins involved in modeling• Data generation• Display• Test Dataset: 3,600 bovine spermatozoa proteins

Page 30: Strategies & Examples for Functional Modeling

Feature Ingenuity Pathway analysis (IPA)

Pathway studio

Input GI numberMicroarray IDAffymetrix IDGenBankSwiss Prot AccessionUnigene IDName orAliasHUGO ID

Entrez geneGenBankMicroarray IDSwiss Prot AccessionUnigene IDName or AliasHUGO ID

Databases Contains biological interactions data for human, mouse, rat Orthologous mapping available for dog, Cow, Chimp, Chicken, Rhesus macaque monkey, Arabidopsis thaliana, Saccharomyces cerevisiae, Drosophila melanogaster, Caenorhabditis elegans, Danio rerio

Contains biological data for human, mouse, rat, bacteria, chicken, Zebra fish, frog, cow, bee, dog, Arabidopsis, Drosophila, Yeast, and transplantation research etc.

Page 31: Strategies & Examples for Functional Modeling

Ingenuity Pathway analysis (IPA)

Pathway studio

Statistical test The significance value (p value) assigned to the function / pathways using Fischer’s exact test

The statistical significance of the overlap between the protein list and a GO group or pathway using the Fischer’s exact test.

Updates Quarterly Quarterly

Networks Builds networks with a maximum of 35 genes/ proteins

-

Page 32: Strategies & Examples for Functional Modeling

Proteins involved in modeling

Ingenuity

Pathwaystudio

0

20

40

60

80

100

120

57.5

99.85

42.5

0.15

Proteins not involved in modelingProteins involved in modeling

Page 33: Strategies & Examples for Functional Modeling

Data generation

Pathways05

101520253035404550

44

33

Ingenuity pathway anlaysisPathway studio

37 7 26

Page 34: Strategies & Examples for Functional Modeling

Pathway display EGF signaling pathway

Page 35: Strategies & Examples for Functional Modeling

4. Network Analysis• IPA & Pathway Studio equally efficient at drawing networks of

relationships.• IPA : simplifies the pathway display and creates more

manageable user friendly network for users to analyze.• Pathway Studio: Shows the relations in a table format. • STRING Database - known and predicted protein interactions.

Page 36: Strategies & Examples for Functional Modeling

http://string-db.org/

Page 37: Strategies & Examples for Functional Modeling

http://www.cytoscape.org/

Page 38: Strategies & Examples for Functional Modeling

5. Hypothesis Testing• high throughput data sets – ‘fishing expedition’

or hypothesis generation• but GO also serves as a repository of biological

function – can be used for hypothesis testing based on these data sets

Page 39: Strategies & Examples for Functional Modeling

days post infection

mea

n to

tal l

esio

n sc

ore

0

2

4

6

8

10

12

14

16

18

0 20 40 60 80 100

Susceptible (L72)

Resistant (L61)

Genotype

Non-MHC associated resistance and susceptibility

The critical time point in MD lymphomagenesis

Hypothesis At the critical time point of 21

dpi, MD-resistant genotypes have a T-helper (Th)-1 microenvironment (consistent with CTL activity), but MD-susceptible genotypes have a T-reg or Th-2 microenvironment (antagonistic to CTL).

Page 40: Strategies & Examples for Functional Modeling

Th-1 Th-2

NAIVE CD4+ T CELL

CYTOKINES AND T HELPER CELL DIFFERENTIATION

APC T reg

Shyamesh Kumar

Page 41: Strategies & Examples for Functional Modeling

Th-1 Th-2

NAIVE CD4+ T CELL

IFN γ IL 12 IL 18

Macrophage

NK Cell

IL 12 IL 4

IL 4 IL10

APC

CTL

TGFβ

T regSmad 7

L6 Whole

L7 Whole

L7 Micro

Th-1, Th-2, T-reg ?

Inflammatory?

Page 42: Strategies & Examples for Functional Modeling

Step I. GO-based Phenotype Scoring.

Gene product Th1 Th2 Treg Inflammation

IL-2 1.58 1.58 -1.58

IL-4 0.00 0.00 0.00 0.00

IL-6 0.00 -1.20 1.20 -1.20

IL-8 0.00 0.00 1.18 1.18

IL-10 0.00 0.00 0.00 0.00

IL-12 0.00 0.00 0.00 0.00

IL-13 1.51 -1.51 0.00 0.00

IL-18 0.91 0.91 0.91 0.91

IFN-g 0.00 0.00 0.00 0.00

TGF-b -1.71 0.00 1.71 -1.71

CTLA-4 -1.89 -1.89 1.89 -1.89

GPR-83 -1.69 -1.69 1.69 -1.69

SMAD-7 0.00 0.00 0.00 0.00

Net Effect -1.29 -5.38 10.15 -5.98

Step III. Inclusion of quantitative data to the phenotype scoring table and calculation of net affect.

1-111SMAD-7

-11-1-1GPR-83

-11-1-1CTLA-4

-110-1TGF-b

11-11IFN-g

1111IL-18

NDND1-1IL-13

NDND-11IL-12

011-1IL-10

11NDNDIL-8

1-11IL-6

ND11-1IL-4

-11ND1IL-2

InflammationTregTh2Th1Gene product

ND = No data

Step II. Multiply by quantitative data for each gene product.

Page 43: Strategies & Examples for Functional Modeling

- 20

- 10

0

10

20

30

40

50

60

Th-1 Th-2 T-regInflammation

Phenotype

Net

Effe

ct

5mm

Microscopic lesions

L6 (R)

L7 (S)

Page 44: Strategies & Examples for Functional Modeling

ProT-reg Pro

Th-1Anti Th-2

Pro CTLAnti CTL

L7 Susceptible

Pro CTLAnti CTL

L6 Resistant

ProT-reg Pro

Th-2AntiTh-1

Page 45: Strategies & Examples for Functional Modeling
Page 46: Strategies & Examples for Functional Modeling

Concluding thoughts on functional modeling.

“By doing just a little every day, I can gradually let the task overwhelm me.”

Ashleigh Brilliant

Page 47: Strategies & Examples for Functional Modeling

Bringing it all together…

• There is no one “correct” way; there is no “right” answer.

• Using multiple functional modeling strategies (e.g., GO, pathways, networks) can help with insights.

• Need to use biological knowledge to bring these different approaches together.

• Functional modeling is often iterative.• Need to focus not only on what is known but

what is new!

Page 48: Strategies & Examples for Functional Modeling

Protein/Gene identifiers

GORetriever

GO annotations

Genes/Proteins with no GO annotations

GOanna

Pathways and network analysisIngenuity Pathways Analysis (IPA)Pathway StudioCytoscapeDAVID

GO Enrichment analysisIngenuity Pathways Analysis (IPA)Pathway StudioCytoscapeDAVIDAgriGOOnto-tools

ArrayIDer

GOSlimViewer

Yellow boxes represent AgBase toolsGreen boxes are non-AgBase resources

Overview of Functional Modeling Strategy

AutoSlim

Proteomics

RNASeqGenome2seq

Microarrays

Blast2GO

Page 49: Strategies & Examples for Functional Modeling

Functional Modeling Considerations

• Should I add my own GO?• use GOProfiler to see how much GO is available for your species• use GORetriever to find existing GO for your dataset• Does analysis tool allow me to add my own GO?

• Should I do GO analysis and pathway analysis and network analysis?• different functional modeling methods show different aspects about your data

(complementary)• is this type of data available for your species (or a close ortholog)?

• What tools should I use?• which tools have data for your species of interest?• what type of accessions are accepted?• availability (commercial and freely available)

Page 50: Strategies & Examples for Functional Modeling

Some Limitations• Annotation is not complete.

• not all the data is annotated• some gene products have no functional information

• Gene Ontology is only one aspect of functional modeling.• anatomy, tissue expression, phenotype, disease, etc

• Gene nomenclature – need to know what we are annotating!

• Functional modeling tools need to handle larger data sets (& multiple ontologies?).