course on functional analysis -...

26
Daniel Rico, PhD. [email protected] ::: Introduction to Functional Analysis Course on Functional Analysis Bioinformatics Unit CNIO

Upload: others

Post on 26-May-2020

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Course on Functional Analysis - CNIObioinfo.cnio.es/.../Functional_Analysis_Course/Intro-and-Babelomics_24_02_2009.pdffunctional analysis Study the enrichment in functional terms in

Daniel Rico, PhD. [email protected]

::: Introduction to Functional Analysis

Course on Functional Analysis

Bioinformatics Unit CNIO

Page 2: Course on Functional Analysis - CNIObioinfo.cnio.es/.../Functional_Analysis_Course/Intro-and-Babelomics_24_02_2009.pdffunctional analysis Study the enrichment in functional terms in

::: Schedule.

1.  Biological (Functional) Databases 2.  Threshold-based and threshold free methods 3.  Threshold-based example: FatiGO. 4.  Threshold free example 1: FatisScan.

Page 3: Course on Functional Analysis - CNIObioinfo.cnio.es/.../Functional_Analysis_Course/Intro-and-Babelomics_24_02_2009.pdffunctional analysis Study the enrichment in functional terms in

Many of these slides have been taken and adapted from original slides by Fa7ma Al‐Shahrour from Joaquin Dopazo’s group (Babelomics team).  

We are grateful for the material and for the great tools they have developed!!!!  

ACKNOWLEDGEMENTS 

Page 4: Course on Functional Analysis - CNIObioinfo.cnio.es/.../Functional_Analysis_Course/Intro-and-Babelomics_24_02_2009.pdffunctional analysis Study the enrichment in functional terms in

Arabidopsis thaliana

Homo sapiens

Mus musculus

Rattus norvegicus

Drosophila melanogaster

Caenorhabditis elegans

Saccharmoyces cerevisae

Gallus gallus

Danio rerio

HGNC symbol

EMBL acc

RefSeq

PDB

Protein Id

IPI….

Genes IDs

Gene Ontology

Biological Process Molecular Function Cellular Component

UniProt/Swiss-Prot

UniProtKB/TrEMBL

Ensembl IDs

EntrezGene

Affymetrix

Agilent

KEGG pathways Regulatory elements miRNA

CisRed

Transcription Factor Binding Sites

Biocarta pathways

InterPro Motifs

Bioentities from literature:

Diseases terms Chemical terms

Gene Expression in tissues

Keywords Swissprot

Biological databases

Page 5: Course on Functional Analysis - CNIObioinfo.cnio.es/.../Functional_Analysis_Course/Intro-and-Babelomics_24_02_2009.pdffunctional analysis Study the enrichment in functional terms in

Gene Ontology CONSORTIUM h"p://www.geneontology.org

•  The objective of GO is to provide controlled vocabularies for the description of the molecular function, biological process and cellular component of gene products.

•  These terms are to be used as attributes of gene products by collaborating databases, facilitating uniform queries across them.

•  The controlled vocabularies of terms are structured 

Page 6: Course on Functional Analysis - CNIObioinfo.cnio.es/.../Functional_Analysis_Course/Intro-and-Babelomics_24_02_2009.pdffunctional analysis Study the enrichment in functional terms in

GO structure The three categories of GO

Molecular Function

the tasks performed by individual gene products; examples are transcription factor and DNA helicase

Biological Process

broad biological goals, such as mitosis or purine metabolism, that are accomplished by ordered assemblies of molecular functions

Cellular Component

subcellular structures, locations, and macromolecular complexes; examples include nucleus, telomere, and origin recognition complex

GO tree structure

IS_A relation

PART_OF relation

Page 7: Course on Functional Analysis - CNIObioinfo.cnio.es/.../Functional_Analysis_Course/Intro-and-Babelomics_24_02_2009.pdffunctional analysis Study the enrichment in functional terms in

hPp://www.genome.ad.jp/kegg/pathway.html 

Page 8: Course on Functional Analysis - CNIObioinfo.cnio.es/.../Functional_Analysis_Course/Intro-and-Babelomics_24_02_2009.pdffunctional analysis Study the enrichment in functional terms in

hPp://www.biocarta.com/genes/index.asp 

Page 9: Course on Functional Analysis - CNIObioinfo.cnio.es/.../Functional_Analysis_Course/Intro-and-Babelomics_24_02_2009.pdffunctional analysis Study the enrichment in functional terms in

hPp://www.reactome.org/ 

Page 10: Course on Functional Analysis - CNIObioinfo.cnio.es/.../Functional_Analysis_Course/Intro-and-Babelomics_24_02_2009.pdffunctional analysis Study the enrichment in functional terms in

hPp://www.pathwaycommons.org 

Page 11: Course on Functional Analysis - CNIObioinfo.cnio.es/.../Functional_Analysis_Course/Intro-and-Babelomics_24_02_2009.pdffunctional analysis Study the enrichment in functional terms in

hPp://www.whichgenes.org/ 

Page 12: Course on Functional Analysis - CNIObioinfo.cnio.es/.../Functional_Analysis_Course/Intro-and-Babelomics_24_02_2009.pdffunctional analysis Study the enrichment in functional terms in

hPp://www.cisred.org/ 

Page 13: Course on Functional Analysis - CNIObioinfo.cnio.es/.../Functional_Analysis_Course/Intro-and-Babelomics_24_02_2009.pdffunctional analysis Study the enrichment in functional terms in

::: Schedule.

1.  Biological (Functional) Databases 2.  Threshold-based and threshold free methods 3.  Threshold-based example: FatiGO. 4.  Threshold free example 1: FatisScan.

Page 14: Course on Functional Analysis - CNIObioinfo.cnio.es/.../Functional_Analysis_Course/Intro-and-Babelomics_24_02_2009.pdffunctional analysis Study the enrichment in functional terms in

The two-steps approach

•  Genes of interest are selected using the experimental value.

•  Selected genes are compared to the background.

Threshold-based functional analysis

Study the enrichment in functional terms in groups of genes defined by

the experimental value.

FatiGO

GOminer

DAVID

Marmite

Threshold-free functional analysis

Select genes taking into account their functional properties.

FatiScan

GSEA

MarmiteScan

•  Under a systems biology perspective.

•  Detect blocks of functionally related genes.

Page 15: Course on Functional Analysis - CNIObioinfo.cnio.es/.../Functional_Analysis_Course/Intro-and-Babelomics_24_02_2009.pdffunctional analysis Study the enrichment in functional terms in

Class1    Class2 

Pest cut‐off 

FDR<0.05 

FDR<0.05 

Biological meaning? 

Threshold-based functional analysis 

Page 16: Course on Functional Analysis - CNIObioinfo.cnio.es/.../Functional_Analysis_Course/Intro-and-Babelomics_24_02_2009.pdffunctional analysis Study the enrichment in functional terms in

ES/NES sta7s7c 

‐ 

Class1    Class2 

Gene  Set 1 

Pest cut‐off 

Gene  Set 2 

Gene  Set 3 

Gene set 3 enriched in Class 2 

Gene set 2 enriched in Class 1 

Threshold-free functional analysis 

Page 17: Course on Functional Analysis - CNIObioinfo.cnio.es/.../Functional_Analysis_Course/Intro-and-Babelomics_24_02_2009.pdffunctional analysis Study the enrichment in functional terms in

::: Schedule.

1.  Biological (Functional) Databases 2.  Threshold-based and threshold free methods 3.  Threshold-based example: FatiGO. 4.  Threshold free example 1: FatisScan.

Page 18: Course on Functional Analysis - CNIObioinfo.cnio.es/.../Functional_Analysis_Course/Intro-and-Babelomics_24_02_2009.pdffunctional analysis Study the enrichment in functional terms in

hPp://babelomics.bioinfo.cipf.es/ 

Page 19: Course on Functional Analysis - CNIObioinfo.cnio.es/.../Functional_Analysis_Course/Intro-and-Babelomics_24_02_2009.pdffunctional analysis Study the enrichment in functional terms in

::: How the functional profiling should never be done

It is not uncommon to find the following asser7on in papers and talks: “then we examined our set of genes selected in this way (whatever) and we discover that 65% of them were related to metabolism, so we can conclude that our experiment ac7vates metabolism genes”.  

Annota7on is not a func7onal result!!! 

Page 20: Course on Functional Analysis - CNIObioinfo.cnio.es/.../Functional_Analysis_Course/Intro-and-Babelomics_24_02_2009.pdffunctional analysis Study the enrichment in functional terms in

::: Exercise 1: FatiGO SEARCH

1. Select “FatiGO Search” ” and “H. sapiens”. 2. Upload FatiGO_example.txt file 3. Select “KEGG pathways” and click “Run”

Page 21: Course on Functional Analysis - CNIObioinfo.cnio.es/.../Functional_Analysis_Course/Intro-and-Babelomics_24_02_2009.pdffunctional analysis Study the enrichment in functional terms in

::: Exercise 1: FatiGO SEARCH

1. Select “FatiGO Search” ” and “H. sapiens”. 2. Upload FatiGO_example.txt file 3. Select “KEGG pathways” and click “Run”

FatiGO-Search annotations

Page 22: Course on Functional Analysis - CNIObioinfo.cnio.es/.../Functional_Analysis_Course/Intro-and-Babelomics_24_02_2009.pdffunctional analysis Study the enrichment in functional terms in

Testing the distribution of GO terms among two groups of genes

(remember, we have to test hundreds of GOs)

Biosynthesis 60% Biosynthesis 20%

Sporulation 20% Sporulation 20%

Group A Group B

Genes in group A have significantly to do with biosynthesis, but not with sporulation.

Are this two groups of genes

carrying out different

biological roles?

8 4 No biosynthesis

2 6 Biosynthesis

B A

Page 23: Course on Functional Analysis - CNIObioinfo.cnio.es/.../Functional_Analysis_Course/Intro-and-Babelomics_24_02_2009.pdffunctional analysis Study the enrichment in functional terms in

Using FatiGO

  List1: genes of interest (they are significantly over- or under-expressed when two classes of experiments are compared, co-located in the chromosomes, etc.)

  List2:the background (typically the rest of genes).

  Select suitable database, Run...

List2

Remove genes repeated in list1

Remove genes repeated between

both lists

Remove genes repeated in list2

Extract functional

terms

Comparing groups of genes

List1 “clean”

List1

“clean” List2

BABELOMICS

GO KEGG

Interpro KW

Bioentities Gene

Expression TF

Cisred

011000101010101001 ...... 11001010 ........... 010001010 ........... 0110001010 ........... 1111001111...............

Matrix of functional

terms

Fisher´s test

Adjust p-value by FDR

Significant functional

terms

Page 24: Course on Functional Analysis - CNIObioinfo.cnio.es/.../Functional_Analysis_Course/Intro-and-Babelomics_24_02_2009.pdffunctional analysis Study the enrichment in functional terms in

Pest cut‐off 

FDR<0.05 

FDR<0.05 

List 1 

List 2 (background) 

Class1    Class2 

List 1b / List 2b 

Page 25: Course on Functional Analysis - CNIObioinfo.cnio.es/.../Functional_Analysis_Course/Intro-and-Babelomics_24_02_2009.pdffunctional analysis Study the enrichment in functional terms in

::: Exercise 2: FatiGO COMPARE

1. Select “FatiGO Compare” and “H. sapiens”. 2. Upload FatiGO_example.txt file 3. Select “Rest of Genome” as background. 4. Select “KEGG pathways” and click “Run”

Page 26: Course on Functional Analysis - CNIObioinfo.cnio.es/.../Functional_Analysis_Course/Intro-and-Babelomics_24_02_2009.pdffunctional analysis Study the enrichment in functional terms in

::: Exercise 2: FatiGO COMPARE

1. Select “FatiGO Compare” and “H. sapiens”. 2. Upload FatiGO_example.txt file 3. Select “Rest of Genome” as background. 4. Select “KEGG pathways” and click “Run”

Only “Apoptosis” is significant