ester feldmesser bioinformatics unit introduction to deep...
TRANSCRIPT
![Page 1: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/1.jpg)
Ester Feldmesser
Bioinformatics Unit
Introduction to Deep Sequencing Analysis Course
June 2017
![Page 2: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/2.jpg)
Gene Set Enrichment Analysis (GSEA)
http://www.broadinstitute.org/gsea/index.jsp http://www.pnas.org/content/102/43/15545.full
![Page 3: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/3.jpg)
Gene set database
• The gene sets are defined based on prior biological knowledge, e.g., published information about biochemical pathways or co-expression in previous experiments
and more….
![Page 4: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/4.jpg)
Recommended as a starting point. Hallmark gene sets summarize and
represent specific well-defined biological states or processes.
The hallmarks reduce noise and redundancy and provide a better delineated biological space for GSEA.
CGP: chemical and genetic perturbations. CP: Canonical pathways
CP:BIOCARTA CP:KEGG CP:REACTOME
The user can define new gene sets
http://www.broadinstitute.org/gsea/msigdb/collections.jsp#H
![Page 5: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/5.jpg)
GSEA features
• GSEA performs its analysis on a list of ranked genes derived from comparing between two conditions, there is no need of cutoffs to define up or down regulated genes.
• Given a ranked list of differentially expressed genes, the goal of GSEA is to determine whether members of a gene set tend to occur toward the top (or bottom) of the list , in which case the gene set is correlated with the phenotypic class distinction (conditions).
Subramanian et al, 2005, PNAS ; 102(43): 15545–15550, doi: 10.1073/pnas.0506580102
![Page 6: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/6.jpg)
A GSEA overview illustrating the method
• Compute a gene-wise measure for differential expression between A and B and rank the genes according to this measure
• Alternatively a pre-ranked list can be used
• The score is calculated by walking down the list L, increasing a running-sum statistic when we encounter a gene in S and decreasing it when we encounter genes not in S.
• If up regulated genes in group A are enriched with genes from the Gene set S, many of its genes will have high ranks and we will observe a separation in the ordered list
![Page 7: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/7.jpg)
Steps:
1. Calculation of an Enrichment Score (ES): maximum deviation from
zero encountered in the walk
2. Estimation of Significance Level of ES by permutations test
3. Adjustment for Multiple Hypothesis Testing
Gene Set Enrichment Analysis
S1 is significantly enriched in females, S2 is randomly distributed and scores poorly, and S3 is not enriched at the top of the list but is nonrandom. Arrows show the location of the maximum enrichment score and the point where the correlation (signal-to-noise ratio) crosses zero
![Page 8: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/8.jpg)
Advantages
• Not restricted to analysis of differential expression
• Ranking of genes is considered
• No cutoff has to be chosen
Gene Set Enrichment Analysis
![Page 9: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/9.jpg)
GSEA output
![Page 10: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/10.jpg)
INGENUITY Understand complex
'omics data at multiple levels
![Page 11: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/11.jpg)
IPA provides many kinds of biological insights
![Page 12: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/12.jpg)
Quickly analyze your data to identify key insights with IPA Transcriptomics IPA can help with almost any transcriptomics-related question or application Biomarker Discovery Identifies the most promising and relevant biomarker candidates within experimental datasets microRNA Research Combines filtering tools and microRNA-mRNA content to provide insight into the biological effects of microRNAs Toxicogenomics Delivers a focused toxicity and safety assessment of candidate compounds, and provides a more complete understanding of pharmacological response, drug mechanism of action, and mechanism of toxicity Metabolomics Overcomes the metabolomics data analysis challenge by providing the critical context necessary to gain biological insight into cell physiology and metabolism from metabolite data Drug Repositioning Expression profiling of approved drugs and comparison to profiles of diseased tissue can lead to discovery of new uses for these already approved entities Proteomics Perform a comprehensive analysis of your proteomics for a deep understanding of proteins and related biological processes Target Discovery Genes that are shown to be activated in a pathological condition may serve as promising targets for therapeutic development efforts
![Page 13: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/13.jpg)
HIGHLIGHTS
Quickly analyze your experimental data to identify key insights
Easily search the scientific literature and find insights most relevant to your experimental model or question
Build dynamic pathway models to extend your understanding of your research systems
![Page 14: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/14.jpg)
Signaling and Metabolic Pathways Analysis
Cellular and Disease Process Analysis
Molecular Network Analysis
Upstream (Transcription Factor) Analysis
Regulator Analysis
Evaluation of Downstream Effects
THE KEY COMPONENTS OF THE IPA CORE ANALYSIS ARE:
![Page 15: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/15.jpg)
Supported Species: IPA supports the upload and analysis of human, mouse, rat, and canine identifiers, plus chemical identifiers. Additionally, IPA supports analysis of molecular data for the following species through ortholog mapping of Entrez Gene IDs: Arabidopsis thaliana (plant) Bos taurus (bovine) Caenorhabditis elegans (c. elegans) Canis lupus familiaris (canine) Danio rerio (zebrafish) Drosophila melanogaster (fruit fly) Gallus gallus (chicken) Macaca mulatta (Rhesus Monkey) Pan troglodytes (chimpanzee) Saccharomyces cerevisiae (yeast) Schizosaccharo-myces pombe (yeast)
![Page 16: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/16.jpg)
The Ingenuity Knowledge Base includes modeled relationships between: •Chemicals •Proteins and molecular complexes •Genes and mutations •Cells, cellular components, cellular processes and tissues •Drugs, diseases and clinical phenotypes IPA includes relationships that have been manually curated from the literature (Ingenuity® Expert Findings) as well as information that has been manually reviewed and automatically extracted from the literature (Ingenuity® ExpertAssist Findings). IPA also contains information that has been manually curated by Ingenuity scientists (Ingenuity® Expert Knowledge) and that describes: •FDA approved drugs and clinical candidates •Cell Signaling, Metabolic, and Disease Pathways •Toxicity Lists and Pathways •Predicted and experimentally demonstrated microRNA targets
What’s in IPA
![Page 17: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/17.jpg)
IPA includes manually reviewed content from selected third party sources (Ingenuity® Supported Third Party Information), including: Entrez Gene RefSeq OMIM GWAS Database Gene Ontology Human Metabolome Database (HMDB) GNF Tissue Expression Body Atlas NCI-60 Cell Line Expression Atlas KEGG metabolic pathway information LIGAND enzyme/substrate reactions BIND, DIP, MINT, MIPS, BIOGRID, INTACT, COGNIA protein-protein interactions TarBase TargetScan miRecords Clinicaltrials.gov Drugs@FDA Mosby’s Drug Consult Goodman & Gilman’s Pharmacological Basis of Therapeutics DrugBank Hazardous Substance Database (HSDB) Chemical Carcinogenesis Research Information System database (CCRIS)
![Page 18: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/18.jpg)
• ~190,000 new findings (total of ~5.9 million findings), including: • ~65,000 new Expert findings • ~102,000 new mutation-to-disease findings from ClinVar • ~2,900 new disease-to-target findings from ClinicalTrials.gov • ~1,800 new disease-to-drug findings from ClinicalTrials.gov • ~2,300 new cancer mutation disease association findings from
COSMIC • ~1,200 new functional annotations from Gene Ontology • ~500 new mouse knockout-to-phenotype findings in Mouse
Genome Database (JAX Labs) • ~13,000 new protein-protein interactions from the IntAct database • ~2,000 new protein-protein interactions from the BioGRID database • ~300 gene-to-disease findings from OMIM
Content Updates (March 2017)
![Page 19: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/19.jpg)
Using Enrichment Analysis in Practice
• statistical measure
– how likely your differentially regulated genes fall into that category by chance
microarray
1000 genes experiment
100 genes differentially regualted
mitosis – 80/100 apoptosis – 40/100 p. ctrl. cell prol. – 30/100 glucose transp. – 20/100
0
10
20
30
40
50
60
70
80
mitosis apoptosis positive control of
cell proliferation
glucose transport
![Page 20: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/20.jpg)
Using Enrichment Analysis in Practice
• Looking at the distribution of all genes on the microarray:
Process Genes on array # genes expected in occurred 100 random genes mitosis 800/1000 80 80 apoptosis 400/1000 40 40 p. ctrl. cell prol. 100/1000 10 30 glucose transp. 50/1000 5 20
![Page 21: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/21.jpg)
Analysis summary
![Page 22: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/22.jpg)
Networks generation
• Chosen genes are combined into networks that maximize their specific connectivity, according to pre-defined relationships in the Ingenuity Knowledge Base.
• Additional molecules from the database are used to specifically connect two or more smaller networks by merging them into a larger one.
![Page 23: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/23.jpg)
Example of the calculation of specific connectivity
![Page 24: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/24.jpg)
View of a network
![Page 25: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/25.jpg)
Cannonical pathways
• 600 predefined pathways
– HumanCyc Metabolic Pathways
– Signaling Pathways
![Page 26: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/26.jpg)
Downstream Effects Analysis
![Page 27: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/27.jpg)
More Downstream Effects Analysis
![Page 28: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/28.jpg)
Regulation z-score
The purpose of the regulation z‐score is to identify increased or decreased biological functions that are implicated by the observed gene
expression changes
![Page 29: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/29.jpg)
The upstream regulator analysis is based on prior knowledge of expected effects between transcriptional regulators and their target genes stored in the Ingenuity® Knowledge Base.
The analysis examines how many known targets of each transcription regulator are present in the user’s dataset, and also compares their direction of change to what is expected from the literature in order to predict likely relevant transcriptional regulators.
IPA’s definition of upstream transcriptional regulator is quite broad – any molecule that can affect the expression of other molecules, which means that upstream regulators can be almost any type of molecule, from transcription factor, to microRNA, kinase, compound or drug.
OVERLAP P‐VALUE ACTIVATION Z‐SCORE
Upstream Regulator Analysis
![Page 30: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/30.jpg)
Upstream Regulator Analysis
The primary purpose of the activation z‐score is to infer the activation states of predicted transcriptional regulators
![Page 31: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/31.jpg)
Regulator Effectors • Generate a hypothesis for how a phenotype, function or disease
is regulated in your dataset by activated or inhibited upstream regulators
• Explain the biological impact of upstream molecules and the underlying mechanism for a phenotype
• The algorithm goes through one or more iterations to merge upstream and downstream results from the Upstream Regulator Analysis tab and Downstream Effects Analysis tabs
![Page 32: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/32.jpg)
How does the Regulator Effects algorithm work?
![Page 33: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/33.jpg)
How does the Regulator Effects algorithm work?
![Page 34: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/34.jpg)
Custom filters help you identify relevant information
![Page 35: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/35.jpg)
Gain more insight by customizing pathway and network overlays
![Page 36: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/36.jpg)
Quickly visualize canonical pathway scores across dose, time, or other condition using the new Comparison Analysis heat map. Prioritize by score, hierarchical cluster, or trend.
Comparison Analysis
![Page 37: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/37.jpg)
Which microRNA is predicted to target a given mRNA, and how good is the prediction?
Based on my expression data, which microRNAs have regulation that supports the
target prediction?
Which mRNAs participate in a relevant disease, subcellular location, or pathway?
How do certain mRNAs and microRNAs interact, and what’s downstream?
What is the predicted impact of changes in microRNA expression on cellular
processes, pathways, diseases, and phenotypes?
MicroRNA Target Filter
![Page 38: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/38.jpg)
Path Designer
![Page 39: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/39.jpg)
Path Designer 2
![Page 40: Ester Feldmesser Bioinformatics Unit Introduction to Deep ...dors.weizmann.ac.il/course/course2018/Gene Set... · expression between A and B and rank the genes according to this measure](https://reader033.vdocuments.mx/reader033/viewer/2022042923/5f712e803f945d798f11218f/html5/thumbnails/40.jpg)
Batch Upload