machine reading for cancer panomics - akbc · 2018. 11. 4. · genomics 12 discovery …...
TRANSCRIPT
![Page 1: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/1.jpg)
Machine Reading for
Cancer Panomics
Hoifung Poon
1
![Page 2: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/2.jpg)
Overview
2
… ATTCGGATATTTAAGGC …
… ATTCGGGTATTTAAGCC …
……
……
Disease Genes
Drug Targets
……KB
Cancer Systems Modeling
High-Throughput Data
![Page 3: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/3.jpg)
3
… ATTCGGATATTTAAGGC …
… ATTCGGGTATTTAAGCC …
……
……
Disease Genes
Drug Targets
…KB
Extract Pathways
from PubMed
Overview
High-Throughput Data
Grounded
Semantic Parsing
![Page 4: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/4.jpg)
Precision Medicine
![Page 5: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/5.jpg)
5
Before Treatment 15 Weeks
Vemurafenib on BRAF-V600 Melanoma
![Page 6: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/6.jpg)
Vemurafenib on BRAF-V600 Melanoma
6
Before Treatment 15 Weeks 23 Weeks
![Page 7: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/7.jpg)
7
![Page 8: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/8.jpg)
Traditional Biology
8
Targeted Experiments Discovery
One
hypothesis
![Page 9: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/9.jpg)
Genomics
9
High-Throughput ExperimentsDiscovery
… ATTCGGATATTTAAGGC …
… ATTCGGGTATTTAAGCC …
… ATTCGGATATTTAAGGC …
… ATTCGGGTATTTAAGCC …
… ATTCGGATATTTAAGGC …
… ATTCGGGTATTTAAGCC …
Many
hypotheses
?
![Page 10: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/10.jpg)
… ATTCGGATATTTAAGGC …
… ATTCGGGTATTTAAGCC … Healthy
Disease(e.g., Alzheimer, Cancer)
Genome-Wide Association Studies (GWAS)
2000
2010
“Genetic diagnosis of diseases would be
accomplished in 10 years and that
treatments would start to roll out perhaps
five years after that.”
“A Decade Later, Genetic Maps Yield Few New Cures”
New York Times, June 2010.
10
![Page 11: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/11.jpg)
Key Challenges
Human genome: 3 billion base pairs
Potential variations: > 10 million variants
Combination: > 101000000 (1 million zeros)
Machine learning problem
Atomic features: > 10 million
Feature combination: Too many to enumerate
11
![Page 12: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/12.jpg)
Genomics
12
Discovery
… ATTCGGATATTTAAGGC …
… ATTCGGGTATTTAAGCC …
… ATTCGGATATTTAAGGC …
… ATTCGGGTATTTAAGCC …
… ATTCGGATATTTAAGGC …
… ATTCGGGTATTTAAGCC …
How to Scale Discovery?
High-Throughput Experiments
![Page 13: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/13.jpg)
Cancer
Hundreds of mutations
Most are “passenger”, not driver
Can we identify likely drivers?
13
… ATTCGGATATTTAAGGC …
… ATTCGGGTATTTAAGCC … Normal cells
Tumor cells
![Page 14: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/14.jpg)
Panomics
14
… ATTCGGATATTTAAGGC …
Genome Transcriptome Epigenome
……
![Page 15: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/15.jpg)
Pathway Knowledge
Genes work synergistically in pathways
15
![Page 16: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/16.jpg)
Why Hard to Identify Drivers?
Complex diseases Perturb multiple pathways
16Hanahan & Weinberg [Cell 2011]
![Page 17: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/17.jpg)
Why Cancer Comes Back?
Subtypes with alternative pathway profile
Compensatory pathways can be activated
17
EphA2 EphB2
Ovarian Cancer
![Page 18: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/18.jpg)
Why Cancer Comes Back?
Subtypes with alternative pathway profile
Compensatory pathways can be activated
18
EphA2 EphB2
Ovarian Cancer
X
![Page 19: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/19.jpg)
Cancer Systems Modeling
19
Gene A DNA mRNA Protein Protein Active
Transcription Translation Activation
… ATTCGGATATTTAAGGC …
Functional activity
Mutation effect
Drug Target
……
![Page 20: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/20.jpg)
20
Gene A DNA mRNA Protein Protein Active
Gene B DNA mRNA Protein Protein Active
Gene C DNA mRNA Protein Protein Active
Transcription Factor
Protein Kinase
Knowledge Model
![Page 21: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/21.jpg)
21
Gene A DNA mRNA Protein Protein Active
Gene B DNA mRNA Protein Protein Active
Gene C DNA mRNA Protein Protein Active
Transcription Factor
Protein Kinase
?Knowledge Model
![Page 22: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/22.jpg)
22
Gene A DNA mRNA Protein Protein Active
Gene B DNA mRNA Protein Protein Active
Gene C DNA mRNA Protein Protein Active
Transcription Factor
Protein Kinase
?Knowledge Model
![Page 23: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/23.jpg)
23
Gene A DNA mRNA Protein Protein Active
Gene B DNA mRNA Protein Protein Active
Gene C DNA mRNA Protein Protein Active
Transcription Factor
Protein Kinase
!Knowledge Model
![Page 24: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/24.jpg)
Approach: Graph HMM
24
Gene A DNA mRNA Protein Protein Active
Transcription Factor
Protein Kinase
Gene B DNA mRNA Protein Protein Active
Gene C DNA mRNA Protein Protein Active
![Page 25: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/25.jpg)
Extract Pathways from PubMed
25
… ATTCGGATATTTAAGGC …
… ATTCGGGTATTTAAGCC …
……
……
Disease Genes
Drug Targets
……KBHigh-Throughput Data
![Page 26: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/26.jpg)
PubMed
24 millions abstracts
Two new abstracts every minute
Adds over one million every year
26
![Page 27: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/27.jpg)
…
VDR+ binds to
SMAD3 to form
…
…
JUN expression
is induced by
SMAD3/4
…
PMID: 123
PMID: 456
……
27
Machine Reading
![Page 28: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/28.jpg)
28
Involvement of p70(S6)-kinase activation in IL-10
up-regulation in human monocytes by gp41 envelope
protein of human immunodeficiency virus type 1 ...
Machine Reading
![Page 29: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/29.jpg)
29
Involvement of p70(S6)-kinase activation in IL-10
up-regulation in human monocytes by gp41 envelope
protein of human immunodeficiency virus type 1 ...
IL-10human
monocytegp41 p70(S6)-kinase
Machine Reading
PROTEINPROTEINPROTEINCELL
![Page 30: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/30.jpg)
30
Involvement of p70(S6)-kinase activation in IL-10
up-regulation in human monocytes by gp41 envelope
protein of human immunodeficiency virus type 1 ...
Involvement
up-regulation
IL-10human
monocyte
SiteTheme Cause
gp41 p70(S6)-kinase
activation
Theme Cause
Theme
Machine Reading
REGULATION
REGULATION REGULATION
PROTEINPROTEINPROTEINCELL
![Page 31: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/31.jpg)
31
Involvement of p70(S6)-kinase activation in IL-10
up-regulation in human monocytes by gp41 envelope
protein of human immunodeficiency virus type 1 ...
Involvement
up-regulation
IL-10human
monocyte
SiteTheme Cause
gp41 p70(S6)-kinase
activation
Theme Cause
Theme
Machine Reading
REGULATION
REGULATION REGULATION
PROTEINPROTEINPROTEINCELL
Semantic Parsing
![Page 32: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/32.jpg)
Long Tail of Variations
32
TP53 inhibits BCL2.
Tumor suppressor P53 down-regulates the activity of BCL-2 proteins.
BCL2 transcription is suppressed by P53 expression.
The inhibition of B-cell CLL/Lymphoma 2 expression by TP53 …
……
![Page 33: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/33.jpg)
Bottleneck: Annotated Examples
GENIA (BioNLP Shared Task 2009-2013)
1999 abstracts
MeSH: human, blood cell, transcription factor
Challenge for “supervised” machine learning
Can we breach this bottleneck?
33
![Page 34: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/34.jpg)
Free Lunch #1:
Distributional Similarity
Similar context Probably similar meaning
Annotation as latent variables
Textual expression Recursive clusters
Unsupervised semantic parsing
34
Poon & Domingos, “Unsupervised Semantic Parsing”.
EMNLP 2009. Best Paper Award.
![Page 35: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/35.jpg)
Recursive Clustering
35
TP53 inhibits BCL2.
Tumor suppressor P53 down-regulates the activity of BCL-2 proteins.
BCL2 transcription is suppressed by P53 expression.
The inhibition of B-cell CLL/Lymphoma 2 expression by TP53 …
……
![Page 36: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/36.jpg)
Recursive Clustering
36
TP53 inhibits BCL2.
Tumor suppressor P53 down-regulates the activity of BCL-2 proteins.
BCL2 transcription is suppressed by P53 expression.
The inhibition of B-cell CLL/Lymphoma 2 expression by TP53 …
……
![Page 37: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/37.jpg)
Recursive Clustering
37
TP53 inhibits BCL2.
Tumor suppressor P53 down-regulates the activity of BCL-2 proteins.
BCL2 transcription is suppressed by P53 expression.
The inhibition of B-cell CLL/Lymphoma 2 expression by TP53 …
……
![Page 38: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/38.jpg)
Recursive Clustering
38
TP53 inhibits BCL2.
Tumor suppressor P53 down-regulates the activity of BCL-2 proteins.
BCL2 transcription is suppressed by P53 expression.
The inhibition of B-cell CLL/Lymphoma 2 expression by TP53 …
……
BCL2, BCL-2 proteins,
B-cell CLL/Lymphoma 2
……
TP53,Tumor
suppressor P53
……
inhibits, down-regulates,
suppresses, inhibition, …
Theme Cause
![Page 39: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/39.jpg)
Free Lunch #2:
Existing KBs
Many KBs available
Gene/Protein: GeneBank, UniProt, …
Pathways: NCI, Reactome, KEGG, BioCarta, …
Annotation as latent variables
Textual expression Table, column, join, …
Grounded semantic parsing
39
![Page 40: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/40.jpg)
Entity Extraction
40
ID Symbol Alias
990 BCL2 B-cell CLL/Lymphoma 2, …
11998 TP53 Tumor suppressor P53, …
… … …
HGNC
![Page 41: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/41.jpg)
Entity Extraction
41
ID Symbol Alias
990 BCL2 B-cell CLL/Lymphoma 2, …
11998 TP53 Tumor suppressor P53, …
… … …
HGNC
TP53 inhibits BCL2.
Tumor suppressor P53 down-regulates the activity of BCL-2 proteins.
BCL2 transcription is suppressed by P53 expression.
The inhibition of B-cell CLL/Lymphoma 2 expression by TP53 …
……
![Page 42: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/42.jpg)
Relation Extraction
42
Regulation Theme Cause
Positive A2M FOXO1
Positive ABCB1 TP53
Negative BCL2 TP53
… … …
NCI-PID
Pathway KB
TP53 inhibits BCL2.
Tumor suppressor P53 down-regulates the activity of BCL-2 proteins.
BCL2 transcription is suppressed by P53 expression.
The inhibition of B-cell CLL/Lymphoma 2 expression by TP53 …
……
![Page 43: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/43.jpg)
Relation Extraction
43
Regulation Theme Cause
Positive A2M FOXO1
Positive ABCB1 TP53
Negative BCL2 TP53
… … …
NCI-PID
Pathway KB
TP53 inhibits BCL2.
Tumor suppressor P53 down-regulates the activity of BCL-2 proteins.
BCL2 transcription is suppressed by P53 expression.
The inhibition of B-cell CLL/Lymphoma 2 expression by TP53 …
……
Grounded Learning
![Page 44: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/44.jpg)
Question Answering w.r.t. KB
44
Poon, “Grounded Unsupervised Semantic Parsing”. ACL 2013.
System Accuracy
ZC07 84.6
FUBL 82.8
GUSP 83.5
Supervised
Unsupervised
![Page 45: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/45.jpg)
Pathway Extraction
Generalize distant supervision:
Nested events in KB likely occur in
semantic parse of some sentence
Prior: Favor semantic parse grounded in KB
Outperformed the majority of participants in
original GENIA Event Shared Task
45
Parikh, Poon, Toutanova. In Progress.
![Page 46: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/46.jpg)
http://literome.azurewebsites.net
46
Literome
Poon et al., “Literome: PubMed-Scale Genomic Knowledge
Base in the Cloud”, Bioinformatics 2014.
![Page 47: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/47.jpg)
PubMed-Scale Extraction
Preliminary pass:
2 million instances
13,000 genes, 870,000 unique regulations
Applications:
UCSC Genome Browser, MSR Interactions Track
Expression profile modeling
Validate de novo pathway prediction
Etc.
47
Poon, Toutanova, Quirk, “Distant Supervision for Cancer
Pathway Extraction from Text”. PSB 2015. To appear.
![Page 48: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/48.jpg)
Machine Science
48
Evans & Rzhetsky, “Machine Science”.
Science, Vol. 329, 2010.
![Page 49: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/49.jpg)
Machine Science
49
Big Data
![Page 50: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/50.jpg)
Machine Science
50
Big Data Rich Knowledge
KB
![Page 51: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/51.jpg)
Machine Science
51
Deep Model
Big Data Rich Knowledge
KB
![Page 52: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/52.jpg)
Machine Science
52
Deep Model
Big Data Rich Knowledge
Hypotheses
KB
![Page 53: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/53.jpg)
Machine Science
53
Deep Model
Big Data Rich Knowledge
Hypotheses
Experiments
KB
![Page 54: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/54.jpg)
Machine Science
54
Deep Model
Big Data Rich Knowledge
Hypotheses
Experiments
KB
![Page 55: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/55.jpg)
Roadmap
Extract richer knowledge:
Cell type, experimental condition, …
Hedging, negation, …
Formulate coherent models:
Supporting evidence, contradiction, …
Intellectual gaps, hypotheses, …
Integrate w. data & experiments:
Cancer panomics Driver genes / pathways
Single-drug response Drug combo prioritization
55
![Page 56: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/56.jpg)
Big Mechanism
42-million program
Reading, Assembly, Explanation
Domain: Cancer signaling pathways
We are in
PI: Andrey Rzhetsky
Co-PI w. James Evans, Ross King
56
![Page 57: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/57.jpg)
57
Berkeley
AMP Lab
OHSU
Microsoft
Research
![Page 58: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/58.jpg)
We Have Digitized Life
58
![Page 59: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/59.jpg)
Next: Digitize Medicine
59
Knock down genes A, B, C → Cure
![Page 60: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/60.jpg)
Summary
Precision medicine is the future
Cancer systems modeling
Graphical model: Pathways + Panomics data
Extract pathways from PubMed
Machine reading by grounded semantic parsing
Literome: KB for genomic medicine
60
![Page 61: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/61.jpg)
Acknowledgments
61
U. Chicago: Andrey Rzhetsky, Kevin White
OHSU: Brian Drucker, Jeff Tyner
Berkeley AMP Lab: David Patterson
U. Wisconsin: Anthony Gitter
Microsoft Research: Chris Quirk, Kristina
Toutanova, David Heckerman, Ankur Parikh,
Lucy Vanderwende, Bill Bolosky, Ravi Pandya
![Page 62: Machine Reading for Cancer Panomics - AKBC · 2018. 11. 4. · Genomics 12 Discovery … ATTCGGATATTTAAGGC ... UCSC Genome Browser, MSR Interactions Track ... Toutanova, David Heckerman,](https://reader034.vdocuments.mx/reader034/viewer/2022052014/602b8099e9c98a5af142593d/html5/thumbnails/62.jpg)
Summary
62
… ATTCGGATATTTAAGGC …
… ATTCGGGTATTTAAGCC …
……
……
Disease Genes
Drug Targets
……KBHigh-Throughput Data