cancer genome analysis: paradigmcancer genome analysis: paradigm...

21
Cancer Genome Analysis: PARADIGM Inference of pa+entspecific pathway ac+vi+es from mul+dimensional cancer genomics data using PARADIGM. Bioinforma+cs, 2010. (C. J. Vaske et al.) 02715 Advanced Topics in Computa+onal Genomics

Upload: others

Post on 22-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cancer Genome Analysis: PARADIGMCancer Genome Analysis: PARADIGM Inference’of’paent-specific’pathway’ac+vi+es’from’ mul+-dimensional’cancer’genomics’datausing’

Cancer Genome Analysis: PARADIGM

Inference  of  pa+ent-­‐specific  pathway  ac+vi+es  from  mul+-­‐dimensional  cancer  genomics  data  using  

PARADIGM.  Bioinforma+cs,  2010.  (C.  J.  Vaske  et  al.)  

02-­‐715  Advanced  Topics  in  Computa+onal  Genomics  

Page 2: Cancer Genome Analysis: PARADIGMCancer Genome Analysis: PARADIGM Inference’of’paent-specific’pathway’ac+vi+es’from’ mul+-dimensional’cancer’genomics’datausing’

Motivation

•  Integra+ve  analysis  of  cancer  genome  data  –  Copy  number  varia+ons,  gene  expressions  

•  Leverage  pathway  informa+on  to  find  frequently  occurring  pathway  perturba+ons  –  NCI  pathway  interac+on  database,  KEGG  etc.  

Page 3: Cancer Genome Analysis: PARADIGMCancer Genome Analysis: PARADIGM Inference’of’paent-specific’pathway’ac+vi+es’from’ mul+-dimensional’cancer’genomics’datausing’

Motivation

•  Pathway  informa+on  contains  informa+on  on  how  genes  are  supposed  to  behave    

Page 4: Cancer Genome Analysis: PARADIGMCancer Genome Analysis: PARADIGM Inference’of’paent-specific’pathway’ac+vi+es’from’ mul+-dimensional’cancer’genomics’datausing’

PARADIGM

Page 5: Cancer Genome Analysis: PARADIGMCancer Genome Analysis: PARADIGM Inference’of’paent-specific’pathway’ac+vi+es’from’ mul+-dimensional’cancer’genomics’datausing’

PARADIGM Model

•  Physical  en++es  for  variables:  –  Protein-­‐coding  genes,  small  molecules,  complexes  

–  Gene  families:  collec+ons  of  genes  in  which  any  single  gene  is  sufficient  to  perform  a  specific  func+on  

–  Abstract  processes:  the  overall  role  of  the  pathway,  e.g.,  apoptosis  

Page 6: Cancer Genome Analysis: PARADIGMCancer Genome Analysis: PARADIGM Inference’of’paent-specific’pathway’ac+vi+es’from’ mul+-dimensional’cancer’genomics’datausing’

PARADIGM Model

•  Factor  graph  representa+on  of  various  en++es  corresponding  to  a  single  gene  

Page 7: Cancer Genome Analysis: PARADIGMCancer Genome Analysis: PARADIGM Inference’of’paent-specific’pathway’ac+vi+es’from’ mul+-dimensional’cancer’genomics’datausing’

PARADIGM Model: Gene Interactions

Page 8: Cancer Genome Analysis: PARADIGMCancer Genome Analysis: PARADIGM Inference’of’paent-specific’pathway’ac+vi+es’from’ mul+-dimensional’cancer’genomics’datausing’

PARADIGM Model:

•  A  factor  graph  for  a  pathway  

Page 9: Cancer Genome Analysis: PARADIGMCancer Genome Analysis: PARADIGM Inference’of’paent-specific’pathway’ac+vi+es’from’ mul+-dimensional’cancer’genomics’datausing’

Model Specification

•  Convert  an  NCI  pathway  into  a  factor  graph  –  NCI  pathway  to  Bayesian  network  

•  Directed  network  •  Each  variable  takes  values  of  -­‐1  (de-­‐ac+va+on),  0  (normal),  1  (ac+va+on)  – mRNA:  over  expression  for  ac+va+on  

–  Copy  number  varia+ons:  more  than  two  copies  for  ac+va+ons  

•  Probability  distribu+on  of  each  node  –  Labeled  edges  for  posi+ve/nega+ve  interac+ons    –  Set  the  value  of  the  child  node  as  weighted  votes  from  its  parents  

Page 10: Cancer Genome Analysis: PARADIGMCancer Genome Analysis: PARADIGM Inference’of’paent-specific’pathway’ac+vi+es’from’ mul+-dimensional’cancer’genomics’datausing’

Model Specification

•  Conver+ng  the  Bayesian  network  to  a  factor  graph  –  Assign  a  factor  to  each  group  of  variables  consis+ng  of  a  node  and  its  

parents  

•  Z:  normaliza+on  constant  

•  ε  =  0.001  

Page 11: Cancer Genome Analysis: PARADIGMCancer Genome Analysis: PARADIGM Inference’of’paent-specific’pathway’ac+vi+es’from’ mul+-dimensional’cancer’genomics’datausing’

Inference

•  Observed  variables:  copy  number  varia+ons,  gene  expressions  

•  Unobserved  variables:  protein,  protein  ac+vity,  overall  pathway  ac+vity  state  

•  Learn  models  with  EM  algorithm  –  E  step:  impute  the  unobserved  variables  

–  M  step:  what  are  the  parameters?  

Page 12: Cancer Genome Analysis: PARADIGMCancer Genome Analysis: PARADIGM Inference’of’paent-specific’pathway’ac+vi+es’from’ mul+-dimensional’cancer’genomics’datausing’

Log-likelihood Ratio Test

•  Test  sta+s+c  for  assessing  en+ty  i’s  ac+vity  given  data  D  

–  The  probabili+es  can  be  obtained  by  performing  inference  on  the  factor  graph    

Page 13: Cancer Genome Analysis: PARADIGMCancer Genome Analysis: PARADIGM Inference’of’paent-specific’pathway’ac+vi+es’from’ mul+-dimensional’cancer’genomics’datausing’

Log-likelihood Ratio Test

•  Aggrega+ng  over  mul+ple  values  en+ty  i  takes  

Page 14: Cancer Genome Analysis: PARADIGMCancer Genome Analysis: PARADIGM Inference’of’paent-specific’pathway’ac+vi+es’from’ mul+-dimensional’cancer’genomics’datausing’

Dataset

•  Breast  cancer  copy  number  and  gene  expression  data  

•  TCGA  Glioblastoma  copy  number  and  gene  expression  data  

•  Pathways  from  NCI  pathway  interac+on  database  (PID)    

Page 15: Cancer Genome Analysis: PARADIGMCancer Genome Analysis: PARADIGM Inference’of’paent-specific’pathway’ac+vi+es’from’ mul+-dimensional’cancer’genomics’datausing’

EM Convergence

•  Original  data  vs.  permuted  data  

Red:  real  data  Green:  permuted  data  

Page 16: Cancer Genome Analysis: PARADIGMCancer Genome Analysis: PARADIGM Inference’of’paent-specific’pathway’ac+vi+es’from’ mul+-dimensional’cancer’genomics’datausing’

Top PARADIGM Pathways of Breast Cancer

Page 17: Cancer Genome Analysis: PARADIGMCancer Genome Analysis: PARADIGM Inference’of’paent-specific’pathway’ac+vi+es’from’ mul+-dimensional’cancer’genomics’datausing’

Top PARADIGM Pathways of Glioblastoma

Page 18: Cancer Genome Analysis: PARADIGMCancer Genome Analysis: PARADIGM Inference’of’paent-specific’pathway’ac+vi+es’from’ mul+-dimensional’cancer’genomics’datausing’

Glioblastoma Subtypes

Page 19: Cancer Genome Analysis: PARADIGMCancer Genome Analysis: PARADIGM Inference’of’paent-specific’pathway’ac+vi+es’from’ mul+-dimensional’cancer’genomics’datausing’

Survival Rates for Each Subtypes

Page 20: Cancer Genome Analysis: PARADIGMCancer Genome Analysis: PARADIGM Inference’of’paent-specific’pathway’ac+vi+es’from’ mul+-dimensional’cancer’genomics’datausing’

CircleMap of ErbB2 Pathway

•  ER  status,  IPAs,  expression  data,  and  copy-­‐number  data  

Page 21: Cancer Genome Analysis: PARADIGMCancer Genome Analysis: PARADIGM Inference’of’paent-specific’pathway’ac+vi+es’from’ mul+-dimensional’cancer’genomics’datausing’

Summary

•  PARADIGM  integrates  different  types  of  data,  including  gene-­‐expression,  copy  number  varia+on,  and  pathway  database,  in  order  to  infer  pathway  ac+vi+es  for  individual  cancer  pa+ents.  –  Factor  graph  model  for  represen+ng  pathway  and  modeling  datasets  

–  Pathway  ac+vi+es  inferred  by  PARADIGM  can  be  used  to  iden+fy  cancer  subtypes