françois fages iclp december 2003 symbolic model checking of biochemical systems logic programming...

47
François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt http://contraintes.inria.fr/ Joint work with and Nathalie Chabrier-Rivier Sylvain Soliman In collaboration with ARC CPBIO http://contraintes.inria.fr/cpbio Alexander Bockmayr, Vincent Danos, Vincent Schächter et al.

Upload: aron-shon-smith

Post on 14-Jan-2016

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Symbolic model checking of biochemical systemsLogic programming steps towards formal biology

François Fages, INRIA Rocquencourt http://contraintes.inria.fr/

Joint work with andNathalie Chabrier-Rivier Sylvain Soliman

In collaboration with ARC CPBIO http://contraintes.inria.fr/cpbioAlexander Bockmayr, Vincent Danos, Vincent Schächter et al.

Page 2: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Current revolution in Biology

• Elucidation of high-level biological processes

in terms of their biochemical basis at the molecular level.

• Mass production of genomic and post-genomic data:

ARN expression, protein synthesis, protein-protein interactions,…

• Need for a strong parallel effort on the formal representation of biological processes.

• Need for formal tools for modeling and reasoning about their global behavior.

Page 3: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Formalisms for modeling biochemical systems

• Diagrammatic notation • Boolean networks [Thomas 73]

• Milner’s –calculus [Regev-Silverman-Shapiro 99-01, Nagasali et al. 00]

• Concurrent transition systems [Chabrier-Chiaverini-Danos-Fages-Schachter 03]

Biochemical abstract machine BIOCHAM [Chabrier-Fages-Soliman 03]

Pathway logic [Eker-Knapp-Laderoute-Lincoln-Meseguer-Sonmez 02]

• Bio-ambients [Regev-Panina-Silverman-Cardelli-Shapiro 03]

• Differential equations • Hybrid Petri nets [Hofestadt-Thelen 98, Matsuno et al. 00]

• Hybrid automata [Alur et al. 01, Ghosh-Tomlin 01]

• Hybrid concurrent constraint languages [Bockmayr-Courtois 01]

Page 4: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Our goal

Beyond simulation, provide formal tools for querying, validating and completing biological models.

Our proposal:

• Use of temporal logic CTL as a query language for models of biological processes;

• Use of concurrent transition systems for their modeling;

• Use of symbolic and constraint-based model checkers for automatically evaluating CTL queries in qualitative and quantitative models.

• Use of inductive logic programming for learning models [EU APRIL 2]

In course, learn and teach bits of biology with logic programs.

Page 5: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Plan of the talk

1. Introduction

2. A simple algebra of cell molecules

3. Concurrent transition systems of biochemical reactions• Example of the mammalian cell cycle control

4. Temporal logic CTL as a query language• Computational results with BIOCHAM

5. Learning models• An experiment with inductive logic programming

6. Quantitative models• Simulation with differential equations

• Constraint-based model checking

7. Conclusion

Page 6: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

References

A wonderful textbook:Molecular Cell Biology. 5th Edition, 1100 pages+CD, Freeman Publ.Lodish, Berk, Zipursky, Matsudaira, Baltimore, Darnell. Nov. 2003.

Genes and signals. Ptashne, Gann. CSHL Press. 2002.

Modeling dynamic phenomena in molecular and cellular biology. Segel. Cambridge Univ. Press. 1987.

Modeling and querying bio-molecular interaction networks. Chabrier, Chiaverini, Danos, Fages, Schächter. To appear in TCS. 2003.

The biochemical abstract machine BIOCHAM. Chabrier, Fages, Soliman. http://contraintes.inria.fr/BIOCHAM

Page 7: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

2. A Simple Algebra of Cell Molecules

Small molecules: covalent bonds (outer electrons shared) 50-200 kcal/mol

• 70% water

• 1% ions

• 6% amino acids (20), nucleotides (5),

fats, sugars, ATP, ADP, …

Macromolecules: hydrogen bonds, ionic, hydrophobic, Waals 1-5 kcal/mol

Stability and bindings determined by the number of weak bonds: 3D shape

• 20% proteins (50-104 amino acids)

• RNA (102-104 nucleotides AGCU)

• DNA (102-106 nucleotides AGCT)

Page 8: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Structure levels of proteins

1) Primary structure: word of n amino acids residues (20n possibilities)

linked with C-N bonds

ICLP

Isoleucine Cysteine Leucine Proline

2) Secondary: word of m helix, strands, random coils,… (3m-10m)

stabilized by hydrogen bonds H---O

3) Tertiary 3D structure: spatial folding

stabilized by

hydrophobic

interactions

Page 9: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Formal proteins

Cyclin dependent kinase 1 Cdk1

(free, inactive)

Complex Cdk1-Cyclin A Cdk1–CycB

(low activity)

Phosphorylated Cdk1~{thr161}-CycB

at site threonine 161

(high activity)

(BIOCHAM syntax)

Page 10: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Gene expression: DNA RNA protein

DNA: word over 4 nucleotides Adenine, Guanine, Cytosine, Thymine

double helix of pairs A--T and C---G

Replication: DNA synthesis

Genes: parts of DNA

Transcription: RNA copying from a gene

ERCC1-(PRB-JUN-CFOS)

Page 11: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Genome Size

Species Genome size Chromosomes Coding DNA

E. Coli (bacteria) 5 Mb 1 circular 100 %

S. Cerevisae (yeast) 12 Mb 16 70 %

Mouse, Human 3 Gb 20, 23 15 %

… 15 Gb

… 140 Gb

3,200,000,000 pairs of nucleotides

single nucleotide polymorphism 1 / 2kb

Page 12: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Genome Size

Species Genome size Chromosomes Coding DNA

E. Coli (bacteria) 4 Mb 1 100 %

S. Cerevisae (yeast) 12 Mb 16 70 %

Mouse, Human 3 Gb 20, 23 15 %

Onion 15 Gb 8 1 %

… 140 Gb

Page 13: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Genome Size

Species Genome size Chromosomes Coding DNA

E. Coli (bacteria) 4 Mb 1 100 %

S. Cerevisae (yeast) 12 Mb 16 70 %

Mouse, Human 3 Gb 20, 23 15 %

Onion 15 Gb 8 1 %

Lungfish 140 Gb 0.7 %

Page 14: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Algebra of Cell Molecules

E ::= Name|E-E|E~{E,…,E}|(E) S ::= _|E|S+S

Names: proteins, gene binding sites, molecules, abstract processes…

- : binding operator for protein complexes, gene binding sites, …

Non associative, non commutative (could be in most cases)

~{…}: modification operator for phosphorylated sites, …

Associative, Commutative, Idempotent.

+ : solution operator, “soup aspect”, Assoc. Comm. Idempotent, Neutral _

No membranes, no transport formalized. Bitonal calculi [Cardelli 03].

Page 15: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Plan of the talk

1. Introduction

2. A simple algebra of cell molecules

3. Concurrent transition systems of biochemical reactions• Example of the mammalian cell cycle control

4. Temporal logic CTL as a query language• Computational results with BIOCHAM

5. Learning models• An experiment with inductive logic programming

6. Quantitative models• Simulation with differential equations

• Constraint-based model checking

7. Conclusion

Page 16: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

3. Concurrent Transition Syst. of Biochemical Reactions

Enzymatic reactions:

R ::= S=>S | S=[E]=>S | S=[R]=>S | S<=>S | S<=[E]=>S

(where A<=>B stands for A=>B B=>A and A=[C]=>B for A+C=>B+C, etc.)

define a concurrent transition system over integers denoting the multiplicity of the molecules (multiset rewriting).

One can associate a finite abstract CTS over boolean state variables denoting the presence/absence of molecules

which correctly over-approximates the set of all possible behaviors

If we translate a reaction A+B=>C+D by 4 rules for possible consumption:

A+BA+B+C+D A+BA+B +C+D

A+BA+B+C+D A+BA+B+C+D

Page 17: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Four Rule Schemas

Complexation: A + B => A-B

Cdk1+CycB => Cdk1–CycB

Phosphorylation: A =[C]=> A~{p}

Cdk1–CycB =[Myt1]=> Cdk1~{thr161}-CycB

Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB

Synthesis: _ =[C]=> A.

_ =[Ge2-E2f13-Dp12]=> CycA

Degradation: A =[C]=> _.

CycE =[UbiPro]=> _ (not for CycE-Cdk2 which is stable)

Page 18: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

An Actin-Myosin Engine with ATP fuel

A two-stroke nano-engine: Myosin + ATP => Myosin-ATP

Myosin-ATP => Myosin + ADP

http://www.sci.sdsu.edu/movies http://www-rocq.inria.fr/sosso/icema2

Page 19: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Cell Cycle: G1 DNA Synthesis G2 Mitosis

G1: CdK4-CycD

Cdk6-CycD

Cdk2-CycE

S: Cdk2-CycA

G2

M: Cdk1-CycA

Cdk1-CycB

Page 20: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Mammalian Cell Cycle Control Map [Kohn 99]

Page 21: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Kohn’s map detail for Cdk2

Complexation with CycA and CycE Phosphorylation sites PY15 and P

Concurrent Transition Rules:

cdk2+cycA => cdk2-cycA.

cdk2~{p2}+cycA => cdk2~{p2}-cycA.

cdk2~{p1}+cycA => cdk2~{p1}-cycA.

cdk2~{p1,p2}+cycA => cdk2~{p1,p2}-cycA.

cdk2+cycE => cdk2-cycE.

cdk2+cycE~{p1} => cdk2-cycE~{p1}.

cdk2~{p2}+cycE => cdk2~{p2}-cycE.

700 rules, 165 proteins and genes, 500 variables, 2500 states.

Page 22: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Translation in Prolog

Encode states with a single predicate p(A,B,C,D,E)

A+BC+D. p(1,1,_,_,E):-p(_,_,1,1,E).

C A. p(_,B,1,D,E):- p(1,B,_,D,E).

Thm. [Delzanno-Podelski 99] Predecessor(S) = TP(S)

Backward analysis by computing lfp(TP{p(x):-s}).

CLP-based Deductive Model Checker DMC [Delzanno-Podelski 99]

More efficient implementation using state-of-the-art symbolic model-checker NuSMV [Cimatti Clarke Giunchiglia Giunchiglia Pistore 02].

Page 23: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Plan of the talk

1. Introduction2. A simple algebra of cell molecules3. Concurrent transition systems of biochemical reactions

• Example of the mammalian cell cycle control4. Temporal logic CTL as a query language

• Computational results with BIOCHAM5. Learning models

• An experiment with inductive logic programming6. Quantitative models

• Simulation with differential equations• Constraint-based model checking

7. Conclusion

Page 24: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

4. Temporal Logic CTL as a Query Language

Computation Tree Logic

Time

Non-determinism E, A

F,G,U EF

EU

AG

Choice

Time

E

exists 

A

always

X

next time

EX() AX()

F

finally

EF()

AG()

AF()

liveness

G

globally

EG()

AF( )

AG()

safety

U

untilE (U ) A (U )

Page 25: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Kripke Structures

A Kripke structure K is a triple (S; R; L) where S is a set of states, and RSxS is a total relation.

s |= if is true in s,

s |= E if there is a path from s such that |= ,

s |= A if for every path from s, |= ,

|= if s |= where s is the starting state of ,

|= X if 1 |= ,

|= F if there exists k >0 such that k |= ,

|= G if for every k >0, k |= ,

|= U iff there exists k>0 such that k |= for all j < k j |= Following [Emerson 90] we identify a formula to the set of states which

satisfy it ~ {sS : s |= }.

Page 26: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Symbolic Model Checking

Model Checking is an algorithm for computing, in a given finite Kripke structure the set of states satisfying a CTL formula: {sS : s |= }.

Basic algorithm: represent K as a graph and iteratively label the nodes with the subformulas of which are true in that node.

Add to the states satisfying Add EF (EX ) to the (immediate) predecessors of states labeled by Add E( U ) to the predecessor states of while they satisfy Add EG to the states for which there exists a path leading to a non

trivial strongly connected component of the subgraph of states satisfying

Symbolic model checking: use OBDDs to represent states and transitions as boolean formulas (S is finite).

Page 27: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Biological Queries (1/3)

About reachability:

• Given an initial state init, can the cell produce some protein P? init EF(P)

• Which are the states from which a set of products P1,. . . , Pn can be produced simultaneously? EF(P1^…^Pn)

About pathways:

• Can the cell reach a state s while passing by another state s2? init EF(s2^EFs)

• Is state s2 a necessary checkpoint for reaching state s? EF(s2U s)

• Is it possible to produce P without using nor creating Q? EF(Q U s)• Can the cell reach a state s without violating some constraints c? init EF(cUs)

Page 28: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Biological Queries (2/3)

About stability:

• Is a certain (partially described) state s a stable state? sAG(s) sAG(s) (s denotes both the state and the formula describing it).

• Is s a steady state (with possibility of escaping) ? sEG(s)

• Can the cell reach a stable state? initEF(AG(s))not a LTL formula.

• Must the cell reach a stable state? initAF(AG(s))

• What are the stable states? Not expressible in CTL [Chan 00].

• Can the system exhibit a cyclic behavior w.r.t. the presence of P ? init EG((P EF P) ^ (P EF P))

Page 29: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Biological Queries (3/3)

About the correctness of the model:

• Can one see the inaccuracies of the model and correct them?

Exhibit a counterexample pathway or a witness. Suggest refinements of the model or biological experiments to validate/invalidate the property of the model.

About durations:

• How long does it take for a molecule to become activated?

• In a given time, how many Cyclins A can be accumulated?

• What is the duration of a given cell cycle’s phase?

CTL operators abstract from durations. Time intervals can be modeled in FO by adding numerical arguments for start times and durations.

Page 30: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Cell to Cell Signaling by Hormones and Receptors

Receptor Tyrosine Kinase RTKRAF + RAFK -> RAF-RAFK

RAFp + RAFPH -> RAFp-RAFPH

MEKp + RAFp -> MEKp-RAFp

RAF-RAFK -> RAF + RAFK.

RAFp-RAFPH -> RAFp + RAFPH.

MEKp-RAFp -> MEKp + RAFp.

RAF-RAFK -> RAFK + RAFp.

RAFp-RAFPH -> RAF + RAFPH.

MEKp-RAFp -> MEKpp + RAFp.

Page 31: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Cell to Cell Signaling by Hormones and Receptors

Receptor Tyrosine Kinase RTKRAF + RAFK -> RAF-RAFK

RAFp + RAFPH -> RAFp-RAFPH

MEKp + RAFp -> MEKp-RAFp

RAF-RAFK -> RAF + RAFK.

RAFp-RAFPH -> RAFp + RAFPH.

MEKp-RAFp -> MEKp + RAFp.

RAF-RAFK -> RAFK + RAFp.

RAFp-RAFPH -> RAF + RAFPH.

MEKp-RAFp -> MEKpp + RAFp.

MEKp is a checkpoint for the cascade (producing MAPKpp)

?- nusmv(!(E(!(MEKp) U MAPKpp))).

true

The PH complexes are only here to "slow down" the cascade

?- nusmv(E(!(MEKp-MEKPH) U MAPKpp)).

true

Page 32: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Cell Cycle: G1 DNA Synthesis G2 Mitosis

G1: CdK4-CycD

Cdk6-CycD

Cdk2-CycE

S: Cdk2-CycA

G2

M: Cdk1-CycA

Cdk1-CycB

Page 33: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Mammalian Cell Cycle Control Benchmark

700 rules, 165 proteins and genes, 500 variables, 2500 states.

BIOCHAM NuSMV model-checker time in seconds:

Initial state G2 Query: Time:

compiling 29

Reachability G1 EF CycE 2

Reachability G1 EF CycD 1.9

Reachability G1 EF PCNA-CycD 1.7

Checkpoint

for mitosis complex

EF ( Cdc25~{Nterm}

U Cdk1~{Thr161}-CycB)

2.2

Cycle EG ( (CycA EF CycA) ( CycA EF CycA))

31.8

Page 34: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Plan of the talk

1. Introduction

2. A simple algebra of cell molecules

3. Concurrent transition systems of biochemical reactions

• Example of the mammalian cell cycle control4. Temporal logic CTL as a query language

• Computational results with BIOCHAM5. Learning models

• An experiment with inductive logic programming6. Quantitative models

• Simulation with differential equations• Constraint-based model checking

7. Conclusion

Page 35: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

5. Learning Models

Basic idea: learn reaction rules from temporal properties of the system.

Learning of yeast cell cycle rules from reachability properties and counterexamples with Progol [Muggleton 00].

reaction([m_CP,m_Y],[m_pM]).

reaction([m_CP],[m_C2]).

% reaction([m_pM],[m_M]).

reaction([m_M],[m_C2,m_YP]).

reaction([m_C2],[m_CP]).

reaction([m_YP],[]).

reaction([],[m_Y]).

pathway(S1,S2) :- same(S1,S2).

pathway(S1,S2) :- reaction(L1,L2), transition(S1,L1,S3,L2), pathway(S3,S2).

Page 36: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Inductive Logic Programming

reaction([m_pM],[m_M]) learned…

6th PCRD APRIL 2 “Applications of Probabilistic Inductive Logic Progr.” Luc de Raedt, Univ. Freiburg, Stephen Muggleton, Univ. London.

pathway([m_CP,m_Y],[m_M]).

pathway([m_CP,m_Y],[m_M,m_pM]).

pathway([m_CP,m_Y],[m_M,m_Y]).

pathway([m_CP,m_Y],[m_M,m_Y,m_pM]).

pathway([m_CP,m_Y],[m_M,m_CP]).

pathway([m_CP,m_Y],[m_M,m_CP,m_Y]).

pathway([m_CP,m_Y],[m_M,m_CP,m_pM]).

pathway([m_CP,m_Y],[m_M,m_CP,m_Y,m_pM]).

pathway([m_pM],[m_C2,m_YP]).

pathway([m_pM],[m_M,m_C2,m_YP]).

pathway([m_pM],[m_pM,m_C2,m_YP]).

pathway([m_pM],[m_M,m_pM,m_C2,m_YP]).

:-pathway([],[m_C2]).

:-pathway([],[m_CP]).

:-pathway([],[m_C2,m_CP]).

:-pathway([],[m_M]).

:-pathway([],[m_YP]).

:-pathway([],[m_YP, m_Y]).

:-pathway([],[m_Y,m_pM]).

:-pathway([],[m_CP,m_pM]).

:-pathway([],[m_Y,m_M]).

:-pathway([m_CP, m_C2],[m_YP]).

:-pathway([m_CP],[m_YP]).

:-pathway([m_C2],[m_YP]).

:-pathway([m_Y],[]).

Page 37: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Plan of the talk

1. Introduction

2. A simple algebra of cell molecules

3. Concurrent transition systems of biochemical reactions

• Example of the mammalian cell cycle control4. Temporal logic CTL as a query language

• Computational results with BIOCHAM5. Learning models

• An experiment with inductive logic programming6. Quantitative models

• Simulation with differential equations• Constraint-based model checking

7. Conclusion

Page 38: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

6. Quantitative Models

Enzymatic reactions with rates k1 k2 k3

E+S k1 C k2 E+P

E+S k3 C

can be compiled by the law of mass action into a system of

Ordinary Differential Equations

dE/dt = -k1ES+(k2+k3)C

dS/dt = -k1ES+k3C

dC/dt = k1ES-(k2+k3)C

dP/dt = k2C

Page 39: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Circadian Cycle Model

C' = -(k1*C)-k4*C-kdC*C +k2*CN+k3*P2*T2

CN' = k1*C-k2*CN-kdN*CN

MP' = (KIP^n*nusP)/(KIP^n+CN^n)

-kd* MP-(numP*MP)/(KmP+MP)

MT' = (KIT^n*nusT)/(KIT^n+CN^n)

-MT[ t]*(kd+numT/(KmT+MT))

P0' = ksP*MP-kd*P0-(V1P*P0)/( K1P+P0)

+(V2P*P1)/(K2P+P1)

P1' = (V1P*P0)/(K1P+P0)-kd*P1 -(V2P*P1)/(K2P+P1)

-(V3P*P1)/( K3P+P1)+(V4P*P2)/(K4P+P2)

P2' = k4*C+(V3P*P1)/(K3P+P1) -kd*P2-(V4P*P2)/(K4P+P2)

-(nudP*P2)/(KdP+P2)-k3*P2*T2

T0' = ksT*MT-kd*T0-(V1T*T0)/( K1T+T0)+(V2T*T1)/(K2T+T1)

T1' = (V1T*T0)/(K1T+T0)-kd*T1 -(V2T*T1)/(K2T+T1)-(V3T*T1)/( K3T+T1)+(V4T*T2)/(K4T+T2)

T2' = k4*C+(V3T*T1)/(K3T+T1) -k3*P2*T2-(V4T*T2)/(K4T+T2) -T2*(kd+nudT/(KdT+T2))

Page 40: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Gene Interaction Networks

Gene interaction example [Bockmayr-Courtois 01]

Hybrid Concurrent Constraint Programming HCC [Saraswat et al.]

2 genes x and y.

dx/dt = 0.01 – 0.02*x if y < 0.8

dx/dt = – 0.02*x if y ≥ 0.8

dy/dt = 0.01*x

Page 41: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Concurrent Transition System

Time discretized using Euler’s method (Runge-Kutta method in HCC):

y < 0.8 x’ = x + dt*(0.01-0.02*x) , y’ = y + dt*0.01*x

y ≥ 0.8 x’ = x + dt*(0.01-0.02*x) , y’ = y + dt*0.01*x

Initial condition: x=0, y=0.

CLP(R) programInit :- X=0, Y=0, p(X,Y).

p(X,Y):-X>=0, Y>=0, Y<0.8,

X1=X-0.02*X+0.01, Y1=Y+0.01*X, p(X1,Y1).

p(X,Y):-X>=0, Y>=0, Y>=0.8,

X1=X-0.02*X, Y1=Y+0.01*X, p(X1,Y1).

Page 42: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Proving CTL properties by computing fixpoints of CLP programs

Theorem [Delzanno Podelski 99]

EF()=lfp(TP{p(x):-),

EG()=gfp(TP).

Safety property AG() iff EF() iff initlfp(TP{)

Liveness property AG(1AF(2)) iff initlfp(TPgfp(TP{)

Prolog-based implementation in CLP(R,B) [Delzanno 00]

Applications to life in silico:

Proof of protocols, cache consistency, etc. [Delzanno 01]

Page 43: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Deductive Model Checker DMC: Gene Interaction

r(init, p(s_s,A,B), {A=0,B=0}).r(p(s_s,A,B), p(s_s,C,D), {A>=0,B>=0.8,C=A-

0.02*A,D=B+0.01*A}).r(p(s_s,A,B), p(s_s,C,D), {A>=0,B>=0,B<0.8, C=A-0.02*A+0.01,D=B+0.01*A}).| ?- prop(P,S).P = unsafe, S = p:s*(x>=0.6) | ?- ti.Property satisfied. Execution time 0.0 | ?- ls.s(0, p(s_s,A,_), {A>=0.6}, 1, (0,0)).

Page 44: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Demonstration DMC (continued)

| ?- prop(P,S).P = unsafe, S = p:s*(x>=0.2) ?| ?- ti. Property NOT satisfied. Execution time 1.5| ?- ls.s(0, p(s_s,A,_), {A>=0.2}, 1, (0,0)).s(1, p(s_s,A,B), {B<0.8,B>=-0.0,A>=0.19387755102040816}, 2,

(2,1)).…s(26, p(s_s,A,B), {B>=0.0,A>=0.0, B+0.1982676351105516*A<0.7741338175552753}, 27,

(2,26)). s(27, init, {}, 28, (1,27)).

Page 45: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

7. Conclusion

The great ambition of logic programming is to make of programming a modeling task in the first place, with equations, constraints and logical formulae.

In this respect, computational molecular biology offers numerous challenges to the logic programming community at large.

Besides combinatorial search and optimization problems coming from molecular biology (DNA and protein sequence comparison, protein structure prediction,…) there is a need to model globally the system at hand and automate reasoning on all its possible behaviors.

Page 46: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Conclusion

The biochemical abstract machine BIOCHAM project aims at developing:

Qualitative models of complex biochemical processes:• Intracellular and extracellular signaling, cell-cycle control,…

[http://contraintes.inria.fr/CMBSlib]

• Prolog-based implementation + BDD symbolic model-checking • ILP-based learning of models from temporal properties [6thPCRD APRIL 2]

Membranes and transportation not modeledBitonal algebras [Cardelli et al. 03] BioAmbients, Brane calculi [Cardelli et al. 03]

Page 47: François Fages ICLP December 2003 Symbolic model checking of biochemical systems Logic programming steps towards formal biology François Fages, INRIA Rocquencourt

François Fages ICLP December 2003

Perspectives for LP

Quantitative models:

• Differential equations

• Hybrid discrete-continuous time models• Hybrid concurrent constraint programming [Bockmayr-Courtois-Eveillard 03]

• CLP-based model-checking [Delzanno-Podelski 99] [Chabrier-Fages 03]

Multi-scale molecular-electro-physiological models [Sorine et al. 03]

http://www-rocq.inria.fr/sosso/icema2

http://www.sci.sdsu.edu/movies