françois fages new delhi, dec.. 2007 formal verification and inference of biochemical models...

62
François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA Paris-Rocquencourt, France Main idea: to tackle the complexity of biological systems investigate Programming Theory Concepts Formal Methods of Circuit and Program Verification Automated Reasoning Tools Prototype Implementation in the Biochemical Abstract Machine BIOCHAM modeling environment available at http://contraintes.inria.fr/BIOCHAM

Upload: cecil-maxwell

Post on 18-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Formal Verification and Inference of Biochemical Models

François FagesConstraint Programming project-team,

INRIA Paris-Rocquencourt, France

Main idea: to tackle the complexity of biological systems investigate• Programming Theory Concepts• Formal Methods of Circuit and Program Verification• Automated Reasoning Tools

Prototype Implementation in the Biochemical Abstract Machine BIOCHAMmodeling environment available at http://contraintes.inria.fr/BIOCHAM

Page 2: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Systems Biology ?

“Systems Biology aims at systems-level understanding [which]

requires a set of principles and methodologies that links the

behaviors of molecules to systems characteristics and functions.”

H. Kitano, ICSB 2000

• Analyze (post-)genomic data produced with high-throughput technologies (stored in databases like GO, KEGG, BioCyc, etc.);

• Integrate heterogeneous data about a specific problem;

• Understand and Predict behaviors or interactions in large networks of genes and proteins.

Systems Biology Markup Language (SBML) : exchange format for reaction models

Page 3: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Issue of Abstraction in Systems Biology

Models are built in Systems Biology with two contradictory perspectives :

Page 4: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Issue of Abstraction in Systems Biology

Models are built in Systems Biology with two contradictory perspectives :

1) Models for representing knowledge : the more concrete the better

Page 5: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Issue of Abstraction in Systems Biology

Models are built in Systems Biology with two contradictory perspectives :

1) Models for representing knowledge : the more concrete the better

2) Models for making predictions : the more abstract the better !

Page 6: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Issue of Abstraction in Systems Biology

Models are built in Systems Biology with two contradictory perspectives :

1) Models for representing knowledge : the more concrete the better

2) Models for making predictions : the more abstract the better !

These perspectives can be reconciled by organizing formalisms and models into hierarchies of abstractions.

To understand a system is not to know everything about it but to know

abstraction levels that are sufficient for answering questions about it

Page 7: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Semantics of Living Processes ?

Formally, “the” behavior of a system depends on our choice of observables.

? ?

Mitosis movie [Lodish et al. 03]

Page 8: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Boolean Semantics

Formally, “the” behavior of a system depends on our choice of observables.

Presence/absence of molecules over time

Boolean transitions

0 1

Page 9: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Continuous Differential Semantics

Formally, “the” behavior of a system depends on our choice of observables.

Concentrations of molecules over time

Rates of reactions

x ý

Page 10: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Stochastic Semantics

Formally, “the” behavior of a system depends on our choice of observables.

Numbers of molecules over time

Probabilities of reaction

n

Page 11: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Temporal Logic Semantics

Formally, “the” behavior of a system depends on our choice of observables.

Presence/absence of molecules over time

Temporal logic formulas

F xF x

F (x ^ F ( x ^ y))

FG (x v y)

Page 12: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Constraint Temporal Logic Semantics

Formally, “the” behavior of a system depends on our choice of observables.

Concentrations of molecules over time

Constraint LTL temporal formulas

F x>1F (x >0.2)

F (x >0.2 ^ F (x<0.1 ^ y>0.2))

FG (x>0.2 v y>0.2)

Page 13: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Language-based Approaches to Cell Systems Biology

Qualitative models: from diagrammatic notation to• Boolean networks [Kaufman 69, Thomas 73]

• Petri Nets [Reddy 93, Chaouiya 05]

• Process algebra π–calculus [Regev-Silverman-Shapiro 99-01, Nagasali et al. 00] • Bio-ambients [Regev-Panina-Silverman-Cardelli-Shapiro 03]

• Pathway logic [Eker-Knapp-Laderoute-Lincoln-Meseguer-Sonmez 02]

• Reaction rules [Chabrier-Fages 03] [Chabrier-Chiaverini-Danos-Fages-Schachter 04]

Quantitative models: from ODEs and stochastic simulations to• Hybrid Petri nets [Hofestadt-Thelen 98, Matsuno et al. 00]

• Hybrid automata [Alur et al. 01, Ghosh-Tomlin 01] HCC [Bockmayr-Courtois 01]

• Stochastic π–calculus [Priami et al. 03] [Cardelli et al. 06]

• Reaction rules with continuous time dynamics [Fages-Soliman-Chabrier 04]

Page 14: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Overview of the Talk

1. Rule-based Modeling of Biochemical Systems 1. Syntax of molecules, compartments and reactions

2. Semantics at three abstraction levels: boolean, differential, stochastic

3. Cell cycle control models

2. Temporal Logic Language for Formalizing Biological Properties1. CTL for the boolean semantics

2. LTL with constraints for the differential semantics

3. Automated Reasoning Tools1. Inferring kinetic parameter values from Constraint-LTL specification

2. Inferring reaction rules from CTL specificationL. Calzone, N. Chabrier, F. Fages, S. Soliman. TCSB VI, LNBI 4220:68-94. 2006.

F. Fages, S. Soliman. CMSB’06.

F. Fages, A. Rizk CMSB’07.

Page 15: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Syntax of proteins

Cyclin dependent kinase 1 Cdk1

(free, inactive)

Complex Cdk1-Cyclin B Cdk1–CycB

(low activity)

Phosphorylated form Cdk1~{thr161}-CycB

at site threonine 161

(high activity)

mitosis promotion factor

Page 16: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Elementary Reaction Rules

Complexation: A + B => A-B Decomplexation A-B => A + B cdk1+cycB => cdk1–cycB

Page 17: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Elementary Reaction Rules

Complexation: A + B => A-B Decomplexation A-B => A + B cdk1+cycB => cdk1–cycB

Phosphorylation: A =[C]=> A~{p} Dephosphorylation A~{p} =[C]=> A

Cdk1-CycB =[Myt1]=> Cdk1~{thr161}-CycB

Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB

Page 18: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Elementary Reaction Rules

Complexation: A + B => A-B Decomplexation A-B => A + B cdk1+cycB => cdk1–cycB

Phosphorylation: A =[C]=> A~{p} Dephosphorylation A~{p} =[C]=> A

Cdk1-CycB =[Myt1]=> Cdk1~{thr161}-CycB

Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB

Synthesis: _ =[C]=> A. Degradation: A =[C]=> _. _ =[#E2-E2f13-Dp12]=> CycA cycE =[@UbiPro]=> _

(not for cycE-cdk2 which is stable)

Page 19: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Elementary Reaction Rules

Complexation: A + B => A-B Decomplexation A-B => A + B cdk1+cycB => cdk1–cycB

Phosphorylation: A =[C]=> A~{p} Dephosphorylation A~{p} =[C]=> A

Cdk1-CycB =[Myt1]=> Cdk1~{thr161}-CycB

Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB

Synthesis: _ =[C]=> A. Degradation: A =[C]=> _. _ =[#E2-E2f13-Dp12]=> CycA cycE =[@UbiPro]=> _

(not for cycE-cdk2 which is stable)

Transport: A::L1 => A::L2Cdk1~{p}-CycB::cytoplasm => Cdk1~{p}-CycB::nucleus

Page 20: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

From Syntax to Semantics

R ::= S=>S | S =[O]=> S | S <=> S | S <=[O]=> S where A =[C]=> B stands for A+C => B+C

A <=> B stands for A=>B and B=>A, etc.

| kinetic for R

SBML : import/export exchange format, no semantics

Page 21: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

From Syntax to Semantics

R ::= S=>S | S =[O]=> S | S <=> S | S <=[O]=> S where A =[C]=> B stands for A+C => B+C

A <=> B stands for A=>B and B=>A, etc.

| kinetic for R

SBML : import/export exchange format, no semantics

BIOCHAM : three abstraction levels

1. Boolean Semantics: presence-absence of molecules 1. Concurrent Transition System (asynchronous, non-deterministic)

2. Differential Semantics: concentration 1. Ordinary Differential Equations or Hybrid system (deterministic)

• Stochastic Semantics: number of molecules • Continuous time Markov chain

Page 22: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Hierarchy of Semantics

Stochastic model

Differential model

Discrete model

abstraction

concretization

Boolean model

Theory of abstract Interpretation

Abstractions as Galois connections

[Cousot Cousot POPL’77]

[Fages Soliman CMSB’06,TCSc’07]

Syntactical

model

Page 23: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Boolean Semantics

Associate to each molecule a Boolean denoting its presence/absence.

Compile the rule set into an asynchronous transition system where a reaction like A+B=>C+D is translated into 4 transition rules taking into account the possible complete consumption of reactants:

A+BA+B+C+D

A+BA+B +C+D

A+BA+B+C+D

A+BA+B+C+D

Necessary for over-approximating the possible behaviors under the stochastic/discrete semantics (abstraction N {zero, non-zero})

Only the last transition is considered in Pathway logic [Lincoln et al. 02] or Petri nets [Reddy93, Heiner 05, Chaouiya 05]

Page 24: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Type Checking / Inference by Abstract Interpretation

Stochastic model

Differential model

Discrete model

abstraction

concretization

Boolean model

Syntactical

model

[Fages Soliman CMSB’06]

Influence graph of proteins

Protein functions

(kinase, phosphatase,…)

Compartments topology

Influence graph of proteins

(activation/inhibition)

Page 25: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Hierarchy of Analyses

Stochastic model

Differential model

Discrete model

abstraction

concretization

Boolean model

Syntactical

model

Jacobian circuit analysis

Discrete circuit analysis

Boolean circuit analysisabstraction

abstraction

abstraction

Positive circuits are a

necessary condition

for multistability

[Thomas 81] [Soulé 03]

[Remy Ruet Thieffry 05]

[Kauffman 73]

Page 26: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Example: Cell Cycle Control Model [Tyson 91]

MA(k1) for _ => Cyclin.

MA(k2) for Cyclin => _.

MA(K7) for Cyclin~{p1} => _.

MA(k8) for Cdc2 => Cdc2~{p1}.

MA(k9) for Cdc2~{p1} =>Cdc2.

MA(k3) for Cyclin+Cdc2~{p1} => Cdc2~{p1}-Cyclin~{p1}.

MA(k4p) for Cdc2~{p1}-Cyclin~{p1} => Cdc2-Cyclin~{p1}.

k4*[Cdc2-Cyclin~{p1}]^2*[Cdc2~{p1}-Cyclin~{p1}] for

Cdc2~{p1}-Cyclin~{p1} =[Cdc2-Cyclin~{p1}] => Cdc2-Cyclin~{p1}.

MA(k5) for Cdc2-Cyclin~{p1} => Cdc2~{p1}-Cyclin~{p1}.

MA(k6) for Cdc2-Cyclin~{p1} => Cdc2+Cyclin~{p1}.

Page 27: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Interaction Graph

Page 28: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Stochastic Simulation

Page 29: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Differential Simulation

Page 30: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Boolean Simulation

Page 31: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Automatic Generation of True CTL PropertiesEi(reachable(Cyclin)))Ei(reachable(!(Cyclin))))Ai(oscil(Cyclin)))Ei(reachable(Cdc2~{p1})))Ei(reachable(!(Cdc2~{p1}))))Ai(oscil(Cdc2~{p1})))Ai(AG(!(Cdc2~{p1})->checkpoint(Cdc2,Cdc2~{p1}))))Ei(reachable(Cdc2-Cyclin~{p1,p2})))Ei(reachable(!(Cdc2-Cyclin~{p1,p2}))))Ai(oscil(Cdc2-Cyclin~{p1,p2})))Ei(reachable(Cdc2-Cyclin~{p1})))Ei(reachable(!(Cdc2-Cyclin~{p1}))))Ai(oscil(Cdc2-Cyclin~{p1})))Ai(AG(!(Cdc2-Cyclin~{p1})->checkpoint(Cdc2-Cyclin~{p1,p2},Cdc2-Cyclin~{p1})))Ei(reachable(Cdc2)))Ei(reachable(!(Cdc2))))Ai(oscil(Cdc2)))Ei(reachable(Cyclin~{p1})))Ei(reachable(!(Cyclin~{p1}))))Ai(oscil(Cyclin~{p1})))Ai(AG(!(Cyclin~{p1})->checkpoint(Cdc2-Cyclin~{p1},Cyclin~{p1}))))

Page 32: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Page 33: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Mammalian Cell Cycle Control Map [Kohn 99]

Page 34: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Cell Cycle Model-Checking (with BDD NuSMV)

biocham: check_reachable(cdk46~{p1,p2}-cycD~{p1}). Ei(EF(cdk46~{p1,p2}-cycD~{p1})) is truebiocham: check_checkpoint(cdc25C~{p1,p2}, cdk1~{p1,p3}-cycB). Ai(!(E(!(cdc25C~{p1,p2}) U cdk1~{p1,p3}-cycB))) is truebiocham: nusmv(Ai(AG(!(cdk1~{p1,p2,p3}-cycB) -> checkpoint(Wee1, cdk1~{p1,p2,p3}-cycB))))). Ai(AG(!(cdk1~{p1,p2,p3}-cycB)->!(E(!(Wee1) U cdk1~{p1,p2,p3}-cycB)))) is falsebiocham: why.-- Loop starts here cycB-cdk1~{p1,p2,p3} is present cdk7 is present cycH is present cdk1 is present Myt1 is present cdc25C~{p1} is presentrule_114 cycB-cdk1~{p1,p2,p3}=[cdc25C~{p1}]=>cycB-cdk1~{p2,p3}. cycB-cdk1~{p2,p3} is present cycB-cdk1~{p1,p2,p3} is absentrule_74 cycB-cdk1~{p2,p3}=[Myt1]=>cycB-cdk1~{p1,p2,p3}. cycB-cdk1~{p2,p3} is absent cycB-cdk1~{p1,p2,p3} is present

Page 35: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Mammalian Cell Cycle Control Benchmark

165 genes and proteins. 500 variables, 2500 states. 800 rules.

BIOCHAM NuSMV model-checker time in sec. [Chabrier et al. TCS 04]

Initial state G2 Query: Time:

compiling 29 s

Reachability G1 EF CycE 2 s

Reachability G1 EF CycD 1.9 s

Reachability G1 EF PCNA-CycD 1.7 s

Checkpoint

for mitosis complex

EF ( Cdc25~{Nterm}

U Cdk1~{Thr161}-CycB)

2.2 s

Oscillation EG ( (CycA => EF CycA) ^

( CycA => EF CycA))

31.8 s

Page 36: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Temporal Logic with Constraints over the Reals

• Constraints over concentrations and derivatives as FOL formulae over the reals:

• [M] > 0.2

• [M]+[P] > [Q]

• d([M])/dt < 0

Page 37: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Temporal Logic with Constraints over the Reals

• Constraints over concentrations and derivatives as FOL formulae over the reals:

• [M] > 0.2

• [M]+[P] > [Q]

• d([M])/dt < 0

• Linear Time Logic LTL operators for time X, F, U, G• F([M]>0.2)

• FG([M]>0.2)

• F ([M]>2 & F (d([M])/dt<0 & F ([M]<2 & d([M])/dt>0 & F(d([M])/dt<0))))

• oscil(M,n) defined as at least n alternances of sign of the derivative

• Period(A,75)= t v F(T = t & [A] = v & d([A])/dt > 0 & X(d([A])/dt < 0)

& F(T = t + 75 & [A] = v & d([A])/dt > 0 & X(d([A])/dt < 0)))…

Page 38: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Example: Cell Cycle Control Model [Tyson 91]

MA(k1) for _ => Cyclin.

MA(k2) for Cyclin => _.

MA(K7) for Cyclin~{p1} => _.

MA(k8) for Cdc2 => Cdc2~{p1}.

MA(k9) for Cdc2~{p1} =>Cdc2.

MA(k3) for Cyclin+Cdc2~{p1} => Cdc2~{p1}-Cyclin~{p1}.

MA(k4p) for Cdc2~{p1}-Cyclin~{p1} => Cdc2-Cyclin~{p1}.

k4*[Cdc2-Cyclin~{p1}]^2*[Cdc2~{p1}-Cyclin~{p1}] for

Cdc2~{p1}-Cyclin~{p1} =[Cdc2-Cyclin~{p1}] => Cdc2-Cyclin~{p1}.

MA(k5) for Cdc2-Cyclin~{p1} => Cdc2~{p1}-Cyclin~{p1}.

MA(k6) for Cdc2-Cyclin~{p1} => Cdc2+Cyclin~{p1}.

Page 39: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

3.1 Inferring Parameters from Temporal Properties

biocham: learn_parameter([k3,k4],[(0,200),(0,200)],20,

oscil(Cdc2-Cyclin~{p1},3),150).

Page 40: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

3.1 Inferring Parameters from Temporal Properties

biocham: learn_parameter([k3,k4],[(0,200),(0,200)],20,

oscil(Cdc2-Cyclin~{p1},3),150).

First values found :

parameter(k3,10).

parameter(k4,70).

Page 41: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

3.1 Inferring Parameters from Temporal Properties

biocham: learn_parameter([k3,k4],[(0,200),(0,200)],20,

oscil(Cdc2-Cyclin~{p1},3) & F([Cdc2-Cyclin~{p1}]>0.15), 150).

First values found :

parameter(k3,10).

parameter(k4,120).

Page 42: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

3.1 Inferring Parameters from LTL Specification

biocham: learn_parameter([k3,k4],[(0,200),(0,200)],20,

period(Cdc2-Cyclin~{p1},35), 150).

First values found:

parameter(k3,10).

parameter(k4,280).

Page 43: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Leloup and Goldbeter (1999)

MPF preMPF

Wee1

Wee1P

Cdc25

Cdc25PAPC

APC

....

....

........

Cell cycle

Linking the Cell and Circadian Cycles through Wee1

BMAL1/CLOCK

PER/CRY

Circadian cycle

Wee1 mRNA

L [L. Calzone, S. Soliman 2006]

Page 44: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

PCN

Wee1m

Wee1MPF

BN

Cdc25

Page 45: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

entrainmententrainment

Condition on Wee1/Cdc25 for Period Synchronization

Period synchronization constraint expressed in LTL with the period formula

Page 46: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Inferring Rules from Temporal Properties

Given

• a BIOCHAM model (background knowledge)

• a set of properties formalized in temporal logic

learn

• revisions of the reaction model, i.e. rules to delete and rules to add such that the revised model satisfies the properties

Page 47: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Model Revision from Temporal Properties

• Background knowledge T: BIOCHAM model • reaction rule language: complexation, phosphorylation, …

• Examples φ: biological properties formalized in temporal logic language• Reachability

• Checkpoints

• Stable states

• Oscillations

• Bias R: Reaction rule patterns or parameter ranges• Kind of rules to add or delete

Find a revision T’ of T such that T’ |= φ

Page 48: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Model Revision Algorithm

General idea of constraint programming: replace a generate-and-test algorithm by a constrain-and-generate algorithm.

Anticipate whether one has to add or remove a rule.

• Positive ECTL formula: if false, remains false after removing a rule• EF(φ) where φ is a boolean formula (pure state description)

• Oscil(φ)

• Negative ACTL formula: if false, remains false after adding a rule• AG(φ) where φ is a boolean formula,

• Checkpoint(a,b): ¬E(¬aUb)

• Remove a rule on the path given by the model checker (why command)

• Unclassified CTL formulae

Page 49: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Theory Revision Algorithm Rules

Initial state: <(0, 0, 0), (E,U,A), R>

E transition: <(E,U,A), (E{e},U,A), R> <(E{e},U,A), (E,U,A),R> if R |= e

E’ transition: <(E,U,A), (E {e},U,A), R> <(E {e},U,A), (E,U,A),R {r}>

if R |≠ e and f {e} E U A, K {r} |= f

Page 50: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Theory Revision Algorithm Rules

Initial state: <(0, 0, 0), (E,U,A), R>E transition: <(E,U,A), (E{e},U,A), R> <(E{e},U,A), (E,U,A),R> if R |= eE’ transition: <(E,U,A), (E {e},U,A), R> <(E {e},U,A), (E,U,A),R {r}> if R |≠ e and f {e} E U A, K {r} |= fU transition: <(E,U,A), (0,U {u},A), R > <(E,U {u},A), (0,U,A),R> if R |= uU’ transition: <(E,U,A), (0,U {u},A), R > <(E,U {u},A), (0,U,A),R {r}> if R|≠u and f {u} E U A, R {r} |= fU” transition: <(E,U,A), (0,U {u},A), R Re > <(E,U {u},A),(0,U,A), R> if K, si|≠u and f {u} E U A, R |= f

Page 51: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Theory Revision Algorithm Rules

Initial state: <(0, 0, 0), (E,U,A), R>

E transition: <(E,U,A), (E{e},U,A), R> <(E{e},U,A), (E,U,A),R> if R |= e

E’ transition: <(E,U,A), (E {e},U,A), R> <(E {e},U,A), (E,U,A),R {r}>

if R |≠ e and f {e} E U A, K {r} |= f

U transition: <(E,U,A), (0,U {u},A), R > <(E,U {u},A), (0,U,A),R> if R |= u

U’ transition: <(E,U,A), (0,U {u},A), R > <(E,U {u},A), (0,U,A),R {r}>

if R|≠u and f {u} E U A, R {r} |= f

U” transition: <(E,U,A), (0,U {u},A), R Re > <(E,U {u},A),(0,U,A), R>

if K, si|≠u and f {u} E U A, R |= f

A transition: <(E,U,A), (0, 0,A {a}), R > <(E,U,A {a}), (Ep,Up,A),R> if R |= a

A’ transition: <(EEp,UUp,A),(0,0,A{a}), RRe><(E,U,A{a}),(Ep,Up,A),R> if R|≠ a, f {u} [ E U A, R |= f and Ep Up is the set of formulae no longer satisfied after the deletion of the rules in Re.

Page 52: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Termination

Proposition The model revision algorithm terminates.

Proof

The termination of the algorithm is proved by considering the lexicographic

ordering over the couple < a, n >

where a is the number of unsatisfied ACTL formulae,

and n is the number of unsatisfied ECTL and UCTL formulae.

Each transition strictly decreases a,

or lets a unchanged and strictly decreases n.

Page 53: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Correctness

Proposition If the terminal configuration is of the form < (E,U,A), (0,0,0), R > then the model R satisfies the initial CTL specification.

Proof

Each transition maintains only true formulae in the satisfied set,

and preserves the complete CTL specification

in the union of the satisfied set and the untreated set.

Page 54: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Incompleteness

Two reasons:

1) The satisfaction of ECTL and UCTL formula is searched by adding only one rule to the model (transition E’ and U’)

2) The Kripke structure associated to a Biocham set of rules adds loops on terminal states. Hence adding or removing a rule may have an opposite deletion or addition of those loops.

Page 55: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Rule Inference in Cell Cycle Control

[Tyson et al. 91] model over 6 variables,

initial state present(cdc2).

_=>Cyclin.

Cyclin=>_.

Cyclin+Cdc2~{p1}=>Cdc2-Cyclin~{p1,p2}.Cdc2-Cyclin~{p1,p2}=>Cdc2-Cyclin~{p1}.Cdc2-Cyclin~{p1,p2}=[Cdc2-Cyclin~{p1}]=>Cdc2-Cyclin~{p1}.Cdc2-Cyclin~{p1}=>Cdc2-Cyclin~{p1,p2}.Cdc2-Cyclin~{p1}=>Cyclin~{p1}+Cdc2.Cyclin~{p1}=>_.Cdc2=>Cdc2~{p1}.Cdc2~{p1}=>Cdc2.

Page 56: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Automatic Generation of True CTL PropertiesEi(reachable(Cyclin)))Ei(reachable(!(Cyclin))))Ai(oscil(Cyclin)))Ei(reachable(Cdc2~{p1})))Ei(reachable(!(Cdc2~{p1}))))Ai(oscil(Cdc2~{p1})))Ai(AG(!(Cdc2~{p1})->checkpoint(Cdc2,Cdc2~{p1}))))Ei(reachable(Cdc2-Cyclin~{p1,p2})))Ei(reachable(!(Cdc2-Cyclin~{p1,p2}))))Ai(oscil(Cdc2-Cyclin~{p1,p2})))Ei(reachable(Cdc2-Cyclin~{p1})))Ei(reachable(!(Cdc2-Cyclin~{p1}))))Ai(oscil(Cdc2-Cyclin~{p1})))Ai(AG(!(Cdc2-Cyclin~{p1})->checkpoint(Cdc2-Cyclin~{p1,p2},Cdc2-Cyclin~{p1})))Ei(reachable(Cdc2)))Ei(reachable(!(Cdc2))))Ai(oscil(Cdc2)))Ei(reachable(Cyclin~{p1})))Ei(reachable(!(Cyclin~{p1}))))Ai(oscil(Cyclin~{p1})))Ai(AG(!(Cyclin~{p1})->checkpoint(Cdc2-Cyclin~{p1},Cyclin~{p1}))))

Page 57: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Model Compression

biocham: reduce_model.1: deleting Cyclin=>_2: deleting Cdc2-Cyclin~{p1,p2}=[Cdc2-Cyclin~{p1}]=>Cdc2-Cyclin~{p1}3: deleting Cdc2-Cyclin~{p1}=>Cdc2-Cyclin~{p1,p2}4: deleting Cdc2~{p1}=>Cdc2After reduction, 6 rules remain corresponding to the bias ? => ?Deletion(s):Cyclin=>_.Cdc2-Cyclin~{p1,p2}=[Cdc2-Cyclin~{p1}]=>Cdc2-Cyclin~{p1}.Cdc2-Cyclin~{p1}=>Cdc2-Cyclin~{p1,p2}.Cdc2~{p1}=>Cdc2.

Page 58: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Rule Deletion

biocham: delete_rules(Cdc2 => Cdc2~{p1}).

biocham: check_all.

First formula not satisfied

Ei(EF(Cdc2-Cyclin~{p1}))

Page 59: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Model Revision from Temporal Properties

biocham: revise_model.1: adding Cdc2-Cdc2~{p1}=>Cdc2+Cdc2~{p1}

2: adding Cdc2=>_

2: backtracking on previous add -> deleting Cdc2=>_

2: adding Cdc2=[Cyclin]=>_

2: backtracking on previous add -> deleting Cdc2=[Cyclin]=>_

2: adding Cdc2=[Cdc2-Cdc2~{p1}]=>_

3: adding Cdc2=>Cdc2~{p1}

4: deleting Cdc2=[Cdc2-Cdc2~{p1}]=>_

5: deleting Cdc2-Cdc2~{p1}=>Cdc2+Cdc2~{p1}

Modifications found

Deletion(s)

Addition(s): Cdc2=>Cdc2~{p1}.

Page 60: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Search for all Solutions

biocham: learn_one_addition(elementary_interaction_rules).Time: 5.00 sRules tested: 112Solutions found: 3Cdc2=>Cdc2~{p1}Cdc2=[Cdc2]=>Cdc2~{p1}Cdc2=[Cyclin]=>Cdc2~{p1}

Page 61: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Conclusion

• Temporal logic with constraints is powerful enough to express both qualitative and quantitative biological properties of systems

• SBML models are interpreted under three semantics in BIOCHAM :

Boolean semantics CTL formulas (rule learning)

Differential semantics LTL with constraints over reals (parameter search)

Stochastic semantics Probabilistic CTL with integer constraints

• Parameter search from temporal properties proved useful and complementary to bifurcation theory tools (Xppaut)

• Rule inference from temporal properties will benefit from types (e.g. protein functions, influences …) to better prune the search.

Page 62: François Fages New Delhi, Dec.. 2007 Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA

François Fages New Delhi, Dec.. 2007

Collaborations

EU STREP APrIL2 : Stephen Muggleton, IC, Luc de Raedt, U. Freiburg,…

• Learning in a probabilistic logic setting (finished)

INRIA ARC MOCA :

• modularity, compositionality and abstraction

EU NoE REWERSE : semantic web, François Bry, Münich, R. Backofen,

• Connecting Biocham to gene and protein ontologies (types)

EU STREP TEMPO : Cancer chronotherapies, INSERM Villejuif, F. Lévi; INRIA J. Clairambault, S. Soliman

• Coupled models of cell cycle, circadian cycle, cytotoxic drugs.

INRA Tours : E. Reiter, D. Heitzler, INRIA F. Clément

• Model of FSH signaling.