the calculus of looping sequences for modeling biological...

52
The Calculus of Looping Sequences for Modeling Biological Membranes Roberto Barbuti Andrea Maggiolo–Schettini Paolo Milazzo Angelo Troina Dipartimento di Informatica, Universit` a di Pisa, Italy Thessaloniki – June, 2007 Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 1 / 52

Upload: others

Post on 17-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

The Calculus of Looping Sequences for ModelingBiological Membranes

Roberto Barbuti Andrea Maggiolo–SchettiniPaolo Milazzo Angelo Troina

Dipartimento di Informatica, Universita di Pisa, Italy

Thessaloniki – June, 2007

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 1 / 52

Page 2: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Outline of the talk

1 IntroductionCells are complex interactive systemsThe EGF pathway and the lac operon

2 The Calculus of Looping Sequences (CLS)Definition of CLSCLS as an abstraction for biomolecular systemsThe EGF pathway and the lac operon in CLS

3 CLS variantsCLS+Stochastic CLSLCLS

4 Conclusions

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 2 / 52

Page 3: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Cells: complex systems of interactive components

Two classifications of cell:I procaryoticI eucaryotic

Main actors:I membranesI proteinsI DNA/RNA strandsI other molecules

Interaction networks:I metabolic pathwaysI signaling pathwaysI gene regulatory networks

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 3 / 52

Page 4: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Example of interaction networks: the EGF pathway

EGF

EGFR

SHC

CELL MEMBRANE

nucleus

phosphorylations

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 4 / 52

Page 5: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Example of interaction networks: the lac operon

i p o z y a

DNA

mRNA

proteinslac Repressor beta-gal. permease transacet.

R

i p o z y a

R RNAPolime- rase

NO TRANSCRIPTION

a)

i p o z y a

R

RNAPolime- rase

TRANSCRIPTION

b)

LACTOSE

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 5 / 52

Page 6: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Outline of the talk

1 IntroductionCells are complex interactive systemsThe EGF pathway and the lac operon

2 The Calculus of Looping Sequences (CLS)Definition of CLSCLS as an abstraction for biomolecular systemsThe EGF pathway and the lac operon in CLS

3 CLS variantsCLS+Stochastic CLSLCLS

4 Conclusions

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 6 / 52

Page 7: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

The Calculus of Looping Sequences (CLS)

We assume an alphabet E . Terms T and Sequences S of CLS are givenby the following grammar:

T ::= S∣∣ (

S)L c T

∣∣ T | TS ::= ε

∣∣ a∣∣ S · S

where a is a generic element of E , and ε is the empty sequence.

The operators are:S · S : Sequencing(S)L

: Looping (S is closed and it can rotate)T1 c T2 : Containment (T1 contains T2)

T |T : Parallel composition (juxtaposition)

Actually, looping and containment form a single binary operator(S)L c T .

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 7 / 52

Page 8: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Example of Terms

(i)(a · b · c

)L c ε

(ii)(a · b · c

)L c(d · e

)L c ε

(iii)(a · b · c

)L c (f · g |(d · e

)L c ε)

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 8 / 52

Page 9: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Structural Congruence

The Structural Congruence relations ≡S and ≡T are the leastcongruence relations on sequences and on terms, respectively, satisfyingthe following rules:

S1 · (S2 · S3) ≡S (S1 · S2) · S3 S · ε ≡S ε · S ≡S S

T1 | T2 ≡T T2 | T1 T1 | (T2 | T3) ≡T (T1 | T2) | T3

T | ε ≡T T(ε)L c ε ≡T ε

(S1 · S2

)L c T ≡T

(S2 · S1

)L c T

We write ≡ for ≡T .

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 9 / 52

Page 10: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

CLS Patterns

Let us consider variables of three kinds:

term variables (X ,Y ,Z , . . .)

sequence variables (x , y , z , . . .)

element variables (x , y , z , . . .)

Patterns P and Sequence Patterns SP of CLS extend CLS terms andsequences with variables:

P ::= SP∣∣ (

SP)L c P

∣∣ P | P∣∣ X

SP ::= ε∣∣ a

∣∣ SP · SP∣∣ x

∣∣ x

where a is a generic element of E , ε is the empty sequence, and x , x and Xare generic element, sequence and term variables

The structural congruence relation ≡ extends trivially to patterns

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 10 / 52

Page 11: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Rewrite Rules

Pσ denotes the term obtained by replacing any variable in T with thecorresponding term, sequence or element.

Σ is the set of all possible instantiations σ

A Rewrite Rule is a pair (P,P ′), denoted P 7→ P ′, where:

P,P ′ are patterns

variables in P ′ are a subset of those in P

A rule P 7→ P ′ can be applied to all terms Pσ.

Example: a · x · a 7→ b · x · b

can be applied to a · c · a (producing b · c · b)

cannot be applied to a · c · c · a

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 11 / 52

Page 12: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Formal Semantics

Given a set of rewrite rules R, evolution of terms is described by thetransition system given by the least relation → satisfying

P 7→ P ′ ∈ R Pσ 6≡ ε T ≡ Pσ T ′ ≡ P ′σ

T → T ′

T → T ′

T | T ′′ → T ′ | T ′′T → T ′(

S)L c T →

(S)L c T ′

and closed under structural congruence ≡.

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 12 / 52

Page 13: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Biomolecular entities as CLS terms

Biomolecular Entity CLS Term

Gene, protein domain, Alphabet symbolmacro molecule, . . .

DNA strand Sequence of elementsrepresenting genes

RNA strand Sequence of elementsrepresenting transcribed genes

Protein Single alphabet symbols orsequence of elementsrepresenting domains

Molecular population Parallel composition of molecules

Membrane Looping sequence

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 13 / 52

Page 14: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Biomolecular events as CLS rewrite rules

Biomolecular Event Examples of CLS Rewrite Rule

Complexation a | b 7→ c x · a · y | b 7→ x · c · yCatalysis c | P1 7→ c | P2

where P1 7→ P2 is the catalyzed event

Complexation(a · x · b · y

)L c X 7→(c · x · y

)L c X

on membrane a |(b · x

)L c X 7→(c · x

)L c X

Membrane crossing a |(x)L c X 7→

(x)L c (a | X )

x · a · y |(z)L c X 7→

(z)L c (x · a · y | X )

Membrane joining(x)L c (a | X ) 7→

(a · x

)L c X(x)L c (y · a · z | X ) 7→

(y · a · z · x

)L c X

Catalyzed(a · x

)L c (X ) |(b · y

)L c (Y ) 7→membrane fusion

(a · x · b · y

)L c (X | Y )

Catalyzed(a · x · b · y

)L c (X | Y ) 7→membrane division

(a · x

)L c (X ) |(b · y

)L c (Y )

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 14 / 52

Page 15: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

CLS modeling examples: the EGF pathway (1)

EGF

EGFR

SHC

CELL MEMBRANE

nucleus

phosphorylations

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 15 / 52

Page 16: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

CLS modeling examples: the EGF pathway (2)First steps of the EGF signaling pathway up to the binding of thesignal-receptor dimer to the SHC protein

The EGFR,EGF and SHC proteins are modeled as the alphabetsymbols EGFR, EGF and SHC , respectively

The cell is modeled as a looping sequence (representing its externalmembrane):

EGF | EGF |(EGFR · EGFR · EGFR · EGFR

)L c (SHC | SHC )

Rewrite rules modeling the first steps of the pathway:

EGF |(EGFR · x

)L c X 7→(CMPLX · x

)L c X (R1)(CMPLX · x · CMPLX · y

)L c X 7→(DIM · x · y

)L c X (R2)(DIM · x

)L c X 7→(DIMp · x

)L c X (R3)(DIMp · x

)L c (SHC | X ) 7→(DIMpSHC · x

)L c X (R4)

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 16 / 52

Page 17: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

CLS modeling examples: the EGFR pathway (2)

A possible evolution of the system:

EGF | EGF |(EGFR · EGFR · EGFR · EGFR

)L c (SHC | SHC )

(R1)−−−→ EGF |(EGFR · CMPLX · EGFR · EGFR

)L c (SHC | SHC )

(R1)−−−→(EGFR · CMPLX · EGFR · CMPLX

)L c (SHC | SHC )

(R2)−−−→(EGFR · DIM · EGFR

)L c (SHC | SHC )

(R3)−−−→(EGFR · DIMp · EGFR

)L c (SHC | SHC )

(R4)−−−→(EGFR · DIMpSHC · EGFR

)L c SHC

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 17 / 52

Page 18: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

CLS modeling examples: gene regulation (1)

i p o z y a

DNA

mRNA

proteinslac Repressor beta-gal. permease transacet.

R

i p o z y a

R RNAPolime- rase

NO TRANSCRIPTION

a)

i p o z y a

R

RNAPolime- rase

TRANSCRIPTION

b)

LACTOSE

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 18 / 52

Page 19: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

CLS modeling examples: gene regulation (2)

Rules for DNA transcription/regulation:

Polym | p · x 7→ Pp · x (R1)

Repr | x · o · y 7→ x · Ro · y (R2)

Pp · o · x 7→ p · Po · x (R3)

x · Po · zya 7→ x · o · Pzya (R4)

x · Pzya 7→ Polym | x · zya | rna (R5)

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 19 / 52

Page 20: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

CLS modeling examples: gene regulation (3)

The only possible evolution of a term representing an initial situation inwhich no repressor is present is

Polym | p · o · zya (R1)−−−→ Pp · o · zya (R3)−−−→ p · Po · zya(R4)−−−→ p · o · Pzya

(R5)−−−→ Polym | p · o · zya | rna

When the repressor is present, instead, a possible evolution is

Repr | Polym | p · o · zya (R1)−−−→ Repr | Pp · o · zya (R2)−−−→ Pp · Ro · zya

and it corresponds to a situation in which the repressor stops thetranscription of the gene by hampering the activity of the RNA polymerase.

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 20 / 52

Page 21: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Some theoretical results

CLS is Turing complete

A Turing machine encoded into a CLS term and a single rewrite rule

Formalisms capable of describing membranes can be encoded into CLS

Brane Calculi

P Systems

A labeled semantics of CLS is given

Strong and weak bisimulations are defined and proved to becongruences for CLS terms

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 21 / 52

Page 22: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Some variants of CLS

CLS+I More realistic representation of the fluid nature of membranes: the

looping operator can be applied to a parallel composition of sequencesI Can be encoded into CLS

Stochastic CLSI The application of a rule consumes a stochastic quantity of time

LCLS (CLS with Links)I Description of protein–protein interactions at the domain level

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 22 / 52

Page 23: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Outline of the talk

1 IntroductionCells are complex interactive systemsThe EGF pathway and the lac operon

2 The Calculus of Looping Sequences (CLS)Definition of CLSCLS as an abstraction for biomolecular systemsThe EGF pathway and the lac operon in CLS

3 CLS variantsCLS+Stochastic CLSLCLS

4 Conclusions

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 23 / 52

Page 24: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

CLS+

The fluid nature of a membrane can be expressed representing the objectson its surface as a multiset

In CLS this corresponds to allowing the application of the looping operatorto a parallel composition of sequences

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 24 / 52

Page 25: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

The Syntax of CLS+Terms T , Branes B and Sequences S of CLS+ are given by thefollowing grammar:

T ::= S∣∣ (

B)L c T

∣∣ T | TB ::= S

∣∣ B | BS ::= ε

∣∣ a∣∣ S · S

where a is a generic element of E .

Patterns P, brane patterns BP and sequence patterns SP of LCLS aregiven by the following grammar:

P ::= SP∣∣ (

BP)L c P

∣∣ P | P∣∣ X

BP ::= SP∣∣ BP | BP

∣∣ x

SP ::= ε∣∣ a

∣∣ SP · SP∣∣ x

∣∣ x

where a is an element of E , X , x , x and x are elements of TV ,BV ,SV andX , respectively.

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 25 / 52

Page 26: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

The semantics of CLS+The semantics of CLS+ ensures that on membrane we may have multisetsof objects which cannot be membranes themselves

Given a set of rewrite rules R ⊆ < the semantics of CLS is given by thefollowing rules

(P1,P2) ∈ R P1σ 6≡ ε σ ∈ Σ

P1σ → P2σ

T1 → T2

T | T1 → T | T2

T1 → T2(B

)L c T1 →(B

)L c T2

(BP1,BP2) ∈ RB BP1σ 6≡ ε σ ∈ Σ

BP1σ →B BP2σ

B1 →B B2

B | B1 →B B | B2

B1 →B B2(B1

)L c T →(B2

)L c T

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 26 / 52

Page 27: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Translation of CLS+ into CLS

The CLS+ termc |

(a | b

)L c d

becomes the CLS term

c |(out

)L c (a | b |(in

)L c d)

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 27 / 52

Page 28: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Outline of the talk

1 IntroductionCells are complex interactive systemsThe EGF pathway and the lac operon

2 The Calculus of Looping Sequences (CLS)Definition of CLSCLS as an abstraction for biomolecular systemsThe EGF pathway and the lac operon in CLS

3 CLS variantsCLS+Stochastic CLSLCLS

4 Conclusions

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 28 / 52

Page 29: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Stochastic CLS (1)

Stochastic CLS incorporates Gillespie’s stochastic framework into thesemantics of CLS

Rewrite rules are extended with kinetic constants (e.g. a | b k7→ c) to bemultiplied by the number of reactants

Two main problems:

What is a reactant in Stochastic CLS?I A subterm of a term T is a term T ′ 6≡ ε such that T ≡ C [T ′] for some

context CI A reactant is an occurrence of a subterm

What happens with variables?I At each step we compute the set of ground rules that can be applied

among those obtained by instantiating variables of the rewrite ruleI We reduce the problem of defining the semantics to the simpler

problem of defining the semantics with ground rules only

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 29 / 52

Page 30: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Stochastic CLS (2)

Given a finite set of rewrite rules R, the semantics of Stochastic CLS isgiven by the following inference rule

R = T1k7→ T2 ∈ AR(R,T ) T ≡ C [T1] T ′ ≡ C [T2]

TR,k·AC(R,T ,T ′)−−−−−−−−−−→ T ′

where:

AR(R,T ) is the set of ground rewrite rules obtained from rules in Rand applicable to T

AC (R,T ,T ′) is the number of reactants in T equivalent to theleft–hand side of the ground rule R and that allows obtaining term T ′

after the application of R

The transition system obtained can be easily transformed into aContinuous Time Markov Chain

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 30 / 52

Page 31: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

A Stochastic CLS model of the gene regulation (1)

i p o z y a

DNA

mRNA

proteinslac Repressor beta-gal. permease transacet.

R

i p o z y a

R RNAPolime- rase

NO TRANSCRIPTION

a)

i p o z y a

R

RNAPolime- rase

TRANSCRIPTION

b)

LACTOSE

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 31 / 52

Page 32: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

A Stochastic CLS model of the gene regulation (2)

The Stochastic CLS model extends the CLS model by including

a kinetic constant in each rewrite rule

two rewrite rules describing the unbinding of the RNA polymerase andof the repressor from the DNA (to obtain a more realistic model).

Polym | p · x 0.17−→ Pp · x (R1)

Pp · x 27−→ Polym | p · x (R1′)

Repr | x · o · y 17−→ x · Ro · y (R2)

x · Ro · y 107−→ Repr | x · o · y (R2′)

Pp · o · x 1007−→ p · Po · x (R3)

x · Po · zya 1007−→ x · o · Pzya (R4)

x · Pg307−→ Polym | rna | x · zya (R5)

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 32 / 52

Page 33: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Simulation results

0

100

200

300

400

500

600

700

800

900

0 10 20 30 40 50 60 70 80 90 100

Mol

ecul

es

Seconds

no repressors10 repressors25 repressors50 repressors

RNA produced over time by varying the number of repressors in the system

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 33 / 52

Page 34: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Outline of the talk

1 IntroductionCells are complex interactive systemsThe EGF pathway and the lac operon

2 The Calculus of Looping Sequences (CLS)Definition of CLSCLS as an abstraction for biomolecular systemsThe EGF pathway and the lac operon in CLS

3 CLS variantsCLS+Stochastic CLSLCLS

4 Conclusions

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 34 / 52

Page 35: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Modeling proteins at the domain level

In proteins there are places (domains) where bindings to other moleculescan occur

To model a protein at the domain level in CLS it would be natural to use asequence with one symbol for each domain

The binding between two elements of two different sequences cannot beexpressed in CLS

LCLS extends CLS with labels on basic symbols

two symbols with the same label represent domains that are bound toeach other

example: a · b1 · c | d · e1 · f

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 35 / 52

Page 36: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Syntax of LCLS

Terms T and Sequences S of LCLS are given by the following grammar:

T ::= S∣∣ (

S)L c T

∣∣ T | TS ::= ε

∣∣ a∣∣ an

∣∣ S · S

where a is a generic element of E , and n is a natural number.

Patterns P and sequence patterns SP of LCLS are given by thefollowing grammar:

P ::= SP∣∣ (

SP)L c P

∣∣ P | P∣∣ X

SP ::= ε∣∣ a

∣∣ an∣∣ SP · SP

∣∣ x∣∣ x

∣∣ xn

where a is an element of E , n is a natural number and X , x and x areelements of TV ,SV and X , respectively.

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 36 / 52

Page 37: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Well–formed LCLS terms and patterns

A1 |(B11 · B22

)L c C1 · C22 · C3

A1 |(B

)L c C 1× A1 | B1 | C 1×Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 37 / 52

Page 38: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Application of rewrite rulesWe would like to ensure that the application of a rewrite rule to awell–formed term preserves well–formedness

not trivial: well–formedness can be easily violated

e.g. the rewrite rule a 7→ a1 applied to(b)L c a produces

(b)L c a1

A compartment safe rewrite rule is such that

it does not add/remove occurrences of variables and single labels

it does not moves variables from one compartment (content of alooping sequence) to another one

The application of a compartment safe rewrite rule preserveswell–formedness

To apply a compartment unsafe rewrite rule we require that

its patterns are CLOSED

its variables are instantiated with CLOSED terms

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 38 / 52

Page 39: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

The semantics of LCLS

Given a set of compartment safe rewrite rules RCS and a set ofcompartment unsafe rewrite rules RCU , the semantics of LCLS is given bythe following rules

(appCS)P1 7→ P2 ∈ RCS P1σ 6≡ ε σ ∈ Σ α ∈ A

P1ασ → P2ασ

(appCU)P1 7→ P2 ∈ RCU P1σ 6≡ ε σ ∈ Σwf α ∈ A

P1ασ → P2ασ

(par)T1 → T ′

1 L(T1) ∩ L(T2) = {n1, . . . , nM} n′1, . . . , n

′M fresh

T1 | T2 → T ′1{

n′1, . . . , n

′M/n1, . . . , nM} | T2

(cont)T → T ′ L(S) ∩ L(T ′) = {n1, . . . , nM} n′

1, . . . , n′M fresh(

S)L c T →

(S)L c T ′{n

′1, . . . , n

′M/n1, . . . , nM}

where α is link renaming, L(T ) the set of links occurring twice in the toplevel compartment of T

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 39 / 52

Page 40: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Main theoretical result

Theorem (Subject Reduction)Given a set of well–formed rewrite rules R and a well–formed term T

T → T ′ =⇒ T ′ well–formed

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 40 / 52

Page 41: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

An LCLS model of the EGF pathway (1)

EGF

EGFR

SHC

CELL MEMBRANE

nucleus

phosphorylations

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 41 / 52

Page 42: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

An LCLS model of the EGF pathway (2)

We model the EGFR protein as the sequence RE1 · RE2 · RI1 · RI2

RE1 and RE2 are two extra–cellular domains

RI1 and RI2 are two intra–cellular domains

The rewrite rules of the model are

EGF |(RE1 ·x

)L c X 7→ EGF 1 |(R1

E1 ·x)L c X (R1)(

R1E1 ·RE2 ·x ·R2

E1 ·RE2 ·y)L c X 7→

(R1

E1 ·R3E2 ·x ·R2

E1 ·R3E2 ·y

)L c X (R2)(R1

E2 ·RI1 ·x)L c X 7→

(R1

E2 ·PRI1 ·x)L c X (R3)(

R1E2 ·PRI1 ·RI2 ·x ·R1

E2 ·PRI1 ·RI2 ·y)L c (SHC | X ) 7→(

R1E2 ·PRI1 ·R2

I2 ·x ·R1E2 ·PRI1 ·RI2 ·y

)L c (SHC 2 | X ) (R4)

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 42 / 52

Page 43: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

An LCLS model of the EGF pathway (3)

Let us write EGFR for RE1 · RE2 · RI1 · RI2

A possible evolution of the system is

EGF | EGF |`EGFR ·EGFR ·EGFR

´L c (SHC | SHC)

(R1)−−→ EGF 1 | EGF |`R1

E1 ·RE2 ·RI1 ·RI2 ·EGFR ·EGFR´L c (SHC | SHC)

(R1)−−→ EGF 1 | EGF 2 |`R1

E1 ·RE2 ·RI1 ·RI2 ·EGFR ·R2E1 ·RE2 ·RI1 ·RI2

´L c (SHC | SHC)

(R2)−−→ EGF 1 | EGF 2 |`R1

E1 ·R3E2 ·RI1 ·RI2 ·EGFR ·R2

E1 ·R3E2 ·RI1 ·RI2

´L c (SHC | SHC)

(R3)−−→ EGF 1 | EGF 2 |`R1

E1 ·R3E2 ·PRI1 ·RI2 ·EGFR ·R2

E1 ·R3E2 ·RI1 ·RI2

´L c (SHC | SHC)

(R3)−−→ EGF 1 | EGF 2 |`R1

E1 ·R3E2 ·PRI1 ·RI2 ·EGFR ·R2

E1 ·R3E2 ·PRI1 ·RI2

´L c (SHC | SHC)

(R4)−−→ EGF 1 | EGF 2 |`R1

E1 ·R3E2 ·PRI1 ·R4

I2 ·EGFR ·R2E1 ·R3

E2 ·PRI1 ·RI2

´L c (SHC 4 | SHC)

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 43 / 52

Page 44: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Outline of the talk

1 IntroductionCells are complex interactive systemsThe EGF pathway and the lac operon

2 The Calculus of Looping Sequences (CLS)Definition of CLSCLS as an abstraction for biomolecular systemsThe EGF pathway and the lac operon in CLS

3 CLS variantsCLS+Stochastic CLSLCLS

4 Conclusions

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 44 / 52

Page 45: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Observations

The language does not assume a fixed set of operators on membranes.Quite general operations can be modeled by rewrite rules

Rewrite rules are applied sequentially both in the non–deterministic and inthe stochastic case. Forms of parallelism in rewrite rule application couldbe studied

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 45 / 52

Page 46: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Verification and Simulation

We have defined strong and weak bisimulations for CLS and propertiessuch as causality of events can be proved

We have developed a simulator for Stochastic CLS

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 46 / 52

Page 47: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Current and future work

In order to model cell divisions and differentiations, tissues, etc...

we are developing a spatial extension of CLS in which terms areplaced and can move in a 2D/3D space

Moreover,

we are developing a translation of Kohn Molecular Interaction Mapsinto CLS

As future work:

we plan to use CLS to study (in collaboration with biologists) retinalcell develpment and differentiation

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 47 / 52

Page 48: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

ReferencesR. Barbuti, A. Maggiolo-Schettini, P. Milazzo, P. Tiberi and A. Troina.Stochastic CLS for the Modeling and Simulation of Biological Systems.Submitted for publication.

R. Barbuti, A. Maggiolo-Schettini and P. Milazzo. Extending the Calculus ofLooping Sequences to Model Protein Interaction at the Domain Level. Int.Symposium on Bioinformatics Research and Applications (ISBRA’07), LNBI 4463,pages 638–649, Springer, 2006.

R. Barbuti, A. Maggiolo-Schettini, P. Milazzo and A. Troina. BisimulationCongruences in the Calculus of Looping Sequences. Int. Colloquium onTheoretical Aspects of Computing (ICTAC’06), LNCS 4281, pages 93–107,Springer, 2006.

R. Barbuti, A. Maggiolo-Schettini, P. Milazzo and A. Troina. A Calculus ofLooping Sequences for Modelling Microbiological Systems. FundamentaInformaticae, volume 72, pages 21–35, 2006.

P. Milazzo. Qualitative and Quantitative Formal Modeling of Biological Systems,

PhD Thesis, Universita di Pisa.

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 48 / 52

Page 49: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

APPENDIX

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 49 / 52

Page 50: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Background: the kinetics of chemical reactions

Usual notation for chemical reactions:

`1S1 + . . . + `ρSρkk−1

`′1P1 + . . . + `′γPγ

where:

Si ,Pi are molecules (reactants)

`i , `′i are stoichiometric coefficients

k, k−1 are the kinetic constants

The kinetics is described by the law of mass action:

d [Pi ]

dt= `′i k[S1]

`1 · · · [Sρ]`ρ︸ ︷︷ ︸

reaction rate

d [Si ]

dt= `i k−1[P1]

`′1 · · · [Pγ ]`′γ︸ ︷︷ ︸

reaction rate

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 50 / 52

Page 51: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Background: Gillespie’s simulation algorithm

represents a chemical solution as a multiset of molecules

computes the reaction rate aµ by multiplying the kinetic constant bythe number of possible combinations of reactants

Example: chemical solution with X1 molecules S1 and X2 molecules S2

reaction R1 : S1 + S2 → 2S1 rate a1 = X1X2k1

reaction R2 : 2S1 → S1 + S2 rate a2 = X1(X1−1)2 k2

Given a set of reactions {R1, . . . RM} and a current time t

The time t + τ at which the next reaction will occur is randomlychosen with τ exponentially distributed with parameter

∑Mν=1 aν ;

The reaction Rµ that has to occur at time t + τ is randomly chosenwith probability

aµPMν=1 aν

.

At each step t is incremented by τ and the chemical solution is updated.

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 51 / 52

Page 52: The Calculus of Looping Sequences for Modeling Biological ...groups.di.unipi.it/~maggiolo/lucidi_MSV/CLS_2.pdf · CLS as an abstraction for biomolecular systems The EGF pathway and

Well–formedness of LCLS terms and patterns in LCLS

An LCLS term (or pattern) is well–formed if and only if a label occurs nomore than twice, and two occurrences of a label are always in the samecompartment

Type system for well–formedness:

1.(∅, ∅

)|= ε 2.

(∅, ∅

)|= a 3.

(∅, {n}

)|= an

4.(∅, ∅

)|= x 5.

(∅, {n}

)|= xn 6.

(∅, ∅

)|= x 7.

(∅, ∅

)|= X

8.

(N1,N

′1

)|= SP1

(N2,N

′2

)|= SP2 N1 ∩ N2 = N ′

1 ∩ N2 = N1 ∩ N ′2 = ∅(

N1 ∪ N2 ∪ (N ′1 ∩ N ′

2), (N′1 ∪ N ′

2) \ (N ′1 ∩ N ′

2))|= SP1 · SP2

9.

(N1,N

′1

)|= P1

(N2,N

′2

)|= P2 N1 ∩ N2 = N ′

1 ∩ N2 = N1 ∩ N ′2 = ∅(

N1 ∪ N2 ∪ (N ′1 ∩ N ′

2), (N′1 ∪ N ′

2) \ (N ′1 ∩ N ′

2))|= P1 | P2

10.

(N1,N

′1

)|= SP

(N2,N

′2

)|= P N1 ∩ N2 = N ′

1 ∩ N2 = N1 ∩ N ′2 = ∅ N ′

2 ⊆ N ′1(

N1 ∪ N ′2,N

′1 \ N ′

2

)|=

(SP

)L c P

Andrea Maggiolo Schettini (Univ. di Pisa) The Calculus of Looping Sequences Thessaloniki – June, 2007 52 / 52