biological networks - diunitobotta/didattica/aa0809/lez17.pdf · gliceraldeide3fosfato...

Bioinformatica

Biological Networks

Biological Networks

Bioinformatica

a.a. 2008-2009

Francesca Cordero

Networks for Wilson

”...The greatest challenge today, not just in cell biology but all science, osthe accurate and complete description of complex systems. Scientists havebroken down many kinds of systems. They think they know most of theelements and forces. The next task is to reassemble them, at least inmathematical models that capture the key properties of the entireensembles. ...”

Astrazione:

Semplificazione → Adozione di un formalismo → Simulazione

Biological Networks

Bioinformatica2

a.a. 2008-2009

Francesca Cordero

Astrazione - Perche?

La biologia studia sistemi complessi, nell’uomo:

• ≈ 24000 geni

• decine di migliaia di proteine differenti

• almeno 26 tipi di modicazioni post-translazioni

• ≈ 80 possibili localizzazioni sub-cellulari

• una media ”stimata” di 10 interazioni per proteina con proteine chepresentano centinaia di interazioni

• 518 chinasi, 218 fosfatasi, 80 small GTPasi

circa 300000-400000 interazioni proteina-proteina

Biological Networks

Bioinformatica3

a.a. 2008-2009

Francesca Cordero

Astrazione - Perche?

Cosa astrarre?

• L’insieme delle interazioni funzionali del sistema biologico

Come astrarre?

• La rete come astrazione dei fenomeni biologici

Come rappresentare?

• Il grafo, come rappresentazione simbolica

Esempi:

• Sistemi di trasduzione del segnale = insieme di molecole intracellulari,sicamente connesse, deputate alla processazzione di informazione

• Sistema Nervoso Centrale = insieme di neuroni sicamente connessi

• Sistema Immunitario = insieme di cellule e molecole sicamente efunzionalmente connesse

Biological Networks

Bioinformatica4

a.a. 2008-2009

Francesca Cordero

High-throughput Technologies

DNA microarrays → Espressione genica → Genomics

Protein-mAbs arrays, 2D-PAGE+SpecMass, → Interactome,Phosphoproteome, Metabolome etc.. → Proteomics

Genomics & Proteomics =⇒ Ottenimento ”massivo” di datisperimentali

=⇒ Ricostruzione ”rapida” e a ”basso costo” dei network biologici

System Biology

Studia la biologia con analisi statiche e dinamiche.Con sistemi di equazioni differenziali, algoritmi e simulazione delladinamica, cambiamento nel tempo.

Biological Networks

Bioinformatica5

a.a. 2008-2009

Francesca Cordero

Analisi Dinamica

Biological Networks

Bioinformatica6

a.a. 2008-2009

Francesca Cordero

Analisi Dinamica

Glucosio

Glucosio6P

Fruttosio6p

Fruttosio1,6P

FosfoFruttosioChinasi

DiidrossiacetoneFosfatasi

TriFosfatoIsomerasi

Gliceraldeide3Fosfato BiFosfoGlicerato

FosfoGlicerato FosfoGlicerato2 Fosfoenolpiruvato Piruvato

FosfoglucosioIsomerasi

Esochinasi

Aldolasi

EnolasiGliceraldeideTriFosfIdroFosfogliceratoKin

FosfoGliceratoMutasi PiruvatoKinase

T1

T2

T3

T4

T5

T6 T7 T8 T9 T10

Biological Networks

Bioinformatica7

a.a. 2008-2009

Francesca Cordero

Analisi Statica

Analisi topologica, studio dell’architettura locale e globale della rete. Nonpermette di simulare il cambiamento nel tempo, ma permette diidentificare le proprieta strutturali della rete che permettono la dinamicadel sistema. Lo studio delle prorpieta topologiche permette di rispondere adomande come:

• Il sistema e’ robusto? (= tollerante alle interferenze)

• quali sono i possibili regolatori/effettori di una molecola?

• esistono meccanismi regolatori cellulo-specifici?

• quali sono gli effetti collaterali di un farmaco?

• quali sono i meccanismi di azione di un farmaco?

• quali sono le alterazioni generali molecolari durante lo shock settico?

• qual’e il meccanismo molecolare della resistenza alla terapia in AML?

Biological Networks

Bioinformatica8

a.a. 2008-2009

Francesca Cordero

Network measures

Network biology offers a quantifiable description of the networks thatcharacterize various biological systems. The most basic network measuresthat allow us to compare and characterize different complex networks are:

• Degree Number of links the node has to other nodes, k.Incomingdegree kin and outgoing degree kout. An undirected network with Nnodes and L links is characterized by an average degree 〈k〉 = 2L/N .

• Degree distribution P (k) gives the probability that a selected nodehas exactly k links. P (k) is obtained by counting the number of nodesN(k) with k = 1, 2 . . . links and dividing by the total number of nodesN .

• Shortest path and mean path length distance in networks ismeasures with the path length, which tell us how many links we needto pass through to travel between two nodes. In directed networks, thedistance lAB from node A to node B is often different from thedistance lBA. The mean path length , 〈l〉, represents the average overthe shortest paths between all pairs of nodes.

Biological Networks

Bioinformatica9

a.a. 2008-2009

Francesca Cordero

Network measures

• Cluster Coefficient In many networks if node A is connected to Band B is connected to C, then is highly the probable that A also has adirect link to C. This phenomens can be quantified using the clustercoefficient CI = 2n/k(k − 1). The average coefficient 〈C〉 characterizesthe overall tendency of nodes to form clusters.C(k) defines the average cluster coefficient of all nodes with k links.

where CA = 2

20and CF = 0

Biological Networks

Bioinformatica10

a.a. 2008-2009

Francesca Cordero

Examples of metabolic networks

Biological Networks

Bioinformatica11

a.a. 2008-2009

Francesca Cordero

Architectural features of cellular networks

Three network models:

• Random networks

– The ER model start with N nodes and connects each pair of nodeswith probability p.

– The node degree follow a Poisson distribution, means that mostmodes have approximately the same number of links.

– The clustering coefficient is independent of node’s degree.

– The mean path length si proportional to the logarithm of networksize l ≈ logN .

Biological Networks

Bioinformatica12

a.a. 2008-2009

Francesca Cordero

Random networks

Biological Networks

Bioinformatica13

a.a. 2008-2009

Francesca Cordero


• Scale-free networks

– Networks are highly non-uniform, most of the nodes have only a fewlinks, instead few nodes with vary large number of links (HUBS)

– The probability that a node has k links follows P (k) ≈ kγ , where γis the degree exponent. The probability that a node is highlyconnected is statistically more significant than in a random graph.

– Scale-free networks are characterized by a power-law degreedistribution; such distributions are seen as a straight line on aloglog plot.

– C(k) is independent of k.

– The average path length following ≈ loglogN , which is significantlyshorter than logN that characterizes random small-world networks.

Biological Networks

Bioinformatica14

a.a. 2008-2009

Francesca Cordero

Scale-free networks

Biological Networks

Bioinformatica15

a.a. 2008-2009

Francesca Cordero


• Hierarchical networks

– Construction:

1. small cluster of four densely linked nodes:2. Three replicas of this module are generated and the three

external nodes of the replicated clusters connected to the centralnode of the old cluster, which produces a large 16-node module.

3. Three replicas of this 16-node module are generated andconnected to the central node

4. A new module of 64 nodes.

– The network that has a power-law degree distribution

– System-size independent average clustering coefficient 〈C〉 ≈ 0.6.

– Scaling of the clustering coefficient, which follows C(k) ≈ k1 astraight line of slope 1 on a loglog plot .

Biological Networks

Bioinformatica16

a.a. 2008-2009

Francesca Cordero

Hierarchical networks

Biological Networks

Bioinformatica17

a.a. 2008-2009

Francesca Cordero

Which type are the cellular networks?

The analysis of the metabolic networks of 43 different organisms from allthree domains of life (eukaryotes, bacteria, and archaea) indicates that thecellular metabolism has a scale-free topology.

• MOST metabolic substrates participate in only one or two reactions,FEW such as pyruvate or coenzyme A, participate in dozens andfunction → metabolic hubs.

• Genetic regulatory networks

Biological Networks

Bioinformatica18

a.a. 2008-2009

Francesca Cordero

Which type are the cellular networks?

BUT other nets as transcription regulatory networks of S. cerevisiae andEscherichia coli is MIXED scale-free and exponential:

• Most transcription factors regulate only a few genes

• Few general transcription factors interact with many genes.

• that most genes are regulated by one to three transcription factors.

Biological Networks

Bioinformatica19

a.a. 2008-2009

Francesca Cordero

Small-word effect

Two nodes can be connected with a path of a few links only.

In the cell, ”small-world effect” was documented for metabolism:paths of only three to four reactions can link most pairs of metabolites.

This short path length indicates that local perturbations in metaboliteconcentrations could reach the whole network very quickly.

Biological Networks

Bioinformatica20

a.a. 2008-2009

Francesca Cordero

Why scale-free architecture?

• Most networks are the result of a growth process, during which newnodes join the system over an extended time period.

• Nodes prefer to connect to nodes that already have many links.

Duplicated genes produce identical proteins that interact with the sameprotein partners.

Biological Networks

Bioinformatica21

a.a. 2008-2009

Francesca Cordero

Motif and Modules

Modularity refers to a group of physically or functionally linked molecules(nodes) that work together to achieve a (relatively) distinct function.

Biology is full of examples of modularity:

• Protein-protein and protein-RNA complexes (physical modules)

• Temporally coregulated groups of molecules are known to governvarious stages of the cell cycle

• To convey extracellular signals in bacterial chemotaxis

In a network representation, a module (or cluster) appears as a highlyinterconnected group of nodes.

Biological Networks

Bioinformatica22

a.a. 2008-2009

Francesca Cordero

Motif and Modules

The high clustering indicates that networks are locally splitted withvarious subgraphs of highly inter-linked groups of nodes.

• Modules by definition imply that there are groups of nodes that arerelatively isolated from the rest of the system.

• However, in a scale-free network hubs are in contact with a highfraction of nodes, which makes the existence of relatively isolatedmodules unlikely.

• Clustering and hubs naturally coexist, however, which indicates thattopological modules are not independent, but combine to form ahierarchical network.

Biological Networks

Bioinformatica23

a.a. 2008-2009

Francesca Cordero

Robustness

Scale-free networks do not have a critical threshold for disintegration: if80% of randomly selected nodes fail, the remaining 20% still form acompact cluster with a path connecting any two nodes. This is becauserandom failure affects mainly the numerous small degree nodes, theabsence of which does not disrupt the networks integrity.

• It is increasingly accepted that adaptation and robustness are inherentnetwork properties, and not a result of the fine-tuning of a componentscharacteristics

• Robustness is inevitably accompanied by vulnerabilities (well selectednetwork components)

• The ability of a module to evolve also has a key role in developing orlimiting robustness (”frozen” modules as nucleicacid synthesis)

• Modularity and robustness: the weak communication between modulesprobably limiting the effects of local perturbations in cellular networks.

Biological Networks

Bioinformatica24

a.a. 2008-2009

Francesca Cordero

Esempio

Cellula ”virtuale”(interattoma) = GLOBAL Network = 14500 nodi(proteine) - 94600 interazioni Solo mammiferi (> 90%homo sapiens, il restomouse e rat) Vericate mediante differenti approcci sperimentali

• L’esperimento: stimolazione di linfociti T umani con SDF-1alpha

• LISTA di proteine fosforilate da CXCR4 (SDF-1a)

• La lista serve ad estrarre informazione dalla rete globale, cioe aricostruire la sottorete specicamente generata dalle proteine fosforilate.

• Il livello di fosforilazione l’attributo dei nodi del probe

Biological Networks

Bioinformatica25

a.a. 2008-2009

Francesca Cordero

Paper - Module Networks

• The complex functions of a living cell are carried out through theconcerted activity of many genes and gene products. This activity isoften coordinated by the organization of the genome into regulatorymodules, or sets of coregulated genes that share a common function.

• Identifying this organization is crucial for understanding cellularresponses to internal and external signals.

• Genome-wide expression profiles

GoalModule networks procedure; a method based on probabilistic graphicalmodels for inferring regulatory modules from gene expression data.

Biological Networks

Bioinformatica26

a.a. 2008-2009

Francesca Cordero


AssumptionThe regulators are themselves transcriptionally regulated, so that theirexpression profiles provide information about their activity level.

Procedure

Biological Networks

Bioinformatica27

a.a. 2008-2009

Francesca Cordero


The algorithm searches simultaneously for a partition of genes into modulesand for a regulation program for each module that explains the expressionbehavior of genes in the module. The regulation program of a module.

Biological Networks

Bioinformatica28

a.a. 2008-2009

Francesca Cordero


Respiration module Hap4 TF module’s top regulator, infact Hap4-DNAbiding sequence motif is present in 29 of 55 genes in the module

Biological Networks

Bioinformatica29

a.a. 2008-2009

Francesca Cordero


All modules Summary of module analysis

Biological Networks

Bioinformatica30

a.a. 2008-2009

Francesca Cordero


To obtain a global perspective on the relationships between differentmodules:

• Compiled a graph of modules and cis-regulatory motifs

• Connected modules to their significantly enriched motifs

Biological Networks

Bioinformatica31

a.a. 2008-2009

Francesca Cordero


Experimental tests

Biological Networks

Bioinformatica32

a.a. 2008-2009

Francesca Cordero


Regulatory components

Biological Networks

Bioinformatica33

a.a. 2008-2009

Francesca Cordero


A key question regarding the validity of this approach is explaining howregulatory events can be inferred from gene expression data. To identify aregulatory relation in expression data, both the regulator and its targetsmust be transcriptionally regulated, resulting in detectable changes in theirexpression.

Failures:

• Regulatory activity by post-transcriptional changes

• Several regulators participate in the same regultatory event

Biological Networks

Bioinformatica34

a.a. 2008-2009

Francesca Cordero

biological networks - diunitobotta/didattica/aa0809/lez17.pdf · gliceraldeide3fosfato...

Documents