mathematical modeling in molecular biology - uspjb/lectures/bioinformatics/rio-cnpq.pdf ·...

53
Mathematical Modeling in Molecular Biology Junior Barrera DCC-IME-USP/BIOINFO-USP

Upload: nguyenthien

Post on 11-Oct-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

Mathematical Modeling in Molecular Biology

Junior Barrera

DCC-IME-USP/BIOINFO-USP

Projects

BIOINFO-USP

CAGE

Bioquímica-USP

Cell cycle,

Tripanosoma-cruzi,

Dictyostelium

....

VGDN

ICB-USP

OMS

ICB-USPMalária

ICB-USP

Cancer

Ludwig

Agriculture

ESALQ

Agriculture

Ribeirão-USP

Cancer

NHGRI

Virus

Hemocentro

Layout

• Economics perspective

• Fundamentals of Molecular Biology

• Experimental techniques

• Data mining environments

• Some Biological problems

• Formalizing Biological problems

• A family of Mathematical problems

• Other areas with similar problems

Economics Perspective

• Atomic program (1930-1945)

• Space program (1960-1975)

• Genetics (1990- ...): international investment, public and private

Brazilian investment is proportionally greater than the investment of some developed countries

Fundamentals of Molecular Biology

molecule

cell

tissue

organism

Protein structure and dynamics

DNA, Protein, Gene Expression, Gene Networks

Population

Heredity - Mendel (1866)The phenotypes of an individual depends on genes of his parents.

Chromosome Theory - Morgan (1910)Genes were situated in chromosomes

The molecular structure of chromosomes (Watson and Crick -1953)DNA structure: the double helixFour basis: adenine(A), guanine(G), thymine(T), cytosine(C)genes are sequences of nucleotides

DNA manipulationcut, replication and decoding

Genes control the metabolismMetabolism occurs by sequences of enzyme-catalyzed reactions.Enzymes are specified by one or more genes

Gene expression

species modification, diagnostics, drug production

Experimental techniques

Áreas de atuação

Microarrays

Image Analysis

Signal

gene 1

gene 2

Data Mining Environments

P1

P2

Pn

Pi : analytical and mining procedures (kernel parallel)

Objected oriented database

Sequence module

microarray module

Genbank

database

P1

P2

P3

Pn

P1

P2

P3

Pnquery

query

operational

operational

analitical

analitical

Access control module

Integrated Environment

query

Another

databases

Clinical data

Data Warehouse

M_database

Analysis tools writer SpreadSheet Dataminig

MOLAP/ROLAP server

IMS OthersRDMS

Wrapper Wrapper Wrapper

Integrator

Metadata Data

Mart

Users

1

2

3

4

System Architecture

DB/2DB/2

SybaseSybase

OracleOracle

InformixInformix

SliceSlice NNSliceSlice 11

Appl. n

Appl. 3

Appl. 2

Appl. 1

JavaJava

PBPB

DelphiDelphi

MATLAMATLA

N-2 slices

Consulte rules and libraries

Some Biological problems

• Modification of Sugar Cane, Eucalyptus and chickens

• Cancer diagnostics

• Drug performance against HIV and Malaria

• Understanding of the cell division cycle

• Reconstitution of nervous tissue

• Design of drugs

• Prediction of new virus

Formalizing Biological Problems

Proteome

Transcriptome

Genome

Pathways

ΣWet Lab

What genes regulate the pathway A->B->C->D ?

Choice of adequate clustering technique

Class 1

[low variance]

measurement

Class 2

[high variance]

measurement

Clustering

Clustering

Looser clusters due to large variance

−3.5 −3 −2.5 −2 −1.5 −1 −0.5 0 0.5 1−1.5

−1

−0.5

0

0.5

1

1.5

2

phosphofructokinase, platelet

v−

erb

−b

2 a

via

n e

ryth

rob

lastic le

uke

mia

vira

l o

nco

ge

ne

ho

mo

log

2

LINEAR CLASSIFIER (DISPERSED−GAUSSIAN) w/ σ = 0.600

1−th (0.389593)

7−th (0.416607)

BRCA1 BRCA2/sporadic

the tumor sample from Patient 20

Design classifier

Dimensionality Reduction

x1

x2What is the minimum number of genes that is enough to distinguish two Biological states?

Filter Design

Application

Modeling Dynamical Systems

Cell

peptide

other signals

peptide

other signals

mRNA

PathwaysTranslat.Transcr.

Proteins

Pool

Metabolic

peptide

other signals

mRNAGENES NETWORK

Translat.Transcr.Proteins

Pool Pathways

Metabolic

Division Steps

u1

I

u2

u5

v

vfp

w1

w2

w1fp

w2fp

w1f

w2f

x1

x6

y1

y2

z

x1f

x3f

x4f

x6f

x6fp

x1fp

y1fp

y2fp

.

.

.

.

.

.

z

Forward SignalFeedback to pFeedback to previous layer

y1f

y1f

y2f

y2f

p

Knockout

u1

I

u2

u5

v

vfp

w1

w2

w1fp

w2fp

w1f

w2f

x1

x6

y1

y2

z

x1f

x3f

x4f

x6f

x6fp

x1fp

y1fp

y2fp

.

.

.

.

.

.

z

Forward SignalFeedback to pFeedback to previous layer

y1f

y1f

y2f

y2f

p

Division Steps

System identification

Cell

GENES NETWORK

Translat.Transcr.Proteins

Pool

microarray

Pathways

Metabolic

proteome

peptide

other signals

mRNA

Find the architecture of a gene regulation network from microarray data.

Target Gene Predictor 1

Predictor 2

NMSE 1NMSE 2

Predictor 3

NMSE 3

System dynamics simple

System identified

A family of Mathematical problems

Design of classifier, filter or dynamical system

Optimization

Combinatory or continuos

Statistical Estimation

Algebraic Representation

optψ

Ψ

The constrained estimation problem

Sample size

Error

Other areas with similar problems

• Finances

• Marketing

• Digital TV

• Petrol Industry

• Neuro Sciences

• ...