bioinformatics problems. 2 3 main classes of problem areas central dogma related: sequence,...

18
Bioinformatics Problems Bioinformatics Problems

Post on 18-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bioinformatics Problems. 2 3 Main Classes of Problem Areas Central dogma related: sequence, structure or function Data related: storage, retrieval & analysis

Bioinformatics ProblemsBioinformatics Problems

Page 2: Bioinformatics Problems. 2 3 Main Classes of Problem Areas Central dogma related: sequence, structure or function Data related: storage, retrieval & analysis

2

3 Main Classes of Problem Areas3 Main Classes of Problem Areas

Central dogma related: sequence, structure or function

Data related: storage, retrieval & analysis (exponential growth of knowledge in molecular biology)

Simulation of biological processes: protein folding (molecular dynamics) of metabolic pathways

Page 3: Bioinformatics Problems. 2 3 Main Classes of Problem Areas Central dogma related: sequence, structure or function Data related: storage, retrieval & analysis

3

Topics in BioinformaticsTopics in Bioinformatics

Structure analysisStructure analysis Protein structure comparison Protein structure prediction RNA structure modeling

Pathway analysisPathway analysis Metabolic pathway Regulatory networks

Sequence analysisSequence analysis Sequence alignment Structure and function prediction Gene finding

Expression analysisExpression analysis Gene expression analysis Gene clustering

Page 4: Bioinformatics Problems. 2 3 Main Classes of Problem Areas Central dogma related: sequence, structure or function Data related: storage, retrieval & analysis

4

Sequence AnalysisSequence Analysis

Finding evolutionary relationships Finding coding regions of genomic sequences Translating DNA to protein Finding regulatory regions Assembling genome sequences

Finding information and patterns

in DNA and protein data

Page 5: Bioinformatics Problems. 2 3 Main Classes of Problem Areas Central dogma related: sequence, structure or function Data related: storage, retrieval & analysis

5

Structure AnalysisStructure Analysis

Amino acid sequences of protein determine its 3D conformation

MNIHRSTPITIARYGRSRNKTQDFEELSSIRSAEPSQSFSPNLGSPSPPETPNLSHCVSCIGKYLLLEPLEGDHVFRAVHLHSGEELVCKVFDISCYQESLAPCF

Sequence Structure Function

Page 6: Bioinformatics Problems. 2 3 Main Classes of Problem Areas Central dogma related: sequence, structure or function Data related: storage, retrieval & analysis

6

DNA ChipDNA Chip

슬라이드 박편에 oligonucleotide 나 cDNA 를 array 형태로 붙여 놓은 것 Oligonucleotide

• 짧은 길이를 갖는 single-strand 의 nucleotide chain

cDNA• mRNA 로부터 역전사하여 얻어낸 DNA strand

Page 7: Bioinformatics Problems. 2 3 Main Classes of Problem Areas Central dogma related: sequence, structure or function Data related: storage, retrieval & analysis

7

HybridizationHybridization

A-T, G-C 사이의 상보 결합상보적인 염기 서열을 갖는 2 개의 DNA(or

RNA) single strand 가 붙는 과정을 hybridization 이라고 함 .

Page 8: Bioinformatics Problems. 2 3 Main Classes of Problem Areas Central dogma related: sequence, structure or function Data related: storage, retrieval & analysis

8

HybridizationHybridization 의 응용의 응용 Mouse 의 coding region 으로부터 oligonucleot

ide 들을 얻어내고 이를 형광물질이나 방사선을 이용하여 tagging

위의 oligonucleotide 와 human DNA 를 hybridization 시킴

Human DNA 에 oligonucleotide 가 붙어있는 위치를 알 수 있음

Human 과 Mouse 의 coding region 은 유사하므로 m 이 부분들은 coding region 일 확률이 높음

Page 9: Bioinformatics Problems. 2 3 Main Classes of Problem Areas Central dogma related: sequence, structure or function Data related: storage, retrieval & analysis

9

DNA ChipDNA Chip

DNA chip 에는 수많은 DNA strand 들이 밀집해 있으므로 , 한 번의 실험으로 앞의 예와 같은 실험을 대량으로 동시에 진행하는 효과를 얻을 수 있음 .

Page 10: Bioinformatics Problems. 2 3 Main Classes of Problem Areas Central dogma related: sequence, structure or function Data related: storage, retrieval & analysis

10

Gene Expression AnalysisGene Expression Analysis

Gene expression Transcription 과 translation 과정을 통하여 gene이 protein 으로 발현되는 것

Gene expression level 은 gene 의 기능에 대한 단서를 제공

DNA chip 을 통해 세포의 gene expression level 을 효율적으로 알아낼 수 있음

Page 11: Bioinformatics Problems. 2 3 Main Classes of Problem Areas Central dogma related: sequence, structure or function Data related: storage, retrieval & analysis

11

Gene Expression Analysis Gene Expression Analysis 과정과정알려진 gene sequence 를 이용하여 DNA chip을 제작 cDNA chip 의 경우에는 유전자 전체를 이용하여

chip 을 제작 Oligonucleotide chip 의 경우에는 하나의 유전자에서 20~25mer 안팎을 선택하여 chip 제작

Target 세포에서 mRNA 를 추출하여 cDNA를 만들고 DNA chip 에 가하면 hybridization이 일어남

Hybridization 이 일어난 정도를 분석하면 gene expression 정도를 알 수 있음

Page 12: Bioinformatics Problems. 2 3 Main Classes of Problem Areas Central dogma related: sequence, structure or function Data related: storage, retrieval & analysis

12

Gene Expression AnalysisGene Expression Analysis

Nature Genetics 21, 10 (1999)

Page 13: Bioinformatics Problems. 2 3 Main Classes of Problem Areas Central dogma related: sequence, structure or function Data related: storage, retrieval & analysis

13

DNA Chip ApplicationsDNA Chip Applications

Gene discovery: gene/mutated gene Mutation 의 발견

• Oligonucleotide chip 에서는 하나의 염기 서열만 틀려도 hybridization 이 일어나지 않음

Disease diagnosis Gene expression level 분석을 통한 질병 진단

• 질병의 원인이 될만한 gene 의 expression level 을 알아내는 DNA chip 제작

• 정상세포와 암세포의 expression level 차이 분석을 통해 질병 진단

Page 14: Bioinformatics Problems. 2 3 Main Classes of Problem Areas Central dogma related: sequence, structure or function Data related: storage, retrieval & analysis

14

DNA Chip ApplicationsDNA Chip Applications

Sequencing by Hybridization 길이가 k(8~10) 인 모든 가능한 sequence 가 심어져 있는 oligonucleotide chip 을 제작

Target DNA 를 적절한 크기로 잘라 chip 위에 가하면 hybridization 이 일어나 , target sequence 에 어떤 k 길이의 sequence 가 있는 지를 알 수 있음

이 정보를 조합하여 원래 sequence 를 알아냄

Page 15: Bioinformatics Problems. 2 3 Main Classes of Problem Areas Central dogma related: sequence, structure or function Data related: storage, retrieval & analysis

15

Pathway AnalysisPathway Analysis

The one of the declarative way representing biological knowledge

Metabolic pathway

Page 16: Bioinformatics Problems. 2 3 Main Classes of Problem Areas Central dogma related: sequence, structure or function Data related: storage, retrieval & analysis

16

Database SearchDatabase Search

Text based searching Sequence based searching

Page 17: Bioinformatics Problems. 2 3 Main Classes of Problem Areas Central dogma related: sequence, structure or function Data related: storage, retrieval & analysis

17

Bioinformatics as Information TecBioinformatics as Information Technologyhnology

InformationRetrieval

GenBankSWISS-PROT

Hardware

Agent

Machine Learning

Algorithm

Supercomputing

Information filteringMonitoring agent

Pattern recognitionClusteringRule discovery

Sequence alignment

Biomedical text analysis

Database

Bioinformatics

Page 18: Bioinformatics Problems. 2 3 Main Classes of Problem Areas Central dogma related: sequence, structure or function Data related: storage, retrieval & analysis

18

Bioinformatics on the WebBioinformatics on the Web

sample

array

hybridization

scanner

relationaldatabase

Data management

The experimental process

webinterface

image analysis results andsummaries

links to otherinformation

resources

downloaddata to otherapplications

Data analysis and interpretation