ch17. proteomics and protein identification

37
Ch17. Proteomics and Protein Identification IDB Lab. Seoul National University Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins, Third Edition

Upload: evelyn-rodriguez

Post on 03-Jan-2016

47 views

Category:

Documents


3 download

DESCRIPTION

Ch17. Proteomics and Protein Identification. Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins, Third Edition. IDB Lab. Seoul National University. Contents. Introduction MS for Protein Analysis The Major Proteomic Approaches Data Preprocessing - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Ch17. Proteomics and Protein Identification

Ch17. Proteomicsand Protein

Identification

IDB Lab.Seoul National University

Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins, Third Edition

Page 2: Ch17. Proteomics and Protein Identification

Contents

Introduction

MS for Protein Analysis

The Major Proteomic Approaches

Data Preprocessing

The Major Protein Identification Programs

Summary

Page 3: Ch17. Proteomics and Protein Identification

Introduction(1/3)

Proteomics 1994 ๋…„ Marc Wilkins ๊ฐ€ MS ๋ฅผ ์ด์šฉํ•œ protein ์˜

functional study ๋ฅผ ๋งํ•˜๋ฉด์„œ ์ตœ์ดˆ๋กœ ์šฉ์–ด ์‚ฌ์šฉ

<Types of proteomics and their applications to biology>from: Graves and Haystead, 2002

Page 4: Ch17. Proteomics and Protein Identification

Introduction(2/3)

๋ณต์žกํ•œ ๋‹จ๋ฐฑ์งˆ ๋ฐœํ˜„

Page 5: Ch17. Proteomics and Protein Identification

Introduction(3/3)

Protein ๋ถ„์„์˜ ์–ด๋ ค์›€ DNA, RNA ์—ฐ๊ตฌ์—์„œ ์„œ์—ด์„ ๋ฌด์ œํ•œ์œผ๋กœ ๋ณต์ œํ•ด ์ฃผ๋Š”

PCR ๊ธฐ๋ฒ•์ด ์žˆ๋‹ค Protein ์€ ์ƒ์ฒด๋‚ด์˜ ์ƒ๋Œ€์ ์œผ๋กœ ์†Œ๋Ÿ‰์ธ ๋ถ„์ž๋ฅผ ์ง์ ‘

๋ถ„์„ํ•ด์•ผํ•จ ํ•˜๋‚˜์˜ ์œ ์ „์ž์—์„œ ์˜จ ๋‹จ๋ฐฑ์งˆ์ด ๋‹ค์–‘ํ•œ ํ˜•ํƒœ๋ฅผ ์ง€๋‹˜

์งˆ๋ณ‘์„ ํ•ด์„ํ•˜๊ธฐ ์œ„ํ•œ ์ผ๋ฐ˜์  ์ ‘๊ทผ๋ฒ• ์งˆ๋ณ‘์ด ๊ฑธ๋ฆฐ ์กฐ์ง๊ณผ ์ •์ƒ ์กฐ์ง์„ ๋น„๊ต ์ค‘์š”ํ•œ ์ฐจ์ด๋ฅผ ๋ณด์ด๋Š” ๋‹จ๋ฐฑ์งˆ์„ ๋ถ„์„โ–ถ Protein Identification

Page 6: Ch17. Proteomics and Protein Identification

MS for Protein Analysis(1/17)

Mass Spectrometer( ์งˆ๋Ÿ‰ ๋ถ„์„๋ฒ• ) ์งˆ๋Ÿ‰์„ ๊ธฐ์ดˆ๋กœ ๋ถ„์ž๋ฅผ ๋ถ„์„ํ•˜๋Š” ๋ฐฉ๋ฒ•

Ionizer

Sample

+_

Mass Analyzer Detectorโ€ข MALDIโ€ข Electro-Spray

Ionization (ESI)

โ€ข Time-Of-Flight (TOF)โ€ข Quadrapoleโ€ข FT/MS

Page 7: Ch17. Proteomics and Protein Identification

MS for Protein Analysis(2/17)

Time of Flight MS

Reflector

Page 8: Ch17. Proteomics and Protein Identification

MS for Protein Analysis(3/17)

Mass Spectrum

mass-to-charge ratiom/z

the numberof ion

Page 9: Ch17. Proteomics and Protein Identification

MS for Protein Analysis(4/17)

์งˆ๋Ÿ‰ ๋ถ„์„๋ฒ•์„ ์œ„ํ•œ ๋‹จ๋ฐฑ์งˆ์˜ ๋ถ„ํ•ด Peptide Mass Fingerprinting(PMF) Tandem MS, or MS/MS

Page 10: Ch17. Proteomics and Protein Identification

MS for Protein Analysis(5/17)

Peptide Mass Fingerprinting(PMF) MS ๋ฅผ ํ•˜๊ธฐ ์ „ ํ™”ํ•™์  ๋ถ„๋ฆฌ ์ˆ˜ํ–‰

์—ฌ๋Ÿฌ ๋‹จ๋ฐฑ์งˆ์ด ๊ฐ™์ด ์žˆ์„ ๊ฒฝ์šฐ ์ด์˜จํ™”์™€ ๋ถ„์„์˜ ์–ด๋ ค์›€ ๋ถ„์„ํ•˜๊ณ ์ž ํ•˜๋Š” ํ•˜๋‚˜์˜ ๋‹จ๋ฐฑ์งˆ๋งŒ ๋ถ„๋ฆฌ

Two-dimensional electrophoretic gel separation Liquid chromatography

๋‹จ๋ฐฑ์งˆ์„ ํšจ์†Œ๋ฅผ ์ด์šฉํ•ด ๋” ์ž‘์€ ๋‹จ์œ„๋กœ ๋‹จํŽธํ™” ์—ฌ๋Ÿฌ ํŽฉํ‹ฐ๋“œ๊ฐ€ ๊ฐ™์ด ์žˆ์„ ๊ฒฝ์šฐ ์ด์˜จํ™”์™€ ๋ถ„์„์˜ ์–ด๋ ค์›€ Trypsin

P ๊ฐ€ ๋’ค๋”ฐ๋ฅด์ง€ ์•Š๋Š” K, R ๋’ค๋ฅผ ๋ถ„๋ฆฌํ•ด์คŒ ์งˆ๋Ÿ‰ ๋น„๊ต

๋‹จํŽธํ™”๋œ ํŽฉํ‹ฐ๋“œ์˜ ์ŠคํŽ™ํŠธ๋Ÿผ์„ ์ด์šฉํ•ด ๋‹จ๋ฐฑ์งˆ์˜ ์งˆ๋Ÿ‰ ๋ถ„์„ ๊ณ„์‚ฐ๋œ ์งˆ๋Ÿ‰๊ณผ database ์— ์žˆ๋Š” ๋‹จ๋ฐฑ์งˆ์˜ ์งˆ๋Ÿ‰๊ณผ ๋น„๊ต

Page 11: Ch17. Proteomics and Protein Identification

MS for Protein Analysis(6/17)

2D Gel-Electrophoresis( ์ „๊ธฐ์˜๋™ ) Protein separation

Molecular weight (Mw) ๋“ฑ์ „์ 

Isoelectric point (pI) ๋‹จ๋ฐฑ์งˆ์˜ ๋ถ„ํฌ๋ฅผ

๋ณผ ์ˆ˜ ์žˆ๋‹ค .

pl

Page 12: Ch17. Proteomics and Protein Identification

MS for Protein Analysis(7/17)

Peptide Mass Fingerprinting(PMF)

Cut out2D-GelSpot

Page 13: Ch17. Proteomics and Protein Identification

MS for Protein Analysis(8/17)

Peptide Mass Fingerprinting(PMF)

Trypsin Digest(P ๊ฐ€ ๋’ค๋”ฐ๋ฅด์ง€ ์•Š๋Š”K, R ๋’ค๋ฅผ ๋ถ„๋ฆฌํ•ด์คŒ )

Page 14: Ch17. Proteomics and Protein Identification

MS for Protein Analysis(9/17)

Peptide Mass Fingerprinting(PMF)

N CR R PRKR K

N C

M1M2 M3

M4

M5

M1M2M3M4M5

< Trypsin Digest >

Page 15: Ch17. Proteomics and Protein Identification

MS for Protein Analysis(10/17)

Tandem MS, or MS/MS

Enzymatic Digestand

Fractionation

Page 16: Ch17. Proteomics and Protein Identification

MS for Protein Analysis(11/17)

Tandem MS, or MS/MS

MS

Page 17: Ch17. Proteomics and Protein Identification

MS for Protein Analysis(12/17)

Tandem MS, or MS/MS

Precursor selection

Page 18: Ch17. Proteomics and Protein Identification

MS for Protein Analysis(13/17)

Tandem MS, or MS/MS

Precursor selection + Collision-induced dissociation

(CID)

MS/MS

Page 19: Ch17. Proteomics and Protein Identification

MS for Protein Analysis(14/17)

y3

b2

y2 y1

b3a2 a3

HO NH3+

| |

R1 O R2 O R3 O R4

| || | || | || |H -- N --- C --- C --- N --- C --- C --- N --- C --- C --- N --- C -- COOH | | | | | | | H H H H H H H

b2-H2O

y3 -H2O

b3- NH3

y2 - NH3

a1

Peptide Fragmentation with CID

Page 20: Ch17. Proteomics and Protein Identification

G V D L K

mass0

57 Da = โ€˜Gโ€™ 99 Da = โ€˜Vโ€™LK D V G

The peaks in the mass spectrum: Prefix and Suffix Fragments Fragments with neutral losses (-H2O, -NH3) Noise and missing peaks.

D

H2O

MS for Protein Analysis(15/17)

Protein Identification with MS/MS

Page 21: Ch17. Proteomics and Protein Identification

MS for Protein Analysis(16/17) Protein Identification with MS/MS

G V D L K

mass0

Inte

nsity

mass0

MS/MSPeptide Identification:

Page 22: Ch17. Proteomics and Protein Identification

MS for Protein Analysis(17/17)

De Novo vs. Database Search

S#: 1708 RT: 54.47 AV: 1 NL: 5.27E6T: + c d Full ms2 638.00 [ 165.00 - 1925.00]

200 400 600 800 1000 1200 1400 1600 1800 2000m/z

0

5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

100

Re

lative

Ab

un

da

nce

850.3

687.3

588.1

851.4425.0

949.4

326.0524.9

589.2

1048.6397.1226.9

1049.6489.1

629.0

WR

A

C

VG

E

K

DW

LP

T

L T

WR

A

C

VG

E

K

DW

LP

T

L T

De Novo

AVGELTK

Database Search

Database of all peptides = 20n

AAAAAAAA,AAAAAAAC,AAAAAAAD,AAAAAAAE,AAAAAAAG,AAAAAAAF,AAAAAAAH,AAAAAAI,

AVGELTI, AVGELTK , AVGELTL, AVGELTM,

YYYYYYYS,YYYYYYYT,YYYYYYYV,YYYYYYYY

Database ofknown peptides

MDERHILNM, KLQWVCSDL, PTYWASDL, ENQIKRSACVM, TLACHGGEM, NGALPQWRT,

HLLERTKMNVV, GGPASSDA, GGLITGMQSD, MQPLMNWE,

ALKIIMNVRT, AVGELTK, HEWAILF, GHNLWAMNAC,

GVFGSVLRA, EKLNKAATYIN..

Database ofknown peptides

MDERHILNM, KLQWVCSDL, PTYWASDL, ENQIKRSACVM, TLACHGGEM, NGALPQWRT,

HLLERTKMNVV, GGPASSDA, GGLITGMQSD, MQPLMNWE,

ALKIIMNVRT, AVGELTK, HEWAILF, GHNLWAMNAC,

GVFGSVLRA, EKLNKAATYIN..

Mass, Score

Page 23: Ch17. Proteomics and Protein Identification

The Major Proteomic Approaches

PMF, or Tandem MS ๋‹จ๋ฐฑ์งˆ ๋ถ„์„์˜ ์ผ๋ฐ˜์  ๋ฐฉ๋ฒ• ์ ค์—์„œ ๋ถ„๋ฆฌ๋œ ํ•˜๋‚˜์˜ ๋‹จ๋ฐฑ์งˆ์„ ํšจ์†Œ์— ์˜ํ•ด์„œ

๋‹จํŽธํ™”ํ•˜๊ณ  ์งˆ๋Ÿ‰ ๋ถ„์„์„ ํ†ตํ•œ ๋‹จ๋ฐฑ์งˆ ๋ถ„์„ Bottom-up, or shotgun proteomics

์ƒ˜ํ”Œ์— ์žˆ๋Š” ๋‹จ๋ฐฑ์งˆ๋“ค์„ ํšจ์†Œ์— ์˜ํ•ด์„œ ๋‹จํŽธํ™” ๋‹จํŽธํ™”๋œ ํŽฉํ‹ฐ๋“œ๋“ค์„ ํฌ๋กœ๋งˆํ† ๊ทธ๋ž˜ํ”ผ๋ฅผ ํ†ตํ•œ ๋ถ„๋ฆฌ Tandem MS ๋ฅผ ํ†ตํ•œ ๋ถ„์„ ํŽฉํ‹ฐ๋“œ ๋ถ„์„์ด ๋” ์ •ํ™•ํ•˜๋‹ค๋Š” ์ด์  ๋•Œ๋ฌธ์— ๋” ๋งŽ์€

๋‹จ๋ฐฑ์งˆ์„ ๋ฐœ๊ฒฌํ•  ์ˆ˜ ์žˆ์Œ ํ•ด๋‹น ๋‹จ๋ฐฑ์งˆ์˜ ๊ณ„์‚ฐ์˜ ์–ด๋ ค์›€

Page 24: Ch17. Proteomics and Protein Identification

Data Preprocessing

MS ๋Š” ์ด์˜จ์˜ ์งˆ๋Ÿ‰์ด ์•„๋‹ˆ๋ผ Mass-to-Charge Ratio(m/z) ๋ฅผ ์ธก์ • ์ด์˜จํ™”๊ธฐ๋กœ MALDI ๋ฅผ ์‚ฌ์šฉ ๋˜๋Š” ESI ๋ฅผ ์“ธ ๊ฒฝ์šฐ ๋ณ„๋„์˜ ์ฒ˜๋ฆฌ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์‚ฌ์šฉ

๋™์œ„ ์›์†Œ์˜ ์ฒ˜๋ฆฌ๋ฌธ์ œ ํ‰๊ท  vs ๊ฐ€์žฅ ๋งŽ์€ ๋™์œ„์›์†Œ

๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ์ƒ์˜ ์–ด๋ ค์›€ ์ด์˜จํ™”๊ฐ€ ์–ด๋ ค์šด ์›์†Œ Peptide ์˜ ํ™”ํ•™์  ๋ณ€ํ™” ์ƒ˜ํ”Œ ์†์— ์—ฌ๋Ÿฌ ๋‹จ๋ฐฑ์งˆ์ด ์กด์žฌ ๋‹จ๋ฐฑ์งˆ์ด ๋น„๊ต๋˜๋Š” Database ์— ์•„์ง ์—†์„ ์ˆ˜ ์žˆ์Œ

Page 25: Ch17. Proteomics and Protein Identification

The Major Protein Identification Programs

ํ”„๋กœ๊ทธ๋žจ์˜ ๊ณตํ†ต๋œ ๋‹จ๊ณ„ Database ๋‚ด์˜ ๊ฐ ์„œ์—ด๋กœ๋ถ€ํ„ฐ ๊ฐ€๋Šฅํ•œ ์ด์˜จ ์‚ฐ๋ฌผ ๊ณ„์‚ฐ ๊ณ„์‚ฐ๋œ ์ด์˜จ๋“ค๊ณผ MS ๋กœ ๋ฐœ๊ฒฌ๋œ ์ด์˜จ๋“ค๊ณผ ๋น„๊ต , ์ ์ˆ˜ํ™”

ํ”„๋กœ๊ทธ๋žจ ๊ฐ„์˜ ์ฐจ์ด์ ๊ฐœ๋ฐœ์‚ฌ ์ง€์› DB

PMFMS/MS

Scoring

MASCOT Matrix Science

MSDBNCBInr

SwissProtdbEST

๋‘˜ ๋‹ค ์ง€์› MOWSE

ALDENTE(PeptIdent)

SIB(ExPASy)

SwissProtTrEMBL

PMF Tunable

ProteinProspector UCSFNCBInr

SwissProtdbEST

๋‘˜ ๋‹ค ์ง€์› Masses matchedMOWSE

GFSGiddings Lab.

UNC15 genomes ๋‘˜ ๋‹ค ์ง€์›

Page 26: Ch17. Proteomics and Protein Identification

MASCOT(1/4)

764.21231.012841944.82020.22100.35

Or

764.2 20101231.0 23451284 4561944.8 10122020.2 232100.35 566

database

Fixed modifications :ํ•ด๋‹น residue ์— ๋Œ€ํ•ด์„œ์ด๋ฏธ ์•Œ๋ ค์ง„ ๋ณ€ํ˜•๋œ๋‹ค๋ฅธ ์งˆ๋Ÿ‰๊ฐ’์„ ์‚ฌ์šฉ

Variable modification :ํ•ด๋‹น residue ์— ๋Œ€ํ•ด์„œ์ผ์–ด๋‚˜๋Š” ๋ชจ๋“  ๊ฒฝ์šฐ๋ณ€ํ˜•๋œ ์งˆ๋Ÿ‰๊ฐ’์„ ์กฐํ•ฉํ•จ

Page 27: Ch17. Proteomics and Protein Identification

MASCOT(2/4)

Significant matchesp < 0.05

Non-significantmatches

Page 28: Ch17. Proteomics and Protein Identification

MASCOT(3/4)

~

์ผ์ • ํ™•๋ฅ ์ด ๋„˜์–ด์˜๋ฏธ์žˆ๋Š” ๊ฐ’ ๋งŒ ๋นจ๊ฐ„์ƒ‰

Page 29: Ch17. Proteomics and Protein Identification

MASCOT(4/4)

Page 30: Ch17. Proteomics and Protein Identification

ALDENTE(PeptIdent)(1/3)

Page 31: Ch17. Proteomics and Protein Identification

ALDENTE(PeptIdent)(2/3)

Page 32: Ch17. Proteomics and Protein Identification

ALDENTE(PeptIdent)(3/3)

์ ์ˆ˜ํ™”๋ฅผ ํŠœ๋‹ํ•  ์ˆ˜ ์žˆ๋‹ค .

Page 33: Ch17. Proteomics and Protein Identification

ProteinProspector(1/2)

Page 34: Ch17. Proteomics and Protein Identification

ProteinProspector(2/2)

Page 35: Ch17. Proteomics and Protein Identification

GFS(1/2)

Page 36: Ch17. Proteomics and Protein Identification

GFS(2/2)

Page 37: Ch17. Proteomics and Protein Identification

Summary

Proteomics ์— ์‚ฌ์šฉ๋˜๋Š” ํ”„๋กœ๊ทธ๋žจ๋“ค์˜ ๋ฌธ์ œ์  ํœด๋ฆฌ์Šคํ‹ฑ์— ๊ธฐ๋ฐ˜ ์„ ํƒ๋œ ํŒŒ๋ผ๋ฏธํ„ฐ์— ์˜์กด ์ œ๊ณต๋œ ๋ฐ์ดํ„ฐ์— ์˜์กด

์ผ๋ฐ˜์  ํ•ด๊ฒฐ์ฑ… ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ ์ ˆํ•˜๊ฒŒ ์กฐ์ ˆํ•˜๋ผ ์—ฌ๋Ÿฌ ํ”„๋กœ๊ทธ๋žจ๋“ค์„ ์ด์šฉํ•ด๋ณด๊ณ  ๊ฒฐ๊ณผ๋ฅผ ๋น„๊ตํ•˜๋ผ