the human plasma proteome: analysis of chinese serum using shotgun strategy

12
REGULAR ARTICLE The human plasma proteome: Analysis of Chinese serum using shotgun strategy Ping He* 1 , Hong-Zhi He* 1 , Jie Dai* 2 , Ying Wang 1 , Quan-Hu Sheng 2 , Lan-Ping Zhou 1 , Zi-Sen Zhang 1 , Yu-Lin Sun 1 , Fang Liu 1 , Kun Wang 3 , Jin-Sheng Zhang 1 , Hui-Xin Wang 1 , Zhen-Mei Song 1 , Hai-Rong Zhang 3 , Rong Zeng 2 and Xiaohang Zhao 1, 3 1 National Laboratory of Molecular Oncology, Cancer Institute & Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, P. R. China 2 Research Centre for Proteome Analysis, Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, P. R. China 3 Center of Basic Medical Sciences, Beijing Yanjing Hospital, Beijing, P. R. China We have investigated the serum proteome of Han-nationality Chinese by using shotgun strategy. A complete proteomics analysis was performed on two reference specimens from a total of 20 healthy donors, in which each sample was made from ten-pooled male or female serum, respectively. The methodology used encompassed (1) removal of six high-abundant proteins; (2) tryptic digestion of low- and high-abundant proteins of serum; (3) separation of peptide mix- ture by RP-HPLC followed by ESI-MS/MS identification. A total of 944nonredundant proteins were identified under a stringent filter condition (X corr 1.9, 2.2, and 3.75, DC n 0.1, and R sp 4.0) in both pooled male and female samples, in which 594 and 622 entire proteins were found, respectively. Compared with the total 3020 protein identifications confirmed by more than one laboratory or more than one specimen in HUPO Plasma Proteome Project (PPP) partici- pating laboratories recently, 206 proteins were identified with at least two distinct peptides per protein and 185proteins were considered as high-confidence identification. Moreover, some lower abundance serum proteins (ng/mL range) were detected, such as complement C5 and CA125, routinely used as an ovarian cancer marker in plasma and serum. The resulting non- redundant list of serum proteins would add significant information to the knowledge base of human plasma proteome and facilitate disease markers discovery. Received: July 10, 2004 Revised: December 23, 2004 Accepted: March 1, 2005 Keywords: Biomarkers / Mass spectrometry / Reference specimen / Serum proteome / Shotgun proteomics 3442 Proteomics 2005, 5, 3442–3453 1 Introduction Serum/plasma, an amorphous and important component of blood, containing tens of thousands and various other small molecules includes salts, lipids, and sugars [1]. As blood flows Correspondence: Dr. Xiaohang Zhao, National Laboratory of Mo- lecular Oncology, Cancer Institute & Hospital, CAMS & PUMC, Beijing P. O. Box 2258, Beijing 100021, P. R. China E-mail: [email protected] Fax: 186-10-67709015 Abbreviations: HBV , hepatitis B virus; HCV, hepatitis C virus; HIV , human immunodeficiency virus; PPP , Plasma Proteome Project * These authors contributed equally. © 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de DOI 10.1002/pmic.200401301

Upload: ping-he

Post on 06-Jul-2016

229 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: The human plasma proteome: Analysis of Chinese serum using shotgun strategy

REGULAR ARTICLE

The human plasma proteome: Analysis of Chinese serum

using shotgun strategy

Ping He*1, Hong-Zhi He*1, Jie Dai*2, Ying Wang1, Quan-Hu Sheng2,Lan-Ping Zhou1, Zi-Sen Zhang1, Yu-Lin Sun1, Fang Liu1, Kun Wang3,Jin-Sheng Zhang1, Hui-Xin Wang1, Zhen-Mei Song1, Hai-Rong Zhang3,Rong Zeng2 and Xiaohang Zhao1, 3

1 National Laboratory of Molecular Oncology, Cancer Institute & Hospital, Chinese Academyof Medical Sciences & Peking Union Medical College, Beijing, P. R. China

2 Research Centre for Proteome Analysis, Institute of Biochemistry and Cell Biology,Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, P. R. China

3 Center of Basic Medical Sciences, Beijing Yanjing Hospital, Beijing, P. R. China

We have investigated the serum proteome of Han-nationality Chinese by using shotgun strategy.A complete proteomics analysis was performed on two reference specimens from a total of20 healthy donors, in which each sample was made from ten-pooled male or female serum,respectively. The methodology used encompassed (1) removal of six high-abundant proteins;(2) tryptic digestion of low- and high-abundant proteins of serum; (3) separation of peptide mix-ture by RP-HPLC followed by ESI-MS/MS identification. A total of 944 nonredundant proteinswere identified under a stringent filter condition (Xcorr � 1.9, �2.2, and �3.75, DCn � 0.1, andRsp � 4.0) in both pooled male and female samples, in which 594 and 622 entire proteins werefound, respectively. Compared with the total 3020 protein identifications confirmed by more thanone laboratory or more than one specimen in HUPO Plasma Proteome Project (PPP) partici-pating laboratories recently, 206 proteins were identified with at least two distinct peptides perprotein and 185 proteins were considered as high-confidence identification. Moreover, somelower abundance serum proteins (ng/mL range) were detected, such as complement C5 andCA125, routinely used as an ovarian cancer marker in plasma and serum. The resulting non-redundant list of serum proteins would add significant information to the knowledge base ofhuman plasma proteome and facilitate disease markers discovery.

Received: July 10, 2004Revised: December 23, 2004

Accepted: March 1, 2005

Keywords:

Biomarkers / Mass spectrometry / Reference specimen / Serum proteome / Shotgunproteomics

3442 Proteomics 2005, 5, 3442–3453

1 Introduction

Serum/plasma, an amorphous and important component ofblood, containing tens of thousands and various other smallmolecules includes salts, lipids, and sugars [1]. As blood flows

Correspondence: Dr. Xiaohang Zhao, National Laboratory of Mo-lecular Oncology, Cancer Institute & Hospital, CAMS & PUMC,Beijing P. O. Box 2258, Beijing 100021, P. R. ChinaE-mail: [email protected]: 186-10-67709015

Abbreviations: HBV, hepatitis B virus; HCV, hepatitis C virus; HIV,human immunodeficiency virus; PPP, Plasma Proteome Project * These authors contributed equally.

© 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

DOI 10.1002/pmic.200401301

Page 2: The human plasma proteome: Analysis of Chinese serum using shotgun strategy

Proteomics 2005, 5, 3442–3453 Clinical Proteomics 3443

through most of the tissues or organs of the human body andcontacts with almost every cell, the origins of serum/plasmaproteins are various. Many researches have proved that quan-titative and qualitative changes of serum proteins are relatedto physiological or pathological states of the human body. So ithas a long history for serum to be used for clinical diagnosisand therapeutic monitoring. With the development of prote-omics, more and more proteomics techniques (such as 2-DE,MALDI-TOF, LC-MS/MS, SELDI-TOF, and so on) are used toprofile proteins existing in serum and identify disease-relatedserum markers in order to approach early diagnosis andeffective treatment of diseases [1–4]. Although the study ofserum proteins benefits greatly from the application of prote-omic techniques, there are still a great many unknown pro-teins remaining in serum. The difficulties arise from thehigh-dynamic concentration range (about 9–12 magnitudes),individual variations (race, sex, age, etc.) susceptible to variousfactors (living habits), and so on. Recently, Anderson N. L. etal. [5] gave a review about the current status of serum/plasmaproteome study, in which they summarized 1175 non-redundant serum proteins based on four reported sources.

Historically, 2-DE has been the primary method ofseparation and comparison for complex protein mixtures [3],and has been used to analyze human serum [6–13]. Despiteits prominent advantages, e.g., high resolution and sensitiv-ity, 2-DE requires manual dexterity and precision to repro-duce precisely, and is thus not well-suited as a high-throughput technology [14]. Moreover, it has difficulties indetecting proteins with extremes in molecular mass and pI[15]. Recently, multidimensional shotgun proteomics hasproven to be an alternative technology able to identify hun-dreds of proteins from single samples and be comple-mentary to 2-DE based analysis [16–18]. Through this work-flow, the proteins are usually digested with a proteolytic en-zyme to generate shorter peptides that are more easilyanalyzed by MS. Shotgun proteomics relies on separationafter this digestion step and takes advantage of MS/MS toinfer the amino acid sequence of individual peptides [19].Compared with traditional proteomic 2-DE, shotgun, servingas a powerful tool to separate and identify proteins fromcomplex protein mixtures, possesses the virtues of high effi-ciency, time, and labor saving [20].

In this study, we conducted research on the proteincomponents in sera of healthy Chinese using shotgun pro-teomics. To eliminate the interference of high-abundantproteins, a multiple affinity removal column system wasused to remove six high-abundant proteins, i.e., albumin,transferrin, haptoglobin, a-1-antitrypsin, IgA, and IgG.Trypsin was used to digest flow-through and bound proteinmixtures, which were eluted from affinity column by differ-ent buffer conditions, followed by RP-HPLC. The compo-nents of the isolated protein mixture were then identified byESI-MS/MS of the peptides. Some of these proteins/peptidesidentified have not been reported by previous proteomicsstudies and some were low-abundant proteins, such astumor markers.

2 Materials and methods

2.1 Preparation of pooled male and female serum

specimens

All the tested sera (the HUPO reference specimen, c1-CAMS) provided by Chinese Academy of Medical Scienceswere prepared according to the BD protocol [21] withsome modifications. Twenty healthy donors were strictlyselected (ten male and ten female) after a larger groupwas tested to exclude those with infections of humanimmunodeficiency virus (HIV), hepatitis B virus (HBV),hepatitis C virus (HCV), or syphilis. These pools wereprepared, after approval by the Institutional Review Board(IRB) and informed consent by donors. Donors wererequired to fast and avoid taking medicines and drinkingalcohol 12 h before sampling. Human blood was obtainedby venipuncture from each donor into evacuated bloodcollection tubes that contain either no anticoagulant or anappropriate volume of anticoagulant (K-EDTA, sodiumcitrate, and lithium heparin) to prevent clotting. The spe-cimens were centrifuged at 2600 6 g for 15 min at 47C.The resultant serum and plasma were transferred intosecondary centrifuge tubes, which were then centrifugedagain at 12 000 6 g for 5 min at 47C to remove all poten-tially remaining cells. Afterwards, the sera from eachdonor were combined with equal volume (2.0 mL) to formthe pooled male and pooled female serum/plasma, whichwere analyzed for this paper, equal volume of the maleand female serum/plasma (20 mL each) were pooledagain to form the healthy Chinese serum/plasma refer-ence specimens (named c1-CAMS). Each specimen wastransferred into 0.25 mL aliquots, into labeled cryovialsalready on dry ice, then stored at 2807C, and delivered tothe HUPO PPP participating laboratories on dry ice.Before use, the specimen was thawed and recentrifuged at47C, 12 000 6 g for 10 min to remove insoluble material.

2.2 Depletion of the highly abundant serum proteins

Crude sera (male and female) were thawed, diluted five-foldwith buffer A (product no. 5185–5987, pH 7.4; Agilent Tech-nologies, Palo Alto, CA, USA) and then filtered through0.22 mm filters (Agilent Technologies) by spinning at16 000 6 g at room temperature for 1.5 min. Diluted serumsamples were injected on a Multiple Affinity Removal Sys-tem® HPLC column (Agilent Technologies) in 100% buf-fer A at a flow rate of 0.25 mL/min for 9 min [22]. The boundproteins were eluted in 100% buffer B at a flow rate of1.0 mL/min for 3.5 min. All chromatographic fractionationswere performed at room temperature (227C) on an HP1100HPLC system with automated sample injector set at 47C. Theunbound (low-abundant) and bound (high-abundant) pro-teins were collected into Eppendorf tubes and stored at2207C for further analysis.

© 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

Page 3: The human plasma proteome: Analysis of Chinese serum using shotgun strategy

3444 P. He et al. Proteomics 2005, 5, 3442–3453

2.3 Protein concentration

Fractions collected from five injections, respectively, werepooled into two spin concentrators with 3 kDa molecularweight cutoff (Millipore, Microcon). The fractions were spunat 12 000 6 g at 47C for 2 h. Protein contents were estimatedwith a Bradford protein assay using BSA as a protein stand-ard. Each protein mixture was lyophilized for digestion.

2.4 Trypsin digestion of flow through and bound

protein mixtures

Appropriate volumes of protein sample for each fractionwere redissolved in reducing solution (6 M guanidine hydro-chloride, 100 mM ammonium bicarbonate, pH 8.3) with theprotein concentration adjusted to 3 mg/mL. Next, 300 mg ofprotein sample for each fraction with 100 mL volume wasmixed with 1 mL of 1 M DTT. The protein mixtures were thenincubated with trypsin (50:1) at 377C overnight.

2.5 RP-HPLC MS/MS shotgun analysis

1-D-LC MS/MS was performed using an LTQ linear IT massspectrometer (Thermo, San Jose, CA, USA). The system wasfitted with a C18 RP column (180 mm 6 100 mm, BioBa-sic® C18, 5 mm; Thermo Hypersil-Keystone). Mobile phase A(0.1% formic acid in water) and the mobile phase B (0.1% for-mic acid in ACN) were selected. The tryptic peptide mixtureswere eluted using a gradient of 2–98% B over 180 min. TheLTQ linear IT mass spectrometer was set so that one full MSscan was followed by ten MS/MS scans on the ten most intenseions from the MS spectrum with the following DynamicExclusion™ settings: repeat count 2, repeat duration 30 s,exclusion duration 90 s. Each sample was analyzed in triplicate.

2.6 Protein identification

The acquired MS/MS spectra were automatically searchedagainst protein database for human proteins (EMBL-EBI pro-teome set for Homo sapiens (human), 5/27/2004 released)using the TurboSEQUESTprogram in the BioWorks™ 3.0 soft-ware suite. An accepted SEQUEST result had to have a DCn

score of at least 0.1 (regardless of charge state) and a cutoff ofRsp 4. A singly charged peptide must be tryptic, and the cross-correlation score (Xcorr) had to be at least 1.9. Tryptic or partiallytryptic peptides with a charge state of 12 have an Xcorr of atleast 2.2. Triply charged tryptic or partially tryptic peptides witha 13 charge state was accepted if the Xcorr was�3.75 [23].

2.7 Bioinformatic analysis

Protein identification results were extracted from SEQUESTout.file with the in-house software BuildSummary. Func-tional classifications were performed with GOA (http://www.ebi.ac.uk/GOA/) according to the accession number ofproteins in IPI.

3 Results

3.1 Data production of the shotgun MS identification

The samples we used were from Chinese male and femaledonors from which c1-CAMS reference specimen were pre-pared. We employed an antibody column to remove the topsix proteins (albumin, IgG, IgA, transferrin, haptoglobin,and antitrypsin). However, we did not discard the fractionbound to the column, which may contain proteins except thetop six proteins. Therefore, four fractions were collected,every fraction was digested and analyzed with LC-MS/MSidentification duplicated, resulting in eight experimentaldata sets: male-bound-1, male-bound-2, female-bound-1,female-bound-2, male-unbound-1, male-unbound-2, female-unbound-1, and female-unbound-2 (Fig. 1). By capillary LClinear IT-MS, the raw spectra were produced, and dta andout.files were created by database searching, which was per-formed by SEQUEST engine. We used our in-house tool,BuildSummary to summarize the out.files and result in pro-tein lists. BuildSummary combined the peptide sequencesinto proteins and deleted redundant proteins as described byWu C. C. et al. [24] with minor modification. For example, iften different peptides identified a protein and three of the tenwere also present in another protein, only the protein withthe greater number of peptides was listed, and the subsetprotein was removed. If all ten peptides were identified intwo proteins, both proteins were listed and combined as sin-gle group, similar to HUPO PPP integration algorithm

Figure 1. Flowchart of protein fractionation and identification. Anantibody column was employed to remove the six proteins(albumin, IgG, IgA, transferrin, haptoglobin, antitrypsin) and thefraction bound to the column was retained. As a result, four frac-tions were collected, and every fraction was digested and per-formed LC-MS/MS. All the experiments were done in duplicate,resulting in eight experimental data sets: male-bound-1, male-bound-2, female-bound-1, female-bound-2, male-unbound-1,male-unbound-2, female-unbound-1, and female-unbound-2.

© 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

Page 4: The human plasma proteome: Analysis of Chinese serum using shotgun strategy

Proteomics 2005, 5, 3442–3453 Clinical Proteomics 3445

(Adamski et al., this issue). Due to the protein homolog inthe database, one or more peptides obtained by shotgun MS/MS methods may be assigned to multiple proteins; we calledall the multiple proteins a “group”, which contains protein(s)with the same identified peptide sequence(s). So, the morethe database is redundant, the more the protein numbers,but the group numbers would not increase with the redun-dancy of the database, unless nonredundant protein itemswere added in the database.

3.2 Proteins in bound and unbound fractions

Table 1 lists the top-ten proteins identified in bound andunbound fractions, respectively. Albumin, IgG, transferrin,haptoglobin, and antitrypsin are in the top-ten bound proteins,and apolipoprotein B-100 precursor, complement C3 pre-cursor, alpha-2-macroglobulin precursor become the high-abundant proteins in the unbound fraction, indicating the effi-cient depletion of the six proteins. Meanwhile, antitrypsin wasfound only in the bound fraction, and very few of the other fiveproteins were detected synchronously in the unbound fraction;it confirms still the good performance of the depletion. Table 2shows the proteins with at least two peptides identified only inthe bound fraction but not in the unbound fraction.

3.3 Data comparison of different fractions

In order to investigate the reproducibility and overlapping ofdifferent fractions, the following strategies were adopted toselect common proteins among data sets: (1) combine theout.files of related fractions and generate the group results;(2) if peptides from two fractions are both found in a group, acommon protein was counted. Table 3 shows the overlappingof the proteins among fractions. Table 3A shows the repro-ducibility of the duplicated experiments only reaches 40.1–52.3%. As estimated, the overlaps of bound and unboundfractions are very small (16–18.4%). Totally, the overlaps ofmale and female serums reach 40–50%.

3.4 Protein identification by different databases

In HUPO PPP, IPI database was used as reference forsearching. The capacity and redundancy of the databaseinfluence the protein identification. We used four humandatabases, IPI 2.20 (13/June/2003), IPI 2.32 (05/May/2004),Swiss-Prot (sprot43.dat, 03/March/2004), and NCBI (04/December/2003), to test the results from male-bound-1 andmale-unbound-1. Table 4 shows the identification results.The identified group numbers are similar, using different

Table 1. Top-ten proteins in bound fraction (A) and unbound fraction (B)

No. IPI accessionnumber

Protein name Sequencecoverage%

Molecularweight

pI

A

1 IPI00022434.1 Serum albumin precursor 74.71 69 366.9 5.922 IPI00022463.1 Serotransferrin precursor 56.16 77 049.89 6.813 IPI00164623.2 Complement C3 precursor 23.14 187 235.17 6.024 IPI00385332.1 Hypothetical protein 41.28 51 204.1 8.465 IPI00019571.3 Haptoglobin precursor 51.20 46 270.44 6.236 IPI00384938.1 Hypothetical protein DKFZp686N02209 36.31 52 852 8.757 IPI00305457.3 Alpha-1-antitrypsin precursor 44.50 46 736.49 5.378 IPI00399007.2 Ig gamma-2 chain C region 58.59 35 884.58 7.669 IPI00335356.1 Hypothetical protein 35.85 65 039.16 6.34

10 IPI00333982.2 Human full-length cDNA cloneCS0DI019YF20 of placenta of Homosapiens

29.12 57 486.39 8.35

B

1 IPI00022229.1 Apolipoprotein B-100 precursor 41.46 51 5562.59 6.612 IPI00164623.2 Complement C3 precursor 65.99 18 7235.17 6.023 IPI00032256.1 Alpha-2-macroglobulin precursor 55.77 16 3277.92 6.04 IPI00032258.4 Complement C4 precursor 40.77 19 2771.88 6.665 IPI00029739.3 Splice isoform 1 of P08603 complement

factor H precursor38.42 13 9125.57 6.28

6 IPI00017601.1 Ceruloplasmin precursor 42.16 12 2205.24 5.447 IPI00022463.1 Serotransferrin precursor 54.58 77 049.89 6.818 IPI00022418.1 Splice isoform 1 of P02751 fibronectin

precursor22.55 26 2606.29 5.45

9 IPI00019580.1 Plasminogen precursor 46.79 90 569 7.0410 IPI00294193.3 Splice isoform 1 of Q14624 inter-alpha-

trypsin inhibitor heavy chain H4 precursor43.87 10 3358.45 6.51

© 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

Page 5: The human plasma proteome: Analysis of Chinese serum using shotgun strategy

3446 P. He et al. Proteomics 2005, 5, 3442–3453

Table 2. Proteins with at least two unique peptides found only in bound fraction

IPI accessionnumber

Protein name Peptide hits Uniquepeptide

Sequencecoverage %

IPI00305457.3 Alpha-1-antitrypsin precursor 977 14 44.50IPI00386133.1 Ig kappa chain V-IV region B17 precursor 19 3 24.63IPI00385253.1 Ig kappa chain V-III region CLL precursor 12 3 33.33IPI00385985.1 Ig lambda chain V-III region LOI 9 3 37.84IPI00386839.1 Amyloid lambda 6 light chain variable region SAR 4 3 37.93IPI00383732.1 VH3 protein 38 2 15.65IPI00026175.1 Ig kappa chain V-II region GM607 precursor 17 2 31.62IPI00024138.1 Ig kappa chain V-III region VH precursor 10 2 23.28IPI00296365.3 Centromeric protein E 9 2 0.94IPI00015586.1 Hypothetical protein FLJ21924 8 2 3.04IPI00217045.2 Ig heavy chain V-I region HG3 precursor 6 2 19.66IPI00025702.2 PR-domain zinc finger protein 15 6 2 1.86IPI00306239.2 Katanin p80 subunit B 1 2 2 4.73IPI00177494.1 Line-1 reverse transcriptase 2 2 12.27

Table 3. Protein overlapping of different fractions

Fraction Groupnumbersin ex-periment 1

Groupnumbersin ex-periment 2

Overlappedproteins inboth ex-periment 1 and2 (numbers/%)

A

Male-bound 154 150 61/40.1Female-bound 213 167 85/44.7Male-unbound 275 249 137/52.3Female-unbound 215 261 109/45.8

Fraction Groupnumbersin boundfraction

Groupnumbersin unboundfraction

Overlappedproteins inboth boundand unboundfraction (num-bers/%)

B

Male 248 388 51/16.0%Female 301 372 62/18.4%Total 432 583 85/16.7%

Fraction Groupnumbersin male

Groupnumbersin female

Overlappedproteins inboth male andfemale (num-bers/%)

C

Bound 248 301 124/45.2%Unbound 388 372 183/48.2%Total 594 622 284/46.7%

databases, while the NCBI database produced many moreproteins due to the large redundancy of the NCBI database.The two IPI databases are both acceptable at nonredundancyand have quite close numbers of groups and proteins.

Table 4. Protein numbers in male-bound-1 and male-unbound-1fractions using different databases

Database Group/proteinnumbers inmale-bound-1 fraction

Group/proteinnumbers inmale-unbound-1 fraction

IPI-05/May/2004 154/255 275/451IPI-13/June/2003 156/248 291/457Swiss-Prot-03/March/2004 150/196 279/368NCBI-04/December/2003 174/501 298/1075

3.5 Removal of redundancy in data sets

When multiple data sets were obtained, the issue was how tocompare the protein results. The results were grouped, butwhen the pI and molecular weight were calculated and theprotein function was classified, it was observed that almostall proteins in each group are highly homologous, generallybelonging to the same superfamily, or just different alter-native splicing isoforms. It is unreasonable to use all pro-teins in all groups to calculate the pI and molecular weight,and classify the protein function. In this work, we designedtwo strategies to remove redundancy of the identified pro-teins, resulting in only one protein that remained in a group.If a group contained two or more proteins, entries withannotation in Swiss-Prot as well as RefSeq NP remained, andthen entries with Swiss-Prot remained, if there was no Swiss-Prot entry, then protein with RefSeq NP annotationremained. If still over two entries remained, the protein withthe longest sequence was kept as the unique entry in thisgroup. Figure 2 shows the selection based on database annota-tion. The protein numbers produced by each step are alsoindicated in the figure.

© 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

Page 6: The human plasma proteome: Analysis of Chinese serum using shotgun strategy

Proteomics 2005, 5, 3442–3453 Clinical Proteomics 3447

Figure 2. Procedure for removal of redundancy in data sets. Theselection based on database annotation including two steps.Step 1 to filter protein entries by following priorities: keep entriesbuilt from both Swiss-Prot and RefSeq_NP . keep entries builtfrom Swiss-Prot . keep entries built from RefSeq_NP . keep allentries; step 2: select entry with longest sequence from keptentries. The protein numbers produced by each step are alsoindicated in the figure.

3.6 Shotgun sequencing of serum proteins from

healthy pooled male and female serum

Applying the shotgun sequencing strategy, we identified intotal 944 nonredundant proteins from the two pooled serumsamples. Among the 944 proteins, 178 proteins were con-sidered as hypothetical proteins through database search-ing. We identified 594 proteins containing 102 hypotheticalproteins from the pooled male serum and 622 proteinscontaining 112 hypothetical proteins from the pooledfemale serum.

3.7 Function and localization analysis of pooled

serum proteins

Functional classifications were performed with GOA (http://www.ebi.ac.uk/GOA/) according to the accession number ofproteins in IPI. Next, we divided the proteins into ten groups

in accordance with their functions (Fig. 3A). To realize wherethese serum proteins came from, we additionally searchedthe protein databases of IPI, Swiss-Prot, NCBI, and RefSeq[25–27] and found that the proteins with definite subcellularlocalization were mainly secreted proteins (66%), and otherslocalized on membrane, cytoplasm, nucleus, centrosome,and peroxisome (Fig. 3B).

4 Discussion

4.1 Data quality and sensitivity

In this study, the serum donors were healthy Han-nation-ality Chinese. With the shotgun strategy of proteomicsanalysis, a total of 944 distinct proteins were identified (594in pooled male and 622 in pooled female serum speci-mens). These proteins were screened with more stringentsearch parameters: the Xcorr � 1.9, �2.2, �3.75 for singly-,doubly-, and triply-charged ions of peptides with a DCn

score of at least 0.1 and a cutoff of Rsp 4. A singly-chargedpeptide must be fully tryptic, and fully or partially trypticpeptides with a charge state of 12 or 13 were accepted.With multipeptide identifications (identified by more thanone peptide), a total of 3020 proteins were identified byHUPO PPP participating laboratories, which were con-firmed by more than one laboratory or more than onespecimen in HUPO PPP identified protein index (HUPOPPP IPI) [28]. The 3020 proteins include 206 proteinsidentified by our group with at least two distinct peptidesper protein. Additionally, using high-confidence identifica-tions (Xcorr � 1.9, �2.2, �3.75, DCn � 0.1, Rsp � 4.0, fullytriptic) with at least two distinct peptides per protein fromthe same experiment and specimen in the same laboratory,a total of 551 serum proteins were identified and con-sidered to be high-quality proteins in HUPO PPP IPI, inwhich 185 proteins were found in our lists (33.6%, Table 5).Eighty-six percent (159/185) of the high-quality proteinshave been confirmed by other PPP participating labora-tories in HUPO PPP data sets (Fig. 4A). Sequentially, wecompared the 185 high-quality proteins with other ethnicgroups under same parameters in HUPO PPP IPI datasets. There was 68% (126/185) overlap with the 482 high-quality proteins of b1 reference specimen (Caucasian-American, Fig. 4B); 90% (51/57) overlap with the 57 high-quality proteins of b3 reference specimen (African-Amer-ican, Fig. 4C). Still, nearly 66% of our high-quality proteinsare secreted proteins (Fig. 3B) that are regarded as classicalplasma proteins [1]. It is of particular interest that we havealso identified some low-abundance proteins (ng/mL),which were known to be involved in disease processes.This includes complement C5, ovarian cancer relatedmarker of CA125 with high-quality identifications, and abreast cancer related marker of estrogen receptor. So, wecould draw the conclusion that our data are relativelytrustworthy.

© 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

Page 7: The human plasma proteome: Analysis of Chinese serum using shotgun strategy

3448 P. He et al. Proteomics 2005, 5, 3442–3453

Figure 3. Classification and sub-cellular localization of serum pro-teins. (A) Classification of functionalproteins considered from data-bases. Total nonredundant proteinsidentified from both male andfemale sera were analyzed. (B) Sub-cellular localization of high-qualityproteins. High-quality proteins iden-tified from both male and femalesera found to harbor definite sub-cellular localization were analyzedthrough database searching. Mem,membrane; Nuc, nucleus; Cyt, cyto-plasm; Sec, secreted protein; Ext,extra cellular protein; Per, perox-isomal.

4.2 Shotgun identification

Recently, multidimensional shotgun technique has beenproven to be a powerful proteomics analysis tool, whoseprominent advantage is avoiding complicated sample pre-fractionation and a high efficiency to identify hundreds ofproteins in a single run. By comparing our nonredundantmale (594) and female (622) serum proteins with the ana-lytical results of Pieper et al. [2] and Adkins et al. [3], whofocused on male and female sera, respectively, we identifiedmore proteins than those of Pieper et al. (594 vs. 325) and

Adkins et al. (622 vs. 490) revealing the promising fore-ground of shotgun strategy. Moreover, as is shown in Fig. 5, asubstantial fraction of protein molecular weight detected inthis study is from 10 to 100 kDa (over 30% of the proteinswith a hypothetical Mr more than 100 kDa), but 2-DE nor-mally could detect less than 10% of the extreme in molecularmass [1]. Although there are still some shortcomingsaccompanying it, such as lower resolution of isoforms at theprotein level, less accurate quantification of differentialexpression level in comparison with 2-DE [29], and relativelow reproducibility (45.7% on average in our results), the

© 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

Page 8: The human plasma proteome: Analysis of Chinese serum using shotgun strategy

Proteomics 2005, 5, 3442–3453 Clinical Proteomics 3449

Table 5. Our identified 185 high-quality proteinsa)

IPIaccessionnumber

Protein name Numberof peptidesmatched

IPI00000138.1 Alpha-1,3-mannosyl-glycopro-tein 2-beta-N-acetyl-glucosaminyltransferase

2

IPI00383732.1 VH3 protein 2IPI00002127.2 Hypothetical protein KIAA1410 2IPI00002255.4 Lipopolysaccharide-responsive

and beige-like anchor protein2

IPI00002464.1 Hypothetical protein KIAA1304 2IPI00003351.1 Extracellular matrix protein 1

precursor3

IPI00003515.1 Thyroid receptor interactingprotein 11

2

IPI00399007.2 Ig gamma-2 chain C region 13IPI00004617.1 Ig gamma-3 chain C region 8IPI00004618.1 Ig gamma-4 chain C region 8IPI00004641.1 Ig alpha-2 chain C region 9IPI00006014.1 Hypothetical protein KIAA0562 2IPI00006114.2 Pigment epithelium-derived

factor precursor7

IPI00006662.1 Apolipoprotein D precursor 4IPI00384407.1 Myosin-reactive immuno-

globulin heavy chain variableregion

2

IPI00008556.1 Splice isoform 1 of P03951coagulation factor XIprecursor

3

IPI00008558.1 Plasma kallikrein precursor 4IPI00008616.1 Splice isoform 1 of Q9Y666

Solute carrier family 12member 7

2

IPI00008868.3 Microtubule-associatedprotein 1B

2

IPI00009028.1 Tetranectin precursor 4IPI00009920.1 Complement component C6

precursor7

IPI00010286.2 KIAA1109 protein 2IPI00385985.1 Ig lambda chain V-III region LOI 3IPI00011252.1 Complement component C8

alpha chain precursor7

IPI00011261.1 Complement component C8gamma chain precursor

4

IPI00011264.1 Complement factor H-relatedprotein 1 precursor

4

IPI00011268.1 Splice isoform 2 of Q9UKM9RNA-binding protein Raly

2

IPI00013543.2 2IPI00014845.7 Dynein, axonemal, heavy

polypeptide 82

IPI00015286.1 Dedicator of cytokinesisprotein 1

2

IPI00015586.1 Hypothetical protein FLJ21924 2IPI00016472.1 Hypothetical protein KIAA0853 2IPI00017601.1 Ceruloplasmin precursor 30IPI00017696.1 Complement C1s component

precursor4

Table 5. Continued

IPIaccessionnumber

Protein name Numberof peptidesmatched

IPI00018305.1 Insulin-like growth factor bin-ding protein 3 precursor

2

IPI00019399.1 Serum amyloid A-4 proteinprecursor

3

IPI00019568.1 Prothrombin precursor 21IPI00019571.3 Haptoglobin precursor 16IPI00019580.1 Plasminogen precursor 23IPI00019581.1 Coagulation factor XII

precursor3

IPI00019591.1 Splice isoform 1 of P00751Complement factor Bprecursor

21

IPI00019943.1 Afamin precursor 10IPI00020091.1 Alpha-1-acid glycoprotein 2

precursor7

IPI00020265.2 Hypothetical protein 2IPI00020501.1 Myosin heavy chain, smooth

muscle isoform3

IPI00020986.2 Lumican precursor 3IPI00020996.1 Insulin-like growth factor

binding protein complex acidlabile chain precursor

4

IPI00021033.1 Collagen alpha 1(III) chainprecursor

2

IPI00021364.1 Properdin precursor 2IPI00021727.1 C4b-binding protein alpha chain

precursor16

IPI00021841.1 Apolipoprotein A-I precursor 13IPI00021842.1 Apolipoprotein E precursor 6IPI00021854.1 Apolipoprotein A-II precursor 6IPI00021855.1 Apolipoprotein C-I precursor 2IPI00021856.1 Apolipoprotein C-II precursor 3IPI00021857.1 Apolipoprotein C-III precursor 3IPI00022229.1 Apolipoprotein B-100 precursor 107IPI00022371.1 Histidine-rich glycoprotein

precursor9

IPI00022391.1 Serum amyloid P-componentprecursor

4

IPI00022392.1 Complement C1q sub-component, A chain precursor

2

IPI00022394.2 Complement C1q sub-component, C chain precursor

2

IPI00022395.1 Complement component C9precursor

8

IPI00022420.2 Plasma retinol-binding proteinprecursor

7

IPI00022426.1 AMBP protein precursor 8IPI00022429.1 Alpha-1-acid glycoprotein 1

precursor7

IPI00022431.1 Alpha-2-HS-glycoproteinprecursor

7

IPI00022432.1 Transthyretin precursor 6IPI00022434.1 Serum albumin precursor 48IPI00022445.1 Platelet basic protein precursor 5IPI00022446.1 Platelet factor 4 precursor 2

© 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

Page 9: The human plasma proteome: Analysis of Chinese serum using shotgun strategy

3450 P. He et al. Proteomics 2005, 5, 3442–3453

Table 5. Continued

IPIaccessionnumber

Protein name Numberof peptidesmatched

IPI00022463.1 Serotransferrin precursor 30IPI00022488.1 Hemopexin precursor 17IPI00385253.1 Ig kappa chain V-III region CLL

precursor3

IPI00022937.1 Coagulation factor V precursor 2IPI00023019.1 Sex hormone-binding globulin

precursor2

IPI00023339.1 CREB-binding protein 2IPI00023673.1 Galectin-3 binding protein

precursor2

IPI00177869.4 Splice isoform 1 of O14791Apolipoprotein L1 precursor

5

IPI00024138.1 Ig kappa chain V-III region VHprecursor

2

IPI00024825.1 Megakaryocyte stimulatingfactor

2

IPI00025204.1 CD5 antigen-like precursor 3IPI00025327.1 Plasminogen-related protein B

precursor2

IPI00025426.1 Pregnancy zone proteinprecursor

6

IPI00025702.2 PR-domain zinc fingerprotein 15

2

IPI00025862.1 C4b-binding protein beta chainprecursor

3

IPI00386132.1 Ig kappa chain V-IV region JIprecursor

3

IPI00026199.1 Plasma glutathione peroxidaseprecursor

4

IPI00026314.1 Gelsolin precursor, plasma 13IPI00027235.1 Splice isoform 1 of O75882

Attractin precursor3

IPI00382844.1 Complement factor H-relatedprotein 3 precursor

2

IPI00027507.1 Aconitase 2IPI00028051.1 Splice isoform Long of P49754

Vacuolar assembly proteinVPS41 homolog

2

IPI00028413.1 Inter-alpha-trypsin inhibitorheavy chain H3 precursor

4

IPI00029168.1 Apolipoprotein(a) precursor 3IPI00029739.3 Splice isoform 1 of P08603

complement factor Hprecursor

36

IPI00029863.1 Alpha-2-antiplasmin precursor 7IPI00030205.1 Ig kappa chain V-III region HAH

precursor2

IPI00030264.1 Hypothetical protein FLJ21613IPI00030739.1 Apolipoprotein M 2IPI00031410.1 FKBP-rapamycin associated

protein2

IPI00032179.1 Antithrombin-III precursor 12IPI00032215.2 Alpha-1-antichymotrypsin, pre-

cursor13

IPI00032220.1 Angiotensinogen precursor 8

Table 5. Continued

IPIaccessionnumber

Protein name Numberof peptidesmatched

IPI00032256.1 Alpha-2-macroglobulin precur-sor

53

IPI00032258.4 Complement C4 precursor 50IPI00032291.1 Complement C5 precursor 21IPI00032328.1 Splice isoform HMW of P01042

Kininogen precursor10

IPI00041065.2 Hyaluronan binding protein 2 2IPI00044529.3 Microtubule-actin crosslinking

factor 1, isoform 42

IPI00374862.1 Splice isoform 1 of Q96PQ7Kelch-like protein 5

2

IPI00044891.1 Adaptor molecule-1 2IPI00059369.1 Splice isoform 1 of Q96AA8

hypothetical proteinKIAA0555

2

IPI00061246.1 Hypothetical protein 5IPI00062247.2 Tax_Id=9606 2IPI00386839.1 Amyloid lambda 6 light chain

variable region SAR2

IPI00064667.1 Glutamate carboxypeptidase-like protein 2 precursor

2

IPI00096066.1 Succinyl-CoA ligase [GDP-forming] beta-chain,mitochondrial precursor

2

IPI00100399.2 THAP domain protein 4 2IPI00102670.1 Formin-binding protein 17 2IPI00103552.2 Ovarian cancer related tumor

marker CA1252

IPI00103595.1 Centrosome-associatedprotein 350

2

IPI00107113.1 Similar to KIAA0266 geneproduct

2

IPI00402680.1 Titin isoform N2-A 4IPI00154742.1 Hypothetical protein 6IPI00163207.1 Splice isoform 1 of Q96PD5

N-acetylmuramoyl-L-alanineamidase precursor

9

IPI00163446.1 Hypothetical protein 3IPI00164623.2 Complement C3 precursor 73IPI00165243.1 Hypothetical proteinIPI00165927.1 Hypothetical protein KIAA1608 2IPI00166729.1 Alpha-2-glycoprotein 1, zinc 5IPI00166930.2 Similar to Carboxypeptidase N

83 kDa chain (carboxypepti-dase N regulatory subunit)

2

IPI00168679.1 Hypothetical protein 14IPI00169294.1 Protein C12orf2 2IPI00386879.1 Hypothetical protein FLJ14473 13IPI00178198.1 48IPI00178482.1 Hypothetical protein 21IPI00178926.1 Immunoglobulin J chain 3IPI00182274.1 2IPI00182635.1 84IPI00184715.1 Hypothetical protein 29IPI00215983.1 Carbonic anhydrase I 2

© 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

Page 10: The human plasma proteome: Analysis of Chinese serum using shotgun strategy

Proteomics 2005, 5, 3442–3453 Clinical Proteomics 3451

Table 5. Continued

IPIaccessionnumber

Protein name Numberof peptidesmatched

IPI00216722.1 Alpha 1B-glycoprotein 38IPI00217045.2 Ig heavy chain V-I region HG3

precursor2

IPI00218192.1 Splice isoform 2 of Q14624Inter-alpha-trypsin inhibitorheavy chain H4 precursor

24

IPI00218732.1 Paraoxonase 1 6IPI00218746.1 Complement component 1, q

subcomponent, beta poly-peptide precursor

3

IPI00218816.1 Beta globin 7IPI00218845.2 Nitric oxide synthase 3

(endothelial cell)2

IPI00218897.2 Alcohol dehydrogenase 1B(class I), beta polypeptide

2

IPI00219168.2 Spectrin beta chain, brain 4 2IPI00220259.3 Pericentrin B 2IPI00221211.3 Transmembrane protease,

serine 42

IPI00221325.2 Ran-binding protein 2 2IPI00239405.3 Splice isoform 1 of Q8WXH0

Nesprin 22

IPI00244391.1 Xanthine dehydrogenase 2IPI00246101.1 Similar to immunoglobulin

heavy chain2

IPI00261031.3 Similar to hephaestin 2IPI00289301.1 Predicted testis protein 2IPI00289649.3 Protein tyrosine phosphatase,

non-receptor type 32

IPI00289809.3 Phosphatidylinositol 3-kinase-related protein kinase

2

IPI00291262.3 Clusterin precursor 7IPI00291866.1 Plasma protease C1 inhibitor

precursor6

IPI00291867.3 Complement factor I precursor 6IPI00292451.1 Titin fetal isoform 2IPI00292530.1 Inter-alpha-trypsin inhibitor

heavy chain H1 precursor17

IPI00292946.1 Thyroxine-binding globulinprecursor

3

IPI00292950.1 Heparin cofactor II precursor 11IPI00293251.2 Splice isoform 6 of O94833

Bullous pemphigoidantigen 1, isoforms 6/9/10

3

IPI00294004.1 Vitamin K-dependent proteinS precursor

6

IPI00294395.1 Complement component C8beta chain precursor

28

IPI00295832.1 Oligodendrocyte-myelinglycoprotein precursor

2

IPI00296099.1 Thrombospondin 1 precursor 4IPI00296165.3 Complement C1r component

precursor9

IPI00296365.3 Centromeric protein E 2

Table 5. Continued

IPIaccessionnumber

Protein name Numberof peptidesmatched

IPI00296608.4 Complement component C7precursor

3

IPI00334339.2 Hypothetical protein FLJ39824 2IPI00298828.1 Beta-2-glycoprotein I precursor 11IPI00298971.1 Vitronectin precursor 11IPI00299063.1 Stromal interaction molecule 1

precursor2

IPI00301884.1 Glycosylphosphatidylinositol-specific phospholipase Dprecursor

2

IPI00304273.1 Apolipoprotein A-IV precursor 16IPI00305457.3 Alpha-1-antitrypsin precursor 14IPI00305461.1 Inter-alpha-trypsin inhibitor

heavy chain H2 precursor17

IPI00328461.1 Hypothetical proteinDKFZp686H112

118

IPI00328696.7 Hemoglobin alpha chain 4

a) High-confidence multi-peptide identifications.

Figure 4. Comparative analysis of our high-quality proteins withthe PPP database. Our high-quality proteins were compared with:(A) the 3020 proteins identified by multi-peptide in PPP data setsexcluding the proteins we submitted; (B) the 482 high-confidencemulti-peptide identified proteins from b1 reference specimen in13 laboratories; (C) the 57 high-confidence multi-peptide identi-fied proteins from b3 reference specimen in two laboratories.

© 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

Page 11: The human plasma proteome: Analysis of Chinese serum using shotgun strategy

3452 P. He et al. Proteomics 2005, 5, 3442–3453

Figure 5. The distribution of hypo-thetical molecular weight of total944 proteins. Using the IPI databaseand the bioinformatics tool of Com-pute pI/Mw, we got the hypotheticalmolecular weight of total 944 identi-fied proteins and further counted thenumber of proteins in different mo-lecular weight range.

shotgun technique still holds great promises to be a morevalid and high-throughput strategy for proteomics study withvarious improvement performed.

4.3 Parallel analysis of bound and unbound fractions

The presence of higher abundance proteins interferes withthe identification and quantification of lower abundanceproteins, which are considered to possess lots of diseasemarkers [3]. So the removal of the top six high-abundantproteins seems to be vital before further identification ofserum proteins, but a main concern of the depletion is thatsome other proteins can be codepleted concurrently if theybind with the six proteins. Herein, 351 proteins were onlyfound in bound but not unbound fraction, most of whichhave only one peptide identified and might be some otherproteins, which were also removed during the process ofdepletion in this study. This suggested that the depletionwould remove some proteins except for the top-six capturedones. It is necessary to do the parallel analysis of bound andunbound fractions to obtain more global profile of theplasma proteome.

We used shotgun strategy to investigate the serum pro-teome of Han-nationality Chinese. A total of 944 non-redundant proteins were identified with stringent filter cri-teria. After comparing our data with that of HUPO PPPdataset under identical filter parameters, great overlap ratiowas found. The resulting nonredundant list of Chineseserum proteins might enrich the dataset of human plasmaproteome and prove valuable clues to the discovery of diseasemarkers.

We would like to thank Dr. Gilbert S. Omenn for helpfuladvice on the manuscript, Dr. Bruce Haywood and CatherineSkobe for giving suggestions of sample handling. This work was

supported by the national science fund for distinguished youngscholars (30225045), NNSF (30370713; 30171049), thenational high-tech R & D program of China (2004AA227060;2002BA711A11), Basic Research Foundation (2001CB210501;2002CB713807), and HUPO small grant (PPP-SP/04-20).

5 References

[1] Anderson, N. L., Anderson, N. G., Mol. Cell. Proteomics2002, 1, 845–867.

[2] Pieper, R., Gatlin, C. L., Makusky, A. J., Russo, P. S., Schatz, C.R., Miller, S. S., Su, Q. et al., Proteomics 2003, 3, 1345–1364.

[3] Adkins, J. N., Varnum, S. M., Auberry, K. J., Moore, R. J.,Angell, N. H., Smith, R. D., Springer, D. L. et al., Mol. Cell.Proteomics 2002, 1, 947–955.

[4] Petricoin, E. F., Ardekani, A. M., Hitt, B. A., Levine, R. J.,Fusaro, V. A., Steinberg, S. M., Mills, G. B. et al., Lancet 2002,359, 572–577.

[5] Anderson, N. L., Polanski, M., Pieper, R., Gatlin, T., Tirumalai,R. S., Conrads, T. R., Vaenstra, T. D. et al., Mol. Cell. Prote-omics 2004, 3, 311–326.

[6] Eberini, I., Agnello, D., Miller, I., Villa, P., Fratelli, M., Ghazzi,P., Gemeiner, M. et al., Electrophoresis 2000, 21, 2170–2180.

[7] Eberini, I., Miller, I., Zancan, V., Bolego, C., Paglisi, L.,Gemeiner, M., Gianazza, E. et al., Electrophoresis 1999, 20,846–853.

[8] Haynes, P., Miller, I., Aebersold, R., Gemeiner, M., Eberini, I.,Lovati, M. R., Manzoni, C. et al., Electrophoresis 1998, 19,1484–1492.

[9] Edwards, J. J., Anderson, N. G., Nance, S. L., Anderson, N.L., Blood 1979, 53, 1121–1132.

[10] Anderson, L., Anderson, N. G., Proc. Natl. Acad. Sci. USA1977, 74, 5421–5425.

[11] Miller, I., Haynes, P., Gemeiner, M., Aebersold, R., Manzoni,C., Lovati, M. R., Vignati, M. et al., Electrophoresis 1998, 19,1493–1500.

© 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

Page 12: The human plasma proteome: Analysis of Chinese serum using shotgun strategy

Proteomics 2005, 5, 3442–3453 Clinical Proteomics 3453

[12] Miller, I., Haynes, P., Eberini, I., Gemeiner, M., Aebersold, R.,Gianazza, E. et al., Electrophoresis 1999, 20, 836–845.

[13] Peters, T., Clin. Chem. 1987, 33, 1317–1325.

[14] Stephen, J. F., Peter, M. L., Curr. Opin. Chem. Biol. 2001, 5,26–33.

[15] Rabilloud, T., Proteomics 2002, 2, 3–10.

[16] McDonald, W. H., Yates, J. R., Dis. Markers 2002, 18, 99–105.

[17] Wolters, D. A., Washburn, M. P., Yates, J. R., Anal. Chem.2001, 73, 5683–5690.

[18] Wienkoop, S., Glinski, M., Tanaka, N., Tolstikov, V., Fiehn, O.,Weckwerth, W., Rapid Commun. Mass Spectrom. 2004, 18,643–650.

[19] McDonald, W. H., Yates, J. R., Curr. Opin. Mol. Ther. 2003, 5,302–309.

[20] Wu, S. L., Choudhary, G., Ramstrom, M., Bergquist, J., Han-cock, K. S., J. Proteome. Res. 2003, 2, 383–393.

[21] Omenn, G. S., Proteomics 2004, 4, 1235–1240.

[22] Zhang, K., Zolotarjova, N., Nicol, G., Martosella, J., Yang, L.S., Szafranski, C., Bailey, J. et al., Agilent Technologies,publication 5988–9813EN.

[23] Washburn, M. P., Wolters, D., Yates, J. R., Nat. Biotechnol.2001, 19, 242–247.

[24] Wu, C. C., MacCoss, M. J., Howell, K. E., Yates, J. R., Nat.Biotechnol. 2003, 21, 532–538.

[25] ftp://ftp.ebi.ac.uk/pub/databases/IPI/current/

[26] ftp://ftp.ebi.ac.uk/pub/databases/swissprot/

[27] ftp://ftp.ncbi.nih.gov/refseq/

[28] http://www.bioinformatics.med.umich.edu/app1/sqlwebac-cess

[29] Lee, C. L., Hsiao, H. H., Lin, C. W., Wu, S. P. et al., Proteomics2003, 3, 2472–2486.

© 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de