discovery and qualification of serum protein biomarker

1

Discovery and Qualification of Serum Protein

Biomarker Candidates for Cholangiocarcinoma

Diagnosis

Kassaporn Duangkumpha, †,‡ Thomas Stoll, ¥ Jutarop Phetcharaburanin, †,‡Puangrat

Yongvanit, ‡ Raynoo Thanan, † Anchalee Techasen, ‡,ǁ Nisana Namwat, †,‡ Narong Khuntikeo,

‡,€ Nittaya Chamadol, ‡,Ꜫ Sittiruk Roytrakul, Ω Jason Mulvenna, ¥ Ahmed Mohamed, ¥ Alok K.

Shah, ¥ Michelle M. Hill, ¥,* and Watcharin Loilome †,‡,*

†Department of Biochemistry, Faculty of Medicine, Khon Kaen University, Khon Kaen,

Thailand

‡Cholangiocarcinoma Research Institute, Khon Kaen University, Khon Kaen, Thailand;

¥QIMR Berghofer Medical Research Institute, Queensland, Australia

ǁFaculty of Associated Medical Sciences, Khon Kaen University, Khon Kaen, Thailand

€Department of Surgery, Faculty of Medicine, Khon Kaen University, Khon Kaen, Thailand

ꜪDepartment of Radiology, Faculty of Medicine, Khon Kaen University, Khon Kaen,

Thailand

ΩProteomics Research Laboratory, Genome Institute, National Center for Genetic

Engineering and Biotechnology, National Science and Technology Development Agency,

Pathum Thani, Thailand

Corresponding authors:

Associate Professor Watcharin Loilome, Email: [email protected]

Associate Professor Michelle Hill, Email: [email protected]

2

ABSTRACT

Cholangiocarcinoma ( CCA) is a major health problem in northeastern Thailand. The

majority of CCA cases are clinically silent and difficult to detect at an early stage. Although

abdominal ultrasonography ( US) can detect pre- malignant periductal fibrosis ( PDF) , this

method is not suitable for screening populations in remote areas. With the goal of developing

a blood test for detecting CCA in the at-risk population, we carried out serum protein biomarker

discovery and qualification. Label-free shotgun proteomics was performed on depleted serum

samples from 30 participants (n=10 for US-normal, US-PDF and CCA groups). Of 40 protein

candidates selected using multiple reactions monitoring on 90 additional serum samples (n=30

per group) , 11 discriminatory proteins were obtained using supervised multivariate statistical

analysis. We further evaluated 3 candidates using ELISA and immunohistochemistry (IHC).

S100A9, thioredoxin ( TRX) and cadherin- related family member 2 ( CDHR2) were

significantly different between CCA and normal, and CCA and PDF groups when measured in

an additional 247 serum samples (p<0.0001). By IHC, TRX and CDHR2 were detected in the

cytoplasm and nucleus of CCA and inflammatory cells. S100A9 was detected in the infiltrating

tumor stroma immune cells. Proteomics discovery and qualification in depleted sera revealed

promising biomarker candidates for CCA diagnosis.

KEYWORDS: Cholangiocarcinoma, Proteomics, Mass spectrometry, Multiple reaction

monitoring, Serum biomarker discovery pipeline

3

INTRODUCTION

Cholangiocarcinoma (CCA) is an aggressive cancer of the bile duct epithelium with a

poor survival rate due to lack of specific clinical symptoms leading to late diagnosis.1 CCA is

a major public health problem in the northeast of Thailand where it shows the highest incidence

in the world.2 Chronic inflammation of the biliary tract caused by the liver fluke (Opisthorchis

viverrini, Ov) is the principal mechanism that drives cholangiocarcinogenesis in the Mekong

area of Southeast Asia. 2-3 Oxidative stress induced by Ov infection leads to DNA damage,

abnormal tissue remodeling and the alteration of gene expression, all of which have been

implicated in carcinogenesis.4-5 Interestingly, a number of molecules have been reported to be

differentially abundant during Ov-associated cholangiocarcinogenesis that could, therefore, be

used as biomarkers for the assessment and chemoprevention of liver fluke- associated

cholangiocarcinoma. 6 Moreover, the chronic injury affecting the bile duct epithelial cells

during Ov infection leads to periductal fibrosis (PDF), which is believed to be an intermediate

pathological condition leading to CCA. We recently reported histological confirmation using

ultrasound- based diagnosis of PDF and CCA in a cohort of CCA patients. 7 Although

ultrasonography is a potentially useful strategy to screen and follow up at-risk populations, this

technique is not easily accessible for remote populations. Therefore, as a first step towards

developing a specific and economic diagnostic test for CCA, we utilized our cohort to discover

blood-based protein biomarkers.

In the post- genomic era, the field of proteomics promises the discovery of new

molecular targets for therapy, biomarkers for early detection, and new endpoints for therapeutic

efficacy and toxicity.8 Protein expression fingerprints in body fluids such as serum or plasma

or in tissue biopsies from patients have been investigated for the potential to diagnose cancers

such as breast9, colorectal10, prostate11, lung12, as well as CCA. 13-14 A previous discovery

proteomics study for plasma CCA biomarkers analysed sera from 10 CCA patients and 10

4

control subjects using two-dimensional gel electrophoresis (2-DE) and mass spectrometry. The

result found that elevated of α1- antitrypsin ( AP1) together with three previously established

tumor markers (CA19-9, AP1 and α-fetoprotein: AFP) in plasma from CCA patients could be

used to obtain prediction accuracy of greater than 80% for CCA diagnosis. 13 In another

approach, tissues from CCA and control samples were analysed for differentially expressed

proteins including protein S100A9, chaperonin- containing TCR1, subunit 3 ( CCTγ) , 14- 3-3

proteins, periostin and α- smooth muscle actin ( α – SMA) . 1 4 -1 5 A major limitation of these

previous studies is the small sample size and lack of replication/validation.

One reason for the lack of replication may be the feasibility of performing discovery

proteomics on a large number of samples. Multiple reaction monitoring ( MRM) , also known

as selected reaction monitoring ( SRM) , is a targeted mass spectrometry approach to protein

quantitation that is emerging to bridge the gap between biomarker discovery and clinical

validation.16-17 Due to the higher sensitivity and throughput of MRM over shotgun proteomics,

this technique has been successfully applied for biomarker qualification by several research

groups including us. 18- 20 For instance, Brock and colleagues recently targeted seven high

abundance serum proteins for predicting colorectal cancer. 21 Shah and co- workers have

developed a pipeline for glycoprotein biomarker discovery in serum for esophageal

adenocarcinoma using MRM in the biomarker qualification phase.22

In this study, we report the biomarker discovery workflow focusing on the discovery

of potential biomarkers for CCA diagnosis in serum of the at-risk population. We established

MRM-based quantitative assays for selected biomarker candidates and performed the

qualification phase on an independent cohort. The top three biomarker candidates were then

further evaluated using orthogonal methods in independent cohorts, including enzyme-linked

immunosorbent assay (ELISA) for serum protein measurement, and immunohistochemistry

(IHC) to evaluate CCA tissue expression.

5

EXPERIMENTAL SECTION

Study subjects and collection procedure

All subjects in these studies submitted their written, informed consent and the studies

were approved by the Human Ethics Committee of Khon Kaen University, Thailand based on

the ethics of human specimen experimentation of the National Research Council of Thailand

(HE531320 and HE571283). All serum samples were obtained from the Cholangiocarcinoma

Research Institute ( CARI) . The diagnoses of CCA patients employed clinical data, imaging

analysis and pathological diagnosis. Normal and periductal fibrosis groups were obtained from

the cholangiocarcinoma screening and care program (CASCAP) at CARI. CASCAP is a cross-

sectional study with ultrasonography diagnosis, in an endemic area for liver fluke infection in

Khon Kaen province. For discovery phase, shotgun proteomics was performed with three group

of individuals 30 age- sex matched serum samples ( 10 Normal group, 10 PDF group and 10

CCA group) and qualification phase with MRM-MS analysis was carried out with individuals

90 age-sex matched serum samples (30 Normal group, 30 PDF group and 30 CCA group). All

serum specimens were prepared using the same protocol within 3 h of blood collection into 2

tubes. After centrifugation at 1,000 x g for 10 min, 1 mL of serum was aliquoted into an

Eppendorf tube and stored at -80 °C prior to proteomic analysis. Cadaveric donor liver tissues

and CCA tumor tissue microarrays (TMA) obtained from 5 and 208 individuals, respectively,

who had undergone surgery at Srinagarind Hospital, Khon Kaen University, Thailand were

used for immunohistochemical staining (IHC).

Serum sample processing for mass spectrometry

The 12 most abundant serum proteins were depleted using immunodepletion kits

( Thermo Fisher Scientific, MA, United States) according to the manufacturer’ s instructions.

Briefly, detailed of protein depletion and concentration are provided in Supplementary method

6

section. Protein concentration of a depleted serum sample was measured using a BCA assay

kit (Pierce Biotechnology, Rockford, United States). Post-depletion serum (30 µg) was spiked

with internal standards ( 10 pmole chicken ovalbumin for discovery samples, 10 pmole

ovalbumin plus indexed retention time ( iRT) peptides for qualification of the samples) ,

denatured using 2% SDS in 100 mM triethylammonium bicarbonate (TEAB, pH 8.5) at 95°C

for 5 min and cooled on ice for 5 min; reduced in 10 mM of Tris (2-carboxyethyl) phosphine

( TCEP) at 60°C for 30 min; alkylated in the dark at 37°C for 30 min with 40 mM 2-

chloroacetamide (CAA). Trypsin digestion was performed using the methanol co-precipitation

method as previously described.23 Samples were acidified to 1% formic acid and desalted using

C18 cartridges (Phenomenex, NSW, Australia). Digested peptide mixtures were dried and re-

suspended in 1% formic acid.

LC-MS/MS for discovery phase

Tryptic peptides (500 ng) were analyzed on a nano ACQUITY UPLC system (Waters,

Milford, US) coupled to a Triple TOF 5600 mass spectrometer (AB SCIEX) equipped with a

nano electrospray ion source. The peptides were loaded on to a trap column M-Class 5 µm

Symmetry C18 180 µm x 20 mm ( Waters) , before separating on a M- Class 1. 7 µm BEH130

C18 75 µm x 200 mm LC column ( Waters) at a flow rate of 300 nL/ min and a column

temperature of 35 °C. MaxQuant software (version 1.6.0.16) was used for quantitative label-

free analysis of LC- MS/ MS data. 24-25 LC- MS/ MS and data analysis details are available in

Supplementary method section.

MRM analysis for qualification phase

All samples were processed and analysed in a randomised order with an injection

volume of 30 µL ( 15 µg) . LCMS- 8050 ( Shimadzu) triple quadrupole mass spectrometer,

7

coupled with a standard- flow Shimadzu Nexera X2 ultra- high- performance liquid

chromatograph (UHPLC). The UHPLC system consisted of a reverse phase chromatographic

column, AdvanceBio Peptide Mapping (150 × 2.1 mm i.d., 2.7 µm, part number 653750–902,

Agilent Technologies) with a 5 mm long guard column. D e t a i l s o f t h e M R M m e thod

development and LC-MS parameters are decribed in the Supplementary Methods. The final

m e t h o d m e a s u r e d 4 2 proteins ( 40 biomarker candidates plus internal standards, chicken

ovalbumin and iRT) , 165 peptides and 567 transitions with a retention time window of 2 min

(Table S3).

MRM data analysis were performed using Skyline2 6 ( version 3. 7. 1. 11208; August

2017). All peaks were manually checked for correct integration, and peak area of each peptide

(sum of all transitions) was exported for further analysis. Normalization was performed at the

p e p t i d e l e v e l using Microsoft Excel software. The iRT peptide peak intensity was first

normalized with median iRT peptides for each sample. Next, using the normalized intensity of

iRT peptide, the intensity of all other peptides was normalized (Table S5). To calculate protein

intensity from peptides that matched to the same protein, we filtered for normalized peptides

with >0.9 Pearson’s correlation. Then, normalized peptide intensities were averaged to protein

intensity, and a log2 transformation was performed to obtain a near-normal distribution needed

for statistical tests using in the computing environment R (Table S6). For statistical analysis,

missing values were replaced by the minimum detected intensity for each peptide.

Bioinformatics and statistical analysis

Protein interaction network analysis was generated using STRING software based on

the STRING database and Gene Oncology ( GO) term. All of analyses were constructed and

visualized in SPSS 19. 0 ( IBM, USA) , GraphPad Prism 5 and R statistical software. The

ANOVA test was conducted with Shiny MixOmics online software ( http: / / mixomics-

8

projects. di. uq. edu. au/ Shiny) . Principal component analysis ( PCA) and orthogonal signal

correction projection to latent structures discriminant analysis (O-PLS-DA) was conducted in

SIMCA 15.0 (Umetrics, Sweden). Described and detailed of statistical analysis are available

in the Supplementary Methods.

Antibody-based methods

The following primary antibodies were used for indirect enzyme- linked

immunosorbent assay ( ELISA) and immunohistochemical staining ( IHC) : S100A9 ( Cat.

#ab24111) , TRX ( Cat. #ab185329) purchased from Abcam ( Cambridge, MA) and

CDHR2/PCLKC (Cat. #orb158119) purchased from Biorbyt (San Fransisco, CA). Detailed

methods for ELISA and IHC are available in the Supplementary Methods.

RESULTS

Overview of biomarker workflow and baseline characteristics of samples

We performed a multi- phased biomarker discovery and development study as

illustrated in Figure 1. Participant recruitment, ultrasound, blood sample collection and

biobanking were completed before biomarker discovery. According to the proposed

‘triangular’ biomarker study design for development of early cancer biomarkers 27, we used a

small cohort for discovery using shotgun proteomics to measure a broad range of proteins, then

increased the sample size while reducing the number of biomarker candidates. For the

discovery and qualification phases, we selected age-sex matched participants with ultrasound-

confirmed normal liver and PDF pathology. CCA cases were confirmed by pathology

diagnosis. In the qualification cohort, smoking status and alcohol consumption were

significantly different (Table 1 and Table S9). Orthogonal investigation using antibody-based

9

methods were conducted in independent samples using ELISA and IHC techniques on serum

samples and tumor micro tissue arrays, respectively.

Figure 1 Generalized workflow diagram for serum protein biomarker discovery. A; Serum

samples from respective patient groups were stored at - 80ºc until analysis. B; Discovery

samples (n=30) were depleted for the top 12 serum proteins and spiked with internal standard

protein. Tryptic peptides were analysed by label- free proteomics and MaxQuant software.

Biomarker candidates were selected after analysis with Shiny MixOmics. C; A custom multiple

reaction monitoring-mass spectrometry (MRM-MS) was developed for biomarker verification

in an independent cohort of 90 participants. Data processing and analysis used Skyline, R,

SIMCA and SPSS. D; Antibody-based assays were used to validate peptide level MS data for

selected candidates at the protein level in additional independent cohorts, n= 247 for ELISA,

n=208 for IHC.

10

Table 1 Baseline characteristics of serum samples in the discovery and qualification phases

Discovery phase (N=30) Qualification phase (N=90) Normal

US PDF US

CCA p-value Normal US

PDF US

CCA p-value

Sample size 10 10 10 30 30 30 Gender Male/Female

5/5

5/5

5/5

1.000

15/15

15/15

15/15

1.000

Age in year (Median ± SD)

60 ± 11

60 ± 12

61 ± 11

1.000

63 ± 5

65 ± 6

64 ± 5

0.497

Diagnosed with Diabetes Yes No Unknown*

1 (10%) 5 (50%) 4 (40%)

2 (20%) 8 (80%) -

- 9 (90%) 1 (10%)

0.376

6 (20%) 24(80%) -

4 (13%) 26 (87%) -

6 (20%) 18 (60%) 6 (20%)

0.738

Smoking status Yes No Unknown*

1 (10%) 4 (40%) 5 (50%)

3 (30%) 7 (70%) -

4 (40%) 5 (50%) 1 (10%)

0.649

13 (43%) 13 (43%) 4 (14%)

6 (20%) 22 (73%) 2 (7%)

14 (47%) 10 (33%) 6 (20%)

0.042

Alcohol consumption Yes No Unknown*

2 (20%) 3 (30%) 5 (50%)

4 (40%) 6 (60%) -

6 (60%) 3 (30%) 1 (10%)

0.449

18(60%) 8 (27%) 4 (13%)

8 (27%) 20 (67%) 2 (6%)

21 (70%) 3 (10%) 6 (20%)

P<0.001

Histotype Non-papillary Papillary

4 (40%) 6 (60%)

8 (27%) 22 (73%)

Metastatic status No Yes

7 (70%) 3 (30%)

16 (53%) 14 (47%)

Abbreviations: PDF, Periductal fibrosis; CCA, Cholangiocarcinoma; US, Ultrasound P-values were calculated using Pearson’s chi-square test and the P-value of significantly different comparisons is in bold. * indicates data missing from the questionnaire.

Discovery of serum biomarker candidates using shotgun proteomics

For the discovery phase, we selected serum samples from well- matched participants,

10 per group (Table 1) . Following abundant protein depletion, we added an internal standard

protein, chicken ovalbumin, for quality control purposes. LC-MS/MS and database searching

identified a total of 238 proteins (Table S1A), 3,600 peptides (Table S1B), with more than 80%

11

of the proteins identified across all 3 groups (Figure 2A). To confirm the quality of the shotgun

dataset for label-free quantitation, we examined the coefficient of variation (CV) of chicken

ovalbumin which was added at an equal amount across all samples post-depletion. The CV

for chicken ovalblumin label-free quantification (LFQ) was below 20% (Table S1, row 209,

column JH), which is considered acceptable for clinical studies.18, 28 LFQ intensity was used

as the normalized intensity of identified proteins for quantitative analysis.

To prioritize a list of the most promising candidates for the qualification phase, we

developed the following criteria for pairwise evaluation: 1) significantly different by ANOVA

test, 2) two or more unique peptides, 3) score of protein identification more than five, and 4)

involvement in cancer according to literature review (Figure 2B). Out of 238 measured serum

proteins, 40 proteins passed the candidate biomarker selection criteria f o r a t l e a s t o n e

comparison (Table S2). Principal component analysis (PCA) showed the separation of CCA

from healthy and PDF, but the latter two groups overlapped ( Figure 2C) . Application of

hierarchical cluster analysis ( HCA) showed more distinct separation and grouping of the

samples based on each phenotype, the first and fourth principal components ( PC1 and PC4)

expressed 84% and 0% of the variance explained in HCA scores plots (Figure 2D).

12

Figure 2 The biomarker discovery phase including the overlap of the 238 identified proteins,

candidate selection criteria and sample clustering with multivariate analysis. A; Venn diagram

showing the overlap of the 238 identified proteins. B; Candidate selection criteria. C; PCA

analysis and D, hierarchical PCA analysis of samples: normal pathology (green), PDF (blue)

and CCA group (red).

Biomarker qualification by targeted proteomic analysis or MRM analysis.

For biomarker qualification, a multiple reaction monitoring (MRM) mass spectrometry

method was developed for 40 biomarker candidates. An independent cohort of 90 participants

was then selected for qualification using the optimised MRM assay (Table 1 and Figure 1).

Normalized peptides with correlations of >0.9 (Pearson’s correlation test) were selected and

13

converted to protein intensities and log2 transformed. The qualification data set was then

subjected to multivariate statistical analysis (Figure 3). PCA with Pareto scaling showed a

clear separation of the CCA group from the normal and PDF groups (Figure 3A). The first two

principal components expressed 35.9% for PC1 and 21.9% for PC2 in the PCA model (Figure

3B). Orthogonal signal correction projection to latent structures discriminant analysis (O-PLS-

DA) was applied to conduct pair-wise comparisons of normal versus PDF, normal versus CCA,

and PDF versus CCA ( Figure 3C- H) . Although normal and PDF groups were unable to be

separated, the CCA group showed clear separation from the normal as well as PDF groups. The

O- PLS- DA regression model confirmed the above visual observations, with significant CV-

ANOVA for the normal versus CCA comparison, PDF versus CCA comparison, but not for

the normal versus PDF comparison (Table 2).

The loadings of the pairwise O- PLS- DA model identified significantly higher

normalized protein intensities for 11 proteins: Haptoglobin (HP) , Alpha-1-antichymotrypsin

( A1AC / SERPIN3) , Complement component C9 ( C9) , Intercellular adhesion molecule 1

(ICAM1), Protein S100A9 (S100A9), Thioredoxin (TRX/TXN), Aminopeptidase N (ANPEP),

Fumarylacetoacetase (FAH), Lipopolysaccharide-binding protein (LBP), Inter-alpha-trypsin

inhibitor heavy chain H3 (ITIH3), Cadherin-related family member 2 (CHDR2/PCLKC) in the

CCA group when compared with the normal and PDF groups (Table 2). Data for all measured

proteins are provided in Table S7.

14

Figure 3 Biomarker qualification and multivariate analysis. A and B; PCA score plot and

loading plot of MRM results of candidates that shows sample differentiation. The three

significant candidate proteins are shown in red circles. Scores plots of PCA models (panel C-

E) and O- PLS- DA models ( panel F- H) of the pairwise comparison between normal versus

PDF groups (C, F), normal versus CCA groups (D, G) and PDF versus CCA groups (E, H).

15

Table 2 Summary of O-PLS-DA qualified serum biomarker candidate proteins

Gene Name Uniprot ID Protein Name

O-PLS-DA Model

Normal (-) vs PDF (+) R2X = 50.9%; Q2Y = 0.039;

p = 0.6934

CCA (-) vs Normal (+) R2X = 56.7%; Q2Y = 0.645; p <0.00001

CCA (-) vs PDF (+) R2X = 48.3%; Q2Y = 0.6333;

p <0.00001 p(corr) p-value p(corr) p-value p(corr) p-value

HP P00738 Haptoglobin -0.3947

-0.3241 * -0.4775 ** SERPINA3

/ A1AC P01011 Alpha-1-antichymotrypsin -0.6007 * -0.2155

-0.4803 **

C9 P02748 Complement component C9 -0.2722

-0.2303

-0.4402 ** ICAM1 P05362 Intercellular adhesion molecule 1 -0.2584

-0.4944 * -0.5633 **

S100A9 P06702 Protein S100A9 -0.5408 ** -0.8199 **** -0.8600 **** TRX/TXN P10599 Thioredoxin -0.4283 * -0.6224 **** -0.7594 ****

ANPEP P15144 Aminopeptidase N -0.5230 * -0.4012 * -0.6384 *** FAH P16930 Fumarylacetoacetase -0.3643 * -0.2515

-0.5482 **

LBP P18428 Lipopolysaccharide-binding protein -0.4946 * -0.5286 ** -0.6619 **** ITIH3 Q06033 Inter-alpha-trypsin inhibitor heavy

chain H3 -0.5046 ** -0.1813

-0.4822 **

CDHR2 Q9BYE9 Cadherin-related family member 2 -0.1396

-0.7539 **** -0.7707 ****

P-values were calculated using a Mann-Whitney U -test for pairwise group comparison. (* P<0.05, ** P<0.01, ***P<0.001, **** P<0.0001). The p(corr) value is a correlation coefficient (ranging from -1.0 to 1.0) for each model. The P-value of all O-PLS-DA models was derived from permutation tests (n = 500)

16

Protein interaction network and inter-correlation of serum candidate proteins

Next, we determined whether the 11 candidate CCA serum protein biomarker

candidates were correlated in function or abundance. Using the STRING software 29 a n d

functional gene ontology (GO) enrichment analysis (Table S8), two main protein-protein

interaction networks were identified (Figure 4A): immune system and inflammatory processes

(red circle) and redox regulatory processes (blue circle). To investigate the inter-correlation of

abundance among qualified biomarker candidates from the serum samples between the normal

and CCA groups, we constructed a correlogram using Spearman correlation coefficient values

between significant serum candidates with correlation values > 0. 62 and p- values < 0. 001

(Figure 4B). Significant positive correlations were detected among the immune response

system-, inflammatory processes- and the redox regulatory system-related proteins, including

ITIH3, SERPINA3, TRX, S100A9, ANPEP, ICAM1 and FAH in both groups. The inter-

correlation of these candidate proteins, including ANPEP and SERPINA3, ANPEP and

ICAM1, ANPEP and FAH, and between ITIH3 and SERPINA3 showed a similar strength of

positive correlation coefficients in both groups. However, we also found that the inter-

correlation of candidate proteins, including TRX and ICAM1, TRX and ANPEP, TRX and

FAH, and ITIH3 and ICAM1, showed greater levels of positive correlation in the normal group

when compared with the CCA group. Moreover, CDHR2 also showed altered inter-correlations

with other candidate proteins in CCA when compared with the normal group (Figure 4B).

The three most significant qualified serum biomarker candidates for CCA were

S100A9, TRX and CDHR2 ( Figure 3B and Table 2) . All three proteins also showed strong

significant correlations and differences between the CCA group when compared with other

groups (p<0.0001) (Table 2 and Figure 5A-C). Interestingly, there was an increase in the level

of the inter-correlation between TRX and S100A9 for the CCA group when compared with the

normal group. Although there were positive correlations with these biological processes-

17

related proteins in both groups, these were higher in the CCA group. The area under the receiver

operating characteristic ( AUROC) curve was used to measure the diagnostic performance of

the potential single and multiple marker model for each pairwise comparison. As an individual

marker (Figure 5D-F), S100A9 can significantly discriminate the CCA group from the normal

and PDF groups with AUC 0.871 (95%CI; 0.777-0.965) and 0.939 (95%CI; 0.880-0.980),

respectively. Moreover, TRX can significantly discriminate the CCA group from the normal

and PDF groups with AUC 0.797 (95%CI; 0.686-0.907) and 0.888 (95%CI; 0.809-0.967),

respectively. Finally, CDHR2 distinguishes the CCA group from the normal and PDF groups

with AUC 0. 797 ( 95% CI; 0. 759- 0. 953) and 0. 888 ( 95% CI; 0. 782- 0. 953) , respectively.

However, to distinguish the normal groups from the PDF groups was possible only in S100A9

and TRX with AUC 0. 713 ( 95% CI; 0. 582- 0. 845) and 0. 653 ( 95% CI; 0. 514- 0. 793) ,

respectively (Figure 5D). A multiple marker model combining S100A9, TRX and CDHR2 was

used to evaluate the diagnostic performance for CCA diagnosis. The results showed that the

multiple marker panel showed a better performance than the single marker panel for

distinguishing between the CCA groups and the normal and PDF groups with AUC 0. 943

(95%CI; 0.880-1.001) and 0.973 (95%CI; 0.934-1.012), respectively.

18

Figure 4 Relationship between the qualified candidate biomarkers. A; Protein interaction

networks and gene oncology analysis using STRING. Biomarkers clustered into two main

networks: immune system and inflammatory processes ( red circle) and redox regulatory

processes (blue circle) . Gene names of the 11 candidates are color labeled according to their

network. Intermediate interactor proteins were added to each network ( labeled in black)

including Integrin, alpha L ( ITGAL) , Integrin, alpha M ( ITGAM) , Integrin, beta 2 ( ITGB2)

and Hemoglobin, beta ( HBB) for immune system and inflammatory processes ( red circles) ,

and Peroxiredoxin 1 ( PRDX1) , Peroxiredoxin 2 ( PRDX2) , Mitogen- activated protein kinase

kinase kinase 5 (MAP3K5), Thioredoxin reductase 1 (TXNRD1) and Thioredoxin interacting

protein (TXNIP) for redox regulatory system. The different intensity of lines represents the

protein association of confidence. B; inter- correlation analysis of abundance of qualified

candidates among normal (upper triangle) and CCA (lower triangle) groups. The name of the

protein is represented as the gene name and colored according to network relevance. The color

intensity of the dot indicates the size of the correlation coefficient, whereas the size of dot

indicates the significance level. Red denotes a positive correlation and blue a negative

correlation, while blank denotes no significant correlation.

19

Figure 5 Diagnostic value of serum biomarker candidates in the qualification phase. A, B

and C; the box and whisker plots show the log2 normalized intensity of these serum protein

candidates in the normal, PDF and CCA groups, which were represented by the green, blue

and red boxes, respectively. Data are represented as mean ± SD. Kruskal-Wallis test was used

to determine the differences of candidate proteins between the groups, followed by Dunnett’ s

test for pairwise group comparison (* P<0.05, ** P<0.01, ***P<0.001). D, E and F; ROC

analysis of single markers for each pairwise comparison including (D) normal versus PDF, (E)

normal versus CCA and ( F) PDF versus CCA. G, H and I; ROC analysis of multiple marker

panels for each pairwise comparison including ( G) normal versus PDF, ( H) normal versus

CCA and ( I) PDF versus CCA ( black solid line: theoretically perfect performance of an

efficient biomarker as the reference line) . Black, green, red and purple lines: actual

performance of S100A9, TRX, CDHR2 and the combination all three candidates, respectively.

20

Antibody-based orthogonal evaluation of the top three candidates

ELISA techniques and immunohistochemical staining ( IHC) were additionally

performed to support our findings based on the proteomics- based discovery and qualification

of biomarkers. The indirect ELISA technique was used to determine the relative levels of

S100A9, TRX and CDHR2 proteins in the serum of an independent cohort of normal, PDF and

CCA groups (N=54, 57, 136, respectively) . The clinical information is summarized in Table

S11. Our findings demonstrate that the serum level of these 3 candidates was significantly

increased in the CCA group compared with the normal and PDF groups ( P<0. 001) ( Figure

6A-C). The ELISA data were also subjected to ROC curve analysis. Diagnostic performance

for the ELISA measurements was poor and lower than the MRM data from the qualification

cohort (Figure S1). This could be due to differences in assay quality or cohort. Nevertheless, a

multi-marker panel composed of S100A9, TRX and CDHR2 demonstrated better performance

than a single marker model for distinguishing the CCA group from the normal and PDF groups

(Figure 6D-F and Figure S1)

Next, we examined the cell type of expression of S100A9, TRX and CDHR2 in human

CCA tissues using the same antibodies as for the ELISAs. A tissue microarray comprising of

208 cases were used, along with 5 donor tissues from cadaveric liver samples (Table S11). The

results showed that TRX and CDHR2 were expressed in both the cytoplasm and nucleus of

cancer cells and inflammatory cells, whereas S100A9 was only expressed in the infiltrating

immune cells in the tumor stroma (Figure 7). Compared to cadaveric liver, we observed high

TRX expression in human CCA tissues, which needs to be confirmed with a larger sample of

healthy liver tissues. We also determined whether there is a correlation between the tissue

expression levels and various patient/CCA characteristics. No significant correlation was

detected for gender, age, histotype, extra/intraductal status, TNM stage or metastasis (Table

S10).

21

Figure 6 ELISA measurements of the top 3 candidate biomarkers and ROC curve analysis of

the multiple marker model for CCA diagnosis performance. A, B and C; bar chart (mean±SEM)

show the distribution of the 3 protein levels in the serum of each group: normal (yellow), PDF

(orange) and CCA group (red). The data are represented as the mean ± SD of the OD at 492

nm. A Kruskal-Wallis test was used to determine the difference of candidate proteins between

groups, followed by Dunnett’ s test for pairwise group comparison. ( * P<0. 05, * * P<0. 01,

* * * P<0. 001) . D, E and F; ROC analysis of multiple marker for each pairwise comparison

including (D) normal versus PDF, (E) normal versus CCA and (F) PDF versus CCA. Black

solid line: theoretically perfect performance of an efficient biomarker as the reference line. Red

lines: actual performance of the combination S100A9, TRX, CDHR2.

22

Figure 7 Immunohistochemical staining of the three candidate proteins, Protein S100A9

( S100A9) , Thioredoxin ( TRX) and Cadherin- related family member 2 ( CDHR2) , was

performed on cadaveric donor liver tissues ( upper panel) and human CCA microtissue arrays

(lower panel).

DISCUSSION

As a first step towards screening/surveillance of the high risk population in the endemic

area in northern Thailand, we completed the first serum protein biomarker discovery and

qualification study for differential diagnosis of CCA from healthy controls, and from PDF

which results from Ov infection and is the pre-malignant condition for Ov-associated CCA.2-3

Importantly, we used ultrasonography diagnosis as the gold standard for the normal and PDF

groups. 7 The serum protein biomarker pipeline was composed of the discovery phase using

shotgun proteomics ( LC- MS/ MS) , the qualification phase using targeted proteomics

(MRM/MS) and orthogonal evaluation using antibody-based assays (ELISA and IHC). Out of

the 238 serum proteins measured in the discovery phase, 40 (16.8%) were selected as candidate

23

CCA biomarkers to progress to the qualification phase, from which 11 were qualified (26.2%)

in an independent cohort to have diagnostic value for CCA versus normal or PDF. Network

and GO analyses of the 11 qualified biomarker candidates revealed two biological functions

which are known to be important in carcinogenesis: immune system and inflammatory

processes, and redox regulatory processes. In contrast, although PDF is driven by oxidative

damage and inflammation, there were no serum protein candidates that could distinguish the

PDF from the normal group. This result suggests that development of PDF into CCA is

accompanied by additional changes in secreted proteins.

From the descriptive clinical data on the samples in the qualification set, smoking status

and alcohol consumption were significantly more common in the CCA population when

compared with the PDF group. This is expected as smoking and alcohol consumption increase

the risk of hepatocellular carcinomas ( HCC) and CCA. 30 Previous investigations found that

smokers show an increased risk of intrahepatic cholangiocarcinoma (ICC) (HR =1.47, 95%

CI: 1. 07– 2. 02) and alcohol consumption is associated with a 68% increased ICC risk

(HR=1.68, 95% CI: 0.99–2.86). To determine if any of the biomarker candidates were

related to smoking or alcohol consumption, a Mann Whitney U-test was conducted on the

MRM data. This analysis showed a significant correlation with smoking status for TRX

(p=0.01), CDHR2 (p=0.009), ICAM1 (p=0.003), ANPEP (p=0.008) and FAH (p=0.007)

(Figure S2). Further evaluation is needed to decipher the relationships between these CCA

biomarker candidates with smoking/alcohol.

Following the mass spectrometry-based discovery and qualification phases, we further

evaluated the three most significant candidates (S100A9, TRX and CDHR2) using antibodies.

There are major technical differences between the mass spectrometry-based measurements and

the antibody-based techniques. While the former technique measures peptides and then infers

protein levels based on a few peptides, the latter technique directly measures proteins but only

24

the proteoforms that are recognized by the antibody used. Secondly, due to detection limits of

the mass spectrometer in a complex matrix, serum samples were processed through

immunodepletion which was not necessary for the antibody- based methods. Finally,

normalization for the mass spectrometry assays was at the level of protein quantity following

immunodepletion, whereas the ELISA was normalized by serum volume. Although both MRM

and ELISA assays can provide absolute quantitation when appropriate external standards are

used in the assay, such reagents were not available/ developed for the current biomarker

qualification phases. Despite the numerous technical differences and the independent patient

cohorts, we observed similar trends for the ELISA and MRM data sets. S100A9, TRX and

CDHR2 were significantly elevated in the serum of CCA patients compared with the PDF and

normal groups. Although the MRM data reported high diagnostic values for individual markers

(AUROC range from 0.797 to 0.967) and for multi-marker panels (AUC 0.943 for CCA versus

normal, AUC 0.973 for CCA versus PDF), the diagnostic values obtained from the ELISA

data were not impressive.

The calcium binding protein S100A9 heterodimerizes with S100A8 to form

calprotectin, and is generally released from neutrophils as an inflammatory mediator. 31 In

agreement, we detected S1 0 0 A9 protein expression in the infiltrating immune cells of CCA

tissues. S100A9 has been reported to be elevated in several cancers, including colorectal cancer

32, cervical cancer 33 and lung cancer. 3 4 Although elevated S100A9 expression has not been

reported in CCA, the related calcium- binding protein S100P was recently reported to be

upregulated in CCA.35

Thioredoxin is a small redox-regulating protein that plays crucial roles in maintaining

cellular redox homeostasis and cell survival. TRX is highly expressed in many cancers

including lung, cervix, pancreatic, colorectal, hepatocellular carcinomas ( HCC) , gastric

carcinomas and breast cancer.36-42 Previous studies have also shown the overexpression of TRX

25

in both human CCA tissues and liver fluke- induced CCA in the hamster model. 43 Moreover,

they suggest that TRX plays a role in the transformation of bile duct epithelial cells and tumor

progression during cholangiocarcinogenesis.43

Cadherin- related family member 2 ( CHDR2) , also known as the protocadherin liver

kidney and colon protein ( PCLKC) or protocadherin- 24 ( PCDH24) , plays a role in contact

inhibition at the lateral surface of epithelial cells.44 A previous study has suggested that CDHR2

can act as a tumor suppressor that induces contact inhibition in colon cancer cells, thereby

inhibiting tumor formation.44 Although we detected CDHR2 protein expression in CCA tissue,

there was an insufficient number of normal cases for statistical evaluation in this study. Our

serum data using MRM and ELISA showed elevated CDHR2 protein in the CCA compared to

the normal and PDF groups. This may indicate the release of CDHR2 from CCA tissue.

Additional tissue- serum sample comparison will need to be conducted to determine if there is

a significant correlation. Interestingly, although CDHR2 does not have a known role in

inflammatory or redox regulation, its serum levels positively correlated with the other CCA

biomarker candidates in our cohort.

CONCLUSION

In summary, through serum biomarker discovery and qualification steps, we report

eleven potential biomarker candidates for CCA diagnosis. These biomarker candidates have

known functions in immune/inflammatory and redox regulatory processes, supporting the role

of these pathways in carcinogenesis. The main strength of this study is that it was conducted at

a population- based level in an endemic area using ultrasonography diagnosis of PDF and

healthy controls. Future studies should evaluate these serum CCA biomarker candidates in

other independent cohorts, prior to prospective trials.

26

DATA AVAILABILAITY

The raw mass spectrometry proteomics data along with database search results have been

deposited at the publicly accessible platform ProteomeXchange via the PRIDE partner

repository45 for the discovery cohort with the data set identifier PXD011804, and via

PASSEL46 for the qualification cohort has the dataset identifier PASS01298.

AUTHOR INFORMATION

CORRESPONDENECE:

Associate Professor Watcharin Loilome,

Department of Biochemistry, Faculty of Medicine, Khon Kaen University,

123 Mitraphab Road, Khon Kaen Province, Thailand 40002.

Tel: +66 8 1954 1184, Email: [email protected]

Associate Professor Michelle Hill,

QIMR Berghofer Medical Research Institute,

300 Herston Rd, Herston, QLD, Australia 4006.

Tel: +61 7 3845 3020, Email: [email protected]

AUTHOR CONTRIBUTIONS

W.L., P.Y., R.T., A.T., S.R., J.M. and M.H. participated in the project planning, co-ordination

and the experimental design. N. K and N. C. contributed to confirmation by ultrasonography

results. K. D. conducted protein isolation, peptide preparation and proteomics- related

experiments. K.D. , T.S. , A.S. , A.M. contributed to method development and data analysis.

27

K.D., A.M., A.S., J.P., and WL analyzed and interpreted the data. K.D. drafted the manuscript,

W.L., J.M. and M.H. edited the manuscript. All authors approved the manuscript.

FOOTNOTES

The authors declare that the research was conducted with no conflict of interest in the

absence of any commercial or financial relationships.

SUPPORTING INFORMATION

Detail of method for protein depletion and concentration; trypsin digest and sample

preparation for MS analysis; LC-MS/MS and data searching for discovery phase; MRM/MS

assay development; MRM-MS analysis of qualification cohort; bioinformatics and statistical

analysis; ELISA and IHC (PDF)

Table S1A. Total protein identification in the discovery phase processed in MaxQuant output.

Some proteins were excluded by following the criteria. (XLSX)

Table S1B. Total peptide identification and their sequences in the discovery phase processed

in MaxQuant output. Some proteins were excluded by following the criteria. (XLSX)

Table S2. The significant serum candidates list from the discovery phase following criteria

selection for the MRM-MS experiment. (XLSX)

Table S3. All peptides and transition of selected serum candidate proteins for the MRM-MS

experiment. (XLSX)

Table S4. Area under the curve of each peptide (sum of all transitions) of serum candidates

by exporting output from Skyline software. (XLSX)

Table S5. The normalization of peptide area under peak using 1) the median normalized iRT

as normalizing factor and 2) normalization of all peptides in each sample. (XLSX)

28

Table S6. The log2-transformed data of normalized intensity in each protein that has a

normalized peptide intensity correlation > 0.9 using Pearson correlation. (XLSX)

Table S7. The correlation coefficient values and univariate analysis of all qualified serum

protein candidates by MRM/MS and O-PLS-DA analysis. (XLSX)

Table S8. The functional gene ontology (GO) enrichment analysis of the qualified serum

candidates using STRING software (http://string-db.org) to predict protein-protein

associations with other proteins in different pathways. (XLSX)

Table S9. The statistical analysis of multiple comparison of significant clinical data

differences by sample group comparison in the qualification phase. (PDF)

Table S10. The correlation of clinical pathological data with the expression of the 3

diagnostic biomarkers. (PDF)

Table S11 The clinical information of the validation cohort set for ELISA and IHC assays

(PDF)

Figure S1 ROC analysis for single marker in each pairwise comparison including normal

versus PDF, normal versus CCA and PDF versus CCA (PDF)

Figure S2 Confounder analysis for smoking status of 11 serum biomarker candidates (PDF)

ABBREVIATIONS

A1AC /SERPIN3 Alpha-1-antichymotrypsin

ANPEP Aminopeptidase N

AUROC Area under receiver operating characteristic

29

C9 Complement 9

CCA Cholangiocarcinoma

CDHR2 Cadherin-related family member 2

ELISA Enzyme-linked immunosorbent assay

FAH Fumarylacetoacetase

HP Haptoglobin

ICAM1 Intercellular adhesion molecule 1

IDA Information-dependent acquisition

IHC Immunohistochemistry staining

ITIH3 Inter-alpha-trypsin inhibitor heavy chain H3

LBP Lipopolysaccharide-binding protein

LC-MS/MS Liquid chromatography-tandem mass spectrometry

LFQ Label-free quantification

MRM Multiple reactions monitoring

OPD O-phenylenediamine dihydrochloride

Ov Opisthorchis viverrini

O-PLS-DA Orthogonal partial least squares discriminant analysis

PCA Principal component analysis

PDF Periductal fibrosis

ROC Receiver operating characteristic

S100A9 ProteinS100A9

SRM Selected reaction monitoring

TMA Tissues microarrays

TRX/TXN Thioredoxin

US Ultrasonography

30

ACKNOWLEDGMENT

This research was supported by Thailand Research Fund through Royal Golden Jubilee Ph. D.

Program and Khon Kaen University (Grant No. PHD/0145/2556) to W.L. and K.D. , a grant

from Cholangiocarcinoma Screening and Care Program (Grant No. CASCAP-09), the grant of

Faculty of Medicine to K.D. (Grant No. IN 59340) a grant from the Thailand Research Fund

(Grant No. RSA5980013) allocated to W.L. Mass spectrometry was conducted at the QIMR

Berghofer Medical Research Institute, and the University of Queensland Centre for Clinical

Research. We thank Dr. Sarah Reed and Buddhika Jayakody for the technical support. We

thank Professor Trevor N. Petney for editing the MS via the Publication Clinic KKU, Thailand.

REFERENCES

1. Blechacz, B.; Komuta, M.; Roskams, T.; Gores, G. J., Clinical diagnosis and staging of

cholangiocarcinoma. Nat Rev Gastroenterol Hepatol 2011, 8 (9), 512-22.

2. Sripa, B.; Pairojkul, C., Cholangiocarcinoma: lessons from Thailand. Current opinion

in gastroenterology 2008, 24 (3), 349-56.

3. Haswell- Elkins, M. R. ; Mairiang, E. ; Mairiang, P. ; Chaiyakum, J. ; Chamadol, N. ;

Loapaiboon, V.; Sithithaworn, P.; Elkins, D. B., Cross-sectional study of Opisthorchis viverrini

infection and cholangiocarcinoma in communities within a high- risk area in northeast

Thailand. Int J Cancer 1994, 59 (4), 505-9.

4. Pinlaor, S. ; Hiraku, Y. ; Ma, N. ; Yongvanit, P. ; Semba, R. ; Oikawa, S. ; Murata, M. ;

Sripa, B.; Sithithaworn, P.; Kawanishi, S., Mechanism of NO-mediated oxidative and nitrative

DNA damage in hamsters infected with Opisthorchis viverrini: a model of inflammation-

mediated carcinogenesis. Nitric Oxide 2004, 11 (2), 175-83.

31

5. Thanan, R. ; Murata, M. ; Pinlaor, S. ; Sithithaworn, P. ; Khuntikeo, N. ; Tangkanakul,

W. ; Hiraku, Y. ; Oikawa, S. ; Yongvanit, P. ; Kawanishi, S. , Urinary 8- oxo- 7,8- dihydro- 2'-

deoxyguanosine in patients with parasite infection and effect of antiparasitic drug in relation to

cholangiocarcinogenesis. Cancer Epidemiol Biomarkers Prev 2008, 17 (3), 518-24.

6. Yongvanit, P. ; Pinlaor, S. ; Loilome, W. , Risk biomarkers for assessment and

chemoprevention of liver fluke-associated cholangiocarcinoma. J Hepatobiliary Pancreat Sci

2014, 21 (5), 309-15.

7. Chamadol, N. ; Pairojkul, C. ; Khuntikeo, N. ; Laopaiboon, V. ; Loilome, W. ;

Sithithaworn, P. ; Yongvanit, P. , Histological confirmation of periductal fibrosis from

ultrasound diagnosis in cholangiocarcinoma patients. J Hepatobiliary Pancreat Sci 2014, 21

(5), 316-22.

8. Bichsel, V. E.; Liotta, L. A.; Petricoin, E. F., 3rd, Cancer proteomics: from biomarker

discovery to signal pathway profiling. Cancer J 2001, 7 (1), 69-78.

9. Ru, Q. C.; Zhu, L. A.; Silberman, J.; Shriver, C. D., Label-free semiquantitative peptide

feature profiling of human breast cancer and breast disease sera via two- dimensional liquid

chromatography- mass spectrometry. Molecular & cellular proteomics : MCP 2006, 5 ( 6) ,

1095-104.

10. de Noo, M. E.; Mertens, B. J.; Ozalp, A.; Bladergroen, M. R.; van der Werff, M. P.;

van de Velde, C. J. ; Deelder, A. M. ; Tollenaar, R. A. , Detection of colorectal cancer using

MALDI-TOF serum protein profiling. European journal of cancer 2006, 42 (8), 1068-76.

11. Adam, B. L. ; Qu, Y. ; Davis, J. W.; Ward, M. D.; Clements, M. A.; Cazares, L. H. ;

Semmes, O. J. ; Schellhammer, P. F. ; Yasui, Y. ; Feng, Z. ; Wright, G. L. , Jr. , Serum protein

fingerprinting coupled with a pattern- matching algorithm distinguishes prostate cancer from

benign prostate hyperplasia and healthy men. Cancer research 2002, 62 (13), 3609-14.

32

12. Jacot, W. ; Lhermitte, L. ; Dossat, N. ; Pujol, J. L. ; Molinari, N. ; Daures, J. P. ;

Maudelonde, T. ; Mange, A. ; Solassol, J. , Serum proteomic profiling of lung cancer in high-

risk groups and determination of clinical outcomes. Journal of thoracic oncology : official

publication of the International Association for the Study of Lung Cancer 2008, 3 (8), 840-50.

13. Sriwanitchrak, P. ; Viyanant, V. ; Chaijaroenkul, W. ; Srivatanakul, P. ; Gram, H. R. ;

Eursiddhichai, V. ; Na- Bangchang, K. , Proteomics analysis and evaluation of biomarkers for

detection of cholangiocarcinoma. Asian Pac J Cancer Prev 2011, 12 (6), 1503-10.

14. Shi, Y.; Deng, X.; Zhan, Q.; Shen, B.; Jin, X.; Zhu, Z.; Chen, H.; Li, H.; Peng, C., A

prospective proteomic- based study for identifying potential biomarkers for the diagnosis of

cholangiocarcinoma. J Gastrointest Surg 2013, 17 (9), 1584-91.

15. Darby, I. A.; Vuillier-Devillers, K.; Pinault, E.; Sarrazy, V.; Lepreux, S.; Balabaud, C.;

Bioulac- Sage, P. ; Desmouliere, A. , Proteomic analysis of differentially expressed proteins in

peripheral cholangiocarcinoma. Cancer Microenviron 2010, 4 (1), 73-91.

16. Chambers, A. G.; Percy, A. J.; Simon, R.; Borchers, C. H., MRM for the verification

of cancer biomarker proteins: recent applications to human plasma and serum. Expert Rev

Proteomics 2014, 11 (2), 137-48.

17. Parker, C. E.; Domanski, D.; Percy, A. J.; Chambers, A. G.; Camenzind, A. G.; Smith,

D. S. ; Borchers, C. H. , Mass spectrometry in high- throughput clinical biomarker assays:

multiple reaction monitoring. Top Curr Chem 2014, 336, 117-37.

18. Abbatiello, S. E.; Schilling, B.; Mani, D. R.; Zimmerman, L. J.; Hall, S. C.; MacLean,

B.; Albertolle, M.; Allen, S.; Burgess, M.; Cusack, M. P.; Gosh, M.; Hedrick, V.; Held, J. M.;

Inerowicz, H. D.; Jackson, A.; Keshishian, H.; Kinsinger, C. R.; Lyssand, J. ; Makowski, L.;

Mesri, M.; Rodriguez, H.; Rudnick, P.; Sadowski, P.; Sedransk, N.; Shaddox, K.; Skates, S. J.;

Kuhn, E.; Smith, D.; Whiteaker, J. R.; Whitwell, C.; Zhang, S.; Borchers, C. H.; Fisher, S. J.;

Gibson, B. W.; Liebler, D. C.; MacCoss, M. J.; Neubert, T. A.; Paulovich, A. G.; Regnier, F.

33

E. ; Tempst, P. ; Carr, S. A. , Large- Scale Interlaboratory Study to Develop, Analytically

Validate and Apply Highly Multiplexed, Quantitative Peptide Assays to Measure Cancer-

Relevant Proteins in Plasma. Mol Cell Proteomics 2015, 14 (9), 2357-74.

19. Addona, T. A.; Abbatiello, S. E.; Schilling, B.; Skates, S. J.; Mani, D. R.; Bunk, D. M.;

Spiegelman, C. H. ; Zimmerman, L. J. ; Ham, A. J. ; Keshishian, H. ; Hall, S. C. ; Allen, S. ;

Blackman, R. K.; Borchers, C. H.; Buck, C.; Cardasis, H. L.; Cusack, M. P.; Dodder, N. G.;

Gibson, B. W.; Held, J. M.; Hiltke, T.; Jackson, A.; Johansen, E. B.; Kinsinger, C. R.; Li, J.;

Mesri, M. ; Neubert, T. A. ; Niles, R. K. ; Pulsipher, T. C. ; Ransohoff, D. ; Rodriguez, H. ;

Rudnick, P. A.; Smith, D.; Tabb, D. L.; Tegeler, T. J.; Variyath, A. M.; Vega-Montoto, L. J.;

Wahlander, A. ; Waldemarson, S. ; Wang, M. ; Whiteaker, J. R. ; Zhao, L. ; Anderson, N. L. ;

Fisher, S. J.; Liebler, D. C.; Paulovich, A. G.; Regnier, F. E.; Tempst, P.; Carr, S. A., Multi-

site assessment of the precision and reproducibility of multiple reaction monitoring- based

measurements of proteins in plasma. Nat Biotechnol 2009, 27 (7), 633-41.

20. Shah, A. K. ; Hartel, G. ; Brown, I. ; Winterford, C. ; Na, R. ; Le Cao, K. A. ; Spicer, B.

A.; Dunstone, M. A.; Phillips, W. A.; Lord, R. V.; Barbour, A. P.; Watson, D. I.; Joshi, V.;

Whiteman, D. C. ; Hill, M. M. , Evaluation of serum glycoprotein biomarker candidates for

detection of esophageal adenocarcinoma and surveillance of Barrett's esophagus. Mol Cell

Proteomics 2018, 17 (12), 2324-2334.

21. Brock, R. ; Xiong, B. ; Li, L. ; Vanbogelen, R. A. ; Christman, L. , A multiplex serum

protein assay for determining the probability of colorectal cancer. Am J Cancer Res 2012, 2

(5), 598-605.

22. Shah, A. K.; Cao, K. A.; Choi, E.; Chen, D.; Gautier, B.; Nancarrow, D.; Whiteman,

D. C.; Saunders, N. A.; Barbour, A. P.; Joshi, V.; Hill, M. M., Serum Glycoprotein Biomarker

Discovery and Qualification Pipeline Reveals Novel Diagnostic Biomarker Candidates for

Esophageal Adenocarcinoma. Mol Cell Proteomics 2015, 14 (11), 3023-39.

34

23. Dave, K. A.; Norris, E. L.; Bukreyev, A. A.; Headlam, M. J.; Buchholz, U. J.; Singh,

T.; Collins, P. L.; Gorman, J. J., A comprehensive proteomic view of responses of A549 type

II alveolar epithelial cells to human respiratory syncytial virus infection. Mol Cell Proteomics

2014, 13 (12), 3250-69.

24. Cox, J. ; Hein, M. Y. ; Luber, C. A. ; Paron, I. ; Nagaraj, N. ; Mann, M. , Accurate

proteome- wide label- free quantification by delayed normalization and maximal peptide ratio

extraction, termed MaxLFQ. Mol Cell Proteomics 2014, 13 (9), 2513-26.

25. Cox, J.; Mann, M., MaxQuant enables high peptide identification rates, individualized

p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat Biotechnol 2008,

26 (12), 1367-72.

26. MacLean, B.; Tomazela, D. M.; Shulman, N.; Chambers, M.; Finney, G. L.; Frewen,

B.; Kern, R.; Tabb, D. L.; Liebler, D. C.; MacCoss, M. J., Skyline: an open source document

editor for creating and analyzing targeted proteomics experiments. Bioinformatics 2010, 26

(7), 966-8.

27. Rifai, N.; Gillette, M. A.; Carr, S. A., Protein biomarker discovery and validation: the

long and uncertain path to clinical utility. Nat Biotechnol 2006, 24 (8), 971-83.

28. Percy, A. J.; Chambers, A. G.; Yang, J.; Domanski, D.; Borchers, C. H., Comparison

of standard- and nano-flow liquid chromatography platforms for MRM-based quantitation of

putative plasma biomarker proteins. Anal Bioanal Chem 2012, 404 (4), 1089-101.

29. Szklarczyk, D.; Morris, J. H.; Cook, H.; Kuhn, M.; Wyder, S.; Simonovic, M.; Santos,

A.; Doncheva, N. T.; Roth, A.; Bork, P.; Jensen, L. J.; von Mering, C., The STRING database

in 2017: quality- controlled protein- protein association networks, made broadly accessible.

Nucleic Acids Res 2017, 45 (D1), D362-D368.

30. Petrick, J. L.; Campbell, P. T.; Koshiol, J.; Thistle, J. E.; Andreotti, G.; Beane-Freeman,

L. E.; Buring, J. E.; Chan, A. T.; Chong, D. Q.; Doody, M. M.; Gapstur, S. M.; Gaziano, J. M.;

35

Giovannucci, E.; Graubard, B. I.; Lee, I. M.; Liao, L. M.; Linet, M. S.; Palmer, J. R.; Poynter,

J. N.; Purdue, M. P.; Robien, K.; Rosenberg, L.; Schairer, C.; Sesso, H. D.; Sinha, R.; Stampfer,

M. J. ; Stefanick, M.; Wactawski-Wende, J. ; Zhang, X. ; Zeleniuch-Jacquotte, A. ; Freedman,

N. D. ; McGlynn, K. A. , Tobacco, alcohol use and risk of hepatocellular carcinoma and

intrahepatic cholangiocarcinoma: The Liver Cancer Pooling Project. Br J Cancer 2018, 118

(7), 1005-1012.

31. Shabani, F.; Farasat, A.; Mahdavi, M.; Gheibi, N., Calprotectin (S100A8/S100A9): a

key protein between inflammation and cancer. Inflamm Res 2018, 67 (10), 801-812.

32. Shu, P.; Zhao, L.; Wagn, J.; Shen, X.; Zhang, X.; Shen, S.; Ma, J.; Li, X., [Association

between serum levels of S100A8/ S100A9 and clinical features of colorectal cancer patients] .

Zhong Nan Da Xue Xue Bao Yi Xue Ban 2016, 41 (6), 553-9.

33. Zhu, H.; Pei, H. P.; Zeng, S.; Chen, J.; Shen, L. F.; Zhong, M. Z.; Yao, R. J.; Shen, H.,

Profiling protein markers associated with the sensitivity to concurrent chemoradiotherapy in

human cervical carcinoma. J Proteome Res 2009, 8 (8), 3969-76.

34. Kawai, H.; Minamiya, Y.; Takahashi, N., Prognostic impact of S100A9 overexpression

in non-small cell lung cancer. Tumour Biol 2011, 32 (4), 641-6.

35. Hamada, S. ; Satoh, K. ; Hirota, M. ; Kanno, A. ; Ishida, K. ; Umino, J. ; Ito, H. ; Kikuta,

K. ; Kume, K. ; Masamune, A. ; Katayose, Y. ; Unno, M. ; Shimosegawa, T. , Calcium- binding

protein S100P is a novel diagnostic marker of cholangiocarcinoma. Cancer Sci 2011, 102 (1),

150-6.

36. Kim, H. J.; Chae, H. Z.; Kim, Y. J.; Kim, Y. H.; Hwangs, T. S.; Park, E. M.; Park, Y.

M. , Preferential elevation of Prx I and Trx expression in lung cancer cells following hypoxia

and in human lung cancer tissues. Cell Biol Toxicol 2003, 19 (5), 285-98.

37. Hedley, D. ; Pintilie, M. ; Woo, J. ; Nicklee, T. ; Morrison, A. ; Birle, D. ; Fyles, A. ;

Milosevic, M. ; Hill, R. , Up- regulation of the redox mediators thioredoxin and

36

apurinic/ apyrimidinic excision ( APE) / Ref- 1 in hypoxic microregions of invasive cervical

carcinomas, mapped using multispectral, wide-field fluorescence image analysis. Am J Pathol

2004, 164 (2), 557-65.

38. Han, H.; Bearss, D. J.; Browne, L. W.; Calaluce, R.; Nagle, R. B.; Von Hoff, D. D.,

Identification of differentially expressed genes in pancreatic cancer cells using cDNA

microarray. Cancer Res 2002, 62 (10), 2890-6.

39. Raffel, J.; Bhattacharyya, A. K.; Gallegos, A.; Cui, H.; Einspahr, J. G.; Alberts, D. S.;

Powis, G., Increased expression of thioredoxin-1 in human colorectal cancer is associated with

decreased patient survival. J Lab Clin Med 2003, 142 (1), 46-51.

40. Choi, J. H. ; Kim, T. N. ; Kim, S. ; Baek, S. H. ; Kim, J. H. ; Lee, S. R. ; Kim, J. R. ,

Overexpression of mitochondrial thioredoxin reductase and peroxiredoxin III in hepatocellular

carcinomas. Anticancer Res 2002, 22 (6A), 3331-5.

41. Grogan, T. M.; Fenoglio-Prieser, C. ; Zeheb, R. ; Bellamy, W.; Frutiger, Y. ; Vela, E. ;

Stemmerman, G.; Macdonald, J.; Richter, L.; Gallegos, A.; Powis, G., Thioredoxin, a putative

oncogene product, is overexpressed in gastric carcinoma and associated with increased

proliferation and increased cell survival. Hum Pathol 2000, 31 (4), 475-81.

42. Cha, M. K.; Suh, K. H.; Kim, I. H., Overexpression of peroxiredoxin I and thioredoxin1

in human breast carcinoma. J Exp Clin Cancer Res 2009, 28, 93.

43. Yoon, B. I. ; Kim, Y. H. ; Yi, J. Y. ; Kang, M. S. ; Jang, J. J. ; Joo, K. H. ; Kim, Y. ;

McHugh Law, J. ; Kim, D. Y. , Expression of thioredoxin during progression of hamster and

human cholangiocarcinoma. Cancer Sci 2010, 101 (1), 281-8.

44. Okazaki, N.; Takahashi, N.; Kojima, S.; Masuho, Y.; Koga, H., Protocadherin LKC, a

new candidate for a tumor suppressor of colon and liver cancers, its association with contact

inhibition of cell proliferation. Carcinogenesis 2002, 23 (7), 1139-48.

37

45. Vizcaino, J. A.; Csordas, A.; del-Toro, N.; Dianes, J. A.; Griss, J.; Lavidas, I.; Mayer,

G.; Perez-Riverol, Y.; Reisinger, F.; Ternent, T.; Xu, Q. W.; Wang, R.; Hermjakob, H., 2016

update of the PRIDE database and its related tools. Nucleic Acids Res 2016, 44 ( D1) , D447-

56.

46. Farrah, T. ; Deutsch, E. W. ; Kreisberg, R. ; Sun, Z. ; Campbell, D. S. ; Mendoza, L. ;

Kusebauch, U. ; Brusniak, M. Y. ; Huttenhain, R. ; Schiess, R. ; Selevsek, N. ; Aebersold, R. ;

Moritz, R. L., PASSEL: the PeptideAtlas SRMexperiment library. Proteomics 2012, 12 (8),

1170-5.

GRAPHICAL ABSTRACT

“For TOC Only”

discovery and qualification of serum protein biomarker

Documents