uva-dare (digital academic repository) immunohistochemical ... · b sander,8 c thorns,9 e campo,10...

13
UvA-DARE is a service provided by the library of the University of Amsterdam (https://dare.uva.nl) UvA-DARE (Digital Academic Repository) Immunohistochemical prognostic markers in diffuse large B-cell lymphoma: validation of tissue microarray as a prerequisite for broad clinical applications (a study from the Lunenburg Lymphoma Biomarker Consortium) de Jong, D.; Xie, W.; Rosenwald, A.; Chhanabhai, M.; Gaulard, P.; Klapper, W.; Lee, A.; Sander, B.; Thorns, C.; Campo, E.; Molina, T.; Hagenbeek, A.; Horning, S.; Lister, A.; Raemaekers, J.; Salles, G.; Gascoyne, R.D.; Weller, E. DOI 10.1136/jcp.2008.057257 Publication date 2009 Document Version Final published version Published in Journal of clinical pathology Link to publication Citation for published version (APA): de Jong, D., Xie, W., Rosenwald, A., Chhanabhai, M., Gaulard, P., Klapper, W., Lee, A., Sander, B., Thorns, C., Campo, E., Molina, T., Hagenbeek, A., Horning, S., Lister, A., Raemaekers, J., Salles, G., Gascoyne, R. D., & Weller, E. (2009). Immunohistochemical prognostic markers in diffuse large B-cell lymphoma: validation of tissue microarray as a prerequisite for broad clinical applications (a study from the Lunenburg Lymphoma Biomarker Consortium). Journal of clinical pathology, 62(2), 128-138. https://doi.org/10.1136/jcp.2008.057257 General rights It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons). Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible.

Upload: others

Post on 14-Mar-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: UvA-DARE (Digital Academic Repository) Immunohistochemical ... · B Sander,8 C Thorns,9 E Campo,10 T Molina,11 A Hagenbeek,12 S Horning,13 A Lister,14 J Raemaekers, 15 G Salles, 16

UvA-DARE is a service provided by the library of the University of Amsterdam (https://dare.uva.nl)

UvA-DARE (Digital Academic Repository)

Immunohistochemical prognostic markers in diffuse large B-cell lymphoma:validation of tissue microarray as a prerequisite for broad clinical applications (astudy from the Lunenburg Lymphoma Biomarker Consortium)

de Jong, D.; Xie, W.; Rosenwald, A.; Chhanabhai, M.; Gaulard, P.; Klapper, W.; Lee, A.;Sander, B.; Thorns, C.; Campo, E.; Molina, T.; Hagenbeek, A.; Horning, S.; Lister, A.;Raemaekers, J.; Salles, G.; Gascoyne, R.D.; Weller, E.DOI10.1136/jcp.2008.057257Publication date2009Document VersionFinal published versionPublished inJournal of clinical pathology

Link to publication

Citation for published version (APA):de Jong, D., Xie, W., Rosenwald, A., Chhanabhai, M., Gaulard, P., Klapper, W., Lee, A.,Sander, B., Thorns, C., Campo, E., Molina, T., Hagenbeek, A., Horning, S., Lister, A.,Raemaekers, J., Salles, G., Gascoyne, R. D., & Weller, E. (2009). Immunohistochemicalprognostic markers in diffuse large B-cell lymphoma: validation of tissue microarray as aprerequisite for broad clinical applications (a study from the Lunenburg Lymphoma BiomarkerConsortium). Journal of clinical pathology, 62(2), 128-138.https://doi.org/10.1136/jcp.2008.057257

General rightsIt is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s)and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an opencontent license (like Creative Commons).

Disclaimer/Complaints regulationsIf you believe that digital publication of certain material infringes any of your rights or (privacy) interests, pleaselet the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the materialinaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letterto: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. Youwill be contacted as soon as possible.

Page 2: UvA-DARE (Digital Academic Repository) Immunohistochemical ... · B Sander,8 C Thorns,9 E Campo,10 T Molina,11 A Hagenbeek,12 S Horning,13 A Lister,14 J Raemaekers, 15 G Salles, 16

doi: 10.1136/jcp.2008.0572572008

2009 62: 128-138 originally published online September 15,J Clin Pathol D de Jong, W Xie, A Rosenwald, et al. Lunenburg Lymphoma Biomarker Consortium)clinical applications (a study from the tissue microarray as a prerequisite for broaddiffuse large B-cell lymphoma: validation of Immunohistochemical prognostic markers in

http://jcp.bmj.com/content/62/2/128.full.htmlUpdated information and services can be found at:

These include:

References

http://jcp.bmj.com/content/62/2/128.full.html#related-urlsArticle cited in:  

http://jcp.bmj.com/content/62/2/128.full.html#ref-list-1This article cites 30 articles, 24 of which can be accessed free at:

serviceEmail alerting

box at the top right corner of the online article.Receive free email alerts when new articles cite this article. Sign up in the

Topic collections

(44561 articles)Immunology (including allergy)   � Articles on similar topics can be found in the following collections

Notes

http://group.bmj.com/group/rights-licensing/permissionsTo request permissions go to:

http://journals.bmj.com/cgi/reprintformTo order reprints go to:

http://journals.bmj.com/cgi/epTo subscribe to BMJ go to:

group.bmj.com on November 2, 2010 - Published by jcp.bmj.comDownloaded from

Page 3: UvA-DARE (Digital Academic Repository) Immunohistochemical ... · B Sander,8 C Thorns,9 E Campo,10 T Molina,11 A Hagenbeek,12 S Horning,13 A Lister,14 J Raemaekers, 15 G Salles, 16

Immunohistochemical prognostic markers in diffuselarge B-cell lymphoma: validation of tissuemicroarray as a prerequisite for broad clinicalapplications (a study from the Lunenburg LymphomaBiomarker Consortium)

D de Jong,1 W Xie,2 A Rosenwald,3 M Chhanabhai,4 P Gaulard,5 W Klapper,6 A Lee,7

B Sander,8 C Thorns,9 E Campo,10 T Molina,11 A Hagenbeek,12 S Horning,13 A Lister,14

J Raemaekers,15 G Salles,16 R D Gascoyne,4 E Weller2

1 The Netherlands CancerInstitute, Amsterdam, TheNetherlands; 2 Dana-FarberCancer Institute, Boston,Massachusetts, USA; 3 Instituteof Pathology, University ofWurzburg, Wurzburg, Germany;4 British Columbia CancerAgency, University of BritishColumbia, Vancouver, Canada;5 Department of Pathology,Inserm U617, Hopital HenriMondor, Creteil, France;6 Department of Pathology,Hematopathology Section,University Hospital Schleswig-Holstein, Campus Kiel, Germany;7 CRUK Medical Oncology Unit,St Bartholomew’s Hospital,London, UK; 8 KarolinskaInstitutet, Stockholm, Sweden;9 University Clinic Schleswig-Holstein, Campus Lubeck,Germany; 10 Hospital Clinic,University of Barcelona,Barcelona, Spain; 11 UniversiteParis-Descartes; AP-HP, Hotel-Dieu, Paris, France; 12 AcademicMedical Center, Amsterdam,The Netherlands; 13 StanfordUniversity Medical Center, PaloAlto, California, USA; 14 StBartholomew’s Hospital,Department of MedicalOncology, London, UK;15 University Medical CenterNijmegen, The Netherlands;16 Hospices Civils de Lyon andUniversite Claude Bernard Lyon-1, Lyon, France

Correspondence to:Dr De de Jong, Department ofPathology, The NetherlandsCancer Institute, Plesmanlaan121, 1066CX Amsterdam, TheNetherlands; [email protected]

Accepted 21 May 2008Published Online First15 September 2008

ABSTRACTBackground and Aims: The results of class predictionand the determination of prognostic markers in diffuselarge B-cell lymphoma (DLBCL) have been variablyreported. Apart from biological variations, this may becaused by differences in laboratory techniques, scoringdefinitions and inter- and intra-observer variation. In thisstudy, an international collaboration of clinical lymphomaresearch groups has concentrated on validation andstandardisation of immunohistochemistry of the currentlypotentially interesting prognostic markers in DLBCL.Methods: Sections of a tissue microarray with 36 casesof DLBCL were stained in eight laboratories withantibodies to CD20, CD5, bcl-2, bcl-6, CD10, HLA-DR,MUM-1 and Ki-67 according to local methods. The studywas performed in two rounds, firstly focused on theevaluation of laboratory staining variation, and secondly onthe scoring variation.Results: Different techniques resulted in highly variableresults and poor reproducibility for almost all markers.Reproducibility of the nuclear markers was highly sensitiveto technical variations, including immunological enhance-ment techniques (agreements 34%). With elimination ofvariation due to staining and uniformly agreed on scoringcriteria, significant improvement was seen; however lessso for bcl-6 and Ki-67 (agreement 53–58%). Absence ofinternal controls that preclude scoring, significantlyinfluenced the results for CD10 and bcl-6.Conclusion: Semi-quantitative immunohistochemistry forsubclassification of DLBCL is feasible, but with varyingrates of concordance for different markers and only usingoptimised techniques and strict scoring criteria. Thesefindings may explain the wide variation in prognosticimpact reported in the literature. Harmonisation oftechniques and centralised consensus review appearsmandatory when using immunohistochemical biomarkersfor treatment stratification.

The use of immunohistochemical methods hasbecome part of the routine diagnostic procedure inseveral tumours, and even more essential inlymphoma. In the current classification of non-Hodgkin lymphoma, the immunophenotype con-stitutes an integral and very important part of thefeatures used for classification, together withmorphology, genetic characteristics and clinicalfindings. Apart from their use in classification,

immunohistochemical prognostic and predictivebiomarkers have become increasingly importantfor determining an expectation of survival. In thelast 10 years, several markers have been identifiedthat influence patients’ prognosis. This has led tothe proposed use of these markers for riskstratification of lymphoma patients and to developspecific therapeutic strategies.

Since the recognition of two biological subtypesof diffuse large B-cell lymphoma (DLBCL) on thebasis of gene-expression profiling,1 2 the explora-tion of the clinical relevance of this sub-typing hasbeen the subject of many studies. The prognosticstratification of the germinal centre B-cell-like(GCB) and activated B-cell like (ABC) has beenreproducible in most gene-expression studies bydifferent groups.3 4 Immunohistochemistry using alimited number of markers may be an attractivealternative technique that enables exploration inlarger retrospective and prospective studies, inuniformly treated series from clinical trials and inmore rare and specific patient populations.Moreover, rather than using a gene-expressionbased approach, an immunohistochemical methodmight be highly suitable for implementation indaily patient management in the future. Severalgroups of authors have used similar approaches totranslate the biological information of the GCB-like versus non-GCB-like subtypes in sizablepatient series.5–12 Although the markers and algo-rithms were very similar, the results were remark-ably variable. Some of these groups found asignificant prognostic value using the immunohis-tochemical class prediction similar to the gene-expression method, while others did not. Theinconsistency in results not only applies to therather complex biological subtype stratification,but similarly holds true for single immunohisto-chemical prognostic markers such as bcl-2, bcl-6,survivin, FoxP1 and Ki-67 in DLBCL.13–23

Several biological and technical factors mayunderlie the inconsistent findings for the prognos-tic value of markers in DLBCL. These include: (1)selection of specific patient series (specific agegroups, relative contribution of nodal versusextra-nodal disease); (2) treatment factors (non-uniform treatment, +/2 rituximab); (3) laboratory

Original article

128 J Clin Pathol 2009;62:128–138. doi:10.1136/jcp.2008.057257

group.bmj.com on November 2, 2010 - Published by jcp.bmj.comDownloaded from

Page 4: UvA-DARE (Digital Academic Repository) Immunohistochemical ... · B Sander,8 C Thorns,9 E Campo,10 T Molina,11 A Hagenbeek,12 S Horning,13 A Lister,14 J Raemaekers, 15 G Salles, 16

technical variations (such as different primary antibodies,various antigen retrieval and signal amplification techniques);(4) scoring criteria and definitions; and (5) inter- and intra-observer variations.

In October 2003, the Lunenburg Lymphoma BiomarkerConsortium (LLBC) was instituted as an international colla-boration of nine leading European, American and Canadianclinical lymphoma research groups to unite their efforts intranslational research.24 The main research focus was todetermine those biomarkers thought to be important forprognosis in DLBCL. Prior to launching a comprehensive studyon biopsy samples from patients treated in clinical trials, thegroup has concentrated on validation and standardisation ofimmunohistochemistry of the currently interesting prognosticmarkers in DLBCL. This effort has focused on the evaluation oftechnical and inter-observer variation in the context of strictscoring criteria and definitions.

MATERIALS AND METHODS

Tissue microarray constructionTissue microarrays (TMAs) were prepared at the Department ofPathology of the British Columbia Cancer Agency from 36representative cases of DLBCL and two tonsil samples withadequate archival formalin-fixed and paraffin-embedded mate-rial retrieved from six different laboratories and collectedbetween 1984 and 2004. Representative 0.6 mm cores weretaken and re-embedded in duplicate per recipient block. Sixidentical recipient blocks were constructed from the same donorblocks. All TMAs were checked to be equally representative foreach case. Sections (5 mm) were then cut from each TMA ineight laboratories and stained with antibodies to CD20, CD5,bcl-2, bcl-6, CD10, HLA-DR, MUM-1 and Ki-67 according tolocally optimised protocols (details in table 1). One aim of thestudy was to evaluate the variation in the results obtained withstaining in different labs as done in daily practice.

Criteria and scoring methods for immunohistochemistryEach core was evaluated for percentage of tumour cells stainedby visual estimation and the maximum of the two cores wasrecorded. The LLBC pathologists convened twice to determineand refine the criteria for scoring prior to the first rotationround. Table 2 lists the scoring categories. Strong emphasis wasput on the presence of internal controls as a prerequisite forscoring. In the absence of an internal control, the case wasconsidered unscorable. For CD20 and CD5, HLA-DR and bcl-2,reactive B-cells and T-cells respectively, which are alwayspresent in DLBCL served as internal controls. For CD10,sporadic granulocytes and stromal fibroblasts could be used asinternal controls. These could be rather sparse, however. Forbcl-6 and MUM-1, reactive T-cells were adequate as internalcontrols but often sparse; careful attention to their presencewas required. Internal controls for Ki-67 could not be indicated,but since DLBCL always has a sufficient number of tumour cellsin cycle, lack of internal control was not anticipated to cause aproblem.

The Ki-67 staining from lab 5 and the MUM-1 staining fromlab 4 were not considered due to very suboptimal technicalresults precluding evaluation. Lab 3 did not perform an HLA-DRstaining. In the first rotation round, the set of eightimmunohistochemical stains from each laboratory was scoredby the local pathologist and two other pathologists to assessstaining and scoring variation.Ta

ble

1Pr

imar

yan

tibod

ies

and

prot

ocol

s

Lab

1La

b2

Lab

3La

b4

Lab

5La

b6

Lab

7La

b8

Ki-6

7D

AKO

MIB

1D

AKO

MIB

1D

AKO

MIB

1D

AKO

MIB

1Lo

cally

prod

uced

DA

KOM

IB1

DA

KOM

IB1

DA

KOM

IB1

CD

20D

AKO

L26

DA

KOL2

6D

AKO

L26

DA

KOL2

6Lo

cally

prod

uced

DA

KOL2

6D

AKO

L26

DA

KOL2

6

CD

5N

ovac

astr

a4C

7N

ovac

astr

a4C

7N

ovac

astr

a4C

7N

ovac

astr

a4C

7M

EDA

CN

CL-

CD

54C

7N

ovac

astr

a4C

7N

ovac

astr

a4C

7N

ovac

astr

a4C

7

bcl-2

DA

KO12

4D

AKO

124

DA

KO12

4D

AKO

124

ZYM

EDbc

l-2-1

00B

IOC

AR

TA10

0D5

DA

KO12

4D

AKO

124

CD

10N

ovac

astr

a56

C6

Nov

acas

tra

56C

6N

ovac

astr

a56

C6

Nov

acas

tra

56C

6M

EDA

CC

D10

270

Nov

acas

tra

56C

6N

ovac

astr

a56

C6

Nov

acas

tra

56C

6

bcl-6

DA

KOPG

-B6p

DA

KOPG

-B6p

DA

KOPG

-B6p

DA

KOPG

-B6p

DA

KOPG

-B6p

DA

KOPG

-B6p

DA

KOPG

-B6p

DA

KOPG

-B6p

MU

M-1

DA

KOM

UM

-1p

DA

KOM

UM

-1p

DA

KOM

UM

-1p

DA

KOM

UM

-1p

DA

KOM

UM

-1p

DA

KOM

UM

-1p

DA

KOM

UM

-1p

DA

KOM

UM

-1p

HLA

-DR

DA

KOTA

L.1B

5D

AKO

TAL.

1B5

Not

done

DA

KOTA

L.1B

5D

AKO

TAL.

1B5

DA

KOTA

L.1B

5D

AKO

TAL.

1B5

DA

KOTA

L.1B

5

Ret

rieva

lm

etho

dC

itrat

ebu

ffer

(pH

6.1)

,La

bVis

ion,

PTm

odul

eC

itrat

ebu

ffer

(pH

6.1)

,pr

essu

reco

oker

Tris

(low

pHex

cept

bcl-2

and

bcl-6

pH10

),pr

essu

reco

oker

Citr

ate

buff

er(p

H6.

1)pr

essu

reco

oker

(Ki-6

7,bc

l-2,

CD

10),

pres

sure

cook

er(C

D5,

HLA

-DR

,m

icro

wav

e(M

UM

-1)

Citr

ate

buff

er(p

H6.

1),

pres

sure

cook

erC

itrat

ebu

ffer

(var

ious

pHs)

,pr

essu

reco

oker

Citr

ate,

mic

row

ave

(Ki-

67,

CD

20,

bcl-2

,C

D10

,bc

l-6),

EDTA

,60u

over

nigh

t(C

D5,

CD

10,

bcl-6

)

Citr

ate

buff

er(p

H6.

1),

pres

sure

cook

er

Incu

batio

nm

etho

dA

utom

atic

Aut

omat

icA

utom

atic

Aut

omat

icM

anua

lA

utom

atic

Aut

omat

icm

anua

l

Incu

batio

nsy

stem

Labv

isio

n2D

TEC

AN

Rob

otic

Han

dler

DA

KOM

SD

AKO

Tech

Mat

eN

ATE

CA

NR

obot

icH

andl

erD

AKO

Tech

Mat

en.

a.

Ver

tical

/hor

izon

tal

incu

batio

nH

oriz

onta

lV

ertic

alH

oriz

onta

lV

ertic

alH

oriz

onta

lH

oriz

onta

lV

ertic

alho

rizon

tal

Dev

elop

men

tA

BC

/DA

BEn

visi

on*/

DA

BPo

wer

visi

on{

Bio

tin/S

trep

tavi

din

APA

AP

AB

C/D

AB

Che

mM

ate*

/DA

BA

BC

/DA

B

*DA

KO.

{Im

mun

ovis

ion

Tech

nolo

gies

.N

A,

not

appl

icab

le.

Original article

J Clin Pathol 2009;62:128–138. doi:10.1136/jcp.2008.057257 129

group.bmj.com on November 2, 2010 - Published by jcp.bmj.comDownloaded from

Page 5: UvA-DARE (Digital Academic Repository) Immunohistochemical ... · B Sander,8 C Thorns,9 E Campo,10 T Molina,11 A Hagenbeek,12 S Horning,13 A Lister,14 J Raemaekers, 15 G Salles, 16

Based on the results from the first rotation round whichshowed significant staining effects, LLBC pathologists (DdJ, AR,MC, PG, WK, BS, CT, EC, TM) convened a meeting to discussthe results and to directly compare all available stained TMAslides. When comparing all results per staining, the optimalstaining per marker was selected using the results of the firstround as a guideline and according to the presence of minimalartefacts and as most representative of the expected biologicalrange/variation of the marker in DLBCL. Optimal stainingswere selected by consensus from the complete collection to a setof 10 slides, one for each marker, with the exception of bcl-6 andCD5 for which two slides were selected. Subsequently, scoringvariation was further evaluated in a second rotation round inwhich all nine pathologists scored these optimal stainings in thesame slide set. For CD5 and bcl-6, two stains were selected, thatdiffered in staining characteristics, precluding interpretation onexpected best scoring reproducibility. For CD5, identical scoringcriteria were applied on both stains. For bcl-6, two different setsof scoring criteria were used: one based on cell percentages andone based on staining intensity. A ‘‘scoring manual’’ wasconstructed as an additional guideline.

Statistical analysisMeasures used to evaluate agreement included overall agree-ment between pairs of labs as well as the proportion of patientsfor whom all scores agree. The pair-wise agreement wasadjusted for the expected proportion of agreement assumingthe scoring laboratories were independent using the generalisedk statistics.25 The level of agreement for the k statistic wasevaluated based on the following ranges: (0 poor, 0.0–0.2slight, 0.2–0.4 fair, 0.4–0.6 moderate, 0.6–0.8 substantial and0.8–1.0 almost perfect. The standard error of the generalised k

statistic was estimated using the bootstrap method with 2000replications.26 Resampling was performed at the patient level toconserve the correlation structure of the scores within a patient.The bootstrap confidence intervals were computed based on thepercentiles of the bootstrap distribution of the statistic.26 Theagreement measures were evaluated including or excluding theslides that were not scored. In the first round, the averageoverall and pair-wise agreement percentages across the markerswere compared among the staining laboratories using theKruskal–Wallis test; the coefficient of variation (CV) wasreported.

We further evaluated whether combining biologically homo-geneous categories could improve agreement. This was

performed for bcl-6, CD5 and Ki-67 in the second rotationround. Using the most generally applied algorithm to distin-guish GCB versus non-GCB DLBCL on the basis of immuno-histochemistry for CD10, bcl-6 and MUM-1, agreement andgeneralised k statistics among labs was performed for data fromboth rotation rounds. A cut-off level of 25% was used for eachmarker as a positive score in the GCB versus non-GCBclassification.

RESULTSAgreement across staining and scoring laboratoriesAgreement results are summarised for the first rotation round intable 3 and fig 1, and for the second rotation round in table 4.The agreement results are also presented excluding the slideswhich could not be scored. The reasons slides could not bescored varied for different markers and are summarised for eachmarker below. In the first round, the average pair-wiseagreement percentage between the three pathologists acrossmarkers was 71, 67, 66, 75, 67, 65, 64, and 77 for staining labs1–8, respectively (p = 0.9, Kruskal–Wallis test). Similarly, theaverage complete agreement was 67, 60, 56, 56, 67, 50, 51 and 52for staining labs 1–8, respectively (p = 0.9) This indicates thatthe eight staining laboratories produced the same degree ofaverage agreement. However, for some markers the agreementwas low and the estimated standard deviation was relativelyhigh compared to the mean (CV 21–31 for all markers exceptCD20 and HLA-DR, with agreement for one lab different fromthe other seven; fig 1) resulting in further detailed evaluation ofthe individual stains. The majority of the markers show adifference in agreement between both rounds with improve-ment in the second round compared to the first round.However, the agreement remained low to moderate for somemarkers. Technical and staining aspects that could influence thescoring are discussed individually for each marker below.

CD20Only CD20 stands out as a uniformly reliable marker to scorefor almost all stains and for all pathologists (fig 3A), resulting incomplete agreement in the second round. However, it shouldalso be noted, that for CD20 some staining variation waspresent in the first round in terms of staining overall intensities,background levels and artefacts, resulting in incidental unscor-able cores (up to 9% in the first round due to high background inthe staining performed in lab 5) and a single discordant case,which was negative in only 1/8 CD20-stained TMA slides. The

Table 2 Scoring criteria for immunohistochemistry in the second rotation round

CD20 Score as positive/negative in tumour cells

CD5 No staining, 1–25%, 26–50%, 51–75%, .75%. For the score ‘‘no staining’’ an internal staining control must be present. T-cells serve as internal controls

HLA-DR Score as positive/negative in tumour cells. For the score ‘‘negative’’ an internal staining control must be present. Reactive small B- and T-cells and accessorycells serve as internal controls

CD10 Score as positive/negative in tumour cells. For the score ‘‘negative’’ an internal control must be present. Granulocytes and stromal fibroblasts serve as internalcontrols

bcl-2 No staining ( = 0–5%), 5–25%, 26–50%, 51–75%, .75%. For the score ‘‘no staining’’ an internal staining control must be present

Staining intensity is scored as weak ( = weaker than internal T-cells) and strong ( = equal or stronger than internal T-cells)

Ki-67 No staining, 1–25%, 26–50%, 51–75%, 76–95%, .95%. For the score ‘‘no staining’’ an internal staining control must be present (mitoses and at least sporadictumour cells)

MUM-1 No staining ( = 0–5%), 5–25%, 26–50%, 51–75%, .75%. For the score ‘‘no staining’’ an internal staining control must be present. Activated T-cells serve asinternal controls

bcl-6, lab 1 No staining ( = 0–5%), 5–25%, 26–50%, 51–75%, .75%. For the score ‘‘no staining’’ an internal staining control must be present. Internal controls may bevery sparse and consist of T-cells

bcl-6, lab 7 ‘‘Strong’’ as saturated, ‘‘strong variable’’ as variable with strong to moderate staining variation, ‘‘variable weak’’ as variation between weak and moderatestaining intensities and ‘‘weak’’ as negative with sporadically staining cells. For the score ‘‘no staining’’ an internal staining control must be present. Internalcontrols may be very sparse and consist of T-cells

Original article

130 J Clin Pathol 2009;62:128–138. doi:10.1136/jcp.2008.057257

group.bmj.com on November 2, 2010 - Published by jcp.bmj.comDownloaded from

Page 6: UvA-DARE (Digital Academic Repository) Immunohistochemical ... · B Sander,8 C Thorns,9 E Campo,10 T Molina,11 A Hagenbeek,12 S Horning,13 A Lister,14 J Raemaekers, 15 G Salles, 16

presence of nucleolar artefacts was laboratory dependent(fig 2A).

HLA-DRHLA-DR was also reproducible in terms of staining or scoring,with agreement of 86–92% for all staining laboratories exceptone in the first round, and of 95% (95% CI 86% to 99%) in thesecond round. One single lab (lab 7) stood out, with up to 34%unscorable cases due to high background staining in the first

round. This lab routinely used an immunological amplificationstep for immunohistochemical visualisation (ChemMate,DAKO), resulting in a very strong signal, but at the cost ofhigh background staining in some markers, including HLA-DR.

CD10CD10 may be considered as a rather reliably scored marker withan agreement for pairs of labs of 87% in the first round and 95%(95% CI 88% to 99%) in the second round for interpretable

Figure 1 Pair-wise agreementpercentages among three pathologists byeach staining laboratory for the firstrotation round for markers with higherlevels of agreement (A) and markers withlower levels of agreement (B). Results areshown with the not scored slides includedand excluded in the agreementcalculations. The coefficients of variation(CV) across scores for a staininglaboratory are computed with (CV1) andwithout the not scored category (CV2).

Original article

J Clin Pathol 2009;62:128–138. doi:10.1136/jcp.2008.057257 131

group.bmj.com on November 2, 2010 - Published by jcp.bmj.comDownloaded from

Page 7: UvA-DARE (Digital Academic Repository) Immunohistochemical ... · B Sander,8 C Thorns,9 E Campo,10 T Molina,11 A Hagenbeek,12 S Horning,13 A Lister,14 J Raemaekers, 15 G Salles, 16

cases. However, in suboptimal stains, the sparsely presentinternal controls (stromal fibroblasts and granulocytes, fig 3C)may be lost, resulting in a high percentage of unscorable cases(up to 39% in the first round), as also reflected by the pooragreement results with the inclusion of the not scored cases inthe first round (pair-wise agreement of 65% vs 87% with andwithout the not scored cases, respectively). With the optimalstaining in the second round, results were excellent (pair-wiseagreement of 87% vs 95% with 3.7% of non-scorable cases dueto lack of internal controls).

CD5CD5 staining was strongly influenced by technical variationdespite the use of the same antibody clone in all labs (4C7 fromdifferent manufacturers). The use of standard ABC/DABvisualisation produced dramatically different results (pair-wiseagreement 91–98%) than visualisation with maximised

enhancement systems (ChemMate, DAKO; Powervision,Immunovision Technologies; pair-wise agreement 68–69%)with far stronger membranous staining at the cost of increasedintracytoplasmic background staining (fig 3B). Experience withflow-results indicates that this staining is actually specific andshould be considered as positive (B Sander and RD Gascoyne,personal communication). Therefore, for the second rotationround both maximised stains were included. The second roundresults show, however, that extreme enhancement introducesan unacceptable level of background staining and therefore isachieved at the expense of a high percentage of unscorable cases(11% for lab 7, fig 2). CD5 was initially considered in fivescoring categories. The distribution of cases over thesecategories showed at most 9% of the scores in each of thethree intermediate categories (1–25%, 26–50%, 51–75%) (fig 2B).This suggested that the biological dichotomy could be placed ata single higher level (>75%), with all cases with less than 75%

Table 3 Agreement percentages and the generalised k statistic from the first rotation round combinedacross staining laboratories

Marker

Including the not scored category Excluding the not scored category

% agreement in

Generalised k

% agreement in

Generalised k24 labs Pairs of labs 24 labs Pairs of labs

bcl-2 0 47 (7) 0.23 (0.07) 12 56 (8) 0.29 (0.09)

bcl-6 0 34 (3) 0.17 (0.03) 0 36 (3) 0.15 (0.03)

CD10 3 65 (4) 0.39 (0.08) 45 87 (4) 0.69 (0.09)

CD20* 79 95 (3) – 94 99 (1) –

CD5 18 79 (4) 0.25 (0.16) 30 83 (4) 0.26 (0.20)

HLA-DR{ 0 77 (4){ 0.32 (0.09) 61 90 (4) 0.51 (0.15)

Ki-67{ 0 35 (3) 0.14 (0.04) 0 42 (3) 0.16(0.04)

MUM-1{ 0 34 (4) 0.16 (0.05) 3 37 (5) 0.18 (0.06)

GCB 3 57 (5) 0.36 (0.07) 30 81 (5) 0.62 (0.11)

The bootstrap standard errors are in parentheses.*k statistic not computed for CD20 because 99.6% of the cases scored are positive, resulting in high expected agreement andsmall denominator of the k statistic.{Exclude one staining lab for Ki-67, one staining lab for MUM-1 and GCB, and one staining lab for HLA-DR.{Pair-wise agreement was high for 7 of the 8 staining labs (86–92%) and low for one staining lab (34%) as shown in fig 1.

Table 4 Agreement percentages and the generalised k statistic across nine scoring pathologists in thesecond rotation round

Marker

Including the not scored category Excluding the not scored category

% agreement in

Generalised k

% agreement in

Generalised k9/9 labs Pairs of labs 9/9 labs Pairs of labs

bcl-2 46 70 (8) 0.45 (0.12) 46 72 (8) 0.46 (0.11)

bcl-2 intensity 33 74 (5) 0.51 (0.11) 49 80 (5) 0.58 (0.12)

bcl-6, lab 1 12 53 (7) 0.42 (0.07) 30 57 (7) 0.43 (0.09)

bcl-6, lab 7 6 54 (5) 0.37 (0.07) 3 54 (4) 0.34 (0.06)

bcl-6, lab 7, weak vsstrong

42 80 (6) 0.58 (0.11) 44 81 (5) 0.56 (0.11)

CD10 70 87 (6) 0.72 (0.12) 82 95 (3) 0.88 (0.08)

CD20* 100 100 (0) – 100 100 (0) –

CD5, lab 7 46 73 (8) 0.43 (0.13) 61 82 (7) 0.48 (0.16)

CD5, lab 3 49 71 (8) 0.44 (0.12) 49 75 (7) 0.49 (0.11)

CD5, lab 3, (75% vs.75%

67 86 (5) 0.45 (0.18) 73 90 (5) 0.55 (0.18)

HLA-DR 88 95 (4) 0.75 (0.13) 88 96 (3) 0.74 (0.14)

Ki-67 6 58 (4) 0.39 (0.07) 6 58 (4) 0.35 (0.06)

Ki-67, 4 categories 61 83 (6) 0.58 (0.11) 66 84 (6) 0.53 (0.12)

MUM-1 6 54 (6) 0.41 (0.07) 9 56 (6) 0.42 (0.07)

GCB 3 77 (7) 0.62 (0.11) 73 89 (6) 0.77 (0.12)

The bootstrap standard errors are in parentheses.*k statistic cannot be computed for CD20 because 100% of the cases scored are positive, resulting in 100% expected agreementand denominator of the k statistic equal to 0.

Original article

132 J Clin Pathol 2009;62:128–138. doi:10.1136/jcp.2008.057257

group.bmj.com on November 2, 2010 - Published by jcp.bmj.comDownloaded from

Page 8: UvA-DARE (Digital Academic Repository) Immunohistochemical ... · B Sander,8 C Thorns,9 E Campo,10 T Molina,11 A Hagenbeek,12 S Horning,13 A Lister,14 J Raemaekers, 15 G Salles, 16

positive tumour cells considered as negative. This approachimproved the inter-observer agreement from 71% (95% CI 56%to 88%) with five categories to 86% (95% CI 75% to 96%) withtwo categories, but remained at moderate level per the k

statistic (k= 0.43 vs 0.45, including not scored cases).

bcl-2bcl-2 showed only fair agreement in the first rotation round dueto staining variation (pair-wise agreement of 47%, k= 0.23) anda relatively high percentage of not scored cases that stood out intwo of the stains (13% and 15%). While all other labs used thesame antibody, these two used a different clone from anothermanufacturer that seemed to produce more non-specific back-ground staining, precluding any reliable scoring. Moreover, oneof the labs used a very different visualisation technique(APAAP). In the second round with the optimal stain, only1% of samples could not be scored and moderate agreementcould be reached (70% pair-wise agreement, 95% CI 56% to87%, k= 0.45). As an alternative approach, the intensity ofcytoplasmic staining in tumour cells compared to reactive T-cells in the same sample was considered (fig 3D). This could beperformed with somewhat better reproducibility (74% pair-wiseagreement, 95% CI 65% to 86%, k= 0.51). To determine

whether this feature is of possible biological and prognosticrelevance remains to be studied in the context of a clinical series.

MUM-1The nuclear markers, bcl-6, MUM-1 and Ki-67, proved to be themarkers most influenced by laboratory variation. For MUM-1,the problem was not the scorability, since internal controls werevirtually always present. In some labs, non-specific cytoplasmicbackground staining and ‘‘target-like’’ artefacts were a majorproblem, precluding adequate scoring (up to 58%). When allpathologists scored the same optimal stain, only moderateagreement (pair-wise agreement of 54%, 95% CI 45% to 68%,k= 0.41) could be reached and especially classification in thehigher frequency classes was highly variable between patholo-gists (9–46% with scores .75%).

Ki-67The reproducibility of scoring for Ki-67 was poor in the firstrotation round (35% pair-wise agreement, k= 0.14). Severalartefacts were encountered as illustrated in fig 4. Wheneliminating the laboratory staining variation, the reproducibilityof the scoring for Ki-67 did improve in the second round (58%pair-wise agreement, 95% CI 54% to 69%, k= 0.39) and a high

Figure 2 Distribution of the scores ofthe nine pathologists from the secondrotation round for Ki-67 (A) and CD5 (B)from a lab using amplification protocols(x-axis is the percentage of the scores ineach category).

Original article

J Clin Pathol 2009;62:128–138. doi:10.1136/jcp.2008.057257 133

group.bmj.com on November 2, 2010 - Published by jcp.bmj.comDownloaded from

Page 9: UvA-DARE (Digital Academic Repository) Immunohistochemical ... · B Sander,8 C Thorns,9 E Campo,10 T Molina,11 A Hagenbeek,12 S Horning,13 A Lister,14 J Raemaekers, 15 G Salles, 16

proportion of slides were scorable (94%). However, thedistribution of the scores across the categories was skewedtowards the higher scores (fig 2). Also in the second round, thepercentage of patients with scores in the two higher categories(76–95% and .95%) was highly variable (percentage of patients.95% ranged from 15% to 52% for the nine scoring labs).Indeed, when considering these two categories together in afour-category scoring scheme, a moderate agreement (83% pair-wise agreement, 95% CI 71% to 94%, k= 0.58) could be reachedand the percentage of patients with scores .75% was lessvariable (66% for one lab and 73–79% for eight scoring labs).

bcl-6bcl-6 was found to be the most variable and most difficult markerto score (fig 5). Despite use of the same primary antibody, thestaining results varied dramatically, resulting in pair-wise agree-ment of 34% and k= 0.17. Different laboratory techniques werefound to strongly influence the level of sensitivity of the staining.Two labs (lab 3 and lab 7) produced positive staining with bcl-6 invirtually all cases of DLBCL when using immunological amplifica-tion techniques (ChemMate Detection Kit, DAKO andPowervision, Immunovision Technologies), while all other labsobtained some negative cases and a range of positive cases.

Figure 3 (A) CD20 generally showsstrong membranous staining (A1).However, in rare cases, a more variableand weak staining may be seen with anucleolar artefact, that should not beconsidered in scoring and not be regardedas positive in the absence of membranousstaining (A2). (B) Two different CD5stainings on the same case in the sametissue microarray show strongenhancement with the use of Powervision(B2) compared to a standard ABC/DABtechnique. (C) Internal controls for CD10can be very sparse and consist of stromalfibroblasts (C1) and granulocytes (C2).(D) bcl-2 can be uniformly strong indiffuse large B-cell lymphoma, but alsostain positive in a minority of the tumourcells and with varying intensity, orstronger (D1) or weaker (D2) than internalcontrol T-cells.

Original article

134 J Clin Pathol 2009;62:128–138. doi:10.1136/jcp.2008.057257

group.bmj.com on November 2, 2010 - Published by jcp.bmj.comDownloaded from

Page 10: UvA-DARE (Digital Academic Repository) Immunohistochemical ... · B Sander,8 C Thorns,9 E Campo,10 T Molina,11 A Hagenbeek,12 S Horning,13 A Lister,14 J Raemaekers, 15 G Salles, 16

Uniform expression of bcl-6 in DLBCL is in line with expressiondata using RNA-based techniques in which all DLBCL are shownto produce bcl-6 to some extent. Therefore, the staining variationreflects the sensitivity of the different techniques and thegradients of expression. The essential different staining character-istics preclude full comparison in the first round.

In the second round, two different scoring systems were usedfor the two patterns; based on intensity and relative percentagesof positive cells. The classical method based on percentagesresulted in moderate agreement (pair-wise agreement of 53%,95% CI 49% to 67%, k= 0.42), but at the cost of anunacceptably high percentage of not scored cases (15%) due tonot-representative staining (absence of internal controls).Intensity scoring also yielded moderate agreement (pair-wiseagreement of 80%, 95% CI 69% to 91%, k= 0.58) whendichotomised in very simplified ‘‘weak’’ versus ‘‘strong’’categories, but with a lower percentage of not scored cases (4%).

Reproducibility of GCB versus non-GCB DLBCL class-assignmentThe combined analysis of CD10, bcl-6 and MUM-1 according toset algorithms may be a surrogate for the gene-expressionsignatures of the prognostically relevant classes of GCB- versusnon-GCB-like DLBCL. Staining and scoring variations may havea direct effect on the reproducibility of the immunological classdistinction. The overall results in the first round weredominated by the large proportion of unscorable cases forCD10 (up to 39%) and the variation in staining characteristicsdue to signal amplification for bcl-6. Therefore, pair-wiseagreement only reached 57% (95% CI 50% to 68%, k= 0.36).

For the second rotation round with the optimal stains, pair-wise agreement increased to 77% (95% CI 64% to 90%,k= 0.62). Exclusion of cases that could not be scored for oneor more of the relevant markers according to the set criteria,however, resulted in pair-wise agreement of 89% (95% CI 77%to 98%, k= 0.77).

DISCUSSIONModern approaches to cancer treatment are increasingly drivenby biological insights aimed at tailoring therapy. Therefore,

more demand than ever is put on the pathologist to providereproducible and reliable information on biomarkers andbiological subclassification of distinct tumour types. In thisstudy we explored variations introduced by laboratory techni-ques, inter-observer variations and scoring reproducibility usinga set of established and potentially important immunohisto-chemical markers for DLBCL. Despite the fact that all stainswere performed in experienced laboratories with a special focuson haematopathology, laboratory variations had a major impacton levels of agreement. When eliminating the staining variation,scoring proved to be highly reproducible between pathologistsfor several markers such as CD20, CD10 and HLA-DR andreasonably so for MUM-1, bcl-2 and CD5. However, for othermarkers including Ki-67 and bcl-6, the reproducibility was at alower level even with exclusion of the staining variation. Theresults are generally in line with the immunohistochemicalvalidation study performed by the Lymphoma/LeukemiaMolecular Profiling Project.27 However, this study also providesa detailed assessment of the variation sources per marker, thepossible consequences and recommendations for future studiesand daily practice. Obviously, it is a matter of debate what levelof reproducibility, both in terms of technical results and scoringperformance, can be considered as acceptable for immunohis-tochemical studies.

Technical variation resulted in dramatically different andunexpected results in some situations. The use of differentantibody clones for bcl-2, Ki-67 and CD10 increased thevariation and mostly resulted in different proportions ofunscorable cases due to non-specific background staining andartefacts. Also when using the same clones, staining intensitieswere quite variable, but this had generally a minor influence onthe reproducibility of the scoring. However, for CD5 and bcl-6,two labs obtained results that stood out as remarkably differentfrom all others. Only these two labs routinely includedimmunohistochemical enhancement techniques in their stain-ing protocols. This resulted in essentially different stainingcharacteristics. Up to four times more CD5-positive DLBCLwere recognised with these more sensitive techniques.Moreover, the pathologists who were used to these results indaily practice adhered to a different internal standard and more

Figure 4 Artefacts in Ki-67 thatpreclude adequate scoring consist of veryweak and heterogeneous staining withdisproportional strong staining of mitoses(A1) compared to an adequate stain ofthe same core in the same tissuemicroarray (A2). Staining of apoptoticfragments (A3) and cytoplasmicbackground (A4) should not beconsidered as positive. Note the negativemitosis in A4 indicative of an inadequatestaining.

Original article

J Clin Pathol 2009;62:128–138. doi:10.1136/jcp.2008.057257 135

group.bmj.com on November 2, 2010 - Published by jcp.bmj.comDownloaded from

Page 11: UvA-DARE (Digital Academic Repository) Immunohistochemical ... · B Sander,8 C Thorns,9 E Campo,10 T Molina,11 A Hagenbeek,12 S Horning,13 A Lister,14 J Raemaekers, 15 G Salles, 16

often considered weak staining in non-amplified techniques aspositive and thereby showed a significantly lower level ofagreement to the other contributors on these markers.

The amplification effect was most impressive for bcl-6. In ourseries, maximal enhancement of bcl-6 staining showed thatactually all DLBCL expressed bcl-6 protein to some extent. Thisis in line with gene-expression results which show that indeedvirtually all DLBCL do express bcl-6 to some extent and thedistribution of bcl-6 levels is more or less linear. This impliesthat the distribution of percentage classes fully depends on thesensitivity of the techniques used and that the choice of cut-offlevels and the percentage of positive and negative cases that

follows from these cut-offs is very much arbitrary. When thepoor reproducibility of class assignment is taken into considera-tion (53–80% agreement and 0.37–0.58 for k), it is obvious thatcomparison of studies from different centres is not straightfor-ward and that published studies harbour unpredictable techni-cal and interpretation differences. Indeed, the strong prognosticvalue of even a minor component of bcl-6 positive tumour cellsin patients with DLBCL treated with CHOP chemotherapy onlyis quite unexpected.22 Therefore, although in general theincrease of the detection levels by modern immunohistochem-ical enhancement techniques (standard citrate retrieval,Powervision, Envision) has been of great benefit, it certainly

Figure 5 The results for bcl-6 on thesame cores on the same tissuemicroarrays are compared in panels (A),(B) and (C), respectively, with a standardABC/DAB technique and with a protocolthat includes an immunohistochemicalamplification (panels 1 and 2,respectively). Corresponding decreasedintensity is seen in the less sensitivetechnique with increasing numbers of bcl-6 staining tumour cells falling below thedetection level of the staining. In panel(C), no positive tumour cells are seendespite weak, but distinctly positivestaining in the more sensitive technique(C2). Enhancement techniques may resultin stronger staining at the cost ofmorphological artefacts (D1). Increasedcytoplasmic background staining maypreclude reliable scoring (D2).

Original article

136 J Clin Pathol 2009;62:128–138. doi:10.1136/jcp.2008.057257

group.bmj.com on November 2, 2010 - Published by jcp.bmj.comDownloaded from

Page 12: UvA-DARE (Digital Academic Repository) Immunohistochemical ... · B Sander,8 C Thorns,9 E Campo,10 T Molina,11 A Hagenbeek,12 S Horning,13 A Lister,14 J Raemaekers, 15 G Salles, 16

hampers the comparison of published literature and may showunexpected results.

Inability to assess a staining result in an individual case maybe an underestimated problem. Apart from the TMA-specificproblem of missing cores, reasons for excluding a case for aspecific marker included different types of technical artefacts,especially high background staining and absence of an internalcontrol. This played a major role in the first rotation, but also inthe optimised staining of the second rotation. The aspect of theinternal controls was mostly encountered for bcl-6 and CD10, inwhich the internal controls can be very sparse. For all otherstains, admixed T-cells are always present and can serve asinternal control. For Ki-67, mitoses that are always present inDLBCL may serve as internal controls and should be looked for.Since there is always a sizable percentage of tumour cells incycle in DLBCL, completely negative results can be consideredas not reliable.

This study shows that agreement was better for markersscored with only two categories compared to multiplecategories. This would form a strong argument against toorefined scoring and for omitting essentially non-reproduciblecut-off points in situations that would be acceptable from abiological point of view. For daily practice, this may haveimplications for the distinction of DLBCL and (atypical) Burkittlymphoma in which a very high proliferation rate is one of thedefining parameters. We could not reproducibly score Ki-67,however, in categories of 75–95% versus .95% (k= 0.39), whilecombining the highest categories dramatically improved repro-ducibility (k= 0.58), showing that the problem indeed lies inthe upper ranges and that 95% of positive cells may not be areliable cut-off point. Taken together, Ki-67 may therefore notbe a marker of choice for the classification of Burkitt lymphomain daily practice.

All described technical issues that have an effect on stainingand scoring variation may have consequences for class assign-ment of DLBCL into GCB and non-GCB phenotypes. Using themost generally applied algorithm, proportions vary between65%/35%5 and 51%/49% in the published series reported thusfar.9 Selected series fall significantly outside these ranges,however, suggesting also a true biological spectrum (non-GCB/GCB 68%/32% in refractory and relapsed patients,28

17%/83% in paediatric patients,29 58%/42% in high riskpatients12). When excluding ‘‘not scored’’ cases and those withdichotomised markers (considered positive at a cut-off level of25% for bcl-6 and MUM-1) in our study, a very high agreementacross nine labs was seen (k= 0.77, 89% agreement). Agreementdropped, considerably however, when the ‘‘not scored’’ categorywas considered in the analysis (k= 0.62, 77% agreement).Therefore, pathologists are only moderately good at consistentclass assignment according to the standard algorithm, particu-larly on the basis of TMAs. In comparison, optimised centralstaining and scoring for HER2 in breast cancer patients reachesagreement levels of 81.6% and 75.0% for non-optimised stainingin a large validation study.30 The fact that it is only possible toreach an acceptable level of inter-observer reproducibility in thishighly controlled scoring protocol is a strong argument to limittreatment stratification on the basis of GCB versus non-GCBfeatures to clinical trials with central pathology review support,as is also recommended for HER2 testing in breast cancerpatients.

Only when the technical aspects and results are harmonisedcan biologically relevant cut-off levels be determined and can westart to define the relevant factors that determine theprognostic value of immunohistochemical markers in DLBCL,

including in different biological subgroups or treatment groups(e.g. rituximab).31 32

Taken together, this study shows that semi-quantitativeimmunohistochemistry for prognostic stratification of DLBCL isfeasible in a reproducible way, but with varying rates of successfor different markers. Lack of harmonisation of techniques andinterpretation is likely to explain in part the wide variability ofpublished results. At this stage, it is recommended that clinicaldecisions based on immunohistochemical stratification beperformed in the context of clinical trials with centralisedconsensus review and validated assessment of biomarkers, andnot on results of individual local centres. For individual centres,it is highly recommended to retain a very critical approachtowards their own results, to be conscious of the technicalreproducibility of the markers at the local setting, to be involvedin local and national quality control rotations, and to build upexperience with the results to develop sufficient insights in therange of staining variations of cases of DLBCL.

Acknowledgements: Contributors: The Lunenburg Lymphoma Biomarker Consortiumis a collaboration of nine international lymphoma collaborative groups, eachrepresented by a clinical investigator and one or more haematopathologists andsupported by a team of statisticians.EORTC Lymphoma Group: Daphne de Jong, Dennis Veldhuizen, John RaemaekersHOVON: Daphne de Jong, Marie Jose Kersten, Anton HagenbeekGELA: Philippe Gaulard, Thierry Molina, Josette Briere, Gilles SallesBritish Columbia Cancer Center: Randy Gascoyne, Mukesh Chhanabhai, Laurie SehnECOG: Randy Gascoyne, Sandra HorningDSHNHL (German High Grade non-Hodgkin Lymphoma Group): Christoph Thorns,Andreas Rosenwald, Wolfram Klapper, German Ott, Sylvia Hoeller, Heinz-WolframBernd, Michael PfreundschuhNLSG: Birgitta Sander, Eva KimbySt Bartholomew’s Hospital: Abigail Lee, Andrew Norton, Andrew Clear, Andrew ListerIndependent pathology advisor: Elias Campo, Barcelona, Spain; local support: AntoniMartinezDana-Farber Cancer Institute, Boston, USA: Edie Weller

Funding: The project is initiated and financially supported by the van VlissingenLymphoma Foundation. In addition, unrestricted grants were received from Genentech,Millennium Pharmaceuticals Inc., Roche International and Schering AG by the vanVlissingen Lymphoma Foundation.

Competing interests: None.

REFERENCES1. Alizadeh AA, Eisen MB, Davis RE, et al. Distinct types of diffuse large B-cell

lymphoma identified by gene expression profiling. Nature 2000;403:503–11.2. Rosenwald A, Wright G, Chan WC, et al. The use of molecular profiling to predict

survival after chemotherapy for diffuse large-B-cell lymphoma. N Engl J Med2002;346:1937–47.

3. Monti S, Savage KJ, Kutok JL, et al. Molecular profiling of diffuse large B-celllymphoma identifies robust subtypes including one characterized by hostinflammatory response. Blood 2005;105:1851–61.

Take-home messages

c Immunohistochemical markers in lymphoma should only bescored in the presence of adequate staining of cell populationsas internal controls.

c Due to high levels of technical and scoring variations,comparison of published series of immunohistochemicalmarkers in lymphoma should be done with caution.

c At this stage, clinical decisions based onimmunohistochemical stratification should only be performedin the context of clinical trials with centralised consensusreview and validated assessment of biomarkers, and not onresults of individual local centres.

Original article

J Clin Pathol 2009;62:128–138. doi:10.1136/jcp.2008.057257 137

group.bmj.com on November 2, 2010 - Published by jcp.bmj.comDownloaded from

Page 13: UvA-DARE (Digital Academic Repository) Immunohistochemical ... · B Sander,8 C Thorns,9 E Campo,10 T Molina,11 A Hagenbeek,12 S Horning,13 A Lister,14 J Raemaekers, 15 G Salles, 16

4. Feuerhake F, Kutok JL, Monti S, et al. NFkappaB activity, function, and target-genesignatures in primary mediastinal large B-cell lymphoma and diffuse large B-celllymphoma subtypes. Blood 2005;106:1392–9.

5. Barrans SL, Carter I, Owen RG, et al. Germinal center phenotype and bcl-2expression combined with International Prognostic Index improves patient riskstratification in diffuse large B-cell lymphoma. Blood 2002;99:1136–43.

6. Colomo L, Lopez-Guillermo A, Perales M, et al. Clinical impact of the differentiationprofile assessed by immunophenotyping in patients with diffuse large B-celllymphoma. Blood 2003;101:78–84.

7. Hans CP, Weisenburger DD, Greiner TC, et al. Confirmation of the molecularclassification of diffuse large B-cell lymphoma by immunohistochemistry using atissue microarray. Blood 2004;103:275–82.

8. Chang CC, McClintock S, Cleveland RP, et al. Immunohistochemical expressionpatterns of germinal center and activation B-cell markers correlate with prognosis indiffuse large B-cell lymphoma. Am J Surg Pathol 2004;28:464–70.

9. de Paepe P, Achten R, Verhoef G, et al. Large cleaved and immunoblastic lymphomamay represent two distinct clinicopathological entities within the group of diffuselarge B-cell lymphomas. J Clin Oncol 2005;23:1–9.

10. Berglund M, Thunberg U, Amini R-M, et al. Evaluation of immunophenotype indiffuse large B-cell lymphoma and its impact on prognosis. Mod Pathol2005;18:1113–20.

11. Zinzani PL, Dirnhofer S, Sabatini E, et al. Identification of outcome predictors indiffuse large B-cell lymphoma. Immunohistochemical profiling of homogeneouslytreated de novo tumors with nodal presentation on tissue microarrays. Haematologica2005;90:341–7.

12. van Imhoff GW, Boerma EJG, van der Holt B, et al. Prognostic impact of germinalcenter-associated proteins and chromosomal breakpoints in poor-risk diffuse large B-cell lymphoma. J Clin Oncol 2006;24:4135–42.

13. Kramer MH, Hermans J, Parker J, et al. Clinical significance of bcl2 and p53 proteinexpression in diffuse large B-cell lymphoma: a population-based study. J Clin Oncol1996;14:2131–8.

14. Hill ME, MacLennan KA, Cunningham DC, et al. Prognostic significance of bcl-2expression and bcl-2 major breakpoint region rearrangement in diffuse large cell non-Hodgkin’s lymphoma: a British National Lymphoma Investigation Study. Blood1996;88:1046–51.

15. Hermine O, Haioun C, Lepage E, et al. Prognostic significance of bcl-2 proteinexpression in aggressive non-Hodgkin’s lymphoma. Blood 1996;87:265–72.

16. Gascoyne RD, Adomat SA, Krajewski S, et al. Prognostic significance of bcl-2protein expression and bcl-2 gene rearrangement in diffuse aggressive non-Hodgkin’slymphoma. Blood 1997;90:244–51.

17. Iqbal J, Sanger WG, Horsman DE, et al. Bcl-2 translocation defines a unique tumorsubset within the germinal center B-cell-like diffuse large B-cell lymphoma.Am J Pathol 2004;165:159–66.

18. Lossos IS, Jones CD, Warnke R, et al. Expression of a single gene, BCL-6, stronglypredicts survival in patients with diffuse large B-cell lymphoma. Blood 2001;98:945–51.

19. Adida C, Haioun C, Gaulard P, et al. Prognostic significance of survivin expression indiffuse large B-cell lymphomas. Blood 2000;96:1921–5.

20. Sagaert X, de Paepe P, Libbrecht L, et al. Forkhead box protein P1 expression inmucosa-associated lymphoid tissue lymphomas predicts poor prognosis andtransformation to diffuse large B-cell lymphoma. J Clin Oncol 2006;24:2490–7.

21. Banham AH, Connors JM, Brown PJ, et al. Expression of the FOXP1 transcriptionfactor is strongly associated with inferior survival in patients with diffuse large B-celllymphoma. Clin Cancer Res 2005;11:1065–72.

22. Winter JN, Weller EA, Horning SJ, et al. Prognostic significance of bcl-6 proteinexpression in DLBCL treated with CHOP or R-CHOP: a prospective correlative study.Blood 2006;107:4207–13.

23. Barrans SL, Fenton JA, Banham A, et al. Strong expression of FOXP1 identifies adistinct subset of diffuse large B-cell lymphoma (DLBCL) patients with poor outcome.Blood 2004;104:2933–5.

24. Kersten MJ, de Jong D, Raemaekers, et al. Beyond the International PrognosticIndex: new prognostic factors in follicular lymphoma and diffuse large-cell lymphomaA meeting report of the Second International Lunenburg Lymphoma Workshop.Hematol J 2004;5:202–8.

25. Woolson RF, Clarke WR. Statistical methods for the analysis of biomedical data.Hoboken, NJ: Wiley, 2002.

26. Efron B, Tibshirani RJ. An introduction to the bootstrap. London: Chapman & Hall,1993.

27. Zu Y, Steinberg SM, Campo E, et al. Validation of tissue microarrayimmunohistochemistry staining and interpretation in diffuse large B-cell lymphoma.Leuk Lymph 2005;46:693–701.

28. Moskowitz CH, Zelenetz AD, Kwealramani T, et al. Cell of origin, germinal centerversus nongerminal center, determined by immunohistochemistry on tissuemicroarray, does not correlate with outcome in patients with relapsed and refractoryDLBCL. Blood 2005;106:3383–5.

29. Oschlies I, Klapper W, Zimmerman M, et al. Diffuse large B-cell lymphoma inpaediatric patients predominantly belong to the germinal-center type B-celllymphomas. Blood 2006;107:4047–52.

30. Perez EA, Suman VJ, Davidson NE, et al. HER2 testing by local, central andreference laboratories in specimens from the North Central Cancer Treatment GroupN9831 intergroup adjuvant trial. J Clin Oncol 2006;24:3032–8.

31. Mounier N, Briere J, Gisselbrecht C, et al. Rituximab plus CHOP (R-CHOP)overcomes bcl-2-associated resistance to chemotherapy in elderly patients withdiffuse large B-cell lymphoma (DLBCL). Blood 2003;101:4279–84.

32. Iqbal J, Neppalli VT, Wright G, et al. Bcl-2 expression is a prognostic marker foractivated B-cell-like type of diffuse large B-cell lymphoma. J Clin Oncol 2006;24:961–8.

Original article

138 J Clin Pathol 2009;62:128–138. doi:10.1136/jcp.2008.057257

group.bmj.com on November 2, 2010 - Published by jcp.bmj.comDownloaded from