you cannot improve what you do not measure

2
THE BOTTOM LINE You Cannot Improve What You Do Not Measure Nelson Chao Graft-versus-host disease (GVHD) has been classi- cally divided into acute and chronic disease based upon the time of onset with chronic GVHD (cGVHD) oc- curring after day 100. However, this division at day 100 is artificial. There has been a shift toward defining acute GVHD (aGVHD) and cGVHD based upon clin- ical manifestations, rather than the arbitrary cutoff of a particular date posttransplantation. A major step for- ward was taken in 2005, when a National Institutes of Health (NIH) consensus conference was held to refine methods for research in cGVHD, including proposed objective response measures and a provisional algorithm for calculating organ-specific and overall response. For unclear cases, the NIH consensus diag- nostic criteria for GVHD includes an overlap syn- drome in which diagnostic or distinctive features of cGVHD and aGVHD appear together. Patients with cGVHD have skin involvement resembling lichen pla- nus or the cutaneous manifestations of scleroderma, dry oral mucosa with ulcerations and sclerosis of the gastrointestinal tract, and a rising serum bilirubin con- centration. The presenting symptoms are highly vari- able and in some ways similar to those found in other well-established autoimmune syndromes. The variable presentation of cGVHD makes clinical therapeutic studies difficult, because compari- son of responses may be different depending on the se- verity of involvement as well as the involved target organ. The difficulty is further compounded by whether one is a ‘‘lumper’’ or ‘‘splitter,’’ that is, whether all cGVHDs are single disease processes with variable presentations, or whether they are possi- bly different diseases—as different, for example, as scleroderma is from rheumatoid arthritis. The concern here is that newer medications may work for one and not the other, increasing the risk that a good drug does not provide a positive signal due to the wrong pa- tients being studied. So why has it been so very difficult to move the area of cGVHD therapeutics forward? The authors of ‘‘Poor Agreement between Clinician Response Ratings and Calculated Response Measures in Patients with Chronic Graft-versus-Host Disease’’ [1] provide important insight into this conundrum. Inamoto et al. [2] used weighted kappa statistics to eval- uate the level of agreement between clinician response ratings and calculated response categories in 290 pa- tients with cGVHD who had paired enrollment and follow-up visits. Based on a set of objective measures, 37% of the patients had an overall complete or partial response, whereas clinicians reported an overall com- plete or partial response rate of 71% (slight to fair agreement, weighted kappa 0.20). Agreement rates between calculated organ-specific responses and clinician-reported changes in skin, mouth, and eyes were fair to moderate (weighted kappa 0.28-0.54). They conclude that for both overall and organ- specific comparisons, clinician response ratings did not agree well with calculated response categories. Why the lack of agreement? The easiest answer is that there is no clear way to measure the extent of dis- ease or the response. Remember that cGVHD is a clin- ical diagnosis. We can tell when it is black or white, but we are not very good at distinguishing shades of gray. We do not have a biomarker for cGVHD, and so we rely on our clinical acumen to make the determi- nation. Clinical acumen, by definition, changes with practice and experience; hopefully, the more experi- ence (probably to a certain point), the better intraob- server correlation. We are in the business of the practice of medicine and not the assembly line of med- icine—just checking boxes (one can always have hope)—but it is not so simple. We as treating physi- cians want our patients to improve and perhaps this also clouds our judgment, leading us to score improve- ments more than we should. On the other hand, it is important to realize that there may be limits as to what is really important. One specific area that was missing in this article was the grading on performance status and quality of life (being prepared for a separate publication). I would argue that it is not possible to complete the assessment of cGVHD without the in- clusion of a patient’s performance status and quality of life. I can easily envision a physician (me in this case) declaring a complete remission in a patient From the Duke University School of Medicine, Durham, North Carolina. Financial disclosure: See Acknowledgments on page 1468. Correspondence and reprint requests: Nelson Chao, MD, MBA, Donald D. and Elizabeth G. Cooke Professor Chief, Division of Hematologic Malignancies and Cellular Therapy/BMT, 2400 Pratt St, Suite 9100, Box 3961, Durham, NC 27710. (e-mail: [email protected]). Received June 28, 2012; accepted July 2, 2012 Ó 2012 American Society for Blood and Marrow Transplantation 1083-8791/$36.00 http://dx.doi.org/10.1016/j.bbmt.2012.07.001 1467

Upload: nelson

Post on 28-Dec-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: You Cannot Improve What You Do Not Measure

THE BOTTOM LINE

From theCarol

Financial dCorrespon

Donaof H

You Cannot Improve What You Do Not Measure

Nelson Chao

Graft-versus-host disease (GVHD) has been classi-cally divided into acute and chronic disease based uponthe time of onset with chronic GVHD (cGVHD) oc-curring after day 100. However, this division at day100 is artificial. There has been a shift toward definingacuteGVHD (aGVHD) and cGVHDbased upon clin-ical manifestations, rather than the arbitrary cutoff ofa particular date posttransplantation. A major step for-ward was taken in 2005, when a National Institutes ofHealth (NIH) consensus conference was held torefine methods for research in cGVHD, includingproposedobjective responsemeasures andaprovisionalalgorithm for calculating organ-specific and overallresponse. For unclear cases, the NIH consensus diag-nostic criteria for GVHD includes an overlap syn-drome in which diagnostic or distinctive features ofcGVHD and aGVHD appear together. Patients withcGVHD have skin involvement resembling lichen pla-nus or the cutaneous manifestations of scleroderma,dry oral mucosa with ulcerations and sclerosis of thegastrointestinal tract, and a rising serum bilirubin con-centration. The presenting symptoms are highly vari-able and in some ways similar to those found in otherwell-established autoimmune syndromes.

The variable presentation of cGVHD makesclinical therapeutic studies difficult, because compari-son of responses may be different depending on the se-verity of involvement as well as the involved targetorgan. The difficulty is further compounded bywhether one is a ‘‘lumper’’ or ‘‘splitter,’’ that is,whether all cGVHDs are single disease processeswith variable presentations, or whether they are possi-bly different diseases—as different, for example, asscleroderma is from rheumatoid arthritis. The concernhere is that newer medications may work for one andnot the other, increasing the risk that a good drugdoes not provide a positive signal due to the wrong pa-tients being studied. So why has it been so very difficultto move the area of cGVHD therapeutics forward?

The authors of ‘‘Poor Agreement betweenClinicianResponse Ratings and Calculated Response Measuresin Patients with Chronic Graft-versus-Host Disease’’

Duke University School of Medicine, Durham, Northina.isclosure: See Acknowledgments on page 1468.dence and reprint requests: Nelson Chao, MD, MBA,ld D. and Elizabeth G. Cooke Professor Chief, Divisionematologic Malignancies and Cellular Therapy/BMT,

[1] provide important insight into this conundrum.Inamoto et al. [2] used weighted kappa statistics to eval-uate the level of agreement between clinician responseratings and calculated response categories in 290 pa-tients with cGVHD who had paired enrollment andfollow-up visits. Based on a set of objective measures,37% of the patients had an overall complete or partialresponse, whereas clinicians reported an overall com-plete or partial response rate of 71% (slight to fairagreement, weighted kappa 0.20). Agreement ratesbetween calculated organ-specific responses andclinician-reported changes in skin, mouth, and eyeswere fair to moderate (weighted kappa 0.28-0.54).They conclude that for both overall and organ-specific comparisons, clinician response ratings didnot agree well with calculated response categories.

Why the lack of agreement? The easiest answer isthat there is no clear way to measure the extent of dis-ease or the response. Remember that cGVHD is a clin-ical diagnosis. We can tell when it is black or white,but we are not very good at distinguishing shades ofgray. We do not have a biomarker for cGVHD, andso we rely on our clinical acumen to make the determi-nation. Clinical acumen, by definition, changes withpractice and experience; hopefully, the more experi-ence (probably to a certain point), the better intraob-server correlation. We are in the business of thepractice of medicine and not the assembly line of med-icine—just checking boxes (one can always havehope)—but it is not so simple. We as treating physi-cians want our patients to improve and perhaps thisalso clouds our judgment, leading us to score improve-ments more than we should. On the other hand, it isimportant to realize that there may be limits as towhat is really important. One specific area that wasmissing in this article was the grading on performancestatus and quality of life (being prepared for a separatepublication). I would argue that it is not possible tocomplete the assessment of cGVHD without the in-clusion of a patient’s performance status and qualityof life. I can easily envision a physician (me in thiscase) declaring a complete remission in a patient

2400 Pratt St, Suite 9100, Box 3961, Durham, NC 27710.(e-mail: [email protected]).

Received June 28, 2012; accepted July 2, 2012� 2012 American Society for Blood and Marrow Transplantation1083-8791/$36.00http://dx.doi.org/10.1016/j.bbmt.2012.07.001

1467

Page 2: You Cannot Improve What You Do Not Measure

1468 Biol Blood Marrow Transplant 18:1467-1470, 2012S. M. Devine

with only trace lichenoid changes, an excellent perfor-mance status, complete resolution from extensivecGVHD, and off all immunosuppressive drugs. Or,for example, in a 65-year-old patient who has indeedachieved a complete remission, I may be reluctant towean that patient off the last 5 mg of prednisone. Inmany ways, perhaps we need a very good partial re-sponse level for cGVHD, similar to what has beensuggested for aGHVD [3].

It would also have been helpful to know if thedifferences in response were clouded by whether a pa-tient was on a particular study drug compared withthose on prednisone alone. One obvious concern isphysician bias in calling responses for those patientson an investigational drug [4]. The authors are to becommended for calling our attention to this, as conclu-sions from prior literature reporting high overall re-sponse rates based on clinician judgment would notbe supported if the provisional algorithm had been ap-plied to calculate response. These data again highlightthe importance of research rigor in cGVHD studies.There are other factors that make cGVHD difficultto study, not the least of which is that many of the pa-tients are frequently far away from the medical center,making both evaluation and certain therapies (ie,photopheresis) difficult.

This analysis highlights the need to prospectivelydefine an overall response measure that incorporatesboth patient-reported and objective measures, and ac-curately reflects the outcome in patients. This is espe-cially true when there is a mixed response where oneorgan or site improves, while another shows new in-volvement. What these data demonstrate is that wevery much need the validation of the NIH consensuscriteria, the studies of which are ongoing. Our fieldwould benefit considerably from something similarto the Rodnan scores for systemic sclerosis. I wouldargue that we are in a similar situation as the rheuma-

From the Ohio State University Comprehensive Cancer Center,Columbus, Ohio.

Financial disclosure: See Acknowledgments on page 1470.Correspondence and reprint requests: Steven M. Devine, MD, The

Ohio State University Comprehensive Cancer Center, B316

tologists before the acceptance of the Rodnan score.Despite some concerns regarding this instrument, ithas been pivotal in allowing for a uniform language,measuring the disease, and facilitating progress in thechoice of therapies.

In summary, we are reminded of the curmudgeonH.L. Mencken who stated, ‘‘For every complex prob-lem, there is a solution that is simple, neat, andwrong.’’ The authors of this article have provideda very important framework for clinical trials goingforward.We must measure with greater rigor and pre-cision if we want to help our patients.We as a commu-nity must also show our commitment to our patientsby participating in large, well-designed studies suchas those led by the Clinical Trials Network. We needclear definitions that can be understood by all, studieswith sufficient patients that allow for robust statistics,and a measure of thought regarding the biology of thisdisease process.

ACKNOWLEDGMENTS

Financial disclosure: The author has nothing todisclose.

REFERENCES

1. Palmer JM, Lee SJ, Chai X, et al. Poor agreement between clini-cian response ratings and calculated responsemeasures in patientswith chronic graft-versus-host disease. Biol Blood Marrow Trans-plant. 2012 [Epub ahead of print].

2. Inamoto Y, Martin PJ, Chai X, et al. Clinical benefit of responsein chronic graft-versus-host disease. Biol BloodMarrow Transplant.2012;18:1517-1524.

3. Martin PJ, Bachier CR, Klingemann HG, et al. Endpoints forclinical trials testing treatment of acute graft-versus-host disease:a joint statement. Biol Blood Marrow Transplant. 2009;15:777-784.

4. Martin PJ, InamotoY, Carpenter PA, Lee SJ, FlowersME.Treat-ment of chronic graft-versus-host disease: past, present and fu-ture. Korean J Hematol. 2011;46:153-163.

Toward a More Rational Policy for AutologousHematopoietic Stem Cell Mobilization

Steven M. Devine

After years of development culminating in Food andDrug Administration approval in the wake of 2 success-ful phase 3 studies, there is little doubt as to the biolog-ical and clinical activity of the first in class CXCR4antagonist plerixafor [1,2]. The benefit of adding

plerixafor to granulocyte-colony stimulating factor(G-CSF) for optimal mobilization of a sufficientCD341 cell product was demonstrated convincinglyin phase 3 studies conducted separately in patientswith multiple myeloma (MM) and non-Hodgkin

Starling Loving Hall, 320 W 10th Avenue, Columbus,OH 43210 (e-mail: [email protected]).

Received August 4, 2012; accepted August 7, 2012� 2012 American Society for Blood and Marrow Transplantation1083-8791/$36.00http://dx.doi.org/10.1016/j.bbmt.2012.08.001