anything you can do i can do better

2
InvestigatorsCorner Anything You Can Do I Can Do Better Daniel C. Jupiter, PhD Associate Professor of Surgery, Department of Surgery, Texas A&M Health Science Center, College of Medicine; and Research Scientist I, Scott and White Memorial Clinic and Hospital, Temple, TX article info Keywords: equivalence testing non-inferiority testing abstract Newly introduced drugs or treatments may not be substantively more effective than current therapies, but these drugs or treatments may have distinct advantages in terms of lower costs or fewer or less severe side effects. Demonstrating the utility of a novel treatment is thus unlike usual hypothesis testing, in which researchers seek to prove that treatments differ (i.e., that one treatment is better than another). Instead, researchers must prove that the treatments are equivalent in effectiveness (i.e., that the treatments do not differ). I discuss here how to execute this type of study: the non-inferiority study. Ó 2014 by the American College of Foot and Ankle Surgeons. All rights reserved. When we think of clinical trials of drugs, we usually think of the comparison of a drug to a placebo. When we think of trials of novel surgical techniques or of postoperative patient management protocols, we usually think of the comparison between 2 techniques where we expect to see a noticeable difference between them. Most of our statistical techniques are designed to study such differences. In fact, the entire machinery of the p value is phrased in terms of discovering differences. Researchers set up the null hypothesis of no difference in order to reject it, thus establishing the presence of a difference. There may be times, however, when researchers wish to compare drugs or treatments, expecting and hoping not to see a difference. As an example, consider the case in which a drug is currently on the market, but it is rather expensive or has some unpleasant side effects. When a cheaper drug or a drug with fewer side effects comes along that is effective, we would prefer to use that drug. It need not be more effective than the current drug. Indeed, we might even tolerate a minimal loss of efcacy, given that it is much cheaper or has fewer deleterious effects. Similarly, if we develop a less invasive surgical technique, we would prefer to use the less invasive technique, as long as the postoperative results are essentially the same as those obtained with the current, more invasive, technique. This type of comparison is entirely different than what we are used to. We have no desire to prove that there are differences; rather, we want to show sameness! And here, the machinery of the p value seems to fall apart. Look at the example of the novel drug mentioned earlier and assume that it is, indeed, an effective treatment. If we carry out our usual statistical tests that look for a difference in treatment effect, we will simply fail to reject the null hypothesis. At this point, we might be tempted to claim that we have shown that there is no difference between the 2 treatments, declare victory, and start prescribing the new drug. If we did so, we would be making a logical errordsuch as that discussed in an earlier InvestigatorsFinancial Disclosure: None reported. Conict of Interest: None reported. Address correspondence to:Daniel C. Jupiter, PhD, 2401 South 31st Street, Temple, TX 76508. E-mail address: [email protected] 1067-2516/$ - see front matter Ó 2014 by the American College of Foot and Ankle Surgeons. All rights reserved. http://dx.doi.org/10.1053/j.jfas.2013.06.003 Contents lists available at ScienceDirect The Journal of Foot & Ankle Surgery journal homepage: www.jfas.org The Journal of Foot & Ankle Surgery 53 (2014) 252253

Upload: daniel-c

Post on 27-Jan-2017

228 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Anything You Can Do I Can Do Better

lable at ScienceDirect

The Journal of Foot & Ankle Surgery 53 (2014) 252–253

Contents lists avai

The Journal of Foot & Ankle Surgery

journal homepage: www.j fas .org

Investigators’ Corner

Anything You Can Do I Can Do Better

Daniel C. Jupiter, PhDAssociate Professor of Surgery, Department of Surgery, Texas A&M Health Science Center, College of Medicine; and Research Scientist I, Scott and White Memorial Clinic and Hospital,Temple, TX

a r t i c l e i n f o

Keywords:equivalence testingnon-inferiority testing

Financial Disclosure: None reported.Conflict of Interest: None reported.Address correspondence to: Daniel C. Jupiter, PhD

TX 76508.E-mail address: [email protected]

1067-2516/$ - see front matter � 2014 by the Americhttp://dx.doi.org/10.1053/j.jfas.2013.06.003

a b s t r a c t

Newly introduced drugs or treatments may not be substantively more effective than current therapies, butthese drugs or treatments may have distinct advantages in terms of lower costs or fewer or less severe sideeffects. Demonstrating the utility of a novel treatment is thus unlike usual hypothesis testing, in whichresearchers seek to prove that treatments differ (i.e., that one treatment is better than another). Instead,researchers must prove that the treatments are equivalent in effectiveness (i.e., that the treatments do notdiffer). I discuss here how to execute this type of study: the non-inferiority study.

� 2014 by the American College of Foot and Ankle Surgeons. All rights reserved.

, 2401 South 31st Street, Temple

an College of Foot and Ankle S

When we think of clinical trials of drugs, we usually think of

,

urgeon

the comparison of a drug to a placebo. When we think of trials ofnovel surgical techniques or of postoperative patient managementprotocols, we usually think of the comparison between 2 techniqueswherewe expect to see a noticeable difference between them.Most ofour statistical techniques are designed to study such differences. Infact, the entire machinery of the p value is phrased in terms ofdiscovering differences. Researchers set up the null hypothesis of nodifference in order to reject it, thus establishing the presence ofa difference.

There may be times, however, when researchers wish to comparedrugs or treatments, expecting and hoping not to see a difference. Asan example, consider the case in which a drug is currently on themarket, but it is rather expensive or has some unpleasant side effects.When a cheaper drug or a drug with fewer side effects comes alongthat is effective, we would prefer to use that drug. It need not bemoreeffective than the current drug. Indeed, we might even toleratea minimal loss of efficacy, given that it is much cheaper or has fewerdeleterious effects. Similarly, if we develop a less invasive surgicaltechnique, we would prefer to use the less invasive technique, as longas the postoperative results are essentially the same as those obtainedwith the current, more invasive, technique.

This type of comparison is entirely different than what we areused to. We have no desire to prove that there are differences; rather,we want to show sameness! And here, the machinery of the p valueseems to fall apart. Look at the example of the novel drug mentionedearlier and assume that it is, indeed, an effective treatment. If wecarry out our usual statistical tests that look for a difference intreatment effect, we will simply fail to reject the null hypothesis. Atthis point, we might be tempted to claim that we have shown thatthere is no difference between the 2 treatments, declare victory, andstart prescribing the new drug. If we did so, we would be makinga logical errordsuch as that discussed in an earlier Investigators’

s. All rights reserved.

Page 2: Anything You Can Do I Can Do Better

D.C. Jupiter / The Journal of Foot & Ankle Surgery 53 (2014) 252–253 253

Corner (1)dof misinterpreting non-rejection of the null hypothesis. Inshort, not seeing a difference does not prove that there is nodifference.

What are we to do, then, if our arsenal of tools appears to beill-equipped to address this type of problem? The key lies inrethinking not how we use the p value but how we set up ournull hypothesis. Imagine that this drug is used to lower systolicblood pressure (SBP) from stage 1 hypertensive levels to lowprehypertensive levels. The currently available drug lowers SBP to anaverage of 120, but it has side effects and is expensive. The newproposed drug has many fewer side effects, is much cheaper, but onlyreduces SBP to an average of 125. Essentially, our new drug is aseffective as the old, perhaps a little worse, but a little worse thatdoctors, patients, and insurance companies may be willing to accept,given the positive aspects of the new drug.

What if, rather than trying to show that our new drug is betterthan the old, which it is not, we try to show that our new drug is notworse than the old. In other words, we set up the null hypothesis asfollows:

Null hypothesis: Novel drug reduces SBP to a level at least 10higher than the level to which the currently available drugreduces SBP.

Given this null hypothesis, the alternative hypothesis, which weprove if we reject the null hypothesis above, is this:

Alternative hypothesis: Novel drug reduces SBP to a level nomore than 10 higher than the level to which the currentlyavailable drug reduces SBP.

In other words, the new drug is not that much worse thanthe currently available drug. We are back on solid ground, using

the p value properly and not relying on a misinterpretation of thenon-rejection of the null hypothesis to prove our point.

I have just outlined the logic of the non-inferiority test,which is thetest to show that 1 thing is not substantively worse than another.I could just as easily consider non-superiority testing or, by combiningthese 2, consider equivalence testing. Equivalence testing shows thatthe 2 treatments are not substantively different. In the setting of thehypothetical drugs discussed earlier, this means that the new drugreduces SBP to a level no more than 10 higher than the current drugand that the current drug reduces SBP to a level no more than10 higher than the new drug. In short, the difference between thedrugs is no more than 10.

I conclude with 2 remarks. My first remark is about the effectdifference that we should look for in designing this type of study.Clearly, if I had hypothesized that my new drug reduced SBP levelsto within 50 of those of the current drug, clinicians would have beenless than impressed. That difference is far too large. In general, inlooking for non-inferiority or equivalence, the allowable differencebetween treatments should be less than clinically significant. In thisway we ensure that, although we are not achieving exact equivalence,we are close to doing so, at least from a patient perspective. Mysecond remark is simply that now that we understand the mechanicsof non-inferiority and equivalence testing, there is little excuse notto use them in our own study designs. With this and a previous (1)Investigators’ Corner in hand, we should never again be fooled by“proofs” using non-rejection of the null.

Reference

1. Jupiter D. Turning a negative into a positive. J Foot Ankle Surg 52(4):556–557,2013.