[wiley series in probability and statistics] counting processes and survival analysis...

10
Counting Processes and Survival Analysis

Upload: david-p

Post on 03-Dec-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: [Wiley Series in Probability and Statistics] Counting Processes and Survival Analysis (Fleming/Counting) || Front Matter

Counting Processes and Survival Analysis

Page 2: [Wiley Series in Probability and Statistics] Counting Processes and Survival Analysis (Fleming/Counting) || Front Matter

Counting Processes and Survival Analysis

THOMAS R. FLEMING

DAVID P. HARRINGTON

WILEY-INTERSCIENŒ

A JOHN WILEY & SONS, INC., PUBLICATION

Page 3: [Wiley Series in Probability and Statistics] Counting Processes and Survival Analysis (Fleming/Counting) || Front Matter

Copyright © 1991,2005 by John Wiley & Sons, Inc. All rights reserved.

Published by John Wiley & Sons, Inc., Hoboken, New Jersey. Published simultaneously in Canada.

No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008 or online at http://www.wiley.com/go/permission.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the U.S. at (800) 762-2974, outside the U.S. at (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic format. For information about Wiley products, visit our web site at www.wiley.com.

Library of Congress Cataloging-in-Publication is available.

ISBN-13 978-0-471-76988-0 ISBN-10 0-471-76988-6

Printed in the United States of America.

10 9 8 7 6 5 4 3 2 1

Page 4: [Wiley Series in Probability and Statistics] Counting Processes and Survival Analysis (Fleming/Counting) || Front Matter

To Joli and Anne

Page 5: [Wiley Series in Probability and Statistics] Counting Processes and Survival Analysis (Fleming/Counting) || Front Matter

Preface

In recent years, there has been considerable activity in the applied and theoretical statistical literature in methods for analyzing data on events observed over time, and on the study of factors associated with the occurrence rates for those events. This literature is now a rich and important subset of applied statistical work. These useful statistical methods can be cast within a unifying counting process framework, providing an elegant martingale based approach to understanding their properties.

This text explores the martingale approach to the statistical analysis of counting processes, with an emphasis on the application of those methods to censored failure time data. This approach was introduced in the 1970's, and has proved remarkably successful in yielding results about statistical methods for many problems arising in censored data. In 1978 Odd Aalen introduced the multiplicative intensity model for counting processes, and his ideas have led to important advances in statisti-cal methods for counting process data and for the censored data found in many biomédical studies. Martingale methods can be used to obtain simple expressions for the moments of complicated statistics, to calculate and verify asymptotic distri-butions for test statistics and estimators, to examine the operating characteristics of nonparametric testing methods and semiparametric censored data regression meth-ods, and even to provide a basis for graphical diagnostics in model building with counting process data.

It has been our experience that the interaction between applied problems in censored failure time data and the more theoretical perspective of martingale theory has provided important results for both statistical theory and practice, and that students and researchers benefit from an understanding of both the martingale methods used in counting processes, and the results that become available. Since martingale methods for counting processes are powerful, it is not surprising that it takes some time to establish the background material needed for their use. Until recently, most of the results from the calculus of martingales used in the study of counting process methods have appeared in only the probability research literature, and students of statistics often find this literature a formidable challenge. The recent texts by Karr (1986) and Bremaud (1981) have improved the situation, but those books, along with the more theoretical treatments of Liptser and Shiryayev (1977, 1978) and Elliott (1982), are oriented to engineering applications.

vii

Page 6: [Wiley Series in Probability and Statistics] Counting Processes and Survival Analysis (Fleming/Counting) || Front Matter

viii PREFACE

We have tried to give a thorough treatment of both the calculus of martingales needed for the study of counting processes, and of the most important applica-tions of these methods to censored data. We explore both classical problems in asymptotic distribution theory for counting process methods, as well as some newer methods for graphical analyses and diagnostics of censored data. There are already some excellent accounts of the older likelihood approach to failure time data, es-pecially the books by Kalbfleisch and Prentice (1980) and Cox and Oakes (1984), and we do not try to duplicate those efforts here. Gill's research monograph (1980) and Jacobsen's lecture notes (1982) illustrate martingale methods for counting pro-cesses, but we have tried to make the theoretical development here more nearly self-contained, and we have given much more emphasis to the regression analysis of censored data.

It is our expectation that students with one year of graduate study in statistics will find the presentation to be at an acceptable level. The prerequisite for this book is a familiarity with a measure theoretic treatment of probability that may be found, for instance, in Chung (1974, Chapters 1-4 and 9). The development of the theory of counting processes, stochastic integrals and martingales is provided, but only to the extent required for applications in survival analysis. In technical parts of the book, such as in Chapter 2, a summary of main results is provided for those who wish to skip the detailed development and proceed directly to applications in later chapters.

Chapter 0 provides motivation for the types of inference and estimation proce-dures commonly encountered in the analysis of censored failure time data. Ap-pendix A briefly reviews some measure theory concepts, and Chapters 1 and 2 introduce the martingale and counting process framework and indicate how the data analysis methods of Chapter 0 can be reformulated in counting process nota-tion. Chapter 3 considers the small sample moments and large sample consistency of standard test statistics and estimators. Chapter 4 presents censored data regres-sion models and corresponding likelihood methods for inference and estimation, and illustrates the application of these methods. The chapter also introduces regres-sion diagnostics and illustrates the use of martingale based residuals. Convergence in distribution for stochastic processes is introduced in Appendix B. Chapter 5 then discusses the martingale central limit theorems used to derive large sample distribution results for many of the survival analysis methods. Large sample prop-erties of the Kaplan-Meier estimator are presented in Chapter 6, while Chapter 7 considers the large sample null distribution, consistency and efficiency of a class of linear rank statistics. The large sample distribution properties for the regression methods in Chapter 4 are established in Chapter 8.

Data sets are provided in Appendix D and a collection of exercises appear in Appendix E. Exercises have been selected both to provide the reader with practice in the application of martingale methods, as well as to give insight into the martingale calculus itself. Some of the exercises guide the reader through details omitted from the text, and some indicate extensions of results.

We appreciate the help provided by so many people during the development of this book. Elaine Nasco showed tireless dedication while typing the many

Page 7: [Wiley Series in Probability and Statistics] Counting Processes and Survival Analysis (Fleming/Counting) || Front Matter

PREFACE ix

preliminary drafts with uncommon speed and accuracy. Margaret Sullivan Pepe provided many scientific insights and contributions during the early development of this material, especially that in Chapter 2. Michael Parzen prepared the figures, and his graphical expertise and attention to detail is responsible for their high quality.

We thank all those who generously provided careful review and helpful corn-menu on drafts of this book, including the students and faculty at the University of Washington and Harvard University, Robert Smythe and his students at George Washington University, Robert Wolfe and his students at the University of Michi-gan, the Wiley reviewers, as well as Peter Sasieni, Jon Wellner, Myrto Lefkopoulou, Luc Watelet, and Janet Andersen. We are indebted to Scott Emerson, who pro-vided considerable help, especially through his development of special routines in the Unix statistical language S for survival analysis methods discussed in Chapters 4 and 6; to Jennifer Thomas and Karen Abbett, who provided support in typing parts of the manuscript; to Herman Callaert, who provided a setting for extended writing during a sabbatical (DPH) in Belgium; and to E. Rolland Dickson at Mayo Clinic, and Howard Jaffe and Alan Izu at Genentech, who provided permission to publish the PBC and CGD data sets.

Portions of the manuscript were supported by grants from the National Cancer Institute, and we thank Marthana Hjortland for her administrative support at the NCI.

Our deepest gratitude is reserved for two loving families, whose support and gentle encouragement prevented this project from stopping short of its goal many times. Joli (for TRF) and Anne (for DPH) have given the world a new definition of patience, and have taught us much about the value of balance during a demanding time.

THOMAS R. FLEMING

DAVID P. HARRINGTON

Seattle, Washington Boston, Massachusetts September 1990

Page 8: [Wiley Series in Probability and Statistics] Counting Processes and Survival Analysis (Fleming/Counting) || Front Matter

Contents

Preface vü

0. The Applied Setting 1

0.1 Introduction, 1 0.2 A Data Set and Some Examples, 2

1. The Counting Process and Martingale Framework 15

1.1 Introduction, IS 1.2 Stochastic Processes and Stochastic Integrals, IS 1.3 The Martingale M = N - A, 25 1.4 The Doob-Meyer Decomposition: Applications to Quadratic Variation, 31 1.5 The Martingale Transform / HdM, 42 1.6 Bibliographic Notes, 48

2. Local Square Integrable Martingales 51

2.1 Introduction, 51 2.2 Localization of Stochastic Processes and the Doob-Meyer

Decomposition, 52 2.3 The Martingale N - A Revisited, 60 2.4 Stochastic Integrals with Respect to Local Martingales, 65 2.5 Continuous Compensators, 74 2.6 Compensators with Discontinuities, 79 2.7 Summary, 83 2.8 Bibliographic Notes, 88

3. Finite Sample Moments and Large Sample Consistency of Tests and Estimators 89

3.1 Introduction, 89

xi

Page 9: [Wiley Series in Probability and Statistics] Counting Processes and Survival Analysis (Fleming/Counting) || Front Matter

xii CONTENTS

3.2 Nonparametric Estimation of the Survival Distribution, 91 3.3 Some Finite Sample Properties of Linear Rank Statistics, 107 3.4 Consistency of the Kaplan-Meier Estimator, 112 3.5 Bibliographic Notes, 121

4. Censored Data Regression Models and Their Application 12S

4.1 Introduction, 125 4.2 The Proportional Hazards and Multiplicative Intensity Models, 126 4.3 Partial Likelihood Inference, 136 4.4 Applications of Partial Likelihood Methods, 153 4.5 Martingale Residuals, 163 4.6 Applications of Residual Methods, 178 4.7 Bibliographic Notes, 197

5. Martingale Central Limit Theorem 201

5.1 Preliminaries and Motivation, 201 5.2 Convergence of Martingale Difference Arrays, 205 5.3 Weak Convergence of the Process, (/(n), 215 5.4 Bibliographic Notes, 228

6. Large Sample Results of the Kaplan-Meier Estimator 229

6.1 Introduction, 229 6.2 A Large Sample Result for Kaplan-Meier and Weighted Logrank

Statistics, 229 6.3 Confidence Bands for the Survival Distribution, 235 6.4 Bibliographic Notes, 252

7. Weighted Logrank Statistics 255

7.1 Introduction, 255 7.2 Large Sample Null Distribution, 256 7.3 Consistency of Teste of the Class £ , 265 7.4 Efficiencies of Tests of the Class K, 267 7.5 Some Versatile Test Procedures, 277 7.6 Bibliographic Notes, 284

8. Distribution Theory for Proportional Hazards Regression 287

8.1 Introduction, 287 8.2 The Partial Likelihood Score Statistic, 289

Page 10: [Wiley Series in Probability and Statistics] Counting Processes and Survival Analysis (Fleming/Counting) || Front Matter

CONTENTS xiii

8.3 Estimators of the Regression Parameters and the Cumulative Hazard Function, 296

8.4 The Asymptotic Theory for Simple Models, 303 8.5 Asymptotic Relative Efficiency of Partial Likelihood Inference in the

Proportional Hazards Model, 311 8.6 Bibliographic Notes, 316

Appendix A. Some Results from Stieltjes Integration and Probability Theory 317

Appendix B. An Introduction to Weak Convergence 331

Appendix C. The Martingale Central Limit Theorem: Some Preliminaries 343

Appendix D. Data 359

Appendix E. Exercises 385

Bibliography 401

Notation 413

Author Index 417

Subject Index 421