predicting the risk of traumatic lumbar punctures in children … · days since previous lp, and a...

74
Predicting the Risk of Traumatic Lumbar Punctures in Children with Acute Lymphoblastic Leukemia: A Retrospective Cohort Study using Repeated-Measures Analyses by Furqan Shaikh A thesis submitted in conformity with the requirements for the degree of Masters of Science, Clinical Epidemiology and Health Care Research Institute of Health Policy, Management & Evaluation University of Toronto © Copyright by Furqan Shaikh (2012)

Upload: trinhliem

Post on 04-May-2019

215 views

Category:

Documents


0 download

TRANSCRIPT

Predicting the Risk of Traumatic Lumbar Punctures in Children with Acute Lymphoblastic Leukemia:

A Retrospective Cohort Study using Repeated-Measures Analyses

by

Furqan Shaikh

A thesis submitted in conformity with the requirements for the degree of Masters of Science, Clinical Epidemiology and Health Care Research

Institute of Health Policy, Management & Evaluation University of Toronto

© Copyright by Furqan Shaikh (2012)

ii

Predicting the Risk of Traumatic Lumbar Punctures in Children with Acute Lymphoblastic Leukemia:

A Retrospective Cohort Study using Repeated-Measures Analyses

Furqan Shaikh

Master of Science

Institute of Health Policy, Management and Evaluation University of Toronto

2012

ABSTRACT

Traumatic lumbar punctures (TLPs) in children with acute lymphoblastic leukemia are

associated with a poorer prognosis. The objective of this study was to determine risk factors

for TLPs using a retrospective cohort. We compared and contrasted three different regression

methods for the analysis of repeated-measures data. In the multivariable model using

generalized estimating equations, variables significantly associated with TLPs were age < l

year or ≥ 10 years; body mass index percentile ≥ 95; platelet counts < 100 x 103/µL; fewer

days since previous LP, and a preceding TLP. The same variables, with similar estimates and

confidence-intervals, were identified by the random-effects model. In a fixed-effects model

where each patient was used as their own control, days since prior LP and the effect of using

image-guidance were significant. Random-effects and GEE lead to similar conclusions,

whereas fixed-effects discards between-subject comparisons and leads to different estimates

and interpretation of results.

iii

ACKNOWLEDGMENTS

This work would not have been possible without the support of my thesis supervisor, Dr. Lillian

Sung, who has been the greatest mentor that anyone could have asked for. She has been instantly

available and constantly helpful. She has been a role model of the scientific process through her own

superb work ethic and intellectual rigor. It has been an indescribable privilege to have been a graduate

student under her tutelage.

I am extremely grateful to my thesis committee members. It was through casual conversations

with Dr. Sarah Alexander at a medical conference that this project was born, and she has inspired and

helped me every step of the way. She has kept me on track and held me to high standards, in both my

research and clinical work. Dr. Teresa To and Dr. Andrea Doria have provided continual insights,

perspective, and encouragement. Their invaluable contributions have been crucial to the development and

completion of this thesis.

Thank you to the Pediatric Oncology Group of Ontario (POGO). Their financial and moral

support provide young investigators like myself with the wonderful opportunity to pursue dedicated

training in health research methods, allowing us to become useful citizens in the community of childhood

cancer researchers. None of this could have happened without that opportunity.

I’d like to thank my appraisers, Dr. Kuckarczyk and Dr. Ringash, and my defense chair Dr.

Laupacis for the generosity with their time and ideas. Lastly, I’d like to thank the many wonderful staff at

the Institute for Health Policy, Management and Evaluation for organizing and delivering such an

enlightening and valuable degree program.

Finally, a special thank you to parents and to my wife Safia for their limitless patience,

understanding and love. Safia has no doubt at times wondered whether MSc stands for Monster Spouse.

We look forward to the post-thesis era and to a well-deserved vacation!

iv

TABLE OF CONTENTS

ABSTRACT ...................................................................................................................................................... ii

ACKNOWLEDGMENTS .................................................................................................................................. iii

TABLE OF CONTENTS .................................................................................................................................... iv

LIST OF TABLES ............................................................................................................................................ vii

LIST OF FIGURES ......................................................................................................................................... viii

LIST OF APPENDICES .................................................................................................................................... ix

LIST OF ABBREVIATIONS ............................................................................................................................... x

CHAPTER 1: INTRODUCTION ......................................................................................................................... 1

1.1 Traumatic Lumbar Punctures in Children with Acute Lymphoblastic Leukemia ................................ 1

1.1.1 The Concept of CNS-Directed Therapy ........................................................................................ 1

1.1.2 Risk Factors for CNS Relapse ........................................................................................................ 2

1.1.3 The Lumbar Puncture in Pediatric Oncology ............................................................................... 2

1.1.4 Traumatic Lumbar Punctures ....................................................................................................... 3

1.1.5 Consequences of Traumatic Lumbar Punctures .......................................................................... 4

1.1.6 Risk Factors for Traumatic Lumbar Punctures ............................................................................. 8

1.1.7 Limitations of the Previous Literature ......................................................................................... 9

1.2 The Statistical Analysis of Repeated Measures Data ....................................................................... 10

1.2.1 Clustered and Longitudinal Data ................................................................................................ 10

1.2.2 Advantages and Types of Longitudinal Studies.......................................................................... 11

1.2.3 Alternatives to Longitudinal Data Analyses ............................................................................... 12

1.2.4 Notation ..................................................................................................................................... 13

1.2.5 Random Effects Methods .......................................................................................................... 15

1.2.6 Fixed Effects Methods ................................................................................................................ 16

1.2.7 Hybrid Random-Fixed Effects Method ....................................................................................... 17

1.2.8 Marginal Models (Generalized Estimating Equations) ............................................................... 18

1.2.9 Strengths and Limitations of Each Method ............................................................................... 19

1.3 Study Objectives ............................................................................................................................... 23

1.3.1 Primary objective: ...................................................................................................................... 23

1.3.2 Secondary objectives: ................................................................................................................ 23

v

1.4 Study rationale ................................................................................................................................. 24

CHAPTER 2: METHODS ................................................................................................................................ 25

2.1 Study Design ...................................................................................................................................... 25

2.2 Study Population ............................................................................................................................... 25

2.2.1 Inclusion Criteria ........................................................................................................................ 25

2.2.2 Exclusion Criteria ........................................................................................................................ 25

2.2.3 Study Timeline............................................................................................................................ 25

2.3 Variables ............................................................................................................................................ 27

2.3.1 Primary Outcome Variable ......................................................................................................... 27

2.3.2 Secondary Outcome Variable .................................................................................................... 27

2.3.3 Predictor Variables ..................................................................................................................... 27

2.4 Data Sources and Measurement ...................................................................................................... 29

2.5 Sample Size ....................................................................................................................................... 30

2.6 Statistical Analyses ............................................................................................................................ 31

2.6.1 Descriptive Statistics .................................................................................................................. 31

2.6.2 Model Building ........................................................................................................................... 31

2.6.3 Conventional Logistic Regression ............................................................................................... 32

2.6.4 Generalized Estimating Equations ............................................................................................. 32

2.6.5 Random-Effects (Generalized Linear Mixed Models) ................................................................ 33

2.6.6 Fixed-Effects (Conditional Logistic Regression) ......................................................................... 33

2.6.7 A Hybrid Method ........................................................................................................................ 33

2.6.8 Kaplan-Meier Survival Curves .................................................................................................... 34

CHAPTER 3: RESULTS ................................................................................................................................... 35

3.1 Descriptive Statistics ......................................................................................................................... 35

3.1.1 Participants and Procedures ...................................................................................................... 35

3.1.2 Characteristics ............................................................................................................................ 35

3.1.3 Primary Outcome: Traumatic Lumbar Punctures ...................................................................... 36

3.2 Inferential Statistics .......................................................................................................................... 40

3.2.1 Collinearity and Correlations ..................................................................................................... 40

3.2.2 Model Building .......................................................................................................................... 41

3.2.3 Secondary Outcome: Event-Free Survival .................................................................................. 49

CHAPTER 4: DISCUSSION ............................................................................................................................. 50

vi

4.1 Summary of Main Findings ............................................................................................................... 50

4.2 Factors Predictors of TLP .................................................................................................................. 50

4.3 Pertinent Negative Findings .............................................................................................................. 52

4.4 Survival Analysis ................................................................................................................................ 53

4.5 Comparison of Repeated-Measures Analysis Methods .................................................................... 53

4.6 Study Limitations .............................................................................................................................. 56

4.7 Study Strengths ................................................................................................................................. 57

4.8 Future Research ................................................................................................................................ 58

REFERENCES ................................................................................................................................................ 59

APPENDICES ................................................................................................................................................ 63

vii

LIST OF TABLES

Table 1. Incidence of CNS status and effect on event-free survival

Table 2. Comparison of repeated-measures analysis methods

Table 3. List of predictor variables and rationale

Table 4. Patient characteristics

Table 5. Pearson correlation coefficients

Table 6. Conventional logistic regression

Table 7. Generalized estimating equations

Table 8. Random effects with generalized linear mixed models

Table 9. Fixed effects with conditional logistic regression

Table 10. The hybrid method

Table 11. Comparison of multivariable models

viii

LIST OF FIGURES

Figure 1. Timeline of study inclusion and follow-up

Figure 2. Flow of patients and lumbar punctures through the study

Figure 3. CNS status and proportion of patient with traumatic lumbar punctures at first LP

Figure 4. Kaplan-Meier survival curves by CNS status

ix

LIST OF APPENDICES

Appendix 1. Approach to Classifying TLP+ Status Varies Between SJCRH and COG Systems

Appendix 2. ALL risk stratification

x

LIST OF ABBREVIATIONS

CI Confidence interval ALL Acute lymphoblastic leukemia AML Acute myeloid leukemia BLP Bloody lumbar puncture CDC Center for Disease Control CNS Central nervous system COG Children’s Oncology Group CSF Cerebrospinal fluid EFS Event-free survival ER Emergency room FE Fixed effects FLP Failed lumbar puncture GEE Generalized estimating equations HR Hazard ratio IGT Image-guided therapy IQR Interquartile range KM Kaplan-Meier LMWH Low molecular weight heparin N/A Not available OR Odds ratio OS Overall survival RBC Red blood cell RE Random effects Ref Reference SD Standard deviation SE Standard error SJCRH St. Jude Children’s Research Hospital TLP Traumatic lumbar puncture TLP+ Traumatic lumbar puncture with blast cells TLP- Traumatic lumbar puncture without blast cells WBC White blood cell χ2 Chi-square µL microliter

1

CHAPTER 1: INTRODUCTION

1.1 Traumatic Lumbar Punctures in Children with Acute Lymphoblastic

Leukemia

1.1.1 The Concept of CNS-Directed Therapy

Acute lymphoblastic leukemia (ALL) is a cancer of the white blood cells. It is the most

common childhood malignancy, accounting for approximately 25% of all pediatric cancers. At

an incidence of 4 per 100,000 child-years, there are an estimated 250 new cases of childhood

ALL every year in Canada.1

Prior to the 1970s, approximately 50-80% of children with ALL who achieved remission

subsequently experienced a relapse of leukemia within the central nervous system (CNS).2 This

led to the recognition of the CNS as a sanctuary site for leukemia cells. A major advance in the

treatment of childhood leukemia was the addition of pre-symptomatic CNS-directed therapy for

all patients with ALL.3 CNS-directed therapy could include intrathecal chemotherapy injected

directly into the cerebrospinal fluid (CSF) via a lumbar puncture (LP)4,5 or rarely via an Ommaya

reservoir;6 systemic oral or intravenous chemotherapy that crosses the blood-brain barrier;7

and/or cranial radiation.8 The development of such approaches has dramatically improved the

prognosis of children with ALL, reducing CNS relapses to less than 6% of patients.9 However,

more intensive CNS-directed therapy, in particular the use of cranial radiation, is associated with

2

more long-term sequelae of therapy in survivors, including neuro-cognitive defects,

endocrinopathies, and secondary malignancies.10-12

1.1.2 Risk Factors for CNS Relapse

Several factors for predicting an increased risk of CNS relapse have been identified. The

most important factor is the presence of leukemia blast cells in the CSF at diagnosis. In some

regimens, the risk was greater if the presence of blast cells was accompanied by a white blood

cell (WBC) count over 5 cells/µL.13 Therefore, a trichotomous risk-classification for “CNS

status” was proposed and remains in current use.14 CNS1 denotes the absence of any leukemia

blast cells in the CSF; CNS2 denotes the presence of blast cells in a sample that contains less

than 5 WBCs/µL; and CNS3 denotes the presence of blast cells in a sample that contains ≥5

WBCs/µL. The latter category, considered to be overt CNS leukemia, is present in 2-5% of

children with newly diagnosed ALL.2

Risk-stratification allows the treatment intensity to be tailored to the individual. For

example, many regimens recommend the addition of cranial radiation for patients with CNS3

status.7,15 Therefore, accurate diagnosis of CNS status is essential. Misclassification of CNS

status could lead to a child receiving under-treatment and a consequent increased risk of relapse,

or receiving over-treatment and a greater risk of long-term sequalae.

1.1.3 The Lumbar Puncture in Pediatric Oncology

The determination of CNS status is achieved by means of the first (“diagnostic”) lumbar

puncture (LP), prior to initiation of chemotherapy. CSF is collected and microscopically

examined for blast cells.16 The correctly-performed LP is therefore a vital procedure for CNS

3

staging. At the time of the first LP and with each of multiple subsequent LPs, chemotherapy is

delivered directly into the intrathecal space.4,17 On contemporary treatment protocols, children

receive between 14 and 30 LPs over the entire course of ALL treatment, the exact number being

determined by the sum of risk factors and the type of additional CNS-directed therapies.14

Therefore, the LP is the most commonly performed procedure for all pediatric oncologists. As

the focus of research shifts to trying to avoid cranial radiation for as many children as

possible,15,18-20 the importance of intrathecal chemotherapy further increases.

1.1.4 Traumatic Lumbar Punctures

Cerebrospinal fluid is normally a clear and colorless liquid containing no red blood cells

(RBCs). A traumatic lumbar puncture (TLP) occurs when RBCs from neighboring blood vessels

leak into the CSF. In most cases, the RBCs are thought to have entered the CSF as a result of

needle laceration of the vessels during the performance of the LP.

The definition of a TLP varies across the medical literature. In the general pediatrics and

emergency medicine literature, a TLP is generally defined as the presence of ≥ 400 RBCs/µL of

CSF on microscope examination.21 This is approximately the cell count at which clear CSF

begins to develop a red or pink color, but does not yet appear grossly bloody. In the pediatric

oncology literature, however, a TLP is defined at a much stricter threshold of 10 RBCs/µL, for

reasons that will be examined below. At this level, no redness is visible except on microscopy,

and the CSF appears clear.

4

1.1.5 Consequences of Traumatic Lumbar Punctures

There are several negative consequences of a TLP. First, the presence of RBCs in the

CSF can obscure the diagnostic information that was sought from the procedure.22 If the first

diagnostic LP for a patient with ALL is a TLP with blasts (TLP+), it is unclear whether the blasts

were present in the CNS to begin with, or whether they were introduced from the bloodstream by

the needle laceration. Different pediatric oncology groups approach this uncertainty in different

ways, which further adds to the complexity of interpreting a TLP+. At the St Jude Children’s

Research Hospital (SJCRH), a TLP+ is considered to be its own CNS group without further sub-

classification.22 Theoretically, in such a system, children with actual CNS1 status or actual CNS3

status could both be classified as a TLP+, thus potentially leading to either over- or under-

treatment. In contrast, other treatment groups such as the Children’s Oncology Group (COG)

utilize a ratio-based mathematical calculation that attempts to determine whether the white blood

cells and blasts are likely to have been present in the CNS or to have been introduced from the

bloodstream (see Appendix I).14 However, to our knowledge, the recommended formula has not

been derived from published empirical research and has never been evaluated for its diagnostic

properties or ability to correctly predict outcome. Therefore, it is unknown whether the formula

is able to consistently and accurately classify CNS status that occurs after a TLP+.

Second, and most important, multiple studies have shown that the presence of a TLP+ is

a risk factor for leukemia relapse.22-24 These studies observed a 7% to 17% decrease in event-free

survival (EFS) in children who had a TLP+ compared to those who had CNS1 status. A TLP+ in

each of these studies was defined as ≥ 10 RBCs/µL in the presence of leukemia blasts,

suggesting that even microscopic trauma during the first LP carried the associated risk. The

results of these studies are summarized in Table 1.

5

It is worth noting that among all the risk factors for relapse in childhood ALL, a TLP+ is

the only one that may have an iatrogenic component. Apart from treatment itself, it is the only

risk factor that has the potential to be modifiable. All other known risk factors are properties of

the patient characteristics or the disease biology (see Appendix II). They are thus generally fixed

upon the patient’s presentation to the health care system. The iatrogenic nature of this risk factor

increases the burden upon the operating physician to ensure that all reasonable measures to avoid

a TLP are undertaken.

Third, according to current treatment protocols, children with TLP+ receive additional

therapy compared to TLP- or CNS1 patients.14,20 At a minimum, a child with TLP+ will receive

two additional doses of intrathecal chemotherapy compared to one who is CNS1. Patients with

TLP+ may also receive increased intensity of systemic chemotherapy.

Fourth, although it is the first LP that influences outcomes directly, the subsequent LPs

that a child receives are still important for the proper instillation of intrathecal chemotherapy and

for screening CSF for signs of relapse. A study of radionuclide imaging of the CNS observed

that in 11% of intrathecal injections, a radioisotope was inadvertently placed outside the

subarachnoid space.25 Furthermore, TLPs often occur as a sign of difficult LPs or multiple

attempts. The latter are associated with more pain and discomfort after the procedure.26-28

Lastly, when a child with ALL is observed to have repeatedly difficult, bloody, or

unsuccessful LPs, the usual course of action at many institutions is to refer him or her to

interventional radiology, where LPs are performed under fluoroscopic guidance. While

fluoroscopy does allow for the acquisition of otherwise difficult LPs,29,30 this recourse has some

obvious disadvantages. Fluoroscopy is not available at many centers. When available, its use

leads to scheduling delays, increased cost, more time under anesthesia, and decreased availability

6

of the equipment for other procedures. Moreover, it leads to radiation exposure which can have

long term consequences with cumulative exposure.31,32 A lateral abdominal fluoroscope provides

an estimated dose of 0.26 to 1.1 mSV of radiation per minute, varying by age and magnification

level. By comparison, a standard 2-view chest X-ray provides 0.08 mSV of radiation.33

7

TABLE 1. Incidence of CNS status and Effect on Event-Free Survival

Study

(Year)

CNS1 CNS2 CNS3 TLP- TLP+ Log-

rank P

Value*

HR

(95% CI)

Gajjar

et al.22

(2000)

Number (%)

(N=546)

336 (62) 80 (15) 16 (3) 54 (10) 60 (11)

5y-EFS (±SE) 77 (±2) 55 (±6) 38 (±11) 76 (±6) 60 (±6) 0.026 N/A

Burger

et al.23

(2003)

Number (%)

(N=2021)

1605 (79) 103 (5) 58 (3) 111 (6) 135 (7)

5y-EFS (±SE) 80 (±1) 80 (±4) 50 (±8) 83 (±4) 73 (±4) 0.003 1.5

(1.02-2.2)

te Loo

et al.24

(2006)

Number (%)

(N=526)

304 (58)

111 (21) 10 (2) 62 (12) 39 (7)

10y-EFS (±SE) 73 (±3) 70 (±5) 67 (±19) 82 (±5) 58 (±8) <0.01 3.5

(1.4-8.8)

*P-values refer to log-rank test comparing EFS of TLP+ to CNS1 status, and hazard ratios refer

to comparison of TLP+ to CNS1 in Cox proportional hazards multivariable analyses.

For abbreviations of all tables, refer to page 9.

8

1.1.6 Risk Factors for Traumatic Lumbar Punctures

There is only one previous study that investigated risk factors for TLP specifically in

children with ALL.34 Howard et al, assessed all children diagnosed with ALL at SJCRH between

1984 and 1998. They examined the first LP and a median of four subsequent LPs for each child.

To adjust for dependence within repeated measures from the same patient, the study utilized

generalized estimating equations (GEE). In total, the dataset included 956 children undergoing

5609 LPs. The estimated OR and (95% CI) for the identified risk factors included 2.3 (1.7-3.0)

for age younger than 1 year vs 1 year or older; 1.5 (1.2-1.8) for black vs white race; 1.5 (1.2-1.8)

for platelet count less vs more than 100 x103/µL; 1.4 (1.1-1.8) for the least vs the most

experienced operator; 10.8 (7.7-15.2) for short (1 day) vs longer (>15 days) interval since the

previous LP; 1.6 (1.4-1.9) for a previous TLP; and 1.4 (1.2-1.7) for early vs recent era when

sedation was routinely used.

Several other studies have also evaluated risk factors for TLP in contexts other than

childhood ALL, often using a different definition of TLP.35-40 Shah et al defined a TLP as >400

RBC/µL and found a rate of 16% in adults visiting the ER, with a significant risk factor being the

inability to visualize spine landmarks.40 Using the same definitions, Glatstein et al found a TLP

rate of 24% for children, with a risk factor being the need for multiple attempts,41 and Pappano

found a rate of 7% TLP in children for LPs performed by a single experienced operator.21 Lastly,

Nigovic et al defined TLP as >10,000 RBC/µL, had a rate of 35% TLP, and identified risk

factors as increased patient movement, less operator experience, local anesthetic not used, and

needle stylet not removed.38,39

9

1.1.7 Limitations of the Previous Literature

The study by Howard et al34 had several limitations that prompted us to undertake our

study. First, there were important potential predictor variables that were not examined. For

example, the effect of body mass index (BMI) percentile or obesity on the rate of TLP; the effect

of first LP (and active leukemia) versus subsequent LPs; the effect of being treated with

anticoagulants or having abnormal coagulation tests; or the influence of using fluoroscopic-

guidance on the rate of TLPs were not explored. Also, that study did not describe its handling of

failed LPs, which are very likely to be traumatic due to the repeated insertions and attempts.

Second, there is a concern that the study is not generalizable to the present clinical

setting, as the range of procedural practices and operator experience has changed over time.42

Howard et al included a large number of LPs performed without sedation and LPs performed by

operators including medical students, residents and nurses. In the current setting, nearly all LPs

are performed under deep sedation,43 and procedures are restricted only to certified oncology

fellows and staff. Therefore, it would be useful to investigate the incidence of TLP and the

identified risk factors in a more recent setting.

Lastly, the Howard study also had some methodological limitations. It examined a

median of four LPs per child. A child with ALL typically undergoes an average of 20 LPs. Thus,

the study could be biased if the risk of TLPs changes across the course of treatment. Secondly, it

used a repeated measures analysis with generalized estimating equations (GEE). While this is an

appropriate method to adjust for longitudinal data, the study did not describe the rationale for

selecting this method nor determine whether other methods of longitudinal analysis may have

been appropriate for the study.44

10

1.2 The Statistical Analysis of Repeated Measures Data

Conducting this study requires the correct application of statistical methods for analyzing

repeated measures data. In this section we will review the objectives, methods and differences

among the various statistical methods available for such analyses. This will form the background

for the further discussion of the methods used for the analysis of risk factors for TLP and help

with understanding and interpreting the results.

1.2.1 Clustered and Longitudinal Data

In a traditional cross-sectional study, where unique individuals are measured on a single

occasion, one aims to obtain estimates of between-subject differences in an outcome variable. In

the experimental setting, such as a randomized controlled trial (RCT), comparisons of the

outcome variable are made across sub-populations that differ in a single predictor variable of

interest, such as an assigned treatment. The process of randomization aims to ensure that all

other extraneous variables are, on average, balanced among the two groups. An important

assumption of all statistical tests used for the analysis of such data is that each individual

observation is independent of any other.

The assumption of independence is violated if there is any reason for individual

observations to be correlated,45 and statistical tests that assume independence will lead to

incorrect results.44 Many health science studies give rise to data that are clustered.46 This can

occur, firstly, if individuals are sampled from naturally occurring groups (such as families,

neighborhoods, hospitals, or clinics). Observations within a group are likely to be positively

correlated, that is they are more likely to be similar within a group than two observations

between groups.

11

Second, clustering can occur if each individual is observed on more than one point in

time. In fact, longitudinal data can be understood as a special case of clustered data.46 Instead of

individuals clustered within groups, here observations are clustered within an individual.

Observations within an individual are more likely to be similar to each other than observations

between individuals. Individuals tend to be persistently high or persistently low in measures of

outcomes. Therefore, appropriate statistical techniques must explicitly account for these positive

correlations or else risk spurious results.45

1.2.2 Advantages and Types of Longitudinal Studies

The defining characteristic of all longitudinal studies is that multiple measurements of the

same individual (or the same research unit such as a tumor or a cell line) are taken over time.46

There are two major advantages to longitudinal studies that are not available to cross-sectional

studies, and these correspond broadly to two different longitudinal study types.

First, longitudinal studies allow the investigation of change over time.47 This is because,

in addition to between-subject comparisons, they allow one to study within-subject changes.

Indeed, in most longitudinal studies, these within-subject changes and the factors that influence

heterogeneity are the primary objective. For example, longitudinal studies may allow one to

study the trajectory of a measure over time (eg. size of a tumor) and how changes depend on a

covariate (eg. cytogenetics of the tumor). In the experimental setting, groups of subjects can be

defined by exposure category and followed. The objective of statistical testing is to compare the

trajectories of the outcome variable between groups. Notably, each individual is only exposed to

one level of the primary predictor variable throughout the study. This subset of longitudinal

12

studies is sometimes referred to as “growth curve analysis” and forms the bulk of the

longitudinal study literature.47

Second, a longitudinal study can allow the comparison of an outcome variable of an

individual subject under different conditions.46 That is, the level of the predictor variable that

each individual is exposed to changes across measurements. In this case, one is not interested in

the change over time per se, but in the effect of different circumstances or covariates on the

outcome variable. This subset of longitudinal studies are often referred to as “repeated measures

studies.”44 Similar to time in the previous example, the experimental condition can be treated as

a within-subject factor and the conditions can be compared using within-subject contrasts.

In the non-experimental setting, the advantages of the repeated measures design are

particularly attractive, since bias from the effects of confounding covariates are ubiquitous in

observational studies and randomization is not available as a solution.48 Furthermore, the gain in

power from multiple observations per individual means that a sufficient sample size can be

accrued from a smaller number of locations or eras.49

1.2.3 Alternatives to Longitudinal Data Analyses

Often, longitudinal studies are analyzed with conventional statistical methods that ignore

the issue of positive correlations.50 The procedure will incorrectly assume that all of the

observations are independent. While the resulting effect estimates may be similar, the standard

errors are often very biased (either larger or smaller) and thus so are the statistical tests of

significance.51 In observational studies, ignoring the correlation most often leads to

underestimation of the variability.50 This in turn results in standard errors that are too small and

test statistics that are too large, risking type I errors.

13

Another common approach is to reduce the data from repeated measures into a summary

measure that can then allow standard statistical methods to be applied.46 For example, one may

estimate the difference between first and last value of an outcome variable, or calculate the area

under the curve (AUC). However, the drawback here is that it forces the analyst to concentrate

on just a single aspect of the repeated measurements, and this leads to loss of valuable

information that may be present in the totality of data. Individuals with very different trajectories

of response may be reduced to having the same summary measure. Two curves with very

different shapes may have the same slope or the same AUC. Moreover, summary measures do

not allow the inclusion of time-varying covariates, and one loses the advantage of the ability to

use the subject as his or her own control.

For a repeated measures study with a continuous outcome variable and a completely

balanced design, ANOVA or MANOVA may be used. However, these methods cannot handle

unbalanced designs, missing data, or binary outcomes.51

Therefore, the most commonly used methods for the analysis of longitudinal data involve

a number of closely-related techniques based on regression methods.46 These include mixed

models (random effects), conditional models (fixed effects) and marginal models (GEEs). We

will discuss the relative strengths and limitations of these methods in the following sections.

1.2.4 Notation

A basic linear regression model from a cross-sectional study can be written as:

Yi = β0 + βXi + εi

In this notation,46 Yi is the value of the outcome variable for individual i. β0 is the

intercept term in the model. Xi is a vector of covariates that varies between individuals, such as

14

the platelet count or gender. β is the fixed parameter that is estimated by regression and that

relates Xi to Yi. ε is a random error term. This is the standard notation that is familiar to

researchers from conventional linear regression. The same equation applies to non-linear (or

generalized linear) regression if a link function or transformation of Yi is used.51

In repeated measures data, there is one record for each individual at each time point.

There is an identifying number (or subject variable i) that is the same for all records that come

from the same individual. Additionally, there is also an observation number (or time variable t)

that indicates which time point the record comes from.

In a repeated measures model, the familiar vector of covariates Xi is “partitioned” into

two types of covariates: those whose values do not change for an individual over the duration of

the study (time-invariant or between-subject covariates) and those whose values do change over

time (time-varying or within-subject covariates).46,48 This new model can be written as:

Yit = β0 + βXit + Wi + αi + εit

In this notation, Yit is the value of the outcome variable for individual i at time t. β0 is the

intercept term in the model. Xit is a vector of time-varying variables, such as the platelet count

prior to a procedure. Wi is a vector of measured time-invariant covariates, such as gender or

ethnicity. Finally, αi denotes the vector of all time-invariant characteristics of individuals that are

not otherwise accounted for by the Wi term. Such covariates can include an individual’s genomic

profile, internal anatomy, and all other characteristics that are stable over time. These covariates

can influence the individual’s level of response, but they remain unknown to the investigator.

Models of this type are known as linear mixed models. The “linear” denotes that they can

be expressed as a linear equation, and “mixed” indicates that they contain both fixed and random

15

terms. The word “generalized” is added when they are extended to logistic, poisson or other

regression methods through a link function.

An important choice now arises regarding the nature of αi. It can be treated as either a

random variable or a fixed variable.48 This choice underlies the fundamental difference between

two analysis methods and lends them their names.

1.2.5 Random Effects Methods

αi can be treated as a random variable with a specified probability distribution. In this

case, it is allowed to vary between individuals (and can potentially be estimated for each

individual if desired).51 The underlying premise of this type of model is that some subset of

regression parameters vary from one individual to another, and these account for sources of

natural heterogeneity in the population. That is, individuals in the population are assumed to

have their own subject-specific mean responses and their own subset of regression parameters.

Individuals are either “high-responders” or “low-responders.” The response is modeled as a

combination of population characteristics, β and W, that are assumed to be shared by all

individuals, and subject-specific effects that are unique to a particular individual. The former are

referred to as fixed effects, while the latter are referred to as random effects. The term “mixed

models” in this context therefore indicates that the model contains both fixed effects and random

effects. (Although confusing, common use in the literature refers to these models as simply

“random effects,” though it should be remembered that they contain both types).44

In simplified terms, what the estimation procedure does is calculate a regression equation

for each individual in the study, with unique subject-specific estimates of αi, and then compute a

weighted average all such equations to arrive at an estimate of β and W. The mixed model

16

utilizes all within-subject and between-subject comparisons and arrives at an estimate of the

parameter for both time-invariant and time-varying covariates.52 In concrete terms, this is akin to

asking the question “If a child has a traumatic LP on k days, what factors are different about

those days compared to any other day or any other child?”48

1.2.6 Fixed Effects Methods

On the other hand, if αi is treated as a set of fixed parameters, it is no longer easy to

estimate αi by regression. The αi term is now perfectly collinear with the Wi term, and thus both

cannot be estimated. They can however be conditioned out of the estimation process.

To better understand this important concept, let us examine the simplest case of a study

with only two repeated measures (e.g. a crossover trial with an outcome variable measured

before and after starting a treatment). To estimate β, the effect of the treatment, we can calculate

the within-subject changes in the outcome variable, Yi2 – Yi1, as follows:46,48

Yi2 – Yi1 =

β0 + βXi2 + Wi + αi + εi2

- ( β0 + βXi1 + Wi + αi + εi1 )

----------------------------------------------------

= β (Xi2 – Xi1) + (εi2 – εi1)

Note that in calculating this difference, the intercept terms, the time-invariant covariate

effects and the stable characteristics term αi have all disappeared or cancelled out. Therefore, the

stable individual characteristics can have no effect on the estimate of β, even if αi is correlated to

β or is a confounder of the relationship between X and Y.

17

Seen another way, the fixed effects model performs only within-subject comparisons and

discards all between-subject comparisons. In concrete terms, this is akin to asking the question

“If John had a traumatic LP on k days but not on other days, what was different about John on k

days compared to John on other days?”48 We may check to see if the days with a TLP were

associated with a particularly low platelet count. However, since John was a male on all days, the

effect of gender cannot be estimated and has been conditioned out. It makes no sense to ask what

effect male gender had on some days compared to other days. This is not to say that the effect of

gender is discarded, but rather that it is controlled for or balanced across comparisons. In the

example of the before and after experiment that we outlined above, the only variable that differed

between the two time points was the experimental treatment. There was no possible confounding

by gender, age, race, genetics or any other stable covariate. The essence of the fixed effects

method is captured by saying that each individual serves as his or her own control.48

1.2.7 Hybrid Random-Fixed Effects Method

There is a hybrid method that allows one to “disentangle” a covariate into its fixed effect

and random effect components.46,48 This is done by calculating the means and deviations of the

covariate, and regressing on these parameters separately. The estimate of the deviation variable

corresponds closely to the fixed effect. This method has the advantage of combining the best of

both worlds, providing control for bias from stable characteristics where desired while also

allowing inclusion of time-invariant covariates. However, it has the significant disadvantage of

losing parsimony and being difficult to understand by most health science readers. It may be

equally acceptable to simply report both effects as separate analyses if this is desired.

18

1.2.8 Marginal Models (Generalized Estimating Equations)

Despite their many differences, both the random and fixed effects models begin from the

linear mixed model paradigm. They are thus usually distinguished from a third method for the

analysis of repeated measures data known as marginal models, which use an entirely different

approach for accounting for the within-subject correlations.

The term “marginal” indicates that the model for the mean response depends only on the

covariates of interest, and not on any random effects or previous responses. There is no vector of

random effects and no subject-specific coefficients. In other words, there is no model constructed

for each individual, but rather one model for the mean response of the whole population.

Individuals are not distinguished as high responders or low responders but as contributing to an

overall average or “generalized” response. Hence, the most common method for estimating

marginal models is known as generalized estimating equations (GEE).

GEE models the mean response and separately models the within-subject associations

among the repeated measures.53 The goal is to make inferences about the former, while the latter

is treated as a nuisance variable that must be accounted for and thus “removed.”50 Therefore,

GEE estimates two different equations, one for the mean relations and one for the covariance

structure.51 The latter is specified as a “working” correlation matrix. The correlation matrix is a

two-dimensional array of conditional variances and covariances. The covariance structure can be

specified by the analyst to allow an expected pattern.50 An unstructured matrix imposes no

particular assumption about the covariance structure. The exchangeable (or compound-

symmetry) covariance structure specifies a single correlation that applies to all pairs of

19

observations. An independent structure assumes zero correlations. An autoregressive structure

assumes that observation are only related to their own past values and correlations decline with

time. While specifying the correct covariance structure may increase efficiency,54 overall the

GEE estimation is quite robust to misspecifications.55

1.2.9 Strengths and Limitations of Each Method

Recall that the major advantage of RCTs is their ability to balance confounding

covariates across groups, even if they are unmeasured or unknown. In non-experimental studies,

researchers try to approximate a randomized experiment by statistically controlling for potential

confounding covariates using methods such as multivariable regression or propensity scores.

However, the researcher can only control for confounding covariates that are known, thought

about, and accurately measured. Some important covariates are almost always omitted. As a

result, estimates from non-experimental studies are always at risk of confounding bias.

In the fixed effects model outlined above, the same advantage of an RCT becomes

available to an observational study due to the availability of repeated measures and within-

subject comparisons. Confounding covariates (that are stable over time) are always balanced

across comparisons. The estimate of the effect of the predictor variable of interest is free of any

potential confounding effect of stable covariates, even those that have not been measured or

identified. This eliminates potentially large sources of bias.

However, there are three important disadvantages to fixed effects methods. First, as we

have already seen, fixed effects methods do not estimate a coefficient for time-invariant

characteristics, such as gender or ethnicity.44 If these covariates are of primary importance to the

20

researcher, then fixed effects methods would not be preferred. For similar reasons, fixed effects

may not be useful for variables that do change over time but only to a small degree. For example,

a fixed effects analysis may show a significant effect for height in children followed over a

decade, but is unlikely to do so for adults since height changes are minimal. Second, fixed effects

methods discard all information for subjects that have no variability in the outcome throughout

the study.48 If John never has a TLP, we cannot ask what factors put him at risk for TLP, and his

data drop out of the analysis. Thus, the effective sample size for fixed effects methods is usually

smaller relative to random effects methods.44 If a significant portion of subjects show no

variability in the outcome, the loss of information may become substantial. Third, by discarding

the between-person comparison, fixed effects in observational studies has more sampling

variability and thus yield standard errors that are considerably larger when compared to methods

that utilize both between- and within-subject comparisons. In summary, fixed effects offers

researchers a trade-off between reduced bias at the expense of increased variability and loss of

information.48

The relative strengths and weaknesses of random effects methods are a mirror image of

those for fixed effects. Random effects perform both within-subject and between-subject

comparisons and so typically have less sampling variability. They do allow for estimation of the

effects of stable characteristics such as gender and ethnicity. They do not discard information for

individuals whose outcome variable does not change. However, they do not control for

unmeasured stable characteristics of the individual.

Despite the differences in how they are derived, GEE models are similar to those

estimated by the random effects method, especially when an exchangeable covariance structure

is specified.50 Both perform within-subject and between subject contrasts, provide an estimate for

21

time-invariant covariates, but do not control for bias from stable characteristics. If the latter

factor is of primary interest, then once again fixed effects are to be preferred.

However, the separation of the modeling of the mean response and the correlations has

important implications for the interpretation of the regression parameters. While the parameters

from random and fixed effects methods are referred to as “subject-specific,” those from GEE are

referred to as “population-averaged.”50 For the former, the target of inference is the individual.

For the latter, it is the population. The regression parameters β are said to have population-

averaged interpretations. Therefore, the choice between random effects and GEE is not made on

statistical but rather on subject-matter grounds. If the researcher is interested in determining

subject-specific coefficients, such as the expected benefit of treatment for an individual patient,

then random effects are preferred. If the researcher is interested in the potential reduction in

morbidity for a population if the new treatment is universally implemented, GEE is

preferred.44,50,55

In health science studies, however, the subject-specific responses for individuals are often

not shown in published reports. The random effects method does produce a “marginal” mean

estimate of the β coefficients by averaging over the distribution of the random effects.50 The

estimates may be very identical to those produced by GEE. However, when there is significant

unobserved heterogeneity, the estimates produced by GEE are usually smaller, as they undergo

“heterogeneity shrinkage,” or an attenuation towards zero in the presence of heterogeneity.55

Some authors state that the heterogeneity shrinkage is corrected by the random or fixed effects

models and therefore the latter are preferred,50 while others believe that the random effects

estimates are biased upwards, and that if the goal is to make an inference about the population

average mean of Yi, that GEE should be adopted. Either model may be equally acceptable, and

22

Fitzmaurice et al write that the controversy of GEE versus random effects “has generated more

heat than light.”46

The following table summarizes the features of each method and offers guidance on how

to choose the best method for the data.

TABLE 2. Comparison of Repeated-Measures Analysis Methods for Binary Outcomes

Marginal Model Random Effects Fixed Effects Hybrid Model Estimation procedure Generalized estimating

equations Maximum likelihood Conditional likelihood Maximum likelihood

SAS Proc Genmod with repeated statement

Glimmix with random statement

Logistic with strata statement

Glimmix with random statement

Within-subject contrasts?

Yes Yes Yes Yes

Between-subject contrasts?

Yes Yes No Yes

Controls for bias No No Yes Yes Interpretation of coefficients

Population-averaged Subject-specific Subject-specific Subject-specific

Heterogeneity shrinkage

Yes No No No

Loss of information No No Yes No Provides estimate of time-invariant factors

Yes Yes No Yes

Effect on coefficient estimates, relative to random effects

May be decreased in the presence of unobserved heterogeneity

- May be decreased if covariate values do not change significantly within each person

-

Effect on standard errors, relative to random effects

May be decreased in the presence of unobserved heterogeneity

- May be increased due to sampling variability of within-person contrasts and loss of data

-

Other factors Requires user to specify covariance matrix

- - Model is difficult to comprehend by most readers

Choose if: Population-averaged estimates are desired, and subject-specific estimates are not needed.

Subject-specific estimates are desired, and time-invariant variables are important, or fixed effects model not feasible. May use in identical situations as GEE model

Control of bias from unmeasured stable characteristics is the primary objective, time-invariant variables are not important to estimate, and loss of information is not substantial.

Control of bias from unmeasured stable characteristics is the primary objective, but time-invariant variables are also important to estimate.

23

1.3 Study Objectives

1.3.1 Primary objective:

To determine risk factors for TLPs in children with ALL

1.3.2 Secondary objectives:

1) To determine whether the EFS for patients who are TLP+ is significantly lower than those

who are CNS1 treated on contemporary regimens

2) To compare and contrast three different methods for the analysis of repeated-measures data:

GEE, random effects, and fixed effects methods.

24

1.4 Study rationale

Knowing the risk factors for TLPs in children with ALL would allow clinicians to frame

guidelines to optimize modifiable factors. This would allow pediatric oncologists to determine

the optimal platelet threshold, the minimal level of experience for suitable operators, and the

utility of image-guidance. Knowing which children are at highest risk of TLP would be very

important in directing appropriate prospective interventions to those who would most benefit

from such interventions. Some examples of potential interventions that could be tested in future

studies may include non-traumatic needles, improved guidelines or training programs, or

ultrasound-guidance for procedures.

25

CHAPTER 2: METHODS

2.1 Study Design

We conducted a retrospective cohort repeated-measures study utilizing hospital-based

health care records. The study received institutional and research ethics board approval from The

Hospital for Sick Children (SickKids) and the University of Toronto.

2.2 Study Population

2.2.1 Inclusion Criteria

The study population included children with ALL newly diagnosed at SickKids between

January 1, 2005 and December 31, 2009 who were between 0 to 18 years of age at diagnosis.

2.2.2 Exclusion Criteria

Children with relapsed ALL, secondary ALL, Burkitt leukemia (L3), and children who

did not undergo any LPs were excluded.

2.2.3 Study Timeline

The timeline for the inclusion of children in the study and the follow-up of the cohort is

shown in Figure 1. The last date of data extraction and the study end-date was March 1, 2012.

26

FIGURE 1. Timeline of study inclusion and follow-up

Accrual Window

Observation Window

TIME

Accrual start date: Jan 1, 2005

Accrual End Date: Dec 31, 2009

Maximum Follow-up Date: March 1, 2012

27

2.3 Variables

2.3.1 Primary Outcome Variable

The primary outcome variable was a TLP, defined as an LP that contained at least 10

RBCs per microliter of CSF.

2.3.2 Secondary Outcome Variable

The secondary outcome variable was EFS, defined as the time interval from date of

diagnosis to relapse, second malignancy, treatment-related mortality, death, or date last seen

(whichever occurred first).

2.3.3 Predictor Variables

The potential predictor variables (covariates) included patient, disease, operator, and

procedure-related variables. Furthermore, as for repeated-measures studies, covariates were also

classified as being either time-varying (if they could have different values at different

observations, such as age at LP), or time-invariant (if they could not have different values at

different observations, such as age at diagnosis or gender).

All continuous variables were categorized according to cut-offs determined from

previous literature or the clinical judgment of study investigators. The list of variables, their

categories, the rationale for selected categorizations, and comments related to their importance or

analysis are shown in Table 3.

28

TABLE 3. List of Predictor Variables and Rationale

Variable Categories Comments A. Patient-Related Variables Age at LP <1 year

1-<10 year ≥10 year

National Cancer Institute/Rome criteria for ALL risk stratification

Gender Male Female

Ethnicity White Black Other

Categorized as per Howard et al.34 Race was a risk factor for TLP, possibly due to variations in lumbar lordosis by race.

BMI percentile 0-<95 ≥95

BMI percentile ≥ 95 is a standard definition for obesity, but not previously investigated for association with TLP.

B. Disease-Related Variables Initial WBC (x103/µL)

0-100 ≥100

Risk factor in Gajjar et al.22 High WBC may be a marker of inflammation. Note that this variable is only meaningful for the first LP.

Platelets (x103/µL)

0-50 51-75 76-100 >100

Categorized as per Howard et al.34 Platelet count below 100 x103/µL was a risk factor for TLP.

INR or PTT Normal Abnormal

INR and PTT are tests of coagulation and abnormal values may lead to increased bleeding risk.

C. Operator-Related Variables Position Oncology Fellow

Oncology Staff Radiologist

Radiologists only conducted LPs for referred patients using fluoroscopic-guidance. Association with TLP not previously investigated. Note that the comparison of radiologists versus oncologists also represents the effect of using image-guidance.

D. Procedure-Related Variables Phase of treatment

First LP Pre-maintenance Maintenance

To determine if rate of TLPs varies between first LP and subsequent LPs. The pre-maintenance phase is a time of high treatment intensity, usually lasting 6 months, followed by a 2-3 year maintenance phase.

Use of Anticoagulation

None Prophylactic dose Treatment dose

Effect of anticoagulation on bleeding during LPs has not been previously investigated.

Days since previous LP

0-3 4-7 8-15 ≥16

Categorized as per Howard et al.34 This and next 2 variables related to previous LP, to allow adjustment for the potential residual or “lag” effects of previous LP. Theoretically, red blood cells from a previous TLP may be seen at the next one.

Previous TLP Yes No

Identified as a risk factor by Howard et al.34

Platelet count at previous LP (x103/µL)

0-50 51-75 76-100 >100

Identified as a risk factor by Howard et al.34

29

2.4 Data Sources and Measurement

Children were identified through querying a prospective database maintained by

information coordinators in the Division of Hematology/Oncology at SickKids. For each

included child, data were recorded for all LPs from diagnosis until end of therapy, relapse, bone

marrow transplant, death, or date last seen. For children who experienced a relapse, second

malignancy or received a bone marrow transplant, LP variables were only collected preceding

and not after these events.

Each of the specified predictor variables for this study was documented as part of routine

clinical care. The principle data source was the electronic patient chart (EPC) which includes

scanned copies of all hand-written or typed clinical documentation at SickKids. Data on the

outcome variable as well as all laboratory tests were collected from the hospital’s electronic

repository of laboratory results (Sunrise Kidcare).

Data on the identity of the operator was abstracted in a hierarchical manner from multiple

sources. This allowed double-data verification in the large majority of cases. First, we searched

for a progress note or signature for the operator in the EPC. The signatures were compared to a

master-sheet containing signatures of all staff and fellows in Hematology/Oncology. Second, we

compared these names with the most recently-updated monthly schedule for the procedure room.

When procedures were performed instead in the operating room (OR) or image-guided therapy

(IGT) department, the name of the operator was found in the OR log. Third, we obtained paper

copies of the procedure room records maintained by the nurse in charge, which also included the

name of the operator. Lastly, as a method of confirmation, we sorted our dataset by the date of

procedure. In doing so, all operator identifiers for a particular date would be expected to be

identical, and any discrepancies were further investigated. If results were discrepant, the order of

30

reliability was considered to be signatures (most reliable), OR logs, nursing paper records, and

then monthly schedule (least reliable).

Height and weight were abstracted from the clinical visit on the day of or most proximate

to the day of the LP. As height was not recorded at each visit, the process of last observation

carried forward (LOCF) was used. Since clinic visits for ALL occur frequently and height is

unlikely to change significantly between visits, this is a reasonable approach. Using height and

weight, BMI was calculated as weight (in kilograms) divided by height (in meters) squared. BMI

at each time point was compared to age and sex references to determine BMI percentile based on

the 2000 growth charts of the Centers for Disease Control and Prevention (CDC), using a SAS

macro.56 Cut-offs were made as per the definition of childhood obesity as a BMI percentile ≥ 95

for children 2 years of age or older, or as a weight-for-height percentile ≥ 95 for children less

than 2 years old.57

2.5 Sample Size

At SickKids, there are approximately 50-60 children with newly diagnosed ALL each

year. Each child receives an average of 20 LPs. Using inclusion of a five-year period, we

expected to have 250 patients and 5,000 procedures in our cohort. The expected baseline rate of

TLPs, based on our preliminary analysis, was around 16%. Therefore, we expected to have

approximately 800 events in our study. Simulation studies have shown that a minimum of 10

events per variable in a logistic regression model are required, and that below this number

coefficient estimates are biased and variance estimates are inefficient. If we were to use the

approach of 10 events per variable for this study,58,59 we would have sufficient power to examine

up to 80 variables. However, in a repeated-measures study, the number of events needed per

31

covariate also depends on the degree of dependency among observations.49 Sample size can thus

be challenging to estimate and presently requires simulation techniques.51 Nevertheless, a sample

size of 800 events would be conservatively expected to provide sufficient power for the intended

analyses.

2.6 Statistical Analyses

All analyses were conducted using SAS version 9.3 for windows (SAS Institute, Cary,

NC). Statistical significance was defined as a p-value <0.05. Given that the primary objective

was to identify predictor variables and not to test particular hypotheses, no adjustment for

multiple comparisons was made. We reported odds ratios (ORs) with 95% CIs. An OR over 1.0

indicated that the predictor variable was associated with an increased risk of TLP relative to the

reference level.

2.6.1 Descriptive Statistics

Demographic and disease characteristics were calculated for the presentation of

descriptive statistics. Normality of data was tested for continuous variables using histograms and

the Kolmogorov-Smirnov test. Non-normally distributed variables were presented with median

and interquartile range. Categorical variables were presented with number and percentage.

2.6.2 Model Building

Collinearity was tested for all covariates. A variance inflation factor (VIF) >2.5 and

tolerance of <0.4 were used to define collinearity. Correlations were tested using Pearson

32

correlations, with a value of 0.2-0.4 considered a moderate correlation and a value above 0.4

considered a high correlation. Variables that were either collinear or highly correlated were not

both included within the same multivariable model.

To compare different methods available for the analysis of longitudinal data, univariate

and multivariable models were constructed using five different methods. Within each method,

the final multivariable model was developed using a backward selection strategy. From among

the five models, the final model to be presented was based on a combination of clinical judgment

and the suitability of the method to the data as summarized in Table 2.

2.6.3 Conventional Logistic Regression

First, univariate and multivariable models were constructed using conventional logistic

regression methods by utilizing SAS Proc LOGISTIC.55 This method assumes independence of

all observations and thus is an incorrect method for analyzing longitudinal data. It is knowingly

included here only for instructional purposes, in order to compare the resulting effect estimates,

standard errors, p-values and final models between methods that do and do not adjust for the

dependence among repeated measures.

2.6.4 Generalized Estimating Equations

Second, we fit a marginal model with GEE using SAS Proc GENMOD with an

exchangeable (compound symmetry) working correlation matrix.55 This method produces

population-averaged coefficients and conducts both within-person and between-person

comparisons.

33

2.6.5 Random-Effects (Generalized Linear Mixed Models)

Third, we fit a generalized linear mixed model with maximum likelihood estimation

using SAS Proc GLIMMIX.48 Although the name includes linear, the procedure can be used to

model a dichotomous outcome through a link function. This method produces subject-specific

coefficients by incorporating a term for a random intercept variable representing unobserved

heterogeneity. It conducts both within-person and between-person comparisons. The resulting

coefficients can be interpreted as answering “what is the probability of a patient with this

variable having a TLP when compared to patients without this variable?”

2.6.6 Fixed-Effects (Conditional Logistic Regression)

Next, we fit a fixed effects model using SAS Proc LOGISTIC with a STRATA statement

for the subject term.48 This method conditions out all stable characteristics of subjects and

examines only time-varying covariates. It conducts only within-subject comparisons, thereby

allowing each patient to be used as his or her own control. The resulting coefficients can be

interpreted as answering “what is the probability of this patient having a TLP when exposed to

this variable as compared to the same patient when not exposed to this variable?”

2.6.7 A Hybrid Method

Lastly, we fit a hybrid method that combines the fixed-effects and random-effects

approach, allowing us to embed the fixed effects of interest within a random-effects model. This

is accomplished by incorporating means and deviations for each time-varying predictor within

the model specified by Proc GLIMMIX.48 The deviation estimate represents the fixed effects.

34

2.6.8 Kaplan-Meier Survival Curves

For the secondary objective, to determine the association between CNS status and EFS

the Kaplan-Meier product-limit estimator method was used.60 CNS status was classified based on

the results of the first LP using the SJCRH system. EFS was stratified by CNS status, and a two-

sided log-tank test used to compare TLP+ with CNS1 status.

35

CHAPTER 3: RESULTS

3.1 Descriptive Statistics

3.1.1 Participants and Procedures

Of the 268 children diagnosed with ALL during the 5-year period, 2 were excluded

because of early death before the first LP, 1 due to a diagnosis of secondary ALL, and 1 due to a

subsequent diagnosis of Burkitt leukemia. The remaining 264 children underwent 5,435 LPs. Of

these, 121 LP observations were excluded as they occurred after a diagnosis of relapse. Sixteen

LPs were documented as being successful but did not have an available CSF RBC count. A

further 31 LPs were failed procedures (0.6%) that also did not have an available CSF RBC count.

Failed LPs were those where documented attempts could not lead to evaluable CSF. These 47

LPs with missing CSF values could not be used in analyses involving the outcome variable of

TLP. They were, however, kept within the dataset as complete exclusion would have affected

variables determined from the previous procedure (such as days since previous LP).

In total, therefore, 264 children undergoing 5267 LPs were included in the final analysis.

The mean number of evaluable LPs was 20.0, and ranged from 1 to 31 per child. The flow of

patients and procedures through the study are shown in Figure 2.

3.1.2 Characteristics

The demographic and disease characteristics for 264 children are shown in Table 4.

These characteristics include time-invariant variables or the value of time-varying covariates at

the day of the first LP. As all continuous variables were non-normally distributed, they are

presented with median and interquartile range.

36

3.1.3 Primary Outcome: Traumatic Lumbar Punctures

Among all 5267 evaluable LPs, there were 943 (17.9%) TLPs. Among the 264 first LPs,

there were 52 (19.7%) TLPs. There were 26 (9.8%) TLP+ containing blasts, and 26 TLP-

containing no blasts. The distribution of CNS status using the SJCRH system are shown in

Figure 3. Of the 26 TLP+ patients, 16 required the treating physician to use the Steinherz-Bleyer

formula to determine the COG CNS status, and 10 could be classified as CNS2 without the

formula.

37

FIGURE 2. Flow of patients and lumbar punctures through the study

268 patients diagnosed with ALL between 2005 to 2009 inclusive

264 Patients underwent 5,435 LPs

264 Patients undergoing 5,267 LPs included in analysis

4 patients excluded: 2 early death prior to first LP 1 secondary ALL 1 Burkitt leukemia

LPs excluded from analysis: 121 LPs conducted after relapse 47 LPs with no evaluable CSF

38

TABLE 4. Patient Characteristics (N=264 Patients)

Characteristic Median (IQR) N (%) Age at diagnosis <1 year 1-<10 years ≥10 years

4.3 (5.2)

10 203 51

(3.8) (76.9) (19.3)

Gender Male Female

152 112

(57.6) (42.4)

BMI%ile at diagnosis 0-<95 ≥95

61.4 (55.8) 231 33

(87.5) (12.5)

Ethnicity White or Caucasian Black or African Other Not documented or known

63 7 103 91

(23.9) (2.7) (39.0) (34.5)

Presenting WBC (x103/µL) <50 ≥50

7.9 (19.0) 225 38

(86) (14)

Presence of circulating blasts prior to first LP Yes No or unknown

237 27

(89.8) (10.2)

Patient ever received anticoagulation? Yes No

20 244

(7.6) (92.4)

Patient ever received image-guided LP Yes No

27 237

(10.2) (89.8)

Leukemia treatment group Standard risk High risk Very high risk Infant leukemia T-cell leukemia

152 64 17 9 22

(57.6) (24.2) (6.4) (3.4) (8.3)

39

FIGURE 3. CNS status and proportion of patients with traumatic lumbar punctures at

first LP (N=264)

40

3.2 Inferential Statistics

3.2.1 Collinearity and Correlations

No two covariates were collinear as per VIF and tolerance criteria. Predictor variables

that had the highest Pearson correlation coefficients included phase of treatment and time since

previous LP (0.49); and platelet count and previous platelet counts (0.25), as shown in Table 5

below.

TABLE 5. Pearson Correlation Coefficients*

Age BMI Phase Days Prior TLP

Platelet Prior Platelet

Age -0.07 0.016 0.012

-0.15 0.10 0.09

BMI -0.07 -0.06 -0.03 -0.07 -0.04 -0.02

Phase 0.016 -0.06 0.49 (<0.001)

-0.10 -0.04 0.21

Days 0.012 -0.03 0.49 (<0.001)

-0.09 0.19 0.17

Prior TLP

-0.15 -0.07 -0.10 -0.09 -0.01 -0.06

Platelet 0.10 -0.04 -0.04 0.19 -0.01 0.25 (<0.001)

Prior Platelet

0.09 -0.02 0.21 0.17 -0.06 0.25 (<0.001)

*P-values shown if coefficient >0.20

41

3.2.2 Model Building

From the list of 13 potential predictor variables listed in Table 3, nine were tested in the

repeated-measures multivariable regression analyses. Initial WBC count and coagulation tests

(INR/PTT) were excluded as they were relevant only to first LPs and were not significantly

associated with first LPs being traumatic. Phase of treatment, although significant in univariate

analyses, was excluded due to its high correlation with days since previous LP. The covariate

ethnicity was difficult to measure. In about one-third of our cases, we could not find any

documentation of ethnicity, and in many other situations found it difficult to classify. Therefore

ethnicity was not further examined as a predictor variable in regression modeling.

The results for univariate and multivariable regression modeling using each of five

methods is shown in Tables 6 to 10. Variables to be included in multivariable models were

selected using backward elimination. The final multivariable models are displayed comparatively

in Table 11. For models that utilized between-subject comparisons (all except fixed effects), the

same five variables were selected using automated or manual backward elimination procedures.

These variables included age at LP (1-<10 years versus < 1 or ≥ 10 years), BMI percentile (≥ 95

versus < 95), platelet count at LP (over 100 x103/L versus 0-50, 51-75 or 76-100 x x103/L), days

since the prior LP (≥16 vs 0-3, 4-7 or 8-15 days), and a preceding TLP.

Both the effect estimates and the confidence intervals produced by each of these methods

were very similar across all covariates, even when accounting for repeated measures was ignored

as in the conventional logistic regression model. The confidence intervals for the conventional

model were only slightly larger than for the GEE model. The effect estimates and CIs produced

by GEE and random effects were slightly different in univariate analysis but became nearly

equivalent in multivariable analysis.

42

In the fixed effects univariate model, platelet count at LP, platelet count at previous LP,

days since previous LP and operator position (radiologist versus oncologist) were significant. In

multivariable analysis, only days since prior LP and operator (radiologist versus oncologist)

remained, and the effect of radiology was seen to reduce the odds of TLP. In total, 27 children

underwent 233 LPs under radiologic guidance. This latter variable was not selected in any other

model. In general, the fixed effects method produced much larger confidence intervals than the

other methods. As expected, it did not produce any estimate for the time-invariant variable, sex.

43

TABLE 8. Odds Ratios (OR) and 95% CI Using Conventional Logistic Regression

Variable & Categories

No of LPs

TLPs (%)

Univariate OR (95% CI)

Multivariable OR (95% CI)

N 5267 5267 5214 Intercept β estimate -1.69 Age at LP 0-1y 1-10y >10y

82 3965 1220

33 (40) 586 (15) 324 (27)

3.9 (2.5-6.1) 1.0 2.1 (1.8-2.4)

3.46 (2.16-5.53) 1.0 2.04 (1.74-2.39)

BMI%ile at LP 0-95% ≥95%

4209 1057

711 (17) 232 (22)

1.0 1.4 (1.2-1.6)

1.0 1.49 (1.25-1.77)

Sex Male Female

3156 2111

564 (18) 379 (18)

1.0 1.0 (0.9-1.2)

-

Phase of treatment First LP Pre-Maintenance Maintenance

264 2621 2382

52 (20) 514 (20) 377 (16)

1.0 1.0 (0.7-1.4) 0.8 (0.6-1.1)

-

Platelets 0-50 51-75 76-100 >100

59 216 222 4770

13 (22) 57 (26) 61 (28) 812 (17)

1.4 (0.7-2.6) 1.8 (1.3-2.4) 1.9 (1.4-2.5) 1.0

1.26 (0.67-2.39) 1.47 (1.04-2.01) 1.48 (1.08-2.05) 1.0

Operator Fellow Staff Oncologist Radiologist

3146 1874 233

574 (18) 292 (16) 75 (32)

1.0 0.8 (0.71-0.97) 2.1 (1.6-2.8)

-

Days since previous LP 0-3 4-7 8-15 ≥16

29 866 431 3941

15 (52) 188 (22) 96 (22) 644 (16)

5.5 (2.6-11.4) 1.4 (1.2-1.7) 1.5 (1.2-1.9) 1.0

5.18 (2.38-11.27) 1.35 (1.08-1.68) 1.33 (1.01-1.74) 1.0

Platelets at previous LP 0-50 51-75 76-100 >100

56 209 219 4783

16 (29) 53 (25) 57 (26) 817 (17)

1.9 (1.1-3.5) 1.7 (1.2-2.3) 1.7 (1.3-2.3) 1.0

-

Previous TLP Yes No

931 4336

250 (27) 693 (16)

1.9 (1.6-2.3) 1.0

1.62 (1.36-1.93) 1.0

Recent anticoagulation None Prophylactic dose Treatment dose

5043 88 98

906 (18) 13 (15) 21 (21)

1.0 0.79 (0.44-1.43) 1.25 (0.77-2.03)

-

Coagulation Tests Normal Abnormal (INR>1.3, PTT>40)

735 58

164 (22) 14 (24)

1.0 1.11 (0.59-2.07)

-

White Blood Cell 0-50 >50

225 38

44 (20) 8 (21)

1.0 1.10 (0.48-2.56)

-

44

TABLE 9. Odds Ratios (OR) and 95% CI Using Generalized Estimating Equations

Variable & Categories

Univariate OR (95% CI)

Multivariable OR (95% CI)

N 5267 5214 Intercept β estimate -1.64 Age at LP <1y 1-10y >10y

3.92 (2.45-6.28) 1.0 2.01 (1.63-2.48)

3.45 (2.25-5.29) 1.0 1.96 (1.62-2.38)

BMI%ile at LP 0-95% ≥95%

1.0 1.26 (1.02-1.56)

1.0 1.43 (1.19-1.72)

Sex Male Female

1.0 0.99 (0.81-1.22)

-

Phase of treatment First LP Pre-Maintenance Maintenance

1.0 0.97 (0.71-1.32) 0.76 (0.55-1.06)

-

Platelets 0-50 51-75 76-100 >100

1.31 (.71-2.44) 1.63 (1.17-2.28) 1.69 (1.20-2.38) 1.0

1.23 (0.66-2.29) 1.43 (1.02-2.00) 1.47 (1.05-2.06) 1.0

Operator Fellow Staff Oncologist Radiologist

1.0 0.85 (0.73-1.00) 1.46 (090-2.36)

-

Days since previous LP 0-3 4-7 8-15 ≥16

5.83 (2.80-12.15) 1.37 (1.13-1.65) 1.47 (1.17-1.85) 1.0

5.23 (2.47-11.06) 1.33 (1.10-1.61) 1.33 (1.05-1.69) 1.0

Platelets at previous LP 0-50 51-75 76-100 >100

1.84 (1.06-3.19) 1.54 (1.12-2.12) 1.60 (1.14-2.23) 1.0

-

Previous TLP Yes No

1.53 (1.28-1.81) 1.0

1.45 (1.22-1.73) 1.0

Recent anticoagulation None Prophylactic dose Treatment dose

1.0 0.86 (0.51-1.43) 1.21 (0.70-2.11)

-

45

TABLE 10. Odds Ratios (OR) and 95% CI Using Random Effects with Generalized Linear

Mixed Models

Variable & Categories

Univariate OR (95% CI)

Multivariable OR (95% CI)

N 5267 5266 Intercept β estimate -1.70 Age at LP <1y 1-10y >10y

4.01 (2.32-6.95) 1.0 2.05 (1.68-2.51)

3.46 (2.06-5.81) 1.0 2.00 (1.66-2.4)

BMI%ile at LP 0-95% ≥95%

1.0 1.28 (1.04-1.58)

1.0 1.44 (1.19-1.75)

Sex Male Female

1.0 0.99 (0.81-1.22)

-

Phase of treatment First LP Pre-Maintenance Maintenance

1.0 0.98 (0.71-1.36) 0.76 (0.55-1.06)

-

Platelets 0-50 51-75 76-100 >100

1.27 (0.66-2.43) 1.65 (1.19-2.29) 1.73 (1.26-2.39) 1.0

1.15 (0.60-2.20) 1.42 (1.01-1.98) 1.49 (1.08-2.07) 1.0

Operator Fellow Staff Oncologist Radiologist

1.0 0.85 (0.72-0.99) 1.39 (0.94-2.08)

-

Days since previous LP 0-3 4-7 8-15 ≥16

6.25 (2.84-13.78) 1.40 (1.16-1.69) 1.46 (1.14-1.89) 1.0

5.13 (2.34-11.25) 1.35 (1.11-1.63) 1.31 (1.02-1.70) 1.0

Platelets at previous LP 0-50 51-75 76-100 >100

1.86 (1.00-3.44) 1.56 (1.11-2.18) 1.59 (1.14-2.21) 1.0

-

Previous TLP Yes No

1.52 (1.27-1.83) 1.0

1.43 (1.19-1.73) 1.0

Recent anticoagulation None Prophylactic dose Treatment dose

1.0 0.83 (0.42-1.66) 1.21 (0.68-2.17)

-

46

TABLE 11. Odds Ratios (OR) and 95% CI Using Fixed Effects with Conditional Logistic

Regression

Variable & Categories

Univariate OR (95% CI)

Multivariable OR (95% CI)

N 5214 Intercept β estimate Age at LP 0-2y 2-9y >9y

2.52 (0.74-8.64) 1.0 1.09 (0.63-1.88)

-

BMI%ile at LP 0-95% ≥95%

1.0 0.99 (0.74-1.32)

-

Sex Male Female

N/A

-

Phase of treatment First LP Pre-Maintenance Maintenance

1.0 0.99 (0.71-1.37) 0.79 (0.56-1.11)

-

Platelets 0-50 51-75 76-100 >100

1.20 (0.61-2.35) 1.45 (1.03-2.03) 1.54 (1.10-2.15) 1.0

-

Operator Fellow Staff Oncologist Radiologist

1.0 0.90 (0.76-1.06) 0.54 (0.31-0.94)

1.0 0.90 (0.77-1.07) 0.55 (0.32-0.95)

Days since previous LP 0-3 4-7 8-15 ≥16

7.02 (2.92-16.88) 1.33 (1.10-1.61) 1.41 (1.08-1.84) 1.0

7.34 (3.05-17.67) 1.31 (1.08-1.58) 1.40 (1.07-1.82) 1.0

Platelets at previous LP 0-50 51-75 76-100 >100

1.78 (0.94-3.37) 1.43 (1.01-2.02) 1.47 (1.04-2.06) 1.0

-

Previous TLP Yes No

1.10 (0.92-1.31) 1.0

-

Recent anticoagulation None Prophylactic dose Treatment dose

1.0 0.90 (0.37-2.18) 1.18 (0.55-2.51)

-

47

TABLE 12. Odds Ratios (OR) and 95% CI Using The Hybrid Method

Variable & Categories

Multivariable OR (95% CI)

N 5249 Intercept β estimate -1.68 Age at LP <1y 1-10y >10y

3.48 (2.10-5.77) 1.0 1.80 (1.49-2.18)

BMI%ile at LP 0-95% ≥95%

1.0 1.35 (1.11-1.63)

Sex Male Female

-

Phase of treatment First LP Pre-Maintenance Maintenance

-

Platelets 0-50 51-75 76-100 >100

1.13 (0.59-2.18) 1.38 (0.99-1.94) 1.43 (1.03-1.98) 1.0

Operator (Oncologist vs Radiologist) Fixed Effects Random Effects

0.56 (0.33-0.96) 2.19 (1.38-3.47)

Days since previous LP 0-3 4-7 8-15 ≥16

5.19 (2.38-11.36) 1.32 (1.09-1.60) 1.32 (1.02-1.71) 1.0

Platelets at previous LP 0-50 51-75 76-100 >100

-

Previous TLP Yes No

1.44 (1.19-1.73) 1.0

Recent anticoagulation None Prophylactic dose Treatment dose

-

48

TABLE 13. Comparison of Models

Variable & Categories

No of LPs

TLPs (%)

Unadjusted Univariate OR

Logistic Regression GEE Random Effects Fixed Effects Hybrid

Age at LP <1y 1-10y >10y

82 3965 1220

33 (40) 586 (15) 324 (27)

3.9 (2.5-6.1) 1.0 2.1 (1.8-2.4)

3.46 (2.16-5.53) 1.0 2.04 (1.74-2.39)

3.45 (2.25-5.29) 1.0 1.96 (1.62-2.38)

3.46 (2.06-5.81) 1.0 2.00 (1.66-2.4)

-

3.48 (2.10-5.77) 1.0 1.80 (1.49-2.18)

BMI at LP 0-95%ile >95%ile

4209 1057

711 (17) 232 (22)

1.0 1.4 (1.2-1.6)

1.0 1.49 (1.25-1.77)

1.0 1.43 (1.19-1.72)

1.0 1.44 (1.19-1.75)

-

1.0 1.35 (1.11-1.63)

Platelets 0-50 51-75 76-100 >100

59 216 222 4770

13 (22) 57 (26) 61 (28) 812 (17)

1.4 (0.7-2.6) 1.8 (1.3-2.4) 1.9 (1.4-2.5) 1.0

1.26 (0.67-2.39) 1.47 (1.04-2.01) 1.48 (1.08-2.05) 1.0

1.23 (0.66-2.29) 1.43 (1.02-2.00) 1.47 (1.05-2.06) 1.0

1.15 (0.60-2.20) 1.42 (1.01-1.98) 1.49 (1.08-2.07) 1.0

-

1.13 (0.59-2.18) 1.38 (0.99-1.94) 1.43 (1.03-1.98) 1.0

Operator Fellow Staff Radiologist

3146 1874 233

574 (18) 292 (16) 75 (32)

1.0 0.8 (0.71-0.97) 2.1 (1.6-2.8)

-

-

-

1.0 0.90 (0.77-1.07) 0.55 (0.32-0.95)

FE: 0.56 (0.33-0.96) RE: 2.19 (1.38-3.47)

Days since prior LP 0-3 4-7 8-15 ≥16

29 866 431 3941

15 (52) 188 (22) 96 (22) 744 (16)

5.5 (2.6-11.4) 1.4 (1.2-1.7) 1.5 (1.2-1.9) 1.0

5.18 (2.38-11.3) 1.35 (1.08-1.68) 1.33 (1.01-1.74) 1.0

5.23 (2.47-11.1) 1.33 (1.10-1.61) 1.33 (1.05-1.69) 1.0

5.13 (2.34-11.3) 1.35 (1.11-1.63) 1.31 (1.02-1.70) 1.0

7.34 (3.05-17.7) 1.31 (1.08-1.58) 1.40 (1.07-1.82) 1.0

5.19 (2.38-11.36) 1.32 (1.09-1.60) 1.32 (1.02-1.71) 1.0

Previous TLP Yes No

931 4336

250 (27) 693 (16)

1.9 (1.6-2.3) 1.0

1.62 (1.36-1.93) 1.0

1.45 (1.22-1.73) 1.0

1.43 (1.19-1.73) 1.0

-

1.44 (1.19-1.73) 1.0

49

3.2.3 Secondary Outcome: Event-Free Survival

The mean length of follow-up was 3.9 years (SD 1.6 years). For 188 children with CNS1

status, the 5-year EFS was 93% (SE±2). For 26 children with TLP+ status, the 5-year EFS was

77% (±8). The difference between these two groups was statistically significant (Log-rank p=

0.002). Kaplan-Meier survival curves are shown in Figure 4.

FIGURE 4. Kaplan-Meier survival curves by CNS status

50

CHAPTER 4: DISCUSSION

4.1 Summary of Main Findings

In our retrospective cohort study of 5267 LPs, we observed an overall TLP rate of 17.9%

and a first TLP rate of 19.7%. In terms of identifying predictors of TLP, we found that among

the random effects model and GEE, the variables independently associated with TLP were age

less than 1 or over 10 years, BMI percentile over 95, platelet count less than 100 x103/µL, fewer

days since the previous LP, and a previous TLP. However, in the fixed effects model, the

variables independently associated with TLP were fewer days since the previous LP, and the use

of image-guidance by a radiologist.

The overall proportion of TLPs in our study is lower than that reported by Howard et al

(29.3%).34 This is likely due to the fact that Howard et al included only the first few LPs per

child, and therefore had a greater proportion of LPs in the intensive phases of treatment where

LPs are performed closer together. It may also be related to the study being conducted during an

era when procedural sedation was not routinely used, and when junior trainees such as medical

students and residents also performed LPs on children with ALL. Nevertheless, given the

significant consequences of TLPs in ALL, we believe that our proportion of TLPs is still high

and that further attempts at minimizing this complication are still needed.

4.2 Factors Predictors of TLP

The proportion of TLP was higher among infants than children 1-<10 years of age in our

study. This may be due to the technical challenges of performing the procedure within a smaller

51

anatomic space. Due to the shallower distances between the subarachnoid space and the posterior

venous plexus, the infant spine offers a narrower margin of error. Thus even small

miscalculations in estimating distance or angle of needle insertion, coupled with the larger

relative size of the LP needle to the anatomic structures, may increase the chance of blood vessel

laceration. The proportion of TLP was also significantly higher among children older than 10

years compared to those between 1-10 years of age. A different set of technical challenges

probably account for this observation. As children grow older and larger, the distance between

the skin and the spinous processes increases, which makes the spinous processes and thus the

optimal site of needle insertion more difficult to visualize and palpate. With increased distance,

any deviation from the ideal angle at the skin puncture site becomes amplified. Lastly, older

children often require longer needles which are more technically challenging to use. Longer

needles can curve or bend on insertion and are therefore harder to direct in a straight trajectory.

Obesity was also significantly associated with TLP. The reasons for this may be similar

as for older age. Procedure landmarks are harder to palpate in the presence of obesity, and needle

distances are greater. Shah et al found that being unable to visualize (let alone palpate) the

spinous processes for adults in the emergency room was a risk factor for traumatic LPs.40 This

variable was not assessed by Howard et al.34

Age and obesity are both patient-related factors. They are not modifiable at the time of

the LP. However, they are useful as predictor variables in that they can identify those subsets of

patients with a higher risk of TLP. Such risk-stratification would be important to allow directing

interventions to reduce TLP to those patients who are most likely to benefit.

A lower platelet count before the LP was a modifiable risk factor, as children with

thrombocytopenia can be given a platelet transfusion prior to their procedure. We found that the

52

risk of a TLP is significantly associated with all platelet counts under 100 x 103/µL.

Interestingly, patients in all categories with platelet counts less than 100 x 103/µL had an

increased risk, and the risk was not substantially different across the categories below this

threshold. These results are remarkably similar to those identified by Howard et al. The authors

of that study recommended that “in those settings in which traumatic LP is particularly

undesirable and the benefit of transfusion outweighs the disadvantages, such as the diagnostic LP

in a child with ALL and circulating leukemic cells, platelet transfusion for a count of 100 x

103/µL or lower is warranted.”34 Therefore, at SJCRH all children with newly diagnosed ALL

are transfused if their platelet count is below this threshold prior to their first LP.61 Our findings

support this practice. This same threshold was not included as a recommendation in a recent

Canadian platelet transfusion guideline for pediatric cancer patients.62

The proportion of TLPs was significantly higher when a prior LP was performed within

the preceding 15 days, and also if the prior LP was itself traumatic. Both these variables can be

regarded as “lag” variables. Although they are not themselves clinically useful, they are

important to include as potential confounders in a repeated-measures study. By adjusting for

them within the multivariable model, the results obtained from sequential LPs can be more

meaningfully applied to understanding predictors of TLPs.

4.3 Pertinent Negative Findings

In addition to these five identified risk factors, there were two pertinent negative findings

of our study worth highlighting.

First, although TLP rates varied widely among different operators, we found no

significant difference in the rate of TLPs between oncology fellows and staff. Therefore, rather

53

than restrict fellows from performing first LPs, institutions should consider identifying their best

performers within all positions who can be called upon to perform first LPs, especially in

children with higher risk of TLP.

Secondly, our study is the first to look for an association between the use of recent

anticoagulation therapy and the proportion of TLPs. No significant association was found. This

result is important because approximately 5% of all children with ALL will experience a

thrombotic event requiring anticoagulation with either heparin or LMWH.63 Large bleeds within

the spinal canal, known as spinal hematomas, are a rare but devastating complication of LPs that

can lead to paraplegia or death.64,65 Consequently, at SickKids, our institutional practice is to

hold unfractionated heparin for 4 hours and LMWH for 24 hours prior to an LP without requiring

coagulation testing. Our results suggest that this practice is sufficient to prevent even

microscopic bleeding.

4.4 Survival Analysis

Consistent with previous studies,22-24 we found a significant reduction in EFS for children

with TLP+ compared to those who had CNS1 status. This shows that TLP+ continues to be a

prognostic factor even in the context of contemporary therapy with additional treatments given to

these children. This lends further importance to the need to reduce first TLPs to as low as

possible in children with ALL.

4.5 Comparison of Repeated-Measures Analysis Methods

The five risk factors highlighted above were all identified as being the most significant

factors in models fitted using either conventional logistic regression, GEE or random effects

54

mixed models. Indeed, both the effect estimates and the confidence intervals produced by each of

these methods were very similar across all covariates. The similarity of findings across the

different regression methods means that our results are robust to the choice of statistical method.

It also suggests that within-patient correlations and unobserved heterogeneity were not

significant factors within this dataset.

In going from a model that incorrectly ignored dependence among individuals to those

that accounted for it, no substantial change in results was seen. Generally speaking, regression

models that ignore correlation tend to overestimate the standard errors of time-varying

covariates, while the effect on the size of the coefficient estimates is less pronounced.50,55 While

the confidence intervals may be slightly wider for the conventional logistic regression model

compared to GEE in our study, the differences are very subtle. This might suggest that residual

intra-individual correlation was not a strong factor in this setting. Lastly, it may perhaps imply

that TLP risk does not tend on average to cluster within individuals, and hence that there is either

not much variation in TLP risk between individuals and/or that most variation in TLP proportion

is for reasons extrinsic to the individual.

The effect estimates and CIs produced by GEE and random effects were again equivalent.

This reinforces the similarities between these two models in spite of the very different underlying

estimation methods.46 We did not observe any heterogeneity shrinkage in the GEE model relative

to the random effects model, which suggests that unobserved heterogeneity was minimal. Indeed,

the GEE estimates tended to be slightly smaller than the corresponding random effects estimates

in univariate analysis, but more similar in multivariable analysis as the unobserved heterogeneity

was reduced by the addition of covariates. However, this again leads us to conclude that

55

variation in subject-specific effects is small. Indeed, while TLP proportion between individuals

does vary, it tends to do so mainly within the range of 10% to 30%.

In contrast, the fixed effects model led to very different results. As might have been

expected, no significant association for age at LP or BMI percentile was seen. This is because

these variables do not change much over the course of 3 years of observation, especially when

they are categorized at cut-points. They in effect behaved as time-invariant variables when

restricted to just one individual. The number of days since prior LP remained a strong predictor,

but had much larger confidence intervals, as would be expected. Since fixed effects discards data

for any patient whose outcome variable does not change across predictor categories, it makes

inefficient use of the data and thus works with effectively smaller sample sizes.

Furthermore, in the fixed effects method one additional factor was noted to be significant:

the use of fluoroscopic image-guidance by an interventional radiologist. In our cohort, 27

children underwent 233 LPs under fluoroscopy. These were all children who were referred to the

interventional radiologist by the patient’s oncology team after either failed or repeatedly difficult

LPs. Moreover, these procedures often occurred within a few days after a prior LP attempt. One

would therefore expect to see susceptibility bias and/or confounding-by-indication in this group

of children. Indeed, it is instructive to observe how the estimated effect of this variable changed

across the different analytic methods. In unadjusted analysis, the image-guided LPs were

traumatic twice as often as all other LPs, with an overall rate of 32% and an OR of 2.1. In

univariate GEE adjusting only for correlations, the OR declined to 1.39 and lost statistical

significance. It further declined to 1.16 in multivariable GEE accounting for all other significant

covariates, likely due mostly to the added adjustment for days since prior LP (data not shown).

Lastly, in the fixed effects analysis, the effect of image-guidance reversed direction and was seen

56

to be significantly protective against TLP, with an OR of 0.55. The fixed effects method balances

unobserved but stable covariates and uses each patient as their own control.48 This suggests that

among the small group of patients selected for a referral to image-guidance, there is an intrinsic

but un-measured variable that predisposes them to a higher rate of TLP. Within these patients,

the use of image-guidance can reduce the risk of TLP.

After comparing the strengths and limitations of all models, we selected the GEE model

for final presentation. The differences between GEE and the random effects model are minimal,

and we based our decision on the fact that we are most interested in population-averaged rather

than subject-specific effects. The population-averaged coefficients allow users to estimate the

average change in the proportion of the outcome if modification of a risk factor was applied

across all patients.50

4.6 Study Limitations

Our study had several limitations. As a retrospective study, recording of covariates was

limited to those which were documented as part of routine clinical care. Therefore, several

potential predictor variables could not be assessed. For example, we could not assess the

potential impact of depth of anesthesia or degree of patient movement, the positioning of the

patient during the procedure, the width or length of needle used for the LP, or particular operator

practices such as removing the stylet. However, these particular variables would not be expected

to vary much within our cohort, as most procedures are performed within consistent parameters.

Unlike Shah et al,40 we could not assess whether operators were able to visualize or palpate the

spinous landmarks or whether this variable correlated with obesity. Furthermore, unlike Glatstein

57

et al,41 we could not assess whether multiple attempts increased the risk of TLP, as operators did

not document their number of attempts except in rare cases.

As with all non-experimental research, it is always possible that other important

explanatory variables were omitted as a result of not having been considered at all. Indeed, the

results of our fixed effects analysis on image-guidance does strongly suggest that some

unmeasured variable(s) accounts for a high proportion of TLP in at least a subgroup of children.

Lastly, a limitation of our study is that SickKids is a large tertiary-care referral center.

Therefore our results may not be generalizable to smaller or non-academic centers.

4.7 Study Strengths

A strength of our study is its large sample size. Furthermore, those covariates which were

included for the study had very low rates of missing data. Since nearly all LPs for childhood

cancer patients at SickKids are performed under deep sedation, documentation for procedures is

generally of a high quality. Only 47 (0.9%) LPs had missing outcome data (no CSF RBC count)

and only another 53 LPs (1.0%) had data missing on covariate values for the final model.

Another strength of this study was our use and comparison of multiple regression

methods for repeated-measures, which allowed us to gain better insights into the data and present

the most suitable multivariable model. Since results were consistent across all models that

performed within-person comparisons, the results of our analysis were robust to the choice of

method.

Our study had broad inclusion criteria and included all LPs for all children with any type

of ALL. Therefore, the study results should be generalizable to other large pediatric oncology

58

institutions where the routine practices regarding deep sedation and operator experience are

similar to ours.

4.8 Future Research

The findings from this study provide a foundation for future research. Image-guidance

may have the potential to reduce TLPs in children with particular susceptibility, and this

hypothesis should be further investigated. We are currently planning a study to determine

whether ultrasound-guidance performed in the oncology procedure room can reduce the

proportion of TLPs. Another future direction could be to determine prospectively whether

implementation of the clinical recommendations from this study, such as platelet transfusion

thresholds, identification of selected operators for first LPs, and/or image-guidance, are able to

reduce the proportion of first TLPs over time.

Lastly, the dataset in this study led to similar conclusions regardless of whether

conventional logistic regression, random effects, or GEE was used, and thus the relative

strengths and weaknesses of each method were not specifically highlighted. Future research

could therefore utilize simulations with varying degrees of intra-individual correlation and

unobserved heterogeneity in order to determine the relative strengths, bias, and efficiency of

each method under specific conditions. This could provide further guidance to researchers on

which method to use under different circumstances.

59

REFERENCES

1. SEER Cancer Statistics Review, 1975-2008, National Cancer Institute., 2011. (Accessed 2011, at http://seer.cancer.gov?csr/1975_2008/.) 2. Pui CH, Thiel E, Pui C-H, Thiel E. Central nervous system disease in hematologic malignancies: historical perspective and practical applications. Semin Oncol 2009;36:S2-S16. 3. Pui C-H, Howard SC. Current management and challenges of malignant disease in the CNS in paediatric leukaemia. Lancet Oncol 2008;9:257-68. 4. Hvizdala E, Berry DH, Chen T, et al. Impact of the timing of triple intrathecal therapy on remission induction in childhood acute lymphoblastic leukemia: a Pediatric Oncology Group study. Med Pediatr Oncol 1984;12:173-7. 5. Pullen J, Boyett J, Shuster J, et al. Extended triple intrathecal chemotherapy trial for prevention of CNS relapse in good-risk and poor-risk patients with B-progenitor acute lymphoblastic leukemia: a Pediatric Oncology Group study. J Clin Oncol 1993;11:839-49. 6. Iacoangeli M, Roselli R, Pagano L, et al. Intrathecal chemotherapy for treatment of overt meningeal leukemia: comparison between intraventricular and traditional intralumbar route. Ann Oncol 1995;6:377-82. 7. Moghrabi A, Levy DE, Asselin B, et al. Results of the Dana-Farber Cancer Institute ALL Consortium Protocol 95-01 for children with acute lymphoblastic leukemia. Blood 2007;109:896-904. 8. Cherlow JM, Sather H, Steinherz P, et al. Craniospinal irradiation for acute lymphoblastic leukemia with central nervous system disease at diagnosis: a report from the Children's Cancer Group. Int J Radiat Oncol Biol Phys 1996;36:19-27. 9. Pui CH. Central nervous system disease in acute lymphoblastic leukemia: Prophylaxis and treatment. Hematology 2006. 10. Meadows AT, Gordon J, Massari DJ, Littman P, Fergusson J, Moss K. Declines in IQ scores and cognitive dysfunctions in children with acute lymphocytic leukaemia treated with cranial irradiation. Lancet 1981;2:1015-8. 11. Uruena M, Stanhope R, Chessells JM, Leiper AD. Impaired pubertal growth in acute lymphoblastic leukaemia. Arch Dis Child 1991;66:1403-7. 12. MacLean WE, Jr., Noll RB, Stehbens JA, et al. Neuropsychological effects of cranial irradiation in young children with acute lymphoblastic leukemia 9 months after diagnosis. The Children's Cancer Group. Arch Neurol 1995;52:156-60. 13. Mahmoud HH, Rivera GK, Hancock ML, et al. Low leukocyte counts with blast cells in cerebrospinal fluid of children with newly diagnosed acute lymphoblastic leukemia. N Engl J Med 1993;329:314-9. 14. Children's Oncology Group. 2011. (Accessed December, 2011, at www.childrensoncologygroup.org.) 15. Schrappe M, Reiter A, Henze G, et al. Prevention of CNS recurrence in childhood ALL: results with reduced radiotherapy combined with CNS-directed chemotherapy in four consecutive ALL-BFM trials. Klin Padiatr 1998;210:192-9. 16. Practice parameters: lumbar puncture (summary statement). Report of the Quality Standards Subcommittee of the American Academy of Neurology. Neurology 1993;43:625-7.

60

17. Rech A, de Carvalho GP, Meneses CF, Hankins J, Howard S, Brunetto AL. The influence of traumatic lumbar puncture and timing of intrathecal therapy on outcome of pediatric acute lymphoblastic leukemia. Pediatr Hematol Oncol 2005;22:483-8. 18. Conter V, Arico M, Valsecchi MG, et al. Extended intrathecal methotrexate may replace cranial irradiation for prevention of CNS relapse in children with intermediate-risk acute lymphoblastic leukemia treated with Berlin-Frankfurt-Munster-based intensive chemotherapy. The Associazione Italiana di Ematologia ed Oncologia Pediatrica. J Clin Oncol 1995;13:2497-502. 19. Nachman J, Sather HN, Cherlow JM, et al. Response of children with high-risk acute lymphoblastic leukemia treated with and without cranial irradiation: a report from the Children's Cancer Group. J Clin Oncol 1998;16:920-30. 20. Pui CH, Campana D, Pei D, et al. Treating childhood acute lymphoblastic leukemia without cranial irradiation. N Engl J Med 2009;360:2730-41. 21. Pappano D. "Traumatic tap" proportion in pediatric lumbar puncture. Pediatric Emergency Care 2010;26:487-9. 22. Gajjar A, Harrison PL, Sandlund JT, et al. Traumatic lumbar puncture at diagnosis adversely affects outcome in childhood acute lymphoblastic leukemia. Blood 2000;96:3381-4. 23. Burger B, Zimmermann M, Mann G, et al. Diagnostic cerebrospinal fluid examination in children with acute lymphoblastic leukemia: significance of low leukocyte counts with blasts or traumatic lumbar puncture. J Clin Oncol 2003;21:184-8. 24. te Loo DM, Kamps WA, van der Does-van den Berg A, et al. Prognostic significance of blasts in the cerebrospinal fluid without pleiocytosis or a traumatic lumbar puncture in children with acute lymphoblastic leukemia: experience of the Dutch Childhood Oncology Group. J Clin Oncol 2006;24:2332-6. 25. Larson S, Schall G, Di Chrio G. The influence of previous lumbar puncture and pneumoencephalography on the incidence of unsuccessful radioisotope cisternography. Journal of Nuclear Medicine 1971;12. 26. Chordas C. Post-dural puncture headache and other complications after lumbar puncture. J Pediatr Oncol Nurs 2001;18:244-59. 27. Holdsworth MT, Raisch DW, Winter SS, et al. Pain and distress from bone marrow aspirations and lumbar punctures. Ann Pharmacother 2003;37:17-22. 28. Ebinger F, Kosel C, Pietz J, Rating D. Headache and backache after lumbar puncture in children and adolescents: A prospective study. Pediatrics 2004;113:1588-92. 29. Eskey CJ, Ogilvy CS. Fluoroscopy-guided lumbar puncture: Decreased frequency of traumatic tap and implications for the assessment of CT-negative acute subarachnoid hemorrhage. American Journal of Neuroradiology 2001;22:571-6. 30. Yu SD, Chen MY, Johnson AJ. Factors associated with traumatic fluoroscopy-guided lumbar punctures: a retrospective review. Ajnr: American Journal of Neuroradiology 2009;30:512-5. 31. Sidhu M, Coley B, Goske M, et al. Image Gently, Step Lightly: increasing radiation dose awareness in pediatric interventional radiology. Pediatric Radiology 2009;39:1135-8. 32. Chong AL, Grant R, Ahmed B, Thomas K, Connolly BL, Greenberg M. Imaging in pediatric patients: Time to think again about surveillance. Pediatr Blood Cancer 2010;55:407-13.

61

33. Miksys N, Gordon CL, Thomas K, Connolly BL. Estimating effective dose to pediatric patients undergoing interventional radiology procedures using Anthropomorphic Phantoms and MOSFET dosimeters. Am J Roentgenol 2010;194:1315-22. 34. Howard SC, Gajjar AJ, Cheng C, et al. Risk factors for traumatic and bloody lumbar puncture in children with acute lymphoblastic leukemia. Jama 2002;288:2001-7. 35. Shah K, Richard KM, Nicholas S, Edlow J. Incidence of traumatic lumbar puncture. Academic Emergency Medicine 2003;10:151-4. 36. Molina A, Fons J. Factors associated with lumbar puncture success. Pediatrics 2006;118:842-4; author reply 4. 37. Kaushal HS, Daniel M, Jeffrey S, Jonathan AE. Predicting difficult and traumatic lumbar punctures. The American journal of emergency medicine 2007;25:608-11. 38. Nigrovic L, Kuppermann N, Neuman M. Risk Factors for Traumatic or Unsuccessful Lumbar Punctures in Children. Annals of emergency medicine 2007;49:762-71. 39. Nigrovic LE, McQueen AA, Neuman MI. Lumbar Puncture Success Rate Is Not Influenced by Family-Member Presence. Pediatrics 2007;120:e777-82. 40. Shah K, McGillicuddy D, Spear J, Edlow J. Predicting difficult and traumatic lumbar punctures. The American journal of emergency medicine 2007;25:608-11. 41. Glatstein MM, Zucker-Toledano M, Arik A, Scolnik D, Oren A, Reif S. Incidence of traumatic lumbar puncture: experience of a large, tertiary care pediatric hospital. Clin Pediatr (Phila) 2011;50:1005-9. 42. Airhart A, Doyle J, Airhart C, Abla O, Alexander S. Assessment of pediatric haematology/oncology fellows' training in the performance of lumbar punctures [abstract]. Pediatr Blood Cancer 2009;52:721. 43. Ljungman G, Gordh T, Sorensen S, Kreuger A. Lumbar puncture in pediatric oncology: conscious sedation vs. general anesthesia. Med Pediatr Oncol 2001;36:372-9. 44. Gardiner JC, Luo Z, Roman LA. Fixed effects, random effects and GEE: What are the differences? Statistics In Medicine 2009;28:221-39. 45. Edwards LJ. Modern statistical techniques for the analysis of longitudinal data in biomedical research. Pediatric Pulmonology 2000;30:330-44. 46. Fitzmaurice GM, Laird NM, Ware JH. Applied Longitudinal Analysis. New Jersey: Wiley; 2011. 47. Singer JD, Willett JB. Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence. Madison Avenue, New York: Oxford University Press, Inc; 2003. 48. Allison PD. Fixed Effects Regression Methods for Longitudinal Data: Using SAS. Cary, NC: SAS Institute Inc; 2005. 49. Lipsitz RS, Fitzmaurice GM. Sample size for repeated measures studies with binary responses. Statistics In Medicine 1994;13:1233-9. 50. Hu FB, Goldberg J, Hedeker D, Flay BR, Pentz MA. Comparison of population-averaged and subject-specific approaches for analyzing repeated binary outcomes. American Journal of Epidemiology 1998;147:694-703. 51. Locascio JJ, Atri A. An overview of longitudinal data analysis methods for neurological research. Dement Geriatr Cognitive Disorders Extra 2011;1:330-57.

62

52. Cheng J, Edwards LJ, Maldonado-Molina MM, Komro KA, Milluer KE. Real longitudinal data analysis for real people: Building a good enough mix model. Statistics In Medicine 2010;29:504-20. 53. Liang KY, zeger SL. Longitudinal data analysis using generalized linear models. Biometrika 1986;73:13-22. 54. Pan W. Akaike's information criterion in generalized estimating equations Biometrcs 2001;57:120-5. 55. Allison PD. Logistic Regression Using the SAS System: Theory and Application. . Cary NC: SAS Institute Inc.; 1999. 56. A SAS Program for the CDC Growth Charts. Centers for Disease Control and Prevention, 2011. (Accessed March 2012, at http://www.cdc.gov/nccdphp/dnpao/growthcharts/resources/sas.htm.) 57. Barlow SE. Expert committee recommendations regarding the prevention, assessment, and treatment of child and adolescent overweight and obesity: Summary report. Pediatrics 2007;120:S164-S92. 58. Peduzzi P, Concato J, Kemper E, Holford T, Feinstein A. A simulation study of the number of events per variable in logistic regression analysis. Journal of Clinical Epidemiology 1996;49:1373-9. 59. Concato J, Peduzzi P, Holford T, et al. Importance of events per independent variable in proportional hazards analysis. I. Background, goals, and general strategy. Journal of Clinical Epidemiology 1995;48:1495-501. 60. Allison PD. Survival Analysis Using SAS: A Pratcical Guide. Second ed. Cary NC: SAS Institute Inc; 2010. 61. Howard SC, Gajjar A, Ribeiro RC, et al. Safety of lumbar puncture for children with acute lymphoblastic leukemia and thrombocytopenia. Jama 2000;284:2222-4. 62. Guideline for platelet transfusion thresholds for pediatric hematology/oncology patients: Complete reference guide. The C17 Guidelines Committee, 2010. (Accessed March 2011, at http://www.c17.ca/index.php?cID=86.) 63. Caruso V, Iacoviello L, Di Castelnuovo A, et al. Thrombotic complications in childhood acute lymphoblastic leukemia: a meta-analysis of 17 prospective studies comprising 1752 pediatric patients. Blood 2006;108:2216-22. 64. Kreppel D, Antoniadis G, Seeling W. Spinal hematoma: a literature survey with meta-analysis of 613 patients. Neurosurgical Review 2003;26:1-49. 65. van Veen JJ, Nokes TJ, Makris M. The risk of spinal haematoma following neuraxial anaesthesia or lumbar puncture in thrombocytopenic individuals. British Journal of Haematology 2010;148:15-25.

63

APPENDICES

APPENDIX I. Approach to Classifying TLP+ Status Varies Between SJCRH and COG

Systems

In the SHCRH system, a TLP+ is defined as its own CNS group without further attempt to

distinguish the possible underlying status:

CNS1 Absence of blasts in CSF, and <10 RBC/µL

CNS2 <5 WBC/µL, <10 RBC/µL, blasts

CNS3 ≥5 WBC, <10 RBC/µL, blasts

TLP- Absence of blasts in CSF, and ≥10 RBC/µL

TLP+ ≥10 RBC/µL, blasts

In contrast, the COG system uses the following Steinherz/Bleyer algorithm for all patients with

TLP+ to try and distinguish between underlying CNS2 and CNS3 disease:

CSF WBC > 2X CSF RBC Blood RBC

Blood WBC

A patient whose CSF WBC/RBC is 2x greater than the blood WBC/RBC ratio is considered to

have CNS3 disease at diagnosis.

64

APPENDIX II. Risk Group Definitions for B-cell Precursor ALL

Risk Group Definition Standard Risk Age 1.0 to 9.99 years

WBC < 50,000/µL Standard risk-low Not CNS2 or CNS3 or testicular disease

Favourable genetics Day 8 peripheral blood MRD <0.01% Day 29 bone marrow MRD <0.01% No steroid pretreatment

Standard-risk average No unfavourable genetics Day 8 peripheral blood MRD ≥ 0.01% or CNS2 status Day 29 bone marrow MRD < 0.01% No CNS3 or testicular disease

High-Risk Age <1 or ≥10 years WBC ≥50,000/µL CNS3 or testicular disease Steroid pretreatment

Very High-Risk ALL BCR-ABL fusion Hypodiploidy Induction failure MLL rearrangement and slow-early response after induction