age at any birth and breast cancer risk

15
*Correspondence to: Dr. Chris Robertson, Division of Epidemiology and Biostatistics, European Institute of Oncology, via Ripamonti 435, 20141 Milano, Italy. E-mail: crobert@ieo.cilea.it CCC 02776715/98/04043115$17.50 Received September 1996 ( 1998 John Wiley & Sons, Ltd. Revised May 1997 STATISTICS IN MEDICINE, VOL. 17, 431445 (1998) Statist. Med., 17, 431445 (1998) AGE AT ANY BIRTH AND BREAST CANCER RISK CHRIS ROBERTSON1,2* AND PETER BOYLE1 1 Division of Epidemiology and Biostatistics, European Institute of Oncology, via Ripamonti 435, 20141 Milano, Italy 2 Department of Statistics and Modelling Science, Strathclyde University, 26 Richmond Street, Glasgow G1 1XH, Scotland SUMMARY This paper reviews previously published models of the effect of parity and age at any birth on breast cancer risk. It is shown that these models are conceptually similar and can be written within a general model. Various restrictions on the parameters of the general model yield the specific models. The models are applicable in case control studies and are illustrated using data from a case control study of breast cancer in Italy. ( 1998 John Wiley & Sons, Ltd. 1. INTRODUCTION Parity and age at first birth have long been identified as risk factors for breast cancer; relative to nulliparous women, parous women have a decreased risk. The risk of developing breast cancer is greater among women with a later age at first birth compared with women who had their first delivery at an earlier age.1 Recently some attention has been paid to the effects of the age at which women have their second and subsequent deliveries to investigate if the later deliveries also affect the risk of breast cancer. There is still no consensus about the effect of subsequent pregnancies although some studies indicate that multiparity has an independent protective effect against breast cancer.2 In one of the first studies of the effect of age at any delivery on breast cancer risk, Trichopolous et al.3 demonstrated that age at first birth was the most important effect but that age at any birth had an independent and statistically significant effect. Furthermore the estimated effects of the second and subsequent births on breast cancer risk were similar to each other. Also, the parity effect was determined by the age of occurrence of the component pregnancies. Some authors have suggested that age at last pregnancy may be even more important than age at the first one for breast cancer risk4,5 but this finding has been questioned lately.6,7 In these latter studies no significant effect of the age at last birth was found after adjustment for the effects of parity and age at first birth. Several studies have demonstrated that a full-term pregnancy exerts a short-term adverse and a long-term beneficial influence on breast cancer risk.2,8,9 This transient adverse effect is most pronounced for the first birth but is also present with subsequent

Upload: chris-robertson

Post on 06-Jun-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

*Correspondence to: Dr. Chris Robertson, Division of Epidemiology and Biostatistics, European Institute of Oncology,via Ripamonti 435, 20141 Milano, Italy. E-mail: [email protected]

CCC 0277—6715/98/040431—15$17.50 Received September 1996( 1998 John Wiley & Sons, Ltd. Revised May 1997

STATISTICS IN MEDICINE, VOL. 17, 431—445 (1998)

Statist. Med., 17, 431—445 (1998)

AGE AT ANY BIRTH AND BREAST CANCER RISK

CHRIS ROBERTSON1,2* AND PETER BOYLE1

1 Division of Epidemiology and Biostatistics, European Institute of Oncology, via Ripamonti 435, 20141 Milano, Italy2 Department of Statistics and Modelling Science, Strathclyde University, 26 Richmond Street, Glasgow G1 1XH, Scotland

SUMMARY

This paper reviews previously published models of the effect of parity and age at any birth on breast cancerrisk. It is shown that these models are conceptually similar and can be written within a general model.Various restrictions on the parameters of the general model yield the specific models. The models areapplicable in case control studies and are illustrated using data from a case control study of breast cancer inItaly. ( 1998 John Wiley & Sons, Ltd.

1. INTRODUCTION

Parity and age at first birth have long been identified as risk factors for breast cancer; relative tonulliparous women, parous women have a decreased risk. The risk of developing breast cancer isgreater among women with a later age at first birth compared with women who had their firstdelivery at an earlier age.1

Recently some attention has been paid to the effects of the age at which women have theirsecond and subsequent deliveries to investigate if the later deliveries also affect the risk ofbreast cancer. There is still no consensus about the effect of subsequent pregnancies althoughsome studies indicate that multiparity has an independent protective effect against breastcancer.2

In one of the first studies of the effect of age at any delivery on breast cancer risk, Trichopolouset al.3 demonstrated that age at first birth was the most important effect but that age at any birthhad an independent and statistically significant effect. Furthermore the estimated effects of thesecond and subsequent births on breast cancer risk were similar to each other. Also, the parityeffect was determined by the age of occurrence of the component pregnancies.

Some authors have suggested that age at last pregnancy may be even more important than ageat the first one for breast cancer risk4,5 but this finding has been questioned lately.6,7 In theselatter studies no significant effect of the age at last birth was found after adjustment for the effectsof parity and age at first birth. Several studies have demonstrated that a full-term pregnancyexerts a short-term adverse and a long-term beneficial influence on breast cancer risk.2,8,9 Thistransient adverse effect is most pronounced for the first birth but is also present with subsequent

births; the long term benefit is also greater for the first birth but is also present in subsequentbirths.10

In most of the above studies there have been attempts to estimate the joint effects of age at anybirth on breast cancer risk as well as age at last birth. In this paper we review and adaptpreviously published models for estimating the effect of age at any birth on breast cancer riskwithin case control studies. Subsequently, we derive a general model and show that the previouslypublished models can all be considered submodels of this general one.

The interpretation of these models is illustrated with the Italian Breast Cancer Case ControlStudy.11~13 The study was carried out in seven centres in Italy between 1991 and 1994. The caseswere women with incident, histologically confirmed breast cancer, and with no previouslydiagnosed cancer in any site. Overall, 2569 women were included with an age range of 23 to 74and a median of 55 years. The controls were women with no history of cancer admitted to themajor teaching hospitals and general hospitals in the same catchment areas as the cases. Thecontrols were admitted for acute, non-neoplastic, non gynaecological conditions which werenot related to hormonal or digestive tract diseases or long term dieting. There were 2588controls, aged between 20 and 74 with a median age of 56 years. The cases and controls werenot individually matched but the distributions of ages within the study centres werecomparable.

There are 5157 women in the study but 35 are not included in this example because there ismissing or inconsistent information on the ages at births, age at menarche or age at menopause.Fifteen per cent of women in the study are nulliparous. Age at first birth ranges from 14 to 44years with a median of 25 years. Of parous women, 38 per cent had their first birth aged 23 oryounger. The range of ages at second birth is from 15 to 49 years with a median of 28 years. Sixtyeight per cent of women in the study had two or more births.

The aim of the modelling analysis is to separate out the effects of the second and subsequentbirths on the odds of breast cancer once the effect of the first birth has been taken into account. Inall of this work we use a logistic regression model where the dependent variable ½ represents thecase control status, with ½"1 for a case. This is a linear model for the log odds of being a case,ln (p/(1!p)), where p is the probability of being a case. We also assume that the cases andcontrols are not individually matched.

2. MODELS

Trichopolous et al.3 published the first model for estimating the effects of age at any birth onbreast cancer risk. The age at diagnosis for cases and age at interview for controls is denoted by t.We let t

irepresent the age at the time of the ith birth, and assume that there are s births; t

iis not

defined if there are fewer than i births, i"1,2, s. The model can be written as

lnAp

1!pB"k#c1t#

s+i/1

bipi`

#

s+i/1

aitipi`

. (1)

The variables pi`

are indicator variables taking the value 1 if there are at least i births and zerootherwise. The term t

ipi`

just serves to ensure that there is no contribution to the model foraifrom women with fewer than i births. Also, if s"0, as it will be for nulliparous women, the two

summation terms do not contribute to the model.

432 C. ROBERTSON AND P. BOYLE

Statist. Med., 17, 431—445 (1998) ( 1998 John Wiley & Sons, Ltd.

For parities 0, 1 and 2 the model is

lnAp

1!pB"

k#c1t (s"0)

k#c1t#b

1#a

1t1

(s"1)

k#c1t#b

1#b

2#a

1t1#a

2t2

(s"2).

In this model birepresents the log relative risk of breast cancer for the ith birth relative to the

(i!1)th. Thus b1

represents the log relative risk of the first birth relative to no births and b2

represents the log relative risk of two or more births relative to one birth at t1"0 and t

2"0. An

increase in age at first birth of one year is associated with an increase in the log odds of beinga case of a

1units, and a

2represents the effect of the age at second birth given the age at first birth.

This is not the way the model was parameterized when it was first introduced3 though it is usedby Decarli et al.13 Trichopoulos et al. wrote

ln Ap

1!pB"k#c1t#

s+i/1

b@ipi#

s+i/1

aitipi`

(2)

where pi

are indicator variables taking the value one if there are exactly i births and zerootherwise. These two representations of the model are equivalent and it is easy to see that

b@j"

j+i/1

bi

for j"1,2 , s. The equivalence holds provided there are dummy variable for all s parities in thedata. Within this representation b@

jis the log relative risk of the jth birth relative to no births.

In the uncentered version the effect of a birth at age tiis b

i#a

itiand strictly b

iis the effect of the

ith birth at age zero and so does not have a reasonable interpretation. If the ages at the births arecentred about their means then the interpretation of the b

iparameters will change. In the centred

version, biis the log relative risk of the ith birth relative to birth (i!1) when the birth occurs at

the mean age for the ith birth.In model 1 the relative risk of breast cancer for a uniparous women with a birth at age t

1compared with a nulliparous woman of the same age is

eb1`a1t1.

Compared with a uniparous woman, aged t, with a birth at t1

a biparous woman with two birthsat t@

1and t@

2and the same calendar age has a relative risk of

eb2#a

1(t@1!t

1)#a

2t@2 .

Hsieh and Lan14 investigated the effect of the age at which women gave birth through a timedependent model of disease risk. They also developed their model through a stratification tocompare parity 1 with parity 0, and parity 2 with parity 1. For nulliparous versus uniparouswomen, their model for estimating the effect of age at any birth on breast cancer risk is

lnAp

1!pB"k#c1t#b

1p1#a

1t1p1`

#d1tp

1`#o

1tt1p1`

. (3)

The model includes differential terms for the age at diagnosis for cases, interview for controls, fornulliparous and uniparous women, and this is the means by which time dependent effects are

AGE AT ANY BIRTH AND BREAST CANCER RISK 433

Statist. Med., 17, 431—445 (1998)( 1998 John Wiley & Sons, Ltd.

included. In the comparison of uniparous with biparous women the model is extended to:

lnAp

1!pB"k#c1t#b

2p2`

#a1t1p1`

#a2t2p2`

#d2tp

2`#o

1tt1p1`

#o2tt2p2`

. (4)

Within the above two stratified models, (3) and (4), there are common terms, c1t, a

1t1p1`

ando1tt1p1`

, which will have different estimated values but are measuring essentially the same thing.By combining the two models and assuming common values over the two strata a single modelcan be derived which can be extended to s births.

lnAp

1!pB"k#c1t#

s+i/1

bipi`

#

s+i/1

aitipi`#

s+i/1

ditp

i`#

s+i/1

oittipi`

. (5)

This model implies that the relative risk of breast cancer for a uniparous women with a birth atage t

1compared with a nulliparous woman of the same age is:

eb1#a

1t1#d

1t#o

1tt1 .

Compared with a uniparous woman with a birth at t1a biparous woman of the same age with two

births at t@1

and t@2

has a relative risk of

eb2#a

1(t @1!t

1)#a

2t @2#d

2t#o

1t (t @

1!t

1)#o

2tt @2 .

Algebraically these are the same equations as would be derived from (3) and (4). Their estimatedvalues may differ as the general model assumes that some effects take common values across theparity strata.

Rosner et al.10 published a modification of a mathematical model for breast cancer incidencewhich was originally published by Pike.15 Subsequently, they developed a log incidence model.16These models are more involved than the ones of Hsieh and Lan14 and that of Trichopouloset al.3 in that they are based on the concept of breast tissue ageing. Both of these models weredeveloped for cohort studies where estimates of incidence can be obtained. We propose to adaptthe form of the Rosner et al.10 model for use in case control studies.

Within a cohort study the model for breast cancer incidence at age t, I(t), is written asI(t)"[d(t)]k, where d(t) denotes the breast tissue age at calender age t and k is an exponentdetermined by the rate of increase of breast cancer incidence with breast tissue age. Breast tissueage is written as a linear function

d(t)"c1(t!t

0)#b

1p1`

#a1(t!t

1)p

1`#b

2p2`

#a@2

s+i/2

(t!ti)p

i`#c

2p.#c

3(t!t

.)p

..

The age at menarche is denoted t0

and in nulliparous women breast tissue is assumed to age ata rate c

1per year since menarche. There is no ageing of the breast tissue before menarche. The

variable p.

is an indicator of menopausal status and takes the value one for post-menopausalwomen, zero otherwise. The age at menopause is denoted t

.and is defined for post-menopausal

women only.We write

lnAp

1!pB"k#c1(t!t

0)#

2+i/1

bipi`#a

1(t!t

1)p

1`#a@

2

s+i/2

(t!ti)p

i`#c

2p.#c

3(t!t

.)p

..

(6)

434 C. ROBERTSON AND P. BOYLE

Statist. Med., 17, 431—445 (1998) ( 1998 John Wiley & Sons, Ltd.

This is not an attempt to model the odds of being a case as a function of breast tissue age but anattempt to model breast cancer risk at a particular age as a function of the number of years sincethe births. Essentially, we propose to use the functional form of the model to relate the number ofyears since an event to the log odds of breast cancer.

If we take three women of the same age and with the same ages at menarche and at menopause,with one nulliparous, the second uniparous with a birth at age t

1and the third biparous with

births at t@1

and t@2, then the relative risk of breast cancer of the uniparous woman to the

nulliparous one is given by

eb1#a

1(t!t

1)

and the relative risk when comparing the biparous woman with the uniparous one is

eb2#a

1(t1!t@

1)#a

2(t!t@

2).

The log incidence model16 is very similar to (6) and in a case control setting is

lnAp

1!pB"k#c0t0#c

1(t*!t

0)#a

1(t!t

1)p

1`#a@

2

s+i/1

(t*!ti)p

i`

#c3

s+i/1

(t*!ti)p

i`(t!t

.)p

.. (7)

This has many similarities to their previous model but one area of difference is the introduction oft*"min(t, t

.). A second difference is the interaction term modifying the slope among post-

menopasual women, while a third is in the term for age at menarche t0. These two points of

difference mean that the log incidence model does not fit into the general framework of the firstthree models. For this reason we will not attempt to include it as part of the general frameworkbut will return to it later once the relationship among the other three models have been identifiedand an example discussed.

Expanding (6) leads to

lnAp

1!pB"k!c1t0#(c

1#a

1p1`

#c3p.#a@

2

s+i/2

pi`

)t#2+i/1

bipi`

!a1t1p1`

!a@2

s+i/2

tipi`

#c2p.!c

3t.

p.

. (8)

This has the same general pattern as the models of Trichopoulos et al.3 and Hseih and Lan14 withterms in t, t

i, p

i`plus extra terms in t

0, p

.and t

..

The models (1), (5) and (6) have a number of elements in common and a general model can bedefined:

lnAp

1!pB"kg#cg0t0#cg

1t#

s+i/1

bgipi`

#

s+i/1

agitipi`

#

s+i/1

dgitp

i`

#

s+i/1

ogitt

ipi`

#cg2p.#cg

3t.p.

. (9)

In this general model, which is essentially the modified Hseih and Lan model 5, the terms for ageat menarche, age at menopause and menopausal status are separate from those concerned withthe presence or absence of births and age at these births. While neither the Trichopoulos et al.

AGE AT ANY BIRTH AND BREAST CANCER RISK 435

Statist. Med., 17, 431—445 (1998)( 1998 John Wiley & Sons, Ltd.

model3 nor the Hseih and Lan model14 have terms associated with menarche and menopause,this does not preclude the inclusion of these terms. While discussing the equivalence of the modelsthese terms will not be touched on specifically.

Within model (9), setting dgi"0 and og

i"0, for i"1,2, s, yields the Trichopoulos et al. model

(1) with the addition of the terms for menarche and menopause. With restrictions only on thecg0"0, cg

2"0 and cg

3"0 we have the extended model of Hsieh and Lan, model (5). The Rosner

et al. model is derived by setting

(i) bgi"0, for i"3,2, s;

(ii) dgi"!ag

i, for i"1,2 , s;

(iii) ag2"2"ag

s;

(vi) ogi"0, for i"1,2 , s.

This gives a framework for investigating the differences between the models with reference to datafrom a case control study. By fitting the general model and testing which of the variousrestrictions apply it will be possible to assess the relative merits of the different models. Thegeneralized model of Hseih and Lan, model (5), is the most involved of the three componentmodels with interaction terms between age and age at any birth and between age and parity. Themodification of the Rosner et al. model, (6), is less involved as it does not have any interactionbetween age and age any birth. Also the effects of births other than the first are assumed equal.The model of Trichopoulos et al., model (1), is simply an additive model with no interactionsbetween parity or age at any birth with age.

3. ITALIAN BREAST CANCER STUDY

In the Italian Breast Cancer Case Control Study,11,12 and many case control studies of breastcancer risk, the age distributions of the cases and controls are matched. Thus the coefficients ofage, t, in the models will not reflect the true risk of breast cancer with increasing age and so willnot have a ready interpretation. All that we will be able to achieve is the estimation of the effect ofage as a modifier to the effects of age at any birth and the number of births. Specifically, thismeans that we can interpret the coefficients of the interaction terms involving age, in terms ofodds ratios at a given age, but not the main effect of age.17

The estimates of all the logistic regression models were obtained using SAS. The study wasa multi-centre study and the number of cases and controls were balanced in the centres, withmatching age distributions. Consequently, dummy variables for the centres were included in allmodels. These were the only variables included, other than those specified in the models. The dataare used solely to illustrate the models and the relationships between them. Consequently, this isnot a definitive analysis of the effect of age at any birth for this case control study.13

3.1. Women with 0, 1 or 2 Births Only

Initially we fit the models to all women with fewer than three births. This uses 3720 out of the5122 women in the study and focuses on the area in which most information is available. Also, thestratifications involved in the Hseih and Lan14 models are nulliparous and uniparous, anduniparous and biparous, which together involve all women with less than three births. Theestimates from models (1), (5), (6) and (9) are all presented in Table I. Where appropriate, terms

436 C. ROBERTSON AND P. BOYLE

Statist. Med., 17, 431—445 (1998) ( 1998 John Wiley & Sons, Ltd.

Table I. Parameter estimates and standard errors: women with parity 0, 1 or 2 only

Parameter Variable Model (9) Model (1) Model (1) Model (5) Model (6)

Estimate Standard Estimate Standard Estimate Standard Estimate Standard Estimate Standarderror error error error error

c1

Age 0·027 0·006 !0·002 0·003 0·013 0·005 0·016 0·005 !0·001 0·022c0

Age at menarche !0·066 0·021 !0·067 0·021 0·059 0·021a1

Age Birth 1*Parity 1# 0·223 0·052 0·050 0·010 0·049 0·010 0·217 0·052 !0·024 0·006a2

Age Birth 2*Parity 2# 0·023 0·064 0·006 0·012 0·007 0·012 0·018 0·064 !0·014 0·007b1

Parity 1# !5·171 1·445 !1·227 0·289 !1·244 0·291 !4·895 1·434 0·722 0·195b2

Parity 2# 0·665 1·965 !0·204 0·380 !0·217 0·383 0·797 1·954 0·247 0·170d1

Age*Parity 1# 0·070 0·026 0·065 0·025d2

Age*Parity 2# !0·020 0·036 !0·022 0·035o1

Age*Age Birth 1*Parity 1# !0·003 0·001 !0·003 0·001o2

Age*Age Birth 2*Parity 2# !2]10~4 0·001 !1]10~4 0·001c3

Age Menopause* Post menopause 0·039 0·009 0·035 0·009 !0·041 0·007c2

Post-menopause !2·256 0·411 !2·147 0·410 !0·283 0·115

AG

EA

TA

NY

BIR

TH

AN

DB

RE

AST

CA

NC

ER

RIS

K437

Statist.M

ed.,17,431—

445(1998)

(1998

John

Wiley

&Sons,

Ltd

.

involving age at menarche and menopause have been incorporated in the models; dummyvariables were also included for the centres.

The principal interest here is the assessment of the effect of the second birth over and above theeffect of the first birth. Also we wish to test if any of the restrictions on the general model are valid.

The estimates for the general model suggest that while age at menarche, menopausal status andage at menopause are all important determinants of the odds of being a case they do notconfound the effect of the births and ages at these births. This is most clearly seen for model (1),where the estimates of b

1, b

2, a

1and a

2are virtually identical with and without the addition of age

at menarche and age at menopause. The same patterns can be seen in the estimates from theextended Hseih and Lan model, model (5), and the general model, model (9), though there areslight discrepancies here.

Within the general model the estimated effects of the interactions of age at diagnosis/interviewwith parity 1# and age at first birth are clearly highly significant and this would suggest that thesimple Trichopoulos et al. model is not valid. There is evidence, at least, that dg

1O0 and og

1O0.

The Rosner et al. model is derived from the general by, among others, setting agi"!dg

i,

i"1,2 , s. While this is plausible for the second birth as the estimates from the general model areag2"0·023, and dª g

2"!0·020, it is not so for the first birth. A second requirement is that og

i"0

and again this plausible for the second birth but not, as already noted, for the first birth. Thissuggests that the extended Hseih and Lan model is the most appropriate one for these data. Thisis the case principally through the influence of the interaction between age at diagnosis/interviewand age at first birth among parous women.

There are four terms in the general model which are associated with the second birth andomitting them results in a likelihood ratio test statistic of 12·09 on 4 d.f. This suggests that amongthose women with 0, 1 or 2 births there is a contribution to the odds of being a case from thesecond birth, conditional on the first. Adopting a backwards selection procedure results indropping the age at diagnosis interaction with age at second birth (og

2"0). Thereafter, one can

either drop age at second birth (set ag2"0) or equate ag

2"!dg

2, to get a (t!t

2)p

2`term as in the

Rosner et al. model with the latter being the slightly better option.The parameter estimates for the Hseih and Lan models (3) and (4) are presented in Table II.

Model (3) is fitted to the 1855 women with zero or one birth, while uniparous and biparouswomen (2942 in total) are used for the estimates of the parameters of model (4). Although age iscommon to both models we will not compare the similarity of the estimate of cg

1over the two data

sets as the data were frequency matched on age. There are two parameters where it is valid toassess equality of effects as the extended model (5) is based upon this assumption. They are a

1and

o1. Using the nulliparous and uniparous women the estimates are 0·27 (0·06) and !0·0038

(0·0011), respectively, while for the uniparous and biparous women they are 0·22 (0·05) and!0·0030 (0·0009), respectively. There is no formal way of testing the equality of these twocoefficients as the models are not nested. Also as the uniparous women are common to both datasets the estimates are correlated. However the correlation would have to be in excess of 0·9 forthere to be evidence of significant differences in the coefficients in the two models using t-tests.

Within the extended Hseih and Lan model (5), it is possible to assess the contribution of thosewith exactly two births to the effect of t

1p1`

. This is achieved by constructing an interaction termt1p1p2`

. To an extent this is a test of the equality of the a1

parameters in the two separate models(3) and (4). If the coefficient of the interaction term is non-zero then this suggests that the effect ofage at first birth is different for those with two births compared with those with only one birth.Biologically this does not seem plausible. A similar interaction term can be constructed for tt

1p1`

.

438 C. ROBERTSON AND P. BOYLE

Statist. Med., 17, 431—445 (1998) ( 1998 John Wiley & Sons, Ltd.

Table II. Parameter estimates and standard errors for Hseih and Lan model: women with parity 0, 1 or2 only

Parameter Variable Model (3) Model (4) Model (5)Parity 0 or 1 Parity 1 or 2 Extended

Estimate Standard Estimate Standard Estimate Standarderror error error

c1

Age 0·016 0·006 0·080 0·025 0·016 0·005a1

Age Birth 1* Parity 1# 0·270 0·065 0·217 0·052 0·217 0·052a2

Age Birth 2* Parity 2# 0·017 0·064 0·018 0·064b1

One plus deliveries !6·302 1·755 !4·895 1·434b2

Two plus deliveries 0·825 1·953 0·797 1·954d1

Age* Parity 1# 0·087 0·031 0·065 0·025d2

Age* Parity 2# !0·023 0·035 !0·022 0·035o1

Age*Age Birth 1*Parity 1# !0·004 0·001 !0·003 0·001 !0·003 0·001o2

Age*Age Birth 2*Parity 2# !1]10~4 0·001 !1]10~4 0·001

This assesses the equality of o1

in models (3) and (4). The likelihood ratio test statistic for the twointeraction terms is 3·1 on 2 degrees of freedom and we conclude that the common parameterstake equal values over the two models and thus the extended Hseih and Lan model (5) is valid.

3.2. All Women

Within this study the maximum number of births to any one woman was 8 and only 37 womenhad more than 6 births (11 out of 2551 cases and 26 out of 2571 controls). Consequently werestrict the general model where individual ages at any birth are considered to a maximum ofs"6 births. This means that the ages at births 7 and 8 are ignored.

This model has a very large number of parameters, 4 for the cg terms, 6 each for the ag, dg and og

terms, and 8 for the bg terms. The !2 log-likelihood statistic takes the value 6880 and 36parameters, including 6 for the centres, are estimated in addition to the intercept. Restrictingattention to only the first two births, by omitting all parameters with subscripts greater than 2,but using all 5122 women and comparing the likelihoods of the general model (9) results in a teststatistic of 14·3 on 18 degrees of freedom. This suggests that it is not necessary to take intoaccount births other than the first two.

Using all the data is a more powerful test of the adequacy of the Rosner et al. model (6) than thereduced data above. The test of the global restrictions (i)— (iv) at the end of Section 2 havea log-likelihood ratio test statistic of X2"56·2, on 22 d.f. which indicates that these restrictionsare not valid. This assumption that ag

i"!dg

ifor i"1,2 , 6 is not tenable, X2"37·4, on 6 d.f.,

though it is reasonable for i"2,2 , 6, X2"1·6, on 5 d.f. The assumption that ogi"0 for

i"2,2 , 6 is valid, X2"1·7, on 5 d.f., but not og1"0, X2"14·1, on 1 d.f. Another assumption

about the Rosner et al. model is that a2"2"a

6. The test statistic has a p-value of 0·042 which

suggests that the time from later births may not contribute an equal amount. However, testingai"0, i"3,2, 6 gives X2"5·8, on 4 d.f.The above comparisons suggest that the assumptions inherent in a Rosner et al. type model are

not completely valid here. The main areas in which they break down are in the interaction

AGE AT ANY BIRTH AND BREAST CANCER RISK 439

Statist. Med., 17, 431—445 (1998)( 1998 John Wiley & Sons, Ltd.

Table III. Parameter estimates and standard errors for extended Hseih and Lan model: all women

Parameter Variable Model (9)

Estimate Standard error

c1

Age 0·025 0·006c0

Age at menarche !0·050 0·018a1

Age Birth 1* Parity 1# 0·249 0·051a2

Age Birth 2* Parity 2# 0·034 0·055b1

One plus deliveries !5·839 1·397b2

Two plus deliveries 0·609 1·665d1

Age* Parity 1# 0·084 0·025d2

Age* Parity 2# !0·028 0·030o1

Age* Age Birth 1* Parity 1# !0·004 0·001o2

Age* Age Birth 2* Parity 2# !1]10~4 0·001c3

Age Menopause * Post-menopause 0·038 0·007c2

Post-menopause !2·175 0·345

between age and age at first birth, og1, the use of years since first birth rather than an interaction

between age at diagnosis and parity ag1O!dg

1. To some extent these are the areas which are

covered in the use of the log incidence model,16 model (7), where interactions with age atmenopause are also considered.

The value of !2 log-likelihood for model (9) with the global restrictions (i)— (iv) for the Rosneret al. model at the end of Section 2 is 6940 with 14 parameters estimated whereas the value of !2log-likelihood for model (6) is 6920 again with 14 parameters. The differences are due to the use ofage, age at menarche, ages at the births of any children and age at menopause in (9) comparedwith age, years since menarche, years since the births of any children and years since menopausein (6) and the matching on age in the case control study. This means that there is one morevariable included in the Rosner et al. model (6), albeit with a constrained parameter, namely theinteraction between age and menopausal status, tp

.. Inclusion of this variable in the general

model explains the discrepancy in the log-likelihoods.The parameter estimates of model (5), restricted to s"2, are presented in Table III. These

estimates were obtained from all women. The odds of breast cancer decrease with increasing ageat menarche, decrease after menopause but increase with later menopause. Relative to a nullipar-ous woman with the same age at menarche and menopause the odds of breast cancer for a womanwith one birth at age t

1is

e!5·8389#0·2491t1#0·0841t!0·0036tt

1"e!0·0036(t!69·2)(t1!23·4)!0·02.

The predicted odds ratios of breast cancer are presented in Figure 1 for selected ages at first birth.For women who have their first birth at an early age, less than 23, the odds of breast cancer will belower than for nulliparous women at younger ages though the odds ratio will approach one andwill be greater than one at older ages. If there is one late birth then there will be an increased oddsof breast cancer and the ratio to nulliparous women will decrease to one and will go below it atages over 68. The interpretation at older ages does conflict with current theories2,8~10 and thecrossing over of the lines is partly a consequence of the linearity assumptions and the relativelylarge interaction effects with age and age at first birth. It is also an interpretation at the upper end

440 C. ROBERTSON AND P. BOYLE

Statist. Med., 17, 431—445 (1998) ( 1998 John Wiley & Sons, Ltd.

Figure 1. Predicted odds of breast cancer for uniparous women compared with nulliparous women

of the observed age distribution with ranges from 20 to 74 and one should be cautious aboutextrapolations of a linear model outside the observed range of the data.

Relative to a woman with the same age at menarche, menopause and age at first birth the oddsof breast cancer for a woman with a second birth at age t

2are given by

e0·6091#0·0338t2!0·0281t!0·0001tt

2.

The second birth is associated with an increase in the odds ratio of breast cancer at all agesrelative to a uniparous woman (Figure 2). Women with a second birth have an increased odds ofbreast cancer in the years immediately following the birth which then decreases and passesthrough 1 about 20—25 years after the second birth.

4. INTERACTIONS WITH AGE AT MENOPAUSE

The final term in the log incidence model 7 is +si/1

(t*!ti)p

i`(t!t

.)p

.. This will only be in the

model for post-menopausal women and for them will lead to interaction terms between age andage at any birth through tt

ipi`

terms, between age and age at menopause among parous womenthrough t*p

i`t"t

.pi`

t, between age at any birth and age at menopause through tipi`

t.. There

will also be a quadratic term involving age at menopause t*t."t2

..

If the definition of this model is changed by replacing t* with t then we get

ln Ap

1!pB"k#c0t0#c

1(t!t

0)#a

1(t!t

1)p

1`#a@

2

s+i/1

(t!ti)p

i`

#c3

s+i/1

(t!ti)p

i`(t!t

.)p

.. (10)

AGE AT ANY BIRTH AND BREAST CANCER RISK 441

Statist. Med., 17, 431—445 (1998)( 1998 John Wiley & Sons, Ltd.

Figure 2. Predicted odds of breast cancer for biparous women compared with uniparous women

We can see that there is a logical connection between the Rosner and Colditz model16 and theextended model of Hseih and Lan.14 Model (10) can be rewritten as

lnAp

1!pB"k#(c0!c

1)t

0#c

1t#a

1tp

1`#a@

2

s+i/1

tpi`!a

1t1p1`

!a@2

s+i/1

tipi`#c

3

s+i/1

t2pi`

p.!c

3

s+i/1

tpi`

t.p.

!c3

s+i/1

tipi`

tp.#c

3

s+i/1

tipi`

t.p.. (11)

Apart from the term in t2 all the others are part of the general model. The main difference is thatin the general model (9) the interactions between t and t

iare not modified by menopausal status.

Either way, using t or t*, there will be a term ttipi`

is the model. This is one of the reasons whythe Hseih and Lan extended model is preferred to the Rosner et al. model adapted to case controlstudies. In the log incidence model,16 again adapted to case control studies, the interaction termbetween age and age at any birth is only present for post-menopausal women through+si/1

tipi`

tp..

With an extension to model (9) it is possible to investigate if it is reasonable that the term tt1p1`

is only required for post-menopausal women. Age at first birth is used as it is the most importantbirth. Also, we use the model where s"2 but using all the data. The inclusion of tt

1p1`

p.

intomodel (9) yields X2"0·6 on 1 d.f., while the omission of tt

1p1`

with tt1p1`

p.

still included yieldsX2"15·1 on 1 d.f. Thus the interaction of age and age at first birth should not be confined topost-menopausal women only.

442 C. ROBERTSON AND P. BOYLE

Statist. Med., 17, 431—445 (1998) ( 1998 John Wiley & Sons, Ltd.

5. DISCUSSION

This paper has demonstrated that the Hseih and Lan14 stratified models for the effect of age atfirst and second births on the odds of breast cancer can be amalgamated into one model. This canthen be extended to a general model which has the models of Trichopoulos et al.3 and Rosner etal.10 as special cases. The Trichopoulos et al. model simply has additive effects of a birth and theage at which that birth occurs, while in the extended Hseih and Lan model both of these effectsare modified by the age at diagnosis. The Rosner et al. model lies between these two models andone of the differences is that age at diagnosis modifies the effect of a birth but not the effect of theage at which that birth occurs.

Within this general framework it is possible to test the components of the extended model witha view to establishing which features of the models are important. With the data from the ItalianBreast Cancer Case Control Study it would appear as if one crucial aspect is the interactionbetween age at diagnosis and age at the first birth which leads one to prefer the Hseih and Lanframework over the other two.

This is not to say that the Rosner et al.10 or Rosner and Colditz16 models are not applicable incohort studies. What we have done here is to take the Hseih and Lan model and note thesimilarity in its structure to the models of Rosner et al. and so propose the use of the functionalform of the latter model as a means of modelling the effect of age at any birth on the odds ofdeveloping breast cancer. In view of the results obtained here there is a case for extending theRosner and Colditz16 log incidence model of breast cancer risk in cohort studies to include termsrepresenting the interaction between age and age at first birth for all women and not justpost-menopausal women.

Originally, the Rosner et al.10 model was developed for the rate of breast cancer incidence ina cohort. For rare diseases the effects estimated from a logistic regression for the odds are similarto those estimated from a poisson regression for the rate18 and so the interpretations of the effectsof the ages at the births should carry over. One difficulty with the interpretation of the effectsestimated from model (6) is that if t and t!t

0are both included in the model, as in a case control

study matched for age, then for a fixed age at menarche, t0, the coefficients of both are associated

with age and are not interpretable. This does not happen in the general model (9).The functional form of the Rosner and Colditz log incidence model is not compatible with the

extended model of Hseih and Lan in view of the dependency of the effect of the age at births on theminimum of age or age at menopause. For pre-menopasual women the log incidence modelhas a very similar functional form to the Rosner et al. model and so would be a special caseof the extended framework. The differences occur in post-menopausal women only. Here,there is a constant value to summarize the contribution of the age at any birth which does notchange as a post-menopausal woman ages. This term modifies the effect of the number of yearssince menopause and so leads to the introduction of quadratic effects of age at menopause,interactions between age at the births and age at menopause and between age and age atmenopause as well as the important interaction from the Hseih and Lan model between age andage at any birth.

The log incidence model is a more complex model than the ones fitted here and does not fit intothe extended framework. It would be possible to modify the extended model to base it on theminimum of age or age at menopause. In view of the importance of the interactions between bothage and first birth and age at the first birth over all ages then the use of the minimum of age andage at menopause in the extended model would not appear to be warranted.

AGE AT ANY BIRTH AND BREAST CANCER RISK 443

Statist. Med., 17, 431—445 (1998)( 1998 John Wiley & Sons, Ltd.

Within the Italian breast cancer study the evidence points to effects of first and second births onthe odds of breast cancer. These effects are both in terms of the birth itself and also the age atwhich this birth takes place. Subsequent births do not appear to play any further role in thedetermination of the odds of breast cancer.

The determination of the effect of age at any birth is associated with tests for interactions incomplex models. These tests do not have a great deal of power in small studies. This means that itis unlikely that it will be possible to distinguish between the models in small studies. Also it maybe that there are residual effects of age at third and subsequent births but that there is little powerin this study to detect them.

Also with case control studies there is usually matching on age between cases and controlwhich means that it is impossible to estimate the influence of age on the odds of breast cancer.However, interactions with age can still be interpreted and all of the parameters in model (5) withthe exception of cg

1have an interpretation. In particular, it is valid to interpret the interactions

with age and births. Such difficulties of interpretation are not present in cohort studies. Also,if there is exact matching then the use of conditional regression models would normally berequired. If the matching was carried out on variables which were measured then unconditionallogistic regression could still be used provided these matching variables were included in themodel.

ACKNOWLEDGEMENTS

The data for the Italian Breast Cancer Case Control Study were kindly supplied by ProfessorCarlo La Vecchia and Dr. Sylvia Franceschi. We also thank the referees for their assistance. Thiswork was conducted within the framework of support from the Italian Association for CancerResearch.

REFRENCES

1. MacMahon, B., Cole, P., Lin, T. M., Lowe, C. R., Mirra, A. P., Ravinihar, B., Salber, E. J., Valaorbas,V. G. and Yusa, S. ‘Age at first birth and breast cancer risk’, Bulletin of the ¼orld Health Organisation,43, 209—221 (1970).

2. Kelsey, J. L. Gammon, M. D. and John, E. M. ‘Reproductive factors and breast cancer’, EpidemologicalReviews, 15, 36—47 (1993).

3. Trichopoulos, D., Hsieh, C-C., MacMahon, B., Lin, T. M., Lowe, C. R., Mirra, A. P., Ravinihar, B.,Salber, E. J., Valaoras, V. G. and Yusa, S. ‘Age at any birth and breast cancer risk’, International Journalof Cancer, 31, 701—704 (1983).

4. Kalache, A., Maguire, A. and Thompson, S. G. ‘Age at last full term pregnancy and risk of breast cancer’,¸ancet, 341, 33—36 (1993).

5. Kvale, G. and Heuch, I. ‘A prospective study of reproductive factors and breast cancer. II. Age at firstand last birth’, American Journal of Epidemiology, 126, 842—850 (1987).

6. Hsieh, C-C., Chan, H-W., Lambe, M., Ekbom, A., Adami, H.-O. and Trichopoulos, D. ‘Does age at thelast birth affect breast cancer risk?’, European Journal of Cancer, 32A, 118—121 (1996).

7. Cumming, P., Stanford, J. L., Daling, J. R., Weiss, N. S. and McKnight, B. ‘Risk of breast cancer inrelation to the interval since last full time pregnancy’, British Medical Journal, 308, 1672—1674 (1994).

8. Lambe, M., Hsieh C-C., Trichopoulos, D., Ekbom, A., Pavia, M. and Adami, H.-O. ‘Transient increasein the risk of breast cancer after giving birth’, New England Journal of Medicine, 331, 5—9 (1994).

9. Hsieh, C-C., Pavia, M., Lan, S.-J., Colditz, G. A., Ekbom, A., Adami, H.-O., Trichopoulos, D. and Willet,W. C. ‘Dual effect of parity on breast cancer risk’, European Journal of Cancer, 30A, 969—973 (1994).

10. Rosner, B., Colditz, G. A. and Willett, W. C. ‘Reproductive risk factors in a prospective study of breastcancer: the Nurses’ Health study’, American Journal of Epidemiolgy, 139, 819—835 (1994).

444 C. ROBERTSON AND P. BOYLE

Statist. Med., 17, 431—445 (1998) ( 1998 John Wiley & Sons, Ltd.

11. Franceschi, S., Favero, A., La Vecchia, C., Negri, E., Dal Maso, L., Salvini, S., Decarli, A. and Giacosa,A. ‘Influence of food groups and food diversity on breast cancer risk in Italy, International Jounral ofCancer, 65, 140—144 (1996).

12. La Vecchia, C., Negri, E., Franceschi, S., Favero, A., Nanni, O., Filiberti, R., Conti, E., Montella, M.,Veronesi, A., Ferraroni, M., Decarli, A. ‘Hormone replacement treatment and breast cancer risk:a cooperative Italian study’, British Journal of Cancer, 72, 244—248 (1995).

13. Decarli, A., La Vecchia, C., Negri, E. and Franceschi, S. ‘Age at any birth and breast cancer in Italy’,International Journal of Cancer, 67, 187—189 (1996).

14. Hsieh, C-C. and Lan, S-J. ‘Assessment of post partum time dependent risk in case control studies: anapplication for examining age specific effect estimates’, Statistics in Medicine, 15, 1545—1556 (1996).

15. Pike, M. C. ‘Age related factors in cancers of the breast, ovary and endometrium’, Journal of ChronicDiseases, 40, 59S—69S (1987).

16. Rosner, B. and Colditz, G. A. ‘Nurses Health Study: Log-incidence mathematical model of breast cancerincidence’, Journal of National Cancer Institute, 88, 359—364 (1996).

17. Breslow, N. E. and Day, N. E. Statistical Methods in Cancer Research. »olume 1. The Design andAnalysis of Case Control Studies, International Agency for Research in Cancer, Lyon, France, 1980.

18. Clayton, D. and Hills, M. Statistical Models in Epidemiology, Oxford University Press, Oxford, 1993.

AGE AT ANY BIRTH AND BREAST CANCER RISK 445

Statist. Med., 17, 431—445 (1998)( 1998 John Wiley & Sons, Ltd.