longitudinal logistic regression longitudinal poisson...

Statistical Consulting Topics

Generalized Estimating Equations (GEEs)

• GEEs are generally used as a method to han-dle correlated data under the generalized lin-ear model framework.

– Longitudinal logistic regression

– Longitudinal Poisson regression

• GEEs utilize a quasi-likelihood rather than aformal likelihood approach.

Quasi-likelihoodA quasi-likelihood does not fully specify adistribution (like common exponential fam-ilies of normal or binomial, which have aknown distributional ‘shape’).

1

For independent data, a quasi-likelihood mod-els the mean as a function of the covariatesand assumes the variance is a function of themean (akin to the idea of first and secondmoments).

Quasi-likehoods are often associated with thephrase overdispersion. For example, takethe Poisson distribution where the varianceand mean are equal (vi = µi). In a quasi-Poisson, the variance need only be propor-tional to the mean (vi = φµi).

For correlated data, the same quasi-likelihoodframework is employed, but we also specifya “working” correlation matrix for the re-peated observations on each subject.

•Why use GEEs? Even in the binary casewhere likelihood analysis is possible, compu-tation is difficult (Stiratelli et al, 1984).

2

Generalized Linear Models (short intro)

• Generalized linear model (GLMs) for expo-nential families were introduced by McCul-lagh and Nelder in 1989.

• GLMs have a wide range of uses, but onecommon use is to model a response variablethat is dichotomous (Bernoulli or binomial)or non-negative discrete (Poisson) with a re-gression model.

• GLMs connect the response variable to thepredictors through a link function, often de-noted as g. The mean µi depends on theindependent variables in the following man-ner...

g(µi) = g(E[Yi|xi]) =x′iβ

3

– Poisson regression with log link

ln (λi) = β0 + β1x1i + · · · + βkxki

– Logistic regression with a logit link

ln

(P (Y = 1)

P (Y = 0)

)= β0+β1x1i+· · ·+βkxki

• The method of GEEs was developed to ex-tend the GLM framework to accommodatecorrelation between observations using a quasi-likelihood approach.

•Why? Because when the responses are notapproximately multivariate normal (e.g. bi-nary with correlation), likelihood based meth-ods are less tractable.

4

• GEEs assume that the mean and variance arecharacterized as in the GLM (i.e. the same asif there were independent observations), butthe covariance between observations is alsomodeled.

• GEEs don’t assume a specific exponentialfamily for the dependent variable (becausethe method utilizes quasi-likelihood).

• Seminal paper:

Zeger and Liang (Biometrics, 1986).

Their equations are extensions of those usedin quasi-likelihood (Wedderburn, 1974, Bio-metrics) methods.

5

Zeger and Liang write:

“We specify that a known function of themarginal expectation of the dependent vari-ate is a linear function of the covariates, andassume that the variance is a known functionof the mean.”

• GEEs are meant to characterize marginalexpectations of the response as a functionof independent variables.

“The average response for observations shar-ing the same covariates.”

• GEEs are used as a ‘means to an end’. We’restill trying to get the ‘best’ parameter es-timates because it is these parameters thatconnect our covariates to our mean structure.

6

• The method of GEEs is robust to misspec-ification of the ‘working’ correlation matrixand provides consistent and asymptoticallynormal parameter estimates.

•Most software that I’m familiar with for fit-ting a GLM with correlation uses GEEs toestimate parameters in a meaningful way.

– PROC GENMOD in SAS with theREPEATED statement invokes GEEs.

– geepack package in R.

7

• Example: Binary longitudinal data

Let Yij be the observation for the ith subject

at the jth timepoint for j = 1, . . . , t(longitudinal)...

g(E[Yij|xij]) =x′ijβ

ln(

E[Yij|xij]1−E[Yij|xij]

)=x′ijβ

which implies

E[Yij|xij] = µij =exp(x′ijβ)

1 + exp(x′ijβ)

and if we assume the binomial distribution...

var[Yij|xij] = vij =exp(x′ijβ)

(1 + exp(x′ijβ))2

(the mean and variance are tied together)

8

In addition to specifying the mean and vari-ance, we specify the covariance of the tobservations on a subject.

LettingV i represent the t×t variance-covariancematrix for subject i, GEEs use the followingstructure...

V i = φA1/2i Ri(α)A

1/2i

where Ai is a diagonal matrix of variancefunctions v(µij), Ri(α) is the workingcorrelation matrix (not covariance matrix)indexed by a vector of parameters α, and φis a dispersion parameter.

Ri(α) can take on a common form, such asthe AR(1), unstructured, or even a diagonalmatrix suggesting independence, if so chosen.

9

• Consider 4 observations over time, one op-tion for the correlation...

Ri(ρ) =

1 ρ ρ2 ρ3

ρ 1 ρ ρ2

ρ2 ρ 1 ρ

ρ3 ρ2 ρ 1

In this case, if there was constant variance,v(µij) = σ2, and φ = 1, then

Vi = σ2

1 ρ ρ2 ρ3

ρ 1 ρ ρ2

ρ2 ρ 1 ρ

ρ3 ρ2 ρ 1

and the ‘overall’ variance-covariance struc-ture would look like a block diagonal (withblocks of V i down the diagonals), assumingwe have independence between subjects.

10

• Zeger and Liang refer to Ri(α) as the “work-ing” correlation matrix because it is not re-quired to be correctly specified to get consis-tent estimators of the regression parameters(nor of the estimated variance of the param-eter estimates).

If your working correlation matrix is closerto the truth, you’ll have more efficient esti-mators.

• A set of estimating equations are solved throughan iterative process to get β̂ (as is also thecase for GLM with exponential families whichutilizes iteratively reweighted least squares(IRWL) to find the MLEs).

– Get initial estimates of β (perhaps via GLM)

– Compute working correlations Ri(α)

– Compute estimate of covariance matrix Vi

– Update β

– Iterate until convergence

11

• An empirical estimator called the “sandwich”or “robust” estimator is used to estimate thecovariance matrix for β̂, or var(β̂).

Fitting models using SAS:

[Graphic found in Cerrito (2006).]

12

Example:Depression scores (high/low) over timefor two groups of women (placebo/control).

Placebo group (0, n=27)Treatment group (1, n=34)

Observations taken every month for 6 months.

Pre-dichotomized data:

dep1

0 5 15 25 0 5 10 20 5 10 20

010

20

010

20

dep2

dep35

15

05

15 dep4

dep5

010

20

0 5 15 25

515

5 10 20 0 5 15

dep6

13

>cor(dp,use="complete")

dep1 dep2 dep3 dep4 dep5 dep6

dep1 1.0000 0.4982 0.5258 0.3933 0.3674 0.2795

dep2 0.4982 1.0000 0.8672 0.7357 0.7500 0.6900

dep3 0.5258 0.8672 1.0000 0.7831 0.8520 0.7967

dep4 0.3933 0.7357 0.7831 1.0000 0.8449 0.7894

dep5 0.3674 0.7500 0.8520 0.8449 1.0000 0.9014

dep6 0.2795 0.6900 0.7967 0.7894 0.9014 1.0000

proc genmod data=dp descending;

class Group Subject Visit;

model DepCat= Group Visit/d=binomial covb;

repeated subject=Subject/corrw type=AR(1);

lsmeans Group/pdiff;

run;

The SAS System

The GENMOD Procedure

Model Information

Data Set WORK.DP

Distribution Binomial

Link Function Logit

Dependent Variable DepCat

14

Number of Observations Read 366

Number of Observations Used 295

Number of Events 157

Number of Trials 295

Missing Values 71

Criteria For Assessing Goodness Of Fit

Criterion DF Value Value/DF

Deviance 288 377.0009 1.3090

Scaled Deviance 288 377.0009 1.3090

Pearson Chi-Square 288 295.1153 1.0247

Scaled Pearson X2 288 295.1153 1.0247

Log Likelihood -188.5004

Working Correlation Matrix

Col1 Col2 Col3 Col4 Col5 Col6

Row1 1.0000 0.5975 0.3570 0.2133 0.1274 0.0761

Row2 0.5975 1.0000 0.5975 0.3570 0.2133 0.1274

Row3 0.3570 0.5975 1.0000 0.5975 0.3570 0.2133

Row4 0.2133 0.3570 0.5975 1.0000 0.5975 0.3570

Row5 0.1274 0.2133 0.3570 0.5975 1.0000 0.5975

Row6 0.0761 0.1274 0.2133 0.3570 0.5975 1.0000

15


Analysis Of GEE Parameter Estimates

Empirical Standard Error Estimates

Standard 95% Confidence

Parameter Estimate Error Limits Z Pr > |Z|

Intercept -0.1373 0.3003 -0.7259 0.4513 -0.46 0.6476

Group 0 1.2970 0.4175 0.4787 2.1153 3.11 0.0019

Group 1 0.0000 0.0000 0.0000 0.0000 . .

Visit 1 0.2439 0.3107 -0.3651 0.8530 0.78 0.4325

Visit 2 -0.2590 0.3113 -0.8691 0.3511 -0.83 0.4054

Visit 3 -0.2630 0.2737 -0.7995 0.2735 -0.96 0.3366

Visit 4 -0.0382 0.3448 -0.7140 0.6377 -0.11 0.9119

Visit 5 -0.1409 0.2933 -0.7158 0.4340 -0.48 0.6310

Visit 6 0.0000 0.0000 0.0000 0.0000 . .

Least Squares Means

Standard Chi-

Effect Group Estimate Error DF Square Pr > ChiSq

Group 0 1.0835 0.3411 1 10.09 0.0015

Group 1 -0.2135 0.2348 1 0.83 0.3633


Differences of Least Squares Means

Standard Chi-

Effect Group _Group Estimate Error DF Square Pr > ChiSq

Group 0 1 1.2970 0.4175 1 9.65 0.0019

16

• References:

Zeger, S.L., and K.-Y. Liang (1986). Longitudinaldata analysis for discrete and continuous outcomes.Biometrics, 42:121-130.

Cerrito, P.B. (2006). From GLM to GLIMMIX-whichmodel to choose? Paper SP10 in Proceedings ofthe SAS Users Group (PharmaSUG).

Stirarelli, R., Laird, N. and Ware, J.H. (1984). Random-effects models for serial observations with binaryresponse. Biometrics, 40:961-971.

17

longitudinal logistic regression longitudinal poisson...

Documents