correlated data - kupublicifsv.sund.ku.dk/~lts/correlatedmeasurements/lectures/... · correlated...

u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s

Faculty of Health Sciences

Correlated dataIntroduction

Julie Lyng Forman & Lene Theil SkovgaardNovember 25, 2013

1 / 80


Introduction

I The idea of the courseI Comparing two types of measurementI Logarithmic transformationI Linear regressionI The general linear model

Home page:http://staff.pubhealth.ku.dk/~lts/CorrelatedMeasurementsE-mail: [email protected]

2 / 80


Aim of the course

To make the participants able to:I understand and interpret advanced statistical analysesI judge the assumptions behind the use of various methods of

analysesI perform own analyses using SASI understand output from a statistical program package

- in general, i.e. other than SASI present results from a statistical analysis - numerically and

graphically

To create a better platform for communication between ’users’ ofstatistics and statisticians, to benefit subsequent collaboration

3 / 80


We expect students to . . .

Be interested

Be motivatedI ideally from your own (future) research project

Have basic knowledge of statistical concepts such as:I mean, averageI variance, standard deviation, standard errorI distributionI correlation, regression, anovaI t-test, χ2-test, F-test

4 / 80


Topics for the course

Quantitative data (normal distribution):I Analysis of variance

I Variance component modelsI General linear models / regression analysis

I Linear mixed modelsNon-normal outcome (binary data or count data):

I Logistic or Poisson regressionI Generalized linear mixed models

Not covered:I Multivariate data (several outcomes at once)I Censored data (survival analysis)

5 / 80


Recommended reading

I The lecture notes(can be downloaded from the course webpages).

I Brief notes about SAS-programming(can be downloaded from the course webpages).

I B.T. West, K.B. Welch and A.T. Galecki:Linear mixed models: a practical guide using statistical software,Chapman & Hall/CRC, 2007

We teach SAS programming.I . . . but the book also covers SPSS, R, Stata, and HLM.

6 / 80


Teaching activities

Lectures:I Mornings (9.15–12.00)I Copies of overheads must be downloaded in advanceI Coffee break around 10-10.30

Computers labs:I In the afternoon (13.00-15.45) following each lectureI Coffee, tea, and cake will be servedI Exercises will be handed outI Solutions can be downloaded after classes

7 / 80


Course diploma

To pass the course 80% attendance is required.I It is your responsibility to sign the list each morning and each

afternoon.I Note: 5× 2 = 10 lists, 80% equals 8 half days.

There is no compulsory home work . . .I but to benefit from the course you need to work with the

material at homeI We expect you to do so!

8 / 80


What are repeated measurements?

Repeated measurements refer to data where the same outcome hasbeen measured in different situations (or at different spots) on thesame individuals.

I Special case: longitudinal means repeatedly over time.

Repeated measurements are termed clustered data when the sameoutcome is measured on groups of individuals from the samefamilies/workplaces/school classes/villages/etc.

9 / 80


Paired data

The most simple example of clustered or repeated measuments.I Two replicates or two subjects per cluster

Examples of paired data:I Same person with treatment and placebo (cross-over studies)I Baseline-follow up studiesI Twin studiesI Comparison of two measurement methodsI Reliability of a measurement method

Quantiative outcome analysed with the paired t-test BUToften the test is not in focus, rather estimation/quantification

10 / 80


Statistical analysis

The usual assumption is that observations are independent.

If you have clustered or repeated measurements the assumption ofindependence is violated.

I Your analyses must account for the repetitions/clustering.I In this course we will teach you how to do it.

Warning: Ignoring the repetitions/clustering and doing a standardanalysis most often leads to:

I P-values that are too small or too large.I confidence intervals that are too wide or too narrow.

11 / 80


Example: MF vs SV

Two measurement methods,expected to give the same result:

MF: Transmitral volumetric flow,determined by Dopplereccocardiography

SV: Left ventricular strokevolume, determined bycross-sectional eccocardiography

subject MF SV1 47 432 66 703 68 724 69 815 70 60. . .. . .. . .. . .18 105 9819 112 10820 120 13121 132 131

12 / 80


Comparison of measurement methods

Usually a comparison of a new experimental method with anestablished method (the reference)

I How well do the two measurements agree?I Is the new method biased compared to the reference?

The data is pairedI The subjects act as their own controlsI Hence we look at differences within subjects

Set up a statistical model to:I Describe the typical size of the differencesI Test if the bias (i.e. the mean difference) is zero

13 / 80


Description of the dataGraphical description

I ScatterplotI Sample pathsI Bland-Altman plotI Histogram

Numerical description

Variable Mean Std.Dev-------------------------MF 86.05 20.32SV 85.81 21.19DIF 0.24 6.96AVERAGE 85.93 20.46-------------------------

14 / 80


Statistical model for paired data

xi : MF-measurement for the i’th subjectyi : SV-measurement for the i’th subject

Look at the differences:

di = xi − yi , for i = 1, . . . , 21

The model asssumes that the differences? are:I independentI normally distributed di ∼ N (δ, σ2

d)? No assumptions are made about the distribution of the individualflow measurements15 / 80


The normal distribution

x

Den

sit

y

2

1 1( , )N m s

2

2 2( , )N m s

1 1m s+1 1m s- 2 2m s-

2 2m s+2m1m

N (µ, σ2)

The mean is often denotedµ or α.

The standard deviation isoften denoted σ or ω.

The variance is σ2.

16 / 80


Paired t-test in SAS

Can be performed in two different ways:1. as a paired two-sample test

PROC TTEST;PAIRED mf*sv;

RUN;

The TTEST ProcedureStatistics

Lower CL Upper CL Lower CL Upper CLDifference N Mean Mean Mean Std Dev Std Dev Std Devmf - sv 21 -2.932 0.2381 3.4078 5.3275 6.9635 10.056

Difference Std Err Minimum Maximummf - sv 1.5196 -13 10

T-TestsDifference DF t Value Pr > |t|mf - sv 20 0.16 0.8771

17 / 80


One-sample tests in SAS, for differences

2. as a one-sample test on the differences:

PROC UNIVARIATE NORMAL;VAR dif;

RUN;

The UNIVARIATE ProcedureVariable: dif

Tests for Location: Mu0=0

Test -Statistic- -----p Value------

Student’s t t 0.156687 Pr > |t| 0.8771Sign M 2.5 Pr >= |M| 0.3593Signed Rank S 8 Pr >= |S| 0.7603

Moments

N 21 Sum Weights 21Mean 0.23809524 Sum Observations 5Std Deviation 6.96351034 Variance 48.4904762... ... ... ...

18 / 80


About the paired t-test

Test of the null hypothesis H0 : δ = 0 (no bias)

The t-statistic is given by:

t = d − 0SEM = 0.24− 0

6.96/√

21= 0.158 ∼ t(20)

which gives P = 0.88, i.e. no significant bias.

Does this mean that the measurement methods are equally good?

19 / 80


Estimation of bias

The estimated mean difference is given by

d = 0.24 cm3

The estimate is our best guess, but repeating the experimentwould give us a somewhat different result

The estimate has a distribution, with an uncertainty called thestandard error of the estimate.

I The standard error of the mean is given by

SEM = sd√n = 6.96√

21= 1.52 cm3

20 / 80


General confidence intervals

Confidence intervals tells us what the parameter is likely to beI An interval, that ’catches’ the true mean with a 95%

probability is called a 95% confidence intervalI 95% is called the coverage

The usual construction is:I Average ±t97.5%(n − 1) · SEMI Often a good approximation, even if data are not normally

distributed (due to the central limit theorem)

The t-quantile t97.5% may be looked up in a table or computed by a program (e.g. R,see http://mirrors.dotsrc.org/cran/).

21 / 80


Confidence limits for the bias

For the differences mf-sv, we get the confidence interval:

d ± t97.5%(20) · SEM0.24 ± 2.086 · 6.96/

√21

(−2.93 ; 3.41)

If there is a bias, it is likely (i.e. with 95% certainty) within thelimits (−2.93cm3, 3.41cm3)

Conclusion:We cannot rule out a bias of approx. 3 cm3 in either direction

22 / 80


P-values and confidence intervals

Tests and confidence intervals are equivalent in a certain senseI They agree on ’reasonable’ values for the meanI The confidence interval contains the values δ0 for which

H0 : δ = δ0 would be accepted

But the P-value is less informative than the confidence intervalI If the study is large a tiny bias may be significantI If the study is small a large bias may be insignificantI Better use the confidence interval to judge the clinical

implications of the bias!

23 / 80


Note the difference

Standard error (of the mean), SE(M)I tells us something about the uncertainty of the estimate of

the meanI SEM = SD/√n is the standard deviation in the distibution of

the estimateI – is used for comparisons, relations etc.

Standard deviation, SDI tells us something about the variation in our sample,I and presumably in the populationI – is used when describing the data

24 / 80


Normal regions

The normal region is an interval containing 95% of the ’typical’observations, i.e. the midrange of the population:

2.5%-quantile to 97.5%-quantile

If the distribution is normal N (µ, σ2), thenI 2.5%-quantile to 97.5%-quantile is µ± 1.96σ

An estimated normal region is given by:

Average± 2× SD

But this does not account for parameter uncertainty!

25 / 80


Prediction intervals

A prediction interval has to ’catch’ future observations with highprobability, say 95%.

x ± 2s is a good prediction interval if the sample is large.But if the sample is small the coverage will be too low.

95% coverage is attained by the prediction interval:

(x − s ·√

1 + 1/n · t2.5%, x + s ·√

1 + 1/n · t97.5%)

I.e. the probability that a randomly chosen subject from thepopulation has a value in this interval is 95% if the data is normal

26 / 80


Limits of agreement

Limits-of-agreement is the prediction interval for the differencebetween two measuring methods

I important for deciding whether or not two measurementmethods may replace each other.

Limits-of-agreement for mf-sv are given by:

0.24± 2.086 ·√

1 + 1/21 · 6.96 = (−14.97, 15.45)

While "x ± 2s" is too narrow / has too low coverage:

d ± 2 · sd = 0.24± 2 · 6.96 = (−13.68, 14.16)

27 / 80


Derivation of the prediction interval

Assume that dnew is a new observation, then

dnew − d ∼ N(0, σ2

d ·(1 + 1

n) )

dnew−dsd ·√

1+1/n∼ t(n − 1)

implying that with 95% probability:

t2.5% < dnew−dsd ·√

1+1/n< t97.5%

d + sd√

1 + 1/n · t2.5% < dnew < d + sd ·√

1 + 1/n · t97.5%

d − sd√

1 + 1/n · t97.5% < dnew < d + sd ·√

1 + 1/n · t97.5%

since t2.5% = −t97.5% by symmetry of the t-distribution.

28 / 80


Assumptions for the paired comparison

The differences:I are independent, i.e. the subjects are unrelatedI are normally distributed: judged graphically or numerically

I by inspection of histograms or QQ-plotsI by formal tests (e.g. PROC UNIVARIATE NORMAL in SAS)

I have have identical variances: judged using the ’Bland-Altmanplot’ of differencs vs. averages

Sometimes it is necessary to tranform the data in order to fulfillthe assumptions

29 / 80


Checking normality: the QQ-plot

Observed quantiles againsttheoretical normal quantiles

If the data is normal, the pointswill be close to the line

30 / 80


Model assumption: Normality?

Assumption: the differences follow a normal distribution.

We can check the assumption by e.g. looking at the histogram orthe QQ-plot.

But with large samples the assumption is not always necessary:I The validity of the t-test and the confidence intervals only rely

on the distributions of the average d . . .I and averages tend to be normal due to the CLT.

However: Normal regions (e.g. limits of agreement) require anormal distribution.

31 / 80


The central limit theorem (CLT)Averages of rolls of dice are more normal than a single roll

One dice roll

Average0 1 2 3 4 5 6 7

0.0

0.2

0.4

0.6

2 dice rolls

Average0 1 2 3 4 5 6 7

0.0

0.2

0.4

0.6

10 dice rolls

Average2 3 4 5

0.0

0.2

0.4

0.6

50 dice rolls

Average2.5 3.0 3.5 4.0 4.5

0.0

0.5

1.0

1.5

32 / 80


Classical two-sample (unpaired) comparison

If the two treatments were applied to separate groups of subjcets– we have independent samplesTraditional model assumptions:

x11, · · · , x1n1 ∼ N (µ1, σ2)

x21, · · · , x2n2 ∼ N (µ2, σ2)

I All observations are independentI Observations follow a normal distribution within each groupI Both groups have the same variance, σ2

I The mean values, µ1 and µ2 may differ

33 / 80


Paired or unpaired comparison?

Note the consequences for the difference between MF and SV:

Estimated mean differenceI 0.24, CI: (-2.93, 3.41) according to the paired t-testI 0.24, CI: (-12.71, 13.19) according to the unpaired t-test

i.e. same estimate but a much wider confidence intervalI The latter is wrong!

You have to respect your design.I Do not forget to take advantage of a subject serving as its

own control (higher power with fewer individuals)

34 / 80


Comparing measurement methods

When comparing two measurement methods:I We have to determine the proper scale

before carrying out the statistical analysis

Is the precision of the measurements approximately the same overthe entire range?

I In that case look at differences on an absolute scaleI Use the differences between the raw measurements

Or does the precision increase with the size of the quantity beingmeasured?

I In that case look at differences on a relative scaleI Make a logarithmic transformation

35 / 80


Another comparison: REFE vs TEST

Two methods for determiningconcentration of glucose:

I REFE: Colour test, may be’polluted’ by urine acid

I TEST: Enzymatic test,more specific for glucose

Ref: R.G. Miller et.al. (eds):Biostatistics Casebook.John Wiley & Sons, 1980.

nr. REFE TEST1 155 1502 160 1553 180 169. . .. . .. . .44 94 8845 111 10246 210 188

average 144.1 134.2SD 91.0 83.2

36 / 80


The usual analysis - the naive approach

Do we see a systematic difference?Test ’δ=0’ assuming di = REFEi − TESTi ∼ N (δ, σ2

d)

d = 9.89, sd = 9.70⇒ t = dSEM = d

sd/√

n = 6.92 ∼ t(45)hence P< 0.0001 , i.e. stong indication of bias.

Limits of agreement tells us that the typical differences are

9.89± t97.5%(45) ·√

1 + 1/46 · 9.70 = (−9.85, 29.64)

Is this a valid analysis?!?

37 / 80


Plots of the raw data

Scatter plot and Bland Altman plot:

The variance of the differences increases with the level;so the model assumptions of the usual analysis are violated!38 / 80


Plots of the log-transformed data

Precision seem to be relative, hence we do a log-transformation

I The plots look better except for an outlier

39 / 80


Close up

Following a logarithmictransformation (andomission of the outlier)the Bland Altman plotlooks OK

40 / 80


Notes on the log-transformation

I It is the original measurements, that have to be transformedwith the logarithm, not the differences!

I Never make a logarithmic transformation on data that mightbe negative!

I It does not matter which logarithm you choose (i.e. whichbase of the logarithm) since they are all proportional

I The procedure with construction of limits of agreement is nowrepeated for the transformed observations

I The result can be transformed back to the original scale withthe anti-logarithm (exp for the natural logarithm)

41 / 80


The correct analysis

Do we see a systematic difference?Test ’δ=0’ assuming di = log(REFEi)− log(TESTi) ∼ N (δ, σ2

d)

d = 0.066, sd = 0.042⇒ t = dSEM = d

sd/√

n = 10.66 ∼ t(45)P< 0.0001 , i.e. stong indication of bias.

Limits of agreement tells us that the typical differences are

0.066± t97.5%(45) ·√

1 + 1/46 · 0.042 = (−0.020, 0.152)

. . . on Log-scale!

42 / 80


Back transformation

Limits of agreement on log-scale are (−0.020, 0.152),meaning that for 95% of the subjects we will have:

−0.020 < log(REFE)− log(TEST) < 0.152

i.e. − 0.020 < log(REFETEST

)< 0.152

Back transforming (using the exponential function):

0.982 = exp(−0.020) < REFETEST

< exp(0.152) = 1.162

or reversed: 0.859 = 11.162 <

TESTREFE

<1

0.982 = 1.02

So TEST will typically lie 14% below to 2% above REFE.43 / 80


Limits of agreement on the original scale

44 / 80


Non-normal data

If the normal distribution is not a good description:I Tests and confidence intervals are valid if the sample is

sufficiently large (due to the central limit theorem).

I To judge the reliability for a given sample:I Use resampling techniquesI Or check with a statistician

I Normal regions and limits of agreement becomeuntrustworthy!

45 / 80


Example: Fertility and aging

Cross-sectional study: 527 women aged 22–42.

Objective: How does fertility decline with age?

Outcomes: Physiological markers of fertilityI Menstrual cycle lengthI Reproductive hormones (FSH, AMH, . . . )I Ovarian volumeI Antral follicle count (AFC)

46 / 80


Simple linear regeression for AFCAFC = α+ β · age + ε – is this a good model?

47 / 80


Log-linear regressionA more plausible model is exponential decay, implying a linearmodel on logarithmic scale: log(AFC) = α+ β · age + ε

48 / 80


Regression with SAS

PROC GLM DATA=menopause;MODEL logafc = age / SOLUTION CLPARM;RUN;

The GLM Procedure

R-Square Coeff Var Root MSE logafc Mean0.053070 21.53554 0.622772 2.891832

Source DF Type III SS Mean Square F Value Pr > FAGE 1 11.41154527 11.41154527 29.42 <.0001

Parameter Estimate Std.Error t Value Pr > |t| 95% Confidence LimitsIntercept 4.066684811 0.21828311 18.63 <.0001 3.637869196 4.495500427AGE -0.035958049 0.00662907 -5.42 <.0001 -0.048980815 -0.022935284

Note: We could have used PROC REG instead.

49 / 80


Regression equation and estimates

The estimates for the linear regression on logarithmic scale are:

Intercept α = 4.07 (95% CI 3.64–4.50)I The "expected value for age= 0"!

Regression coefficient β = −0.036 (95% CI -0.049 to -0.023)I The expected decrease in log(AFC) with one year of aging.

50 / 80


Rate of decline

We see exponential decay on the natural scale.

The expected AFC for age x (median or geometric mean) is

AFC(x) = exp(α+ βx)

I With one year of aging x → x + 1I AFC(x + 1) = exp(α+ β(x + 1)) = exp(β) · AFC(x)I Annual rate of change is the factor exp(β)

corresponding to the decline {1− exp(β)} · 100%.I Estimated by exp(β) = 0.9646, i.e. a decline of 3.5%.

51 / 80


Multiple regression

The regression could be biased by possible confounders:I Use of oral contraceptives (yes, no)I Smoking (current, former og never)I Prenatal smoking exposure (yes, no)I BMI (under weight, normal weight, over weight, obese)

Adjust for these in a multiple regression (general linear model):

Yi = α+ βX + β1Xi,1 + . . .+ βkXi,k + εi

with k additional covariates.I Some of these are dummy variables coding for relevant groups

52 / 80


SAS-program

PROC GLM DATA=menopause;CLASS oc smoking prenatsmoke bmigrp;MODEL logafc = oc smoking prenatsmoke bmigrp age

/ SOLUTION CLPARM;OUTPUT OUT=diagnostics p=fitted r=residual student=stres;RUN;

The GLM Procedure

Sum ofSource DF Squares Mean Square F Value Pr > FModel 8 22.7349809 2.8418726 7.58 <.0001Error 497 186.2490086 0.3747465Corrected Total 505 208.9839894


53 / 80


SAS-output

Source DF Type III SS Mean Square F Value Pr > FOC 1 8.38447592 8.38447592 22.37 <.0001SMOKING 2 0.04472481 0.02236240 0.06 0.9421PRENATSMOKE 1 1.74079772 1.74079772 4.65 0.0316BMIGRP 3 0.68550681 0.22850227 0.61 0.6089AGE 1 15.39698818 15.39698818 41.09 <.0001

StandardParameter Estimate Error t Value Pr > |t|Intercept 4.007665017 B 0.29614093 13.53 <.0001OC no 0.313390980 B 0.06625480 4.73 <.0001OC yes 0.000000000 B . . .SMOKING never -0.023610470 B 0.07174410 -0.33 0.7422SMOKING previous -0.023529734 B 0.08113255 -0.29 0.7719SMOKING smoker 0.000000000 B . . .PRENATSMOKE no-smoke 0.130881971 B 0.06072597 2.16 0.0316PRENATSMOKE smoke 0.000000000 B . . .BMIGRP normal 0.153602199 B 0.18013313 0.85 0.3942BMIGRP over25 0.084779529 B 0.19228883 0.44 0.6595BMIGRP over30 0.050248838 B 0.21702144 0.23 0.8170BMIGRP under18.5 0.000000000 B . . .AGE -0.047386837 0.00739279 -6.41 <.0001

Adjusted β = −0.047, i.e. rate of decline by 4.6%.54 / 80


SAS-output

Parameter 95% Confidence Limits

Intercept 3.425822534 4.589507500OC no 0.183216961 0.443565000OC yes . .SMOKING never -0.164569584 0.117348643SMOKING previous -0.182934802 0.135875335SMOKING smoker . .PRENATSMOKE no-smoke 0.011570705 0.250193237PRENATSMOKE smoke . .BMIGRP normal -0.200314124 0.507518523BMIGRP over25 -0.293019675 0.462578732BMIGRP over30 -0.376143740 0.476641415BMIGRP under18.5 . .AGE -0.061911820 -0.032861855

. . . with 95% confidence interval (-0.062,-0.033),corresponding to a decline between 3.2% and 6.0%.

55 / 80


Interpretation of regression coefficients

Simple regression Y = α+ β · age + ε

I β is the expected change in log(AFC) when age increases byone year.

Multiple regression Y = α+ β · age + β1 ·X1 + . . .+ βkXk + ε

I β is the expected change in log(AFC) when age increases byone year and all other covariates are held fixed.

Similarly for the other covariates:I e.g. exp(0.154) ' 1.166 or 16.6% higher AFC for normal BMI

compared to < 18.5 and all other covariates held fixed.

56 / 80


Hypothesis tests

Does AFC decline with age?

T-test for H0 β = 0:I β = −0.0439, s.e(β) = 0.0074, t = β/s.e(β) = −6.41.I P < 0.0001 in t-distribution with 497 degrees of freedom.

Equivalent to F-test:I Mean Square(Age)/Mean Square(Error) = 41.09I P < 0.0001 in F-distribution with (1,497) degrees of freedom

Note: In case of a categorical covariates with more than two levelsonly the F-test is generally applicable.

57 / 80


Tests of type I and type III

Mind the difference!

Type I: Test the effect of each covariate after ajustment for allother covariates above it on the list.

I Sequential tests to be read bottom-up.

Type III: Test the effect of each covariate after ajustment for allother covariates on the list.

I Non-sequential tests, pick the one that you like.

58 / 80


Predictions (fitted values)

log(AFC) = α+ β · age + β1 · I (no prenatal smoking)+β2 · I (never smoker) + β3 · I (previous smoker)+β4 · I (normal BMI) + . . .+ β6 · I (BMI > 30)+β7 · I (No use of oral contraceptives)

Expected log(AFC) of a 30 year old woman, no smoking, normalweight, non-user of oral contraceptives:

log(AFC) = 4.008−0.047·30−0.024+0.131+0.154+0.313 = 3.172

I.e. we expect an AFC of exp(3.172) ' 24.59 / 80


Model assumptions

The general linear model assumes that:1. The observations are independent2. The linear model for the mean is correct3. Error terms (εi ’s) are normally distributed with zero mean

and equal variancesUse the residuals for model diagnostics:

Ri = Yi − Yi

I "Observed value - Predicted value"I Standardized values are preferred for diagnostics (because of

varying estimation uncertainty in the predicted values)60 / 80


Residual plot

Should be fairly symemtric around zero and with no systematicpatterns.

61 / 80


Residuals against covariatesSimilar plot – looking for non-linear relation with a covariate.

62 / 80


Checking normality: the QQ-plot

63 / 80


Example: Maternal age at menopause

64 / 80



Does the decline in fertility depend on heridatory factors?

Three groups according to maternal age at menopause:I Early, ≤ 45 years of ageI Normal, 46 to 54 years of ageI Late, > 55 years of age

We have a log-linear model for each group.I Is the rate of decline the same in all three groups?

65 / 80


Analysis of covariance

Another name for a general linear model with one quantiativecovariate and one categorical covariate

I We have one regression line for each group

Are the lines parallel?I If not we have an interaction between the two covariates

Are the lines identical?I If not we have differences among the groups

66 / 80



In the late-group fertility seems to increase with age???67 / 80


Estimating regression linesModel: log(AFC)ij = αj + βj · ageij , j = 1, 2, 3

I One set of regression parameters per groupI Re-set the intercept at age= 22 for interpretability

data menopause;set menopause;age22 = age-22;run;

proc glm data=menopause;class menogrp;model logafc = menogrp age22*menogrp

/ noint solution clparm;run;

68 / 80


ANCOVA-output

The GLM Procedure

Dependent Variable: logafc


StandardParameter Estimate Error t Value Pr > |t| 95% Confidence Limits

MENOGRP early 3.328468294 0.20237671 16.45 <.0001 2.930893639 3.726042949MENOGRP late 2.744304704 0.24071674 11.40 <.0001 2.271409991 3.217199417MENOGRP normal 3.334785604 0.08572241 38.90 <.0001 3.166381562 3.503189646AGE22*MENOGRP early -0.052377545 0.01703328 -3.08 0.0022 -0.085839902 -0.018915188AGE22*MENOGRP late 0.022007035 0.01998117 1.10 0.2712 -0.017246526 0.061260596AGE22*MENOGRP normal -0.042078492 0.00764074 -5.51 <.0001 -0.057088940 -0.027068044

Increasing rate in the late maternal menopause group isinsignificant (P=0.27).

69 / 80


Rates of decline

When the slopes are back-transformed, they becomeestimated rates of decline, with 95%-confidence intervals:

Maternal menopause Rate of change in AFC per year (95% CI)Early (≤ 45 years) -5.1% (-8.2% to -1.9%)Normal (46-54 years) -4.1% (-5.5% to -2.7%)Late (> 55 years) +2.2% (-1.7% to +6.3%)

Increasing rate in the late-group might as well be a chance finding.

70 / 80


Re-parametrisationSame model other parameters:

log(AFC)i = α+ β · age + δ1 · I (group=1) + δ2 · I (group=2)+γ1 · I (group=1) · age + γ2 · I (group=2) · age

I Group 3 is reference with regression parameters α and β.I δ’s and γ’s are differences in regression parameters wrt ref.I Allows for testing differences among the groups.

title1 ’ANCOVA’;proc glm data=menopause;class menogrp;model logafc = menogrp age22 age22*menogrp / solution;run;71 / 80


ANCOVA-output

The GLM Procedure


Source DF Type III SS Mean Square F Value Pr > FMENOGRP 2 2.05075422 1.02537711 2.71 0.0676AGE22 1 2.65777726 2.65777726 7.02 0.0083AGE22*MENOGRP 2 3.77690717 1.88845358 4.99 0.0072

StandardParameter Estimate Error t Value Pr > |t|Intercept 3.334785604 B 0.08572241 38.90 <.0001MENOGRP early -0.006317310 B 0.21978322 -0.03 0.9771MENOGRP late -0.590480900 B 0.25552472 -2.31 0.0212MENOGRP normal 0.000000000 B . . .AGE22 -0.042078492 B 0.00764074 -5.51 <.0001AGE22*MENOGRP early -0.010299053 B 0.01866852 -0.55 0.5814AGE22*MENOGRP late 0.064085527 B 0.02139224 3.00 0.0029AGE22*MENOGRP normal 0.000000000 B . . .

Regression coefficients differ significantly, intercepts do not.72 / 80


Missing data problem?

We have missing data . . .I among younger women whose mothers aren’t yet menopausalI i.e. missing not at randomI data from some of the potentially most fertile tend to be

missing

This may cause biasI Particularly the late-group.

73 / 80


Assuming identical intercepts

Leave out the main effect of menogrp.

title1 ’ANCOVA with same intercept at age 22’;proc glm data=menopause;class menogrp;model logafc = age22 age22*menogrp/ solution clparm;run;

Output:Source DF Type I SS Mean Square F Value Pr > FAGE22 1 11.41154527 11.41154527 29.94 <.0001AGE22*MENOGRP 2 4.30076782 2.15038391 5.64 0.0038

Rate of decline still differ significantly between groups (P=0.004).

74 / 80


A prettier picture

75 / 80


Estimated rates of decline

. . . when assuming identical intercepts (at age 22).

Estimated rates of decline with 95%-confidence intervals:

Maternal menopause Rate of decline in AFC per year (95% CI)Early (≤ 45 years) 4.7% (3.1% to 6.3%)Normal (46-54 years) 3.7% (2.3% to 4.9%)Late (> 55 years) 2.0% (0.4% to 3.6%)

76 / 80


Summary statistics

Numerical description of quantitative variables:

Location, centerI average (mean value) x = (x1 + · · ·+ xn)/nI median (middle observation, 50% above and 50% below)

VariationI variance, s2 = Σ(xi − x)2/(n − 1) (quadratic units)I standard deviation, s =

√variance (units as outcome)

I quantiles, e.g. Inter Quantile Range (25% to 75% quantile)I standard error, SE = s/

√n (uncertainty of mean estimate)

77 / 80


The summary statistics for ’MF vs SV’ are made using the code:

Note: the data is read in from the file ’mf_sv.txt’(text file with two columns and 21 observations)

DATA mydata;INFILE ’mf_sv.txt’ FIRSTOBS=2;INPUT mf sv;

dif=mf-sv;average=(mf+sv)/2;

RUN;

PROC MEANS DATA=mydata MEAN STD;RUN;

78 / 80


The pictures for ’MF vs SV’ are made using the code:

proc gplot;plot mf*sv / haxis=axis1 vaxis=axis2 frame;

axis1 value=(H=2) minor=NONE label=(H=2);axis2 value=(H=2) minor=NONE label=(A=90 R=0 H=2);symbol1 v=circle i=none c=BLACK l=1 w=2;run;

proc gplot;plot flow*method=subject/ nolegend haxis=axis1 vaxis=axis2 frame;

axis1 value=(H=2) minor=NONE label=(H=2);axis2 value=(H=2) minor=NONE label=(A=90 R=0 H=2);symbol1 v=circle i=join l=1 w=2 r=21;run;

79 / 80


proc gplot;plot dif*average / vref=0 lv=1 vref=0.24 15.5 -15.0 lv=2

haxis=axis1 vaxis=axis2 frame;axis1 value=(H=2) minor=NONE label=(H=2 ’average’);axis2 order=(-16 to 16 by 4) value=(H=2) minor=NONE

label=(A=90 R=0 H=2 ’difference MF-SV’);symbol1 v=circle i=none l=1 w=2;title h=3 ’Bland Altman plot’;run;

title;proc gchart;

vbar dif;run;

80 / 80

correlated data - kupublicifsv.sund.ku.dk/~lts/correlatedmeasurements/lectures/... · correlated...

Documents