biostatistics & sas...

23
1 Kevin Zhang April 18, 2017 Determine Sample Size and Power Biostatistics & SAS programming

Upload: vodang

Post on 12-Apr-2018

239 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Biostatistics & SAS programmingfacstaff.bloomu.edu/dzhang/lecturenotes/biostat/Lecture10/Lecture... · Biostatistics & SAS programming. Errors ... in SAS is used for power analysis

1

Kevin Zhang

April 18, 2017 Determine Sample Size and Power

Biostatistics & SAS programming

Page 2: Biostatistics & SAS programmingfacstaff.bloomu.edu/dzhang/lecturenotes/biostat/Lecture10/Lecture... · Biostatistics & SAS programming. Errors ... in SAS is used for power analysis

Errors

April 18, 2017 Biostat 2

Page 3: Biostatistics & SAS programmingfacstaff.bloomu.edu/dzhang/lecturenotes/biostat/Lecture10/Lecture... · Biostatistics & SAS programming. Errors ... in SAS is used for power analysis

In practice

When you design the study, you need to first tell how many units, i.e. the sample size, should be involved: 10, 100, 1000, or more?

Which one you will trust? A sample with 10 observations A sample with 10,000 observations

April 18, 2017 Biostat 3

Page 4: Biostatistics & SAS programmingfacstaff.bloomu.edu/dzhang/lecturenotes/biostat/Lecture10/Lecture... · Biostatistics & SAS programming. Errors ... in SAS is used for power analysis

Power

The power of the hypothesis test demonstrate the sensitivity of the hypothesis: Whether the conclusion is reliable?

Power function

Power function is an equation of sample size: We may enlarge the power by getting a larger sample size.

April 18, 2017 Biostat 4

Page 5: Biostatistics & SAS programmingfacstaff.bloomu.edu/dzhang/lecturenotes/biostat/Lecture10/Lecture... · Biostatistics & SAS programming. Errors ... in SAS is used for power analysis

POWER proc

proc power in SAS is used for power analysis. You can detect the power for the given sample size, or determine the sample size using desired power.

POWER need to know what kind of problem you will solve: MULTREG -- Tests of one or more coefficients in multiple linear regression ONECORR -- Fisher’s Z test and T test of (partial) correlation ONESAMPLEFREQ -- Tests, confidence interval precision, and equivalence tests of a single

binomial proportion ONESAMPLEMEANS -- One-sample test, confidence interval precision, or equivalence test ONEWAYANOVA -- One-way ANOVA including single-degree-of-freedom contrasts PAIREDMEANS -- Paired T test, confidence interval precision, or equivalence test PLOT -- Displays plots for previous sample size analysis TWOSAMPLEMEANS -- Two-sample T test, confidence interval precision, or equivalence test

April 18, 2017 Biostat 5

Page 6: Biostatistics & SAS programmingfacstaff.bloomu.edu/dzhang/lecturenotes/biostat/Lecture10/Lecture... · Biostatistics & SAS programming. Errors ... in SAS is used for power analysis

Example 1 A clinical dietician wants to compare two different diets, A and B, for

diabetic patients. She hypothesizes that diet A (Group 1) will be better than diet B (Group

2), in terms of lower blood glucose. She plans to get a random sample of diabetic patients and randomly

assign them to one of the two diets. At the end of the experiment, which lasts 6 weeks, a fasting blood glucose test will be conducted on each patient.

She also expects that the average difference in blood glucose measure between the two group will be about 10 mg/dl. Furthermore, she also assumes the standard deviation of blood glucose distribution for diet A to be 15 and the standard deviation for diet B to be 17.

The dietician wants to know the number of subjects needed in each group assuming equal sized groups.

April 18, 2017 Biostat 6

Page 7: Biostatistics & SAS programmingfacstaff.bloomu.edu/dzhang/lecturenotes/biostat/Lecture10/Lecture... · Biostatistics & SAS programming. Errors ... in SAS is used for power analysis

Analysis

April 18, 2017 Biostat 7

Page 8: Biostatistics & SAS programmingfacstaff.bloomu.edu/dzhang/lecturenotes/biostat/Lecture10/Lecture... · Biostatistics & SAS programming. Errors ... in SAS is used for power analysis

SAS code explaination

proc power; twosamplemeans test=diff groupmeans = 0 | 10stddev = 16.03npergroup = .power = 0.8;

run;

April 18, 2017 Biostat 8

Two sample mean test, we need to check the difference.

Set the averages of groups, herewe just set 0 and 10 thus todescribe the desired diff

Leave npergroup blank thus SAS will calculate sizes for groups. Specify the desired power as 80%

Page 9: Biostatistics & SAS programmingfacstaff.bloomu.edu/dzhang/lecturenotes/biostat/Lecture10/Lecture... · Biostatistics & SAS programming. Errors ... in SAS is used for power analysis

April 18, 2017 Biostat 9

Your settings

We will achieve 80% powerwhen 42 patients in each group

Page 10: Biostatistics & SAS programmingfacstaff.bloomu.edu/dzhang/lecturenotes/biostat/Lecture10/Lecture... · Biostatistics & SAS programming. Errors ... in SAS is used for power analysis

Evaluate the power of a given sample

What happens if we only have 30 patients in each group?

proc power; twosamplemeans test=diff groupmeans = 0 | 10stddev = 16.03npergroup = 30power = .;

run;

April 18, 2017 Biostat 10

30 patients ineach group

Power is?

Page 11: Biostatistics & SAS programmingfacstaff.bloomu.edu/dzhang/lecturenotes/biostat/Lecture10/Lecture... · Biostatistics & SAS programming. Errors ... in SAS is used for power analysis

In practice, how to evaluate the power of an imbalance design? More patients assigned to Diet A, say 40 Only 20 patients wish to take Diet B

proc power; twosamplemeans test=diff groupmeans = 0 | 10stddev = 16.03groupns = (40 20)power = .;

run;

April 18, 2017 Biostat 11

Page 12: Biostatistics & SAS programmingfacstaff.bloomu.edu/dzhang/lecturenotes/biostat/Lecture10/Lecture... · Biostatistics & SAS programming. Errors ... in SAS is used for power analysis

Small simulation study

Wish to see the change of the sample size when we have different mean differences?

proc power; twosamplemeans test=diff meandiff = .2 to 1.2 by .2stddev = 1power = .8npergroup = . ;

run;

April 18, 2017 Biostat 12

Checking differences:0.2, 0.4, 0.6, 0.8, 1.0, 1.2

Larger difference will be easier to be detected, thus a smaller sample size willbe needed.

Page 13: Biostatistics & SAS programmingfacstaff.bloomu.edu/dzhang/lecturenotes/biostat/Lecture10/Lecture... · Biostatistics & SAS programming. Errors ... in SAS is used for power analysis

Power chart

A plot to show the trend of sample size

proc power; twosamplemeans test=diff meandiff = .2 to 1 by .2stddev = 1power = .9ntotal = .;plot x = power min= .5 max=.95;

run;

April 18, 2017 Biostat 13

Page 14: Biostatistics & SAS programmingfacstaff.bloomu.edu/dzhang/lecturenotes/biostat/Lecture10/Lecture... · Biostatistics & SAS programming. Errors ... in SAS is used for power analysis

Correlation Examples

A researcher is interested in seeing whether a significant positivecorrelation exists between reading speed and IQ in adolescents. Before beginning the study, the researcher would like to know what sample size would be required to detect a positive correlation of 0.5 with power of 80%. Correlation analysis Hypothesis test about the significance of the correlation

Assumed correlation is 0.5

April 18, 2017 Biostat 14

𝐻𝐻0:𝜌𝜌 = 0 𝑣𝑣𝑣𝑣 𝐻𝐻1: 𝜌𝜌 > 0

Page 15: Biostatistics & SAS programmingfacstaff.bloomu.edu/dzhang/lecturenotes/biostat/Lecture10/Lecture... · Biostatistics & SAS programming. Errors ... in SAS is used for power analysis

proc power;onecorr alpha=0.05sides=1corr=0.5ntotal=.power=0.8;

run;

April 18, 2017 Biostat 15

Page 16: Biostatistics & SAS programmingfacstaff.bloomu.edu/dzhang/lecturenotes/biostat/Lecture10/Lecture... · Biostatistics & SAS programming. Errors ... in SAS is used for power analysis

Proportion Example

A survey claims that 90% dentists recommend a particular brand of toothpaste for their patients suffering with sensitive teeth. A researcher decides to test this claim by taking a random sample of 80 dentists, but wants to first find out if this sample size is large enough to achieve 80% power. Hypothesis test about the proportion (i.e. percentage)

April 18, 2017 Biostat 16

Page 17: Biostatistics & SAS programmingfacstaff.bloomu.edu/dzhang/lecturenotes/biostat/Lecture10/Lecture... · Biostatistics & SAS programming. Errors ... in SAS is used for power analysis

proc power;onesamplefreq testsides=2nullproportion=0.9proportion=0.05 to 0.85 by 0.05alpha=0.05ntotal=80power=.;run;

April 18, 2017 Biostat 17

Assume any proportion that isdifferent from the proposed 90%.

Here we check the power for alist of different possible proportions in the sample

Page 18: Biostatistics & SAS programmingfacstaff.bloomu.edu/dzhang/lecturenotes/biostat/Lecture10/Lecture... · Biostatistics & SAS programming. Errors ... in SAS is used for power analysis

One sample T test of mean

A researcher is planning a pharmaceutical study on a new formulation of a drug. The current formulation has an average elimination rate of 0.06. The researcher hypothesizes that the elimination rate for the new formulation is higher than 0.06. Wanting to be confident, the researcher would like to see how large the sample size must be to achieve 90% power. A standard deviation of 0.02 will be used based on studies of the original formulation of the drug. Hypothesis Test of the Average to 0.06 One tail test

April 18, 2017 Biostat 18

Page 19: Biostatistics & SAS programmingfacstaff.bloomu.edu/dzhang/lecturenotes/biostat/Lecture10/Lecture... · Biostatistics & SAS programming. Errors ... in SAS is used for power analysis

proc power;onesamplemeans sides=1nullmean=0.06mean=0.01 to 0.1 by 0.01stddev=0.02ntotal=.power=0.9;run;

April 18, 2017 Biostat 19

Test structure

Null hypo

Page 20: Biostatistics & SAS programmingfacstaff.bloomu.edu/dzhang/lecturenotes/biostat/Lecture10/Lecture... · Biostatistics & SAS programming. Errors ... in SAS is used for power analysis

Paired T test

A researcher is interested in investigating whether BMI changes in males aged 55-65 years after spending four weeks on a novel diet and exercise program. The researcher plans to take BMI measurements on a random sample of men before and after the intervention and see whether there was a change. An 80% level of power is desired and a standard deviation of 2.0 based on past studies of weight loss and BMI change is used for calculations. Comparison between two readings – T test A SAME sample has been read twice (Before vs After) – Paired design

April 18, 2017 Biostat 20

Page 21: Biostatistics & SAS programmingfacstaff.bloomu.edu/dzhang/lecturenotes/biostat/Lecture10/Lecture... · Biostatistics & SAS programming. Errors ... in SAS is used for power analysis

proc power;pairedmeans sides=2nulldiff=0meandiff=0.5 to 3 by 0.5corr=0.5stddev=2.0npairs=.power=0.8;run;

April 18, 2017 Biostat 21

Null assumes no diffPossible differencesin the sample

Correlation:Before vs After

npairs instead of ntotal

Page 22: Biostatistics & SAS programmingfacstaff.bloomu.edu/dzhang/lecturenotes/biostat/Lecture10/Lecture... · Biostatistics & SAS programming. Errors ... in SAS is used for power analysis

Example of ANOVA

A researcher is interested in investigating the effects of three different diets on percent weight loss when implemented along with a 5-day per week cardio exercise program. The diets include a lowcarbohydrate diet, a high protein diet, and a control diet (just exercise). Before beginning the study, sample size determinations must be made. The researcher would like to achieve power of 80%.From previous study, the average percent weight loss values are: 9 for Low, 12 for High and 8 for Control. Assume the standard deviation is 3.0 Comparing 3 groups (Low – High - Control) – One-way ANOVA

April 18, 2017 Biostat 22

Page 23: Biostatistics & SAS programmingfacstaff.bloomu.edu/dzhang/lecturenotes/biostat/Lecture10/Lecture... · Biostatistics & SAS programming. Errors ... in SAS is used for power analysis

proc power;onewayanova test=overallgroupmeans=9|12|8stddev=3.0npergroup=.power=0.8;run;

April 18, 2017 Biostat 23

Here we have 3 groups, so we need to know howmany subjects in each. Balance design assumed.