sample size determination bandit thinkhamrop, phd (statistics) dept. of biostatistics &...
TRANSCRIPT
Sample Size Sample Size DeterminatioDeterminationn Bandit Thinkhamrop, PhD (Statistics)Bandit Thinkhamrop, PhD (Statistics)
Dept. of Biostatistics & DemographyDept. of Biostatistics & Demography
Khon Kaen UniversityKhon Kaen University
Essential of sample size Essential of sample size calculationcalculation
No one accept any “magic number”No one accept any “magic number” Too large vs Too smallToo large vs Too small To justify with the sponsor and the To justify with the sponsor and the
Ethics CommitteeEthics Committee To ensure:To ensure:
– adequate adequate powerpower to test a hypothesis to test a hypothesis– desired desired precisionprecision to obtain an estimate to obtain an estimate
Two main approachesTwo main approaches
Hypothesis-based sample size calculationHypothesis-based sample size calculation– Involve “power” or beta errorInvolve “power” or beta error– Ensure a significant finding but may not be Ensure a significant finding but may not be
conclusive clinicallyconclusive clinically– Easy and widely availableEasy and widely available
Confidence interval methods of sample Confidence interval methods of sample size calculationsize calculation– Involve precision of the estimationInvolve precision of the estimation– Ensure a conclusive finding clinically as this Ensure a conclusive finding clinically as this
method is directly estimate the magnitude of method is directly estimate the magnitude of effect effect
– Difficult and not widely availableDifficult and not widely available
Overall stepsOverall steps
Identify the primary outcomeIdentify the primary outcome Identify and review the Identify and review the magnitude of magnitude of
effecteffect and its variability that will be used and its variability that will be used as the basis of the conclusion of the as the basis of the conclusion of the research.research.
Identify what statistical method that will Identify what statistical method that will be used to obtain the main magnitude of be used to obtain the main magnitude of effect.effect.
Calculate the sample sizeCalculate the sample size Describe how the sample size is Describe how the sample size is
calculated with sufficient details that calculated with sufficient details that allow explicability.allow explicability.
Steps in the Steps in the calculationcalculation Base sample size calculationBase sample size calculation Design effect (for correlated outcome)Design effect (for correlated outcome) Contingency (increase to account for Contingency (increase to account for
non-responses or dropout)non-responses or dropout) Rounding up to a nearest (and Rounding up to a nearest (and
comfortable) numbercomfortable) number Evaluate if this sample size would Evaluate if this sample size would
provide a precise and conclusive provide a precise and conclusive answer to the research question by answer to the research question by analyze the data as if it is as expected. analyze the data as if it is as expected.
Suggested approachesSuggested approaches
For unknown parameters in the formula, For unknown parameters in the formula, try to find existing evidences or use your try to find existing evidences or use your best “GUESTIMATE”, a.k.a. educated best “GUESTIMATE”, a.k.a. educated guest.guest.
Do not use only one scenario or based on Do not use only one scenario or based on only one reference for the calculation. It is only one reference for the calculation. It is highly recommended that all key highly recommended that all key parameters should be varied to see how parameters should be varied to see how they effect on the sample size. they effect on the sample size.
Always evaluate its sufficiency by Always evaluate its sufficiency by estimate the main magnitude of effect estimate the main magnitude of effect and its 95% CI and see if it provide a and its 95% CI and see if it provide a conclusive finding. conclusive finding.
Consult with the statistician earlyConsult with the statistician early
Common pitfallsCommon pitfalls
Unjustified sample size by specifying a Unjustified sample size by specifying a “magic” number“magic” number
Based on a simplify formula or a sample size Based on a simplify formula or a sample size table without understanding its limitations table without understanding its limitations
""A previous study in this area recruited 50 A previous study in this area recruited 50 subjects and found highly significant results subjects and found highly significant results (p=0.001), and therefore a similar sample size (p=0.001), and therefore a similar sample size should be sufficientshould be sufficient." – never do it like this ." – never do it like this
Inconsistent with the protocolInconsistent with the protocol Too much rely on the previous findings in Too much rely on the previous findings in
sample size calculationsample size calculation
Examples of common Examples of common calculationscalculations Mean – one groupMean – one group Mean – two independent groups Mean – two independent groups Proportion – one groupProportion – one group Proportion – two independent Proportion – two independent
groups groups Get some idea from thoseGet some idea from those Practice with your own researchPractice with your own research
Mean – one groupMean – one group:Formula:Formula
Where:
n = The sample sizeZ/2 = The standard normal coefficient, typically 1.96 for 95% CI
s =The standard deviation.d = The desired precision level expressed as half of the maximum acceptable confidence interval width.
Mean – one groupMean – one group:Calculations:Calculations (fix (fix = = 0.05)0.05)
ExpectedExpected
Standard Standard deviationdeviation
PrecisionPrecision
(half width)(half width)nn
2525 55 9999
3030 55 141141
2525 1010 2727
3030 1010 3838
Mean – one groupMean – one group:Descriptions:Descriptions A sample size of 38 would be able A sample size of 38 would be able
to estimate a mean with a precision to estimate a mean with a precision of 10 assuming a standard of 10 assuming a standard deviation of 30 according to a study deviation of 30 according to a study by <Reference>. That is, based on by <Reference>. That is, based on the expected mean of 55 the expected mean of 55 <Reference>, the 95% confidence <Reference>, the 95% confidence interval of the estimated mean interval of the estimated mean would be between 45 and 65. would be between 45 and 65.
Mean – two independent Mean – two independent groupgroup:Formula:Formula
Sample size in each group (assumes equal sized groups)
Represents the desired level of statistical significance (typically 1.96 for = 0.05).
Represents the desired power (typically .84 for 80% power). A measure of
variability (This is a variance or a square of the standard deviation)
Minimum meaningful difference or Effect Size
Mean – two independent groupsMean – two independent groups:Calculations:Calculations (fix (fix = 0.05) = 0.05)H0: M1-M2=0. H1: M1-M2=D1<>0. Test Statistic: Z test with pooled variance (SD1 = 20; SD2 = 25)
PowePowerr
Mean in Mean in Control grp.Control grp.
Minimum and Minimum and meaningful meaningful differencedifference
n1n1 n2n2
90%90% 3030 1010 109109 109109
80%80% 3030 1010 8282 8282
90%90% 3030 2020 2828 2828
80%80% 3030 2020 2222 2222
90%90% 3535 55 432432 432432
80%80% 3535 55 322322 322322
90%90% 3535 1515 4949 4949
80%80% 3535 1515 3737 3737
Mean – two independent Mean – two independent groupsgroups:Descriptions:Descriptions
A total sample size of 37 in group one A total sample size of 37 in group one and 37 in group two would have a and 37 in group two would have a power of 80% to detect a difference power of 80% to detect a difference between group of 15between group of 15 assuming a mean assuming a mean of 35 in control group of 35 in control group with estimated gr with estimated gr
oup standard deviations of oup standard deviations of 2020 and and 25, 25, respectively,respectively, according to a study by according to a study by <Reference>. <Reference>.
The test statistic used is the two-sided The test statistic used is the two-sided two sample t-test. The significance level two sample t-test. The significance level of the test was targeted at 0.05. of the test was targeted at 0.05.
Proportion – one groupProportion – one group:Formula:Formula
Where:
n = The sample sizeZ/2 = The standard normal coefficient, , typically 1.96 for 95% CI
p = The value of the proportion as a decimal percent (e.g., 0.45).d = The desired precision level expressed as half of the maximum acceptable confidence interval width.
Proportion – one groupProportion – one group:Calculations:Calculations (fix (fix = = 0.05)0.05)
Expected Expected PrevalencePrevalence
PrecisionPrecision
(half width)(half width)nn
15%15% 2%2% 1,2251,225
20%20% 2%2% 1,5371,537
15%15% 4%4% 307307
20%20% 4%4% 385385
Proportion – one groupProportion – one group:Descriptions:Descriptions A sample size of 400 would have A sample size of 400 would have
a 95% confidence interval of 16% a 95% confidence interval of 16% to 24%to 24% assuming a prevalence of assuming a prevalence of 20% according to a study by 20% according to a study by <Reference>. <Reference>.
Proportion – two independent Proportion – two independent groupgroup:Formula:Formula
Sample size in each group (assumes equal sized groups)
Represents the desired level of statistical significance (typically 1.96 for = 0.05).
Represents the desired power (typically .84 for 80% power). A measure of
variability (similar to standard deviation)
Minimum meaningful difference or Effect Size
Proportion – two independent Proportion – two independent groupsgroups:Calculations:Calculations (fix (fix = 0.05) = 0.05)H0: P1-P2=0. H1: P1-P2=D1<>0. Test Statistic: Z test with pooled variancePowePowe
rrProportion in Proportion in Control grp.Control grp.
Minimum and Minimum and meaningful meaningful differencedifference
n1n1 n2n2
90%90% 40%40% 5%5% 2,0532,053 2,0532,053
80%80% 40%40% 5%5% 1,5341,534 1,5341,534
90%90% 50%50% 5%5% 2,0952,095 2,0952,095
80%80% 50%50% 5%5% 1,5651,565 1,5651,565
90%90% 40%40% 10%10% 519519 519519
80%80% 40%40% 10%10% 388388 388388
90%90% 50%50% 10%10% 519519 519519
80%80% 50%50% 10%10% 388388 388388
Proportion – two independent Proportion – two independent groupsgroups:Descriptions:Descriptions
A total sample size of 388 in group one A total sample size of 388 in group one and 388 in group two would have a and 388 in group two would have a power of 80% to detect a difference power of 80% to detect a difference between group of 10%between group of 10% assuming a assuming a prevalence of 50% in control group prevalence of 50% in control group according to a study by <Reference>. according to a study by <Reference>.
The test statistic used is the two-sided The test statistic used is the two-sided Z test. The significance level of the Z test. The significance level of the test was targeted at 0.0500. test was targeted at 0.0500.
Other considerationsOther considerations
Sampling design affects the calculation of sample sizeSampling design affects the calculation of sample size– Simple randomSimple random sampling / assignment sampling / assignment– Stratified randomStratified random sampling / assignment sampling / assignment– Clustered randomClustered random sampling / assignment sampling / assignment
Complex study designs affects the calculation of sample Complex study designs affects the calculation of sample sizesize– MatchingMatching– Multiple stages of samplingMultiple stages of sampling– Repeated measuresRepeated measures
Usually the sample size calculation is based on method of Usually the sample size calculation is based on method of analysisanalysis– Correlation, Agreement, Diagnostic performanceCorrelation, Agreement, Diagnostic performance– Z-testZ-test– Regression – multiple linear, logistic Regression – multiple linear, logistic – Multivariate analyses such as principle component or factor Multivariate analyses such as principle component or factor
analysisanalysis– Survival analysesSurvival analyses– Multilevel modelsMultilevel models
Other considerationsOther considerations
Demonstrate superiority Demonstrate superiority – Sample size sufficient to detect difference Sample size sufficient to detect difference
between treatments between treatments – Require to specify “Require to specify “minimum meaningfulminimum meaningful” ”
differencedifference Demonstrate non-inferiority or equally Demonstrate non-inferiority or equally
effectiveeffective– Sample size required to demonstrate Sample size required to demonstrate
equivalence larger than required to equivalence larger than required to demonstrate superioritydemonstrate superiority
– Require to specify “Require to specify “non-inferiority marginnon-inferiority margin or or equivalence rangeequivalence range” ”
Precision or Power Precision or Power EstimationEstimation Equivalence to sample size calculation – do it Equivalence to sample size calculation – do it
in the planning phase of the studyin the planning phase of the study Do it when the number of available sample is Do it when the number of available sample is
knownknown Wrong: “There are around 50 patients per Wrong: “There are around 50 patients per
year, of whom 10% may refuse to take part year, of whom 10% may refuse to take part in the study. Therefore over the 2 years of in the study. Therefore over the 2 years of the study, the sample size will be 90 the study, the sample size will be 90 patients. “patients. “
Correct: “It is estimated that there will be 90 Correct: “It is estimated that there will be 90 patients in the clinic. This will give a patients in the clinic. This will give a precision of the prevalence estimation of precision of the prevalence estimation of 20% assuming a prevalence of 65%.”20% assuming a prevalence of 65%.”
Suggested learning Suggested learning resourcesresources WWWWWW: : Statistics Guide for Research Grant Statistics Guide for Research Grant
Applicants Applicants at St George’s University of at St George’s University of London (maintained by Martin Bland):London (maintained by Martin Bland):– httphttp://://wwwwww--usersusers..yorkyork..acac..ukuk// 55~mb55~mb //guideguide//sizesize..htmhtm
SoftwareSoftware: : PASS2008, nQuery, PASS2008, nQuery, EpiTable, SeqTrial, PS, etc.EpiTable, SeqTrial, PS, etc.
Q & A