statistics pres 3.31.2014

ADLT 673 : TEACHING AS SCHOLARSHIP IN MEDICAL EDUCATION

MONDAY, MARCH 31 , 2014

An Overview of Quantitative Data Analysis

Outline of Today’s Class

Analytic Methods Summary Measures Hypothesis Testing Statistical Methodologies Group Discussion

Sample Size Determination Group Discussion

Additional Resources

Analytic Methods: Summary Measures

Representative Measures

Reflect the most “typical” or “average” data value.

Continuous Measurements: Mean (Average), Median and Mode

Categorical Measurements: Frequencies and Proportions


Measures of Variability

Reflect how much the values differ from one another.

Continuous Measurements: Standard deviation, range, interquartile range

Categorical Measurements: None that are meaningful (sorry!)

“Normally” Distributed Data

“Skewed” Data


0

100

200

300

400

500

600

100

150

200

250

300

“Normally” Distributed Data

“Skewed” Data


100

150

200

250

300-2.33-1.64-1.28-0.670.00.671.281.642.33

0.5 0.80.20.05 0.95

Normal Quantile Plot

0

100

200

300

400

500

600 -2.33-1.64-1.28-0.670.00.671.281.642.33

0.5 0.80.20.05 0.95

Normal Quantile Plot


Measures of Association Continuous Measures: Correlation Coefficient (ρ): -1 < ρ <

1 Correlations close to 1 indicate two measurements are highly

predictive and “track” with one another. Correlations close to -1 indicate two measurements are highly

predictive and have inverse relationship. Correlations close to 0 indicate little association.

Categorical Measures: Odds Ratio (OR): 0 < OR < ∞ OR greater than 1 indicates outcome (e.g., passed test) more

likely in test group than in control. OR less than 1 indicates outcome less likely in test group than

in control. OR ≈ 1 indicates little difference in outcomes between groups.

Analytic Methods: Hypothesis Testing

Most commonly accepted format of providing quantitative evidence.

Consists of 5 Steps: Translate research question into a set of testable

hypotheses. Select most appropriate statistical test for your

hypotheses. Collect your data. Calculate test statistic and/or p-value. Make Decision.


Translating Research Question into Testable Hypotheses Identify parameter: population Mean (μ), proportion (p) or

difference (e.g., μ1-μ2).

Identify statements made about that parameter. Should be in the form of: <, ≤, >, ≥, = or ≠

Write research question in symbolic form, and find its opposite. Opposite of “<“ is “≥” “≤” is opposite of “>” “≠” is opposite of “=“


Example: Does an active learning curriculum improve the

proportion of students passing their board examinations compared to students receiving the standard curriculum? Parameter: proportion passing board exams p

Statement: pactive is greater than pstandard

Symbolic Form: pactive > pstandard or pactive – pstandard > 0

Opposite of Symbolic Form: pactive ≤ pstandard or pactive –

pstandard ≤ 0


Testable Hypotheses: Null Hypothesis: Statement that parameter (or difference) is

equal to zero. Any statement in symbolic form with a ≤, ≥ or = is automatically

the null (note: we replace ≤ or ≥ with 0).

Alternative Hypothesis: Statement that parameter (or difference) is somehow different from zero. Any statement in symbolic form with a <, > or ≠ is automatically

the alternative.

Example: pactive – pstandard > 0 becomes the alternative (HA)

pactive – pstandard ≤ 0 becomes the null (H0)


Make Decision Based on statistical methodology you use, you get a p-value.

Probability of observing outcomes that are more extreme than the data you actually observed, given the null hypothesis is true.

Plain English: If your study was ineffective, p-value is the probability of observing more extreme results than what you observed. If this probability is high, then your results match with the null

hypothesis, and you fail to reject the null (intervention didn’t work) If this probability is low, then your results do not seem to match the

null hypothesis, and you reject the null (intervention likely worked).

In practice: we compare p-value to significance level (α = 0.05). If p-value ≥ 0.05, we fail to reject the null. If p-value < 0.05, we reject the null.

Analytic Methods: Continuous Data

# of Measurements

# of Samples

Single Pre/Post Repeated Measures

1 Sample t-test Paired t-test Repeated Measures ANOVA (RMA) / Linear Mixed Model (LMM)*

2 Samples Two-sample t-

test

RMA / LMM* RMA / LMM*

“k” Samples

Analysis of Variance (ANOVA)

RMA / LMM* RMA / LMM*

Adjusting for

Covariates:

Multiple Linear Regression*, Analysis of Covariance (ANCOVA)*, Linear Mixed Models*

*Will likely require statistical assistance

Analytic Methods: Categorical Data

# of Measurements

# of Samples

Single Pre/Post Repeated Measures

1 Sample z-test McNemar’s Test

Generalized Linear Mixed Models

(GLMM)*

2 Samples Chi-square Test

GLMM* GLMM*

“k” Samples

Chi-square Test

GLMM* GLMM*

Adjusting for

Covariates:

Multiple Logistic Regression*, Generalized Linear Mixed Models*

*Will likely require statistical assistance

Analytic Methods: Group Discussion

Please break into groups by table

For the next 10-15 minutes, take turns discussing what analytic approaches are appropriate for your proposed study. What are your null and alternative hypotheses?

Is your outcome continuous or categorical?

How many groups and measurements?

If your study is qualitative, discuss how statistical methodologies could be used (e.g. data summary, association).

Sample Size Determination

As a general rule, larger sample sizes: Lead to more representative samples Lead to better estimation of parameters (e.g.,

representative measures) Provide estimators with lower variability

N=9 N=36

N=100


Averages over 10,000 Simulations

Sample Size

Sample Mean

Sample Std. Dev.

Standard Error*

9 204.4 36.5 12.3

16 204.3 37.1 9.5

25 204.2 37.2 7.8

36 204.1 37.5 6.5

49 204.1 37.6 5.5

64 204.2 37.7 4.9

81 204.1 37.7 4.2

100 204.1 37.7 3.9

1000 204.1 37.7 1.2

*SE: explains variability in estimator; not the sample data


Possible Decisions

Power = 1 - β

True State

Decision H0 is “True” HA is True

Reject H0 Type I Errorα

Correct Decision

Fail to Reject H0

Correct Decision

Type II Error

β


Determinants of Required Sample Size

Significance Level (α): probability of rejecting H0 when it is true.

Power (1-β): probability of failing to reject H0 when it is false.

These values are selected during design phase α = 5% 1-β = 80% (sometimes 90%).



Measure of variability (usually standard deviation) inherent in study population. As data become more variable… Standard error of Test statistic increases… p-value increases… Ability to reject H0 decreases… Power decreases.

Controlling variability: Better measurement methodology Homogeneous samples



Effect Size: smallest difference or change in outcome that you are hoping to find As difference you want to observe decreases… Test statistic decreases… p-value increases… Ability to reject H0 decreases… Power decreases.

Considerations: Clinical significance Clinical possibility (larger differences are easier to detect and

harder to find)


Calculating Required Sample Size Equations exist (involving α, β, variability and effect

size) for simple analytic methods (t-test, chi-square, etc.).

Advanced methods require professional assistance.

Where do you find variability and effect size? Previous literature of similar populations Pilot study Guess-timates


What if required sample size is too large? Consider a different outcome

Continuous measures generally require smaller sample sizes than categorical measures

Consider multiple sections or sites Will require more sophisticated analytic methods

Reconfigure study as a “pilot” Emphasis switches from “hypothesis testing” to “estimation”

and “data summary” Goal is to provide data summaries and estimate confidence

intervals Summaries can be used to power larger study

Sample Size Determination: Group Discussions

Please break into groups by table.

For the next 10-15 minutes, take turns discussing: Whether you will be able to power your study.

Where to find information to perform power analysis.

Your options if you are unable to adequately power your study.


VCU Department of Biostatistics 18 full-time faculty

Can assist with: study design, sample size determination, interim and final analyses, dissemination

Grant funding (or prospects of funding) usually required.

BIOS 516 Biostatistical Consulting: graduate students available for FREE consultations Contact Russ Boyle ([email protected]) and provide a

protocol.

mailto:[email protected]


VCU Center for Clinical and Translation Research

Research Incubator: study design, sample size determination, and other resources (e.g. grant writing) Contact: Pam Dillon ([email protected])

Biomedical Informatics: data management and storage (e.g. REDCAP) Support requested online:

(http://www.cctr.vcu.edu/informatics/index.html)

mailto:[email protected]


Textbooks (i.e., shameless plug):

Statistical Research Methods: A Guide for Non-Statisticians Sabo and Boone, Springer, 2013 Available on the web ($45-$65):

http://www.springer.com/statistics//life+sciences,+medicine+%26+health/book/978-1-4614-8707-4

http://www.amazon.ca/Statistical-Research-Methods-Guide-Non-Statisticians/dp/1461487072

http://www.springer.com/statistics



statistics pres 3.31.2014

Documents

null hypothesis

categorical measures

test group

analytic approaches

alternative hypothesis

continuous data

categorical data

continuous measurements