statistics for librarians, session 1: what is statistics & why is it important?
DESCRIPTION
First of 4 sessions introducing statistics to librarians and library staff.TRANSCRIPT
Why is it important?
WHAT IS STATISTICS?
Goals of Series
Comfort
Fears
Series Objectives
FoundationsDescriptive Statistics
Inferential Statistics
Reading & Interpreting
Statistics
Comfort Level
What is Statistics?
• Study of Data• Collecting• Organizing• Summarizing • Analyzing• Presenting• Storing &
Sharing
Why is it Important?
• Make sense of the data
• Explain what happens and (possibly) why
• Make sound decisions
• To know how close we are to the truth.
Results
Bias?
Sampling Error?
Invalid Measure
s?
Random Error?
Other Factors?
Purpose of Statistics
Thinking about Data in your Research Project
Start with your Research Question
How do users differ when (searching, finding, selecting) (articles, books, Web sites)?What are the effects of ___________On ____________?
Which is better at improving _________?How are people (finding, selecting, using) _______?
What are factors associated with ___________?
Example of Research Question
PACS• Low LibQUAL+
Ratings
Collections
• Is it our collections?
Do we have what they use?
• Based on citations
VariablesIndepende
nt
Subjects
Factors
Effects of…
Dependent
Objects
Outcomes
Effects on…
Example of Variables
• Department• Years at UNTFaculty
• # published by type
Published
• # cited by type• UNT accessibleCited
IV
DV
Scales of Data (NOIR)
Nominal• Counts by
category• Binary (Yes/No)• No meaning
between the categories (Blue is not better than Red)
Ordinal• Ranks• Scales• Space between
ranks is subjective
Interval• Integers• No baseline• Space between
values is equal and objective, but discrete
Ratio• Interval data with
a baseline• Space between is
continuous
Likert-Type Scale?
Arbitrary
Few Levels
Individual Questions
Ordinal?
Symmetrical
Many Levels
Composite Score
Interval?
Example of Variable Types
• Department• Years at UNTFaculty
• # published by type
Published
• # cited by type• UNT accessibleCited
N
N
NN
I
Compared to What?
Book Circulations
180,354
Compared by…
Time Periods
Other Libraries
National Surveys
Patron Types
Material Types
Research Question
Data Type
Comparison Group
Statistical
Methods Used
VALIDITY OF MEASURES
Are you actually measuring what you are trying to
measure?
Selecting Measures
•Counts•Survey responses•Grades/Scores•Ranks•Scales (e.g. Likert)•Age, Length of Time•Frequency
•People•Books•Articles•Uses•Levels of Analysis•What is the object (DV)?•What is the subject (IV)?
Measures Units of Analysis
Use a tool with established validity
Approaches and Study Skills Inventory for Students (ASSIST)
User Engagement Scale (UES)
Establish Validity of Measures
• ConsistencyReliability
• Corresponds with expectations
• Common understandings
Content Validity
• Corresponds with other variables based on theory
Construct Validity
• Corresponds with other measures
Criterion Validity
Image: © Nevit Dilmen found at Wikimedia commons
Results
Bias?
Invalid Measure
s?
Sampling Error?
Random Error?
Other Factors?
ROLE OF SAMPLING
All members of population
Hard to measure
The Truth
Census
A selection of the population
Easier to measure
An estimate of the truth
Sample
When to Use Which:Research Question?
Census
• Book usage at UNT Libraries
• Effects of IL instruction on English 1100 students
Sample
• Book usage at all libraries
• Effects of IL instruction on all students
Example - Census or Sample?
All journal articles cited
All Items Published by PACS Faculty
All journal articles published by PACS faculty
Random Samples
• Every Unit of Analysis has an equal and known chance of being included.
Importance of Randomness
Random Samples
Random, Weighted,
etc.
Should be representati
ve of population
Can use inferential statistics
Most useful for testing hypotheses
Non-Random Samples
Convenience, Purposive, etc.
May or may not be
representative of population
Use descriptive
statistics only
Most useful for generating hypotheses
Results
Bias?
Invalid Measure
s?
Sampling Error?
Random Error?
Other Factors?
ROLE OF DATA COLLECTION IN STATISTICS
Goal of Data Collection in Statistics
Reliability
Bias
BiasSystematic (not random) deviation from the true value (Statistics.com)
Selection Bias
Measurement• Observer Bias• Non-response Bias
Analysis Bias
Data Collection Forms
Many or Complex Variables
Surveys
1 Unit Per
Form Fewer Variables
Collected all at once
BibliometricSpace Surveys
Spread-
sheet
Data Input
Have a data entry plan
Train the inputters
Use data validation tricks
Double-entry
Organizing Data
One Unit of Analysis per Row
Example Spreadsheets
Results
Bias?
Invalid Measure
s?
Sampling Error?
Random Error?
Other Factors?
STATISTICAL ANALYSIS
Central Tenden
cy
ErrorSpread
Elements of Statistical Analysis
Inferential
• Infer associations
Descriptive
• Describe
Descriptive AnalysisJust the Facts, Ma’am
Summarizes
TablesCharts
UnivariateOne
variable at a time
Comparison with
Population
Demonstrates how random the sample is
Measures of Central Tendency
• Average
Mean
• Middle
Median
• Most Common
Mode
Central Tendency by Scales
Interval or Ratio
Mean
Median
Nominal or Rank
Median
Mode
Spread
Interval & Ratio
• Range• Quartiles
or Quintiles
• Standard Deviation
Nominal & Rank
• Distribution Tables
• Bar Graphs
How variable is the data?
Range & Quartiles
Standard Deviation
•Measure of dispersion of data•Square root of the average variation from the mean
What does the Standard Deviation tell you?
Greater variation, less certainty
Lower variation, more certainty
Presentation of Spread
•Box plots•Mean•Upper & lower quintiles•Outliers•Cross-tabulations•Bar graphs
Spread of Nominal data
Bar graphs & plots
Inferential Statistics
Tests of hypotheses• Associations• ExpectationsAccounts for uncertainty• Random error• Confidence interval
Hypotheses
Your Hypothes
is(H1)
Null Hypothesis(H0)
Example Hypothesis
>=75%* <75%*
*…of journal articles cited by UNT PACS faculty in journal articles published between 2008-2011.
UNT Libraries provides access to…
Hypothesis Testing
p
Sample Size
Central Tendency
SpreadDistribution
Significance Level
Statistical Analysis
Noise
Signal
Results
Bias?
Sampling Error?
Invalid Measure
s?
Random Error?
Other Factors?
Purpose of Statistics
Valid
• Measures• Data Collection• Sample Selection• Statistical Methods
Valid
• Data• Sample• Statistical Analysis
Valid
• Results
Role of Validity
in Researc
h
Resources
Rice Virtual Lab in Statistics
Excel Tutorials for Statistical Analysis
Khan Academy - videos
Basic Research Methods for Librarians
– ebook
Descriptive Statistical Techniques for Librarians
- ebook