chapter 1 introduction to statistics
DESCRIPTION
Chapter 1 Introduction to Statistics. 1-1Review and Preview 1-2Statistical and Critical Thinking 1-3Types of Data 1-4Collecting Sample Data. Preview. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/1.jpg)
Chapter 1Introduction to Statistics
1-1 Review and Preview
1-2 Statistical and Critical Thinking
1-3 Types of Data
1-4 Collecting Sample Data
![Page 2: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/2.jpg)
Preview
Polls, studies, surveys and other data collecting tools collect data from a small part of a larger group so that we can learn something about the larger group.
This is a common and important goal of statistics: Learn about a large group by examining data from some of its members.
![Page 3: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/3.jpg)
Preview
In this context, the terms sample and population have special meaning. Formal definitions for these and other basic terms will be given here.
In this chapter, we will look at some of the ways to describe data.
![Page 4: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/4.jpg)
Data -Collections of observations, such as measurements, genders, or survey responses
Data
![Page 5: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/5.jpg)
Statistics -
The science of planning studies and experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions based on the data
Statistics
![Page 6: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/6.jpg)
Population
Population - The complete collection of all
measurements or data that are being considered
![Page 7: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/7.jpg)
Census versus Sample
Census -
Collection of data from every member of a population
Sample -
Subcollection of members selected from a population
![Page 8: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/8.jpg)
Example The Gallup corporation collected data from
1013 adults in the United States. Results showed that 66% of the respondents worried about identity theft.
The population consists of all 241,472,385 adults in the United States.
The sample consists of the 1013 polled adults.
The objective is to use the sample data as a basis for drawing a conclusion about the whole population.
![Page 9: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/9.jpg)
1-2 Statistical and Critical Thinking
This section provides an overview of the process involved in conducting a statistical study:
•Prepare
•Analyze
•Conclude
![Page 10: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/10.jpg)
Prepare - Context
What do the data mean? What is the goal of the study?
![Page 11: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/11.jpg)
Prepare - Source of the Data
Is the source objective? Is the source biased? Be vigilant and skeptical of studies from
sources that may be biased.
![Page 12: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/12.jpg)
Prepare - Sampling Method
Does the method chosen greatly influence the validity of the conclusion?
Voluntary response (or self-selected) samples often have bias (those with special interest are more likely to participate).
Other methods are more likely to produce good results.
![Page 13: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/13.jpg)
Analyze – Graph and Explore
Every analysis should begin with appropriate graphs (Chapter 2).
![Page 14: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/14.jpg)
Analyze – Apply Statistical Methods
Later chapters describe important statistical methods.
With technology, good analysis does not require strong computational skills, but it does require using common sense and paying attention to sound statistical methods.
![Page 15: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/15.jpg)
Conclude – Statistical Significance
Statistical significance is achieved in a study when we get a result that is very unlikely to occur by chance.
![Page 16: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/16.jpg)
Conclude - Practical Significance
State practical implications of the results.
Common sense might suggest that the finding does not make enough of a difference to justify its use or to be practical.
![Page 17: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/17.jpg)
Example
In a test of the Atkins weight loss program, 40 subjects had a mean weight loss of 4.6 pounds after one year.
Using formal methods of statistical analysis, we can conclude the diet appears to be effective.
![Page 18: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/18.jpg)
Example - continued
However, although 4.6 pounds is statistically significant, using common sense, it does not seem very worthwhile.
![Page 19: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/19.jpg)
Potential Pitfalls – Misleading Conclusions
Concluding that one variable causes the other variable when in fact the variables are only correlated or associated together.
Two variables that may seemed linked, are smoking and pulse rate.
We cannot conclude the one causes the other. Correlation does not imply causality.
![Page 20: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/20.jpg)
Potential Pitfalls - Small Samples
Conclusions should not be based on samples that are far too small.
Example: Basing a school suspension rate on a sample of only three students
![Page 21: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/21.jpg)
Potential Pitfalls - Loaded Questions
If survey questions are not worded carefully, the results of a study can be misleading.
97% yes: “Should the President have the line item veto to eliminate waste?”
57% yes: “Should the President have the line item veto, or not?”
![Page 22: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/22.jpg)
Potential Pitfalls - Order of Questions
Questions are unintentionally loaded by such factors as the order of the items being considered.
Would you say traffic contributes more or less to air pollution than industry? Results: traffic - 45%; industry - 27%
When order reversed.Results: industry - 57%; traffic - 24%
![Page 23: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/23.jpg)
Potential Pitfalls - Percentages
Misleading or unclear percentages are sometimes used.
Example – Continental Airlines ran an ad claiming “We’ve already improved 100% in the last six months” with respect to lost baggage.
Does this mean Continental made no mistakes?
![Page 24: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/24.jpg)
Potential Pitfalls - Nonresponse
Occurs when someone either refuses to respond to a survey question or is unavailable.
People who refuse to talk to pollsters have a view of the world around them that is markedly different than those who will let pollsters into their homes.
![Page 25: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/25.jpg)
Potential Pitfalls - Missing Data
Can dramatically affect results.
Subjects may drop out for reasons unrelated to the study.
Example - People with low incomes are less likely to report their incomes.
Example – U.S. Census suffers from missing people (tend to be homeless or low income).
![Page 26: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/26.jpg)
Potential Pitfalls - Precise Numbers
Because as a figure is precise, many people incorrectly assume that it is also accurate.
A precise number can be an estimate, and it should be referred to that way.
![Page 27: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/27.jpg)
1-3 Types of Data
The subject of statistics is largely about using sample data to make inferences about an entire population.
It is essential to know and understand the definitions that follow.
![Page 28: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/28.jpg)
Parameter a numerical measurement describing some characteristic of a population.
population
parameter
Parameter
![Page 29: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/29.jpg)
Statistic
Statistic
a numerical measurement describing some characteristic of a sample.
sample
statistic
![Page 30: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/30.jpg)
Quantitative Data
Quantitative (or numerical) data
consists of numbers representing counts or measurements.
Example: The weights of supermodels
Example: The ages of respondents
![Page 31: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/31.jpg)
Categorical Data
Categorical (or qualitative or attribute) dataconsists of names or labels (representing categories).
Example: The gender (male/female) of professional athletes
Example: Shirt numbers on professional athletes uniforms - substitutes for names.
![Page 32: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/32.jpg)
Working with Quantitative Data
Quantitative data can be further described by distinguishing between discrete and continuous types.
![Page 33: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/33.jpg)
Discrete data result when the number of possible values is
either a finite number or a ‘countable’ number
(i.e. the number of possible values is
0, 1, 2, 3, . . .).
Example: The number of eggs that a hen lays
Discrete Data
![Page 34: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/34.jpg)
Continuous (numerical) data result from infinitely many possible values that
correspond to some continuous scale that covers a range of values without gaps, interruptions, or jumps.
Continuous Data
Example: The amount of milk that a cow produces; e.g. 2.343115 gallons per day
![Page 35: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/35.jpg)
Levels of Measurement
Another way to classify data is to use levels of measurement.
![Page 36: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/36.jpg)
Nominal level of measurement
characterized by data that consist of names,
labels, or categories only, and the data cannot be
arranged in an ordering scheme (such as low to
high).
Example: Survey responses yes, no, undecided
Nominal Level
![Page 37: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/37.jpg)
Ordinal level of measurement
involves data that can be arranged in some order,
but differences between data values either cannot
be determined or are meaningless.
Example: Course grades A, B, C, D, or F
Ordinal Level
![Page 38: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/38.jpg)
Interval level of measurement
involves data that can be arranged in order and
the difference between any two data values is
meaningful. However, there is no natural zero
starting point (where none of the quantity is
present).
Example: Years 1000, 2000, 1776, and 1492
Interval Level
![Page 39: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/39.jpg)
Ratio level of measurement
the interval level with the additional property that there is also a natural zero starting point (where zero indicates that none of the quantity is present); for values at this level, differences and ratios are meaningful.
Example: Prices of college textbooks ($0 represents no cost, a $100 book costs twice as much as a $50 book)
Ratio Level
![Page 40: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/40.jpg)
Summary - Levels of Measurement
Nominal - categories only
Ordinal - categories with some order
Interval - differences but no natural zero point
Ratio - differences and a natural zero point
![Page 41: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/41.jpg)
1-4 Collecting Sample Data
If sample data are not collected in an appropriate way, the data may be so completely useless that no amount of statistical torturing can salvage them.
The method used to collect sample data influences the quality of the statistical analysis.
Of particular importance is the simple random sample.
![Page 42: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/42.jpg)
Statistical methods are driven by the data that we collect. We typically obtain data from two distinct sources: observational studies and experiment.
Basics of Collecting Data
![Page 43: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/43.jpg)
Observational study observing and measuring specific
characteristics without attempting to modify the subjects being studied.
Observational Study
![Page 44: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/44.jpg)
Experiment apply some treatment and then observe its effects on the subjects (subjects inexperiments are called experimental units)
Experiment
![Page 45: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/45.jpg)
The Pew Research Center surveyed 2252 adults and found that 59% of them go online wirelessly. This an observational study because the adults had no treatment applied to them.
Example
![Page 46: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/46.jpg)
In the largest public health experiment ever conducted, 200,745 children were given the Salk vaccine, while another 201,229 children were given a placebo. The vaccine injections constitute a treatment that modified the subjects, so this is an example of an experiment.
Example
![Page 47: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/47.jpg)
Simple Random Sample
Simple Random SampleA sample of n subjects is selected in such a way that every possible sample of the same size n has the same chance of being chosen.
![Page 48: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/48.jpg)
Random Sample Members from the population are selected in such a way that each individual member in the population has an equal chance of being selected.
Random Sample
![Page 49: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/49.jpg)
Systematic Sampling Select some starting point and then select every kth element in the population.
![Page 50: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/50.jpg)
Convenience SamplingUse results that are easy to get.
![Page 51: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/51.jpg)
Stratified Sampling
Subdivide the population into at least two different subgroups that share the same characteristics, then draw a sample from each subgroup (or stratum).
![Page 52: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/52.jpg)
Cluster Sampling
Divide the population area into sections (or clusters). Then randomly select some of those clusters. Now choose all members from selected clusters.
![Page 53: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/53.jpg)
Multistage Sampling
Collect data by using some combination of the basic sampling methods.
In a multistage sample design, pollsters select a sample in different stages, and each stage might use different methods of sampling.
![Page 54: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/54.jpg)
Random
Systematic
Convenience
Stratified
Cluster
Multistage
Methods of Sampling - Summary
![Page 55: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/55.jpg)
Different types of observational studies and experiment design.
Beyond the Basics of Collecting Data
![Page 56: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/56.jpg)
Cross-sectional study
Data are observed, measured, and collected at one point in time.
Retrospective (or case control) study
Data are collected from the past by going back in time (examine records, interviews, and so on …).
Prospective (or longitudinal or cohort) study
Data are collected in the future from groups sharing common factors (called cohorts).
Types of Studies
![Page 57: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/57.jpg)
Randomization is used when subjects are assigned to different groups through a process of random selection. The logic is to use chance as a way to create two groups that are similar.
Design of Experiments
![Page 58: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/58.jpg)
Replicationis the repetition of an experiment on more than one subject.
Samples should be large enough so that the erratic behavior that is characteristic of very small samples will not disguise the true effects of different treatments.
It is used effectively when there are enough subjects to recognize the differences from different treatments.
Design of Experiments
![Page 59: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/59.jpg)
ReplicationDesign of Experiments
Use a sample size that is large enough to let us see the true nature of any effects, and obtain the sample using an appropriate method, such as one based on randomness.
![Page 60: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/60.jpg)
Blinding is a technique in which the subject doesn’t know whether he or she is receiving a treatment or a placebo.
Blinding allows us to determine whether the treatment effect is significantly different from a placebo effect, which occurs when an untreated subject reports improvement in symptoms.
Design of Experiments
![Page 61: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/61.jpg)
Double-Blind Blinding occurs at two levels:
(1) The subject doesn’t know whether he or she is receiving the treatment or a placebo.
(2) The experimenter does not know whether he or she is administering the treatment or placebo.
Design of Experiments
![Page 62: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/62.jpg)
Confounding occurs in an experiment when the experimenter is not able to distinguish between the effects of different factors.
Try to plan the experiment so that confounding does not occur.
Design of Experiments
![Page 63: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/63.jpg)
Three very important considerations in the design of experiments are the following:
Summary
1. Use randomization to assign subjects to different groups.
2. Use replication by repeating the experiment on enough subjects so that effects of treatment or other factors can be clearly seen.
3. Control the effects of variables by using such techniques as blinding and a completely randomized experimental design.
![Page 64: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/64.jpg)
Sampling error
the difference between a sample result and the true population result, such an error results from chance sample fluctuations.
Nonsampling error sample data incorrectly collected, recorded, or analyzed (such as by selecting a biased sample, using a defective instrument, or copying the data incorrectly).
ErrorsNo matter how well you plan and execute the sample collection process, there is likely to be some error in the results.
![Page 65: Chapter 1 Introduction to Statistics](https://reader035.vdocuments.mx/reader035/viewer/2022062221/56813946550346895da0e0fa/html5/thumbnails/65.jpg)
Nonrandom sampling error
result of using a sampling method that is not random, such as using a convenience sample or a voluntary response sample.
ErrorsNo matter how well you plan and execute the sample collection process, there is likely to be some error in the results.