prob \u0026 stats slides on unit

72
Bell Work 1/16 1) In your own words tell me what you think stascs is about. Give examples how you think stascs applied or how stascs is relevant to your own life.

Upload: unsrat

Post on 24-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

Bell Work 1/16

1) In your own words tell me what you think statistics is about. Give examples how you think statistics applied or how statistics is relevant to your own life.

Objective(s):

1. Identify the Who, What, When, Where, Why and How of data, or recognize when some of this information has not been provided.

2. Identify the cases and variables in any data set.

3. Identify the population from which a sample was chosen.

Unit 1:Introduction to Statistics

What is Statistics

Statistics is the science of collecting, organizing, analyzing, and interpreting data in

order to make decisions.

Examples of Statistical Applications

Search engines and websites that you frequent collect and store data on your internet usage to tailor your online experience

Data is collected on each of you to chart your progress throughout your high school experience

DataData- Systematically recorded information, whether numbers or labels, together with its context

Who: Whom the data is being collected on, or whom we are recording characteristics of

What: What characteristics are being recorded about each individual case (variables)

Why: Why are we gathering this data? What is its intended use

Where: Where was the study conducted

When: When was the information collected

How: How was the information gathered, or how the data are collected

Data is meaningless without context

Data TablesData table – an arrangement of data in which each row represents a case and each column represents a variable

Data collected by Amazon

7

Important Terms•Population

The collection of all responses, measurements, or counts that are of interest.

•SampleA portion or subset of the population.

xxx

x

x

x

xx

x x

x

x x

x x

xx

x x

xx

x

xx

x

xx

x

xx x

xxx

x

x

xx

x

x

xx

x

xx

x

x

x

x

x

x

x

x

x

x

x

xx

xx

xx

x xxx

x

x

xx

xxx

x

xx

x

xx

x

x

x

x

x

DataWhich of the following Venn diagrams shows the

relationship between population data and sample data?

a). b).

c). d).

S P

S

P S

P

P

S

Ian Walker, a psychologist at the University of Bath, wondered whether drivers treat bicycle riders differently when they wear helmets. He rigged his bicycle with an ultrasonic sensor that could measure how close each car was that passed him. He then rode on alternating days with and without a helmet. Out of 2500 cars passing him, he found that when he wore his helmet, motorists passed 3.35 inches closer to him, on average, than when his head was bear?

Who:

What:

Why:

When:

Where:

How:

Contextualizing Data: Identifying the 5 W’s and the How

Population of Interests:

Ian Walker, a psychologist at the University of Bath, wondered whether drivers treat bicycle riders differently when they wear helmets. He rigged his bicycle with an ultrasonic sensor that could measure how close each car was that passed him. He then rode on alternating days with and without a helmet. Out of 2500 cars passing him, he found that when he wore his helmet, motorists passed 3.35 inches closer to him, on average, than when his head was bear?

Who: 2500 motorists

What: the distance between the cars and the bicycle riders

Why: to determine if wearing a helmet influenced how drivers treated bicycle riders

When: Not specified

Where: Not specified

How: He rigged his bicycle with an ultrasonic sensor that could measure how close each car was that passed him

Contextualizing Data: Identifying the 5 W’s and the How

Population of Interests: all motorists passing cyclists

Some companies offer 401(k) retirement plans to employees, permitting them to shift part of their before-tax salaries into investments such as mutual funds. Employers typically match 50% of the employees’ contributions up to about 6% of salary. One company, concerned with what it believed was a low employee participation rate in its 401(k) plan, sampled 30 other companies with similar plans and asked for their 401(k) participation rates?

Who:

What:

Why:

When:

Where:

How:

Contextualizing Data: Identifying the 5 W’s and the How

Population of Interests:

Some companies offer 401(k) retirement plans to employees, permitting them to shift part of their before-tax salaries into investments such as mutual funds. Employers typically match 50% of the employees’ contributions up to about 6% of salary. One company, concerned with what it believed was a low employee participation rate in its 401(k) plan, sampled 30 other companies with similar plans and asked for their 401(k) participation rates?

Who: 30 similar companies

What: 401(k) participation rates

Why: concern over low employee participation rates in its 401(k) plan

When: Not specified

Where: Not specified

How: by sampling 30 similar companies

Contextualizing Data: Identifying the 5 W’s and the How

Population of Interests:All similar companies

Contextualizing Data: Identifying the 5 W’s and the How

Because of the difficulty of weighing a bear in the woods, researchers caught and measured 54 bears, recording their weight, neck size, length, and sex. They hoped to find a way to estimate weight from the other, more easily determined quantities.

Who:

What:

Why:

When:

Where:

How:

Contextualizing Data: Identifying the 5 W’s and the How

Because of the difficulty of weighing a bear in the woods, researchers caught and measured 54 bears, recording their weight, neck size, length, and sex. They hoped to find a way to estimate weight from the other, more easily determined quantities.

Who: 54 bears

What: weight, neck size, length, and sex

Why: to find an easier way of estimating weight

When: Not specified

Where: Not specified

How: Researchers collected data from 54 bears they were able to catch

One of the reasons that the Monitoring the Future (MTF) project was started was “to study changes in the beliefs, attitudes, and behavior of young people in the United States.” Data are collected from 8th, 10th, and 12th graders each year. To get a representative nationwide sample, surveys are given to a randomly selected group of students. In Spring 2004, students were asked about alcohol, illegal drug, and cigarette use. Describe the W’s, if the information is given. If the information is not given, state that it is not specified. Who:

What:

When:

Where:

How:

Why:

In June 2003 Consumer Reports published an article on some sport-utility vehicles they had tested recently. They reported some basic information about each of the vehicles and the results of some tests conducted by their staff. Among other things, the article told the brand of each vehicle, its price, and whether it had a standard or automatic transmission. They reported the vehicle’s fuel economy, its acceleration (number of seconds to go from zero to 60 mph), and its braking distance to stop from 60 mph. The article also rated each vehicle’s reliability as much better than average, better than average, average, worse, or much worse than average. Describe the W’s and the How, if the information is given:

Who:

What:

When:

Where:

How:

Why:

Who:

What:

When:

Where:

How:

Why:

A listing posted by Arby’s restaurant chain gives, for each of the sandwiches it sells, the type of meat in the sandwich, the number of calories, and the serving size in ounces. The data might be used to assess the nutritional value of the different sandwiches. Describe the W’s and the How, if the information is given:

Bell Work 1/18

1) For the description of data, identify the Who and What were investigated and the population of interest

Some motion pictures are profitable and others are not. Understandably, the movie industry would like to know what makes movies successful. Data from 120 first-run movies released in 2005 suggest that longer movies actually make less profit.

2) From the Venn Diagram Identify the population and the sample

Objectives

• Classify a variable as categorical (qualitative) or quantitative.

• Identify whether variables are quantitative or categorical based on the context of the data

• For any quantitative variable, identify the units in which the variable has been measured.

Some variables have units that tell how each value has been measured and tell the scale of the measurement.

Categorical and Quantitative VariablesVariable - is an attribute or characteristic of an individual or object whose value varies from case to case

Categorical Variables consist of attributes, labels, or non-numerical entries.

Examples:

Quantitative Variables consist of numerical measurements or counts.(Quantitative Variables always have units)

Examples:

The suggested retail prices of several Ford vehicles are shown in the table.Which data are qualitative data and which are quantitative data? Explain yourreasoning.

Categorical and Quantitative Variables

The populations of several U.S. cities are shown in the table. Which data arequalitative data and which are quantitative data?

determine whether the data are qualitative or quantitative. Explain your reasoning.

a) telephone numbers in a directory e) heights of hot air balloons

b) body temperatures of patients f) eye colors of models

c) lengths of songs on MP3 player g) carrying capacities of pickups

d) Student ID numbers h) age

Categorical and Quantitative Variables

A February 2007 Gallup Poll question asked, “In politics, as of today, do you consider yourself a Republican, a Democrat, or an Independent?” The possible responses were “Democrat”, “Republican”, “Independent”, “Other”, and “No Response”. What kind of variable is the response?

A pharmaceutical company conducts an experiment in which a subject takes 100 mg of a substance orally. The researchers measure how many minutes ittakes for half of the substance to exit the bloodstream. What kind of variable is the company studying?

Age and party. The Gallup Poll conducted a representativetelephone survey of 1180 American voters duringthe first quarter of 2007. Among the reported results werethe voter’s region (Northeast, South, etc.), age, party affiliation,and whether or not the person had voted in the2006 midterm congressional election.

For each description of data, identify the W’s,name the variables, specify for each variable whether its use indicatesthat it should be treated as categorical or quantitative, and, forany quantitative variable, identify the units in which it was measured(or note that they were not provided).

Categorical and Quantitative Variables

Who- How-

What-

Where-

Why-

When-

Variables:

Schools. The State Education Department requires localschool districts to keep these records on all students:age, race or ethnicity, days absent, current grade level,standardized test scores in reading and mathematics, andany disabilities or special educational needs.

For each description of data, identify the W’s, name the variables, specify for each variable whether its use indicates that it should be treated as categorical or quantitative, and, for any quantitative variable, identify the units in which it was measured (or note that they were not provided).

Categorical and Quantitative Variables

Who- How-

What-

Where-

Why-

When-

Variables:

The Kentucky Derby is a horse race that has been run every year since 1875 at Churchill Downs, Louisville, Kentucky. The race started as a 1.5-mile race, but in 1896, it was shortened to 1.25 miles because experts felt that 3-year-old horses shouldn’t run such a long race that early in the season. (It has been run in May every year but one—1901—when it took place on April 29). Here are the data for the first four and several recent races.

Categorical and Quantitative Variables

Categorical and Quantitative Variables

The Gallup Poll conducted a representative telephone survey of 1180 American voters during the first quarter of 2007. Among the reported results were the voter’s region (Northeast, South, etc.), age, party affiliation, and whether or not the person had voted in the2006 midterm congressional election.

Who- How-

What-

Where-

Why-

When-

Variables:

a) player numbers for a soccer team b) student ID numbers c)wait times at a grocery store

d) species of trees in a forest

e) weights of infants at a hospital f) responses on an opinion poll

Identify the variables as quantitative or categorical. Explain your reasoning.

In November 2003 Discover published an article on the colonies of ants. They reported some basic information about many species of ants and the results of some discoveries found by myrmecologist Walter Tschinkel of the University of Florida. Information included the scientific name of the ant species, the geographic location, the depth of the nest (in feet), the number of chambers in the nest, and the number of ants in the colony. The article documented how new ant colonies begin, the ant-nest design, and how nests differ in shape, number, size of chambers, and how they are connected, depending on the species. It reported that nest designs include vertical, horizontal, or inclined tunnels for movement and transport of food and ants.

Categorical and Quantitative Variables

1. Describe the W’s, if the information is given:• Who: Colonies of ants. “Many species of ants,” but no indication of exactly how many.• What: scientific name, geographic location, average nest depth, average number of chambers, average colony size, how new ant colonies begin, the ant-nest design, and how nests differ in architecture.• When: November 2003• Where: not specified• How: The results of some discoveries found by myrmecologist Walter Tschinkel of the University of Florida• Why: Information of interest to readers of the magazine

The 2.5 mile Indianapolis Motor Speedway has been the home to a race on Memorial Day nearly every year since 1911. Here are the data for the first five races of five races and five recent Indianapolis 500 races. Included also are the pole winners (the winners of the trial races, when each driver drives alone to determine the position on race day). Identify the W’s, name the variables, specify for each variable whether its use indicates that it should be treated as categorical or quantitative, and, for any quantitative variable, identify the units in which it was measured.

What can go wrong?• Don’t label a variable as categorical or quantitative without thinking about

the question you want to answer. The same variable can sometimes take on different roles

• Just because your variables are numbers, don’t assume that it’s quantitative.

• Always be skeptical

What have we learned?• Data are information in a context.

– The W’s help with context.– We must know the Who (cases), What (variables), and Why to be able

to say anything useful about the data.

We treat variables as categorical or quantitative.• Categorical variables identify a category for each case.• Quantitative variables record measurements or amounts of something

and must have units.• Some variables can be treated as categorical or quantitative depending

on what we want to learn from them.

Bell Work 1/22

1) From the diagram, identify the population and the sample

2) From the description, Identify the variables, and for each variable tell whether it should be treated as categorical or quantitative Business analysts hoping to provide information helpful to American grape growers compiled these data about vineyards: size (acres), number of years in existence, state, varieties of grapes grown, average case price, gross sales, and percent profit.

Designing a Statistical Study1. Identify the variable(s) of interest (the focus) and the populationof the study.2. Develop a detailed plan for collecting data. If you use a sample, make sure the sample is representative of the population.3. Collect the data.4. Describe the data, using descriptive statistics techniques.5. Interpret the data and make decisions about the population using inferential statistics.6. Identify any possible errors

Identifying Populations and SamplesDetermine whether each data set is a population or a sample

a) The height of each player on a school’s basketball team

b) The amount of energy collected from every wind turbine on a wind farm

c) A survey of 500 spectators from a stadium with 42,000 spectators

d) The annual salary of each pharmacist at a pharmacy

e) The cholesterol levels of 20 patients in a hospital with 100 patients

a) A survey of 1000 U.S. adults found that 59% think buying a home is the best investment a family can make. (Source: Rasmussen Reports)

b) A study of 33,043 infants in Italy was conducted to find a link between a heart rhythm abnormality and sudden infant death syndrome. (Source: New England Journal of Medicine)

c) A survey of 1442 U.S. adults found that 36% received an influenza vaccine for the current flu season. (Source: Zogby International)

d) A survey of 1600 people found that 76% plan on using the Microsoft Windows 7™ operating system at their businesses. (Source: Information Technology Intelligence Corporation and Sunbelt Software)

e) A survey of 800 registered voters found that 50% think economic stimulus is the most important issue to consider when voting for Congress.

Identify the Sample and the Population of InterestIdentifying Populations and Samples

Methods of Collecting Data: Sample Surveys

Objectives:

Identify population parameters and sample statistics

Identify sampling techniques as simple random, stratified, cluster, systematic, or convenience.

Identify the sampling frame, sample and any potential biases.

Sample Surveys

We would like to gather information on an entire population of individuals.

Sample Survey- designed to ask questions of a small group of people in hope of learning something about the entire population

We examine a smaller group of individuals – a sample

We could perform a census.

Population Parameters and Sample Statistics

Definition

Parameter- is a numerical description of a population characteristic

Statistic- is a numerical description of a sample characteristic

1. A recent survey of 200 college career centers reported that the averagestarting salary for petroleum engineering majors is $83,121. (Source: NationalAssociation of Colleges and Employers)

2. The 2182 students who accepted admission offers to NorthwesternUniversity in 2009 have an average SAT score of 1442. (Source: NorthwesternUniversity)

3. In a random check of a sample of retail stores, the Food and DrugAdministration found that 34% of the stores were not storing fish at theproper temperature.

Distinguishing between Statistics and Parameters: Examples

In 2009, Major League Baseball teams spent a total of $2,655,395,194 on players’ salaries.

Population Parameters and Sample Statistics

Sixty-two of the 97 passengers aboard the Hindenburg airship survived itsexplosion.

In January 2010, 52% of the governors of the 50 states in the United Stateswere Democrats.

In a survey of 300 computer users, 8% said their computers had malfunctions that needed to be repaired by service technicians.

In a recent year, the interest category for 12% of all new magazines was sports.

Determine whether the numerical value is a parameter or a statistic

BiasSampling methods that by their nature, tend to over- or underemphasize some characteristic of the population are said to be biased

Biased sampling methods tend to over- or underestimate parameters.

x xx

xx

Random Sample: Each member of the population has an equal chance of being selected.

Simple Random Sample: All samples of the same size are equally likely.

xx

x x

x

xx

xx

x

x x

xx

x

x

xx

xx xx x

xx

x

x

xxx

xx

x

x x

xx

x

x

xx

xx xx x

xx

x

x

xx

xx

x

x

x x

xx

x

x

xx

xx xx x

x x

x

xxx

xx

x

x x

xx

x

x

xx

xx xx x

x x

x

x

x xx

xx

xx

x

x

Assign a number to each member of the population. Random numbers can be generated by a random

number table, software program or a calculator.Data from members of the population that correspond

to these numbers become members of the sample.

Simple Random Sample (SRS)Sampling frame: A list of individuals from whom the sample is drawn

Books: Appendix G pg. A-101

Bell Work 1/22

1) Tell whether the value given describes a parameter or a statistic

a) The 2009 team payroll of the Philadelphia Phillies was $113,004,046. (Source: USA Today)

b) In a survey of 752 adults in the United States, 42% think there should be a law that prohibits people from talking on cell phones in public places.(Source: University of Michigan)

2) Using the sequence of random number and reading from left to right. Generate a set of 4 numbers between 1 – 60.

71622 35940 81807 59225 18192

71|62|23|59|40|81|80|75|92|25 18192

{23, 59, 40, 25}

Sammy’s Salsa, a small local company, produces 240 jars of salsa a day. Each jar is imprinted with a code indicating the date and batch number. To help maintain consistency, at the end of each day, Sammy selects three jars of salsa, weighs the contents, and tastes the product. Help Sammy select the sample jars. Today’s jars are coded 07N61 through 07N300.

a) Describe how you might set this up using a simple randomized sample.

b) Show how to use random numbers to pick 3 jars.

20639 28642 06962 08710 84395

Simple Random Sample (SRS)

Stratified Sampling

Stratified Random Sampling- a sampling design where the population is divided into subpopulations, or strata, and random samples are then drawn from each stratum.

Strata are homogenous groups (groups sharing some common characteristic)

Examples of strata:

Cluster Sampling

Cluster sample – A sampling design in which entire groups or clusters are chosen at random.

Each cluster should be representative of the population, so all clusters should be heterogeneous

Example:

Systematic Sample

Choose a starting value at random. Then choose sample members at regular intervals.

When there is no relationship between the order of the sampling frame and the variables of interest, a systematic sample can be representative

Bad or Biased Sampling Methods: Voluntary Response Sample and Convenience Sample

Voluntary Response Sample – a large group of individuals is invited to respond and all who respond are counted

Convenience Sample - readily available members of the population are chosen for the sample

You divide the student population with respect to majors and randomly select and question some students in each major.

You assign each student a number and generate random numbers. You then question each student whose number is randomly selected.

Using random digit dialing, researchers call 1400 people and ask whatobstacles (such as childcare) keep them from exercising.

Questioning students as they leave a university library, a researcher asks358 students about their drinking habits.

After a hurricane, a disaster area is divided into 200 equal grids. Thirty of thegrids are selected, and every occupied household in the grid is interviewed tohelp focus relief efforts on what residents require the most.

Identify the Sampling Method used

Every tenth person entering a mall is asked to name his or her favorite store.

2) The Web site www.gamefaqs.com asked, as their question of the day to which visitors to the site were invited to respond, “Do you ever use emoticons when you type online?” Of the 87,262 respondents, 27% said that they did not use emoticons.

a) What kind of sample was this?

b) How much confidence would you place in using 27% as an estimate of the fraction of people who use emoticons?

Bell Work 1/24

1) From the sequence of random numbers, select 3 distinct numbers (no repeats) between 1 and 40, reading from left to right

56282 69928 14125 38872

Simple Random Sample (SRS)

A small sampling frame of administrators and teachers who work at CCHS is given in the box. Using the sequence of random numbers given, perform and SRS(Simple Random Sample) to select a sample of sample size 3.

1 Mr. Warren 6 Dr. Dixon2 Dr. Crosby 7 Mr. Allen3 Mr. Locklair 8 Mrs. Stroble4 Mrs. Shipp 9 Mr. Thomas5 Mr. Mambou 10 Mr. Bordieanu

Sample: Mr. Warren, Mr. Thomas, Mr. Bordieanu

83010 97601 89105 98803

55

Stratified Random Samples

Divide the population into groups (strata) and select a random sample from each group. Strata could be age groups, genders or levels of education, for example.

Sample

Stratified SamplingWorking with the same sampling frame, lets say that 20% of our schools employees are administrators and 80 % are teachers. Perform a Stratified Random Sample on Administrators and Teachers, with a sample size of 5.

Administrators1 Mr. Warren2 Dr. Crosby3 Mr. Locklair4 Dr. Dixon

Teachers1 Mr. Allen2 Mrs. Stroble3 Mrs. Shipp4 Mr. Thomas5 Mr. Mambou6 Mr. Bordieanu

Sequence of Random Numbers96299 07196

Sequence of Random Numbers98642 20639 23185

Our sample: Dr. Crosby, Mr. Bordieanu, Mr. Thomas, Mrs. Stroble, Mrs. Shipp

Cluster Samples

Divide the population into individual units or groups and randomly select one or more units. The sample consists of all members from selected unit(s).

Cluster Sample:

58

Systematic Samples

x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x xx x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x xx x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x

Choose a starting value at random. Then choose sample members at regular intervals.

We say we choose every kth member. In this example, k = 5. Every 5th member of the population is selected.

Management at a retail store is concerned about the possibility of drug abuse by people who work there. They decide to check on the extent of the problem by having a random sample of the employees undergo a drug test. Several plans for choosing the sample are proposed. Name the sampling strategy in each.

a. Randomly select an a store location and test all the people who work in that store- supervisors, full-time clerks, part-time clerks, and maintenance staff.

b. Choose the fourth person that arrives to work for each shift.

c. There are four employee classifications: supervisors, fulltime clerks, part-time clerks, and maintenance staff. Randomly select ten people from each category.

d. Each employee has a three-digit identification number. Randomly choose 40 numbers.

Identifying Sampling Methods

Causes of Biases

Under-coverage – part of the population is given less representation than it has in the population

Non-Response Bias

Voluntary Response Bias

Sampling from a bad or incomplete Sampling Frame

Response Bias

Response Bias: Examples of Biased Questions determine whether the survey question is biased. If the question is biased, suggest a better wording.

Why does eating whole-grain foods improve your health?

Why does text messaging while driving increase the risk of a crash?

How much do you exercise during an average week?

Why do you think the media have a negative effect on teen girls’ dietinghabits?

Do you think high school students should be required to wear uniforms?

Given humanity’s great tradition of exploration, do you favor continued funding for space flights?

Bell Work 1/24

Identify whether the numerical value is a statistic or a parameter?

a) The class average for a Probability and Statistics test, was an 81%b) In a survey of 880 students, 76% said they enjoyed having music played in classc) A sample conducted on U.S. voters, found that President Obama had a 50%

approval ratingd) Voter turnout in the 2012 election was 57.5%

1) From the sample frame given in the box, perform an SRS (Simple Random Sample) of sample size 3, using the sequence of random numbers 1 Franklin2 Elizabeth 3 Enrique4 Steven5 William6 Jordan7 Dean

8 Tishia 9 Ray Kwan10 Chardesia11 James12 Sandra13 Craig14 Debra

07196 08607 41081 34125 38872

Our Sample: Dean, Tishia, Craig

a) We want to know what percentage of local doctors accept Medicaid patients. We call the offices of 50 doctors randomly selected from local Yellow Page listings.

b) We want to know what percentage of local businessesanticipate hiring additional employees in the upcoming month. We randomly select a page in the Yellow Pages and call every business listed there

c) We want to know if there is neighborhood support to turn a vacant lot into a playground. We spend a Saturday afternoon going door-to-door in the neighborhood, asking people to sign a petition.

d) We want to know if students at our college are satisfied with the selection of food available on campus. We go to the largest cafeteria and interview every 10th person in line.

Identifying Sampling Methods

Identifying Parameters and Statistics

Occasionally, when I fill my car with gas, I figure out how many miles per gallon my car got. I wrote down those results after 6 fill-ups in the past few months. Overall, it appears my car gets 28.8 miles per gallon.a) What statistic have I calculated?b) What is the parameter I’m trying to estimate?c) When the Environmental Protection Agency (EPA) checks a car like mine to predict its fuel economy, what parameter is it trying to estimate?

During the course of your Statistics class, you are given 6 equally weighted tests. You have taken three tests so far, and calculated your average score to be 89% .a) What statistic have you calculated?b) What is the parameter you are trying to estimate?c) If you ask five of your friends in class how they did on the test, what

parameter are you trying to estimate?

For the following reports on statistical studies, identify the following:

a) The populationb) The population parameter of interestc) The sampling framed) The sampling Method, including whether or not randomization was employede) Potential sources of bias

a) Consumers Union asked all subscribers whether they had used alternative medical treatments and, if so, whether they had benefited from them. For almost all of the treatments, approximately 20% of those responding reported cures or substantial improvement in their condition.

b) Researchers waited outside a bar they had randomly selected from a list of such establishments. They stopped every 10th person who came out of the bar and asked whether he or she thought drinking and driving was a serious problem.

Population – adultsParameter – proportion who think drinking and driving is a serious problemSampling Frame – bar patronsSample – every 10th person leaving the barMethod – systematic samplingBias – undercoverage. Those interviewed had just left a bar, and may have opinions about drinking and driving that differ from the opinions of the population in general.

Population – all U.S. adultsParameter – proportion who have used and benefited from alternative medical treatments.Sampling Frame – all Consumers Union subscribersSample – those subscribers who respondedMethod – not specified, but probably a questionnaire mailed to all subscribersBias – nonresponse bias, specifically voluntary response bias. Those who respond may have strong feelings one way or another.

a)

b)

a) The populationb) The population parameter of interestc) The sampling framed) The sampling Method, including whether or not randomization was employede) Potential sources of bias

For the following reports on statistical studies, identify the following:

1) A question posted on the Lycos Web site on 18 June 2000 asked visitors to the site to say whether they thought that marijuana should be legally available for medicinal purposes. (www.lycos.com)

2) A company packaging snack foods maintains quality control by randomly selecting 10 cases from each day’s production and weighing the bags. Then they open one bagfrom each case and inspect the contents.

Population – snack food bagsParameter – proportion passing inspectionSampling Frame – all bags produced each daySample – 10 bags, one from each of 10 randomly selected casesMethod – multistage sampling. Presumably, they take a simple random sample of 10 cases, followed by a simple random sample of one bag from each case.Bias – no indication of bias

Population – all U.S. adultsParameter – proportion that feels marijuana should be legalized for medicinal purposesSampling Frame – none given –potentially all people with access to web siteSample – those visiting the web site who respondedMethod – voluntary response (no randomization employed)Bias – voluntary response sample. Those who visit the website and respond may be predisposed to a particular answer.

1)

2)

Identifying Sampling Method and Potential Sources of BiasIn a large city school system with 20 elementary schools, the school board is considering the adoption of a new policy that would require elementary students to pass a test in order to be promoted to the next grade. The PTA wants to find out whether parents agree with this plan. Listed below are some of the ideas proposed for gathering data. For each, indicate what kind of sampling strategy is involved and what (if any) biases might result.

a) Put a big ad in the newspaper asking people to log their opinions on the PTA Web site.

b) Randomly select one of the elementary schools and contact every parent by phone.

c) Send a survey home with every student, and ask parents to fill it out and return it the next day.

d) Randomly select 20 parents from each elementary school. Send them a survey, and follow up with a phone call if they do not return the survey within a week.

a) This is a voluntary response sample. Only those who see the ad, feel strongly about theissue, and have web access will respond.

b) This is cluster sampling, but probably not a good idea. The opinions of parents in one school may not be typical of the opinions of all parents.

c) This is an attempt at a census, and will probably suffer from nonresponse bias.

d) This is stratified sampling. If the follow-up is carried out carefully, the sample should beunbiased.

Identifying Sampling Method and Potential Sources of BiasIn a large city school system with 20 elementary schools, the school board is considering the adoption of a new policy that would require elementary students to pass a test in order to be promoted to the next grade. The PTA wants to find out whether parents agree with this plan. Listed below are some of the ideas proposed for gathering data. For each, indicate what kind of sampling strategy is involved and what (if any) biases might result.

Identifying Sampling Method and Potential Sources of Bias

Four new sampling strategies have been proposed to help the PTA determine whether parents favor requiring elementary students to pass a test in order to be promoted to the next grade. For each, indicate what kind of sampling strategy is involved and what (if any) biases might result.

a) Run a poll on the local TV news, asking people to dial one of two phone numbers to indicate whether they favor or oppose the plan.

b) Hold a PTA meeting at each of the 20 elementary schools, and tally the opinions expressed by those who attend the meetings.

c) Randomly select one class at each elementary school and contact each of those parents.

d) Go through the district’s enrollment records, selecting every 40th parent. PTA volunteers will go to those homes to interview the people chosen

Four new sampling strategies have been proposed to help the PTA determine whether parents favor requiring elementary students to pass a test in order to be promoted to the next grade. For each, indicate what kind of sampling strategy is involved and what (if any) biases might result.

a) This sampling method suffers from voluntary response bias. Only those who see the show and feel strongly will call.

b) Although this method may result in a more representative sample than the method in part, this is still a voluntary response sample. Only strongly motivated parents attend PTA meetings.

c) This is multistage sampling, stratified by elementary school and then clustered by grade. This is a good design, as long as the parents in the class respond. There should be follow-up to get the opinions of parents who do not respond.

d) This is systematic sampling. As long as a starting point is randomized, this method should produce reliable data.