ug statistics help

42
FINAL EXAM PREP FOR UNDERGRADUATE STATISTICS © BN Heard Not to be posted on any websites, etc. Intended to be downloaded by students for their own personal use

Upload: brent-heard

Post on 16-Jul-2015

2.944 views

Category:

Education


0 download

TRANSCRIPT

Page 1: Ug statistics help

FINAL EXAM PREP FOR UNDERGRADUATE

STATISTICS

© BN Heard

Not to be posted on any websites, etc.

Intended to be downloaded by students for their own personal use

Page 2: Ug statistics help

PREPARING FOR YOUR STATS FINALFirst of all, know your basic terms and definitions.

This is just an overall picture of the types of questions generally found on undergraduate statistics final exams.

There is no way to guess, I just try to hit a few.

In some cases, I give only one example, but be prepared for other types of questions.

For example, I may give an example of a Poisson problem, but you should be ready to answer a Binomial or Geometric distribution question if your course covered those.

Also, with confidence intervals there are different types of situations.

Hopefully, what I have put together will help you remember what you learned in your class and do better on your Final Exam.

I reference Minitab in most of the problem examples because most of my classes use Minitab, however, I give the answers so that you can check your answers using the technology of your choice.

Page 3: Ug statistics help

PREPARING FOR YOUR STATS FINALLevels of Measurement

The following are from records from a trainer of a football team. Identify which of the levels of measurement each are (Either Nominal, Ordinal, Interval or Ratio).

a) Number of visits to the whirlpool since the season started

b) General feeling after practice chosen from a scale of (Too Tired, Normally Tired, Still Ready to Go)

c) Player's pre-practice body temperature.

d) Player's state of residence for insurance purposes.

Page 4: Ug statistics help

PREPARING FOR YOUR STATS FINALAnswers to previous question

a) Interval, because 0 is the lowest number of visits you could have

b) Ordinal, because they can only be logically ranked

c) Body Temperature is actually Ratio because temperature can go below zero (although we know that we could not survive)

d) Nominal, a state is just a label like a color

Page 5: Ug statistics help

PREPARING FOR YOUR STATS FINALTypes of Sampling

What type of sampling? Stratified, Cluster, Convenience, Systematic, or Simple Random.

a) A university asks every 10th student in line what they are majoring in.

b) 4 girls from each of 30 Girl Scout Troops are surveyed

c) One high school choir out of the 12 choirs in the district is randomly chosen and all choir members are surveyed

d) The 17 members of the first grade class have all their names put in a hat and one is drawn out to see who gets to sit next to the teacher

e) I ask everyone sitting at my lunch table who will win the Super Bowl

Page 6: Ug statistics help

PREPARING FOR YOUR STATS FINALAnswers to previous question

What type of sampling? Stratified, Cluster, Convenience, Systematic, or Simple Random.

a) Systematic

b) Stratified

c) Cluster

d) Simple Random

e) Convenience

Page 7: Ug statistics help

PREPARING FOR YOUR STATS FINALDescriptive Statistics

Find the range, mean, variance and standard deviation for the following set of sample data.

12 13 15 11 19 9 21 19 14

Give the RangeMean (xbar)Variance (s^2)Standard Deviation (s)

Page 8: Ug statistics help

PREPARING FOR YOUR STATS FINALAnswer to previous question

On this one, I simply used Minitab, some students use the formulas, Excel or other software.

In Minitab, I input the data in Column C1, Labeling it "Data" (name doesn't matter)

Then I went to the Stat tab, chose Basic Statistics, Chose Descriptive Statistics

I put my cursor under in the Variables Box, then Double Clicked on the Data on the left I wanted to analyze

I clicked the Statistics button and unchecked the things I didn't need and Made Sure Range, Mean, Variance and Standard Deviation were checked.

I clicked Ok (for both windows) and got the following results in the Session Window

Descriptive Statistics: Data

Variable Mean StDev Variance RangeData 14.78 4.09 16.69 12.00

Page 9: Ug statistics help

PREPARING FOR YOUR STATS FINALUnderstanding the difference between populations and samples

In a poll, 1300 American first graders were asked whether they liked the smell of crayons or did not like the smell of crayons. Among the respondents, 79% said they liked the smell of crayons. Identify the population and the sample.

Page 10: Ug statistics help

PREPARING FOR YOUR STATS FINALAnswer to the previous question

The population would be all American first graders.

The sample would be the 1300 who were asked the question

Also remember, populations give us parameters, samples give us statistics.

Page 11: Ug statistics help

PREPARING FOR YOUR STATS FINALUnderstanding the difference between Qualitative and Quantitative data

Determine whether the following variables are quantitative or qualitative.

a) Color of a car

b) Pressure in the car’s tires

c) Your zip code

d) Number of whole potato chips in a bag

Page 12: Ug statistics help

PREPARING FOR YOUR STATS FINALAnswers to previous question

a) Color of a car - Qualitative

b) Pressure in the car’s tires - Quantitative

c) Your zip code - Qualitative

d) Number of whole potato chips in a bag - Quantitative

Page 13: Ug statistics help

PREPARING FOR YOUR STATS FINALProblems involving Pivot Tables/Contingency Tables

Consider the following data from an ice cream truck operator, specifically the information of the area of town of the customer and the number of Choco Tacos ordered when the truck stopped.

(See Pivot Table on chart that follows)

If you choose a customer at random, then find the probability that the customer is

a. from the South Side.b. from the South Side and ordered 2 or more Choco Tacos.c. East Side, given that the customer ordered only 1 Choco Taco.

Page 14: Ug statistics help

PREPARING FOR YOUR STATS FINAL

Page 15: Ug statistics help

PREPARING FOR YOUR STATS FINAL

a. from the South Side. 25/131 or 0.1908b. from the South Side and ordered 2 or more Choco Tacos. Go within the table because this is an “AND” 3/131 or 0.0229c. East Side, given that the customer ordered only 1 Choco Taco. These are tricky, any time you have a “given that” your denominator changes to only those you are dealing with or the “given that” total.. So your answer would be 7/23 or 0.3043 Please note that if the question was worded “Ordered 1 Choco Taco, given that the customer was from the East Side, the answer would be 7/57 or the decimal form

Answers to previous question

Page 16: Ug statistics help

PREPARING FOR YOUR STATS FINALNormal Distribution Problems

The mean amount of money spent on groceries by a family of four in a given week is found to be $327 with a standard deviation of $84.25. The distribution is known to be normally distributed.

a) Based on this information, find the probability the family would spend less than $300

b) Based on this information, find the probability the family would spend between than $275 and $390

c) Based on this information, find the probability the family would spend more than $400

Page 17: Ug statistics help

PREPARING FOR YOUR STATS FINALAnswers to previous question (I use Minitab, but you can use Excel or Standard Normal tables/z scores etc.)

Using Minitab.

Go to Graph >> Probability Distribution Plot

Click View Probability (last one on the right)

Click OK

Make sure the distribution is set to “Normal”

Input 327 for the mean and 84.25 for the standard deviation

(Continued Next Page)

Page 18: Ug statistics help

PREPARING FOR YOUR STATS FINALContinued

Click the Shaded Area tab

a) For “less than $300”, Click radial button next to x-values, Click the Left Tail icon because we want to know the probability that it is less than a value, input 300 for the x value, Click Ok

We see our answer is 0.3743 on the graph.

b) For “between than $275 and $390”, Click radial button next to x-values, Click the Middle icon because we want to know the probability that it is between two values, input 275 for x1 and 390 for x2, Click Ok

We see our answer is 0.5042 on the graph.

c) For “more than $400”, Click radial button next to x-values, Click the Right Tail icon because we want to know the probability that it is greater than a value, Input 400 for the x value Click Ok

We see our answer is 0.1931 on the graph.

Page 19: Ug statistics help

PREPARING FOR YOUR STATS FINALOther Normal Distribution type problems (concerning z scores, Standard Normal Distribution, etc.)

For the standard normal distribution, find the probability of z being between -1.26 and 1.15. You might even be given a graph with coloring shown between these two z values on the bottom of the graph.

Page 20: Ug statistics help

PREPARING FOR YOUR STATS FINAL

Answer to previous question

As I have noted in lectures, these questions pop up from time to time because of the use of tables in statistics, in particular the standard normal distribution, z scores, etc.

I use Minitab, but you can easily do these using tables.

This is easy to solve with Minitab.

Go to Graph >> Probability Distribution Plot

Click View Probability (last one on the right)

Click OK

Make sure the distribution is set to “Normal”

Input 0 for the mean and 1 for the standard deviation (because we are dealing with the Standard Normal and that is where z scores come from)

Click the Shaded Area tab (Next Chart)

Page 21: Ug statistics help

PREPARING FOR YOUR STATS FINALAnswer Continued

After clicking the Shaded Area tab

Click radial button next to x-values (we ARE NOT putting in probabilities or percentages etc.).

Click the Middle icon because we want to know the probability that it is between two values

Input -1.26 for x1 and 1.15 for x2

Click OK

We get the attached graph (next chart) which shows us the probability is 0.7711 or 0.771 rounded to three decimal places.

Page 22: Ug statistics help

PREPARING FOR YOUR STATS FINALContinued

Page 23: Ug statistics help

PREPARING FOR YOUR STATS FINALPoisson Distribution Problems

Assume the Poisson Distribution applies. Use the given mean to find the indicated probability.

Find P(4) when mu = 3.

Page 24: Ug statistics help

PREPARING FOR YOUR STATS FINALAnswer to previous question

I used Minitab, but you can use Excel, tables, etc.

In Minitab, go the Calc Tab then Probability Distributions, Choose Poisson

When the window comes up, click button next to "Probability"

Input Mean of 3

Click button next to "Input Constant" Enter 4 and hit ok

In the session window you see the answer is 0.168031 or 0.1680 rounded to four decimals or 0.168 rounded to three decimals

Page 25: Ug statistics help

PREPARING FOR YOUR STATS FINALConfidence Intervals

A random sample of 57 kitchen mixers have a mean price of $109 and standard deviation of $19.20. Find the 90% confidence interval for the population mean.

Page 26: Ug statistics help

PREPARING FOR YOUR STATS FINALAnswer to the previous question (You can use Minitab, formulas, other technology, etc.)

I used Minitab

Go to Stat >> Basic Statistics >> 1 Sample Z (because you had more than 30 samples)

Input Sample Size of 57, Mean of 109 and Standard Deviation of 19.20.

Click Options and input 90 for your Confidence Level.

Click your OK buttons.

In the session window, you see that your 90% Confidence Interval is ($104.82, $113.18)

Note, if you would have had say 15 samples (less than 30), you would have used a 1 Sample t in Minitab.

Page 27: Ug statistics help

PREPARING FOR YOUR STATS FINALConfidence Intervals (proportions)

In a survey of 9000 dog owners, 5400 say they walk their dogs at least once per week. Construct a 95% confidence interval of the population proportion of dog owners who walk their dogs at least once per week.

Page 28: Ug statistics help

PREPARING FOR YOUR STATS FINALAnswer to previous question

I used Minitab

In Minitab, Go to Stat >> Basic Statistics >> 1 Proportion

Click radial button next to summarized data

Input number of events 5400

Input number of trials 9000

Click Options button, set confidence level to 95%

Click Ok on both windows

In the session window, you see your answer is (0.5898, 0.6101) rounded to four decimal places

Page 29: Ug statistics help

PREPARING FOR YOUR STATS FINALSample Size Questions (Here is just one example)

People were surveyed on how many trips they made to the bank the previous year. How many are needed to estimate the number of trips they made the previous year to within one trip with 95% confidence? Initial survey results indicate the population standard deviation sigma is equal to 21.4 trips.

Page 30: Ug statistics help

PREPARING FOR YOUR STATS FINALAnswer to previous question

The formula is

N = (zC s/E)^2

We find zC using Minitab or a table (most common ones in table to right).

s was given to be 21.4

E is 1 because it noted “within 1 trip”

N = (1.96 x 21.4/1)^2

N = (41.944)^2

N = 1759.2991 (BUT WE ALWAYS ROUND SAMPLE SIZES UP)

SO THE ANSWER IS WE NEED 1760

Page 31: Ug statistics help

PREPARING FOR YOUR STATS FINAL

Correlation/Regression Type Questions

Find the equation of the regression line of the given data. It gives study hours (x) vs grade on final exam (y). Also give the correlation coefficient “r.” In addition, predict scores for someone who studies 100 hours, 150 hours and 300 hours. Data Below.

"Study Hrs, x"

"Grade, y"

160 98

170 95

130 82

120 76

80 65

190 99

Page 32: Ug statistics help

PREPARING FOR YOUR STATS FINALAnswer to previous question (I used Minitab, you can use the technology of your choice)

This is a fun problem because being able to do regression is very valuable.

Get your data into Minitab. It copies and pastes nicely as always using the little icon in the corner (above right of data, I always choose Make tab delimited copy because it works well for me)

On this one, you simply go to Stat >> Regression >> Fitted Line Plot

You put your cursor in the box next to Response, then double click on your “y” variable to the left. In our case this is the Grade. Now put your cursor in the box next to Predictor and then double click on your “x” variable to the left. In our case, this is the Study hours.

After clicking OK, you get a nice graph that you can use to compare to the given choices if you are asked to do so.

(Continued on next chart)

Page 33: Ug statistics help

PREPARING FOR YOUR STATS FINAL

Answer continued

It also gives you the regression equation. On this one we got y = 38 + 0.3376x (Be careful to round to the correct number of decimal places) For example if they said to round to three, this would be y = 38.000 + 0.338x . They give you the r^2 value in percentage form (94.5%). To get r you could take the square root of the decimal form, or square root of 0.945 which is 0.972 rounded to three decimal places.

To answer the other questions, you just “plug and chug” being careful to see if the given value is within the range of data you were given.

So if we want to predict scores for someone who studies 100 hours, 150 hours and 300 hours.

For 100 hours we have y = 38.000 + 0.338(100) = 38 + 33.8 = 71.8 or 72 rounded to the nearest whole number

For 150 hours we have y = 38.000 + 0.338(150) = 38 + 50.7 = 88.7 or 89 rounded to the nearest whole number

For 300 hours, we would say that this would not be very meaningful because our study hours ranged from 80 to 190. 300 is well outside of that range.

I'm attaching the Minitab graph result I got on the next page....

Page 34: Ug statistics help

PREPARING FOR YOUR STATS FINAL

Page 35: Ug statistics help

PREPARING FOR YOUR STATS FINALProbability Type Questions

A cupcake company receives shipments of plastic decorations from three different suppliers in quantities of 100, 200 and 500. Three times a plastic decoration is selected at random, each time without replacement. Find the probability that a) all three products came from the 2nd supplier (assume suppliers are in order), b) none of the three products came from the second supplier.

Page 36: Ug statistics help

PREPARING FOR YOUR STATS FINALAnswer to previous question

a) (200/800)(199/799)(198/798) = 0.015 (rounded to three decimal places)

b) (600/800)(599/799)(598/798) = 0.421 (rounded to three decimal places)

Remember your total is going down….

Page 37: Ug statistics help

PREPARING FOR YOUR STATS FINALHypothesis Testing (Basics)

Write the null and alternative hypothesis and identify the claim.

A recent survey showed the percentage of dogs that bite is equal to 45%.

Ho p = 0.45

Ha p ≠ 0.45

So the claim is the null in this case. However you have to remember that the claim can be either the null or the alternative.

Page 38: Ug statistics help

PREPARING FOR YOUR STATS FINALMore on the Basics of Hypothesis Testing

A recent survey of cab drivers showed more than 45% work more than 5 days a week

Ho p ≤ 0.45

Ha p > 0.45

The claim would be the alternative in this case.

Or

A recent survey of nurses showed at least 45% work more than 5 days a week

Ho p ≥ 0.45

Ha p < 0.45

The claim would be the null in this case because “at least” means 45% or 0.45 or more

Page 39: Ug statistics help

PREPARING FOR YOUR STATS FINALHypothesis Testing Questions (There are many types, be familiar with what your course has covered)

For a study on the number of bones that dogs consume in a given week, you randomly select ten dog owners who give their dogs bones. The results are listed at the right. At an a=0.05 is there enough evidence to support the claim that dogs eat fewer than four bones per week? Assume the population is normally distributed.

Note I use a for alpha and m for mu.

4.43.2

3.2

3.7

2.3

4.7

3.8

4.8

4.6

2.2

Page 40: Ug statistics help

PREPARING FOR YOUR STATS FINAL

Answer to previous question (I USED Minitab)

Input data into Minitab, use copy and paste function as I’ve shown in lectures.

I labeled my column “Bones” (It doesn’t matter).

Go to Stat >> Basic Statistics >> 1 sample t (because we only had 10 sample data points)

Put cursor under Samples in Columns, then double click on Bones on the left or whatever you called it…

Check box for “Perform Hypothesis Test”

Input 4 for Hypothesized Mean

Click Options button

Input 95 for confidence level (because our a is 0.05, the confidence is 1 – 0.05 or 0.95 or 95%

Make sure you choose “Less than” as your alternative because our alternative hypothesis is m < 4

Click your ok buttons… (Next Chart)

Page 41: Ug statistics help

PREPARING FOR YOUR STATS FINALAnswer continued

In the session window you will see

Variable N Mean StDev SE Mean Bound T P

Bones 10 3.690 0.956 0.302 4.244 -1.03 0.166

What does this mean?

Well, our p value of 0.166 is greater than our a of 0.05 which means we Fail to reject our null hypothesis.

Remember the null hypothesis was that m >= 4 (the null always contains equality, in this instance, the null was that the mean was greater than or equal to 4)

This means we do not accept the claim which was the alternative hypothesis that m < 4

So we would say, “There is not sufficient evidence to support the claim that dogs eat less than four bones per week.”

Again above I used a for alpha and m for mu.

Page 42: Ug statistics help

PREPARING FOR YOUR STATS FINAL

I do this as a service to my statistics students and those who attend my online lectures.

These charts are not to be posted by other instructors, schools, etc. without my permission.

Want to thank me?

Never speak ill of statistics – Never say “Statistics Lie” – Because when done properly, Statistical Analysis does not lie… Selective data, partial results and people might be untruthful or skew results…

Other Ways to thank me…

Encourage children to do well in math and stress the importance…

Come see me at www.CranksMyTractor.com where I write about the good people, places and things I find in life.

Like my CMT page on Facebook at www.facebook.com/cranksmytractor

Catch me on stage telling stories about the good things in life…

My best to you all!