jenboucher.weebly.comjenboucher.weebly.com/.../5/8/3/4/58340953/unit_1_notes.docx · web viewthe...

15
Unit 1.1 Notes: Analyzing Data Statistics is the study, analysis, and interpretation of data. A measure of central tendency indicates the __________________ of the data set. The ______________, ______________, and ______________ are the most common measures of central tendency. Measure Definition Example Mean of thedata values number of data values 1+2+ 3+3 +4+5 +5+9 8 =4 Median Mode* the most frequently occurring value(s) *if a data set has more than two modes, then the modes are probably not statistically useful. If no value occurs more frequently than any other, then there is no mode. Find the measures of central tendency: 1. The frequency table shows the number of trees in the yard of each house on one street. What are the mean, median, and mode for the trees per yard? Data set in order: Mean: Median: Mode: 2. Find the mean, median, and mode of each set of values. a. Time spent on the Internet per day (in minutes): Data set in order: For a data set listed in order: the middle value for an odd number of data values; the mean of the two For 1 , 2 , 3 , 3,4 , 5 , 5 , 9 , the middle two values are 3 and 4. The median is 3+4 Two modes: In 1 , 2 , 3,3 , 4 , 5,5 , 9 both 3 and 75684312065180952251 Tre es 3 4 5 6 7 8 Yar ds 1 5 7 4 1 2

Upload: others

Post on 10-Apr-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Unit 1.1 Notes: Analyzing DataStatistics is the study, analysis, and interpretation of data. A measure of central tendency indicates the __________________ of the data set. The ______________, ______________, and ______________ are the most common measures of central tendency.

Measure Definition Example

Mean ∑ of the data valuesnumber of data values

1+2+3+3+4+5+5+98=4

Median

Mode* the most frequently occurring value(s)

*if a data set has more than two modes, then the modes are probably not statistically useful. If no value occurs more frequently than any other, then there is no mode.

Find the measures of central tendency:

1. The frequency table shows the number of trees in the yard of each house on one street. What are the mean, median, and mode for the trees per yard?

Data set in order:

Mean: Median: Mode:

2. Find the mean, median, and mode of each set of values.

a. Time spent on the Internet per day (in minutes):

Data set in order:

Mean: Median: Mode:

b.

Data set in order:

For a data set listed in order: the middle value for an odd number of data values; the mean of the two middle values for an even number of data values

For 1 ,2 ,3 ,3,4 ,5 ,5 ,9 , the middle two values are 3 and 4 . The median is

their mean 3+42

=3.5 .

Two modes: In 1 ,2 ,3,3 ,4 ,5,5 ,9 both 3 and 5 occur twice.

7568431206518095225140

Trees 3 4 5 6 7 8Yards

1 5 7 4 1 2

Mean: Median: Mode:

An outlier is a value that is substantially ______________ from the rest of the data in a set. If the data is in one variable, outliers can occur at the “ends”. They can be ________________ parts of data because they can affect measures of central tendency.

Identifying an Outlier:

Suppose the values 56 ,65 ,73 ,59 ,98 ,65 ,∧59 are the data for the situations below. Would you discard the outlier? Explain.

a. Water temperature of a lake at seven locations b. The number of customers in a restaurant each night in one week

Identify the outlier of each set of values.

a. 3.4 ,4.5 ,2.3 ,5.9 ,9.8 ,3.3 ,2.1 ,3.0 ,2.9 b. 17 ,21 ,19 ,10 ,15 ,19 ,14 ,0 ,11 ,16

The range of a set of data is the ___________between the greatest and least values. If you order data from least value to greatest value, the median ___________ the data into two parts. The median of each part divides the data further and you have ______ parts in all. The values separating the four parts are quartiles. The interquartile range is the difference between the _____ and ______quartiles.

Comparing Data Sets:

The table shows average monthly temperatures of two cities. How can you compare the temperatures?

Jacksonville Texas

Mean:

Mode:

Range:

Q1:

Median:

Q3:

Interquartile range:

41 54 61 65 67 73 74 77 77 77 79

80 82 88 89 93 97 98 98 100

Using a Box-and-Whisker Plot:

A box-and-whisker plot uses __________________ and __________________ values, the _____________, and the ______ and _______ quartiles to display the spread, or variability, in a data set.

Entering a box-and-whiskers in the graphing calculator:

Step 1: Use STAT EDIT to enter the data into L1

Step 2: In STAT PLOT, select a box-and-whisker plot. Enter L1 for the data you just entered.

Step 3: Adjust the WINDOW values to fit the data

Step 4: Hit GRAPH

Step 5: Use TRACE to find the quartiles Q1 = , Q2 = , and Q3 = .

Make a box-and-whiskers plot for each set of values and use TRACE to find the quartiles Q1 ,Q2 , and Q3:

a) 12 11 15 12 19 20 19 14 18 15 16 b) 120 145 133 105 117 150 130 136 128

Finding Percentiles:

A percentile is a number from ______ to ______ that you can associate with a value x from a data set. It show the ______________________ of the data that are less than or equal to x. If x is at the 63rd percentile, then 63% of the data are less than or equal to x.

Here is an ordered list of midterm test scores for a Spanish class. What value is at the 65th percentile?

Find the values at the 30th and 90th percentiles for each data set.

a) 6283 5700 6381 6274 5700 5896 5972 6075 5993 5581

x x x−x (x−x ¿2 x x x−x (x−x ¿2

x x x−x (x−x ¿2 x x x−x (x−x ¿2

b) 7 12 3 14 17 20 5 3 17 4 13 2 15 9 15 18 16 9 1 6

Unit 1.2 Notes: Standard DeviationStandard deviation, represented by sigma (σ ¿, and variance, represented by sigma squared (σ 2), are measures showing how much data values deviate from the ___________. Measure of variation describe how the data in a data set are spread out. ________________ and ____________________ are two examples of measure of variation.

Find the mean, variance, and standard deviation of the following:

a) 6.9 8.7 7.6 4.8 9.0 c) 52 63 65 77 80 82

b) 60 40 35 45 39 d) 12 3 2 4 5 7

Finding Variance and Standard Deviation

*Find the mean, x, of the n values in a data set

*Find the difference, x−x, between each value x and the mean.

*Find the average (mean) of these squares. This is the variance:

σ 2=∑ ¿¿¿

*Take the square root of the variance. This the standard deviation:

σ =2√∑ ¿¿¿¿

1 – Var Stats

x = mean

σx = standard deviation

Finding the Standard Deviation using a Calculator:

Step 1: Use STAT EDIT to enter the data in list L1

Step 2: In STAT CALC select the 1-Var Stats options −−→

Practice: Find the mean and standard deviation using your graphing calculator

a) The Dow Jones Industrial average for the first 12 weeks of 1922:

1911.31 1956.07 1903.51 1958.22 1910.48 1983.26 2014.59 2023.21 2057.86 2034.98 2087.37 2067.14

b) The Dow Jones Industrial average for the first 12 weeks of 2008:

12800.18 12606.3 12099.3 12207.17 12743.19 12182.13 12348.21 12381.02 12266.39 11893.69 11951.09 11972.25

Using Standard Deviation to Describe Data:

A ________ ___________ diagram helps understand normal distribution of data in a population. About ___% of data values are within 1 standard deviation of the mean, about ___% are within 2 standard deviations, and about ____% are within 3 standard deviations of the mean. In a data list, every value fall within some number of standard deviations of the mean. For example, if the mean is 50 and the standard deviation is 10, then a value x with 40≤ x≤60 is within _____ standard deviation of the ___________.

Finding standard deviations from the mean:

1. Know the data values, their mean, and their standard deviation2. Need the number of standard deviations from the mean that include all the data3. Draw a number line4. Plot the data values and the mean5. Mark off intervals of the deviation on either side of the mean.

Example: The table displays the number of hurricanes in the Atlantic Ocean from 1992 to 2006.

a) What are the mean and standard deviation?b) Within how many standard deviations from the mean do all the values fall? Sketch a bell curve to

explain the normal distribution.

Practice: Determine the whole number of standard deviations from the mean that include all data values.

a) The mean price of the nonfiction books on a bestsellers list is $25.07; the standard deviations is $2.62. $26.95, $22.95, $24.00, $24.95, $29.95; $19.95; $24.95, $24.00, $27.95, $25.00

b) The mean length of Beethoven’s nine symphonies is 37 minutes; the standard deviation is 12 minutes. 27 min, 30 min, 47 min, 35 min, 30 min, 40 min, 35 min, 22 min, 65 min

c) One of your friends says that the data below fall within three standard deviations from the mean. Your other friend disagrees, saying the data fall within six standard deviations from the mean. Who is correct? Explain.

Unit 1.3 Notes: Samples and SurveysA population is all the members of a set. A sample is part of a population. If you determine a sample carefully, the statistics for the sample can be used to make ______________ ____________________about the larger population. Samples vary in how well they reflect a population. A sample has a bias when a part of the population is ___________________ or __________________________.

Analyzing Sampling Methods:

A newspaper wants to find out what percent of the city population favors a property tax increase to raise money for local parks. What is the sampling method used for each situation? Does the sample have a bias? Explain.

a) A newspaper article on the tax increase invites readers to express their opinions on the newspapers website.

b) A reporter interviews people leaving the city’s largest park.

c) A survey service calls every 50th listing from the local phone book.

Identify the sampling method and then identify any bias in each method.

Sampling Types and Methods:

1. Convenience sample – select any members of the population who are conveniently and readily available.

2. Self-selected sample – select only members of the population who volunteer for the sample.

3. Systematic sample – order the population in some way, and then select from it at regular intervals.

4. Random sample – all members of the population are equally likely to be chosen.

a) A supermarket wants to find the percent of shoppers who use coupons. A manager interviews every shopper entering the greeting card aisle.

b) A maintenance crew wants to estimate how many of 3000 air filters in a 30-story office building need replacing. The crew examines five filters chosen at random on each floor of the building.

c) To survey the eating habits of the community, employees of a local television station interview people visiting a food court in the mall.

Analyzing Study Methods:

Which type of study method is describe in each situation? Should the sample statistics be used to make a general conclusion about the population?

a) Researchers randomly choose two groups from 10 volunteers. Over a period of 8 weeks, one group eats ice cream before going to seep, and the other does not. Volunteers wear monitoring devices while sleeping, and researchers record dream activity.

b) Students in a science class record the height of bean plants as they grow.

Study Methods:

1. Observational study – measure or observe members of a sample in such a way that they are not affected by the study.

2. Controlled experiment – divide the sample into two groups. You impose a treatment on one group but not on the other “control” group. Then compare the effect on the treated group to the control group.

3. Survey – ask every member of the sample a set of questions.

c) Student council members ask every tenth student in the lunch line if they like the cafeteria food.

Identify the type of study method described and explain whether the sample statistics could be used to make a general conclusion about the population.

a) A list of students is randomly generated from the school database. Information for every student is entered into the database, and each student has an equally likely chance of being selected. The students selected are asked how much time they spend on household chores each week.

b) A gardener tests a new plant food by planting seeds from the same package in the same soil and location. Each plant is given the same amount of water, but half of the plants are given food and the other half are given no food at all. He records the growth and flowing rated of each plant.

Designing a Survey:

a) During the 2008 Olympic Games, a US swimmer won more medals than any other swimmer in history. What sampling method could you use to find the percent of students in your school who recognize that swimmer from a photograph? What is an example of a survey question that is likely to yield information that has no bias?

b) What sampling method could you use to find the percent of residents in your neighborhood who recognize the governor of your state by name? What is an example of a survey question that is likely to yield information that has no bias?

c) What sampling method could you use to find the percent of adults in your community who support building more nuclear power plants? What is an example of a survey question that is likely to yield unbiased information?

Unit 1.4 Notes: Normal DistributionNormal distributions occur often in real life such as standardized test scores, heights of adults, weight, and blood pressure. A normal distribution curve has the following characteristics:

*it has maximum at the center *it is symmetric about the mean *the mean, mode, and median are equal

*about 68% of the values fall within one standard deviation of the mean, 34% fall within one standard deviation above the mean and 34% fall within one standard deviation below the mean

* about 95% of the values fall within two standard deviations of the mean, 47.5% fall within two standard deviations above the mean and 47.5% fall within two standard deviations below the mean

*about 99.7% of the values fall within three standard deviations of the mean, 49.85% fall within three standard deviations above the mean and 49.85% fall within three standard deviations below the mean.

Analyzing Normally Distributed Data:

The heights of men in a survey are normally distributed about the mean. Use the graph to answer the following

a) About what percent of men aged 25-34 are 69-71 inches tall?

b) Suppose the survey included data on 100 men. About how many would you expect to be 69-71 inches tall?

c) The mean of the data is 70, and the standard deviation is 2.5. approximately

what percent of men are within one standarddeviation of the mean height?

PRACTICE:

a) For a populations of male European eels, the mean body length is 15.7 inches with a +1 standard deviation measuring 18.5 inches and the -1 standard deviation measuring 12.9 inches. Sketch a normal curve showing the eel lengths at one, two, and three standard deviations from the mean.

b) For a population of female European eels, the mean body length is 21.1 inches. The standard deviation is 4.7 inches. Sketch a normal curve showing eel lengths at one, two, and three standard deviations.

c) Mean = 45, standard deviation =5

Sketching a Normal Curve:

1. Need to know the mean and the standard deviation of the population2. Need the values that are one, two, and three standard deviations from the mean3. Multiply the standard deviation by 1, 2, 3.4. Draw vertical lines at the mean ± these values.5. Sketch the normal curve6. Label the mean7. Divide the graph into standard deviations8. Label the percentages for each section

d) Mean = 45, standard deviation = 3.5

Analyzing a Normal Distribution:

The scores on the Algebra 2 final are approximately normally distributed with a mean of 150 and a standard deviation of 15.

a) What percentage of the students who took the test scored above 180?

b) If 250 students took the final exam, approximately how many scored above 135?

c) If 13.6% of the students received a B on the final, how can you describe their scores? Explain.

Practice: A set of data has a normal distribution with a mean of 50 and a standard deviation of 8. Find the percent of data within each interval.

a) From 42 to 58 b) Greater than 34