applying the normal distribution: z-scores chapter 3.5 – tools for analyzing data mathematics of...

Post on 25-Dec-2015

221 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Applying the Normal Distribution: Z-Scores

Chapter 3.5 – Tools for Analyzing DataMathematics of Data Management (Nelson)MDM 4U

Comparing Data

Consider the following two students: Student 1

MDM 4U, Mr. Lieff, Semester 1, 2004-2005 Mark = 84%,

Student 2MDM 4U, Mr. Lieff, Semester 2, 2005-2006 Mark = 83%,

Can we compare the two students fairly when the mark distributions are different?

x 74 8,

x 70 9 8, .

Mark Distributions for Each Class

Semester 1, 2004-05 Semester 2, 2005-06

74665850 82 90 99.489.679.87060.250.440.698

Comparing Distributions

It is difficult to compare two distributions when they have different characteristics

For example, the two histograms have different means and standard deviations

z-scores allow us to make the comparison

Co

un

t

123456

a1 2 3 4 5 6 7 8

Collection 1 Histogram

Co

un

t

2

4

6

b

4 5 6 7 8 9 10 11

Collection 1 Histogram

The Standard Normal Distribution A distribution with a mean of zero and a standard

deviation of one X~N(0,1²) Each element of any normal distribution can be

translated to the same place on a Standard Normal Distribution using the z-score of the element

the z-score is the number of standard deviations the piece of data is below or above the mean

If the z-score is positive, the data lies above the mean, if negative, below

xx

z

Standardizing The process of reducing the normal

distribution to a standard normal distribution N(0,12) is called standardizing

Remember that a standardized normal distribution has a mean of 0 and a standard deviation of 1

Example 1 For the distribution X~N(10,2²) determine the number

of standard deviations each value lies above or below the mean:

a. x = 7

z = 7 – 10 2 z = -1.5

7 is 1.5 standard deviations below the mean 18.5 is 4.25 standard deviations above the mean

(anything beyond 3 is an outlier)

b. x = 18.5

z = 18.5 – 10

2

z=4.25

Example continued…

34% 34%

13.5% 13.5%

2.35% 2.35%

95%

99.7%

10 12 1486

7

16

18.5

Standard Deviation

A recent math quiz offered the following data

The z-scores offer a way to compare scores among members of the class, find out how many had a mark greater than yours, indicate position in the class, etc.

mean = 68.0 standard deviation = 10.9

Co

un

t

2

4

6

8

10

marks40 45 50 55 60 65 70 75 80 85 90

Test 1 Histogram

Example 2:

Suppose your mark was 64 Compare your mark to the rest of the class z = (64 – 68.0)/10.9 = -0.37

(using the z-score table on page 398) We get 0.3557 or 35.6% So 35.6% of the class has a mark less than or

equal to yours

Example 3: Percentiles

The kth percentile is the data value that is greater than k% of the population

If another student has a mark of 75, what percentile is this student in?

z = (75 - 68)/10.9 = 0.64 From the table on page 398 we get 0.7389 or

73.9%, so the student is in the 74th percentile – their mark is greater than 74% of the others

Example 4: Ranges

Now find the percent of data between a mark of 60 and 80

For 60: z = (60 – 68)/10.9 = -0.73 gives 23.3%

For 80: z = (80 – 68)/10.9 = 1.10 gives 86.4%

86.4% - 23.3% = 63.1% So 63.1% of the class is between a mark of

60 and 80

Back to the two students...

Student 1

Student 2

Student 2 has the lower mark, but a higher z-score!

z

84 74

81 25.

83 701.326

9.8z

Exercises read through the examples on pages 180-185 try page 186 #2-5, 7, 8, 10

Mathematical Indices

Chapter 3.6 – Tools for Analyzing Data

Mathematics of Data Management (Nelson)

MDM 4U

What is an Index?

An index is an arbitrarily defined number that provides a measure of scale

These are used to indicate a value, but do not actually represent some actual measurement or quantity so that we can make comparisons

1) BMI – Body Mass Index

A mathematical formula created to determine whether a person’s mass puts them at risk for health problems

BMI = m = mass(kg), h = height(m)

Standard / Metric BMI Calculator http://nhlbisupport.com/bmi/bmicalc.htm

Underweight Below 18.5

Normal 18.5 - 24.9

Overweight 25.0 - 29.9

Obese 30.0 and Above

2

m

h

2) Slugging Percentage

Baseball is the most statistically analyzed sport in the world A number of indices are used to measure the value of a

player Batting Average (AVG) measures a player’s ability to get on

base (hits / at bats) Slugging percentage (SLG) also takes into account the

number of bases that a player earns (total bases / at bats)

SLG = where TB = 1B + 2B*2 + 3B*3 + HR*4

and 1B = singles, 2B = doubles,

3B = triples, HR = homeruns

TB

AB

Slugging PercentageExample

e.g. DH Frank Thomas, Toronto Blue Jayshttp://sports.espn.go.com/mlb/players/stats?playerId=2370

2006 Statistics: 466 AB, 126 H, 11 2B, 0 3B, 39 HR

SLG = (H + 2B + 2*3B + 3*HR) / AB

= (126 + 11 + 2*0 + 3*39) / 466

= 254 / 466

= 0.545 (3 decimal places)

Moving Average

Used when time-series data show a great deal of fluctuation (e.g. long term trend of a stock)

takes the average of the previous n values e.g. 5-Day Moving Average

cannot calculate until the 5th day value for Day 5 is the average of Days 1-5 value for Day 6 is the average of Days 2-6

e.g. Look up a stock symbol at http://ca.finance.yahoo.com

Click Charts Technical chart n-Day Moving Average

Exercises

read pp. 189-192 1a (odd), 2-3 ac, 4 (alt: calculate SLG for 3

players on your favourite team for 2007), 8, 9, 11

References

Halls, S. (2004). Body Mass Index Calculator. Retrieved October 12, 2004 from http://www.halls.md/body-mass-index/av.htm

Wikipedia (2004). Online Encyclopedia. Retrieved September 1, 2004 from http://en.wikipedia.org/wiki/Main_Page

top related