das20502 chapter 1 descriptive statistics

48
CHAPTER 1 Descriptive Statistics 1 [email protected] Objectives: 1.To study the basic introductory concept of statistics, including the branches of statistics, the basic terms of statistics, and types of variables. 2.To be able to use graphical and numerical methods to describe a data set. 3.To be able to find mean, median, mode and standard deviation for grouped data and ungrouped data.

Upload: rozainita-rosley

Post on 27-Jan-2015

117 views

Category:

Education


5 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Das20502 chapter 1 descriptive statistics

[email protected] 1

CHAPTER 1Descriptive Statistics

Objectives: 1. To study the basic introductory concept of statistics, including the

branches of statistics, the basic terms of statistics, and types of variables.

2. To be able to use graphical and numerical methods to describe a data set.

3. To be able to find mean, median, mode and standard deviation for grouped data and ungrouped data.

Page 2: Das20502 chapter 1 descriptive statistics

[email protected] 2

CHAPTER 1Descriptive Statistics

Descriptive Statistics

Ungrouped Data Group Data

Page 3: Das20502 chapter 1 descriptive statistics

[email protected] 3

CHAPTER 1Descriptive Statistics

Ungrouped Data

Measurement of Central Tendency

Mode Median Mean

Measurement of Dispersion

Variance Std Deviation

cont

Page 4: Das20502 chapter 1 descriptive statistics

[email protected] 4

CHAPTER 1Descriptive Statistics

Grouped Data

Measurement of Central Tendency

Mode Median Mean

Measurement of Dispersion

Variance Std Deviation

cont

Page 5: Das20502 chapter 1 descriptive statistics

[email protected] 5

CHAPTER 1Descriptive Statisticscont

Definition of basic termsa) Population consists of all items or elements of interest for a

particular decision or investigation. E.g.: All married staff over the age of 25 in UTHM.

b) Samples is a certain number of elements that have been chosen from a population. Sample is a subset of population. E.g.: a list of married staffs over the age 25 in the Registrar’s Office would be a sample from the population of all married staffs over the age of 25 in the UTHM.

c) Random sample is a sample drawn in such a way that each element of the population has a chance of being selected.

d) Simple random sample implies that any particular sample of a specified sample size has the same chance of being selected as any other sample.

Page 6: Das20502 chapter 1 descriptive statistics

[email protected] 6

CHAPTER 1Descriptive Statisticscont

e) Element / number is a specific subject or individual about which the information is collected.

f) Variable is a characteristic of the individual within the sample or population

g) Observation / measurement is the value of a variable for an element.

h) Data set is a collection of values of one or more variables. i) Ungrouped data set contains information of each number of a

sample or population. j) Grouped data set is a collection of data which are grouped in

classes. k) Raw data is data recorded in the sequence in which they are

collected and before they are processed or ranked.

Page 7: Das20502 chapter 1 descriptive statistics

[email protected] 7

CHAPTER 1Descriptive Statisticscont

l) Population parameter is a descriptive measure computed from a population data.

m) Sample statistic is a descriptive measure computed from a sample data.

n) Outliers / Extreme Values are values that are very small or very large relative to the majority of the values in a data set.

Page 8: Das20502 chapter 1 descriptive statistics

[email protected] 8

CHAPTER 1Descriptive Statisticscont

Example

ShopNumber of A4

Paper (in reams)

12345678

Elements or members

20002500300050007000500040005500

Observations or measurements

1. The following table gives the number of sales of A4 paper in 8 shops in Melaka.

Variable

Page 9: Das20502 chapter 1 descriptive statistics

[email protected] 9

CHAPTER 1Descriptive Statisticscont

Measures of central tendency are statistical measures which describe the position of a distribution.

They are also called statistics of location, and are the complement of statistics of dispersion, which provide information concerning the variance or distribution of observations.

In the univariate context, the mean, median and mode are the most commonly used measures of central tendency.

Page 10: Das20502 chapter 1 descriptive statistics

[email protected] 10

CHAPTER 1Descriptive Statisticscont

Mean- The average of data values

Median- Middle value in ranked list- Data must be arranged in increasing or decreasing order. -Ungrouped data and grouped data

Mode- Value that occur most frequency

Page 11: Das20502 chapter 1 descriptive statistics

[email protected] 11

CHAPTER 1Descriptive Statisticscont

Sample vs. Population

Population Sample

Page 12: Das20502 chapter 1 descriptive statistics

[email protected] 12

CHAPTER 1Descriptive Statisticscont

Page 13: Das20502 chapter 1 descriptive statistics

[email protected] 13

CHAPTER 1Descriptive Statisticscont

Median for Ungrouped Data

evenisnwhenxx

oddisnwhenxMMedian nn

n

2

,, 1)2/(2/

,2/)1(

Page 14: Das20502 chapter 1 descriptive statistics

[email protected] 14

CHAPTER 1Descriptive Statisticscont

Mode for Ungrouped Data

The frequency of each value in the data set.

•If no value occurs more than once, then the data set has no mode.

•Otherwise, any value that occurs with the greatest frequency is a mode of the data set.

Page 15: Das20502 chapter 1 descriptive statistics

[email protected] 15

CHAPTER 1Descriptive Statisticscont

Exercise

1. Find the mean for the price of pen (in RM) below:2.00 2.50 3.00 3.50 2.50

2. A sample of six students in UTHM is selected and their height is measured, resulting in the following data: 150.2 cm 1.592 m 149.4 cm 152.7 cm 1.533 m 1.510 mFind the sample mean.

3. Calculate the mean for the following data:a) 14, 11, -10, 8, 8, -16b) 23, 14, 6, -7, -2, 9, 16

Page 16: Das20502 chapter 1 descriptive statistics

[email protected] 16

CHAPTER 1Descriptive Statisticscont

Example

1. Find the median of the following examination scores:80, 56, 34, 67, 55, 91, 82, 47, 75, 31, 90

2. The following data represent the number of home runs hits by all teams in the Indian League in 2004.

157 133 189 215 208 139 152 167 202 197 124 239 191 169. Find the median of this data set.

3. The data below represent the length (in seconds) of a random sample of songs released in the 90’s.198 255 287 207 176 224 215 208 241 Find the median of the data given.

Page 17: Das20502 chapter 1 descriptive statistics

[email protected] 17

CHAPTER 1Descriptive Statisticscont

Example

Page 18: Das20502 chapter 1 descriptive statistics

[email protected] 18

CHAPTER 1Descriptive Statisticscont

Example

Page 19: Das20502 chapter 1 descriptive statistics

[email protected] 19

CHAPTER 1Descriptive Statisticscont

Sample variance, s2, for a sample of n data values :

Page 20: Das20502 chapter 1 descriptive statistics

[email protected] 20

CHAPTER 1Descriptive Statisticscont

The variance of the n observations is

The standard deviation s is the square root of the variance,

s2 2 22 1( ) ( ) ... ( )

1 1i ny y y y y y

sn n

2s s

Page 21: Das20502 chapter 1 descriptive statistics

Formula Formula

Computing the Variance Computing the Variance

22 x

N

( )

… for a Population

22

1

x xs

n

( )

Formula Formula … for a Sample

Page 22: Das20502 chapter 1 descriptive statistics

[email protected] 22

CHAPTER 1Descriptive Statisticscont

sExample:Find the sample variance for the given data

6.1 5.7 5.8 6.0 5.8 6.3

Find the variance and std deviation of the following data:5 2 1 7 6 9

Page 23: Das20502 chapter 1 descriptive statistics

[email protected] 23

CHAPTER 1Descriptive Statisticscont

sCompute the sample variance and std deviation of the heights of the starting players on Team I.

Page 24: Das20502 chapter 1 descriptive statistics

[email protected] 24

Organizing Data

sVariable

Qualitative Quantitative

Discrete Continuous

A characteristic that varies from one person or thing to another

A non-numerically valued variable

A numerically valued variable

A quantitative variable whose possible values can be listed

A quantitative variable whose possible values form some interval of numbers

Page 25: Das20502 chapter 1 descriptive statistics

[email protected] 25

Organizing Datacont

sGrouped frequency distribution-Is obtained by giving classes or intervals together with the number of data values in each class.

Cumulative frequency-Is the frequency of a class that includes all values in a data set that fall below the upper boundary of that class

Class midpoint or mark-Is the number halfway between the lower and upper class limits of a class

Class width-Upper boundary – lower boundary

2

limlim itUpperitlower

Page 26: Das20502 chapter 1 descriptive statistics

[email protected] 26

cont

sExample: Given the data below:

Construct the frequency distribution table with class limits 42 – 45, 46 – 49, 50 – 53 and so on.

Organizing Data

Page 27: Das20502 chapter 1 descriptive statistics

[email protected] 27

cont

sOrganizing Data

Construct frequency distribution table and find the class midpoint and class width.

Age No. of Employees20 – 2930 – 3940 – 4950 – 5960 – 69

303520105

The ages of its employees in a company

Page 28: Das20502 chapter 1 descriptive statistics

[email protected] 28

sThe Ministry of Health Malaysia for Health Statistics publishes data on weights and height by age and sex in Vital and Health Statistics. The weights shown in Table, given to the nearest tenth of pound, were obtained from a sample of 18 – 24 – year-old males. Construct a grouped data table for these weights. Use a class width of 20 and a first cutpoint of 120.

Table 6a: Weights of 37 males, aged 18-24 years129.2 185.3 218.1 182.5 142.8155.2 170.0 151.3 187.5 145.6167.3 161.0 178.7 165.0 172.5191.1 150.7 187.0 173.7 178.2161.7 170.1 165.8 214.6 136.7278.8 175.6 188.7 132.1 158.5146.4 209.1 175.4 182.0 173.6149.9 158.6

Page 29: Das20502 chapter 1 descriptive statistics

[email protected] 29

Grouped Data

sSample Mean

• The sample mean of grouped data is:

n

ii

n

iii

f

xf

1

1

Page 30: Das20502 chapter 1 descriptive statistics

[email protected] 30

Grouped Data

sThe following data shows the number of mistakes that Redza had done when he typed 100 pages. Find the mean.

No. of mistake/s 0 1 2 3 4 5No. of pages 60 21 10 5 3 1

Page 31: Das20502 chapter 1 descriptive statistics

[email protected] 31

Grouped Data

sFind the mean for the data below that refers to the number of bicycles owned by 27 families at Taman Permata.

No. of bicycles No. of families01234

26

1342

Page 32: Das20502 chapter 1 descriptive statistics

Mean , M

Mean is the average of data values Ungrouped : The sample mean for raw data:

Let x1, x2, ....xn be a sample of size n.

Grouped : The sample mean for grouped data: Suppose we have a sample of size n grouped into m groups or cells

32

CENTRAL TENDENCY MEASUREMENT

Page 33: Das20502 chapter 1 descriptive statistics

Mean , M

Mean of sample data is a) Ungrouped data b) Group data

Mean of population data is

a) Ungrouped data b) Group data

n

xx i

i

ii

f

xfx

33

N

xi

i

ii

f

xf

where xi = class midpoint / mark = (lower limit – upper limit ) / 2fi = frequency of xi

CENTRAL TENDENCY MEASUREMENT

Page 34: Das20502 chapter 1 descriptive statistics

Median, M

Median is the middle value in a ranked list. The data must be arranged in increasing or decreasing order. The are two type of median which are median for ungrouped data and median for grouped data.

Ungrouped : The data, a) when n is odd (ganjil) : the median is the value of ( ) th term

in ranked list. B) when n is even (genap) : the median = average of the value of the

two middle terms Median of sample data is

a) Ungrouped data b) Group dataOdd (ganjil)

Even (genap)

21nx

342122 nn xx

Cf

FLMedian

median

n

M .2

where

LM = lower boundary for median class , C = size of class / width,

F = cummulative frequency from classes less than the median class

fm = frequency in the median class , n = number of data

CENTRAL TENDENCY MEASUREMENT

21n

Page 35: Das20502 chapter 1 descriptive statistics

[email protected] 35

sMedian

• The median for grouped data is:

mM f

Fn

CLM 2

Page 36: Das20502 chapter 1 descriptive statistics

[email protected] 36

sA study of sulphur oxide production within 80 days produced the distribution of the following table. Find the median.

Sulphur oxide (tonne) Frequency5.0 – 8.9

9.0 – 12.913.0 – 16.917.0 – 20.921.0 – 24.925.0 – 28.929.0 – 32.9

31014251792

Page 37: Das20502 chapter 1 descriptive statistics

[email protected] 37

sNumber of visits No. Of students0-45-9

10-1415-1920-2425-29

1741221181

Find the median for the data below that shows the number of visits to the library made by all the 100 international students in one year.

Page 38: Das20502 chapter 1 descriptive statistics

[email protected] 38

sMode is the value that occurs most frequently (highest frequency in a data set)

Grouped Data :

Note : Group Data1) Data with 2 mode is known as bimode and more 2 mode is multimodeMode for data grouped ,

Cdd

dLMoMode

ab

bM .,

MODE

Page 39: Das20502 chapter 1 descriptive statistics

Find the m

ode of the following

data.Class Frequency

11 – 1516 – 2021 – 2526 – 3031 – 35 36 – 40

61018241612

Page 40: Das20502 chapter 1 descriptive statistics

A Global Warming Awareness Exhibition was held by a state government. The above table recorded the number of visitors who visited the exhibition and the number of days having those numbers of visitors. Find the mode of number of visitors.

Number of visitors Number of days

0 – 99100 – 199200 – 299300 – 399400 – 499500 – 599

1023167224211107

Page 41: Das20502 chapter 1 descriptive statistics

[email protected] 41

s Find the mean, median and mode for the following data:

Age Number of people

17 – 2122 – 2627 – 3132 – 3637 – 4142 – 4647 – 5152 – 56

23568723

Page 42: Das20502 chapter 1 descriptive statistics

Sample Variance for Grouped Data

The formula for the sample variance for grouped data is:

f is class frequency and X is class midpoint

where

f

xfxf

fS ii

ii

2

22

1

1

Page 44: Das20502 chapter 1 descriptive statistics

[email protected] 44

sxi 3.0 – 3.4 3.4 – 3.8 3.8 – 4.2 4.2 – 4.6 4.6 – 5.0

fi 4 8 11 9 6

Find the variance and std deviation

Page 45: Das20502 chapter 1 descriptive statistics

Population variance, σ2

The formula for the sample variance for grouped data is:

2

2

11

2

1

2

2

)(

N

xxN

N

x

n

ii

n

ii

n

ii

Page 46: Das20502 chapter 1 descriptive statistics

23.3 12.4 58.1 38.2 14.0 58.2 75.4 23.9 23.9 18.3

22.0 37.1 31.4 8.5 1.0 15.5 6.9 5.2 28.7 26.3

13.9 25.9 26.8 26.9 16.8 37.7 10.6 21.9 31.6 30.1

42.4 16.5 21.1 32.9 8.8 10.6 28.6 40.7 12.9 13.8

Given the data below:

a) Construct the frequency distribution table with class boundary -0.5 – 9.5, 9.5 – 19.5, 19.5 – 29.5, and so on.

b) Findi) Mean

ii) Medianiii) Modeiv) Standard deviation

Page 47: Das20502 chapter 1 descriptive statistics

Find the mean, median, mode, standard deviation

Class limit f

20 – 2930 – 3940 – 4950 – 5960 – 69

303520105

Page 48: Das20502 chapter 1 descriptive statistics