descriptive statistics for one variable. variables and measurements a variable is a characteristic...

27
Descriptive Statistics for one Variable

Upload: myra-hudson

Post on 17-Jan-2018

223 views

Category:

Documents


0 download

DESCRIPTION

Different Types of Variables Some variables are quantitative variable, like the time for a person to finish a task or the person’s age. Other variables are qualitative variables as the person’s nationality or the person’s preferred sport. In this note we will work with quantitative variables. All the measurement collected from individuals about a particular data is referred a “data”. Our data will contain the measurement for only one variable.

TRANSCRIPT

Page 1: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

Descriptive Statisticsfor one Variable

Page 2: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

Variables and measurements

• A variable is a characteristic of an individual or object in which the researcher is interested. For example the SAT score for a college student.

• For a particular individual or object the variable will take a value called measurement. For example , John’s SAT is 720.

Page 3: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

Different Types of Variables• Some variables are quantitative variable, like the time for

a person to finish a task or the person’s age.

• Other variables are qualitative variables as the person’s nationality or the person’s preferred sport.

• In this note we will work with quantitative variables.

• All the measurement collected from individuals about a particular data is referred a “data”.

• Our data will contain the measurement for only one variable.

Page 4: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

Statistics has two major chapters:

• Descriptive Statistics

• Inferential statistics

Page 5: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

StatisticsDescriptive Statistics• Provides numerical and

graphic procedures to summarize the information of the data in a clear and understandable way

Inferential Statistics

• Provides procedures to draw inferences about a population from a sample

Page 6: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

Population and SamplesThe Population under study is the set off all individuals of interest for the research.

We will see that, in practice, the variable is measured only for a part of the population.

That part of the population for which we collect measurements is called sample.

The number of individuals in a sample is denoted by n.

In this notes and examples we will assume that our data correspond to a sample of the population under study.

Page 7: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

Descriptive Measures• Central Tendency measures. They are

computed in order to give a “center” around which the measurements in the data are distributed.

• Variation or Variability measures. They describe “data spread” or how far away the measurements are from the center.

• Relative Standing measures. They describe the relative position of a specific measurement in the data.

Page 8: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

Measures of Central Tendency

• Mean: Sum of all measurements in the data divided by the number of measurements.

• Median: A number such that at most half of the measurements are below it and at most half of the measurements are above it.

• Mode: The most frequent measurement in the data.

Page 9: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

Example of Mean

Measurements Deviationx x - mean3 -15 15 11 -37 32 -26 27 30 -44 0

40 0

• MEAN = 40/10 = 4

• Notice that the sum of the “deviations” is 0.

• Notice that every single observation intervenes in the computation of the mean.

Page 10: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

Example of Median• Median: (4+5)/2 =

4.5

• Notice that only the two central values are used in the computation.

• The median is not sensible to extreme values

Measurements Measurements Ranked

x x3 05 15 21 37 42 56 57 60 74 7

40 40

Page 11: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

Example of ModeMeasurements

x3551726704

• In this case the data have two modes:

• 5 and 7• Both measurements are

repeated twice

Page 12: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

Example of ModeMeasurements

x351147383

• Mode: 3

• Notice that it is possible for a data not to have any mode.

Page 13: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

Measures of Variability

• Range• Variance• Standard Deviation

Page 14: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

The Range• Definition: The range of a data is the difference

between the largest and the smallest measurements in the data.

• To find the range, first order the data from least to greatest. Then subtract the smallest value from the largest value in the set.

• Example: A marathon race was completed by 7 participants. What is the range of times given in hours below? 2.3 hr, 8.7 hr, 3.5 hr, 5.1 hr, 4.9 hr, 7.1 hr, 4.2 hs

Ordering the data from least to greatest, we get: 2.3, 3.5, 4.2, 4.9, 5.1, 7.1, 8.7. So highest - lowest = 8.7 hr - 2.3 hr = 6.4 hr Answer: The range of swim times is 6.4 hr.

Page 15: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

The Range is not Enough

Consider the following examples of data1,1,1,1,81,2,4,6,81,8,1,8,1In the three cases the Range is the same:

Range = 7However, the three series exhibit

completely different distributions of values along the range of values

Page 16: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

The sample variance

The variance takes into account the deviation around the mean of the Data.The formula for the sample variance is as follows

1

22

nxx

s

Page 17: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

The Standard Deviation consists of the square root of the Variance

Notice that the mean and the standard deviation have the same unit as the one of the measurements

2sVariances

Page 18: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

Variance (for a sample)

• Steps:– Compute each deviation– Square each deviation– Sum all the squares– Divide by the data size (sample size)

minus one: n-1

Page 19: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

Example of Variance

Measurements Deviations Square of deviations

x x - mean3 -1 15 1 15 1 11 -3 97 3 92 -2 46 2 47 3 90 -4 164 0 0

40 0 54

• Variance = 54/9 = 6

• It is a measure of “spread”.

• Notice that the larger the deviations (positive or negative) the larger the variance

Page 20: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

The standard deviation

• It is defined as the square root of the variance

• In the previous example• Variance = 6• Standard deviation = Square root of

the variance = Square root of 6 = 2.45• The standard deviation summarizes the

deviations in one number

Page 21: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

Percentiles

• The p-th percentile is a number such that at most p% of the measurements are below it and at most 100 – p percent of the data are above it.

• Example, if in a certain data the 85th percentile is 340 means that 15% of the measurements in the data are above 340. It also means that 85% of the measurements are below 340

• Notice that the median is the 50th percentile

Page 22: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

Tchebichev’s Rule

The standard deviation can be used to construct an interval enclosing an important percent of the data. In fact, this rule says that for any data set:

• At least 75% of the measurements differ from the mean less than twice the standard deviation.

• At least 89% of the measurements differ from the mean less than three times the standard deviation.

Note: This is a general property and it is called Tchebichev’s Rule: At least 1-1/k2 of the observation falls within k standard deviations from the mean. It is true for every dataset.

Page 23: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

Example of Tchebichev’s Rule

Suppose that for a certain data is :

• Mean = 20

• Standard deviation =3

Then:

• A least 75% of the measurements are between 14 and 26

• At least 89% of the measurements are between 11 and 29

Page 24: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

Further Notes

• When the Mean is greater than the Median the data distribution is skewed to the Right.

• When the Median is greater than the Mean the data distribution is skewed to the Left.

• When Mean and Median are very close to each other the data distribution is approximately symmetric.

Page 25: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

Empirical Rule (68-95-99.7 Rule)

For “Normal Distributions” (Data sets whose histograms are bell or mount shaped):

• Approx. 68% of values are within 1 standard deviation of the mean

• Approx. 95% of values are within 2 standard deviations of the mean

• Approx. 99.7% of values are within 3 standard deviations of the mean

Page 26: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

Example of Empirical Rule

Suppose that the hourly wages of certain type of workers have a “normal distribution” ( bell shaped histogram). Assume also that the mean is $16 with a standard deviation of $1.5

The we have:

1 standard deviation = $1.52 standard deviations = $3.03 standard deviations = $4.5

What does the empirical rule allow us to say?

Page 27: Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher

SolutionThe empirical rule allows us to say that:

• Approx. 68% of workers in this occupation earn wages that are within 1 standard deviation of the mean :– Between 14 – 1.5 and 14 + 1.5– Between $12.5 and $15.5

• Approx. 95% of workers in this occupation earn wages that are within 2 standard deviation of the mean :– Between 14 – 3 and 14 + 3– Between $11.0 and $17.0

• Approx. 99.7% of workers in this occupation earn wages that are within 3 standard deviation of the mean :– Between 14 – 4.5 and 14 + 4.5– Between $9.5 and $18.5