descriptive statistics organize descriptive statistics ...acfoos/courses/381/02... · descriptive...

5
5/8/2014 1 Descriptive Statistics PSYC 381 Arlo Clark-Foos Descriptive Statistics Organize Summarize Graphing Graphing New Terms: Raw scores Distribution Frequency Tables Visual depiction of data that shows how often each value occurred. Typically used for discrete variables. Steps in Creating… Find highest and lowest scores Create 2 columns, one for the value and one for the frequency List full range of values, including those with frequency of 0 Count the # of scores at each value, record that in frequency column. Frequency Tables Example On the board record the number of days each On the board, record the number of days each week that you spend watching 2+ hours of television Frequency Tables Days with 2+ hours of TV Frequency 7 6 5 4 3 2 1 0 Expanding Frequency Tables to Percentages Cumulative Percentage Th t fi di id l h h t The percentage of individuals who have scores at a given value or lower. Calculating: Count how many scores fall at or below a given value. Divide this number by the total number of scores.

Upload: others

Post on 09-Jul-2020

41 views

Category:

Documents


4 download

TRANSCRIPT

5/8/2014

1

Descriptive Statistics

PSYC 381Arlo Clark-Foos

• Descriptive Statistics– Organize– Summarize

Graphing– Graphing

• New Terms:– Raw scores– Distribution

• Frequency Tables– Visual depiction of data that shows how often

each value occurred.• Typically used for discrete variables.

– Steps in Creating…• Find highest and lowest scores• Create 2 columns, one for the value and one for the frequency• List full range of values, including those with frequency of 0• Count the # of scores at each value, record that in frequency

column.

• Frequency Tables– Example

On the board record the number of days each– On the board, record the number of days each week that you spend watching 2+ hours of television

• Frequency Tables

Days with 2+ hours of TV Frequency

7

6

5

4

3

2

1

0

• Expanding Frequency Tables to Percentages

– Cumulative PercentageTh t f i di id l h h t• The percentage of individuals who have scores at a given value or lower.

• Calculating: Count how many scores fall at or below a given value. Divide this number by the total number of scores.

5/8/2014

2

• Expanding Frequency Tables to Percentages

Days with 2+ hours of TV

Frequency Percent Cumulative Percent

77

6

5

4

3

2

1

0

• Grouped Frequency Tables– Reports the frequencies within a given interval

rather than the frequencies for a specific value.

– When to use instead of frequency tables…• Continuous, interval/ratio (scale) variables• When data cover a huge range

– Determining the # of intervals…(5-10?)

• Grouped Frequency Tables, example.

Number of Siblings Frequency

5

4

3

2

1

0

• Histograms– Typically used to depict interval data with the values

of the variable on the x-axis and the frequencies on the y-axis (similar to a bar graph).

– Constructing from a Frequency Table• Draw x-axis and label with variable of interest and full range

of values• Draw y-axis and label it “frequency”• Draw a bar for each value as high as the frequency for that

value, as represented on the y-axis

• Histogram– Example

• Frequency Polygons– Line graphs with the x-axes representing values

(or midpoints of intervals) and the y-axes representing frequencies.p g q• Start the same as a Histogram• Instead of drawing bars to represent frequency, place a

dot at the appropriate frequency for each value or interval of values

• Connect the dots

5/8/2014

3

• Frequency Polygons • Normal Distribution– A specific frequency distribution in the shape of a

bell-shaped, symmetric, unimodal curve• Similar to a frequency polygon with infiniteSimilar to a frequency polygon with infinite

observations, thus a smooth curve instead of connected dots

• Skewness– How much one of the tails of the distribution is

pulled away from the center

Floor effects&

Ceiling effects

• Kurtosis– The degree to which a curve’s width and thickness

of its tails deviate from the normal curve• Mesokurtic (normal), Leptokurtic (tall and thin),Mesokurtic (normal), Leptokurtic (tall and thin),

Platykurtic (short and fat)

Platykurtic Leptokurtic

• Best represents the center of a data set, the particular value that all the other data seem to be gathering around. Usually at high point of histogram.– Mean, Median, Mode

• There is no best measure for all data!

• Mean– Arithmetic average of a group of scores

M Σx/N– M = Σx/N

5/8/2014

4

• Median (mdn)– The middle score of all the scores in a sample

when the scores are arranged in ascending order.5 3 6 9 11 28 3 1 155 3 6 9 11 28 3 1 151 3 3 5 6 9 11 15 28

9 45 32 27 16 3 89 123 9 12 16 27 32 45 89

21.5

• Mode– The most common score of all the scores in a

sample• Unimodal, Bimodal, MultimodalUnimodal, Bimodal, Multimodal

5 3 6 9 11 28 3 1 15

1 3 3 5 6 9 11 15 28

• An extreme score that is either very high or very low in comparison with the rest of the scores in the distribution.

• Which is the best measure?3 9 12 2 16 2 17 5 11 45 89 32 1 96

1 2 2 3 5 9 11 12 16 17 32 45 89 961 2 2 3 5 9 11 12 16 17 32 45 89 96

Mode = 2Median = 11.5Mean = 24.29

• Example from book (1e):– Tuition Increases & Outliers

5/8/2014

5

• Range = Xhighest - Xlowest

– Does not tell us much, other than absolute spreadabsolute spread

• How close to the mean?

• How far from the mean is the typical score?

• Variance & Standard Deviation– The typical amount that the scores in a sample vary,

or deviate, from the mean.

V i S 2 2 2– Variance: SD2 or s2 or σ2

• SD2 = Σ(X-M)2N

– Standard Deviation: SD or s or σ• SD = √ Σ(X-M)2

N

Sum of Squares (SS): The sum of squared deviations from the mean

• Variance & Standard Deviation– Steps in Calculating

1. Find the mean of the data2. Subtract the mean from each individual score3. Square each of these numbers4. Sum all of these squared numbers (Sum of Squares)5. Divide the resulting number by the number of scores (N)

6. If calculating SD, take the square root of this number

• The difference between the first and third quartiles of a data set.– 1st Quartile: 25th percentile

• Median of lower half of distributiond h– 3rd Quartile: 75th percentile• Median of upper half of distribution

1. Calculate the median2. For the lower half, calculate another median (Q1).3. For the upper half, calculate another median (Q3)

IQR = Q3 - Q1

• Ages of everyone in class

• Decide the best way to portray this data graphically, then graph it neatly (use a ruler if you have to)– Describe the shape of this distribution

• Calculate Measures of Central Tendency– Which is the best for this data set

• Calculate Measures of Variability (including IQR)

• Show all of your work!