data summaries. summary statistics given a large set of numbers, we often want to describe, or...

37
Data Summaries Data Summaries

Upload: ayden-kipps

Post on 14-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Data SummariesData Summaries

Page 2: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Summary StatisticsSummary Statistics

• Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers.

• Example: Yearly sales of two brands of peanut butter

Page 3: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Summary StatisticsSummary Statistics

• Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers.

• Example: Yearly sales of two brands of peanut butter

Year 1992 1993 1994 1995 1996 1997 1998 1999Skippy 12 10 15 9 12 8 11 11Jif 12 9 11 12 10 12 11 11

Page 4: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Summary StatisticsSummary Statistics

• Example: Yearly sales of two brands of peanut butter

• Measurements of Center

Arithmetic Mean: The Average

Year 1992 1993 1994 1995 1996 1997 1998 1999Skippy 12 10 15 9 12 8 11 11Jif 12 9 11 12 10 12 11 11

Median: The data point in the center

1

n

ii

xx

n

Page 5: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Summary StatisticsSummary Statistics

• Example: Yearly sales of two brands of peanut butter

Skippy Mean:

Jif Mean:

Year 1992 1993 1994 1995 1996 1997 1998 1999Skippy 12 10 15 9 12 8 11 11Jif 12 9 11 12 10 12 11 11

8811

8x

Page 6: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Summary StatisticsSummary Statistics

• Example: Yearly sales of two brands of peanut butter

Median: Order the DataIf even number average the two center

numbersIf odd number report the center number

Page 7: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Summary StatisticsSummary Statistics

• Example: Yearly sales of two brands of peanut butter

Median: Order the DataIf even number average the two center

numbersIf odd number report the center numberSmallest Largest

Skippy 8 9 10 11 11 12 12 15Jif 9 10 11 11 11 12 12 12

two center numbers

Skippy and Jif Median = 11

Page 8: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Why Use A Median?Why Use A Median?

• Example: Sales Force Compensation

Group 1$60 K$60 K$60 K$60 K

$210 K

Page 9: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Why Use A Median?Why Use A Median?

• Example: Sales Force Compensation

Group 1$60 K$60 K$60 K$60 K

$210 K

Group 2$86 K$88 K$90 K$92 K$94 K

Page 10: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Why Use A Median?Why Use A Median?

• Example: Sales Force Compensation

Mean $90 K $90 K

Group 1$60 K$60 K$60 K$60 K

$210 K

Group 2$86 K$88 K$90 K$92 K$94 K

Page 11: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Why Use A Median?Why Use A Median?

• Example: Sales Force Compensation

Mean $90 K $90 K

Median $60 K $90 K

Group 1$60 K$60 K$60 K$60 K

$210 K

Group 2$86 K$88 K$90 K$92 K$94 K

Page 12: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Summary StatisticsSummary Statistics

• Measurements of Variation

Range: Largest - Smallest

Standard Deviation: Square Root of Variance

Variance: Average Squared Difference

2s s

max mini iR x x

2

2 1

1

n

ii

x xs

n

Page 13: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Summary StatisticsSummary Statistics

• Example: Yearly sales of two brands of peanut butter

Range: Largest - Smallest

Skippy:

Jif:

15 8 7R

12 9 3R

Smallest LargestSkippy 8 9 10 11 11 12 12 15Jif 9 10 11 11 11 12 12 12

Page 14: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Summary StatisticsSummary Statistics• Example: Yearly sales of two brands of peanut butter

Variance: Average Squared Difference: Skippy Only

Smallest LargestSkippy 8 9 10 11 11 12 12 15Jif 9 10 11 11 11 12 12 12

Page 15: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Summary StatisticsSummary Statistics• Example: Yearly sales of two brands of peanut butter

Variance: Average Squared Difference: Skippy Only

Smallest LargestSkippy 8 9 10 11 11 12 12 15Jif 9 10 11 11 11 12 12 12

2 2 2

2 8 11 9 11 ... 15 11

7s

Page 16: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Summary StatisticsSummary Statistics• Example: Yearly sales of two brands of peanut butter

Variance: Average Squared Difference: Skippy Only

Smallest LargestSkippy 8 9 10 11 11 12 12 15Jif 9 10 11 11 11 12 12 12

2 2 2

2 3 2 ... 4

7s

9 4 ... 16

7

4.57

2 2 2

2 8 11 9 11 ... 15 11

7s

Page 17: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Summary StatisticsSummary Statistics• Example: Yearly sales of two brands of peanut butter

Standard Deviation: Square Root of Variance

Skippy:

Jif:

Smallest LargestSkippy 8 9 10 11 11 12 12 15Jif 9 10 11 11 11 12 12 12

2.14 4.57

1.07 1.14

Page 18: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Graphical SummaryGraphical Summary

• A Picture is Worth a Thousand Words (Bar Chart)

YEAR

1999.00

1998.00

1997.00

1996.00

1995.00

1994.00

1993.00

1992.00

Me

an

16

14

12

10

8

6

SKIPPY

JIF

Page 19: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Summary StatisticsSummary Statistics

• A Year Worth of Weekly Sales Figures

Week Skippy Jif1 10.25 10.532 7.81 9.863 11.61 11.094 14.19 10.93. . .. . .. . .51 4.56 10.5652 14.62 11.52

Page 20: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Summary StatisticsSummary Statistics

• Summary Statistics: Using SPSS

• Skippy Range = 16.94 - 4.56 = 12.38

• Jif Range = 14.07 - 9.06 = 5.01

Descriptive Statistics

52 4.56 16.94 10.6885 3.0282

52 9.06 14.07 11.0368 .9475

52

SKIPPY

JIF

Valid N (listwise)

N Minimum Maximum MeanStd.

Deviation

Page 21: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Graphical SummaryGraphical Summary

• Bar Chart

WEEK

52.00

49.00

46.00

43.00

40.00

37.00

34.00

31.00

28.00

25.00

22.00

19.00

16.00

13.00

10.00

7.00

4.00

1.00

Me

an

18

16

14

12

10

8

6

4

2

SKIPPY

JIF

Page 22: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Graphical SummaryGraphical Summary

• Line Chart

WEEK

52.00

49.00

46.00

43.00

40.00

37.00

34.00

31.00

28.00

25.00

22.00

19.00

16.00

13.00

10.00

7.00

4.00

1.00

Me

an

18

16

14

12

10

8

6

4

2

SKIPPY

JIF

Page 23: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Graphical SummaryGraphical Summary

• Histogram

SKIPPY

17.0

16.0

15.0

14.0

13.0

12.0

11.0

10.0

9.0

8.0

7.0

6.0

5.0

14

12

10

8

6

4

2

0

Std. Dev = 3.03

Mean = 10.7

N = 52.00

JIF

14.00

13.50

13.00

12.50

12.00

11.50

11.00

10.50

10.00

9.50

9.00

14

12

10

8

6

4

2

0

Std. Dev = .95

Mean = 11.04

N = 52.00

Page 24: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Graphical SummaryGraphical Summary

• The Box and Whisker Plot

5252N =

JIFFSKIPPY

18

16

14

12

10

8

6

4

2

35

Page 25: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Antidepressant SurveyAntidepressant Survey

• Questionnaire Administered to 178 Physicians Randomly Selected from 100,000 physicians who prescribe of antidepressant drugs

• Investigating Physician Usage of Antidepressant medication

Page 26: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

QuestionnaireQuestionnaireAntidepressant Survey

Physician and Practice Characteristics 1. What is your primary medical specialty? (circle one only) Adult Psychiatry (1) General Psychiatry (6) Child/Adolescent Psychiatry (2) Internal Medicine (7) Family Practitioner (3) Neurology (8) Forensic Psychiatry (4) Other (9) General Practitioner (5)

2. How many years have you been in practice, post residency? Number years in practice: ________ (raw #)

Page 27: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

QuestionnaireQuestionnaireDrug Profile and Utilization 3. Please indicate approximately how many prescriptions you write for each of the

following products in a typical month.

# of Rx’s in an average month

a. Celexa (raw #)

b. Effexor (raw #)

c. Luvox (raw #)

d. Paxil (raw #)

e. Prozac (raw #)

f. Serzone (raw #)

g. Wellbutrin (raw #)

h. Zoloft (raw #)

Page 28: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Summary StatisticsSummary Statistics

• Frequency Data (0/1 or 1 From Many)

Frequency Percent Valid Percent

Cumulative Percent

Valid Adult Psychiatry 24 13.5 13.8 13.8 Child/Adolescent

Psychiatry 7 3.9 4.0 17.8

Family Practitioner 85 47.8 48.9 66.7 General

Practitioner 1 .6 .6 67.2

General Psychiatry

10 5.6 5.7 73.0

Internal Medicine 47 26.4 27.0 100.0 Total 174 97.8 100.0

Missing System 4 2.2 Total 178 100.0

Page 29: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Graphical SummaryGraphical Summary

• Pie Chart

Internal Medicine

General Psychiatry

General Practitioner

Family Practitioner

Child/Adolescent Psy

Adult Psychiatry

Missing

Page 30: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Prescription RatesPrescription Rates

Descriptive Statistics

178 .00 60.00 6.0646 8.9755

178 .00 100.00 7.2455 12.5082

177 .00 35.00 2.0650 4.4136

178 .00 100.00 11.2725 13.2902

178 .00 100.00 15.6264 17.8469

177 .00 80.00 4.7345 8.7678

178 .00 75.00 7.4876 11.2028

178 .00 100.00 10.7079 13.9228

177

CELEXA

EFFEXOR

LUVOX

PAXIL

PROZAC

SERZONE

WELLBUTR

ZOLOFT

Valid N (listwise)

N Minimum Maximum MeanStd.

Deviation

Page 31: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Prescription RatesPrescription Rates

177177177177177177177177N =

ZOLOFT

WELLBUTR

SERZONE

PROZAC

PAXIL

LUVOX

EFFEXOR

CELEXA

120

100

80

60

40

20

0

-20

1141701691345290160105125115110981316929

701767889138

9

13

981701101544

529

13170

47

138

31981381259913916917011014678154131

17152

9

17087138

7314717113117498

69151

7870

89

99847

131

166

69

1467514878171810517618341691375715916752291334149115679170125131

98

1701605278

9155

70

69

131

13

509817070751691671313193416517652978

29155

Page 32: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Prozac Rates by Physician TypeProzac Rates by Physician Type

• First, Box Plot Summaries by Physician Type

• Second, ReCode Data - High/Average /Low Prescription Rates

Page 33: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Prozac Rates by Physician TypeProzac Rates by Physician Type

• Box Plot Summaries by Physician Type

47101857244N =

D.TYPE

Internal Medicine

General Psychiatry

General Practitioner

Family Practitioner

Child/Adolescent Psy

Adult Psychiatry

Missing

PR

OZ

AC

120

100

80

60

40

20

0

-20

176

87

171

78

18

17473

151

67

167

98131

69

Page 34: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Prozac Rates by Physician TypeProzac Rates by Physician Type

• ReCode Data

High

Average

Low

– Low Rate = 0 to 10 prescriptions per month

– Average Rate = 10 to 20 prescriptions per month

– High Rate = 20+ prescriptions per month

Page 35: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Cross Tabulating DataCross Tabulating Data

• Create a Table Which Summarizes Number in Each Level

D.TYPE * PROZACLV Crosstabulation

Count

4 9 11 24

1 4 2 7

58 14 13 85

1 1

4 2 4 10

33 7 7 47

100 36 38 174

Adult Psychiatry

Child/AdolescentPsychiatry

Family Practitioner

General Practitioner

General Psychiatry

Internal Medicine

D.TYPE

Total

Low Average High

PROZACLV

Total

Page 36: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Graphing the Cross Tabulation Graphing the Cross Tabulation

• Same Information Can be Summarized Using a Bar Plot

D.TYPE

Internal Medicine

General Psychiatry

General Practitioner

Family Practitioner

Child/Adolescent Psy

Adult Psychiatry

Missing

Co

un

t

70

60

50

40

30

20

10

0

PROZACLV

Low

Average

High

Page 37: Data Summaries. Summary Statistics Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers. Example:

Next Class Period Next Class Period in Computer Labin Computer Lab

• Don’t forget: Next Period 11&14 BAB – from 7:15 p.m. to 9:00 p.m. We will not meet during the regularly class time during the day. 

• Also, please bring a floppy disk to class, to save your work.