data summaries. summary statistics given a large set of numbers, we often want to describe, or...

Post on 14-Dec-2015

217 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Data SummariesData Summaries

Summary StatisticsSummary Statistics

• Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers.

• Example: Yearly sales of two brands of peanut butter

Summary StatisticsSummary Statistics

• Given a large set of numbers, we often want to describe, or summarize, the data with a few revealing numbers.

• Example: Yearly sales of two brands of peanut butter

Year 1992 1993 1994 1995 1996 1997 1998 1999Skippy 12 10 15 9 12 8 11 11Jif 12 9 11 12 10 12 11 11

Summary StatisticsSummary Statistics

• Example: Yearly sales of two brands of peanut butter

• Measurements of Center

Arithmetic Mean: The Average

Year 1992 1993 1994 1995 1996 1997 1998 1999Skippy 12 10 15 9 12 8 11 11Jif 12 9 11 12 10 12 11 11

Median: The data point in the center

1

n

ii

xx

n

Summary StatisticsSummary Statistics

• Example: Yearly sales of two brands of peanut butter

Skippy Mean:

Jif Mean:

Year 1992 1993 1994 1995 1996 1997 1998 1999Skippy 12 10 15 9 12 8 11 11Jif 12 9 11 12 10 12 11 11

8811

8x

Summary StatisticsSummary Statistics

• Example: Yearly sales of two brands of peanut butter

Median: Order the DataIf even number average the two center

numbersIf odd number report the center number

Summary StatisticsSummary Statistics

• Example: Yearly sales of two brands of peanut butter

Median: Order the DataIf even number average the two center

numbersIf odd number report the center numberSmallest Largest

Skippy 8 9 10 11 11 12 12 15Jif 9 10 11 11 11 12 12 12

two center numbers

Skippy and Jif Median = 11

Why Use A Median?Why Use A Median?

• Example: Sales Force Compensation

Group 1$60 K$60 K$60 K$60 K

$210 K

Why Use A Median?Why Use A Median?

• Example: Sales Force Compensation

Group 1$60 K$60 K$60 K$60 K

$210 K

Group 2$86 K$88 K$90 K$92 K$94 K

Why Use A Median?Why Use A Median?

• Example: Sales Force Compensation

Mean $90 K $90 K

Group 1$60 K$60 K$60 K$60 K

$210 K

Group 2$86 K$88 K$90 K$92 K$94 K

Why Use A Median?Why Use A Median?

• Example: Sales Force Compensation

Mean $90 K $90 K

Median $60 K $90 K

Group 1$60 K$60 K$60 K$60 K

$210 K

Group 2$86 K$88 K$90 K$92 K$94 K

Summary StatisticsSummary Statistics

• Measurements of Variation

Range: Largest - Smallest

Standard Deviation: Square Root of Variance

Variance: Average Squared Difference

2s s

max mini iR x x

2

2 1

1

n

ii

x xs

n

Summary StatisticsSummary Statistics

• Example: Yearly sales of two brands of peanut butter

Range: Largest - Smallest

Skippy:

Jif:

15 8 7R

12 9 3R

Smallest LargestSkippy 8 9 10 11 11 12 12 15Jif 9 10 11 11 11 12 12 12

Summary StatisticsSummary Statistics• Example: Yearly sales of two brands of peanut butter

Variance: Average Squared Difference: Skippy Only

Smallest LargestSkippy 8 9 10 11 11 12 12 15Jif 9 10 11 11 11 12 12 12

Summary StatisticsSummary Statistics• Example: Yearly sales of two brands of peanut butter

Variance: Average Squared Difference: Skippy Only

Smallest LargestSkippy 8 9 10 11 11 12 12 15Jif 9 10 11 11 11 12 12 12

2 2 2

2 8 11 9 11 ... 15 11

7s

Summary StatisticsSummary Statistics• Example: Yearly sales of two brands of peanut butter

Variance: Average Squared Difference: Skippy Only

Smallest LargestSkippy 8 9 10 11 11 12 12 15Jif 9 10 11 11 11 12 12 12

2 2 2

2 3 2 ... 4

7s

9 4 ... 16

7

4.57

2 2 2

2 8 11 9 11 ... 15 11

7s

Summary StatisticsSummary Statistics• Example: Yearly sales of two brands of peanut butter

Standard Deviation: Square Root of Variance

Skippy:

Jif:

Smallest LargestSkippy 8 9 10 11 11 12 12 15Jif 9 10 11 11 11 12 12 12

2.14 4.57

1.07 1.14

Graphical SummaryGraphical Summary

• A Picture is Worth a Thousand Words (Bar Chart)

YEAR

1999.00

1998.00

1997.00

1996.00

1995.00

1994.00

1993.00

1992.00

Me

an

16

14

12

10

8

6

SKIPPY

JIF

Summary StatisticsSummary Statistics

• A Year Worth of Weekly Sales Figures

Week Skippy Jif1 10.25 10.532 7.81 9.863 11.61 11.094 14.19 10.93. . .. . .. . .51 4.56 10.5652 14.62 11.52

Summary StatisticsSummary Statistics

• Summary Statistics: Using SPSS

• Skippy Range = 16.94 - 4.56 = 12.38

• Jif Range = 14.07 - 9.06 = 5.01

Descriptive Statistics

52 4.56 16.94 10.6885 3.0282

52 9.06 14.07 11.0368 .9475

52

SKIPPY

JIF

Valid N (listwise)

N Minimum Maximum MeanStd.

Deviation

Graphical SummaryGraphical Summary

• Bar Chart

WEEK

52.00

49.00

46.00

43.00

40.00

37.00

34.00

31.00

28.00

25.00

22.00

19.00

16.00

13.00

10.00

7.00

4.00

1.00

Me

an

18

16

14

12

10

8

6

4

2

SKIPPY

JIF

Graphical SummaryGraphical Summary

• Line Chart

WEEK

52.00

49.00

46.00

43.00

40.00

37.00

34.00

31.00

28.00

25.00

22.00

19.00

16.00

13.00

10.00

7.00

4.00

1.00

Me

an

18

16

14

12

10

8

6

4

2

SKIPPY

JIF

Graphical SummaryGraphical Summary

• Histogram

SKIPPY

17.0

16.0

15.0

14.0

13.0

12.0

11.0

10.0

9.0

8.0

7.0

6.0

5.0

14

12

10

8

6

4

2

0

Std. Dev = 3.03

Mean = 10.7

N = 52.00

JIF

14.00

13.50

13.00

12.50

12.00

11.50

11.00

10.50

10.00

9.50

9.00

14

12

10

8

6

4

2

0

Std. Dev = .95

Mean = 11.04

N = 52.00

Graphical SummaryGraphical Summary

• The Box and Whisker Plot

5252N =

JIFFSKIPPY

18

16

14

12

10

8

6

4

2

35

Antidepressant SurveyAntidepressant Survey

• Questionnaire Administered to 178 Physicians Randomly Selected from 100,000 physicians who prescribe of antidepressant drugs

• Investigating Physician Usage of Antidepressant medication

QuestionnaireQuestionnaireAntidepressant Survey

Physician and Practice Characteristics 1. What is your primary medical specialty? (circle one only) Adult Psychiatry (1) General Psychiatry (6) Child/Adolescent Psychiatry (2) Internal Medicine (7) Family Practitioner (3) Neurology (8) Forensic Psychiatry (4) Other (9) General Practitioner (5)

2. How many years have you been in practice, post residency? Number years in practice: ________ (raw #)

QuestionnaireQuestionnaireDrug Profile and Utilization 3. Please indicate approximately how many prescriptions you write for each of the

following products in a typical month.

# of Rx’s in an average month

a. Celexa (raw #)

b. Effexor (raw #)

c. Luvox (raw #)

d. Paxil (raw #)

e. Prozac (raw #)

f. Serzone (raw #)

g. Wellbutrin (raw #)

h. Zoloft (raw #)

Summary StatisticsSummary Statistics

• Frequency Data (0/1 or 1 From Many)

Frequency Percent Valid Percent

Cumulative Percent

Valid Adult Psychiatry 24 13.5 13.8 13.8 Child/Adolescent

Psychiatry 7 3.9 4.0 17.8

Family Practitioner 85 47.8 48.9 66.7 General

Practitioner 1 .6 .6 67.2

General Psychiatry

10 5.6 5.7 73.0

Internal Medicine 47 26.4 27.0 100.0 Total 174 97.8 100.0

Missing System 4 2.2 Total 178 100.0

Graphical SummaryGraphical Summary

• Pie Chart

Internal Medicine

General Psychiatry

General Practitioner

Family Practitioner

Child/Adolescent Psy

Adult Psychiatry

Missing

Prescription RatesPrescription Rates

Descriptive Statistics

178 .00 60.00 6.0646 8.9755

178 .00 100.00 7.2455 12.5082

177 .00 35.00 2.0650 4.4136

178 .00 100.00 11.2725 13.2902

178 .00 100.00 15.6264 17.8469

177 .00 80.00 4.7345 8.7678

178 .00 75.00 7.4876 11.2028

178 .00 100.00 10.7079 13.9228

177

CELEXA

EFFEXOR

LUVOX

PAXIL

PROZAC

SERZONE

WELLBUTR

ZOLOFT

Valid N (listwise)

N Minimum Maximum MeanStd.

Deviation

Prescription RatesPrescription Rates

177177177177177177177177N =

ZOLOFT

WELLBUTR

SERZONE

PROZAC

PAXIL

LUVOX

EFFEXOR

CELEXA

120

100

80

60

40

20

0

-20

1141701691345290160105125115110981316929

701767889138

9

13

981701101544

529

13170

47

138

31981381259913916917011014678154131

17152

9

17087138

7314717113117498

69151

7870

89

99847

131

166

69

1467514878171810517618341691375715916752291334149115679170125131

98

1701605278

9155

70

69

131

13

509817070751691671313193416517652978

29155

Prozac Rates by Physician TypeProzac Rates by Physician Type

• First, Box Plot Summaries by Physician Type

• Second, ReCode Data - High/Average /Low Prescription Rates

Prozac Rates by Physician TypeProzac Rates by Physician Type

• Box Plot Summaries by Physician Type

47101857244N =

D.TYPE

Internal Medicine

General Psychiatry

General Practitioner

Family Practitioner

Child/Adolescent Psy

Adult Psychiatry

Missing

PR

OZ

AC

120

100

80

60

40

20

0

-20

176

87

171

78

18

17473

151

67

167

98131

69

Prozac Rates by Physician TypeProzac Rates by Physician Type

• ReCode Data

High

Average

Low

– Low Rate = 0 to 10 prescriptions per month

– Average Rate = 10 to 20 prescriptions per month

– High Rate = 20+ prescriptions per month

Cross Tabulating DataCross Tabulating Data

• Create a Table Which Summarizes Number in Each Level

D.TYPE * PROZACLV Crosstabulation

Count

4 9 11 24

1 4 2 7

58 14 13 85

1 1

4 2 4 10

33 7 7 47

100 36 38 174

Adult Psychiatry

Child/AdolescentPsychiatry

Family Practitioner

General Practitioner

General Psychiatry

Internal Medicine

D.TYPE

Total

Low Average High

PROZACLV

Total

Graphing the Cross Tabulation Graphing the Cross Tabulation

• Same Information Can be Summarized Using a Bar Plot

D.TYPE

Internal Medicine

General Psychiatry

General Practitioner

Family Practitioner

Child/Adolescent Psy

Adult Psychiatry

Missing

Co

un

t

70

60

50

40

30

20

10

0

PROZACLV

Low

Average

High

Next Class Period Next Class Period in Computer Labin Computer Lab

• Don’t forget: Next Period 11&14 BAB – from 7:15 p.m. to 9:00 p.m. We will not meet during the regularly class time during the day. 

• Also, please bring a floppy disk to class, to save your work.

top related