chapter-2: descriptive statistics:

29
1 © University of Minnesota-Duluth, Econ-2030(Dr. Tadesse) © University of Minnesota-Duluth, Econ-2030(Dr. Tadesse) Chapter-2: Descriptive Chapter-2: Descriptive Statistics: Statistics: Tabular and Graphical Tabular and Graphical Presentations Presentations Part A Part A Exploratory Data Analysis Exploratory Data Analysis Cross-tabulations; Scatter Cross-tabulations; Scatter Diagrams Diagrams

Upload: oihane

Post on 19-Jan-2016

59 views

Category:

Documents


1 download

DESCRIPTION

Chapter-2: Descriptive Statistics:. Tabular and Graphical Presentations Part A. Exploratory Data Analysis Cross-tabulations; Scatter Diagrams. Tabular and Graphical Procedures. Data. Qualitative Data. Quantitative Data. Tabular Methods. Graphical Methods. Tabular Methods. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Chapter-2: Descriptive Statistics:

1 1 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

Chapter-2: Descriptive Chapter-2: Descriptive Statistics:Statistics:

Tabular and Graphical Tabular and Graphical PresentationsPresentations

Part APart A

Exploratory Data AnalysisExploratory Data Analysis

Cross-tabulations; Scatter Cross-tabulations; Scatter DiagramsDiagrams

Page 2: Chapter-2: Descriptive Statistics:

2 2 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

Tabular and Graphical ProceduresTabular and Graphical Procedures

Qualitative DataQualitative DataQualitative DataQualitative Data Quantitative DataQuantitative DataQuantitative DataQuantitative Data

TabularTabularMethodsMethods TabularTabularMethodsMethods

TabularTabularMethodsMethods TabularTabularMethodsMethods

GraphicalGraphical MethodsMethodsGraphicalGraphical MethodsMethods

GraphicalGraphical MethodsMethodsGraphicalGraphical MethodsMethods

•FrequencyFrequency DistributionDistribution•Rel. Freq. Dist.Rel. Freq. Dist.•Percent Freq. Percent Freq. DistributionDistribution•CrosstabulationCrosstabulation

•Bar GraphBar Graph•Pie ChartPie Chart

•FrequencyFrequency DistributionDistribution•Rel. Freq. Dist.Rel. Freq. Dist.•Cum. Freq. Dist.Cum. Freq. Dist.•Cum. Rel. Freq.Cum. Rel. Freq. Distribution Distribution •Stem-and-LeafStem-and-Leaf DisplayDisplay•CrosstabulationCrosstabulation

•Dot PlotDot Plot•HistogramHistogram•OgiveOgive•ScatterScatter DiagramDiagram

DataDataDataData

Page 3: Chapter-2: Descriptive Statistics:

3 3 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

Stem-and-Leaf DisplayStem-and-Leaf Display

It is It is similar to a histogramsimilar to a histogram, but it has the , but it has the advantage advantage of showing the actual data values.of showing the actual data values.

A stem-and-leaf display is an exploratory data A stem-and-leaf display is an exploratory data analysis method that shows both the analysis method that shows both the rank orderrank order and and shape of the distributionshape of the distribution of the data. of the data.

Page 4: Chapter-2: Descriptive Statistics:

4 4 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

Example: Hudson Auto RepairExample: Hudson Auto Repair

Sample of Parts Cost for 50 Tune-upsSample of Parts Cost for 50 Tune-ups

91 78 93 57 75 52 99 80 97 6271 69 72 89 66 75 79 75 72 76104 74 62 68 97 105 77 65 80 10985 97 88 68 83 68 71 69 67 7462 82 98 101 79 105 79 69 62 73

91 78 93 57 75 52 99 80 97 6271 69 72 89 66 75 79 75 72 76104 74 62 68 97 105 77 65 80 10985 97 88 68 83 68 71 69 67 7462 82 98 101 79 105 79 69 62 73

Page 5: Chapter-2: Descriptive Statistics:

5 5 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

Stem-and-Leaf DisplayStem-and-Leaf Display

55

66

77

88

99

1010

2 72 7

2 2 2 2 5 6 7 8 8 8 9 9 92 2 2 2 5 6 7 8 8 8 9 9 9

1 1 2 2 3 4 4 5 5 5 6 7 8 9 9 91 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9

0 0 2 3 5 8 90 0 2 3 5 8 9

1 3 7 7 7 8 91 3 7 7 7 8 9

1 4 5 5 91 4 5 5 9

Page 6: Chapter-2: Descriptive Statistics:

6 6 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

Steps to Construct a Stem-and-Leaf Steps to Construct a Stem-and-Leaf DisplayDisplay

Each digit on a stem is a Each digit on a stem is a leafleaf..

Each line in the display is referred to as a Each line in the display is referred to as a stemstem..

To the right of the vertical line, record the lastTo the right of the vertical line, record the last digit for each item in rank orderdigit for each item in rank order..

Arrange the first digits of each data to theArrange the first digits of each data to the left of a vertical line.left of a vertical line.

Page 7: Chapter-2: Descriptive Statistics:

7 7 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

Stem-and-Leaf DisplayStem-and-Leaf Display

55

66

77

88

99

1010

2 72 7

2 2 2 2 5 6 7 8 8 8 9 9 92 2 2 2 5 6 7 8 8 8 9 9 9

1 1 2 2 3 4 4 5 5 5 6 7 8 9 9 91 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9

0 0 2 3 5 8 90 0 2 3 5 8 9

1 3 7 7 7 8 91 3 7 7 7 8 9

1 4 5 5 91 4 5 5 9

a stema stema leafa leaf

Page 8: Chapter-2: Descriptive Statistics:

8 8 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

Stretched Stem-and-Leaf DisplayStretched Stem-and-Leaf Display

Whenever stretched, the first value of a stemWhenever stretched, the first value of a stem corresponds to leaf corresponds to leaf values of 0 values of 0 4 4, and the second, and the second value of a stem corresponds to leaf value of a stem corresponds to leaf values of 5 values of 5 9. 9.

If we believe the display has condensed the data too If we believe the display has condensed the data too much, we can much, we can stretch thestretch the displaydisplay by using by using two stems two stems for each leading digit(sfor each leading digit(s).).

Page 9: Chapter-2: Descriptive Statistics:

9 9 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

Stretched Stem-and-Leaf DisplayStretched Stem-and-Leaf Display

5 5 95 5 91 41 47 7 7 8 97 7 7 8 91 31 35 8 95 8 90 0 2 30 0 2 35 5 5 6 7 8 9 9 95 5 5 6 7 8 9 9 91 1 2 2 3 4 41 1 2 2 3 4 45 6 7 8 8 8 9 9 95 6 7 8 8 8 9 9 92 2 2 22 2 2 2772255

556666777788889999

10101010

Page 10: Chapter-2: Descriptive Statistics:

10 10 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

Stem-and-Leaf Display—Some notesStem-and-Leaf Display—Some notes

Leaf UnitsLeaf Units

• Where the leaf unit is not shown, it is assumedWhere the leaf unit is not shown, it is assumed to equal 1.to equal 1.

• Leaf units may be 100, 10, 1, 0.1, and so on.Leaf units may be 100, 10, 1, 0.1, and so on.

• In the preceding example, the leaf unit was 1.In the preceding example, the leaf unit was 1.

• A single digit is used to define each leaf.A single digit is used to define each leaf.

Page 11: Chapter-2: Descriptive Statistics:

11 11 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

Example: Leaf Unit = 0.1Example: Leaf Unit = 0.1

If we have data with values such asIf we have data with values such as

88

99

1010

1111

Leaf Unit = 0.1Leaf Unit = 0.16 86 8

1 41 422

0 70 7

8.6 8.6 11.711.7 9.49.4 9.19.1 10.210.2 11.011.0 8.88.8

a stem-and-leaf display of these data will bea stem-and-leaf display of these data will be

Page 12: Chapter-2: Descriptive Statistics:

12 12 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

Example: Leaf Unit = 10Example: Leaf Unit = 10

If we have data with values such asIf we have data with values such as

1616

1717

1818

1919

Leaf Unit = 10Leaf Unit = 1088

1 91 90 30 3

1 71 7

18061806 17171717 19741974 17911791 16821682 19101910 18381838

a stem-and-leaf display of these data will bea stem-and-leaf display of these data will be

The 82 in 1682The 82 in 1682is rounded downis rounded down

to 80 and isto 80 and isrepresented as represented as

an 8.an 8.

Page 13: Chapter-2: Descriptive Statistics:

13 13 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

Crosstabulations Crosstabulations and and

Scatter DiagramsScatter Diagrams

Page 14: Chapter-2: Descriptive Statistics:

14 14 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

Crosstabulations and Scatter DiagramsCrosstabulations and Scatter Diagrams

Cross tabulationCross tabulation and a and a scatter diagramscatter diagram are methods for are methods for summarizing the data on two (or more) variablessummarizing the data on two (or more) variables simultaneously.simultaneously.

Help us understand the Help us understand the relationship between two variablesrelationship between two variables..

Thus far we have focused on methods that are usedThus far we have focused on methods that are used to summarize the data on to summarize the data on one variable at a timeone variable at a time..

Page 15: Chapter-2: Descriptive Statistics:

15 15 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

CrosstabulationCrosstabulation

The left and top margin labels define the classes forThe left and top margin labels define the classes for the two variables.the two variables.

Crosstabulation can be used when:Crosstabulation can be used when:• one variable is qualitative and the other isone variable is qualitative and the other is quantitative,quantitative,• both variables are qualitative, orboth variables are qualitative, or• both variables are quantitative.both variables are quantitative.

A A crosstabulationcrosstabulation is a tabular summary of data for is a tabular summary of data for two variables.two variables.

Page 16: Chapter-2: Descriptive Statistics:

16 16 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

PricePriceRangeRange Colonial Log Split A-FrameColonial Log Split A-FrameTotalTotal

<< $99,000 $99,000

> $99,000> $99,00018 6 19 1218 6 19 12 5555

4545

3030 20 35 15 20 35 15TotalTotal 100100

12 14 16 312 14 16 3

Home StyleHome Style

Cross-tabulationCross-tabulation

Example:Example:

The number of Homes sold by style and price The number of Homes sold by style and price for the past two years is shown below. for the past two years is shown below.

quantitativquantitativee

variablevariable

qualitativequalitative variablevariable

Page 17: Chapter-2: Descriptive Statistics:

17 17 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

Cross-tabulationCross-tabulation

Insights Gained from Preceding Cross-Insights Gained from Preceding Cross-tabulationtabulation

• Only three homes in the sample are an A-FrameOnly three homes in the sample are an A-Frame style and priced at more than $99,000.style and priced at more than $99,000.

• The greatest number of homes in the sample (19)The greatest number of homes in the sample (19) are a split-level style and priced at less than orare a split-level style and priced at less than or equal to $99,000.equal to $99,000.

Page 18: Chapter-2: Descriptive Statistics:

18 18 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

Cross-tabulation: Row or Column Cross-tabulation: Row or Column PercentagesPercentages

Converting the entries in the table into row Converting the entries in the table into row percentages or column percentages can provide percentages or column percentages can provide additional insight about the relationship additional insight about the relationship between the two variables.between the two variables.

Page 19: Chapter-2: Descriptive Statistics:

19 19 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

PricePriceRangeRange Colonial Log Split A-FrameColonial Log Split A-FrameTotalTotal

<< $99,000 $99,000

> $99,000> $99,00018 6 19 1218 6 19 12 5555

4545

3030 20 35 15 20 35 15TotalTotal 100100

12 14 16 312 14 16 3

Home StyleHome Style

Cross-tabulationCross-tabulation

Frequency distributionFrequency distributionfor the price variablefor the price variable

Frequency distributionFrequency distributionfor the home style for the home style

variablevariable

Page 20: Chapter-2: Descriptive Statistics:

20 20 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

PricePriceRangeRange Colonial Log Split A-FrameColonial Log Split A-FrameTotalTotal

<< $99,000 $99,000

> $99,000> $99,00032.73 10.91 34.55 21.8232.73 10.91 34.55 21.82100100

10010026.67 31.11 35.56 6.6726.67 31.11 35.56 6.67

Home StyleHome Style

(Colonial and > $99K)/(All >$99K) x 100 = (12/45) x 100(Colonial and > $99K)/(All >$99K) x 100 = (12/45) x 100

Cross-tabulation: Row PercentagesCross-tabulation: Row Percentages

Page 21: Chapter-2: Descriptive Statistics:

21 21 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

Simpson’s ParadoxSimpson’s Paradox

Simpson’ ParadoxSimpson’ Paradox: In some cases the conclusions: In some cases the conclusions based upon an aggregated cross-tabulation can bebased upon an aggregated cross-tabulation can be completely different from the un-aggregatedcompletely different from the un-aggregated data. data.

We must be careful in drawing conclusions about theWe must be careful in drawing conclusions about the relationship between the two variables in therelationship between the two variables in the aggregated cross-tabulation.aggregated cross-tabulation.

Often Times data in two or more cross-tabulations are Often Times data in two or more cross-tabulations are aggregated to produce a summary cross-tabulation.aggregated to produce a summary cross-tabulation.

Page 22: Chapter-2: Descriptive Statistics:

22 22 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

Scatter Diagram and Trend lineScatter Diagram and Trend line

Page 23: Chapter-2: Descriptive Statistics:

23 23 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

The general pattern of the plotted points suggests theThe general pattern of the plotted points suggests the overall relationship between the variables.overall relationship between the variables.

One variable is shown on the horizontal axis and theOne variable is shown on the horizontal axis and the other variable is shown on the vertical axis.other variable is shown on the vertical axis.

A A scatter diagramscatter diagram is a graphical presentation of the is a graphical presentation of the relationship between two relationship between two quantitativequantitative variables. variables.

Scatter Diagram and Trend lineScatter Diagram and Trend line

A A trend linetrend line is an approximation of the relationship. is an approximation of the relationship.

Page 24: Chapter-2: Descriptive Statistics:

24 24 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

Scatter DiagramScatter Diagram

A Positive RelationshipA Positive Relationship

xx

yy

Page 25: Chapter-2: Descriptive Statistics:

25 25 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

Scatter DiagramScatter Diagram

A Negative RelationshipA Negative Relationship

xx

yy

Page 26: Chapter-2: Descriptive Statistics:

26 26 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

Scatter DiagramScatter Diagram

No Apparent RelationshipNo Apparent Relationship

xx

yy

Page 27: Chapter-2: Descriptive Statistics:

27 27 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

Example: Panthers Football TeamExample: Panthers Football Team

The Panthers football team is interestedThe Panthers football team is interested

in investigating the relationship, if any,in investigating the relationship, if any,

between between interceptions madeinterceptions made and and points points scored.scored.

1133221133

14142424181817173030

xx = Number of = Number ofInterceptionsInterceptions

yy = Number of = Number of Points ScoredPoints Scored

Page 28: Chapter-2: Descriptive Statistics:

28 28 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

Scatter DiagramScatter Diagram

yy

xx

Number of InterceptionsNumber of Interceptions

Num

ber

of

Poin

ts S

core

dN

um

ber

of

Poin

ts S

core

d

551010

1515202025253030

00

3535

11 22 3300 44

Page 29: Chapter-2: Descriptive Statistics:

29 29 Slide

Slide

© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)

Insights Gained from the Preceding Scatter Insights Gained from the Preceding Scatter DiagramDiagram

• The relationship is not perfect; all plotted points inThe relationship is not perfect; all plotted points in the scatter diagram are not on a straight line.the scatter diagram are not on a straight line.

• Higher points scored are associated with a higherHigher points scored are associated with a higher number of interceptions.number of interceptions.

• The scatter diagram indicates a positive relationshipThe scatter diagram indicates a positive relationship between the number of interceptions and thebetween the number of interceptions and the number of points scored.number of points scored.

Example: Panthers Football TeamExample: Panthers Football Team