section 3e how numbers deceive pages 195-202. simpson’s paradox 3-e since shaq has the better...

38
Section 3E Section 3E How numbers How numbers deceive deceive Pages 195-202 Pages 195-202

Upload: norah-hudson

Post on 29-Dec-2015

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Section 3ESection 3EHow numbers deceiveHow numbers deceive

Pages 195-202Pages 195-202

Page 2: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Simpson’s ParadoxSimpson’s Paradox3-E

Since Shaq has the better shooting percentages in both the first half and second half of the game, can he claim that he has the ‘better game’ compared to Vince?

Page 3: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Simpson’s ParadoxSimpson’s Paradox3-E

Shaq overall %: 7 / 14 = 50%Vince overall %: 8 / 14 = 57.1%

Page 4: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Simpson’s ParadoxSimpson’s Paradox3-E

Simpson’s Paradox occurs when something appears better in each of two or more comparison groups, but is actually worse overall.

It occurs because the numbers/counts in each comparison group are so unequal.

Page 5: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

3-E

Men WomenApplied Admitte

dPercent Applied Admitte

dPercent

Total 2691 1198 44.5% 1835 557 30.4%

Simpson’s Paradox – A Famous ExampleSimpson’s Paradox – A Famous Example

University of California – BerkeleyUniversity of California – BerkeleyGraduate Admissions, 1973Graduate Admissions, 1973

Gender Discrimination??Gender Discrimination??

Page 6: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

3-E

MenMen WomenWomenDepartment

Applied

Admitted

% Applied

Admitted

%

AA 825825 512512 62%62% 108108 8989 82%82%

BB 560560 353353 63%63% 2525 1717 68%68%

CC 325325 120120 37%37% 593593 202202 34%34%

DD 417417 138138 33%33% 375375 131131 35%35%

EE 191191 5353 28%28% 393393 9494 24%24%

FF 374374 2222 6%6% 341341 2424 7%7%

Total 2691 1198 44.5% 1835 557 30.4%

Page 7: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Does Smoking Make You Live Does Smoking Make You Live Longer?Longer?

Early 1970’s – Medical Study in England

Involved adult residents from WickhamInvolved adult residents from Wickham

20 years later, follow-up study looked 20 years later, follow-up study looked at survival rates of people from the at survival rates of people from the original study.original study.

Among adult smokers, Among adult smokers, 24%24% died died during that 20 year period.during that 20 year period.

Among adult non-smokers, Among adult non-smokers, 31%31% died died during that period.during that period.

3-D

Page 8: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Does Smoking Make You Live Does Smoking Make You Live Longer?Longer?

Turns out – in the original study, Turns out – in the original study, nonsmokers were older (on average) nonsmokers were older (on average) than the smokers.than the smokers.

Thus the higher death rate among the Thus the higher death rate among the non-smokers simply reflected the fact non-smokers simply reflected the fact that death rates tend to increase with that death rates tend to increase with age.age.

When the results were broken into age When the results were broken into age groups, they showed that for any given groups, they showed that for any given age group, non-smokers had a higher age group, non-smokers had a higher survival rate than smokers.survival rate than smokers.

3-D

Page 9: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Conditional ProbabilityConditional Probability3-E

Does a positive mammogram always mean cancer? Positive test +

•True positive – identify malignant tumors as malignant•False positive – identify benign tumors as malignant

• Negative test - •True negative – identify benign tumors as benign•False negative – identify malignant tumors as benign

Page 10: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Mammography Mammography 85% accurate85% accurateresults for results for 10,00010,000 mammograms mammograms

Cancer rate = Cancer rate = 1%

3-E

Cancer

No Cancer

Total

Mammogram + + Test Test (malignant)(malignant)

Mammogram – – Test Test (benign)(benign)

TotalTotal 10,00010,000

Page 11: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Mammography Mammography 85% accurate85% accurateresults for 10,000 mammogramsresults for 10,000 mammograms

Cancer rate = Cancer rate = 1%

3-E

Cancer

No Cancer

Total

Mammogram + + Test Test (malignant)(malignant)

Mammogram – – Test Test (benign)(benign)

TotalTotal 100 9,9009,900 10,00010,000

Page 12: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Mammography Mammography 85% accurate85% accurateresults for 10,000 mammogramsresults for 10,000 mammograms

Cancer rate = Cancer rate = 1%

3-E

Cancer No Cancer

Total

Mammogram + + Test Test (malignant)(malignant)

.85.85×10×1000

=85=85

Mammogram – – Test Test (benign)(benign)

.85.85×9,90×9,9000

=8,415=8,415

TotalTotal 100 9,9009,900 10,00010,000

Page 13: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Mammography Mammography 85% accurate85% accurateresults for 10,000 mammogramsresults for 10,000 mammograms

Cancer rate = Cancer rate = 1%

3-E

Cancer

No Cancer

Total

Mammogram + + Test Test (malignant)(malignant)

8585 14851485 1,5701,570

Mammogram – – Test Test (benign)(benign)

1515 8,4158,415 8,4308,430

TotalTotal 100 9,9009,900 10,00010,000

Page 14: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Mammography Mammography 85% accurate85% accurateresults for 10,000 mammogramsresults for 10,000 mammograms

Cancer rate = Cancer rate = 1%

3-E

Cancer No Cancer

Total

Mammogram + Test + Test (malignant)(malignant)

8585

True +True +1,4851,485

False +False +1,5701,570

Mammogram – Test – Test (benign)(benign)

1515

False -False -8,4158,415

True -True -8,4308,430

TotalTotal 100 9,9009,900 10,00010,000

Page 15: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Mammography Mammography 85% accurate85% accurateresults for 10,000 mammogramsresults for 10,000 mammograms

Cancer rate = Cancer rate = 1%

3-E

So, chance of actually having cancer if So, chance of actually having cancer if the mammogram is positive isthe mammogram is positive is

85 / 1570 = 5.4%85 / 1570 = 5.4% If your mammogram is negative – what If your mammogram is negative – what

is the chance it’s a false negative? is the chance it’s a false negative? (You have cancer in spite of being (You have cancer in spite of being told you don’t?)told you don’t?)

Page 16: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Mammography Mammography 85% accurate85% accurateresults for 10,000 mammogramsresults for 10,000 mammograms

Cancer rate = Cancer rate = 1%

3-E

Cancer No Cancer

Total

Mammogram + + (malignant)(malignant)

8585

True +True +1,4851,485

False +False +1,5701,570

Mammogram - - (benign)(benign)

1515

False -False -8,4158,415

True -True -8,4308,430

TotalTotal 100 9,9009,900 10,00010,000

Page 17: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Mammography Mammography 85% accurate85% accurateresults for 10,000 mammogramsresults for 10,000 mammograms

Cancer rate = Cancer rate = 1%

3-E

If your mammogram is negative – what If your mammogram is negative – what is the chance it’s a false negative? is the chance it’s a false negative?

15 / 8430 = .0018 = .18%15 / 8430 = .0018 = .18%

Page 18: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Polygraphs & Drug TestsPolygraphs & Drug Tests3-E

Suppose a polygraph is 90% accurate1%1% of job applicants lielie

1000 applicants – so 10 are lying

How many of those applicants who were accused of lying were actually telling the truth?

Page 19: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

3-E

Lie Tell Truth Total

Polygraph Test + (Lie)+ (Lie)

Polygraph Test- (Truth)- (Truth)

Total 1,000

Page 20: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

3-E

Lie Tell Truth Total

Polygraph Test + (Lie)+ (Lie)

Polygraph Test- (Truth)- (Truth)

Total 10 990 1,000

Page 21: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

3-E

Lie Tell Truth Total

Polygraph Test + (Lie)+ (Lie)

9

Polygraph Test- (Truth)- (Truth)

891

Total 10 990 1,000

Page 22: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Lie Tell Truth Total

Polygraph Test +(Lie)(Lie)

9 99 108

Polygraph Test - (Truth)- (Truth)

1 891 892

Total 10 990 1,000

Of those applicants that failed the polygraph, 99 out of 108 or 99/108 = .917 = 91.7% were actually telling the truth.

[Of those applicants that passed the polygraph, 1 out of 892 or 1/892 = .0011 = .11% were actually lying.]

3-E

Page 23: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Tree Diagram for PolygraphsTree Diagram for Polygraphs90% accurate90% accurate

3-E

Page 24: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Tree Diagram for PolygraphsTree Diagram for Polygraphs3-E

So 99/108 = 91.7% of those who are accused of lying are not actually lying.

Page 25: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Athletic Drug TestingAthletic Drug Testing3-E

Drug tests are about 95% accurateAssume 4% of athletes use drugsAssume 1000 athletes at some Event.

What percentage of the athletes are falsely suspended from the team?

Page 26: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Drug test Drug test 95% accurate95% accurate, , 1,000 athletes1,000 athletesDrug use rate = Drug use rate = 4%4%

3-E

Drug User

Clean Total

Drug Test + + (drugs)(drugs)

Drug Test – – (no drugs)(no drugs)

TotalTotal 1,0001,000

Page 27: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Drug test Drug test 95% accurate95% accurate, , 1,000 athletes1,000 athletesDrug use rate = Drug use rate = 4%4%

3-E

Drug User

Clean Total

Drug Test + + (drugs)(drugs)

8686

Drug Test – – (no drugs)(no drugs)

914914

TotalTotal 40 960960 1,0001,000

Page 28: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Drug test Drug test 95% accurate95% accurate, , 1,000 athletes1,000 athletesDrug use rate = Drug use rate = 4%4%

3-E

Drug User

Clean Total

Drug Test ++

.95.95×40×40

= 38= 38

Drug Test --

.95.95×960×960

= 912= 912

TotalTotal 40 960960 1,0001,000

Page 29: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Drug test Drug test 95% accurate95% accurate, , 1,000 athletes1,000 athletesDrug use rate = Drug use rate = 4%4%

3-E

Drug User

Clean Total

Drug Test + + drugsdrugs

3838 4848 8686

Drug Test – – no drugsno drugs

22 912912 914914

TotalTotal 40 960960 1,0001,000

Page 30: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Drug test Drug test 95% accurate95% accurate, , 1,000 athletes1,000 athletesDrug use rate = Drug use rate = 4%4%

3-E

Drug User

Clean Total

Drug Test + + drugsdrugs

3838

True +True +4848

False +False +8686

Drug Test – – no drugsno drugs

22

False -False -912912

True -True -914914

TotalTotal 40 960960 1,0001,000

So 48/86 = 55.8% of those accused of using drugs So 48/86 = 55.8% of those accused of using drugs were actually clean and falsely suspendedwere actually clean and falsely suspended..

Page 31: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Drug test Drug test 95% accurate95% accurate, , 1,000 athletes1,000 athletesDrug use rate = Drug use rate = 4%4%

3-E

Drug User

Clean Total

Drug Test + + drugsdrugs

3838

True +True +4848

False +False +8686

Drug Test – – no drugsno drugs

22

False -False -912912

True -True -914914

TotalTotal 40 960960 1,0001,000

Of those athletes that passed the drug test, 2 out of 914 or 2/914 = .0021 = .21% were drug users.

Page 32: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Political MathematicsPolitical Mathematics3-E

Republicans: Tax cut would benefit all families and the middle class would receive slightly greater benefits.

Democrats: Tax cut would send disproportionate benefits to the rich..Which side was being more fair?Which side was being more fair?

Page 33: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

3-D

Republicans calculated the average tax cut that would be received per family in each group.

The last bar shows that families with incomes over $200,000 would get an average tax cut of 2.9%

Someone paying $100,000 in taxes Someone paying $100,000 in taxes would reduce their taxes by $2900 would reduce their taxes by $2900 while someone paying $1000 in taxes while someone paying $1000 in taxes would reduce their taxes by only $29.would reduce their taxes by only $29.

Page 34: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

3-D

Democrats calculated the percentage of total benefits that would be received by families in each group.

Families with incomes over $200,000 would receive 28.1% of the total benefits from the tax cut.

Because such families pay more than Because such families pay more than ¼ of the total income taxes collected, ¼ of the total income taxes collected, they would see (as a group) more than they would see (as a group) more than ¼ of¼ of the total benefit of the total benefit of any across the any across the board cutboard cut..

Page 35: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

3-D

Which side was being more fair?Which side was being more fair?

Neither!

The Republicans neglect the fact that most of the total tax savings would go to the wealthy.

Democrats neglect the fact that the wealthy already pay most of the taxes.

Page 36: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

A Cut or an Increase?A Cut or an Increase?Government spending for a popular Government spending for a popular

education program was $100 million last education program was $100 million last year. When Congress prepares its year. When Congress prepares its budget for next year, spending for the budget for next year, spending for the program is slated to rise to $102 million. program is slated to rise to $102 million.

The Consumer Price Index is expected to rise by 3% over the next year.

Is spending on this program being Is spending on this program being increased or cut?increased or cut?

3-D

Page 37: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

A Cut or an Increase?A Cut or an Increase?

Absolute change:

$102 million - $100 million = $102 million - $100 million = $2 million$2 million

This is an This is an increase in spending.increase in spending.

Relative change:

$2 million / $100 million = $2 million / $100 million = 2%2%

This is a This is a decrease in spending decrease in spending

relative to the inflation raterelative to the inflation rate (3%). (3%).

3-D

Page 38: Section 3E How numbers deceive Pages 195-202. Simpson’s Paradox 3-E Since Shaq has the better shooting percentages in both the first half and second half

Homework for Wednesday:Homework for Wednesday:

Pages 204-207Pages 204-207# 14, 15, 17, 18, 20, 23, 26