communicating quantitative information inflation election district polling, predictions, confidence...
Post on 04-Jan-2016
220 Views
Preview:
TRANSCRIPT
Communicating Quantitative Information
InflationElection district
Polling, predictions, confidence intervals, margin of error
Homework: Identify topic for Project 1. Postings. Prepare for Midterm
Inflation
• is when goods and services cost more over time– money is worth less
• Government agencies do the analysis on a 'shopping cart' of goods and services and calculates (and publishes) a number
• If annual inflation is 2% = .02 , it means that something that cost $100 last year would cost $102 this year (on average)
old_cost * (1 + inflation_rate) is the new_cost
Hint
• Need to change the percentage into a fraction– 2% becomes .02
• Need to add 1
• Multiply old by 1.02
• Hint: if inflation is positive (if goods and services are increasing in price), then new must be more than old—need to multiply by something that increases…..
Exercises
• If inflation is 4%, what would new prices be for something– $50– $10
• If inflation is 12%, what would new prices be for something– $50– $10
History
• Mostly, there is inflation, though deflation is possible (and generally not good for economy)
• Central banks ('the fed') try to regulate inflation by changes in the interest rates
• Calculation is complex– Consider computers– digital cameras
What is meant by Grade Inflation?
• ?
Dental expenses
• Yes, expenses have gone up, but have they gone up faster than inflation, that is, faster than everything
• Look at the graph– Gray line versus blue line– NOTE: both are increases
Pie chart versus Bar graph
• Pie is to show parts of a whole– For example, different categories of spending
• Bar graphs can show categories, also. – Better than pie charts if categories are not everything
• Bar graphs good for showing different time periods– Horizontal (x-axis) typically holds the time
• Clustered bar good for comparisons• Stacked bar good for parts of a whole
On graphs
• Graphs and diagrams are for showing context…. Telling a story (the relevant story)
• Complexity is okay– Want to encourage AND reward study
Remember: definitions, denominator, distribution, difference (context), dimensionDimension: may be axis in graph
gapminder uses color, size of 'dot', and timingNapoleon matching to/from Moscow: color, thickness
of line, geography, temperature
On re-districting
• One technique is to concentrate [known] voters of one type to remove from other districts
• Are voters so predictable?
• Do the qualities of the individual representatives count?
New topic(s)
• Measurement
• Polling and sampling
Measurements • Measuring something can require defining a
system / process– Competitive figure skating
• ‘operational’ definition– ‘likely voter’
• someone who voted in x% of last general elections and/or y% of primaries
• And knows the voting place
• Fixed place and time• For surveys: answered a specific question in the
context of other questions, …
Source
• The Cartoon guide to Statistics by Larry Gonick and Woollcott SmithHarperResource
Caution
• Procedures (formulas) presented without proof, though, hopefully, motivated
• Go over process different ways
• Next class: models of population, subpopulations in sample
Task
• Want to know the percentage (proportion) of some large group– adults in USA– television viewers– web users
• For a particular thing– think the president is doing a good job– watched specific program
• viewed specific commercial
– visited specific website
Strategy: Sampling
• Ask a small group– phone– solicitation at a mall– other?
• Monitor actions of a small group, group defined for this purpose
• Monitor actions of a panel chosen ahead of time
Quality of sample
• Recall discussion on students who 'took the bait' to take special survey
• More on quality of sample later
• More on adjusting data from panel for statement about total population later
Two approaches
• Estimating with confidence intervalc in general population based on proportionphat
in sample
• Hypothesis testing:H0 (null hypothesis) p = p0 versusHa p > p0
Estimation process
• Construct a sample of size n and determine phat
– Ask who they are voting for (for now, let this be binomial choice)
• Use this as estimate for actual proportion p.
• … but the estimate has a margin of error. This means :The actual value is within a range centered at phat …UNLESS the sample was really strange.
• The confidence value specifies what the chances are of the sample being that strange.
Statement
• I'm 95% sure that the actual proportion is in the following range….
• phat – m <= p <= phat + m
• Notice: if you want to claim more confidence, you need to make the margin bigger.
Image from Cartoon book
• You are standing behind a target.• An arrow is shot at the target, at a specific point in the
target. The arrow comes through to your side.• You draw a circle
(more complex than+/- error) and sayChances are:the target point is inthis circle unless shooterwas 'way off' . Shooter would only be way off X percent of the time.(Typically X is 5% or 1%.)
Mathematical basis
• Samples are themselves normally distributed…– if sample and p satisfy certain conditions.
• Most samples produce phat values that are close to the p value of the whole population.
• Only a small number of samples produce values that are way off.– Think of outliers of normal distribution
Actual (mathematical) process• Can use these techniques
when n*p>=5 and n*(1-p)>=5• The phat values are distributed close to normal
distribution with standard deviation sd(p) =
• Can estimate this using phat in place of p in formula!• Choose the level of confidence you want (again, typically
5% or 1%). For 5% (95% confident), look up (or learn by heart the value 1.96: this is the amount of standard deviations such that 95% of values fall in this area. So .95 is P(-1.96 <= (p-phat)/sd(p) <=1.96)
n
pp )1(*
Sample size must be this big
Notes
• p is less than 1 so (1-p) is positive.• Margin of error decreases as p varies from .5 in
either direction. (Check using excel).– if sample produces a very high (close to 1) or very
low value (close to 0), p * (1-p) gets smaller– (.9)*(.1) = .09; (.8)*(.2) = .16, (.6)*(.4) =.24;
(.5)*.5)=.25
n
pp )1(*
Notes
• Need to quadruple the n to halve the margin of error.
n
pp )1(*
Formula
• Use a value called the z transform– 95% confidence, the value is 1.96
Level of confidence
1-leg or 2-leg Standard deviations (z-score)
80% .10 or .20 1.28
90% .05 or .10 1.64
95% .025 or .05 1.96
99% .005 or .01 2.58
Mechanics
Process is • Gather data (get phat and n)• choose confidence level • Using table, calculate margin of error.
Book example: 55% (.55 of sample of 1000) said they backed the politician)
sd(phat) = square_root ((.55)*(.45)/1000)= .0157
• Multiply by z-score (e.g., 1.96 for a 95% confidence) to get margin of error
So p is within the range: .550 – (1.96)*(.0157) and .550 + (1.96)*(.0157) .519 to .581 or 51.9% to 58.1%
Example, continued
51.9% to 58.1%
may round to 52% to 58%
or
may say 55% plus or minus 3 percent.
What is typically left out is that there is a 1/20 chance that the actual value is NOT in this range.
95% confident means
• 95/100 probability that this is true• 5/100 chance that this is not true• 5/100 is the same as 1/20 so,• There is only a 1/20 chance that this is not true.• Only 1/20 truly random samples would give an
answer that deviated more from the real
– ASSUMING NO INTRINSIC QUALITY PROBLEMS– ASSUMING IT IS RANDOMLY CHOSEN
99% confidence means
• [Give fraction positive]
• [Give fraction negative]
Why
• Confidence intervals given mainly for 95% and 99%??
• History, tradition, doing others required more computing….
Let's ask a question
• How many of you watched the last Super Bowl? World Cup? – Sample is whole class
• How many registered to vote?– Sample size is number in class 18 and older
• ????
Excel: columns A & B
students
watchers
psample =B2/B1
sd =SQRT(B3*(1-B3)/B1)
Ztransform for 95% =1.96
margin =B5*B4
lower =MAX(0,B3-B6)
upper =MIN(B3+B6,1)
Variation of book problem
• Say sample was 300 (not 1000).• sd(phat) = square_root ((.55)*(.45)/300)
= .0287Bigger number. The circle around the arrow is larger. The
margin is larger because it was based on a smaller sample. Multiplying by 1.96 get .056, subtracting and adding from the .55 get
.494 to .606
You/we are 95% sure that true value is in this range.• Oops: may be better, but may be worse. The fact that
the lower end is below .5 is significant for an election!
Divisor smaller
Exercise
Determine / choose / read
• size of sample n
• proportion in sample (phat)
• claimed confidence level (and consult table).
• Hint: go back to Mechanics slide and Table slide and plug in the numbers!
Exercise• size of sample is n• proportion in sample is phat
• confidence level produces factor called the z-score– Can be anything but common values are
[80%], 90%, 95%, 99%) – Use table. For example, 95% value is 1.96;
99% is 2.58
• Calculate margin of error m– m = zscore * sqrt((phat)*(1-phat)/n)
• Actual value is >= phat – m and <= phat + m
Hypothesis testing
• Pre-election polling
• Repeat example
• Source (again) The Cartoon Guide to Statistics by Gonick and Smith– See also for Jury selection, product
inspection, etc.
Hypothesis testing
• Null hypothesisp = p0
• Alternate hypothesisp > p0
• Do a test and decide if there is evidence to reject the Null hypothesis. (Need more evidence to reject than to keep).– Similar analysis (not giving proof!)
Hypothesis testing, continued
• Test statistic is
Z = (.55-50)/sqrt(.5*.5)/sqrt(1000)
= 3.16
Use Excel =1-normsdist(3.16)
P(z>=3.16) = .0008
Reject Null hypothesis. Chances are .0008 that it is true (that p = p0)
n
pp
pp
)1(
ˆ
00
0
Project I
• Paper or presentation on news story involving mathematics and/or quantitative reasoning– Involving the audience is good– Everybody be ready with paper or ready to
present. Some presentations may go to next class.
• Use multiple sources• Explain the mathematics!!!
Ways to get topic• Topic, assignment in other course that involves
quantitative information– Double dipping
• Alternative: compare how two different newspapers/writers/media treat the same topic. There must be real differences.– Variant (special case): election polling. Talk about
similarities and differences, perhaps definition of 'toss-up', how they describe sources,?
• Paulos TV series: http://abcnews.go.com/Technology/WhosCounting/
Homework
• Topic for project 1 due by October 20– You can re-use any topic you or anyone else posted– You can re-use spreadsheet or diagram topics– You can use topics I suggested– You can use topics from another class– YOU MUST post your proposal even if it is a topic I suggested.
• Midterm is October 18• Presentation and project 1 paper due Nov. 4
• (Guide to midterm is on-line. Reviewing will assume you have studied the guide.)
top related