estimation and uncertainty 12-706/73-359 original lecture by h. scott matthews, cmu sept 24, 2003

22
Estimation and Uncertainty 12-706/73-359 Original lecture by H. Scott Matthews, CMU Sept 24, 2003

Post on 19-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Estimation and Uncertainty 12-706/73-359 Original lecture by H. Scott Matthews, CMU Sept 24, 2003

Estimation and Uncertainty

12-706/73-359Original lecture byH. Scott Matthews, CMUSept 24, 2003

Page 2: Estimation and Uncertainty 12-706/73-359 Original lecture by H. Scott Matthews, CMU Sept 24, 2003

Fermi Problems

Estimating an unknown quantity is sometimes called a “Fermi problem,” after physicist Enrico Fermi Wanted to show students they had the power

to do estimation His first problem: “How many piano tuners are

there in Chicago?”

Page 3: Estimation and Uncertainty 12-706/73-359 Original lecture by H. Scott Matthews, CMU Sept 24, 2003

Sample Fermi Problems

How much tea is there in China? How may pounds of human hair are cut every day? How many leaves are there on all the trees in the

world? If you got a penny for each time someone said

“Damn!" in the United States, how long would it take you to become a billionaire?

What area of the Earth would it take to supply the U.S. with all its energy needs if solar energy could be converted with 1% efficiency? Solar energy at Earth is about 1 kW/m2.

Page 4: Estimation and Uncertainty 12-706/73-359 Original lecture by H. Scott Matthews, CMU Sept 24, 2003

Cobblers in the US – Method 1

Cobblers repair shoes On average, assume 20 min/task Thus 20 jobs / day ~ 5000/yr

How many jobs are needed overall for US? I get shoes fixed once every 4 years

About 280M people in US Thus 280M/4 = 56 M shoes fixed/year

56M/5000 ~ 11,000 => 10^4 cobblers in US Sensitivity:

Am I representative? Are all shoe repairs done by cobblers? Do cobblers work 8 hours per day?

Page 5: Estimation and Uncertainty 12-706/73-359 Original lecture by H. Scott Matthews, CMU Sept 24, 2003

Cobblers in the US – Method 2

Greater Pittsburgh Yellow Pages has 36 entries under “Shoe Repairing”

Assume each repair shop has two employees. 72 in greater Pittsburgh

Population of greater Pittsburgh = 2.3 million (2000 Census) = 0.82% of U.S.

Number of cobblers in U.S. = 72/0.0082 = 8780 Sensitivity:

Is Pittsburgh representative? Is “greater Pittsburgh” the right area for the Yellow

Pages? Average number of employees of a shoe repair shop

Page 6: Estimation and Uncertainty 12-706/73-359 Original lecture by H. Scott Matthews, CMU Sept 24, 2003

Cobblers in the US

Methods 1 and 2 give “close” answers: 11,000 v. 8780

Actual: Census Dept says 5,120 in US Depends on accuracy of job counting in Census Listing of occupations Full-time vs. part-time Number of responses received

Page 7: Estimation and Uncertainty 12-706/73-359 Original lecture by H. Scott Matthews, CMU Sept 24, 2003

Problem of Unknown Numbers

If we need a piece of data, we can: Look it up in a reference source Collect number through survey/investigation Guess it ourselves Get experts to help guess it

Often only ‘ballpark’, ‘back of the envelope’ or ‘order of magnitude needed Situations when actual number is unavailable

or where rough estimates are good enough E.g. 100s, 1000s, … (102, 103, etc.)

Page 8: Estimation and Uncertainty 12-706/73-359 Original lecture by H. Scott Matthews, CMU Sept 24, 2003

Methodology

First develop an upper bound and a lower bound. This will allow to do a “sanity check” on the answer

Use at least two independent methods of estimation and compare the answers

Identify sensitivity to errors in the data. For sensitive data, but sure you have good values

Page 9: Estimation and Uncertainty 12-706/73-359 Original lecture by H. Scott Matthews, CMU Sept 24, 2003

In the absence of “Real Data”

Are there similar or related values that we know or can guess? (proxies) Example: registered voters v. population

Are there ‘rules of thumb’ in the area? E.g. ‘Rule of 72’ for compound interest r*t = 72: investment at 6% doubles in 12 yrs

Set up a ‘model’ to estimate the unknown Linear, product, etc functional forms Divide and conquer

Page 10: Estimation and Uncertainty 12-706/73-359 Original lecture by H. Scott Matthews, CMU Sept 24, 2003

Methods

Similarity – do we have data that might apply to our problem?

Stratification – segment the population into subgroups, estimate each group

Triangulation – create models with different approaches and compare results

Page 11: Estimation and Uncertainty 12-706/73-359 Original lecture by H. Scott Matthews, CMU Sept 24, 2003

‘How much disk space to store every word you hear in a lifetime?’

How many words per day can you hear? 12 hours per day, 120 words per minute = 86,400

words/day = 33 million per year

How much disk space to store them? Average word < 10 characters, 330MB/year

Average lifetime? 75 years? Answer: < 25GB, less than the size of a laptop

Page 12: Estimation and Uncertainty 12-706/73-359 Original lecture by H. Scott Matthews, CMU Sept 24, 2003

‘How much energy used by lighting in US residences?’

Assume 25 light fixtures per house Assume each in use avg 2 hours per day Assume average fixture is 50W Thus each fixture uses 100Wh/day Each house uses 2500Wh/day 100 million households would use 250 million

kWh/day 91,300 million kWh/yr

Page 13: Estimation and Uncertainty 12-706/73-359 Original lecture by H. Scott Matthews, CMU Sept 24, 2003

‘How much energy used by lighting in US residences?’

Our guess: 91,300 million kWh/yr DOE: “lighting is 5-10% of household elec” http://www.eren.doe.gov/erec/factsheets/

eelight.html 2000 US residential Demand ~ 1.2 million million kWh

(source below) 10% is 120,000 million kWh 5% is 60,000 million kWh 2000 demand source:

http://www.eia.doe.gov/cneaf/electricity/epm/ epmt44p1.html

Page 14: Estimation and Uncertainty 12-706/73-359 Original lecture by H. Scott Matthews, CMU Sept 24, 2003

How many TV sets in the US?

Can this be calculated? Estimation approach #1:

Survey/similarity How many TV sets owned by class? Scale up by number of people in the

US Should we consider the class a

representative sample? Why not?

Page 15: Estimation and Uncertainty 12-706/73-359 Original lecture by H. Scott Matthews, CMU Sept 24, 2003

TV Sets in US – Method 2

Segmenting work from # households and # tvs per

household - may survey for one input Assume x households in US Assume z segments of ownership (i.e.

what % owns 0, owns 1, etc) Then estimated number of television

sets in US = x*(4z5+3z4+2z3+1z2+0z1)

Page 16: Estimation and Uncertainty 12-706/73-359 Original lecture by H. Scott Matthews, CMU Sept 24, 2003

TV Sets in US – By Segmentation

Assume 50 million households in US Assume 19% have 4, 30% 3, 35% 2,

15% 1, 1% 0 television sets Then

50,000,000*(4*.19+3*.3+2*.35+.15) = 125.5 M television sets

Page 17: Estimation and Uncertainty 12-706/73-359 Original lecture by H. Scott Matthews, CMU Sept 24, 2003

TV Sets in US – Method 3

Estimation approach #3 – published data

Source: Statistical Abstract of US Gives many basic statistics such as

population, areas, etc.

Page 18: Estimation and Uncertainty 12-706/73-359 Original lecture by H. Scott Matthews, CMU Sept 24, 2003
Page 19: Estimation and Uncertainty 12-706/73-359 Original lecture by H. Scott Matthews, CMU Sept 24, 2003

How well did we do?

Most recent data = 1997 But ‘recently’ increasing < 3% per year

TV/HH - 125.5 tvs, StatAb – 229M tvs, % error: (229M – 125.5M)/125.5M ~

82% What assumptions are crucial in

determining our answer? Were we right? What other data on this table validate our

models?

Page 20: Estimation and Uncertainty 12-706/73-359 Original lecture by H. Scott Matthews, CMU Sept 24, 2003

Some handy/often used data

Population of US 275-300 millionNumber of households ~ 100 millionAverage personal income ~$30,000

Page 21: Estimation and Uncertainty 12-706/73-359 Original lecture by H. Scott Matthews, CMU Sept 24, 2003

Good Assumptions

Justify and document your assumptions Have some basis in known facts or

experience Do not allow bias toward the answer affect

your assumptions Example: what will the inflation rate be next

year? Is past inflation a good predictor? Can I find current inflation? Should I assume change from current conditions? We typically use history to guide us

Page 22: Estimation and Uncertainty 12-706/73-359 Original lecture by H. Scott Matthews, CMU Sept 24, 2003

Notes on Estimation

Move from abstract to concrete, identifying assumptions

Draw from experience and basic data sources

Use statistical techniques/surveys if needed Be creative, BUT Be logical and able to justify Find answer, then learn from it. Apply a reasonableness test