19 inference

27
Hadley Wickham Stat310 Introduction to inference Saturday, 27 March 2010

Upload: hadley-wickham

Post on 13-May-2015

586 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 19 Inference

Hadley Wickham

Stat310Introduction to inference

Saturday, 27 March 2010

Page 2: 19 Inference

Continuity correction

Applies only to sums, not means!

Important only where n smaller.

Saturday, 27 March 2010

Page 3: 19 Inference

Homework

Learn the change of variable steps!

If X and Y are independent, and you know f(x) and f(y), what is f(x, y)?

Common pattern

Saturday, 27 March 2010

Page 4: 19 Inference

What proportion of Rice students smoke

marijuana?

Saturday, 27 March 2010

Page 5: 19 Inference

1. Introduction to inference

2. A survey

3. Estimators and their properties

4. Feedback

Saturday, 27 March 2010

Page 6: 19 Inference

Data vs. Distributions

Random experiments produce data.

A repeatable random experiment has some underlying distribution.

We want to go from the data to say something about the underlying distribution. So far we have been doing the opposite.

Saturday, 27 March 2010

Page 7: 19 Inference

Inference

Up to now: Given a sequence of random variables with distribution F, what can we say about the mean of a sample?

What we really want: Given the mean of a sample, what can we say about the underlying sequence of random variables?

Saturday, 27 March 2010

Page 8: 19 Inference

Marijuana

Up to now: If I told you that smoking pot was Binomially distributed with probability p, you could tell me what I would expect to see if I sampled n people at random.

What we want: If I sample n people at random and find out m of them smoke pot, what does that tell me about p?

Saturday, 27 March 2010

Page 9: 19 Inference

Your turn

Let’s I selected 10 people at random from the Rice campus and asked them if they’d smoked pot in the last month. Two said yes.

Let Yi be 1 if person i smoked pot and 0 otherwise. What’s a good guess for distribution of Yi?

Saturday, 27 March 2010

Page 10: 19 Inference

p

prob

0.0e+00

5.0e−09

1.0e−08

1.5e−08

2.0e−08

2.5e−08

3.0e−08

3.5e−08

0.0 0.2 0.4 0.6 0.8 1.0

Saturday, 27 March 2010

Page 11: 19 Inference

Plug-in principle

A good first guess at the true mean, variance, or any other parameter of a distribution is simply to estimate it from the data.

We’ll learn more sophisticated ways after the break.

Saturday, 27 March 2010

Page 12: 19 Inference

Problems

If I picked someone at random out of the Rice campus directory, and asked them if they’d smoked pot in the last month, how likely do you think I am to get an honest answer?

How can we get around this?

Saturday, 27 March 2010

Page 13: 19 Inference

Your turn

Compare your survey to someone with the other colour survey. What is the difference? How could you use the answers to estimate how many people smoke pot?

Saturday, 27 March 2010

Page 14: 19 Inference

Questions1. Have you smoked pot in the last month?

2. Flip a coin. If it’s heads, write yes. Otherwise write yes if you’ve smoked pot in the last month, and no otherwise.

3. In the last month, how many of the following activities have you done? One list contains smoking pot, the other doesn’t.

Saturday, 27 March 2010

Page 15: 19 Inference

Results

1. The proportion of yeses in Q1.

2. The proportion of yeses in (Q2 - 0.5) * 2

3. (Q3a/4 - Q3b/5) / n.

Saturday, 27 March 2010

Page 16: 19 Inference

Your turn

Fill in the survey. Make sure no one else can see your answers!

Saturday, 27 March 2010

Page 17: 19 Inference

Definitions

Parameter space: set of all possible parameter values

Estimator: process/function which takes data and gives best guess for parameter

Point estimate: estimator for a single value

Saturday, 27 March 2010

Page 18: 19 Inference

Your turn

With a partner, brainstorm as many different properties that you could use to decide which estimator is best.

Remember that there is some true unknown proportion that we are trying to estimate.

Saturday, 27 March 2010

Page 19: 19 Inference

Important properties of an estimator

Unbiased

Small variance

Consistent

Sufficient

Saturday, 27 March 2010

Page 20: 19 Inference

Unbiased

V ar(θ̂1) < V ar(θ̂2)

Minimum variance

Common problem is to find UMVE (unbiased minimum variance estimator) across all possible estimators

E(θ̂) = θ

Saturday, 27 March 2010

Page 21: 19 Inference

Low bias, low variance Low bias, high variance

High bias, low variance High bias, high variance

Saturday, 27 March 2010

Page 22: 19 Inference

n=1

n=5

n=10

Consistent

Saturday, 27 March 2010

Page 23: 19 Inference

Results

Saturday, 27 March 2010

Page 24: 19 Inference

Your turn

How can we describe this as a random experiment? Do you have any concerns about applying the results of this survey to the entire Rice campus?

Saturday, 27 March 2010

Page 25: 19 Inference

Experiments

Remember, the context is always the random experiment. For this case the randomness comes from picking the person at random.

Repeating the experiment does not mean asking each person again (the answer would be the same) but asking a different set of people.

Saturday, 27 March 2010

Page 26: 19 Inference

Next time

(After the test and spring recess)

Method of moments

Maximum likelihood

Saturday, 27 March 2010

Page 27: 19 Inference

Feedback

Saturday, 27 March 2010