# stat 110 midterm review, fall 2011 - projects at harvard ?· stat 110 midterm review, fall 2011...

Post on 01-Sep-2018

214 views

Category:

## Documents

Embed Size (px)

TRANSCRIPT

• Stat 110 Midterm Review, Fall 2011Prof. Joe Blitzstein (Department of Statistics, Harvard University)

1 General Information

The midterm will be in class on Wednesday, October 12. There is no alternate timefor the exam, so please be there and arrive on time! Cell phones must be o, so itis a good idea to bring a watch. No books, notes, or calculators are allowed, exceptthat you may bring two sheets of standard-sized paper (8.5 x 11) with whateveryou want written on it (two-sided): notes, theorems, formulas, information aboutthe important distributions, etc.

There will be 4 problems, weighted equally. Many of the parts can be donequickly if you have a good understanding of the ideas covered in class (e.g., seeingif you can use Bayes Rule or understand what independence means), and for manyyou can just write down an answer without needing to simplify. None will requirelong or messy calculations. They are not arranged in order of increasing diculty.Since it is a short exam, make sure not to spend too long on any one problem.

Suggestions for studying: review all the homeworks and read the solutions, studyyour lecture notes (and possibly relevant sections from either book), the strategicpractice problems, and this handout. Solving practice problems (which means tryinghard to work out the details yourself, not just trying for a minute and then lookingat the solution!) is extremely important.

2 Topics

Combinatorics: multiplication rule, tree diagrams, binomial coecients, per-mutations and combinations, inclusion-exclusion, story proofs.

Basic Probability: sample spaces, events, axioms of probability, equally likelyoutcomes, inclusion-exclusion, unions, intersections, and complements.

Conditional Probability: definition and meaning, writing P (A1

\A2

\ \An

)as a product, Bayes Rule, Law of Total Probability, thinking conditionally,independence vs. conditional independence.

1

• Random Variables: definition, meaning of X = x, stories, indicator r.v.s, prob-ability mass functions (PMFs), probability density functions (PDFs), cumula-tive distribution functions (CDFs), independence, Poisson approximation.

Expected Value and Variance: definitions, linearity, standard deviation, Lawof the Unconscious Statistician (LOTUS).

Important Discrete Distributions: Bernoulli, Binomial, Geometric, NegativeBinomial, Hypergeometric, Poisson.

Important Continuous Distributions: Uniform, Normal.

General Concepts: stories, symmetry, discrete vs. continuous, conditional prob-ability is the soul of statistics, checking simple and extreme cases.

Important Examples: birthday problem, matching problem, Newton-Pepysproblem, Monty Hall problem, testing for a rare disease, elk problem (capture-recapture), gamblers ruin, Simpsons paradox, St. Petersburg paradox.

2

• 3 Important Distributions

The eight most important distributions we have discussed so far are listed below,each with its PMF/PDF, mean, and variance. This table will be provided on the lastpage of the midterm. As usual, we let 0 < p < 1 and q = 1 p. Each of thesedistributions is important because it has a natural, useful story, so understandingthese stories (and recognizing equivalent stories) is crucial. It is also important toknow how these distributions are related to each other. For example, Bern(p) is thesame as Bin(1, p), and Bin(n, p) is approximately Pois() if n is large, p is small and = np is moderate.

Name Param. PMF or PDF Mean Variance

Bernoulli p P (X = 1) = p, P (X = 0) = q p pq

Binomial n, pn

k

p

k

q

nk, for k 2 {0, 1, . . . , n} np npq

Geometric p qkp, for k 2 {0, 1, 2, . . . } q/p q/p2

NegBinom r, pr+n1r1

p

r

q

n

, n 2 {0, 1, 2, . . . } rq/p rq/p2

Hypergeom w, b, n(wk

)( bnk)

(w+bn

), for k 2 {0, 1, . . . , n} = nw

w+b

(w+bnw+b1 )n

n

(1 n

)

Poisson e

k

k!

, for k 2 {0, 1, 2, . . . }

Uniform a < b 1ba , for x 2 (a, b)

a+b

2

(ba)212

Normal , 2 1

p2

e

(x)2/(22)

2

3

• 4 Some Useful Formulas

4.1 De Morgans Laws

(A1

[ A2

[ An

)c = Ac1

\ Ac2

\ Acn

(A1

\ A2

\ An

)c = Ac1

[ Ac2

[ Acn

4.2 Complements

P (Ac) = 1 P (A)

4.3 Unions

P (A [ B) = P (A) + P (B) P (A \ B)

P (A1

[ A2

[ [ An

) =nX

i=1

P (Ai

), if the Ai

are disjoint

P (A1

[ A2

[ [ An

) nX

i=1

P (Ai

)

P (A1

[A2

[ [An

) =nX

k=1

(1)k+1

X

i1

• 4.6 Bayes Rule

P (A|B) = P (B|A)P (A)P (B)

Often the denominator P (B) is then expanded by the Law of Total Probability.

4.7 Expected Value and Variance

Expected value is linear: for any random variables X and Y and constant c,

E(X + Y ) = E(X) + E(Y )

E(cX) = cE(X)

It is not true in general that Var(X + Y ) = Var(X) + Var(Y ). For example, letX be Bernoulli(1/2) and Y = 1 X (note that Y is also Bernoulli(1/2)). ThenVar(X) + Var(Y ) = 1/4 + 1/4 = 1/2, but Var(X + Y ) = Var(1) = 0 since X + Y isalways equal to the constant 1. (We will see later exactly when the variance of thesum is the sum of the variances.)

Constants come out from variance as the constant squared:

Var(cX) = c2Var(X)

The variance of X is defined as E(X EX)2, but often it is easier to compute usingthe following:

Var(X) = E(X2) (EX)2

4.8 Law of the Unconscious Statistician (LOTUS)

LetX be a discrete random variable and h be a real-valued function. Then Y = h(X)is a random variable. To compute E(Y ) using the definition of expected value, wewould need to first find the PMF of Y , and then use E(Y ) =

Py

yP (Y = y). TheLaw of the Unconscious Statistician says we can use the PMF of X directly:

E(h(X)) =X

x

h(x)P (X = x),

where the sum is over all possible values of X. Similarly, for X a continuous r.v. withPDF f

X

, we can find the expected value of Y = h(X) using the PDF of X, withouthaving to find the PDF of Y :

E(h(X)) =

Z 1

1h(x)f

X

(x)dx

5

• 5 Stat 110 Midterm from 2007

1. Alice and Bob have just met, and wonder whether they have a mutual friend.Each has 50 friends, out of 1000 other people who live in their town. They thinkthat its unlikely that they have a friend in common, saying each of us is only friendswith 5% of the people here, so it would be very unlikely that our two 5%s overlap.

Assume that Alices 50 friends are a random sample of the 1000 people (equally likelyto be any 50 of the 1000), and similarly for Bob. Also assume that knowing whoAlices friends are gives no information about who Bobs friends are.

(a) Compute the expected number of mutual friends Alice and Bob have (simplify).

(b) Let X be the number of mutual friends they have. Find the PMF of X.

(c) Is the distribution of X one of the important distributions we have looked at andif so, which one? Note: even without solving (b), you can get credit by giving clearreasons for or against each of the important distributions.

6

• 2. Two coins are in a hat. The coins look alike, but one coin is fair (with probability1/2 of Heads), while the other coin is biased, with probability 1/4 of Heads. One ofthe coins is randomly pulled from the hat, without knowing which of the two it is.Call the chosen coin Coin C.

(a) Coin C is tossed twice, showing Heads both times. Given this information, whatis the probability that Coin C is the fair coin? (Simplify.)

(b) Are the events first toss of Coin C is Heads and second toss of Coin C isHeads independent? Explain briefly.

(c) Find the probability that in 10 flips of Coin C, there will be exactly 3 Heads.(The coin is equally likely to be either of the 2 coins; do not assume it already landedHeads twice as in (a). Do not simplify.)

7

• 3. Five people have just won a \$100 prize, and are deciding how to divide the \$100up between them. Assume that whole dollars are used, not cents. Also, for example,giving \$50 to the first person and \$10 to the second is dierent from vice versa.

(a) How many ways are there to divide up the \$100, such that each gets at least \$10?

Hint: there aren+k1

k

ways to put k indistinguishable balls into n distinguishable

boxes; you can use this fact without deriving it.

(b) Assume that the \$100 is randomly divided up, with all of the possible allocationscounted in (a) equally likely. Find the expected amount of money that the firstperson receives (justify your reasoning).

(c) Let Aj

be the event that the jth person receives more than the first person(for 2 j 5), when the \$100 is randomly allocated as in (b). Are A

2

and A3

independent? (No explanation needed for this.) Express IA2\A3 and IA2[A3 in terms

of IA2 and IA3 (where IA is the indicator random variable of any event A).

8

• 4. (a) Let X Pois(), with > 0. Find E(X!), the average factorial of X.(Simplify, and specify what condition on is needed to make the expectation finite.)

(b) The owner of a certain website is studying the distribution of the number ofvisitors to the site. Every day, a million people independently decide whether to visitthe site, with probability p = 2106 of visiting. Give a good, simple approximationof the probability of getting at least two visitors on a particular day (simplify; youranswer should not involve series).

(c) In the scenario of (b), approximately how many days will it take

Recommended