eecs 70 discrete mathematics and probability theory spring ... · eecs 70 discrete mathematics and...

EECS 70 Discrete Mathematics and Probability TheorySpring 2014 Anant Sahai Final exam

PRINT your student ID:

PRINT AND SIGN your name: Sahai , Anant(last) (first) (signature)

PRINT your Unix account login: cs70-

PRINT your discussion section and GSI (the one you attend):

Name of the person to your left:

Name of the person to your right:

Name of someone in front of you:

Name of someone behind you:

Section 0: Pre-exam questions (3 points)1. What is your favorite thing about math/EECS? (1 pt)

Solution: I love how different things come together in surprising ways. How the material that we teach youin 20 and 70 (along with linear algebra) is intimately connected with so many practical things that in 121,we can push to bring you guys to nearly the cutting edge of research in some places.

2. What are you most looking forward to this summer? (2pts)Solution: Spending a lot of time in my garden (70 = too many weeds!) and maybe actually going to somemore concerts, and reading some books. As far as research goes, I am most looking forward to workingclosely with Kate and Gireeja to bring them closer to their PhD dissertations. (Plus Kate and I never reallygot to celebrate her Best Paper award at IEEE DySpAN at the beginning of April.)

Quite frankly for myself, I am also looking forward to spending at least one solid afternoon relaxing onthe beach in Curaçao during the Communication Theory Workshop that will be held immediately after wesubmit the final grades for 70! Then in June, attending the wedding of two of my former students (one ofwhom is now a Prof at CMU). And then at the beginning of July, spending some time in Hawaii (whereGireeja is going to be presenting a paper coauthored with myself and Govind) for the Information TheorySymposium, where I’ll meet up with my brother who will also be coming from UCLA.

Do not turn this page until the proctor tells you to do so.

EECS 70, Spring 2014, Final exam 1

PRINT your name and student ID:

SOME APPROXIMATIONS AND OTHER USEFUL TRICKS THAT MAY OR MAY NOT COME IN HANDY:

n!≈√

2πn(ne)n

(nk

)≤ (

nek)k lim

n→∞(1+

xn)n = ex

When x is small, ln(1+ x)≈ x

When x is small, (1+ x)n ≈ 1+nx

The Golden Rule of 70 (and Engineering generally) applies: if you can’t solve the problem in frontof you, state and solve a simpler one that captures at least some of its essence. You might get partialcredit for doing so, and maybe you’ll find yourself on a path to the solution.

Source: http://www.math.unb.ca/~knight/utility/NormTble.htm



Section 1: Straightforward questions (24 points)You get one drop: do 4 out of the following 5 questions. Bonus for getting all five perfectly. No partial creditwill be given.

3. Birthday

What is the probability that at least two of your EECS70 classmates (there are 530 of you all) have the samebirthday (years don’t matter; only month and day do)?

Solution: N = 530,D= 365, so there are more students than days (even if you consider D= 366); therefore,there will be at least one day with more than two students (consider balls thrown into bins), so Pr = 1.

4. Stirling

Use Stirling’s approximation to estimate the probability of flipping 50 fair coins and getting exactly 25heads. Is this bigger or smaller than 0.1?

Solution: Each coin has probability 0.5 of coming up heads. Let X denote the total number of heads after50 fair coin tosses. From our knowledge of the binomial distribution, we get that:

Pr[X = 25] =(

5025

)0.5250.550−25 =

(5025

)0.550

Now, in order to make use of Stirling’s approximation, we want to rewrite the choose term in terms offactorials, then use Stirling’s:

Pr[X = 25] =(

5025

)0.550

=50!

25!25!250

≈√

2π(50)(50e )

50√2π(25)(25

e )25√

2π(25)(25e )

25250

Cancelling the e’s and splitting up 50 into 2∗25:

=

√2π√

25√

2 ·2550 ·250√

2π√

2π√

25√

25 ·2550 ·250

Cancelling terms:

=1√

25√

π

=1

5√

π

Noting that π < 4, we realize that 15√

π> 1

10 = 0.1. Hence, it is bigger than 0.1.



5. RSA

Let p = 11, q = 17 in standard RSA. Let e = 3.

a) Compute the private key that Bob should use.Solution: d = e−1 mod (p−1)(q−1)d = 3−1 mod (11−1)(17−1)d = 3−1 mod 160d = 107We verify this with 3∗107 = 321 = 1 mod 160

b) Let x = 20 be the message that Alice wants to send to Bob. What is the encrypted message E(x)?Solution: E(x) = x3 mod NE(x) = 203 mod 11∗17E(x) = 203 mod 187E(x) = 200∗10∗2∗2 mod 187 We reduce 200 mod 187E(x) = 13∗10∗2∗2 mod 187E(x) = 260∗2 mod 187 We reduce 260 mod 187E(x) = 73∗2 mod 187E(x) = 146 mod 187E(x) = 146



6. Count It

For each of the following collections, determine whether it is finite, countably infinite (like the naturalnumbers), or uncountably infinite (like the reals):

No explanations are required, just circle the correct choice. Only 6 out of 9 correct answers are required forfull credit.

1. Even numbers:Circle one: FINITE / COUNTABLY INFINITE / UNCOUNTABLY INFINITESolution: COUNTABLY INFINITE. They are obviously a subset of the natural numbers, thereforethey are at most countably infinite. But they are obviously infinite (i.e. not finite), so this set iscountably infinite.

2. The number of distinct files that are 1GB in lengthCircle one: FINITE / COUNTABLY INFINITE / UNCOUNTABLY INFINITESolution: FINITE. Each file can be viewed as a binary number with at most 109×8 bits. So once thelength of the file is fixed to some k each file can be viewed as a number between 0 and 2109×8. Thetotal number of possible values for length is also finite. So the total number of different files is finite.

3. Computer programs that haltCircle one: FINITE / COUNTABLY INFINITE / UNCOUNTABLY INFINITESolution: COUNTABLY INFINITE. The total number of programs is countably infinite, since eachcan be viewed as a string of characters (so for example if we assume each character is one of the 256possible values, then each program can be viewed as number in base 257, and we know these numbersare countably infinite. So the number of halting programs, which is a subset of all programs, can beeither finite or countably finite. But there are an infinite number of halting programs, for examplefor each number i the program that just prints i is different for each i. So the total number of haltingprograms is countably infinite.

4. The functions from N to {0,1,2,3}Circle one: FINITE / COUNTABLY INFINITE / UNCOUNTABLY INFINITESolution: UNCOUNTABLY INFINITE. Each real number between 0 and 1 can be written in thebinary base and then we can extract a function by mapping i to the i-th digit after the decimal point.This mapping is obviously one-to-one, so the set of functions must be uncountably infinite.

5. The number of points in the unit square [0,1]× [0,1]Circle one: FINITE / COUNTABLY INFINITE / UNCOUNTABLY INFINITESolution: UNCOUNTABLY INFINITE. This set contains {0}× [0,1] which is in one-to-one corre-spondence with [0,1] which we know is uncountably infinite.

6. The functions from R to {0,1,2,3,4,5}Circle one: FINITE / COUNTABLY INFINITE / UNCOUNTABLY INFINITESolution: UNCOUNTABLY INFINITE. This set is larger than R because we can map each r ∈ R toone of these functions by creating a function that is 1 at r and 0 everywhere else. So this set containsat least R and is therefore uncountably infinite.

7. Computer programs that always correctly tell if a program halts or notCircle one: FINITE / COUNTABLY INFINITE / UNCOUNTABLY INFINITESolution: FINITE. There is no such program because the halting problem cannot be solved.


8. Numbers that are the roots of nonzero polynomials with integer coefficienctsCircle one: FINITE / COUNTABLY INFINITE / UNCOUNTABLY INFINITESolution: COUNTABLY INFINITE. Polynomials with integer coefficients themselves are countablyinfinite. So let us list all polynomials with integer coefficients as P1,P2, . . . . We can label each root by apair (i, j) which means take the polynomial Pi and take its j-th root (we can have an arbitrary orderingon the roots of each polynomial). This means that the roots of these polynomials can be mapped in aninjective manner to N×N which we know is countably infinite. So this set is either finite or countablyinfinite. But every natural number n is in this set (it is the root of x−n). So this set is countably infinite.

9. Computer programs that correctly return the product of their two integer argumentsCircle one: FINITE / COUNTABLY INFINITE / UNCOUNTABLY INFINITESolution: COUNTABLY INFINITE. As said previously the set of all programs is countably infinite.So this set is either finite or countably infinite. But it is not finite. Consider a program that multipliesits two integer arguments and add a simple for loop that runs for i steps to its beginning. It does notchange the behavior of the program, but it will give us a different program for each i. Therefore thisset is countably infinite.



7. Induction

Prove by induction that for every natural number n≥ 1, we have:

n

∑i=1

i2 =n(n+1)(2n+1)

6

Solution:

Base case: Set n = 1, ∑1i=1 i2 = 12 = 1 = 1(2)(3)

6 .

Inductive hypothesis: Assume that the proposition is true for n. That is, ∑ni=1 i2 = n(n+1)(2n+1)

6 .

Induction step: Consider n+1,

n+1

∑i=1

i2 =n

∑i=1

i2 +(n+1)2

=n(n+1)(2n+1)

6+(n+1)2 (by inductive hypothesis)

=n(n+1)(2n+1)+6(n+1)2

6

=(n+1)(n(2n+1)+6(n+1))

6

=(n+1)(2n2 +n+6n+6)

6

=(n+1)(2n2 +7n+6)

6

=(n+1)(n+2)(2n+3)

6

=(n+1)((n+1)+1)(2(n+1)+1)

6.

By the principle of induction, the claim is proved.



Section 2: Additional straightforward questions (40 points)You get two drops: do 4 out of the following 6 questions. Bonus for getting more than that perfectly. Verylittle partial credit will be given.

8. Frying PanRapunzel and Flynn decide to get Elsa a special gift for her coronation. They decide to order it from Oaken’sTrading Company.

Pretend there are just 3 warehouses, one in Corona, one in Wesselton, and one in Arendelle. When someoneorders a product, the Corona warehouse is first checked to see if the product is in stock there. The probabilitythat the product is in stock here is 50%. If it is in stock, it is shipped for delivery in either 1, 2, or 3 weeks,all equally likely.

If the product is not in stock in Corona, the next closest warehouse (Wesselton) is checked, which has an80% probability of having the product in stock. Again, if the product is in stock here, it is shipped, but thistime, the product will arrive in 2, 3, or 4 weeks, all equally likely.

If the product is not in stock in Wesselton either, it will be shipped from Arendelle (which is assumed tohave all tradeable goods), and the delivery will be made in either 3 weeks (with 10% probability) or 4 weeks(with 90% probability).

Rapunzel ordered a frying pan (“who knew!”) from Oaken’s Trading Company, and it was deliveredin 3 weeks. What are the chances that it was shipped from Arendelle?Solution: The answer to the problem is Pr (Arendelle | 3 weeks), which we need to calculate. We have:

Pr (Arendelle | 3 weeks) =Pr (Arendelle ∩ 3 weeks)

Pr (3 weeks)

=Pr (3 weeks | Arendelle) × Pr (Arendelle)

Pr (3 weeks)

=0.1×Pr (Arendelle)

Pr (3 weeks)=

0.1×Pr (Not Corona ∩ Not Wesselton)Pr (3 weeks)

=0.1× (1−0.5)× (1−0.8)

Pr (3 weeks)=

0.01Pr (3 weeks)

=0.01

Pr ((3 weeks ∩ Corona) ∪ (3 weeks ∩ Wesselton) ∪ (3 weeks ∩ Arendelle))

=0.01

Pr (3 weeks ∩ Corona)+Pr (3 weeks ∩ Wesselton)+Pr (3 weeks ∩ Arendelle)

=0.01(

Pr (3 weeks | Corona) Pr (Corona)+Pr (3 weeks |Wesselton) Pr (Wesselton)+Pr (3 weeks | Arendelle) Pr (Arendelle)

)=

0.0113 ×0.5+ 1

3 ×0.5×0.8+0.1×0.5×0.2

=131

= 0.032258 (approx.)



9. Combinatorial identity

Use a combinatorial argument to prove why the following combinatorial identity is true (i.e., provide a storyfor why the identity should hold).

(2nn

)=

(n0

)2

+

(n1

)2

+ · · · +(

nn

)2

Solution: Assume we have a bag of n blue marbles and a bag of n red marbles and we need to pick nmarbles in total out of those two bags. One way to count the number of ways to pick n marbles out of those2n marbles is

(2nn

). The other way is to take i marbles from the blue bag and n− i from the red bag where i

can take values anywhere from 0 to n. Thus(2nn

)=

n

∑i=0

(ni

)(n

n− i

)As we know (

nk

)=

(n

n− k

)We proved (

2nn

)=

n

∑i=0

(ni

)2



10. Doubles

You are playing a game of chance where the rules are as follows. You start out with $1. You roll a pair ofindependent and fair 6-faced dice. If you roll “doubles” (both dice agree on a single number), you doubleyour money and get to roll the dice again. If the two dice are different, the game is over and you walk awaywith your money. Basically: you keep rolling the dice and doubling your money until you don’t roll doubles.

What is the expectation and variance of the final money at the end of the game?

Solution: The probability that we succeed at each trial is 1/6. Therefore the probability that we succeedfor exactly n trials and then fail is (1/6)n(5/6). If we succeed for n trials and fail at the next, the amountof money we get is 2n. So the random variable X which shows our final money has this distribution:Pr[X = 2n] = (1/6)n(5/6).

In order to compute the expectation, we have

E[X ] = ∑x

xPr[X = x] =∞

∑n=0

2n(1/6)n(5/6) = (5/6)∞

∑n=0

(2/6)n

Recall that for a geometric sum we have ∑∞n=0 α i = 1

1−α. So we have

E[X ] = (5/6)1

1−2/6= (5/6)(3/2) =

54

To compute Var[X ] we use the formula E[X2]−E[X ]2. We have

E[X2] = ∑x

x2 Pr[X = x] =∞

∑n=0

4n(1/6)n(5/6) = (5/6)∞

∑n=0

(4/6)n

But again using the formula for geometric sums we get

E[X2] = (5/6)1

1−4/6= (5/6)×3 =

52

Now we haveVar[X ] = E[X2]−E[X ]2 =

52− (

54)2 =

52− 25

16=

1516



11. Seating Senators

Assume there are 100 galactic senators, each one from a different star system, to be seated in a 100-seat row(not a circle. Seats 1 and 100 are not next to each other) in the galactic senate. You can leave your answersas expressions involving factorials and choose notation.:

a) In how many ways can the senators be seated, assuming any senator can be assigned any seat?Solution: This is just a count of the permutations of 100 senators. So this is 100!

b) In how many of the ways above will the senator from Naboo and the senator from the Trade Federationbe seated next to each other?Solution: There are 99 places where the pair can sit (1,2), through (99,100). Within this, thereare 2 orderings of the two senators. (Naboo, TF) and (TF, Naboo). There are 98 remaining seatswith 98 senators. So there are 98! different ways of arranging those. Putting these together we have2∗99∗98! = 2∗99! ways.

c) In how many of the ways will the senators from Naboo and the Trade Federation be seated next to eachother while the senators from Alderaan and the Techno Union are also seated next to each other?Solution: By parallel reasoning, we get 4 ∗ (98!). To say it explicitly: Start with Naboo and the TF.There are 99 places where they can be seated, but two of them are special — the ends of the table.Positions (1,2) and (99,100).For these two ends, there are 97 choices for where the Alderaan and TU sit. e.g. (1,2) through (97,98)if Naboo and TF are at the end. There are clearly 2 ways of arranging the Naboo/TF pair and 2 waysof arranging the Alderaan/TU pair. And there are 96! ways of arranging the others. That gives 2* 4 *97 * 96!.For the 97 other choices that are not the end, there are only 96 choices for where Alderaan and TU sit.This is because the Naboo/TF pair end up blocking 3 possible sites for the Alderaan/TU pair. The onethat they are in as well as the one overlapping before and the one overlapping after. Putting all thistogether we have another 97 * 4 * 96 * 96!.Combining the two, we see 2 ∗ 4 ∗ 97 ∗ (96!) + 97 ∗ 4 ∗ 96 ∗ (96!) = 4 ∗ 2 ∗ (97!) + 4 ∗ 96 ∗ (97!) =4∗ (2+96)∗ (97!) = 4∗ (98!).


d) Use the inclusion exclusion principle, along with your answers to the sub-problems above, to findthe probability that, when a seating arrangement is picked uniformly at random, neither thesenators from Naboo and the Trade Federation, nor the senators from Alderaan and the TechnoUnion, are seated next to each other. (No need to simplify your expression. You can just use A forthe answer to part (a), B for the answer to part (b), and C for the answer to part (c) to avoid having torewrite those expressions.)Solution: A−2B+C

A .If you wanted to simplify this: 100!−4∗99!+4∗98!

100! = 100∗99−4∗99+4100∗99 = 100∗99−4∗98

100∗99 = 25∗99−9825∗99 = 2377

2475 ≈ 96%.That last part probably requires a calculator or someone skilled at the abacus.



12. No Soulmates

Here are some partial preferences for 4 men and 4 women. Fill in the missing preferences so that there are4 different stable pairings possible. Give a brief justification for why there are 4 different stable pairings.

Man Women1 A B C D2 B D3 C D A B4 D B

Woman MenA 4B 4C 2D 2

Solution:

Man Women1 A B C D2 B A C D3 C D A B4 D C A B

Woman MenA 2 1 3 4B 1 2 3 4C 4 3 1 2D 3 4 1 2

The title gives a hint. We know that soulmates must be paired together in all possible stable pairings, so wedon’t want there to be any soulmates in order to increase the number of possible stable pairings.

This exact example was given in lecture by a student working it out on the board. We learned that one wayto show an exponential number of pairings is to create groups of 2 men and women that have two possibleconfigurations such that they will not be any rogue couples — the anti-soulmates configuration. So fill outthe preferences so { (A,1), (B,2) } is a possible stable pairing and so is { (A,2), (B,1) }. Do the same for Cand D and now there are two choices for each pair and thus 4 stable pairings overall.



13. Time Sink

In Fall, every EECS70 student (assume there are 625 = 252 of them) will be asked to do a supposedly easyvirtual lab in the second week of class. Let Xi be the number of hours it takes for student i to complete thelab. Assume that the {Xi} are independent geometric random variables each with expectation 1

p and variance1−pp2 .

a) If the average time for students to complete the first virtual lab is greater than 4 hours, the morale inEECS70 will crash. Use the central limit theorem to find an approximate value for p (i.e. the labdifficulty parameter) such that the probability of a decline in morale is about 15.87%.(It is fine to leave the answer in the form of a solution to an equation.)Solution: Let the average time for students to complete the virtual lab be X , so

X =1n

n

∑i=1

Xi,

where n = 625. We want:

P(X > 4) = 1−P(X ≤ 4)≈ 0.1587

⇒ P(X ≤ 4)≈ 0.8413

To use the CLT, we need to subtract off the expectation of X and divide by the standard deviation of X .We know

E[X ] =1p

and Var(X) =1− pnp2 .

Therefore we want

P

X− 1p√

1−pnp2

≤4− 1

p√1−pnp2

≈ 0.8413.

Looking up 0.8413 in the z table at the beginning of the exam, we see that it corresponds to a z valueof 1. So we need

4− 1p√

1−pnp2

≈ 1 ⇒ 16np2 +n−8np≈ 1− p.

Plugging in n = 625 and rearranging terms, we get that p should be a solution to the equation

10,000p2−4,999p+624≈ 0.

It is fine to leave the answer like this. Applying the quadratic formula and solving for p we get4999+

√49992−4∗10000∗624

20000 = 4999+√

3000120000 ≈ 0.259. Why did we pick the positive sign for the ± in the

quadratic formula? Because this is the root that makes sense.Notice that this is close to a quarter, which is what we expect. If we had just used expectations, wewould have immediately set p = 1

4 . We want to be safe, so the lab has to be easier. This means that weneed a number bigger than 0.25.


A reality check is to notice that this is very close to a perfect square, i.e.

10,000p2−5,000p+625 = (100p−25)2.

Setting the above expression equal to 0, we obtain

p≈ 14

Which confirms the intuition above. But saying p = 14 is not enough to account for the finite size of

the class.

b) BONUS: What is the approximate distribution for the number of students who will spend more than10 hours on the lab?)Solution: This distribution should be binomial, since each student either spends more than 10 hourson the lab, or they don’t. For student i, the probability they spend more than 10 hours is equal to theprobability that they don’t finish the lab in the first 10 hours (after that it doesn’t matter which hourthey finish the lab). Therefore

P(Xi > 10) = (1− p)10,

so the distribution of the number of students who will spend more than 10 hours on the lab is

Binomial(625,(1− p)10)

Thinking about this, we need to know what (0.75)10 is like. This is 590491088576 ≈ 0.06. This is arguably

close enough to zero to also approximate this distribution by a Poisson distribution with parameterλ = 625∗ (3

4)10 ≈ 36.



Section 3: True/False (42 points)For each question in this section, determine whether the given statement is TRUE or FALSE. If TRUE, provethe statement. If FALSE, provide a counterexample.

14. Variance

If n > 1 and Var(∑ni=1 Xi) = nVar(X1), then the {Xi} are iid.

Mark one: TRUE or FALSE .

Solution: iid stands for “identically independently distributed”, so we can provide a lot of counter-examples. The one here is just for reference.

If we assume that X1,X2 are independent, then we have Var(X1 +X2) = 2Var(X1)→Var(X2) =Var(X1).

So we can use the following counter-example, n= 2,X1∼Binomial(8,1/2),X2∼Geometric(1/2) for whichclearly, X1,X2 are not iid but Var(X1) =Var(X2) = 2.

Alternative counter-examples exist which break independence.



15. Déjà Vu?

Let n≥ 1 be a positive integer. Let r ≥ 1 be a positive integer. Consider a polynomial based code in whichn character messages are encoded into polynomials of degree less than or equal to n− 1. The codewordsare generated by evaluating these polynomials at n+ r distinct points (assume the underlying finite field hasmore than n+ r elements).

Then any two codewords corresponding to two different messages must differ in at least r+1 places.

Mark one: TRUE or FALSE.

Solution: TRUE.

Proof: We want to show that any two codewords corresponding to different messages will differ in at leastr+1 places.

Consider 2 different messages M1 and M2, each with n characters. Let the corresponding polynomials be P1and P2 respectively. Note that P1 and P2 are two distinct polynomials, each with degree at most n−1.

Let the transmitted codewords be C1 and C2 respectively. Each of these codewords has length n+ r.

Suppose for the sake of contradiction that these codewords differ in fewer than r + 1 places. Then thecodewords have to be identical in at least n places. But this means the polynomials P1 and P2 (each withdegree at most n−1) pass through at least n common points. But we know that, for any set of n points, thereis a unique polynomial of degree at most n− 1 that passes through these points. Therefore P1 and P2 mustbe the same polynomial, which contradicts the fact that they are distinct polynomials. Thus, our assumptionthat the codewords differ in fewer than r+1 places must be false. Therefore the codewords must differ in atleast r+1 places. �



16. Bound

Let X be a random variable with expectation µX and variance σ2X . Suppose µX ≤ µbound, and σ2

X ≤ σ2bound.

Then, for every a≥ µbound, we have:

Pr (X ≥ a)≤σ2

bound(a−µbound)2

Mark one: TRUE or FALSE.

TRUE.

P(X ≥ a) = P(X−µX ≥ a−µX)

≤ P(|X−µX | ≥ a−µX)

≤ σ2X

(a−µX)2 (Chebyshev’s Inequality)

≤σ2

bound(a−µbound)2 .

Replacing σ2X with σ2

bound ≥ σ2 clearly increases the numerator. Also, since a ≥ µbound , replacing µX withµbound will decrease the denominator, overall increasing the expression.



Section 4: Free-form Problems (65 + 35 points)17. Cathy’s Coupons (20 points)

Cathy is trying to collect two coupons. She pays $1 to get a coupon, but does not know which coupon shewill get before paying. There are two stores (A and B) selling the coupons. In store A, the probabilities ofgetting Coupon 1 and Coupon 2 are both 1

2 . In store B, the probabilities of getting Coupon 1 and Coupon 2are 1

3 and 23 respectively.

a) (10 points) If Cathy can pick only one store to buy all her coupons from, which store should she pickso as to minimize her expected cost?Solution: She should pick store A.The cost X is X1 +X2 where X1 is the cost to get the first new coupon (it can be either Coupon 1 orCoupon 2), and X2 is the cost to get the other coupon after getting the first new coupon. Because eachcoupon costs $1, X is also the number of coupons that Cathy buys (X1 is the number of coupons thatCathy buys to get the first new coupon; X2 is he number of coupons that Cathy buys to get the othercoupon after getting the first new coupon). Clearly, X1 is always 1, and X2 is related to geometricdistributions. Note that the expectation of a geometric distribution Geom(p) is 1

p .

• Cathy picks store A: if she gets Coupon 1 first (the probability is 12 ), X2 is Geom

(12

); if she gets

Coupon 2 first (the probability is 12 ), X2 is also Geom

(12

). Therefore, the expectation of X2 is

12 ×2+ 1

2 ×2 = 2, and the expectation of X is 1+2 = 3.• Cathy picks store B: if she gets Coupon 1 first (the probability is 1

3 ), X2 is Geom(2

3

); if she

gets Coupon 2 first (the probability is 23 ), X2 is Geom

(13

). Therefore, the expectation of X2 is

13 ×

32 +

23 ×3 = 5

2 , and the expectation of X is 1+ 52 = 7

2 .

Since 3 < 72 , she should pick store A.



b) (10 points) If Cathy can switch stores at any time, what strategy should she follow to collect bothcoupons while minimizing her expected cost? Why?Calculate the expectation of the total cost when following this strategy.

Solution: She should go to store A first. If she gets Coupon 1 there, she switches to store B; if she getsCoupon 2 there, she stays at store A.

It is trivial that, if she has gotten Coupon 1, she should go to store B because the probability of gettingCoupon 2 is higher. Similarly, if she has gotten Coupon 2, she should go to store A because the probabilityof getting Coupon 1 is higher. With this strategy, two scenarios can be analyzed as follows.

• Cathy goes to store A first: if she gets Coupon 1 there (the probability is 12 ), she should switch to store

B, and X2 is Geom(2

3

); if she gets Coupon 2 there (the probability is 1

2 ), she should stay at store A,and X2 is Geom

(12

). Therefore, the expectation of X2 is 1

2 ×32 +

12 ×2 = 7

4 , and the expectation of X is1+ 7

4 = 114 .

• Cathy goes to store B first: if she gets Coupon 1 there (the probability is 13 ), she should stay at store

B, and X2 is Geom(2

3

); if she gets Coupon 2 there (the probability is 2

3 ), she should switch to store A,and X2 is Geom

(12

). Therefore, the expectation of X2 is 1

3 ×32 +

23 ×2 = 11

6 , and the expectation of Xis 1+ 11

6 = 176 .

Since 114 < 17

6 , she should go to store A first and follow the strategy above.



18. Miley’s Money (30 + 20 points)

Miley Cyrus decides to get into angel investing in casual gaming startups. The process is simple.

At the start of each quarter, Miley invests some money in one startup and gives them some publicity. Atthe end of the quarter, Miley sells her stake. If the venture is successful (this happens with 50% probability,independent of other ventures), Miley sells her stake for 10 times her initial investment (paid immediately,in full, in cash). Otherwise, Miley’s investment is worthless and she gets back nothing.

Each quarter, Miley invests a fraction 0≤ f ≤ 1 of her investment pool in a startup, as described above. Theremaining fraction (1− f ) is kept as cash and not invested.

So Miley’s investment multiplier in quarter k is iid distributed as Xk with probabilities:

P(Xk = (1− f )+10 f ) =12

P(Xk = (1− f )) =12

She hires you as an adviser to help her choose her strategy and allocates B0 =$10M to start investing.

a) (5 points) What is the expectation and variance of Xk?Solution: Expectation: E[Xk] =

12(1+9 f )+ 1

2(1− f ) = 1+4 f

Var[Xk] = E[X2k ]−E[Xk]

2

= (12(1+9 f )2 +

12(1− f )2)− (1+4 f )2

= (12(1+18 f +81 f 2)+

12(1−2 f + f 2))− (1+4 f )2

= 1+8 f +41 f 2− (1+8 f +16 f 2)

= 25 f 2



b) (5 points) Find an expression for Bk — the total value of Miley’s investments at the end of quarterk — in terms of B0 and the random variables X1,X2, . . . ,Xk.

Solution: Bk = B0k∏i=1

Xi

c) (5 points) Let t be total number of quarters that Miley is going to dedicate to investing/promotingcasual games. What choice of f maximizes E[Bt ]? What is the resulting expectation?Solution:

E[Bt ] = B0E[t

∏i=1

Xi]

By independence of the Xi:

= B0

t

∏i=1

E[Xi]

= B0(1+4 f )t

This is maximized by f = 1. The resulting expectation is B0 ·5t .


PRINT your name and student ID:d) (5 Bonus points) What is the variance of the resulting Bt in the previous part?

Solution: From the previous part, we have that f = 1 is the value we are interested in calculating thevariance for:

Var(Bt) = E[B2t ]−E[Bt ]

2

= B20E[(

t

∏i=1

Xi)2]−E[Bt ]

2

= B20E[

t

∏i=1

(X2i )]−E[Bt ]

2

Noting that the X2i are independent:

= B20(

t

∏i=1

E[X2i ])−E[Bt ]

2

= B20(1+8 f +41 f 2)t −B2

0(1+4 f )2t

= B20(50t −25t)

Alternatively, we can realize that with this strategy, Miley has a 12t chance of walking away 10t factor

richer and will otherwise have nothing. So this is a Bernoulli random variable with p = 12t times B010t .

That gives a variance B20102t(2t−1

22t ) = B2025t(2t −1) = B2

0(50t −25t).

e) (15 points) Miley understands that investing is risky. She is willing to accept a 10% probability of abad outcome overall. So Miley is interested in a number mt for which P(Bt ≤mt)≈ 0.1. Approximatemt in terms of f assuming that Miley invests for t quarters. (Feel free to assume that t is large.)

Solution: Note that Bt = B0t

∏i=1

Xi from the previous part. We can define the random variables Yi such

that:

Yi =

{1 if Xi = 1+9 f0 if Xi = 1− f

Note that the Yi are iid Ber(1/2). We can substitute into the definition of Bt with these Yi, getting

Bt = B0(1+9 f )t∑

i=1Yi(1− f )

t−t∑

i=1Yi

We note that the bigger thatt∑

i=1Yi is, the bigger Bt is, as f > 0. Hence we want to say that the probability

oft∑

i=1Yi being small is fairly small. Since t is large, we can apply the CLT. mt is a value such that Bt is

smaller than it with probability 0.1. Instead of focusing on Bt , we will focus on finding a value y such

thatt∑

i=1Yi is smaller than it with probability at most 0.1, and then plugging that into our formula for Bt

to get a value for mt .

Pr[t

∑i=1

Yi < y] = Pr[

t∑

i=1Yi−0.5t

0.5√

t<

y−0.5t0.5√

t] < 0.1


Using the lookup table and the symmetry of a Gaussian distribution, we see that we want y−0.5t0.5√

t ≈−1.3.

Simplifying, we get that y =−0.65√

t +0.5t. Plugging in the formula for Bt , we get that

Pr[Bt ≤ B0(1+9 f )−0.65√

t+0.5t(1− f )0.5t+0.65√

t ]≤ 0.1

So mt ≈ B0(1+9 f )−0.65√

t+0.5t(1− f )0.5t+0.65√

t

f) (15 Bonus points) What choice of f maximizes the mt that you have calculated above? What doesf tend to when t gets large?Solution: We will use the fact that the value of f that maximizes mt also maximizes logmt .

lnmt = lnB0 +(−0.65√

t +0.5t) ln(1+9 f )+(0.5t +0.65√

t) ln(1− f )

Taking the derivative with respect to f and setting it to 0, we get:

9(−0.65√

t +0.5t)1+9 f

− 0.5t +0.65√

t1− f

= 0

−(0.5t +0.65√

t)(1+9 f )+(9(0.5t−0.65√

t))(1− f ) = 0

(−9(0.5t +0.65√

t)−9(0.5t−0.65√

t)) f = 0.5t +0.65√

t−9(0.5)t +9(0.65)√

t

f =0.5t +0.65

√t−9(0.5)t +9(0.65)

√t

−9(0.5t +0.65√

t)−9(0.5t−0.65√

t)

f =4t−6.5

√t

9t−18∗0.65√

t.

As t→ ∞: The terms that are significant are the terms that have t, not√

t. So we get that f → 49 .



19. Median (15 + 15 points)

Given a list of numbers with an odd length, the median is obtained by sorting the list and seeing whichnumber occupies the middle position. (e.g. the median of (1,0,1,0,2,1,2) is 1 because the list sorts to(0,0,1,1,1,2,2) and there is a 1 in the middle. Meanwhile, the median of (1,0,2,0,0,1,0) is 0 because thelist sorts to (0,0,0,0,1,1,2) and there is a 0 in the middle.)

a) (15 points) Consider an iid sequence of random variables Xi with probability mass function PX(0) =13 , PX(1) = 1

4 , PX(2) = 512 . Let Mi be the random variable that is the median of the random list

(X1,X2, . . . ,X2i+1). Show that P(Mi 6= 1) goes to zero exponentially fast in i.(HINT: What has to happen with the {X j} for Mi = 0 or Mi = 2 ?)Solution: The idea for this is as follows. Intuitively, as we sample more and more X j, the fractionof X j that turn up 0 is approaching 1

3 , and the fraction of X j that turn up 2 is approaching 512 . Since

these are both numbers less than 1/2, it is increasingly likely that the list will have the number 1 in themiddle when sorted. Note that this is using the fact that the median of the list is not 1 if and only ifeither at least half the Xi turned up 0 or at least half the X j turned up 2.We can formalize this intuition with a Chernoff bound. Define the random variables Y 0

j , Y 1j , and Y 2

jsuch that

Y kj =

{1 if X j = k0 if X j 6= k

Then E[Y 0j ] =

13 for all i, and E[Y 2

j ] =5

12 for all i. From the Chernoff bound for Bernoulli r.v.’s, whichwas worked out in class, we get that

Pr[1

2i+1

2i+1

∑j=1

Y 0j ≥

12]≤ e−(2i+1)D( 1

2 ||13 )

Pr[1

2i+1

2i+1

∑j=1

Y 2j ≥

12]≤ e−(2i+1)D( 1

2 ||512 )

We note from class that these are both exponentially decaying values, since D(a||p) = 0i f f a = p. As

mentioned above, we know that Mi 6= 1 iff 12i+1

2i+1∑j=1

Y 0j ≥ 1

2 or 1n

2i+1∑j=1

Y 2j ≥ 1

2 . By the union bound, the

probability of either of these events happening is bounded by the sum of the probabilities. Hence weget:

Pr[Mi 6= 1]≤ e−(2i+1)D( 12 ||

13 )+ e−(2i+1)D( 1

2 ||512 )

This is decaying exponentially in i.



b) (15 Bonus points) Generalize the argument you have made above to state a law of large numbers forthe median of a sequence of iid discrete-valued random variables. Sketch a proof for this law.(HINT: Let Mid(X) be that value t for which P(X < t)< 1

2 and P(X > t)< 12 . If such a t doesn’t exist,

then don’t worry about that case at all.)Solution: As the number of samples increases, the median of a sequence of iid discrete-valued randomvariables {Xi} converges to the median of the PMF of Xi, which is defined in the hint above.This is basically the same as the last part, except now we do not even have to necessarily worry aboutexponential convergence. Let

Y X<ti =

{1 if Xi < t0 if Xi > t

Y X>ti =

{1 if Xi > t0 if Xi < t

Then by a similar argument to the previous part, the averages of the Y X>ti and Y X<t

i are going toconverge to their means, Pr[Xi < t] and Pr[Xi > t] respectively, as the number of samples approachesinfinity. Since both these values are less than 1/2, this means that as the number of samples approachesinfinity, the probability of the frequency exceeding 1

2 is going to go to zero exponentially.Consequently, with high probability, the middle of the sequence will be t. This is because P(Xi >t)+P(Xi = t)+P(Xi < t) = 1 and so the probability that Xi takes value t is nonzero. Hence, if we letMn be the median of n trials, Pr[Mn 6= Mid(X)]→ 0, so Pr[Mn = Mid(X)]→ 1.



[Doodle page! Draw us something if you want or give us suggestions or complaints.]

Enjoy your summer!

Bees’ll buzz, kids’ll blow dandelion fuzz. . .


eecs 70 discrete mathematics and probability theory spring ... · eecs 70 discrete mathematics and...

Documents