exercises of random variables - docencia.ac.upc.edu
TRANSCRIPT
![Page 1: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/1.jpg)
1
Exercises of Random Variables
![Page 2: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/2.jpg)
2
Exercise
• Show that the necessary and suficient condition for a random variable on NN to have a geometric distribution is that it should have the property:
– For each natural number n and m.
)()/( nXPmXmnXP >=>+>
![Page 3: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/3.jpg)
3
geometric distribution
• Random variable that models the number of trials until a success or failure.
• requirements :– number of trials is potentially infinite– two outcomes per trial; success and failure– outcomes statistically independent– trials have the same probability of success
L1,2,3,ifor )1()( 1 =−== − ppiXP i
![Page 4: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/4.jpg)
4
Exercise
• Meaning of:
• Probability of waiting n minuts more given that you have waited m is independent of m.– Applications:
• Queue at the bus stop (Relate to Poison rv)• Queue at a hub or a relay (is the model correct?)• Expected survival time
– Illness, or protocol design.
)()/( nXPmXmnXP >=>+>
Like its continuous analogue (the exponential distribution), the geometric distribution is memoryless.
![Page 5: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/5.jpg)
5
Exercise
• Property to be shown:
• Definition: Geometric Random Variable:
• The distribution function is
)()/( nXPmXmnXP >=>+>
L1,2,3,ifor )1()( 1 =−== − ppiXP i
nn
k
kn
k
nk
ni
i
pp
pp
pppppppnXP
=−
−=
=−=−=−=> ∑∑∑∞
=
∞
=
+
+−=
∞
+=
−
11
)1(
)1( )1( )1()(Series
Geometric001)(nikc.v. 1
1
npnXP => )(
![Page 6: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/6.jpg)
6
Exercise
• If A then B:
)()(
)()/( nXPp
pp
mXPmnXP
mXmnXP nm
mn
>===>
+>=>+>
+
npnXP => )(
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1
1 6 11 16 21 26 31 36 41 46 51 56 61
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.11 6 11 16 21 26 31 36 41 46 51 56 61
m n+m
npnXP => )( )()/( nXPmXmnXP >=>+>
![Page 7: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/7.jpg)
7
Exercise
• On the other hand If B then A:
)1()()1()(then
and then
property thehas )( that Suppose
11
111
1
122
111
aaaamXPmXPmXP
aaaaaaaaa
aa
aanXP
mmm
mmmmmnmn
nm
mnn
−=−=>−−>==
====
==>
−−
−−+
+
L
)()(
)()/( nXP
mXPmnXP
mXmnXP >=>
+>=>+>
![Page 8: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/8.jpg)
8
Example of a rv with memory
• A Pareto distribution when used to model a queue has memory:
– For each natural number n and m.– Meaning:
• Probability of waiting n minuts more given that you have waited m is greater than at the arrival.
• Richer get richer: "80-20 rule" which says that 20% of the population owns 80% of the wealth.
• The more you wait, the more you are expected to wait
)()/( nXPmXmnXP >>>+>
![Page 9: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/9.jpg)
9
Example of a rv with memory• Examples of uses of the Pareto Distribution:
– * Frequencies of words in longer texts (a few words are used often, lots of words are used infrequently)
– * The sizes of human settlements (few cities, many hamlets/villages)– * File size distribution of Internet traffic which uses the TCP protocol (many smaller
files, few larger ones)– * Clusters of Bose-Einstein condensate near absolute zero– * The values of oil reserves in oil fields (a few large fields, many small fields)– * The length distribution in jobs assigned supercomputers (a few large ones, many
small ones)– * The standardized price returns on individual stocks– * Sizes of sand particles– * Sizes of meteorites– * Numbers of species per genus (There is subjectivity involved: The tendency to
divide a genus into two or more increases with the number of species in it)– * Areas burnt in forest fires
![Page 10: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/10.jpg)
10
Cities and firms
• Zipf distribution of U.S. firm sizes
Axtell, R. L. (2001), "Zipf distribution of U.S. firm sizes", Science
![Page 11: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/11.jpg)
11
Web sites visits
• Distribution of AOL users' visits to various sites on a December day in 1997
Zipf, Power-laws, and Pareto - a ranking tutorial Lada A. Adamic
Comments from B.A. Huberman
![Page 12: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/12.jpg)
12
Word frequencies in a text
![Page 13: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/13.jpg)
13
Speculative Prices
• Mandelbrot’s paper on long tail densities
![Page 14: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/14.jpg)
14
Speculative Prices
• Mandelbrot’s paper on long tail densities– An interesting result
http://classes.yale.edu/fractals/Panorama/ManuFractals/Internet/Internet4.html
![Page 15: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/15.jpg)
15
Burstiness property
• Burstiness in cities & internet trafic
The image below (composed of several satellite pictures) gives an idea of the degree of economic agglomeration in the world economy.
An introduction to geographical economics
Steven Brakman, Harry Garretsen, and Charles van Marrewijk
![Page 16: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/16.jpg)
16
Analisys of the Pareto distribution
• We will compute the value:
• Remember the definition:
• The conditioned probability is:
0 with )( 0 >
=> α
α
mm
mXP
)/( nXmnXP >+>
)()(
)/(mXP
mnXPmXmnXP
>+>
=>+>
![Page 17: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/17.jpg)
17
Analisys of the Pareto distribution
• We will compute the value:
• The conditioned probability is:
ααα
αα
α
α
=>>
+
+
+
+
=
++
=>+>
nn
nXPmn
mmn
m
mnm
mnm
mm
mnmn
mXmnXP
0
00
0
00
0
0
00
)(
)/(
)()(
)/(mXP
mnXPmXmnXP
>+>
=>+>
![Page 18: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/18.jpg)
18
Analisys of the Pareto distribution• Simulation:
– Message: the longer you wait, the more you will wait
0
0.2
0.4
0.6
0.8
1
1.2
1 2 3 4 5 6 7 8 9 10 11 12
Value of n
Pro
bab
ility
P(X>n+10/X>10) P(X>n)
αα
+
+
=>+>mn
mmn
mmXmnXP
00
0)/(α
=>
nn
nXP 0)(
1 and 10 00 == nm
![Page 19: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/19.jpg)
19
Negative Binomial distribution
• Generalization of a Geometric distribution:• Def. Probability of r successes in n
Bernouilli trials. Trials independent and identically distributed.
( )
1 1
1
2 2 2 2
1r=1 (1 ) (1 )
0
1r=2 1 (1 ) (1 )
1
General case
1r
1
N N
N
N N
T
T
T T
HHH HHHH
HH HHHH
H
NT p p p p
TN
N p p pHH HHH
HH H H
p
H
T
r
T T
N
− −
−
− −
− → − = −
− → − − = −
−−
L1442443
LO
L
L
1 (1 )
1
r N rNp p
r
T
−
− → − −
M
L
![Page 20: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/20.jpg)
20
Negative Binomial distribution
• General expression:– Probability of r successes in n Bernouilli trials.
Trials independent and identically distributed.
• Examples:– Disk redundancies– Coding theory. Error correction– Banach Matches.
1( ) (1 )
1r N rN
P X r p pr
−− = = − −
![Page 21: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/21.jpg)
21
Banach’s Matches
• ExampleA pipe-smoking mathematician carries, at all times, 2matchboxes, 1 in his left-hand pocket and 1 in hisright-hand pocket. Each time he needs a match he isequally likely to take it from either pocket. Consider themoment when the mathematician first discovers that one ofhis matchboxes is empty. If it is assumed that bothmatchboxes initially contained N matches, what is theprobability that there are exactly k matches in he otherbox, k = 0, 1, ...,N?
See Feller
![Page 22: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/22.jpg)
22
Banach’s Matches
• Note that it is a negative binomial, at least must have N+1 successes in one of the boxes.
• The success number (N+1) occurs at the (N+1)+(N-k)=2N-k trial.
12Prob( ) 2 ( ( 1)) 2 (1 )N N kN k
k P X N p pN
+ −− = = + = −
![Page 23: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/23.jpg)
23
Banach’s Matches
• Applications:– Allocations of files in a disk system.– Heap management.
12Prob( ) 2 ( ( 1)) 2 (1 )N N kN k
k P X N p pN
+ −− = = + = −
![Page 24: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/24.jpg)
24
• Models the number of successes k in a sequence of n draws from a finite population without replacement. – Size of the population: m– Observed successes: k– Favorable objects: r– Number of draws: n
Hypergeometric Random
{ }Wht
{ }Blck{ }Wht{ }Wht
{ }Wht{ }Wht
{ }Blck{ }Blck
{ }Blck
{ }Wht{ }Wht
![Page 25: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/25.jpg)
25
• Random Variable Y=k– Size of the population: m– Observed successes: k– Favorable objects: r– Number of draws: n
Hypergeometric Random
{ }Wht
{ }Blck{ }Wht{ }Wht
{ }Wht{ }Wht
{ }Blck{ }Blck
{ }Blck
{ }Wht{ }Wht
( )
r m rk n k
P Y kmn
− − = =
![Page 26: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/26.jpg)
26
Application: capture-recapture problem
• Lake containing m fish where m is unknown. We capture r of the fish, tag them, and return them to the lake.
• Next we capture n of the fish and observe Y, the number of tagged fish in the sample.
Y rn m
=
Size of the population: mObserved successes: kFavorable objects: rNumber of draws: n
![Page 27: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/27.jpg)
27
Application: capture-recapture problem
• Caveat:– Diffusion problem
takes for granted that the observed value is the meanY rn m
=
( )
r m rk n k
P Y kmn
− − = =
![Page 28: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/28.jpg)
28
Observation
* ( / )p P white observation composition of the urn=
*p
ˆ ˆ( )pf p
Urn:3 White7 Black
( / )P composition of the urn white observation
Application: capture-recapture problem
• Caveat:
– Variability arround the most probable value
takes for granted that the observed value is the meanY rn m
=
![Page 29: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/29.jpg)
29
Example
• A computer cluster of 24 machines, at a given moment has 3 with high load processes. What is the probability of getting k loaded machines if 5 are selected at random?
3 215
( )245
k kP Y k
− = =
3 210 5 19*18*17
( 0) 0.478724 24*23*225
P Y
= = = =
!!
![Page 30: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/30.jpg)
30
Combinatorial Methods.Lotto6/49
• Lotto6/49: 6 numbers+ 1 complementary are selected from 49. A multiple bet means selecting r from the 49 numbers– Probability of guessing k from the winning combination.– Probability of guessing k AND the complentary– Probability of guessing k AND Not the complentary
{ }1i
b { }6i
b{ }5i
b{ }4i
b{ }3i
b{ }2i
b { }7i
b
{ }1 2 3 48 49, , , , ,b b b b bL { }1 2 7, , ,i i iL { }1 2 7, , ,i i ib b bL→ →
Example taken from VÉLEZ , HERNÁNDEZ, Cálculo de Probabilidades
![Page 31: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/31.jpg)
31
Combinatorial Methods.Lotto6/49
• Number of ways for guessing n results.
496
r rk k
− −
{ }1b { }6b{ }5b{ }4b{ }3b{ }2b
{ }1 2, , , ki i iL
Different sets with the non-selected winning numbers.
Different sets with the winning numbers
496
Pr( )496
r rk k
n
− − =
•Probability of guessing k
![Page 32: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/32.jpg)
32
Combinatorial Methods.Lotto6/49
• Probability of guessing k AND the complentary
{ }1i
b { }6ib{ }
5ib{ }4i
b{ }3ib{ }
2ib { }
7ib
( )
49 496 1 6
Pr( )49 49
43 436 6
r r r k r rk k k k
n r k
− − − − − = = −
Different sets with the non-selected winning numbers.
The complementary can be any of the remaining r-k
![Page 33: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/33.jpg)
33
Combinatorial Methods.Lotto6/49
• Probability of guessing k AND NOT the complentary
{ }1i
b { }6i
b{ }5i
b{ }4i
b{ }3i
b{ }2i
b { }7i
b
( )
( )( )
49 49 6 496 1 6
Pr( ) 4349 49
43 436 6
r r r k r rk k k k
n r k
− − + − − − − = = − −
Compementary cannot be
•in the marked r, •nor in (6-k) non-marked but winner numbers.
![Page 34: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/34.jpg)
34
Binomial Random Variables
• Most important discrete probability distribution.• Model:
– Two possible outcomes: Success/Failure– Probabilities: Success=p / Failure=1-p– We compound n independent Bernouilli trials.– Define the random variable:
X=Total number of successes in n indep. Bernouilli trials
![Page 35: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/35.jpg)
35
Binomial Random Variables
• Distribution.
X=Total number of successes in n indep. Bernouilli trials
• Model:– Two possible outcomes: Success/Failure– Probabilities: Success=p / Failure=1-p
– We compound n independent Bernouilli trials.
( ) (1 ) 0,1, 2,3 , k n knP X k p p k n
k−
= = − =
L
Successes in trialsk n
T TT T
TTT
HH H HHHH H HHH H
HHH HH
LL
OL1442443
![Page 36: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/36.jpg)
36
Binomial Random VariablesExample
• Overbooking:– An aircraft has a capacity of 150 tickets. The airline
management sells 160 tickets in order to protect themselves against no-show passengers.
– Experience shows that the probability of a passenger being a no-show is of 0.1. The booked passengers act independengly of each other.
– Given this overbooking strategy, what is the probability that some passengers will be left out?.
Taken from H.Tijms, understanding probability
![Page 37: Exercises of Random Variables - docencia.ac.upc.edu](https://reader031.vdocuments.mx/reader031/viewer/2022021408/6209b0631ea8160b220fa18e/html5/thumbnails/37.jpg)
37
Binomial Random VariablesExample
• Overbooking:– The problem can be seen as 160 independent trials of a
bernouilli experiment with a success rate of 9/10, where a passenger who shows up for the flight is counted as a success.
– We define X=number of passengers that show up.– X is binomially distributed with parameters n=160, and p=9/10.
– The probability is P(X>150)
151
( 150) (1 ) 0.0359n
k n k
k
nP X p p
k−
=
> = − =
∑
more than 150 Successes in 160 trials
T TT T T THH H HH TTL14444244443
Taken from H.Tijms, understanding probability