probability inequalities --- law of large...
Post on 22-May-2020
6 Views
Preview:
TRANSCRIPT
Probability inequalities
--- Law of Large Numbers
May 15, 2019
来嶋 秀治 (Shuji Kijima)
Dept. Informatics,
Graduate School of ISEE
Todays topics
• expectation,
• Markov’s inequality
• variance, covariance, moment
• Chebyshev’s inequality
• Law of large numbers
確率統計特論 (Probability & Statistics)
Lesson 4
Expectation, variance, moment
Today’s topic 2
Expectation3
Expectation (期待値) of a discrete random variable X is defined by
E 𝑋 =
𝑥∈Ω
𝑥 ⋅ 𝑓 𝑥
only when the right hand side is converged absolutely (絶対収束),
i.e., σ𝑥∈Ω 𝑥 ⋅ 𝑓 𝑥 < ∞ holds.
If it is not the case, we say “expectation does not exist.”
Expectation (期待値) of a continuous random variable X is defined by
E 𝑋 = න−∞
+∞
𝑥 ⋅ 𝑓 𝑥 d𝑥 .
Compute expectations of distributions4
*Ex 2.
Discrete
(*i) Bernoulli distribution B 1, 𝑝 .
(*ii) Binomial distribution B 𝑛, 𝑝 .
(iii) Geometric distribution Ge 𝑝 .
(iv) Poisson distribution Po 𝜆 .
Continuous
(v) Exponential distribution Ex 𝛼 .
(vi) Normal distribution N 𝜇, 𝜎2 .
Ex. Expectation of Geom. distr. 5
Thm.
The expectation of 𝑋 ∼ 𝐵 𝑛, 𝑝 is 𝑛𝑝
proof
𝑘=0
𝑛
𝑘𝑛
𝑘𝑝𝑘 1 − 𝑝 𝑛−𝑘 =
𝑘=0
𝑛
𝑘𝑛!
𝑘! 𝑛 − 𝑘 !𝑝𝑘 1 − 𝑝 𝑛−𝑘
=
𝑘=1
𝑛
𝑘𝑛!
𝑘! 𝑛 − 𝑘 !𝑝𝑘 1 − 𝑝 𝑛−𝑘
=
𝑘=1
𝑛𝑛!
(𝑘 − 1)! 𝑛 − 𝑘 !𝑝𝑘 1 − 𝑝 𝑛−𝑘
=
𝑘=1
𝑛
𝑛𝑝(𝑛 − 1)!
(𝑘 − 1)! 𝑛 − 𝑘 !𝑝𝑘−1 1 − 𝑝 𝑛−𝑘
= 𝑛𝑝
𝑘′=0
𝑛−1𝑛 − 1
𝑘′𝑝𝑘
′1 − 𝑝 𝑛−1−𝑘′
= 𝑛𝑝
Ex. Expectation of Geom. distr. 6
Thm.
The expectation of 𝑋 ∼ Ge 𝑝 is 1−𝑝
𝑝.
Proof
E 𝑋 = 0 𝑝 + 1 1 − 𝑝 𝑝 + 2 1 − 𝑝 2𝑝 + 3 1 − 𝑝 3𝑝 +⋯−) 1 − 𝑝 E 𝑋 = 0 1 − 𝑝 𝑝 + 1 1 − 𝑝 2𝑝 + 2 1 − 𝑝 3𝑝 +⋯
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−𝑝E 𝑋 = 1 − 𝑝 𝑝 + 1 − 𝑝 2𝑝 + 1 − 𝑝 3𝑝 +⋯
=1 − 𝑝 𝑝
1 − (1 − 𝑝)= 1 − 𝑝
Thus E 𝑋 =1−𝑝
𝑝.
Properties of Expectations7
Thm.
For an arbitrary constant c,
E 𝑐 = 𝑐E 𝑐𝑋 = 𝑐 ⋅ E 𝑋E 𝑋 + 𝑐 = E 𝑋 + 𝑐
Linearity of expectations (discrete random variables)8
Thm. (linearity of expectation; 期待値の線形性)
E
𝑖=1
𝑛
𝑋𝑖 =
𝑖=1
𝑛
E(𝑋𝑖)
proof.
E 𝑋 + 𝑌
= σ𝑥σ𝑦(𝑥 + 𝑦) Pr 𝑋 = 𝑥 ∩ 𝑌 = 𝑦
= σ𝑥σ𝑦 𝑥𝑓(𝑥, 𝑦) + σ𝑥σ𝑦 𝑦𝑓(𝑥, 𝑦)
= σ𝑥 𝑥 σ𝑦 𝑓(𝑥, 𝑦) + σ𝑦 𝑦σ𝑥 𝑓(𝑥, 𝑦)
= σ𝑥 𝑥𝑓(𝑥) + σ𝑦 𝑦𝑓(𝑦)
= E 𝑋 + E[𝑌]
= σ𝑥σ𝑦 𝑥 + 𝑦 𝑓(𝑥, 𝑦)
Linearity of expectations (continuous random variables)9
Thm. (linearity of expectation; 期待値の線形性)
E
𝑖=1
𝑛
𝑋𝑖 =
𝑖=1
𝑛
E(𝑋𝑖)
proof.
E 𝑋 + 𝑌
= ∞−+∞
∞−+∞
𝑥 + 𝑦 𝑓 𝑥, 𝑦 d𝑥d𝑦
= ∞−+∞
∞−+∞
𝑥𝑓 𝑥, 𝑦 d𝑥d𝑦 + ∞−+∞
∞−+∞
𝑦𝑓 𝑥, 𝑦 d𝑥d𝑦
= ∞−+∞
𝑥 ∞−+∞
𝑓 𝑥, 𝑦 d𝑦 d𝑥 + ∞−+∞
𝑦 ∞−+∞
𝑓 𝑥, 𝑦 d𝑥 d𝑦
= ∞−+∞
𝑥𝑓(𝑥)d𝑥 + ∞−+∞
𝑦𝑓(𝑦)d𝑦
= E 𝑋 + E[𝑌]
Application of linearity of expectation10
Thm.
The expectation of 𝑋 ∼ B(𝑛; 𝑝) is 𝑛𝑝
proof
Suppose 𝑋1, … , 𝑋𝑛 are i.i.d. B(1; 𝑝),
then 𝑌 ≔ 𝑋1 +⋯+ 𝑋𝑛 follows B(𝑛; 𝑝).
E 𝑋𝑖 = 1 ⋅ 𝑝 + 0 ⋅ (1 − 𝑝)
E 𝑌 = E σ𝑖𝑋𝑖 = σ𝑖 E 𝑋𝑖 = σ𝑖 𝑝 = 𝑝𝑛
Moment & Variance
Today’s topic 2
Motivation12
Consider the following three distributions.
Distr. 1.
• Pr 𝑋 = 0 = 1/3
• Pr 𝑋 = 1 = 1/3
• Pr 𝑋 = 2 = 1/3
Distr. 2.
• Pr 𝑋 = 𝑘 = 1/2(𝑘+1)
for 𝑘 = 0,1,2,…
Distr. 3.
•Pr 𝑋 = 0 = 2/3
• Pr 𝑋 = 1 = 0
• Pr 𝑋 = 2𝑘 = 1/4𝑘
for 𝑘 = 1,2,…
E 𝑋 = 1 E 𝑋 = 1 E 𝑋 = 1
Motivation13
Consider the following three distributions.
Distr. 1.
• Pr 𝑋 = 0 = 1/3
• Pr 𝑋 = 1 = 1/3
• Pr 𝑋 = 2 = 1/3
Distr. 2.
• Pr 𝑋 = 𝑘 = 1/2(𝑘+1)
for 𝑘 = 0,1,2,…
Distr. 3.
•Pr 𝑋 = 0 = 2/3
• Pr 𝑋 = 1 = 0
• Pr 𝑋 = 2𝑘 = 1/4𝑘
for 𝑘 = 1,2,…
E 𝑋 = 1
Pr 𝑋 > 1 = 1/3
Pr 𝑋 > 2 = 0
Pr 𝑋 > 1000 = 0
E 𝑋 = 1
Pr 𝑋 > 1 = 1/4
Pr 𝑋 > 2 = 1/8
Pr 𝑋 > 1000 = 1/512
E 𝑋 = 1
Pr 𝑋 > 1 = 1/3
Pr 𝑋 > 2 = 1/12
Pr 𝑋 > 1000 = 1/192
Definitions14
𝑘-th moment (𝑘次の積率) of 𝑋
E[𝑋𝑘]
variance (分散) of 𝑋
Var 𝑋 ≔ E 𝑋 − 𝐸 𝑋 2
standard deviation (標準偏差) of 𝑋
𝜎 𝑋 ≔ Var 𝑋
covariance (共分散) of 𝑋 and 𝑌
Cov 𝑋, 𝑌 ≔ E (𝑋 − E[𝑋])(𝑌 − E[𝑌])
Compute the variances of distributions15
*Ex 2.
Discrete
(*i) Bernoulli distribution B 1, 𝑝 .
(*ii) Binomial distribution B 𝑛, 𝑝 .
(iii) Geometric distribution Ge 𝑝 .
(iv) Poisson distribution Po 𝜆 .
Continuous
(v) Exponential distribution Ex 𝛼 .
(vi) Normal distribution N 𝜇, 𝜎2 .
Properties of variance and covariance16
Thm.
Var 𝑋 = E 𝑋2 − E 𝑋 2
Cov 𝑋, 𝑌 = E 𝑋𝑌 − E 𝑋 E 𝑌
Var 𝑋 + 𝑌 = Var 𝑋 + Var 𝑌 + 2Cov[𝑋, 𝑌]
E 𝑋 − E 𝑋 2 = E 𝑋2 − 2𝑋E 𝑋 + E 𝑋 2
= E 𝑋2 − 2E 𝑋 E 𝑋 + E 𝑋 2
= E 𝑋2 − E 𝑋 2
Cov 𝑋, 𝑌 = E 𝑋 − E 𝑋 𝑌 − E 𝑌= E 𝑋𝑌 − 𝑋E 𝑌 − 𝑌E 𝑋 + E 𝑋 E 𝑌= E 𝑋𝑌 − 2E 𝑋 E 𝑌 + E 𝑋 E 𝑌= E 𝑋𝑌 − E 𝑋 E[𝑌]
Properties of variance and covariance17
Thm.
Var 𝑋 = E 𝑋2 − E 𝑋 2
Cov 𝑋, 𝑌 = E 𝑋𝑌 − E 𝑋 E 𝑌
Var 𝑋 + 𝑌 = Var 𝑋 + Var 𝑌 + 2Cov[𝑋, 𝑌]
Var 𝑋 + 𝑌 = E 𝑋 + 𝑌 2 − E 𝑋 + 𝑌 2
= E 𝑋2 + 2𝑋𝑌 + 𝑌2 − E 𝑋 + E 𝑌 2
= E 𝑋2 − E 𝑋 2 + E 𝑌2 − E 𝑌 2 + 2E 𝑋𝑌 − 2E 𝑋 E 𝑌= Var 𝑋 + Var 𝑌 + 2Cov[𝑋, 𝑌]
Properties of var and cov (for independent 𝑋 and 𝑌)18
Thm. If 𝑋 and 𝑌 are independent,
E 𝑋𝑌 = E 𝑋 E 𝑌
Cov 𝑋, 𝑌 = 0
Var 𝑋 + 𝑌 = Var 𝑋 + Var 𝑌
𝐸 𝑋𝑌 =
𝑥
𝑦
𝑥𝑦Pr 𝑋 = 𝑥 ∧ 𝑌 = 𝑦
=
𝑥
𝑦
𝑥𝑦 Pr 𝑋 = 𝑥 Pr 𝑌 = 𝑦
=
𝑥
𝑥 Pr 𝑋 = 𝑥
𝑦
𝑦 Pr 𝑌 = 𝑦
= E 𝑋 E[𝑌]
Cov 𝑋, 𝑌 = E 𝑋𝑌 − E 𝑋 E 𝑌= 0
Properties of Var and Cov19
Thm. If 𝑋1, … , 𝑋𝑛 are mutually independent,
Var 𝑋1 +⋯+ 𝑋𝑛 = Var 𝑋1 +⋯+ Var 𝑋𝑛
Linearity of independent variance: binomial distr.20
Thm.
The variance of 𝑋 ∼ B(𝑛; 𝑝) is 𝑛𝑝(1 − 𝑝)
proof
Suppose 𝑋1, … , 𝑋𝑛 are independent and identically distr. B(1; 𝑝),
then 𝑌 ≔ 𝑋1 +⋯+ 𝑋𝑛 follows B(𝑛; 𝑝).
𝐸 𝑋𝑖2 = 12 ⋅ 𝑝 + 02 ⋅ 1 − 𝑝 = 𝑝
Var 𝑋𝑖 = 𝐸 𝑋𝑖2 − 𝐸 𝑋𝑖
2 = 𝑝 − 𝑝2 = 𝑝 1 − 𝑝
Var 𝑌 = Var σ𝑖=1𝑛 𝑋𝑖 = σ𝑖=1
𝑛 Var 𝑋𝑖 = σ𝑖=1𝑛 𝑝 1 − 𝑝 = 𝑛𝑝 1 − 𝑝
Since X and Y are indipendent
Expectation (contd.)
Ex. Coupon collector22
The are 𝑛 kinds of coupons.
How many coupons do you need to draw, in expectation,
before having drawn each coupon at least once ?
•ビックリマンシール
•ポケモンカード
Ex. Coupon collector23
The are 𝑛 kinds of coupons.
How many coupons do you need to draw, in expectation,
before having drawn each coupon at least once ?
Suppose you have already drawn 𝑘 − 1 kinds of coupon.
Let 𝑋𝑘 denote the number of draws from 𝑘 − 1 to 𝑘.
The probability is 𝑝𝑘 ≔𝑛−(𝑘−1)
𝑛
The expected number is
E 𝑋𝑘 =1
𝑝𝑘=
𝑛
𝑛 − 𝑘 + 1
•ビックリマンシール
•ポケモンカード
Thm.
𝑛 ln 𝑛 ≤ 𝐸 𝑋 ≤ 𝑛 1 + ln 𝑛
Ex. Coupon collector24
The are 𝑛 kinds of coupons.
How many coupons do you need to draw, in expectation,
before having drawn each coupon at least once ?
•ビックリマンシール
•ポケモンカード
harmonic number
E 𝑋 = E
𝑖=1
𝑛
𝑋𝑖
=
𝑖
𝑛
E 𝑋𝑖
=
𝑖=1
𝑛𝑛
𝑛 − 𝑖 + 1
= 𝑛
𝑖′=1
𝑛1
𝑖′
ln 𝑛 = න1
𝑛 1
𝑥d𝑥 ≤
𝑘=1
𝑛1
𝑘
1 +
𝑘=2
𝑛1
𝑘≤ 1 +න
1
𝑛 1
𝑥d𝑥 = 1 + ln 𝑛
Ex. Coupon collector25
The are 𝑛 kinds of coupons.
How many coupons do you need to draw, in expectation,
before having drawn each coupon at least once ?
What is the probability of completion after 𝑚 trials?
•ビックリマンシール
•ポケモンカード
Markov’s inequality
Today’s topic 1
Markov’s inequality27
Thm. Markov’s inequality
Let X be a nonnegative random variable, then
Pr 𝑋 ≥ 𝑎 ≤E 𝑋
𝑎holds for any a 0.
Markov’s inequality28
E𝑋
𝑎= න
0
∞ 𝑥
𝑎𝑓(𝑥)d𝑥 = න
0
𝑎 𝑥
𝑎𝑓(𝑥)d𝑥 + න
𝑎
∞ 𝑥
𝑎𝑓(𝑥)d𝑥
≥ න𝑎
∞ 𝑥
𝑎𝑓(𝑥)d𝑥 ≥ න
𝑎
∞
𝑓(𝑥) d𝑥 = Pr[𝑋 ≥ 𝑎]
Pr 𝑋 ≥ 𝑎 ≤ E𝑋
𝑎=E 𝑋
𝑎
Thus,
Proof.
Thm. Markov’s inequality
Let X be a nonnegative random variable, then
Pr 𝑋 ≥ 𝑎 ≤E 𝑋
𝑎holds for any a 0.
Ex. Coupon collector29
The are 𝑛 kinds of coupons.
How many coupons do you need to draw, in expectation,
before having drawn each coupon at least once ?
What is the probability of completion after 𝑚 trials?
•ビックリマンシール
•ポケモンカード
Using Markov’s inequality,
Pr 𝑋 ≥ 𝑚 ≤𝐸 𝑋
𝑚≤𝑛 1 + ln 𝑛
𝑚
e.g., n=100, m=1000,
Pr 𝑐𝑜𝑚𝑝𝑙𝑒𝑡𝑖𝑜𝑛 ≥ 1 − Pr 𝑋 ≥ 1001 ≃ 0.44
e.g., n=100, m=10000,
Pr 𝑐𝑜𝑚𝑝𝑙𝑒𝑡𝑖𝑜𝑛 ≥ 1 − Pr 𝑋 ≥ 10001 ≃ 0.94
too loose?
rem.
𝑛 ln 𝑛 ≤ 𝐸 𝑋 ≤ 𝑛 1 + ln 𝑛
Chebyshev’s inequality
Today’s topic 3
Chebyshev’s inequality31
Thm. Chebyshev’s inequality
For any a 0.
Pr 𝑋 − E 𝑋 ≥ 𝑎 ≤Var 𝑋
𝑎2
Remark that
Pr 𝑋 − E 𝑋 ≥ 𝑎 = Pr 𝑋 − E 𝑋 2 ≥ 𝑎2
Using Markov’s inequality,
Pr 𝑋 − E 𝑋 2 ≥ 𝑎2 ≤E 𝑋 − E 𝑋 2
𝑎2=Var 𝑋
𝑎2
proof.
Chebyshev’s inequality32
Cor. Chebyshev’s inequality
For any t 0.
Pr 𝑋 ≥ 1 + 𝑡 E 𝑋 ≤Var 𝑋
𝑡E 𝑋 2
proof.
Pr 𝑋 ≥ 1 + 𝑡 E 𝑋 = Pr 𝑋 − E 𝑋 ≥ 𝑡E[𝑋]
≤ Pr 𝑋 − 𝐸 𝑋 ≥ 𝑡E 𝑋
≤Var 𝑋
𝑡E 𝑋 2
Ex. Coupon collector33
The are n kinds of coupons.
How many coupons do you need to draw, in expectation,
before having drawn each coupon at least once ?
What is the probability of completion after m trials?
•ビックリマンシール
•ポケモンカード
Using Markov’s inequality,
Pr 𝑋 ≥ 𝑚 ≤𝐸 𝑋
𝑚≤𝑛 1 + ln 𝑛
𝑚
e.g., n=100, m=1000,
Pr 𝑐𝑜𝑚𝑝𝑙𝑒𝑡𝑖𝑜𝑛 = 1 − Pr 𝑋 ≥ 1001 ≃ 0.44
e.g., n=100, m=10000,
Pr 𝑐𝑜𝑚𝑝𝑙𝑒𝑡𝑖𝑜𝑛 = 1 − Pr 𝑋 ≥ 10001 ≃ 0.94
too loose?
rem.
𝑛 ln 𝑛 ≤ 𝐸 𝑋 ≤ 𝑛 1 + ln 𝑛
Ex. Coupon collector34
The are n kinds of coupons.
How many coupons do you need to draw, in expectation,
before having drawn each coupon at least once ?
What is the probability of completion after m trials?
•ビックリマンシール
•ポケモンカード
Using Chevyshev’s inequality,
Pr 𝑋 ≥ 1 + 𝑡 𝐸[𝑋] ≤Var 𝑋
𝑡E 𝑋 2
rem.
𝑛 ln 𝑛 ≤ 𝐸 𝑋 ≤ 𝑛 1 + ln 𝑛
Ex. Coupon collector35
The are n kinds of coupons.
How many coupons do you need to draw, in expectation,
before having drawn each coupon at least once ?
What is the probability of completion after m trials?
•ビックリマンシール
•ポケモンカード
Var 𝑋
=
𝑖=1
𝑛
Var 𝑋𝑖 =
𝑖=1
𝑛1 − 𝑝𝑖
𝑝𝑖2
≤
𝑖=1
𝑛1
𝑝𝑖2 =
𝑖=1
𝑛𝑛
𝑛 − 𝑖 + 1
2
= 𝑛2
𝑖=1
𝑛1
𝑖2≤ 𝑛2
𝜋2
6
Ex. 2.
Ex. Coupon collector36
The are n kinds of coupons.
How many coupons do you need to draw, in expectation,
before having drawn each coupon at least once ?
What is the probability of completion after m trials?
•ビックリマンシール
•ポケモンカード
Using Chevyshev’s inequality,
Pr 𝑋 ≥ 1 + 𝑡 𝐸[𝑋] ≤Var 𝑋
𝑡E 𝑋 2≤
𝑛2𝜋2
6𝑡2 𝑛 ln 𝑛 2
=𝜋2
6𝑡2 ln 𝑛 2
rem.
𝑛 ln 𝑛 ≤ 𝐸 𝑋 ≤ 𝑛 1 + ln 𝑛
e.g., n=100, m=1000 (𝑡 ≃𝑚
𝑛 ln 𝑛− 1 ≃ 1.1),
Pr[Completion] ≥ 1-Pr[X 1000] 0.95
still loose?
Chernoff’s bound
Law of Large number
Law of large numbers (大数の法則)38
Def.
A series {𝑌𝑛} converges 𝑌 in probability (𝑌に確率収束する), if
∀𝜀 > 0, lim𝑛→∞
Pr 𝑌𝑛 − 𝑌 < 𝜀 = 1
Thm. (law of large numbers; 大数の法則)
Let r.v. 𝑋1, … , 𝑋𝑛 are i.i.d., w/ expectation 𝜇, and variance 𝜎2,
then 𝑌𝑛: =𝑋1+⋯+𝑋𝑛
𝑛converges 𝜇 in probability;
i.e.,
∀𝜀 > 0, lim𝑛→∞
Pr𝑋1 +⋯+ 𝑋𝑛
𝑛− 𝜇 < 𝜀 = 1
independent and identically distributed
(独立同一分布)
39Thm. (low of large numbers; 大数の法則)
Let r.v. 𝑋1, … , 𝑋𝑛 are i.i.d., w/ expectation 𝜇, and variance 𝜎2,
then 𝑌𝑛: =𝑋1+⋯+𝑋𝑛
𝑛converges 𝜇 in probability;
i.e.,
∀𝜀 > 0, lim𝑛→∞
Pr𝑋1 +⋯+ 𝑋𝑛
𝑛− 𝜇 < 𝜀 = 1
E 𝑌 = E𝑋1 +⋯+ 𝑋𝑛
𝑛=E 𝑋1 +⋯+ E 𝑋𝑛
𝑛= 𝜇
Var 𝑌 = Var𝑋1 +⋯+ 𝑋𝑛
𝑛=Var 𝑋1 +⋯+ Var 𝑋𝑛
𝑛2=𝜎2
𝑛
Recall
Let r.v. X1,…,Xn are i.i.d., w/ expectation , and variance 2,
then (X1+…+Xn)/n converge in probability;
i.e.,
∀𝜀 > 0, lim𝑛→∞
Pr𝑋1 +⋯+ 𝑋𝑛
𝑛− 𝜇 < 𝜀 = 1
Thm. (low of large numbers; 大数の法則)40
Using Chebyshev’s inequality,
Pr𝑋1 +⋯+ 𝑋𝑛
𝑛− 𝜇 ≥ 𝜀 ≤
𝜎2
𝑛𝜖2
𝑛→∞0
Thm. Chebyshev’s inequality
For any a 0.
Pr 𝑋 − E 𝑋 ≥ 𝑎 ≤Var 𝑋
𝑎2
E 𝑌 = 𝜇
Var[𝑌] =𝜎2
𝑛
top related