ece 302: chapter 04: continuous random variables...continuous random variable de nition the...

Post on 01-Jun-2020

19 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

c©Stanley Chan 2019. All Rights Reserved.

ECE 302: Chapter 04: Continuous Random Variables

Fall 2019

Prof Stanley Chan

School of Electrical and Computer EngineeringPurdue University

1 / 56

c©Stanley Chan 2019. All Rights Reserved.

1. Continuous Random Variable

2 / 56

c©Stanley Chan 2019. All Rights Reserved.

Continuous Random Variable

Sample space becomes continuous

E.g., time, area

Characterized by histogram too!

Not PMF, but Probability Density Function (PDF)

3 / 56

c©Stanley Chan 2019. All Rights Reserved.

Continuous Random Variable

Definition

The probability density function (PDF) of a random variable X is afunction which, when integrated over an interval [a, b], yields theprobability of obtaining a ≤ X (ξ) ≤ b. We denote PDF of X as fX (x), and

P[a ≤ X ≤ b] =

∫ b

afX (x)dx . (1)

4 / 56

c©Stanley Chan 2019. All Rights Reserved.

Continuous and discrete unified!

If X is continuous,

P[a ≤ X ≤ b] =

∫ b

afX (x)dx

If X is discrete,

P[a ≤ X ≤ b] = P[X = x0] = pX (x0) =

∫ b

apX (x0)δ(x − x0)︸ ︷︷ ︸

fX (x)

dx

5 / 56

c©Stanley Chan 2019. All Rights Reserved.

Property

A PDF fX (x) should satisfy ∫ ∞−∞

fX (x)dx = 1. (2)

Example. Let fX (x) = c(1− x2) for −1 ≤ x ≤ 1, and 0 otherwise. Find c .

6 / 56

c©Stanley Chan 2019. All Rights Reserved.

Expectation

Definition (Expectation)

The expectation of a continuous random variable X is

E[X ] =

∫ ∞−∞

x fX (x)dx . (3)

7 / 56

c©Stanley Chan 2019. All Rights Reserved.

Expectation

Definition (Expectation of Function)

The expectation of a function g of a continuous random variables X is

E[g(X )] =

∫ ∞−∞

g(x) fX (x)dx . (4)

Definition (Moment)

The kth moment of a continuous random variables X is

E[X k ] =

∫ ∞−∞

xk fX (x)dx . (5)

8 / 56

c©Stanley Chan 2019. All Rights Reserved.

Variance

Definition (Variance)

The variance of a continuous random variables X is

Var[X ] = E[(X − µX )2]

=

∫ ∞−∞

(x − µX )2fX (x)dx

where µXdef= E[X ].

Remark: It also holds that

Var[X ] = E[X 2]− E[X ]2.

9 / 56

c©Stanley Chan 2019. All Rights Reserved.

2. Common Continuous Random Variables

10 / 56

c©Stanley Chan 2019. All Rights Reserved.

Uniform Distribution

Definition (Uniform Distribution)

Let X be a continuous uniform random variable. The PDF of X is

fX (x) =

{1

b−a , a ≤ x ≤ b,

0, otherwise,(6)

where [a, b] is the interval on which X is defined. We write

X ∼ Uniform(a, b)

to say that X is drawn from a uniform distribution on an interval [a, b].

11 / 56

c©Stanley Chan 2019. All Rights Reserved.

Mean and Variance

Proposition (Mean/Variance of Uniform Distribution)

If X ∼ Uniform(a, b), then

E[X ] =a + b

2, and Var[X ] =

(b − a)2

12.

12 / 56

c©Stanley Chan 2019. All Rights Reserved.

Application of Uniform Distribution

Analysis of Uniform QuantizerAssumption: X [n] is random signal.Quantization: partition the amplitude of X [n] into a discrete set of levels.

13 / 56

c©Stanley Chan 2019. All Rights Reserved.

Application of Uniform Distribution

We can model the quantization error as uniform distribution.

Or if we let the ∆ be the height of the quantization interval, then

Eq[n] ∼ Uniform

[−∆

2,

2

].

The mean and variance of Eq[n] is

E[Eq[n]] = 0, Var[Eq[n]] =∆2

12.

14 / 56

c©Stanley Chan 2019. All Rights Reserved.

Application of Uniform Distribution

Knowing the distribution of Eq[n] is important:

It helps us design error compensation algorithms

It helps us understand the limit of data compression

It helps us generalize the concept to more advanced coding schemesR. Gray, Source Coding Theory, Kluwer Academic Publishers, 1990.

15 / 56

c©Stanley Chan 2019. All Rights Reserved.

Exponential distribution

Definition (Exponential Distribution)

Let X be an exponential random variable. The PDF of X is

fX (x) =

{λe−λx , x ≥ 0,

0, otherwise,(7)

where λ > 0 is a parameter. We write

X ∼ Exponential(λ)

to say that X is drawn from an exponential distribution of parameter λ.

Example. Inter-arrival time of Poisson random variables

16 / 56

c©Stanley Chan 2019. All Rights Reserved.

Effect of λ

Proposition (Mean/Variance of Exponential Distribution)

If X ∼ Exponential(λ), then

E[X ] =1

λ, and Var[X ] =

1

λ2.

17 / 56

c©Stanley Chan 2019. All Rights Reserved.

Neighbor of Exponential Distribution

A closely related distribution to Exponential distribution is the Laplacedistribution:

fX (x) = λe−λ|x |

Example: Image statistics.

18 / 56

c©Stanley Chan 2019. All Rights Reserved.

Neighbor of Exponential Distribution

• Instead of looking at the image intensity I directly, we can look at the

gradient of the image:

[∇x I∇y I

].

• Image gradients are sparse.

19 / 56

c©Stanley Chan 2019. All Rights Reserved.

3. Cumulative Distribution Function

20 / 56

c©Stanley Chan 2019. All Rights Reserved.

Cumulative Distribution Function

Definition

The cumulative distribution function (CDF) of a continuous randomvariable X is

FX (x)def= P[X ≤ x ] =

∫ x

−∞fX (x ′)dx ′. (8)

Example. Let fX (x) = c(1− x2) for −1 ≤ x ≤ 1, and 0 otherwise. FindFX (x).

21 / 56

c©Stanley Chan 2019. All Rights Reserved.

Properties of CDF

1 FX (−∞) =

2 FX (+∞) =

3 FX (x) is a non-decreasing function of x .

4 0 ≤ FX (x) ≤ 1

5 P[a ≤ X ≤ b] =

22 / 56

c©Stanley Chan 2019. All Rights Reserved.

Properties of CDF

Before we discuss Properties 6-7, we need the following terms.

(i) FX (b): The value of FX (x) at x = b.

(ii) limh→0 FX (b − h): The limit of FX (x) from the left hand side ofx = b.

(iii) limh→0 FX (b + h): The limit of FX (x) from the right hand side ofx = b.

23 / 56

c©Stanley Chan 2019. All Rights Reserved.

Properties of CDF

We say that FX (x) is

Left-continuous at x = b if

Right-continuous at x = b if

Continuous at x = b if

24 / 56

c©Stanley Chan 2019. All Rights Reserved.

Properties of CDF

6 FX (x) is right-continuous. That is,

limh→0

FX (b + h) = FX (b).

7 P[X = b] is determined by

P[X = b] = FX (b)− limh→0

FX (b − h).

25 / 56

c©Stanley Chan 2019. All Rights Reserved.

Theorem (Fundamental theorem of calculus)

If a function f is continuous, then

f (x) =d

dx

∫ x

af (t)dt

for some constant a.

Theorem

The probability density function (PDF) is the derivative of thecumulative distribution function (CDF):

fX (x) =dFX (x)

dx=

d

dx

∫ x

−∞fX (x ′)dx ′, (9)

provided FX is differentiable at x .

26 / 56

c©Stanley Chan 2019. All Rights Reserved.

Example. Consider a CDF

FX (x) =

{1− 1

4e−2x , x ≥ 0

0, x < 0.

Find fX (x).

27 / 56

c©Stanley Chan 2019. All Rights Reserved.

Example. Consider a CDF

FX (x) =

0.2, 0 ≤ x < 1

0.7, 1 ≤ x < 2

0.9, 2 ≤ x < 4

1, x ≥ 4.

Find fX (x).

28 / 56

c©Stanley Chan 2019. All Rights Reserved.

Mean / Mode / Median

Given a random variable X , can we define its mean/mode/median?From PDF:

Mean:

Mode:

Median:

29 / 56

c©Stanley Chan 2019. All Rights Reserved.

Mean / Mode / Median

From CDF:

Mean:

E[X ] =

∫ ∞0

(1− FX (x ′)

)dx ′ −

∫ 0

−∞FX (x ′)dx ′. (10)

Mode:

Median:

30 / 56

c©Stanley Chan 2019. All Rights Reserved.

Application of CDF

Q-Q Plot - a tool to check how good your model is.

Example Consider a dataset containing N data points. The histogram(empirical PDF) and empirical CDF is as follows:

Is it a Gaussian distribution?31 / 56

c©Stanley Chan 2019. All Rights Reserved.

QQ-Plot

32 / 56

c©Stanley Chan 2019. All Rights Reserved.

QQ-Plot

Why does it work?

Assume x1, . . . , xN are samples of a random variable X .Hypothesis: These data points are generated from certain randomvariable X̂ . Let F

X̂be its CDF.

Consider y1, . . . , yN are the equally spaced points of FX̂

. Then the zi ’s are

zi = F−1X̂

(yi ).

Testing: If X = X̂ , then for large N, we must have

zi = F−1X̂

(yi ) ≈ xi .

Therefore, we should have a linear function if we plot xi against zi .

33 / 56

c©Stanley Chan 2019. All Rights Reserved.

QQ-Plot

Figure: Left: Poor fit. In fact, the empirical data is generated from at-distribution. Right: Good fit.

34 / 56

c©Stanley Chan 2019. All Rights Reserved.

4. Gaussian Distribution

35 / 56

c©Stanley Chan 2019. All Rights Reserved.

Gaussian Distribution

Definition (Gaussian Distribution)

Let X be an Gaussian random variable. The PDF of X is

fX (x) =1√

2πσ2e−

(x−µ)2

2σ2 (11)

where (µ, σ2) are parameters of the distribution. We write

X ∼ N (µ, σ2)

to say that X is drawn from a Gaussian distribution of parameter (µ, σ2).

36 / 56

c©Stanley Chan 2019. All Rights Reserved.

Gaussian Distribution

Figure: Gaussian distribution

Proposition (Mean/Variance of Gaussian Distribution)

If X ∼ N (µ, σ2), then

E[X ] = µ, and Var[X ] = σ2.

37 / 56

c©Stanley Chan 2019. All Rights Reserved.

Gaussian Distribution

Proof.

38 / 56

c©Stanley Chan 2019. All Rights Reserved.

Percentile of Gaussian Distribution

39 / 56

c©Stanley Chan 2019. All Rights Reserved.

Standard Gaussian

Definition (Standard Gaussian)

A standard Gaussian (or standard Normal) random variable X has a PDF

fX (x) =1√2π

e−x2

2 . (12)

That is, X ∼ N (0, 1) is a Gaussian with µ = 0 and σ2 = 1.

Definition (CDF of Standard Gaussian)

The Φ(·) function of the standard Gaussian is

Φ(z) =1√2π

∫ z

−∞e−

x2

2 dx (13)

40 / 56

c©Stanley Chan 2019. All Rights Reserved.

Standardize Random Variable

If X ∼ N (µ, σ2), then

Z =X − µσ

∼ N (0, 1).

Proof. Key: Change of variable.

FX (x) =

∫ x

−∞fX (x ′)dx ′

=

∫ x

−∞

1√2πσ2

e−(x′−µ)2

2σ2 dx ′

=

∫ x−µσ

−∞

1√2π

e−x′22 dx ′

= Φ

(x − µσ

).

41 / 56

c©Stanley Chan 2019. All Rights Reserved.

Standard Gaussian

Figure: Definition of Φ(y).

Example. Let X ∼ N (µ, σ2). Find P[X ≤ b] and P[a ≤ X ≤ b].

42 / 56

c©Stanley Chan 2019. All Rights Reserved.

Standard Gaussian

Example. X ∼ N (5, 16), find

(a) P[X > 3]

(b) If P[X < a] = 0.7910, find a.

(c) If P[X > b] = 0.1635, find b.

43 / 56

c©Stanley Chan 2019. All Rights Reserved.

Example: Find the Outlier!

Find the outlier of this set of data:[0.25, 0.31, 0.33, 0.32, 0.36, 0.28, 0.29, 0.26, 0.7, 0.34].

Compute the statistics.

µ = 0.344, σ = 0.129.

Standarize Z = (X − µ)/σ.

The z-values are:-0.72, -0.26, -0.10, -0.18, 0.12, -0.49, -0.41, -0.64, 2.74, -0.03.

The probabilities P[Z < z ] are:0.23, 0.39, 0.45, 0.42, 0.54, 0.31, 0.33, 0.25, 0.9969, 0.48.

44 / 56

c©Stanley Chan 2019. All Rights Reserved.

Linear Transform of Gaussian

If X is Gaussian, and if we let

Y = aX + b,

then Y is also Gaussian.

Why?Assume X ∼ N (0, 1). Otherwise, standardize Z = (X − µ)/σ.

FY (y) = P[Y ≤ y ]

= P[aX + b ≤ y ]

= P[X ≤ (y − b)/a]

=

∫ (y−b)/a

−∞

1√2π

e−x2

2 dx .

45 / 56

c©Stanley Chan 2019. All Rights Reserved.

Linear Transform of Gaussian

Therefore, by Fundamental Theorem of Calculus,

fY (y) =d

dyFY (y)

=d

dy

∫ (y−b)/a

−∞

1√2π

e−x2

2 dx

=d y−b

a

dy· d

d y−ba

∫ (y−b)/a

−∞

1√2π

e−x2

2 dx (chain rule)

=1

a· 1√

2πe−

((y−b)/a)2

2 =1√

2πa2e−

(y−b)2

2a2 .

So Y is also Gaussian, with mean E[Y ] = b and Var[Y ] = a2.

In General: If X is Gaussian but not N (0, 1), then

E[Y ] = aE[X ] + b, Var[Y ] = a2Var[X ].

46 / 56

c©Stanley Chan 2019. All Rights Reserved.

Detection

Problem: Consider two clusters of data points.You want to build a simple classifier to determine whether a point belongsto N (µ1, σ

21) or N (µ2, σ

22).

Solution: Given the data point x , check whether one probability is largerthan the other! 47 / 56

c©Stanley Chan 2019. All Rights Reserved.

Detection

Write down the two PDFs:

1√2πσ21

e− (x−µ1)

2

2σ21 ≷

1√2πσ22

e− (x−µ2)

2

2σ22

Simplified Case: When σ1 = σ2 = σ. Then,

e−(x−µ1)

2

2σ2 ≷ e−(x−µ2)

2

2σ2

−(x − µ1)2

2σ2≷ −(x − µ2)2

2σ2

(x − µ1)2 ≶ (x − µ2)2

x2 − 2µ1x + µ21 ≶ x2 − 2µ2x + µ22

x ≶µ1 + µ2

2.

Therefore, if x < µ1+µ22 , then it is more likely that it belongs to class 1.

Otherwise, it is more likely that it belongs to class 2.48 / 56

c©Stanley Chan 2019. All Rights Reserved.

5. Function of Random Variable

49 / 56

c©Stanley Chan 2019. All Rights Reserved.

Function of Random Variable

Problem:

Given X .

Let Y = g(X ).

Want to find fY (y) and FY (y).

Example 1. Let X ∼ Uniform(0, 1). Let Y = 2X + 3. Find fY (y).

Example 2. Let X ∼ N (0, 1). Let Y = X 2. Find fY (y).

Why should we care about this?

Needed by problem. E.g., power and voltage: P = V 2/R.

Needed by analysis. E.g., random phase cos(ωt + Θ).

Needed by design. E.g., variance stabilizing transform.

50 / 56

c©Stanley Chan 2019. All Rights Reserved.

Examples

Example 1. Let X ∼ N (0, 1). Let Y = 2X + 3. Find fY (y) and FY (y).

51 / 56

c©Stanley Chan 2019. All Rights Reserved.

Examples

Example 2. Let X ∼ Uniform(−1, 1). Suppose Y = X 2. Find fY (y) andFY (y).

52 / 56

c©Stanley Chan 2019. All Rights Reserved.

Examples

Example 3. Let X ∼ Uniform(0, 2π). Suppose Y = cosX . Find fY (y)and FY (y). Hint: d

dy cos−1 y = −1√1−y2

.

53 / 56

c©Stanley Chan 2019. All Rights Reserved.

General Procedure

As shown in the previous examples, the basic steps are

FY (y) = P[Y ≤ y ]

P[Y ≤ y ] = P[g(X ) ≤ y ] = P[X ≤ g−1(y)], if g is increasing.Otherwise, pay attention to the inequality sign.

P[x ≤ g−1(y)] = FX (g−1(y)).

fY (y) = ddy FY (y) = d

dy FX (g−1(y))

Fundamental theorem of calculus is useful here:

d

dyFX (g−1(y)) =

d

dy

∫ g−1(y)

−∞fX (x ′)dx ′.

Chain rule:

d

dy

∫ g−1(y)

−∞fX (x ′)dx ′ =

dg−1(y)

dy· d

dg−1(y)

∫ g−1(y)

−∞fX (x ′)dx ′.

54 / 56

c©Stanley Chan 2019. All Rights Reserved.

Why Study Function of Random Variable?

Variance Stabilizing TransformMost of the denoising algorithms are

Designed for Gaussian noise

Assume variance is constant throughout the image

Easy to analyze, easy to implement

But, most photon shot noise is

Poisson

If X ∼ Poisson(λ), then E[X ] = λ and Var[X ] = λ

Variance changes as pixel intensity changes.

Variance stabilizing transform:

Let Y =√X + 3/8

Var[Y ] ≈ 1/4, constant throughout the image

Anscombe, F. J. (1948), “The transformation of Poisson, binomial and negative-binomial data”, Biometrika, 35 (34), pp.246254.

55 / 56

c©Stanley Chan 2019. All Rights Reserved.

Variance Stabilizing Transform

X , noisy input Var[X ] (before) Var[Y ] (after)

noisy input direct denoise transform-denoise56 / 56

top related