# random variables, vectors, and processes â„¦ random variables, vectors, and processes ee278:...

Post on 11-May-2020

1 views

Embed Size (px)

TRANSCRIPT

EE 278 Lecture Notes # 3 Winter 2010–2011

Random variables, vectors, and processes

EE278: Introduction to Statistical Signal Processing, winter 2010–2011 c�R.M. Gray 2011 1

Random Variables

Probability space (Ω,F , P) A (real-valued) random variable is a real-valued function defined on Ω with a technical condition (to be stated)

Common to use upper-case letters. E.g., a random variable X is a function X : Ω→ R. Y,Z,U,V,Θ, · · · Also common: random variable may take on values only in some subset ΩX ⊂ R (sometimes called the alphabet of X, AX and X also common notations)

Intuition: Randomness is in experiment, which produces outcome ω according to probability P⇒ random variable outcome is X(ω) ∈ ΩX ⊂ R. EE278: Introduction to Statistical Signal Processing, winter 2010–2011 c�R.M. Gray 2011 2

Examples

Consider (Ω,F , P) with Ω = R, P determined by uniform pdf on [0, 1)

Coin flip from earlier: X : R→ {0, 1} by

X(r) =

0 if r ≤ 0.5 1 otherwise

.

Observe X, do not observe outcome of fair spin.

Lots of possible random variables, e.g., W(r) = r2, Z(r) = er, V(r) = r, L(r) = −r ln r (require r ≥ 0), Y(r) = cos(2πr), etc.

Can think of rvs as observations or measurements made on an underlying experiment.

EE278: Introduction to Statistical Signal Processing, winter 2010–2011 c�R.M. Gray 2011 3

Functions of random variables

Suppose that X is a rv defined on (Ω,F , P) and suppose that g : ΩX → R is another real-valued function.

Then the function g(X) : Ω→ R defined by g(X)(ω) = g(X(ω)) is also a real-valued mapping of Ω, i.e., a real-valued function of a random variable is a random variable

Can express the previous examples as W = V2, Z = eV, L = −V ln V, Y = cos(2πV)

Similarly, 1/W, sinh(Y), L3 are all random variables

EE278: Introduction to Statistical Signal Processing, winter 2010–2011 c�R.M. Gray 2011 4

Random vectors and random processes

A finite collection of random variables (defined on a common probability space (Ω,F , P) is a random vector

E.g., (X,Y), (X0, X1, · · · , Xk−1)

An infinite collection of random variables (defined on a common probability space) is a random process

E.g., {Xn, n = 0, 1, 2, · · · }, {X(t); t ∈ (−∞,∞)}

So theory of random vectors and random processes mostly boils down to theory of random variables.

EE278: Introduction to Statistical Signal Processing, winter 2010–2011 c�R.M. Gray 2011 5

Derived distributions

In general: “input” probability space (Ω,F , P) + random variable X ⇒ “output” probability space, say (ΩX,B(ΩX), PX), where ΩX ⊂ R and PX is distribution of X PX(F) = Pr(X ∈ F)

Typically PX described by pmf pX or pdf fX

For binary quantizer special case derived PX.

Idea generalizes and forces a technical condition on definition of random variable (and hence also on random vector and random process)

EE278: Introduction to Statistical Signal Processing, winter 2010–2011 c�R.M. Gray 2011 6

Inverse image formula

Given (Ω,B(Ω), P) and a random variable X, find PX Basic method: PX(F) = the probability computed using P of all the original sample points that are mapped by X into the subset F:

PX(F) = P({ω : X(ω) ∈ F})

Shorthand way to write formula in terms of inverse image of an event F ∈ B(ΩX) under the mapping X : Ω→ ΩX: X−1(F) = {r : X(r) ∈ F}:

PX(F) = P(X−1(F))

Written informally as PX(F) = Pr(X ∈ F) = P{X ∈ F} = “probability that random variable X assumes a value in F”

EE278: Introduction to Statistical Signal Processing, winter 2010–2011 c�R.M. Gray 2011 7

X−1(F) X

F

Inverse image method: Pr(X ∈ F) = P({ω : X(ω) ∈ F}) = P(X−1(F)) inverse image formula — fundamental to probability, random processes, signal processing.

Shows how to compute probabilities of output events in terms of the input probability space does the definition make sense?

i.e., is PX(F) = P(X−1(F)) well-defined for all output events F??

Yes if include requirement in definition of random variable —

EE278: Introduction to Statistical Signal Processing, winter 2010–2011 c�R.M. Gray 2011 8

Careful definition of a random variable

Given a probability space (Ω,F , P), a (real-valued) random variable X is a function X : Ω→ ΩX ⊂ R with the property that

if F ∈ B(ΩX), then X−1(F) ∈ F

Notes:

• In English: X : Ω→ ΩX ⊂ R is a random variable iff the inverse image of every output event is an input event and therefore PX(F) = P(X−1(F)) is well-defined for all events F.

• Another name for a function with this property: measurable function

EE278: Introduction to Statistical Signal Processing, winter 2010–2011 c�R.M. Gray 2011 9

• Most every function we encounter is measurable, but calculus of probability rests on this property and advanced courses prove measurability of important functions.

In simple binary quantizer example, X is measurable (easy to show since F = B([0, 1)) contains intervals) Recall

PX({0}) = P({r : X(r) = 0}) = P(X−1({0})) = P({r : 0 ≤ r ≤ 0.5}) = P([0, 0.5]) = 0.5

PX({1}) = P(X−1({1})) = P((0.5, 1.0]) = 0.5 PX(ΩX) = PX({0, 1}) = P(X−1({0, 1}) = P([0, 1)) = 1

PX(∅) = P(X−1(∅)) = P(∅) = 0, In general, find PX by computing pmf or pdf, as appropriate. Many shortcuts, but basic approach is inverse image formula.

EE278: Introduction to Statistical Signal Processing, winter 2010–2011 c�R.M. Gray 2011 10

Random vectors

All theory, calculus, applications of individual random variables useful for studying random vectors and random processes since random vectors and processes are simply collections of random variables.

One k-dimensional random vector = k 1-dimensional random variables defined on a common probability space.

Earlier example: two coin flips, k-coin flips (first k binary coefficients of fair spinner)

Several notations used, e.g., Xk = (X0, X1, . . . , Xk−1) is shorthand for Xk(ω) = (X0(ω), X1(ω), . . . , Xk−1)(ω)

or X or {Xn; n = 0, 1, . . . , k − 1} or {Xn; n ∈ Zk} EE278: Introduction to Statistical Signal Processing, winter 2010–2011 c�R.M. Gray 2011 11

Can be discrete (discribed by multidimensional pmf) or continuous (e.g., described by multidimensional pdf) or mixed

Recall that a real-valued function of a random variable is a random variable.

Similarly, a real-valued function of a random vector (several random variables) is a random variable. E.g., if X0, X1, . . . Xn−1 are random variables, then

S n = 1 n

n−1�

k=0

Xk

is a random variable defined by

S n(ω) = 1 n

n−1�

k=0

Xk(ω)

EE278: Introduction to Statistical Signal Processing, winter 2010–2011 c�R.M. Gray 2011 12

Inverse image formula for random vectors

PX(F) = P(X−1(F)) = P({ω : X(ω) ∈ F}) = P({ω : (X0(ω), X1(ω), . . . , Xk−1(ω)) ∈ F})

where the various forms are equivalent and all stand for Pr(X ∈ F) Technically, the formula holds for suitable events F ∈ B(R)k, the Borel field of Rk (or some suitable subset). See book for discussion.

One multidimensional event of particular interest is a Cartesian product of 1D events (called a rectangle): F = ×k−1i=0 Fi = {xk : xi ∈ Fi; i = 0, . . . , k − 1}

PX(F) = P({ω : X0(ω) ∈ F0, X1(ω) ∈ F1, . . . , Xk−1(ω) ∈ Fk−1}) EE278: Introduction to Statistical Signal Processing, winter 2010–2011 c�R.M. Gray 2011 13

Random processes

A random vector is a finite collection of rvs defined on a common probability space

A random process is an infinite family of rvs defined on a common probability space. Many types:

{Xn; n = 0, 1, 2, . . .} (discrete-time, one-sided)

{Xn; n ∈ Z} (discrete-time, two-sided)

{Xt; t ∈ [0,∞)} (continuous-time, one-sided)

{Xt; t ∈ R} (continuous-time, two-sided)

Also called stochastic process

EE278: Introduction to Statistical Signal Processing, winter 2010–2011 c�R.M. Gray 2011 14

In general: {Xt; t ∈ T } or {X(t); t ∈ T }

Other notations: {X(t)}, {X[n]} (for discrete-time)

Sloppy but common: X(t), context tells rp and not single rv

Also called a stochastic process. Discrete-time random processes are also called time series

Always: a random process is an indexed family of random variables, T is index set

For each t, Xt is a random variable. All Xt defined on a common probability space

index is usually time, in some applications it is space, e.g., random field {X(t, s); t, s ∈ [0, 1)} models a random image, {V(x, y, t); x, y ∈ [0, 1); t ∈ [0,∞)} models analog video. EE278: Introduction to Statistical Signal Processing, winter 2010–2011 c�R.M. Gray 2011 15

Keep in mind the suppressed argument ω— e.g., each Xt is Xt(ω), a function defined on the sample space

X(t) is X(t,ω), it can be viewed as a function of two arguments

Have seen one example — fair coin flips, a Bernoulli random process

Another, simpler, example:

Random sinusoids Suppose that A and Θ are two random variables with a joint pdf fA,θ(a, θ) = fA(a) fΘ(θ). For example, Θ ∼ U([0, 2π)) and A ∼ N(0,σ2). Define a continuous-time random process X(t) for all t

Recommended