probability, random processes and inferencepescamilla/prpi/slides/prpi_3.pdf · probability, random...

130
INSTITUTO POLITÉCNICO NACIONAL CENTRO DE INVESTIGACION EN COMPUTACION Probability, Random Processes and Inference Dr. Ponciano Jorge Escamilla Ambrosio [email protected] http://www.cic.ipn.mx/~pescamilla/ Laboratorio de Ciberseguridad

Upload: others

Post on 25-Jul-2020

48 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

INSTITUTO POLITÉCNICO NACIONALCENTRO DE INVESTIGACION EN COMPUTACION

Probability, Random Processes and Inference

Dr. Ponciano Jorge Escamilla [email protected]

http://www.cic.ipn.mx/~pescamilla/

Laboratorio de

Ciberseguridad

Page 2: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

2

Course Content

1.4. General Random Variables

1.4.1. Continuous Random Variables and PDFs

1.4.2. Cumulative Distribution Function

1.4.3. Normal Random Variables

1.4.4. Joint PDFs of Multiple Random Variables

1.4.5. Conditioning

1.4.6. The Continuous Bayes’ Rule

1.4.7. The Strong Law of Large Numbers

Page 3: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

❑ Continuous random variables

➢ The velocity of a vehicle traveling along the highway

❑ Continuous random variables can take on any real

value in an interval.

➢ possibly of infinite length, such as (0,) or the entire real

line.

❑ In this section the concepts and method for discrete

r.v.s, such as expectation, PMF, and conditioning,

for their continuous counterparts are introduced.

3

General Random Variables

Page 4: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

❑ Continuous random variable. A random variable is

called continuous if there exists a non negative

function fX, called the probability density function

of X, or PDF, such that:

❑ For every subset B of the real line

4

Probability Density Function

Page 5: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

❑ The probability that the value of X falls within an

interval is:

which can be interpreted as the area under the graph of

the PDF.

5

Probability Density Function

Page 6: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

6

Probability Density Function

Page 7: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

❑ For any single value a, we have:

❑ For this reason, including or excluding the endpoints

of an interval has no effect on its probability:

7

Probability Density Function

Page 8: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

❑ To qualify as a PDF, a function fX must be:

o nonnegative, i.e., fX(x) 0 for every x,

o have the normalisation property:

❑ Graphically, this means that the entire area under the

graph of the PDF must be equal to 1.

8

Probability Density Function

Page 9: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

9

Discrete vs. continuous r.v.s.

Recall that for a discrete r.v., the CDF jumps at every point in the

support, and is flat everywhere else. In contrast, for a continuous

r.v. the CDF increases smoothly.

Page 10: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

❑ For a continuous r.v. X with CDF, FX(x), the

probability density function (PDF) of X is the

derivative fX(x) of the CDF, given by fX(x) = F′X (x).

The support of X, and of its distribution, is the set of

all x where fX(x) > 0.

❑ The PDF represents the “density” of probability at

the point x.

10

Discrete vs. continuous r.v.s.

Page 11: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

❑ To get from the PDF back to the CDF we apply:

❑ Thus, analogous to how we obtained the value of a

discrete CDF at x by summing the PMF over all

values less than or equal to x; here we integrate the

PDF over all values up to x, so the CDF is the

accumulated area under the PDF.

11

Probability Density Function

Page 12: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

❑ Since we can freely convert between the PDF and

the CDF using the inverse operations of integration

and differentiation, both the PDF and CDF carry

complete information about the distribution of a

continuous r.v.

❑ Thus the PDF completely specifies the behavior

of continuous random variables.

12

Probability Density Function

Page 13: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

❑ For an interval [x, x+] with very small length , we

have:

So we can view fX(x) as the “probability mass per unit

length” near x.

13

Probability Density Function

Even though a PDF is used to calculate

event probabilities, fX(x) is not the

probability of any particular event.

In particular, it is not restricted to be

less than or equal to one.

Page 14: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

❑ An important way in which continuous r.v.s differ

from discrete r.v.s is that for a continuous r.v. X,

P(X = x) = 0 for all x. This is because P(X = x) is the

height of a jump in the CDF at x, but the CDF of X

has no jumps! Since the PMF of a continuous r.v.

would just be 0 everywhere, we work with a PDF

instead.

14

Probability Density Function

Page 15: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

❑ The PDF is analogous to the PMF in many ways, but

there is a key difference: for a PDF fX , the quantity

fX(x) is not a probability, and in fact it is possible to

have fX(x) > 1 for some values of x. To obtain a

probability, we need to integrate the PDF.

❑ In summary:

➢To get a desired probability, integrate the PDF over

the appropriate range.

15

Probability Density Function

Page 16: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

❑ The Logistic distribution has CDF:

❑ To get the PDF, we differentiate the CDF, which

gives:

❑ Example:

16

Examples of PDFs

Page 17: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

17

Examples of PDFs

Page 18: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

❑ The Rayleigh distribution has CDF:

❑ To get the PDF, we differentiate the CDF, which

gives:

❑ Example:

18

Examples of PDFs

Page 19: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

19

Examples of PDFs

Page 20: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

❑ A continuous r.v. X is said to have Uniform

distribution on the interval (a, b) if its PDF is:

❑ The CDF is the accumulated area under the PDF:

20

Examples of PDFs

Page 21: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

❑ We denote this by X Unif(a, b).

❑ The Uniform distribution that we will most frequently

use is the Unif(0, 1) distribution, also called the

standard Uniform.

❑ The Unif(0, 1) PDF and CDF are particularly simple:

f(x) = 1 and F(x) = x for 0 < x < 1.

❑ For a general Unif(a, b) distribution, the PDF is

constant on (a, b), and the CDF is ramp-shaped,

increasing linearly from 0 to 1 as x ranges from a to b.

21

Examples of PDFs

Page 22: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

22

Examples of PDFs

For Uniform distributions, probability is proportional to length.

Page 23: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

23

PDF Properties

Page 24: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

❑ The expected value or expectation or mean of a

continuous r.v. X is defined by:

❑ This sis similar to the discrete case except that the

PMF is replaced by the PDF, and summation is

replaced by integration.

❑ Its mathematical properties are similar to the discrete

case.

24

Expected Value and Variance of a

Continuous r.v.

Page 25: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

❑ If X is a continuous random variable with given

PDF, then any real-valued function Y = ɡ(X) of X is

also a random variable.

➢Note that Y can be a continuous r.v., but Y can also be

discrete, e.g., ɡ(x) = 1 for x ˃ 0 and ɡ(x) = 0, otherwise.

❑ In either case, the mean of ɡ(X) satisfies the

expected value rule:

25

Expected Value and Variance of a

Continuous r.v.

Page 26: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

❑ The nth moment of a continuous r.v. X is defined as

E[Xn], the expected value of the random variable Xn.

❑ The variance of X denoted as var(X), is defined as the

expected value of the random variable (X - E[X])2 :

26

Expected Value and Variance of a

Continuous r.v.

Page 27: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

❑ Example. Consider a uniform PDF over an interval

[a, b], its expectation is given by:

27

Expected Value and Variance of a

Continuous r.v.

Page 28: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

❑ Its variance is given as:

28

Expected Value and Variance of a

Continuous r.v.

Page 29: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

❑ The exponential continuous random variable has

PDF:

where is a positive parameter characterising the

PDF, with

29

Expected Value and Variance of a

Continuous r.v.

Page 30: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

❑ The probability that X exceeds a certain value

decreases exponentially. This is, for any a 0, we

have:

❑ An exponential random variable can be a good

model for the amount of time until an incident of

interest takes place.

➢ a message arriving at a computer, some equipment

breaking down, a light bulb burning out, etc.

30

Expected Value and Variance of a

Continuous r.v.

Page 31: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

31

Expected Value and Variance of a

Continuous r.v.

Page 32: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

❑ The mean of the exponential r.v. X is calculated by:

32

Expected Value and Variance of a

Continuous r.v.

Page 33: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

❑ The variance of the exponential r.v. X is calculated

by:

33

Expected Value and Variance of a

Continuous r.v.

Page 34: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

❑ The cumulative distribution function, CDF, of a

random variable X is denoted as FX and provides the

probability P(X x). In particular for every x we

have:

34

Cumulative Distribution Functions

The CDF FX(x) “accumulates” probability “up to” the value of x.

Page 35: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

❑ Any random variable associated with a given

probability model has CDF, regardless of whether it

is discrete or continuous.

➢ {X x} is always an event and therefore has well-defined

probability.

35

Cumulative Distribution Functions

Page 36: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

36

Cumulative Distribution Functions

Page 37: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

37

Cumulative Distribution Functions

Page 38: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

38

Cumulative Distribution Functions

Page 39: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

39

Cumulative Distribution Functions

Page 40: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

40

Normal Random Variables

❑ A continuous random variable X is normal or

Gaussian or normally distributed if it has PDF of

the form:

where μ and σ are two scalar parameters characterising

the PDF (abbreviated N(μ, σ2), and referred to as

normal density function), with σ assumed positive.

Page 41: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

41

Normal Random Variables

❑ It can be verified that the normalisation property

holds:

N(1,1)

Page 42: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

42

Normal Random Variables

❑ If X is N(μ, σ2), then: E(X) = μ

Proof: The PDF is symmetric about x = μ.

❑ If X is N(μ, σ2), then: Var(X) = σ2

Proof:

Page 43: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

43

Normal Random Variables

❑ Its maximum value occurs at the mean value of its

argument.

❑ It is symmetrical about the mean value.

❑ The points of maximum absolute slope occur at one

standard deviation above and below the mean.

❑ Its maximum value is inversely proportional to its

standard deviation.

❑ The limit as the standard deviation approaches zero

is a unit impulse.

Page 44: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

44

Normal Random Variables

Page 45: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

45

Linear Function of a Normal

Random Variable

❑ If X is a normal r.v. with mean and variance 2,

and if a 0, b are scalars, then the random variable:

Y = aX + b

is also normal, with mean and variance:

E[Y] = a + b, var(Y) = a2 2

Page 46: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

46

Standard Normal Random Variables

❑ A normal random variable Y with zero mean and

unit variance, N(0, 1), is said to be a standard

normal. Its PDF and CDF are denoted by and ,

respectively:

Page 47: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

47

Standard Normal Random Variables

❑ The PDF of a normal r.v. cannot be integrated in

terms of the common elementary functions, and

therefore the probabilities of X falling in various

intervals are obtained from tables or by computer.

❑ Example, the Standard Normal Table.

❑ The table only provides the values of (y) for y 0,

because the omitted values can be calculated using

the symmetry of the PDF.

Page 48: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

48

Standard Normal Random Variables

Page 49: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

49

Standard Normal Random Variables

Page 50: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

50

Standard Normal Random Variables

❑ It would be overwhelming to construct tables for all

μ and σ values required in application.

➢ Standardise the r.v.

❑ Let X be a normal (Gaussian) random variable with

mean μ and variance σ 2 values. We standardise X by

defining a new random variable Y given by:

Page 51: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

51

Standard Normal Random Variables

❑ Since Y is a linear function of X, it is normal, This

means:

❑ Thus, Y is a standard normal random variable.

➢ This allows us to calculate the probability of any event

defined in terms of X by redefining the event in terms of

Y, and then using the standard normal table.

Page 52: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

52

Standard Normal Random Variables

❑ Example 1:

Page 53: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

53

Standard Normal Random Variables

❑ Example 2: The annual snowfall at a particular

geographic location is modelled as a normal random

variable with a mean = 60 inches and a standard

deviation of = 20. What is the probability that this

year’s snowfall will be at least 80 inches?

Page 54: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

54

Standard Normal Random Variables

❑ Solution:

Page 55: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

55

Standard Normal Random Variables

❑ Example 3: (Height Distribution of Men). Assume

that the height X, in inches, of a randomly selected

man in a certain population is normally distributed

with μ = 69 and σ = 2.6. Find

1. P(X < 72),

2. P(X > 72),

3. P(X < 66),

4. P(|X − μ| < 3).

Page 56: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

56

Standard Normal Random Variables

❑ The table gives (z) only for z ≥ 0, and for z < 0 we

need to make use of the symmetry of the normal

distribution. This implies that, for any z, P(Z < −z) =

P(Z > z). Thus, solution:

Page 57: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

57

Standard Normal Random Variables

Page 58: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

58

Standard Normal Random Variables

❑ Normal r.v.s. are often used in signal processing and

communications engineering to model noise and

unpredictable distortions of signals.

❑ Example:

Page 59: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

59

Standard Normal Random Variables

Page 60: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

60

Standard Normal Random Variables

❑ Solution:

Page 61: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

61

Standard Normal Random Variables

❑ Three important benchmarks for the Normal

distribution are the probabilities of falling within

one, two, and three standard deviations of the mean.

The 68-95-99.7% rule tells us that these probabilities

are what the name suggests.

❑ (68-95-99.7% rule). If X N(μ, 2), then:

Standardising

Page 62: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

62

Standard Normal Random Variables

❑ Three important benchmarks for the Normal

distribution are the probabilities of falling within

one, two, and three standard deviations of the mean.

The 68-95-99.7% rule tells us that these probabilities

are what the name suggests.

❑ (68-95-99.7% rule). If X N(μ, 2), then:

Standardising

Page 63: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

63

Standard Normal Random Variables

Page 64: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

64

Joint PDF of Multiple Random

Variables

❑ Two continuous random variables associated with

the same experiment are jointly continuous and can

be described in terms of a joint PDF fX,Y if fX,Y is a

nonnegative function that satisfies:

for every subset B of the two-dimensional plane.

❑ The notation means that the integration is carried out

over the set B.

Page 65: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

65

Joint PDF of Multiple Random

Variables

❑ In the particular case where B is a rectangle of the

form B = {(x, y) | a x b, c y d}, we have:

❑ If B is the entire two-dimensional plane, then we

obtain the normalisation property:

Page 66: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

66

Joint PDF of Multiple Random

Variables

B

Page 67: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

67

Joint PDF of Multiple Random

Variables

❑ To interpret the joint PDF, we let be a small

positive number and consider the probability of a

small rectangle. Then we have:

so we can view fX,Y(a, c) as the probability per unit

area in the vicinity of (a, c).

Page 68: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

68

Joint PDF of Multiple Random

Variables

Page 69: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

69

Joint PDF of Multiple Random

Variables

❑ The joint PDF contains all relevant probabilistic

information on the random variables X, Y, and their

dependencies.

❑ Therefore, the joint PDF allow us to calculate the

probability of any event that can be defined in terms

of these two random variables.

Page 70: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

70

Marginals

❑ Marginal PDF. For continuous r.v.s X and Y with

joint PDF fX,Y, the marginal PDF of X is:

❑ Similarly, the marginal PDF of Y is:

Page 71: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

71

Marginals

❑ Marginalisation works analogously with any number

of variables. For example, if we have the joint PDF

of X, Y, Z,W but want the joint PDF of X,W, we

just have to integrate over all possible values of Y

and Z:

➢ Conceptually this is very easy—just integrate over the unwanted

variables to get the joint PDF of the wanted variables—but

computing the integral may or may not be difficult.

Page 72: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

72

Marginals

❑ Example 1.

Page 73: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

73

Marginals

❑ Example 1.

Page 74: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

74

Joint CDFs

❑ If X and Y are two random variables associated with

the same experiment, their joint CDF is defined by:

❑ The joint CDF is the joint probability of the two

events {X ≤ x} and {Y ≤ y}.

❑ If X and Y are described by a joint PDF fX,Y, then:

Page 75: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

75

Joint PDF of Multiple Random

Variables

❑ Conversely, if X and Y are continuous with joint

CDF FX,Y their joint PDF is the derivative of the

joint CDF with respect to x and y:

Page 76: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

76

Joint CDF of Multiple Random

Variables

❑ Let X and Y be described by a uniform PDF on the

unit square. The joint CDF is given by:

❑ It can be verified that:

for al (x, y) in the unit square.

Page 77: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

77

Expectation

❑ If X and Y are jointly continuous random variables

and ɡ is some function, then Z = ɡ (X, Y) is also a

random variable. Thus the expected value rule

applies:

❑ As an important special case, for any scalars a, b,

and c, we have:

Page 78: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

78

More than Two Random Variables

❑ The joint PDF of three random variables X, Y, and Z

is defined in analogy with the case of two random

variables. For example:

❑ For any set B. We have the relations such as:

Page 79: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

79

More than Two Random Variables

❑ The expected value rule takes the form:

❑ If ɡ is linear, of the form aX +bY + cZ, then:

Page 80: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

80

More than Two Random Variables

Page 81: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

81

More than Two Random Variables

Page 82: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

82

Conditioning

❑ The conditional PDF of a continuous random

variable X, given an event A with P(A) 0, is

defined as a nonnegative function fX|A that satisfies:

for any subset B of the real line.

Page 83: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

83

Conditioning

❑ In particular, by letting B be the entire real line, we

obtain the normalisation property:

so that fX|A is a legitimate PDF.

Page 84: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

84

Conditioning

❑ In the important special case where we condition on

an event of the form {X A}, with P(X A) 0,

the definition of conditional probabilities yields:

❑ By comparing with the earlier formula, it gives:

Page 85: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

85

Conditioning

Page 86: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

86

Joint Conditional PDF

❑ Suppose that X and Y are jointly continuous random

variables, with joint PDF fX,Y. If we condition on a

positive probability event of the form C = {(X,Y)

A}, we have:

❑ In this case, the conditional PDF of X, given this

event, can be obtained from the formula:

Page 87: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

87

Joint Conditional PDF

❑ These two formulas provide one possible method for

obtaining the conditional PDF of a random variable

X when the conditioning event is not of the form {X

A}, but instead defined in terms multiple random

variables.

Page 88: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

88

Joint Conditional PDF

❑ A version of the total probability theorem, which

involves conditional PDFs is given as: if the events

A1,…, An form a partition of the sample space, then:

❑ Using the total probability theorem:

Page 89: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

89

Joint Conditional PDF

❑ Finally, the formula can be written as:

❑ We then take the derivative of both sides, with

respect to x, and obtain the desired result.

Page 90: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

90

Conditioning

Page 91: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

91

Conditioning

Page 92: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

92

Conditioning

Page 93: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

93

Joint Conditional PDF

❑ To interpret the conditional PDF, let us fix some

small positive numbers 1 and 2, and condition on

the event B = {y Y y + 2}. We have:

Page 94: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

94

Joint Conditional PDF

❑ Therefore, fX|Y(x|y)1 provides us with the

probability that X belongs to a small interval [x, x +

1], given that Y belongs to a small interval [y, y +

2]. Since fX|Y(x|y)1 does not depend on 2, we can

think of the limiting case where 2 decreases to zero

and write:

❑ And more generally:

Page 95: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

95

Joint Conditional PDF

❑ The conditional probability PDF fX|Y(x|y) can be seen

as a description of the probability law of X, given that

the event {Y = y} has occurred.

❑ As in the discrete case, the PDF fX|Y, together with the

marginal PDF fy are sometimes used to calculate the

joint PDF.

➢ This approach can also be used for modelling: instead of

directly specifying fX|Y, it is often natural to provide a

probability law for Y, in terms of a PDF fY, and then provide a

conditional PDF fX|Y(x|y) for X, given any possible value y of Y.

Page 96: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

96

Joint Conditional PDF

❑ Example. The speed of a typical vehicle that drives

past a police radar is modelled as an exponentially

distributed random variable X with mean 50 miles

per hour. The police radar’s measurement Y of the

vehicle’s speed has an error which is modelled as a

normal random variable with zero mean and

standard deviation equal to one tenth of the vehicle’s

speed. What is the joint PDF of X and Y?

Page 97: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

97

Joint Conditional PDF

❑ Solution. We have fX(x) = (1/50)e-x/50, for x 0.

Also, conditioned on X = x, the measurement Y has

a normal PDF with mean x and variance x2/100.

Therefore:

❑ Thus, for all x 0 and all y:

Page 98: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

98

Conditional PDF for More Than Two r.v.s.

❑ Conditional PDF can be defined for the extension for

the case of more than two random variables:

❑ The analogue multiplication rule is given as:

Page 99: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

99

Conditional Expectation

❑ For a continuous random variable X, we define the

conditional expectation E[X|A] given an event A,

similar to the unconditional case, except that we now

need to use the conditional PDF fX|A.

❑ Let X and Y be jointly continuous random variables,

and let A be an event with P(A) 0, then the

conditional expectation of X given the event A is

defined by:

Page 100: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

100

Conditional Expectation

❑ The conditional expectation of X given that Y = y is

defined by:

❑ The expectation rule, for a function ɡ(x):

and

Page 101: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

101

Conditional Expectation

❑ Total expectation theorem: Let A1, A2,…, An be

disjoint events that form a partition of the same

space, and assume that P(Ai) ˃ 0 for all i. Then:

❑ Similarly:

Page 102: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

102

Conditional Expectation

❑ There are natural analogues for the case of functions

of several random variables. For example:

❑ And:

Page 103: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

103

Independence

❑ Two continuous random variable X and Y are

independent if their joint PDF is the product of the

marginal PDFs:

❑ Comparing with the formula fX,Y(x, y) = fX|Y(x|y)fY(y),

we see that independence is the same as the

condition:

or, symmetrically:

Page 104: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

104

Independence

❑ For the case of more than three random variables, for

example, we say that three random variables X, Y,

and Z are independent if:

Page 105: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

105

Independence

❑ Example. Independent Normal Random Variables.

Let X and Y be independent normal random

variables with means x, y, and variances 2x,

2y,

respectively. Their joint PDF is of the form:

❑ This joint PDF has the shape of a bell cantered at (x,

y), and whose width in the x and y directions is

proportional to 2x and 2

y, respectively.

Page 106: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

106

Independence

❑ Additional insight into the form of the PDF can be

get by considering its contours.

➢ i.e., sets of points ata which the PDF takes a constant

value.

❑ These contours are described by an equation of the

form:

and are ellipses whose two axes are horizontal and

vertical. If 2x = 2

y, then the contours are circles.

Page 107: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

107

Independence

Page 108: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

108

Independence

Page 109: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

109

Independence

❑ If X and Y are independent, then any two events of

the form {X A} and {Y B} are independent:

Page 110: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

110

Independence

❑ Independence implies that:

❑ The property:

can be used to provide a general definition of

independence between two random variables, e.g., if X

is discrete and Y is continuous.

Page 111: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

111

Independence

❑ Similarly than to the discrete case, if X and Y are

independent, then:

for any two functions ɡ and h.

❑ The variance of the sum of independent random

variables is equal to the sum of their variances:

Page 112: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

112

Summary of Independence

Page 113: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

113

Summary of Independence

Page 114: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

114

The continuous Bayes’ rule:

Inference problem

posterior

Page 115: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

115

The continuous Bayes’ rule

❑ Inference problem:

❑ We have an unobserved random variable X with

known PDF, and we obtain a measurement Y

according to a conditional PDF fX|Y. Given an

observed value y of Y, the inference problem is to

evaluate the conditional PDF fX|Y(x|y).

Page 116: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

116

The continuous Bayes’ rule

❑ Thus, whatever information is provided by the event

{Y = y} is captured by the conditional PDF fX|Y(x|y).

It thus suffices to evaluate this PDF. From the

formula fX fY|X = fX,Y = fY fX|Y, it follows:

Page 117: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

117

The continuous Bayes’ rule

❑ Based on the normalisation property

an equivalent expression is:

Page 118: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

118

The continuous Bayes’ rule

Page 119: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

119

The continuous Bayes’ rule

Page 120: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

120

The Bayes’ rule – discrete unknown,

continuous measurement

Page 121: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

121

The Bayes’ rule – continuous unknown,

discrete measurement

Page 122: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

122

Sums of Independent Random Variables

Convolution

❑ Let Z = X + Y, where X and Y are independent

integer-valued random variables with PMFs pX and

pY, respectively. Then, for any integer z:

❑ The resulting PMF pZ is called the convolution of the

PMFs of X and Y.

Page 123: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

123

Covariance and Correlation

❑ The covariance of two random variables X and Y,

denoted by cov(X, Y), is defined as:

❑ When cov(X, Y) = 0, we say X and Y are

uncorrelated.

➢A positive o negative covariance indicates that the values

of X− E[X] and Y − E[Y] obtained in a single experiment

“tend” to have the same or the opposite sign, respectively.

Page 124: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

124

Covariance and Correlation

Page 125: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

125

Covariance and Correlation

❑ Multiplying this out and using linearity, we have an

equivalent expression:

❑ Covariance has the following key properties:

1. Cov(X, X) = Var(X).

2. Cov(X, Y) = Cov(Y, X).

3. Cov(X, c) = 0 for any constant c.

4. Cov(aX, Y ) = aCov(X, Y) for any constant a.

Page 126: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

126

Covariance and Correlation

5. Cov(X + Y,Z) = Cov(X,Z) + Cov(Y,Z).

6. Cov(X + Y,Z +W) = Cov(X,Z) + Cov(X,W) +

Cov(Y,Z) + Cov(Y,W).

7. Var(X + Y ) = Var(X) + Var(Y) + 2Cov(X, Y ). For

n r.v.s X1, . . . ,Xn,

Page 127: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

127

Covariance and Correlation

❑ The correlation coefficient (X,Y) of two random

variables X and Y that have nonzero variances is

defined as:

❑ It may be viewed as a normalised version of the

covariance cov(X, Y).

❑ ranges from -1 to 1.

Page 128: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

128

Covariance and Correlation

❑ If > 0 (or < 0), then the values of X − E[X] and Y

− E[Y] “tend” to have the same (or opposite,

respectively) sign.

➢ The size of || provides a normalized measure of the

extent to which this is true.

➢Always assuming that X and Y have positive variances, it

cab be shown that = 1 (or = −1) if and only if

there exists a positive (or negative, respectively)

constant c such that:

Page 129: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

129

Covariance and Correlation

Page 130: Probability, Random Processes and Inferencepescamilla/PRPI/slides/PRPI_3.pdf · Probability, Random Processes and Inference ... called the probability density function of X, or PDF,

CIC

130

Covariance and Correlation