ch 2_ part 1.pptx

Upload: ahmedlolo175

Post on 02-Jun-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Ch 2_ Part 1.pptx

    1/25

  • 8/10/2019 Ch 2_ Part 1.pptx

    2/25

    Statistical decision theory deals with situationswhere decisions have to be made under a state ofuncertainty, and its goal is to provide a rationalframework for dealing with such situations.

    The Bayesian approach is a particular way offormulating and dealing with statistical decision

    problems. More specifically, it offers a method offormalizing a priori beliefs and of combining themwith the available observations,

    Introduction

  • 8/10/2019 Ch 2_ Part 1.pptx

    3/25

    Introduction (cont..)

    The sea bass/salmon example

    State of nature, prior

    State of nature is a random variable

    The catch of salmon and sea bass is equiprobable

    P(1) = P(2) (uniform priors)

    P(1) + P( 2) = 1

    Pattern Classification, Chapter 2 (Part 1) 3

    Salmon

    Sea Bass

  • 8/10/2019 Ch 2_ Part 1.pptx

    4/25

    The a priori or prior probability reacts our knowledgeof how likely we expect a certain state of nature beforewe can actually observe said state of nature.

    What is a reasonable Decision Rule if

    the only available information is the prior, and

    the cost of any incorrect classification is equal?

    Decide 1if P(1) > P(2)otherwise decide 2What can we say about this decision rule?

    Seems reasonable, but it will always choose the same fish.

    If the priors are uniform, this rule will behave poorly.Pattern Classification, Chapter 2 (Part 1) 4

    Decision Rule From Only Priors

  • 8/10/2019 Ch 2_ Part 1.pptx

    5/25

    Class-Conditional Density The class-conditional probability density function is

    the probability density function for x, our feature,given that the state of nature is :

    P(x | 1)and P(x | 2)describe the difference inlightness between populations of sea and salmon

  • 8/10/2019 Ch 2_ Part 1.pptx

    6/25

    Class Conditional Probabilities

  • 8/10/2019 Ch 2_ Part 1.pptx

    7/25

    Combine prior and class-conditional probabilities

    Posterior probability is the probability of a certain stateof nature given our observables.

    Pattern Classification, Chapter 2 (Part 1) 7

    Posterior Probability: Bayes formula

  • 8/10/2019 Ch 2_ Part 1.pptx

    8/25

    Posterior Probability: Bayes formula

    Feature x

  • 8/10/2019 Ch 2_ Part 1.pptx

    9/25

    Likelihood ratio test (LRT)

    The rule for a 2-class problem

    if choose 1 else choose 2

    Or, in a more compact form

    Applying Bayes rule

    Pattern Classification, Chapter 2 (Part 1) 9

    x)|(x)|( 21

  • 8/10/2019 Ch 2_ Part 1.pptx

    10/25

    For a given observation x, we would be inclined to let theposterior govern our decision:

    The probability of error is :

    Pattern Classification, Chapter 2 (Part 1) 10

    Probability of Error

  • 8/10/2019 Ch 2_ Part 1.pptx

    11/25

    Pattern Classification, Chapter 2 (Part 1) 11

  • 8/10/2019 Ch 2_ Part 1.pptx

    12/25

    To answer this question, it is convenient to express[] in terms of the posterior

    The optimal decision rulewill minimize at everyvalue of in feature space, so that the integral above isminimized .

    dxxpxerrorperrorP )(]|[][

    ]|[ xerrorp

    How good is the LRT decision rule?

  • 8/10/2019 Ch 2_ Part 1.pptx

    13/25

    Decide 1if P(1| x) > P(2| x);

    otherwise decide 2

    Therefore:

    P(error | x) = min [P(1| x), P(2| x)]

    (Bayes decision)

    Pattern Classification, Chapter 2 (Part 1) 13

    Minimizing the probability of error

  • 8/10/2019 Ch 2_ Part 1.pptx

    14/25

    Bayesian Decision TheoryContinuous Features

    Generalization of the preceding ideas:

    Use of more than one feature (e.g., length and lightness)

    Use more than two states of nature

    Allowing actionsand not only decide on the state of nature

    Introduce a loss of function which is more general than

    the probability of error(e.g., errors are not equally costly)

    Allowing actionsother than classificationprimarily allowsthe possibility of rejection

    The lossfunctionstates how costly each action takenis.

    Pattern Classification, Chapter 2 (Part 1) 14

  • 8/10/2019 Ch 2_ Part 1.pptx

    15/25

    A loss function states exactly how costly each action is.

    Let {1, 2,, c}be the set of c states of nature (orcategories)

    Let {1, 2,, a}be the set of possible actions

    Let the loss function (i| j)be the lossincurred fortaking action iwhen the state of nature is j

    A general decision rule is a function ithat tells uswhich action to take for every possible observation.

    Pattern Classification, Chapter 2 (Part 1) 15

    LossFunctions

  • 8/10/2019 Ch 2_ Part 1.pptx

    16/25

    R = Sum of all R(i| x) for i = 1,,a Minimizing R implies minimizing R(i| x)

    for i = 1,, a The expected loss Risk or conditional risk from

    taking action i is:

    for i = 1,,a

    Pattern Classification, Chapter 2 (Part 1) 16

    cj

    jjjii xPxR

    1 )|()|()|(

    Overall risk

  • 8/10/2019 Ch 2_ Part 1.pptx

    17/25

    Bayes Decision Rule gives us a method for minimizing theoverall risk.

    Select the action that minimizes the conditional risk:

    R is minimumand R in this case is called the Bayes risk=best performance that can be achieved!

    Pattern Classification, Chapter 2 (Part 1) 17

    The Minimum Overall Risk

  • 8/10/2019 Ch 2_ Part 1.pptx

    18/25

    Consider two classes and two actions,

    1: deciding 12 : deciding 2ij = (i| j) is the loss incurred for deciding iwhen the

    true state of nature is j

    Conditional risk:

    R(1| x) = 11P(1| x) + 12P(2| x)

    R(2| x) = 21P(1 | x) + 22P(2| x)Pattern Classification, Chapter 2 (Part 1) 18

    Two-Category Classification Examples

  • 8/10/2019 Ch 2_ Part 1.pptx

    19/25

    Our rule is the following:if R(1| x)< R(2| x)

    action 1: decide 1 is taken

    In terms of posteriors, decide 1if:

    (21- 11) P(1|x) > (12- 22) P(2|x)

    and decide2otherwise

    Or, expanding via Bayes Rule, decide !1 if

    (21- 11) P(x | 1) P(1) > (12- 22) P(x | 2) P(2)

    and decide2otherwise

    Pattern Classification, Chapter 2 (Part 1) 19

    Two-Category Classification Examples (cont..)

  • 8/10/2019 Ch 2_ Part 1.pptx

    20/25

    The preceding rule is equivalent to the following rule:

    Then take action 1(decide 1)

    Otherwise take action 2(decide 2)

    Pattern Classification, Chapter 2 (Part 1) 20

    )(

    )(.

    )|(

    )|(

    1

    2

    1121

    2212

    2

    1

    P

    P

    xP

    xPif

    Likelihood Ratio:

  • 8/10/2019 Ch 2_ Part 1.pptx

    21/25

    If the likelihood ratio exceeds a threshold value

    independent of the input pattern x, we can takeoptimal actions

    Pattern Classification, Chapter 2 (Part 1) 21

    Optimal decision property

  • 8/10/2019 Ch 2_ Part 1.pptx

    22/25

    Pattern Classification, Chapter 2 (Part 1) 22

    Likelihood Ratio Test:An Example

  • 8/10/2019 Ch 2_ Part 1.pptx

    23/25

    Likelihood Ratio Test: Example 2

    Pattern Classification, Chapter 2 (Part 1) 23

  • 8/10/2019 Ch 2_ Part 1.pptx

    24/25

    Example 2: Answer

    Pattern Classification, Chapter 2 (Part 1) 24

  • 8/10/2019 Ch 2_ Part 1.pptx

    25/25

    Likelihood Ratio Test: Example 3

    Select the optimal decision where:

    = {1, 2}

    P(x | 1)= N(2, 0.5)P(x | 2)=N(1.5, 0.2)

    P(1) = 2/3

    P(2) = 1/3, and

    Pattern Classification Chapter 2 (Part 1) 25

    43

    21