bigdata edu lecture slide pdfs w002v001

Upload: jimmy-alejo

Post on 04-Jun-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    1/28

    Detector Confidence

    Week 2 Video 1

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    2/28

    Classification

    ! There is something you want to predict (the label)! The thing you want to predict is categorical

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    3/28

    It can be useful to know yes or no

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    4/28

    It can be useful to know yes or no

    ! The detector says you dont have PtarmigansDisease!

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    5/28

    It can be useful to know yes or no

    ! But its even more useful to know how certain theprediction is

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    6/28

    It can be useful to know yes or no

    ! But its even more useful to know how certain theprediction is

    ! The detector says there is a 50.1% chance that youdont have Ptarmigans disease!

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    7/28

    Uses of detector confidence

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    8/28

    Uses of detector confidence

    ! Gradated intervention! Give a strong intervention if confidence over 60%! Give no intervention if confidence under 60%! Give fail-soft intervention if confidence 40-60%

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    9/28

    Uses of detector confidence

    ! Decisions about strength of intervention can bemade based on cost-benefit analysis

    ! What is the cost of an incorrectly appliedintervention?

    ! What is the benefit of a correctly appliedintervention?

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    10/28

    Example

    ! An incorrectly applied intervention will cost thestudent 1 minute

    ! Each minute the student typically will learn 0.05%of course content

    ! A correctly applied intervention will result in thestudent learning 0.03% more course content thanthey would have learned otherwise

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    11/28

    Expected Value of Intervention

    ! 0.03*Confidence 0.05 * (1-Confidence)

    -0.05

    -0.04

    -0.03

    -0.02

    -0.01

    0

    0.01

    0.02

    0.03

    0.04

    0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

    ExpectedGain

    Detector Confidence

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    12/28

    Adding a second intervention

    ! 0.05*Confidence 0.08 * (1-Confidence)

    -0.12

    -0.1

    -0.08

    -0.06

    -0.04

    -0.02

    0

    0.02

    0.04

    0.06

    0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

    Ex

    pectedGain

    Detector Confidence

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    13/28

    Intervention cut-points

    -0.12

    -0.1

    -0.08

    -0.06

    -0.04

    -0.02

    0

    0.02

    0.04

    0.06

    0.3 0.4 0.5 0.6 0.7 0.8 0.9

    ExpectedGain

    Detector Confidence

    FAIL SOFTSTRONGER

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    14/28

    Uses of detector confidence

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    15/28

    Uses of detector confidence

    ! Discovery with models analyses! When you use this model in further analyses! Well discuss this later in the course! Big idea: keep all of your information around

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    16/28

    Not always available

    ! Not all classifiers provide confidence estimates

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    17/28

    Not always available

    ! Not all classifiers provide confidence estimates

    ! Some, like step regression, provide pseudo-confidences

    ! do not scale nicely from 0 to 1! but still show relative strength that can be used in

    comparing two predictions to each other

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    18/28

    Some algorithms give it to you in

    straightforward fashion

    ! Confidence = 72%

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    19/28

    With others, you need to parse it out of

    software output

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    20/28

    With others, you need to parse it out of

    software output

    C = Y / (Y+N)

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    21/28

    With others, you need to parse it out of

    software output

    C = 2 / (2+1)

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    22/28

    With others, you need to parse it out of

    software output

    C = 66.6667%

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    23/28

    With others, you need to parse it out of

    software output

    C = 100%

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    24/28

    With others, you need to parse it out of

    software output

    C = 2.22%

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    25/28

    With others, you need to parse it out of

    software output

    C = 2.22%

    (or NO with

    97.88%)

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    26/28

    Confidence can be lumpy

    ! Previous tree only had values! 100%, 66.67%, 50%, 2.22%

    ! This isnt a problem per-se! But some implementations of standard metrics (like A)

    can behave oddly in this case

    !Well discuss this later this week

    ! Common in tree and rule based classifiers

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    27/28

    Confidence

    ! Almost always a good idea to use it when itsavailable

    ! Not all metrics use it, well discuss this later this week

  • 8/13/2019 Bigdata Edu Lecture Slide PDFs W002V001

    28/28

    Thanks!