section 1 introduction combined

Upload: car-alv

Post on 03-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/28/2019 Section 1 Introduction Combined

    1/40

    Probabilistic

    Graphical

    Models Introduction

    the PGM C

  • 7/28/2019 Section 1 Introduction Combined

    2/40

    Probabilistic

    Models

  • 7/28/2019 Section 1 Introduction Combined

    3/40

    Course Structure

    10 weeks + final Videos + quizzes

    pro em se s 25% of score

    Multiple submissions

  • 7/28/2019 Section 1 Introduction Combined

    4/40

    Course Structure

    9 programming assignments Genetically inherited diseases Optical character recognition

    9 x 7% = 63% of score Final exam

    12% of score

  • 7/28/2019 Section 1 Introduction Combined

    5/40

    Background

    Required Basic probability theory Some programming

    Recommended

    Machine learning Simple optimization Matlab or Octave

  • 7/28/2019 Section 1 Introduction Combined

    6/40

    Other Issues

    Honor code Time management (10-15 hrs / wee

    scuss on orum s u y groups

  • 7/28/2019 Section 1 Introduction Combined

    7/40

    What youll learn

    Fundamental methods Real-world applications How to use these methods in your wo

  • 7/28/2019 Section 1 Introduction Combined

    8/40

    Daphne Koller

    Mo#va#on'

    and'Overview'

    Probabilis#c'

    Graphical'

    Models'Introduc#on'

  • 7/28/2019 Section 1 Introduction Combined

    9/40

    Daphne Koller

    predisposing factorssymptomstest results

    millions of pixels orthousands of superpixels

    each needs to be labeled

    {grass, sky, water, cow, horse, }diseasestreatment outcomes

  • 7/28/2019 Section 1 Introduction Combined

    10/40

    Daphne Koller

    ProbabilisticGraphical

    Models

  • 7/28/2019 Section 1 Introduction Combined

    11/40

    Daphne Koller

    ModelsData

    Model

    Declarative representation

    AlgorithmAlgorithm Algorithm

    elicitation

    domain expert

    Learning

  • 7/28/2019 Section 1 Introduction Combined

    12/40

    Daphne Koller

    Uncertainty Partial knowledge of state of the world Noisy observations Phenomena not covered by our model Inherent stochasticity

  • 7/28/2019 Section 1 Introduction Combined

    13/40

    Daphne Koller

    Probability Theory

    Declarative representation with clearsemantics

    Powerful reasoning patterns Established learning methods

  • 7/28/2019 Section 1 Introduction Combined

    14/40

    Daphne Koller

    Complex Systemspredisposing factorssymptomstest results

    diseasestreatment outcomes

    class labels forthousands of superpixels

    Random variables X1,, Xn

    Joint distribution P(X1,, Xn)

  • 7/28/2019 Section 1 Introduction Combined

    15/40

    Daphne Koller

    Graphical Models

    IntelligenceDifficulty

    Grade

    Letter

    SAT BD

    C

    A

    Bayesian networks Markov networks

  • 7/28/2019 Section 1 Introduction Combined

    16/40

    Daphne Koller

    Graphical Models

    M. Pradhan, G. Provan, B. Middleton, M. Henrion, UAI 94

  • 7/28/2019 Section 1 Introduction Combined

    17/40

    Daphne Koller

    Graphical Representation Intuitive & compact data structureEfficient reasoning using general-purposealgorithms

    Sparse parameterizationfeasible elicitationlearning from data

  • 7/28/2019 Section 1 Introduction Combined

    18/40

    Daphne Koller

    Many Applications Medical diagnosis Fault diagnosis Natural language

    processing

    Traffic analysis Social network models Message decoding

    Computer vision Image segmentation 3D reconstruction Holistic scene analysis

    Speech recognition

    Robot localization &mapping

  • 7/28/2019 Section 1 Introduction Combined

    19/40

    Daphne Koller

    Image Segmentation

  • 7/28/2019 Section 1 Introduction Combined

    20/40

    Daphne Koller

    -Medical DiagnosisThanks to: Eric Horvitz, Microsoft Research

  • 7/28/2019 Section 1 Introduction Combined

    21/40

    Daphne Koller

    Textual Information Extraction

    Mrs. Green spoke today in New York. Green chairs the finance committee.

  • 7/28/2019 Section 1 Introduction Combined

    22/40

    Daphne Koller

    Learned

    Model

    Multi-Sensor Integration: Traffic Trained on historical data Learn to predict current & future road speed, including

    on unmeasured roads Dynamic route optimization

    Multiple views

    on traffic

    Incident reports

    Weather

    Thanks to: Eric Horvitz, Microsoft Research

    I95 corridor experiment: accurateto 5 MPH in 85% of cases Fielded in 72 cities

  • 7/28/2019 Section 1 Introduction Combined

    23/40

    Daphne Koller

    Known 15/17Supported 2/17

    Reversed 1

    Missed 3

    Causal protein-signaling networks derived from multiparameter single-cell dataSachs et al., Science 2005

    Biological Network ReconstructionPhospho-ProteinsPhospho-LipidsPerturbed in data

    PKC

    Raf

    Erk

    Mek

    PlcPKA

    Akt

    Jnk P38

    PIP2

    PIP3

    Subsequently validated in wetlab

    This figure may be used for non-commercial and classroom purposes only.Any other uses require the prior written permission from AAAS

  • 7/28/2019 Section 1 Introduction Combined

    24/40

    Daphne Koller

    Overview Representation

    Directed and undirected Temporal and plate models

    Inference Exact and approximate Decision making

    Learning Parameters and structure With and without complete data

  • 7/28/2019 Section 1 Introduction Combined

    25/40

    Daphne Koller

    Preliminaries:+

    Distribu0ons+

    Probabilis0c+

    Graphical+

    Models+Introduc0on+

  • 7/28/2019 Section 1 Introduction Combined

    26/40

    Daphne Koller

    Joint Distribution

    Intelligence (I) i0 (low), i1 (high),

    Difficulty (D) d0 (easy), d1 (hard)

    Grade (G) g1 (A),g2 (B), g3 (C)

    I D G Prob.

    i0 d0 g1 0.126

    i0 d0 g2 0.168

    i0 d0 g3 0.126

    i0 d1 g1 0.009

    i0 d1 g2 0.045

    i0 d1 g3 0.126

    i1 d0 g1 0.252

    i1 d0 g2 0.0224

    i1 d0 g3 0.0056

    i1 d1 g1 0.06

    i1 d1 g2 0.036

    i1 d1 g3 0.024

  • 7/28/2019 Section 1 Introduction Combined

    27/40

    Daphne Koller

    ConditioningI D G Prob.

    i0 d0 g1 0.126

    i0 d0 g2 0.168

    i0 d0 g3 0.126

    i0 d1 g1 0.009

    i0 d1 g2 0.045

    i0 d1 g3 0.126

    i1 d0 g1 0.252

    i1 d0 g2 0.0224

    i1 d0 g3 0.0056

    i1 d1 g1 0.06i1 d1 g2 0.036i1 d1 g3 0.024

    condition on g1

  • 7/28/2019 Section 1 Introduction Combined

    28/40

    Daphne Koller

    Conditioning: ReductionI D G Prob.

    i0 d0 g1 0.126

    i0 d1 g1 0.009

    i1 d0 g1 0.252

    i1 d1 g1 0.06

  • 7/28/2019 Section 1 Introduction Combined

    29/40

    Daphne Koller

    P(I, D, g1)

    Conditioning: RenormalizationI D G Prob.

    i0 d0 g1 0.126

    i0 d1 g1 0.009

    i1 d0 g1 0.252

    i1 d1 g1 0.06P(I, D | g1)

    I D Prob.

    i0 d0 0.282

    i0 d1 0.02

    i1 d0 0.564

    i1 d1 0.1340.447

  • 7/28/2019 Section 1 Introduction Combined

    30/40

    Daphne Koller

    Marginalization

    D Prob.

    d0 0.846

    d1 0.154

    I D Prob.

    i0 d0 0.282

    i0 d1 0.02

    i1 d0 0.564

    i1 d1 0.134

    Marginalize I

  • 7/28/2019 Section 1 Introduction Combined

    31/40

    Daphne Koller

    Preliminaries:+

    Factors+

    Probabilis1c+

    Graphical+

    Models+Introduc1on+

  • 7/28/2019 Section 1 Introduction Combined

    32/40

    Daphne Koller

    Factors A factor (X1,,Xk)

    Scope = {X1,,Xk} : Val(X1,,Xk)

    R

  • 7/28/2019 Section 1 Introduction Combined

    33/40

    Daphne Koller

    I D G Prob.

    i0 d0 g1 0.126

    i0 d0 g2 0.168

    i0

    d0

    g3 0.126

    i0 d1 g1 0.009

    i0 d1 g2 0.045

    i0 d1 g3 0.126

    i1 d0 g1 0.252

    i1 d0 g2 0.0224

    i1 d0 g3 0.0056

    i1 d1 g1 0.06i1 d1 g2 0.036i1 d1 g3 0.024

    Joint Distribution

    P(I,D,G)

  • 7/28/2019 Section 1 Introduction Combined

    34/40

    Daphne Koller

    Unnormalized measure P(I,D,g1)

    I D G Prob.

    i0 d0 g1 0.126

    i0 d1 g1 0.009

    i1 d0 g1 0.252

    i1 d1 g1 0.06P(I,D,g1)

  • 7/28/2019 Section 1 Introduction Combined

    35/40

    Daphne Koller

    Conditional ProbabilityDistribution (CPD)

    0.30.080.25

    0.4g2

    0.020.9i1,d00.70.05i0,d1

    0.5

    0.3g1 g3

    0.2i1,d1

    0.3i0,d0

    P(G | I,D)

  • 7/28/2019 Section 1 Introduction Combined

    36/40

    Daphne Koller

    General factors

    A B a0 b0 30

    a0 b1 5

    a1 b0 1

    a1 b1 10

  • 7/28/2019 Section 1 Introduction Combined

    37/40

    Daphne Koller

    a1 b1 0.5

    a1 b2 0.8

    a2 b1 0.1

    a2 b2 0

    a3 b1 0.3

    a3 b2 0.9

    b1 c1 0.5

    b1 c2 0.7

    b2 c1 0.1

    b2 c2 0.2

    a1 b1 c1 0.50.5 = 0.25

    a1 b1 c2 0.50.7 = 0.35

    a1 b2 c1 0.80.1 = 0.08

    a1 b2 c2 0.80.2 = 0.16

    a2 b1 c1 0.10.5 = 0.05

    a2 b1 c2 0.10.7 = 0.07

    a2 b2 c1 00.1 = 0

    a2 b2 c2 00.2 = 0

    a3 b1 c1 0.30.5 = 0.15

    a3 b1 c2 0.30.7 = 0.21

    a3 b2 c1 0.90.1 = 0.09

    a3 b2 c2 0.90.2 = 0.18

    Factor Product

  • 7/28/2019 Section 1 Introduction Combined

    38/40

    Daphne Koller

    a1 b1 c1 0.25

    a1 b1 c2 0.35

    a1 b2 c1 0.08

    a1 b2 c2 0.16

    a2 b1 c1 0.05

    a2 b1 c2 0.07

    a2 b2 c1 0

    a2 b2 c2 0

    a3 b1 c1 0.15

    a3 b1 c2 0.21

    a3 b2 c1 0.09

    a3 b2 c2 0.18

    a1 c1 0.33

    a1 c2 0.51

    a2 c1 0.05

    a2 c2 0.07

    a3 c1 0.24

    a3 c2 0.39

    Factor Marginalization

  • 7/28/2019 Section 1 Introduction Combined

    39/40

    Daphne Koller

    a1 b1 c1 0.25

    a1 b2 c1 0.08

    a2 b1 c1 0.05

    a2 b2 c1 0

    a3 b1 c1 0.15

    a3 b2 c1 0.09

    a1 b1 c1 0.25

    a1 b1 c2 0.35

    a1 b2 c1 0.08

    a1 b2 c2 0.16

    a2 b1 c1 0.05

    a2 b1 c2 0.07

    a2 b2 c1 0

    a2 b2 c2 0

    a3 b1 c1 0.15

    a3 b1 c2 0.21

    a3 b2 c1 0.09

    a3 b2 c2 0.18

    Factor Reductiona1 b1 c1 0.25

    a1 b1 c2 0.35

    a1 b2 c1 0.08

    a1 b2 c2 0.16

    a2 b1 c1 0.05

    a2 b1 c2 0.07

    a2 b2 c1 0

    a2 b2 c2 0

    a3 b1 c1 0.15

    a3 b1 c2 0.21

    a3 b2 c1 0.09

    a3 b2 c2 0.18

  • 7/28/2019 Section 1 Introduction Combined

    40/40

    Daphne Koller

    Why factors? Fundamental building block for defining

    distributions in high-dimensional spaces

    Set of basic operations for manipulatingthese probability distributions