lecture 12: variationalinference and mean field · computing mean parameter: bernoulli 10 •a...

Post on 20-Feb-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

CS839:ProbabilisticGraphicalModels

Lecture12:Variational InferenceandMeanField

TheoRekatsinas

1

Summary

2

• Variational Inference(approximateinference):• LoopyBP(BetheFreeEnergy)• Mean-fieldApproximation

• Whatiscommoninthetwo?

• LoopyBP:outerapproximationofthemarginalpolytope• Mean-field:innerapproximationofthemarginalpolytope

Variational Methods

3

• Variational means:optimization-basedformulation• Representaquantityofinterestasthesolutiontoanoptimizationproblem• Approximatethedesiredsolutionbyrelaxing/approximatingtheintractableoptimizationproblem

• Example:

InferenceproblemsinGraphicalModels

4

• Consideranundirectedgraphicalmodel(MRF)

• Thequantitatesofinterest(forinference)

• Marginaldistributions

• NormalizationconstantZ• Howtorepresentthesequantitiesinavariational form?• Exponentialfamiliesandconvexanalysis

ExponentialFamilies

5

• Canonicalparameterization

• Lognormalizationconstant

• Thisisaconvexfunction• Spaceofcanonicalparameters

GraphicalModelsasExponentialFamilies

6

• Undirectedgraphicalmodel(MRF)

• MRFinexponentialform:

Example:GaussianMRF

7

• Zero-meanmultivariateGaussiandistributionthatrespectstheMarkovpropertyofagraph

• GaussianMRFinexponentialform

Example:DiscreteMRF

8

Whyexponentialfamilies

9

• Computingtheexpectationofsufficientstatistics(meanparameters)giventhecanonicalparametersyieldsthemarginals

• Computingthenormalizeryieldsthelogpartitionfunction(orloglikelihoodfunction)

ComputingMeanParameter:Bernoulli

10

• AsingleBernoullirandomvariable

• Inference=Computingthemeanparameter

• Inavariational manner:casttheprocedureofcomputingmeaninanoptimization-basedformulation

ConjugateDualFunction

11

• Givenanyfunctionf(θ)itsconjugatedualfunctionis

• Conjugatedualisalwaysaconvexfunction:point-wisesupremumofaclassoflinearfunctions

DualoftheDualistheOriginal

12

• Undersometechnicalconditionsonf(convexandlowersemi-continuous)thedualofdualisitself.

• Forlogpartitionfunction

ComputingMeanParameter:Bernoulli

13

• Theconjugate

• Stationarycondition

• If

• If

• Wehave:

ComputingMeanParameter:Bernoulli

14

• Theconjugate

• Stationarycondition

• Wehave:

• Thevariational form:

• Theoptimumisachievedat

Remarks

15

• Thelastfewidentitiesrelyonadeeptheoryingeneralexponentialfamily:• Thedualfunctionisthenegativeentropyfunction• Themeanparameterisrestricted• Solvingtheoptimizationreturnsthemeanparameterandlogpartitionfunction

• Extendthistogeneralexponentialfamilies/graphicalmodels.

• However,• Computingtheconjugatedualentropyisingeneralintractable• Theconstraintsetofmeanparameterishardtocharacterize• Weneedtoapproximate

ComputetheConjugateDual

16

• Givenanexponentialfamily

• Thedualfunction• Stationarycondition

• DerivativesofAyieldsthemeanparameters• Thestationaryconditionbecomes• Forwhichμwehaveasolutionθ(μ)?

ComputetheConjugateDual

17

• Let’sassumethereisasolutionθ(μ) suchthat

• Thedualhastheform

• Theentropyisdefinedas

• Sothedualiswhenthereisasolutionθ(μ)

ComplexityofComputingConjugateDual

18

• Thedualfunctionisimplicitlydefined:

• Solvingtheinversemappingisnon-trivial• Evaluatingthenegativeentropyrequireshigh-dimensionalintegration(summation)• Forwhichμ doesithaveasolutionθ(μ)?WhatisthedomainofA*(μ)

MarginalPolytope

19

• Foranydistributionp(x)andasetofsufficientstatisticsφ(x)defineavectorofmeanparameters

• p(x)isnotnecessarilyanexponentialfamily

• Thesetofallrealizablemeanparametersisaconvexset

• Fordiscreteexp.familiesthisiscalledmarginalpolytope.

ConvexPolytope

20

• Convexhullrepresentation

• Half-planerepresentation• Minkowski-WeylTheorem:anynon-emptyconvexpolytopecanbecharacterizedbyafinitecollectionoflinearinequalityconstraints

Example:Two-nodeIsing Model

21

• Sufficientstatistics

• Meanparameters

• Two-nodeIsing model• Convexhullrepresentation

• Half-planerepresentation

MarginalPolytopeforGeneralGraphs

22

• Stilldoableforconnectedbinarygraphswith3nodes:16constraints

• Fortreegraphicalmodels,thenumberofhalf-placesgrowsonlylinearlyinthegraphsize

• Generalgraphs?• Extremelyhardtocharacterizethemarginalpolytope.

Variational Principle

23

• Thedualfunctiontakestheform

• Thelogpartitionfunctionhasthevariational form

• Forallθ theaboveoptimizationproblemisattaineduniquelyatμ(θ)thatsatisfies

Example:Two-nodeIsing Model

24

• Thedistribution• Sufficientstatistics

• Themarginalpolytopeischaracterizedby

• Thedualhasanexplicitform

• Thevariational problemis• Theoptimumisattainedat

Variational Principle

25

• Exactvariational formulation

• Meanfieldmethod:non-convexinnerboundandexactformofentropy

• BetheapproximationandLoopyBP:polyhedralouterboundandnon-convexBetheapproximation

BeliefPropagationAlgorithm

26

• Messagepassingrule:

• Marginals

• Exactfortreesbutapproximateforloopygraphs• Howdoesthisrelatetothevariational principle?Fortrees/genericgraphs?

TreeGraphicalModels

27

• DiscretevariablesonatreeT=(V,E)

• Sufficientstatistics

• Exponentialrepresentationofdistribution?• Meanparametersaremarginalprobabilities:

MarginalPolytopeforTrees

28

• Marginalpolytopeforgeneralgraphs

• Byjunctiontreewehave:

• Ifthen

DecompositionofEntropyforTrees

29

• Fortreestheentropydecomposesas(thisisalsoourdual!):

ExactVariational PrincipleforTrees

30

• Variational formulation

• AssignaLagrangemultiplierforthenormalizationconstraintandeachmarginalizationconstraint

Lagrangian Derivation

31

• TakingthederivativesoftheLagrangian wrt toμs μst

• Settingthemtozerosyields

BPonArbitraryGraphs

32

• Twomaindifficultiesofthevariationformulation

• Themarginalpolytopeishardtocharacterize,solet’susethetree-basedouterbound

• Exactentropylacksexplicitform,solet’sapproximateitusingtheexactexpressionfortrees

BetheVariational Problem

33

• CombiningthetwogivesustheBethevariational problem

• Whatishappening?• Tree-basedouterbound

MeanFieldApproximation

34

TractableSubgraphs

35

• ForanexponentialfamilywithsufficientstatisticsφdefinedongraphGthesetofrealizablemeanparametersetis

• Idea:restrictptoasubsetofdistributionsassociatedwithatractablesubgraph

MeanFieldMethods

36

• ForagiventractablesubgraphF,asubsetofcanonicalparametersis

• Innerapproximation

• Meanfieldsolvestherelaxedproblem

• istheexactdualfunctionrestrictedto

Example:NaïveMeanFieldforIsing Model

37

• Ising modelin{0,1}representation

• Meanparameters

• ForfullydisconnectedgraphF

• Thedualdecomposesintosum,oneforeachnode

Example:NaïveMeanFieldforIsing Model

38

• Meanfieldproblem

• Thesameobjectivefunciton asinfreeenergybasedapproach

• Thenaïvemeanfieldupdateequations

• Lowerboundonlogpartitionfunction

GeometryofMeanField

39

• Meanfieldoptimizationisalwaysnon-convexforanyexponentialfamilyinwhichthestatespaceisfinite

• Marginalpolytopeisaconvexhull

• containsalltheextremepoints(ifitisastrictsubsetthenitmustbenon-convex• Example:two-nodeising

• Paraboliccrosssectionalongτ1 =τ2

Summary

40

• Variationmethodsingeneralturninfernece intoanoptimizationproblemviaexponentialfamiliesandconvexduality

• Theexactvariational principleisintractabletosolve;Twoapproximations:• Eitherinnerorouterboundtothemarginalpolytope• Variousapproximationstotheentropyfunction

• Mean-field:non-convexinnerboundandexactformofentropy• BP:polyhedralouterboundandnon-convexBetheapproximation

top related