bayes’ nets - university of washington · cse 473: ar+ficial intelligence bayes’ nets luke...

60
CSE 473: Ar+ficial Intelligence Bayes’ Nets Luke Ze?lemoyer --- University of Washington [These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at h?p://ai.berkeley.edu.]

Upload: truongmien

Post on 05-Jun-2019

214 views

Category:

Documents


0 download

TRANSCRIPT

CSE473:Ar+ficialIntelligence

Bayes’Nets

LukeZe?lemoyer---UniversityofWashington[TheseslideswerecreatedbyDanKleinandPieterAbbeelforCS188IntrotoAIatUCBerkeley.AllCS188materialsareavailableath?p://ai.berkeley.edu.]

Probabilis+cModels

§  Modelsdescribehow(apor+onof)theworldworks§  Modelsarealwayssimplifica+ons

§  Maynotaccountforeveryvariable§  Maynotaccountforallinterac+onsbetweenvariables§  “Allmodelsarewrong;butsomeareuseful.”

–GeorgeE.P.Box

§  Whatdowedowithprobabilis+cmodels?§  We(orouragents)needtoreasonaboutunknown

variables,givenevidence§  Example:explana+on(diagnos+creasoning)§  Example:predic+on(causalreasoning)§  Example:valueofinforma+on

Independence

§  Twovariablesareindependentif:

§  Thissaysthattheirjointdistribu+onfactorsintoaproducttwosimplerdistribu+ons

§  Anotherform:

§  Wewrite:

§  Independenceisasimplifyingmodelingassump2on

§  Empiricaljointdistribu+ons:atbest“close”toindependent

§  Whatcouldweassumefor{Weather,Traffic,Cavity,Toothache}?

Independence

Example:Independence?

T W P

hot sun 0.4

hot rain 0.1

cold sun 0.2

cold rain 0.3

T W P

hot sun 0.3

hot rain 0.2

cold sun 0.3

cold rain 0.2

T P

hot 0.5

cold 0.5

W P

sun 0.6

rain 0.4

Example:Independence

§  Nfair,independentcoinflips:

H 0.5

T 0.5

H 0.5

T 0.5

H 0.5

T 0.5

Condi+onalIndependence

Condi+onalIndependence§  P(Toothache,Cavity,Catch)

§  IfIhaveacavity,theprobabilitythattheprobecatchesinitdoesn'tdependonwhetherIhaveatoothache:§  P(+catch|+toothache,+cavity)=P(+catch|+cavity)

§  ThesameindependenceholdsifIdon’thaveacavity:§  P(+catch|+toothache,-cavity)=P(+catch|-cavity)

§  Catchiscondi2onallyindependentofToothachegivenCavity:§  P(Catch|Toothache,Cavity)=P(Catch|Cavity)

§  Equivalentstatements:§  P(Toothache|Catch,Cavity)=P(Toothache|Cavity)§  P(Toothache,Catch|Cavity)=P(Toothache|Cavity)P(Catch|Cavity)§  Onecanbederivedfromtheothereasily

Condi+onalIndependence

§  Uncondi+onal(absolute)independenceveryrare(why?)

§  Condi2onalindependenceisourmostbasicandrobustformofknowledgeaboutuncertainenvironments.

§  Xiscondi+onallyindependentofYgivenZ

ifandonlyif:or,equivalently,ifandonlyif

Condi+onalIndependence

§  Whataboutthisdomain:

§  Traffic§  Umbrella§  Raining

Condi+onalIndependence

§  Whataboutthisdomain:

§  Fire§  Smoke§  Alarm

Condi+onalIndependenceandtheChainRule

§  Chainrule:

§  Trivialdecomposi+on:

§  Withassump+onofcondi+onalindependence:

§  Bayes’nets/graphicalmodelshelpusexpresscondi+onalindependenceassump+ons

GhostbustersChainRule

§  Eachsensordependsonlyonwheretheghostis

§  Thatmeans,thetwosensorsarecondi+onallyindependent,giventheghostposi+on

§  T:TopsquareisredB:Bo?omsquareisredG:Ghostisinthetop

§  Givens:

P(+g)=0.5P(-g)=0.5P(+t|+g)=0.8P(+t|-g)=0.4P(+b|+g)=0.4P(+b|-g)=0.8

P(T,B,G)=P(G)P(T|G)P(B|G)

T B G P(T,B,G)

+t +b +g 0.16

+t +b -g 0.16

+t -b +g 0.24

+t -b -g 0.04

-t +b +g 0.04

-t +b -g 0.24

-t -b +g 0.06

-t -b -g 0.06

Bayes’Nets:BigPicture

Bayes’Nets:BigPicture

§  Twoproblemswithusingfulljointdistribu+ontablesasourprobabilis+cmodels:§  Unlessthereareonlyafewvariables,thejointisWAYtoo

bigtorepresentexplicitly§  Hardtolearn(es+mate)anythingempiricallyaboutmore

thanafewvariablesata+me

§  Bayes’nets:atechniquefordescribingcomplexjointdistribu+ons(models)usingsimple,localdistribu+ons(condi+onalprobabili+es)§  Moreproperlycalledgraphicalmodels§  Wedescribehowvariableslocallyinteract§  Localinterac+onschaintogethertogiveglobal,indirect

interac+ons§  Forabout10min,we’llbevagueabouthowthese

interac+onsarespecified

ExampleBayes’Net:Insurance

ExampleBayes’Net:Car

GraphicalModelNota+on

§  Nodes:variables(withdomains)§  Canbeassigned(observed)orunassigned

(unobserved)

§  Arcs:interac+ons§  Indicate“directinfluence”betweenvariables§  Formally:encodecondi+onalindependence

(morelater)

§  Fornow:imaginethatarrowsmeandirectcausa+on(ingeneral,theydon’t!)

Example:CoinFlips

§  Nindependentcoinflips

§  Nointerac+onsbetweenvariables:absoluteindependence

X1 X2 Xn

Example:Traffic

§  Variables:§  R:Itrains§  T:Thereistraffic

§  Model1:independence

§  Whyisanagentusingmodel2be?er?

R

T

R

T

§  Model2:raincausestraffic

§  Let’sbuildacausalgraphicalmodel!§  Variables

§  T:Traffic§  R:Itrains§  L:Lowpressure§  D:Roofdrips§  B:Ballgame§  C:Cavity

Example:TrafficII

Example:AlarmNetwork§  Variables

§  B:Burglary§  A:Alarmgoesoff§  M:Marycalls§  J:Johncalls§  E:Earthquake!

Bayes’NetSeman+cs

Bayes’NetSeman+cs

§  Asetofnodes,onepervariableX

§  Adirected,acyclicgraph

§  Acondi+onaldistribu+onforeachnode§  Acollec+onofdistribu+onsoverX,oneforeach

combina+onofparents’values

§  CPT:condi+onalprobabilitytable

§  Descrip+onofanoisy“causal”process

A1

X

An

ABayesnet=Topology(graph)+LocalCondi2onalProbabili2es

Probabili+esinBNs

§  Bayes’netsimplicitlyencodejointdistribu+ons

§  Asaproductoflocalcondi+onaldistribu+ons

§  ToseewhatprobabilityaBNgivestoafullassignment,mul+plyalltherelevantcondi+onalstogether:

§  Example:

Probabili+esinBNs

§  Whyareweguaranteedthatseyng

resultsinaproperjointdistribu+on?

§  Chainrule(validforalldistribu+ons):

§  Assumecondi+onalindependences:

àConsequence:

§  NoteveryBNcanrepresenteveryjointdistribu+on§  Thetopologyenforcescertaincondi+onalindependencies

Onlydistribu2onswhosevariablesareabsolutelyindependentcanberepresentedbyaBayes’netwithnoarcs.

Example:CoinFlips

h 0.5

t 0.5

h 0.5

t 0.5

h 0.5

t 0.5

X1 X2 Xn

Example:Traffic

R

T

+r 1/4

-r 3/4

+r +t 3/4

-t 1/4

-r +t 1/2

-t 1/2

Example:Traffic

§  Causaldirec+on

R

T

+r 1/4

-r 3/4

+r +t 3/4

-t 1/4

-r +t 1/2

-t 1/2

+r +t 3/16

+r -t 1/16

-r +t 6/16

-r -t 6/16

Example:ReverseTraffic

§  Reversecausality?

T

R

+t 9/16

-t 7/16

+t +r 1/3

-r 2/3

-t +r 1/7

-r 6/7

+r +t 3/16

+r -t 1/16

-r +t 6/16

-r -t 6/16

Causality?

§  WhenBayes’netsreflectthetruecausalpa?erns:§  O{ensimpler(nodeshavefewerparents)§  O{eneasiertothinkabout§  O{eneasiertoelicitfromexperts

§  BNsneednotactuallybecausal§  Some+mesnocausalnetexistsoverthedomain

(especiallyifvariablesaremissing)§  E.g.considerthevariablesTrafficandDrips§  Endupwitharrowsthatreflectcorrela+on,notcausa+on

§  Whatdothearrowsreallymean?§  Topologymayhappentoencodecausalstructure§  Topologyreallyencodescondi+onalindependence

Example:AlarmNetworkB P(B)

+b 0.001

-b 0.999

E P(E)

+e 0.002

-e 0.998

B E A P(A|B,E)

+b +e +a 0.95

+b +e -a 0.05

+b -e +a 0.94

+b -e -a 0.06

-b +e +a 0.29

-b +e -a 0.71

-b -e +a 0.001

-b -e -a 0.999

A J P(J|A)

+a +j 0.9

+a -j 0.1

-a +j 0.05

-a -j 0.95

A M P(M|A)

+a +m 0.7

+a -m 0.3

-a +m 0.01

-a -m 0.99

B E

A

MJ

Example:AlarmNetworkB P(B)

+b 0.001

-b 0.999

E P(E)

+e 0.002

-e 0.998

B E A P(A|B,E)

+b +e +a 0.95

+b +e -a 0.05

+b -e +a 0.94

+b -e -a 0.06

-b +e +a 0.29

-b +e -a 0.71

-b -e +a 0.001

-b -e -a 0.999

A J P(J|A)

+a +j 0.9

+a -j 0.1

-a +j 0.05

-a -j 0.95

A M P(M|A)

+a +m 0.7

+a -m 0.3

-a +m 0.01

-a -m 0.99

B E

A

MJ

SizeofaBayes’Net

§  Howbigisajointdistribu+onoverNBooleanvariables?

2N

§  HowbigisanN-nodenetifnodeshaveuptokparents?

O(N*2k+1)

§  Bothgiveyouthepowertocalculate

§  BNs:Hugespacesavings!

§  AlsoeasiertoelicitlocalCPTs

§  Alsofastertoanswerqueries(coming)

Bayes’Nets

§  Sofar:howaBayes’netencodesajointdistribu+on

§  Next:howtoanswerqueriesaboutthatdistribu+on§  Today:

§  FirstassembledBNsusinganintui+veno+onofcondi+onalindependenceascausality

§  Thensawthatkeypropertyiscondi+onalindependence§  Maingoal:answerqueriesaboutcondi+onal

independenceandinfluence

§  A{erthat:howtoanswernumericalqueries(inference)

Bayes’Nets

§  Representa+on

§  Condi+onalIndependences

§  Probabilis+cInference

§  LearningBayes’NetsfromData

Condi+onalIndependence

§  XandYareindependentif

§  XandYarecondi+onallyindependentgivenZ

§  (Condi+onal)independenceisapropertyofadistribu+on

§  Example:

BayesNets:Assump+ons

§  Assump+onswearerequiredtomaketodefinetheBayesnetwhengiventhegraph:

§  Beyondabove“chainruleàBayesnet”condi+onalindependenceassump+ons

§  O{enaddi+onalcondi+onalindependences

§  Theycanbereadoffthegraph

§  Importantformodeling:understandassump+onsmadewhenchoosingaBayesnetgraph

P (xi|x1 · · ·xi�1) = P (xi|parents(Xi))

Example

§  Condi+onalindependenceassump+onsdirectlyfromsimplifica+onsinchainrule:

§  Addi+onalimpliedcondi+onalindependenceassump+ons?

X Y Z W

IndependenceinaBN

§  Importantques+onaboutaBN:§  Aretwonodesindependentgivencertainevidence?§  Ifyes,canproveusingalgebra(tediousingeneral)§  Ifno,canprovewithacounterexample§  Example:

§  Ques+on:areXandZnecessarilyindependent?§  Answer:no.Example:lowpressurecausesrain,whichcausestraffic.§  XcaninfluenceZ,ZcaninfluenceX(viaY)§  Addendum:theycouldbeindependent:how?

X Y Z

D-separa+on:Outline

D-separa+on:Outline

§  Studyindependenceproper+esfortriples

§  Analyzecomplexcasesintermsofmembertriples

§  D-separa+on:acondi+on/algorithmforansweringsuchqueries

CausalChains

§  Thisconfigura+onisa“causalchain”

X:LowpressureY:RainZ:Traffic

§  GuaranteedXindependentofZ?No!

§  OneexamplesetofCPTsforwhichXisnotindependentofZissufficienttoshowthisindependenceisnotguaranteed.

§  Example:

§  Lowpressurecausesraincausestraffic,highpressurecausesnoraincausesnotraffic

§  Innumbers:P(+y|+x)=1,P(-y|-x)=1,P(+z|+y)=1,P(-z|-y)=1

CausalChains

§  Thisconfigura+onisa“causalchain” §  GuaranteedXindependentofZgivenY?

§  Evidencealongthechain“blocks”theinfluence

Yes!

X:LowpressureY:RainZ:Traffic

CommonCause

§  Thisconfigura+onisa“commoncause” §  GuaranteedXindependentofZ?No!

§  OneexamplesetofCPTsforwhichXisnotindependentofZissufficienttoshowthisindependenceisnotguaranteed.

§  Example:

§  Projectduecausesbothforumsbusyandlabfull

§  Innumbers:P(+x|+y)=1,P(-x|-y)=1,P(+z|+y)=1,P(-z|-y)=1

Y:Projectdue

X:Forumsbusy Z:Labfull

CommonCause

§  Thisconfigura+onisa“commoncause” §  GuaranteedXandZindependentgivenY?

§  Observingthecauseblocksinfluencebetweeneffects.

Yes!

Y:Projectdue

X:Forumsbusy Z:Labfull

CommonEffect§  Lastconfigura+on:twocausesofone

effect(v-structures)

Z:Traffic

§  AreXandYindependent?§  Yes:theballgameandtheraincausetraffic,but

theyarenotcorrelated

§  S+llneedtoprovetheymustbe(tryit!)

§  AreXandYindependentgivenZ?

§  No:seeingtrafficputstherainandtheballgameincompe++onasexplana+on.

§  Thisisbackwardsfromtheothercases

§  Observinganeffectac+vatesinfluencebetweenpossiblecauses.

X:Raining Y:Ballgame

TheGeneralCase

TheGeneralCase

§  Generalques+on:inagivenBN,aretwovariablesindependent(givenevidence)?

§  Solu+on:analyzethegraph

§  Anycomplexexamplecanbebrokenintorepe++onsofthethreecanonicalcases

Reachability

§  Recipe:shadeevidencenodes,lookforpathsintheresul+nggraph

§  A?empt1:iftwonodesareconnectedbyanundirectedpathnotblockedbyashadednode,theyarecondi+onallyindependent

§  Almostworks,butnotquite§  Wheredoesitbreak?§  Answer:thev-structureatTdoesn’tcount

asalinkinapathunless“ac+ve”

R

T

B

D

L

Ac+ve/Inac+vePaths

§  Ques+on:AreXandYcondi+onallyindependentgivenevidencevariables{Z}?§  Yes,ifXandY“d-separated”byZ§  Considerall(undirected)pathsfromXtoY

§  Noac+vepaths=independence!

§  Apathisac+veifeachtripleisac+ve:§  CausalchainA→B→CwhereBisunobserved(eitherdirec+on)§  CommoncauseA←B→CwhereBisunobserved§  Commoneffect(akav-structure)

A→B←CwhereBoroneofitsdescendentsisobserved

§  Allittakestoblockapathisasingleinac+vesegment

Ac+veTriples Inac+veTriples

§  Query:

§  Checkall(undirected!)pathsbetweenand§  Ifoneormoreac+ve,thenindependencenotguaranteed

§  Otherwise(i.e.ifallpathsareinac+ve),thenindependenceisguaranteed

D-Separa+on

Xi �� Xj |{Xk1 , ..., Xkn}

Xi �� Xj |{Xk1 , ..., Xkn}

?

Xi �� Xj |{Xk1 , ..., Xkn}

Example

YesR

T

B

T’

Example

R

T

B

D

L

T’

Yes

Yes

Yes

Example

§  Variables:§  R:Raining§  T:Traffic§  D:Roofdrips§  S:I’msad

§  Ques+ons:T

S

D

R

Yes

StructureImplica+ons

§  GivenaBayesnetstructure,canrund-separa+onalgorithmtobuildacompletelistofcondi+onalindependencesthatarenecessarilytrueoftheform

§  Thislistdeterminesthesetofprobabilitydistribu+onsthatcanberepresented

Xi �� Xj |{Xk1 , ..., Xkn}

Compu+ngAllIndependences

X

Y

Z

X

Y

Z

X

Y

Z

X

Y

Z

XY

Z

{X �� Y,X �� Z, Y �� Z,

X �� Z | Y,X �� Y | Z, Y �� Z | X}

TopologyLimitsDistribu+ons

§  GivensomegraphtopologyG,onlycertainjointdistribu+onscanbeencoded

§  Thegraphstructureguaranteescertain(condi+onal)independences

§  (Theremightbemoreindependence)

§  Addingarcsincreasesthesetofdistribu+ons,buthasseveralcosts

§  Fullcondi+oningcanencodeanydistribu+on

X

Y

Z

X

Y

Z

X

Y

Z

{X �� Z | Y }

X

Y

Z X

Y

Z X

Y

Z

X

Y

Z X

Y

Z X

Y

Z

{}

BayesNetsRepresenta+onSummary

§  Bayesnetscompactlyencodejointdistribu+ons

§  Guaranteedindependenciesofdistribu+onscanbededucedfromBNgraphstructure

§  D-separa+ongivesprecisecondi+onalindependenceguaranteesfromgraphalone

§  ABayes’net’sjointdistribu+onmayhavefurther(condi+onal)independencethatisnotdetectableun+lyouinspectitsspecificdistribu+on

Bayes’Nets

§  Representa+on§  Condi+onalIndependences§  Probabilis+cInference

§  Enumera+on(exact,exponen+alcomplexity)§  Variableelimina+on(exact,worst-case

exponen+alcomplexity,o{enbe?er)§  Probabilis+cinferenceisNP-complete§  Sampling(approximate)

§  LearningBayes’NetsfromData