bayes’ nets - university of washington · cse 473: ar+ficial intelligence bayes’ nets luke...
TRANSCRIPT
CSE473:Ar+ficialIntelligence
Bayes’Nets
LukeZe?lemoyer---UniversityofWashington[TheseslideswerecreatedbyDanKleinandPieterAbbeelforCS188IntrotoAIatUCBerkeley.AllCS188materialsareavailableath?p://ai.berkeley.edu.]
Probabilis+cModels
§ Modelsdescribehow(apor+onof)theworldworks§ Modelsarealwayssimplifica+ons
§ Maynotaccountforeveryvariable§ Maynotaccountforallinterac+onsbetweenvariables§ “Allmodelsarewrong;butsomeareuseful.”
–GeorgeE.P.Box
§ Whatdowedowithprobabilis+cmodels?§ We(orouragents)needtoreasonaboutunknown
variables,givenevidence§ Example:explana+on(diagnos+creasoning)§ Example:predic+on(causalreasoning)§ Example:valueofinforma+on
§ Twovariablesareindependentif:
§ Thissaysthattheirjointdistribu+onfactorsintoaproducttwosimplerdistribu+ons
§ Anotherform:
§ Wewrite:
§ Independenceisasimplifyingmodelingassump2on
§ Empiricaljointdistribu+ons:atbest“close”toindependent
§ Whatcouldweassumefor{Weather,Traffic,Cavity,Toothache}?
Independence
Example:Independence?
T W P
hot sun 0.4
hot rain 0.1
cold sun 0.2
cold rain 0.3
T W P
hot sun 0.3
hot rain 0.2
cold sun 0.3
cold rain 0.2
T P
hot 0.5
cold 0.5
W P
sun 0.6
rain 0.4
Condi+onalIndependence§ P(Toothache,Cavity,Catch)
§ IfIhaveacavity,theprobabilitythattheprobecatchesinitdoesn'tdependonwhetherIhaveatoothache:§ P(+catch|+toothache,+cavity)=P(+catch|+cavity)
§ ThesameindependenceholdsifIdon’thaveacavity:§ P(+catch|+toothache,-cavity)=P(+catch|-cavity)
§ Catchiscondi2onallyindependentofToothachegivenCavity:§ P(Catch|Toothache,Cavity)=P(Catch|Cavity)
§ Equivalentstatements:§ P(Toothache|Catch,Cavity)=P(Toothache|Cavity)§ P(Toothache,Catch|Cavity)=P(Toothache|Cavity)P(Catch|Cavity)§ Onecanbederivedfromtheothereasily
Condi+onalIndependence
§ Uncondi+onal(absolute)independenceveryrare(why?)
§ Condi2onalindependenceisourmostbasicandrobustformofknowledgeaboutuncertainenvironments.
§ Xiscondi+onallyindependentofYgivenZ
ifandonlyif:or,equivalently,ifandonlyif
Condi+onalIndependenceandtheChainRule
§ Chainrule:
§ Trivialdecomposi+on:
§ Withassump+onofcondi+onalindependence:
§ Bayes’nets/graphicalmodelshelpusexpresscondi+onalindependenceassump+ons
GhostbustersChainRule
§ Eachsensordependsonlyonwheretheghostis
§ Thatmeans,thetwosensorsarecondi+onallyindependent,giventheghostposi+on
§ T:TopsquareisredB:Bo?omsquareisredG:Ghostisinthetop
§ Givens:
P(+g)=0.5P(-g)=0.5P(+t|+g)=0.8P(+t|-g)=0.4P(+b|+g)=0.4P(+b|-g)=0.8
P(T,B,G)=P(G)P(T|G)P(B|G)
T B G P(T,B,G)
+t +b +g 0.16
+t +b -g 0.16
+t -b +g 0.24
+t -b -g 0.04
-t +b +g 0.04
-t +b -g 0.24
-t -b +g 0.06
-t -b -g 0.06
Bayes’Nets:BigPicture
§ Twoproblemswithusingfulljointdistribu+ontablesasourprobabilis+cmodels:§ Unlessthereareonlyafewvariables,thejointisWAYtoo
bigtorepresentexplicitly§ Hardtolearn(es+mate)anythingempiricallyaboutmore
thanafewvariablesata+me
§ Bayes’nets:atechniquefordescribingcomplexjointdistribu+ons(models)usingsimple,localdistribu+ons(condi+onalprobabili+es)§ Moreproperlycalledgraphicalmodels§ Wedescribehowvariableslocallyinteract§ Localinterac+onschaintogethertogiveglobal,indirect
interac+ons§ Forabout10min,we’llbevagueabouthowthese
interac+onsarespecified
GraphicalModelNota+on
§ Nodes:variables(withdomains)§ Canbeassigned(observed)orunassigned
(unobserved)
§ Arcs:interac+ons§ Indicate“directinfluence”betweenvariables§ Formally:encodecondi+onalindependence
(morelater)
§ Fornow:imaginethatarrowsmeandirectcausa+on(ingeneral,theydon’t!)
Example:CoinFlips
§ Nindependentcoinflips
§ Nointerac+onsbetweenvariables:absoluteindependence
X1 X2 Xn
Example:Traffic
§ Variables:§ R:Itrains§ T:Thereistraffic
§ Model1:independence
§ Whyisanagentusingmodel2be?er?
R
T
R
T
§ Model2:raincausestraffic
§ Let’sbuildacausalgraphicalmodel!§ Variables
§ T:Traffic§ R:Itrains§ L:Lowpressure§ D:Roofdrips§ B:Ballgame§ C:Cavity
Example:TrafficII
Example:AlarmNetwork§ Variables
§ B:Burglary§ A:Alarmgoesoff§ M:Marycalls§ J:Johncalls§ E:Earthquake!
Bayes’NetSeman+cs
§ Asetofnodes,onepervariableX
§ Adirected,acyclicgraph
§ Acondi+onaldistribu+onforeachnode§ Acollec+onofdistribu+onsoverX,oneforeach
combina+onofparents’values
§ CPT:condi+onalprobabilitytable
§ Descrip+onofanoisy“causal”process
A1
X
An
ABayesnet=Topology(graph)+LocalCondi2onalProbabili2es
Probabili+esinBNs
§ Bayes’netsimplicitlyencodejointdistribu+ons
§ Asaproductoflocalcondi+onaldistribu+ons
§ ToseewhatprobabilityaBNgivestoafullassignment,mul+plyalltherelevantcondi+onalstogether:
§ Example:
Probabili+esinBNs
§ Whyareweguaranteedthatseyng
resultsinaproperjointdistribu+on?
§ Chainrule(validforalldistribu+ons):
§ Assumecondi+onalindependences:
àConsequence:
§ NoteveryBNcanrepresenteveryjointdistribu+on§ Thetopologyenforcescertaincondi+onalindependencies
Onlydistribu2onswhosevariablesareabsolutelyindependentcanberepresentedbyaBayes’netwithnoarcs.
Example:CoinFlips
h 0.5
t 0.5
h 0.5
t 0.5
h 0.5
t 0.5
X1 X2 Xn
Example:Traffic
§ Causaldirec+on
R
T
+r 1/4
-r 3/4
+r +t 3/4
-t 1/4
-r +t 1/2
-t 1/2
+r +t 3/16
+r -t 1/16
-r +t 6/16
-r -t 6/16
Example:ReverseTraffic
§ Reversecausality?
T
R
+t 9/16
-t 7/16
+t +r 1/3
-r 2/3
-t +r 1/7
-r 6/7
+r +t 3/16
+r -t 1/16
-r +t 6/16
-r -t 6/16
Causality?
§ WhenBayes’netsreflectthetruecausalpa?erns:§ O{ensimpler(nodeshavefewerparents)§ O{eneasiertothinkabout§ O{eneasiertoelicitfromexperts
§ BNsneednotactuallybecausal§ Some+mesnocausalnetexistsoverthedomain
(especiallyifvariablesaremissing)§ E.g.considerthevariablesTrafficandDrips§ Endupwitharrowsthatreflectcorrela+on,notcausa+on
§ Whatdothearrowsreallymean?§ Topologymayhappentoencodecausalstructure§ Topologyreallyencodescondi+onalindependence
Example:AlarmNetworkB P(B)
+b 0.001
-b 0.999
E P(E)
+e 0.002
-e 0.998
B E A P(A|B,E)
+b +e +a 0.95
+b +e -a 0.05
+b -e +a 0.94
+b -e -a 0.06
-b +e +a 0.29
-b +e -a 0.71
-b -e +a 0.001
-b -e -a 0.999
A J P(J|A)
+a +j 0.9
+a -j 0.1
-a +j 0.05
-a -j 0.95
A M P(M|A)
+a +m 0.7
+a -m 0.3
-a +m 0.01
-a -m 0.99
B E
A
MJ
Example:AlarmNetworkB P(B)
+b 0.001
-b 0.999
E P(E)
+e 0.002
-e 0.998
B E A P(A|B,E)
+b +e +a 0.95
+b +e -a 0.05
+b -e +a 0.94
+b -e -a 0.06
-b +e +a 0.29
-b +e -a 0.71
-b -e +a 0.001
-b -e -a 0.999
A J P(J|A)
+a +j 0.9
+a -j 0.1
-a +j 0.05
-a -j 0.95
A M P(M|A)
+a +m 0.7
+a -m 0.3
-a +m 0.01
-a -m 0.99
B E
A
MJ
SizeofaBayes’Net
§ Howbigisajointdistribu+onoverNBooleanvariables?
2N
§ HowbigisanN-nodenetifnodeshaveuptokparents?
O(N*2k+1)
§ Bothgiveyouthepowertocalculate
§ BNs:Hugespacesavings!
§ AlsoeasiertoelicitlocalCPTs
§ Alsofastertoanswerqueries(coming)
Bayes’Nets
§ Sofar:howaBayes’netencodesajointdistribu+on
§ Next:howtoanswerqueriesaboutthatdistribu+on§ Today:
§ FirstassembledBNsusinganintui+veno+onofcondi+onalindependenceascausality
§ Thensawthatkeypropertyiscondi+onalindependence§ Maingoal:answerqueriesaboutcondi+onal
independenceandinfluence
§ A{erthat:howtoanswernumericalqueries(inference)
Bayes’Nets
§ Representa+on
§ Condi+onalIndependences
§ Probabilis+cInference
§ LearningBayes’NetsfromData
Condi+onalIndependence
§ XandYareindependentif
§ XandYarecondi+onallyindependentgivenZ
§ (Condi+onal)independenceisapropertyofadistribu+on
§ Example:
BayesNets:Assump+ons
§ Assump+onswearerequiredtomaketodefinetheBayesnetwhengiventhegraph:
§ Beyondabove“chainruleàBayesnet”condi+onalindependenceassump+ons
§ O{enaddi+onalcondi+onalindependences
§ Theycanbereadoffthegraph
§ Importantformodeling:understandassump+onsmadewhenchoosingaBayesnetgraph
P (xi|x1 · · ·xi�1) = P (xi|parents(Xi))
Example
§ Condi+onalindependenceassump+onsdirectlyfromsimplifica+onsinchainrule:
§ Addi+onalimpliedcondi+onalindependenceassump+ons?
X Y Z W
IndependenceinaBN
§ Importantques+onaboutaBN:§ Aretwonodesindependentgivencertainevidence?§ Ifyes,canproveusingalgebra(tediousingeneral)§ Ifno,canprovewithacounterexample§ Example:
§ Ques+on:areXandZnecessarilyindependent?§ Answer:no.Example:lowpressurecausesrain,whichcausestraffic.§ XcaninfluenceZ,ZcaninfluenceX(viaY)§ Addendum:theycouldbeindependent:how?
X Y Z
D-separa+on:Outline
§ Studyindependenceproper+esfortriples
§ Analyzecomplexcasesintermsofmembertriples
§ D-separa+on:acondi+on/algorithmforansweringsuchqueries
CausalChains
§ Thisconfigura+onisa“causalchain”
X:LowpressureY:RainZ:Traffic
§ GuaranteedXindependentofZ?No!
§ OneexamplesetofCPTsforwhichXisnotindependentofZissufficienttoshowthisindependenceisnotguaranteed.
§ Example:
§ Lowpressurecausesraincausestraffic,highpressurecausesnoraincausesnotraffic
§ Innumbers:P(+y|+x)=1,P(-y|-x)=1,P(+z|+y)=1,P(-z|-y)=1
CausalChains
§ Thisconfigura+onisa“causalchain” § GuaranteedXindependentofZgivenY?
§ Evidencealongthechain“blocks”theinfluence
Yes!
X:LowpressureY:RainZ:Traffic
CommonCause
§ Thisconfigura+onisa“commoncause” § GuaranteedXindependentofZ?No!
§ OneexamplesetofCPTsforwhichXisnotindependentofZissufficienttoshowthisindependenceisnotguaranteed.
§ Example:
§ Projectduecausesbothforumsbusyandlabfull
§ Innumbers:P(+x|+y)=1,P(-x|-y)=1,P(+z|+y)=1,P(-z|-y)=1
Y:Projectdue
X:Forumsbusy Z:Labfull
CommonCause
§ Thisconfigura+onisa“commoncause” § GuaranteedXandZindependentgivenY?
§ Observingthecauseblocksinfluencebetweeneffects.
Yes!
Y:Projectdue
X:Forumsbusy Z:Labfull
CommonEffect§ Lastconfigura+on:twocausesofone
effect(v-structures)
Z:Traffic
§ AreXandYindependent?§ Yes:theballgameandtheraincausetraffic,but
theyarenotcorrelated
§ S+llneedtoprovetheymustbe(tryit!)
§ AreXandYindependentgivenZ?
§ No:seeingtrafficputstherainandtheballgameincompe++onasexplana+on.
§ Thisisbackwardsfromtheothercases
§ Observinganeffectac+vatesinfluencebetweenpossiblecauses.
X:Raining Y:Ballgame
TheGeneralCase
§ Generalques+on:inagivenBN,aretwovariablesindependent(givenevidence)?
§ Solu+on:analyzethegraph
§ Anycomplexexamplecanbebrokenintorepe++onsofthethreecanonicalcases
Reachability
§ Recipe:shadeevidencenodes,lookforpathsintheresul+nggraph
§ A?empt1:iftwonodesareconnectedbyanundirectedpathnotblockedbyashadednode,theyarecondi+onallyindependent
§ Almostworks,butnotquite§ Wheredoesitbreak?§ Answer:thev-structureatTdoesn’tcount
asalinkinapathunless“ac+ve”
R
T
B
D
L
Ac+ve/Inac+vePaths
§ Ques+on:AreXandYcondi+onallyindependentgivenevidencevariables{Z}?§ Yes,ifXandY“d-separated”byZ§ Considerall(undirected)pathsfromXtoY
§ Noac+vepaths=independence!
§ Apathisac+veifeachtripleisac+ve:§ CausalchainA→B→CwhereBisunobserved(eitherdirec+on)§ CommoncauseA←B→CwhereBisunobserved§ Commoneffect(akav-structure)
A→B←CwhereBoroneofitsdescendentsisobserved
§ Allittakestoblockapathisasingleinac+vesegment
Ac+veTriples Inac+veTriples
§ Query:
§ Checkall(undirected!)pathsbetweenand§ Ifoneormoreac+ve,thenindependencenotguaranteed
§ Otherwise(i.e.ifallpathsareinac+ve),thenindependenceisguaranteed
D-Separa+on
Xi �� Xj |{Xk1 , ..., Xkn}
Xi �� Xj |{Xk1 , ..., Xkn}
?
Xi �� Xj |{Xk1 , ..., Xkn}
StructureImplica+ons
§ GivenaBayesnetstructure,canrund-separa+onalgorithmtobuildacompletelistofcondi+onalindependencesthatarenecessarilytrueoftheform
§ Thislistdeterminesthesetofprobabilitydistribu+onsthatcanberepresented
Xi �� Xj |{Xk1 , ..., Xkn}
XY
Z
{X �� Y,X �� Z, Y �� Z,
X �� Z | Y,X �� Y | Z, Y �� Z | X}
TopologyLimitsDistribu+ons
§ GivensomegraphtopologyG,onlycertainjointdistribu+onscanbeencoded
§ Thegraphstructureguaranteescertain(condi+onal)independences
§ (Theremightbemoreindependence)
§ Addingarcsincreasesthesetofdistribu+ons,buthasseveralcosts
§ Fullcondi+oningcanencodeanydistribu+on
X
Y
Z
X
Y
Z
X
Y
Z
{X �� Z | Y }
X
Y
Z X
Y
Z X
Y
Z
X
Y
Z X
Y
Z X
Y
Z
{}
BayesNetsRepresenta+onSummary
§ Bayesnetscompactlyencodejointdistribu+ons
§ Guaranteedindependenciesofdistribu+onscanbededucedfromBNgraphstructure
§ D-separa+ongivesprecisecondi+onalindependenceguaranteesfromgraphalone
§ ABayes’net’sjointdistribu+onmayhavefurther(condi+onal)independencethatisnotdetectableun+lyouinspectitsspecificdistribu+on