boa (bayesian optimization algorithm)

31
BOA (Bayesian Optimization Algorithm) Hsuan Lee for Dummies

Upload: yael

Post on 22-Feb-2016

57 views

Category:

Documents


1 download

DESCRIPTION

BOA (Bayesian Optimization Algorithm). for Dummies. Hsuan Lee. References. Martin Pelikan: Hierarchical Bayesian Optimization Algorithm , StudFuzz 170 , 31–48 (2005 ) //BOA - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: BOA (Bayesian Optimization Algorithm)

BOA (Bayesian Optimization Algorithm)

Hsuan Lee

for Dummies

Page 2: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE2

References Martin Pelikan: Hierarchical Bayesian Optimization Algorithm,

StudFuzz 170, 31–48 (2005) //BOA Martin Pelikan and D. E. Goldberg: Hierarchical Bayesian

Optimization Algorithm, Studies in Computational Intelligence (SCI) 33, 63-90 (2006) //hBOA

Cooper, G. F. and Herskovits, E. H. (1992). A Bayesian method for the induction of probabilistic networks from data. Machine Learning, 9:309–347.

Heckerman, D., Geiger, D., and Chickering, D. M. (1994). Learning Bayesian networks: The combination of knowledge and statistical data. Technical Report MSR-TR-94-09, Microsoft Research, Redmond, WA.

Friedman, N., and Goldszmidt, M. (1999). Learning Bayesian networks with local structure. In Jordan, M. I., (Ed.), Graphical models, pp. 421–459. MIT, Cambridge, MA

2010.10.07

Page 3: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE3

Generating Offspring

Mutate Crossover Asexual

Reproduction Use ONE fit

chromosome. Change slightly to form an offspring

Eg. ES

Sexual Reproduction Use a PAIR of

fit chromosome. Take parts of each to form an offspring

Eg. sGA, DSMGA

EDA Group

Reproduction Use a GROUP of

fit chromosome to build a model. Sample the model to generate an offspring

Eg. DSMGA(?) for SpinGlass, BOA

2010.10.07

Page 4: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE4

Bayesian Optimization Algorithm Pseudo Code

Bayesian Optimization Algorithm (BOA)t 0;generate initial population P(0);while (not done) {

SELECT population of promising solution S(t);

BUILD Bayesian network (BN) B(t) from S(t);

SAMPLE B(t) to generate O(t); incorporate O(t) into P(t);

//REPLACEMENTt t+1;

} 2010.10.07

Page 5: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE5

Bayesian Optimization Algorithm

Selection

Learning Bayesian Network

Sampling Bayesian Network

Replacement

Evaluation

Until Termination

Initialization

2010.10.07

Page 6: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE6

Learning Bayesian Network Bayesian Network

A BN is a directed acyclic graph (DAG) An edge on Bayesian Network AB implies that the

occurrence of A has an effect on the probability of B’s occurrence. A is a parent of B. B is conditionally dependent on A.

Two nodes are assumed to be conditionally independent if there is not an edge between them

𝑝 (𝑋 1=𝑥1 , 𝑋 2=𝑥2 ,…, 𝑋𝑛=𝑥𝑛)=∏𝑖=1

𝑛

𝑝 (𝑋 𝑖=𝑥 𝑖∨𝑋 𝑗=𝑥 𝑗 𝑓𝑜𝑟 h𝑒𝑎𝑐 𝑋 𝑗 h h𝑤 𝑖𝑐 𝑖𝑠 𝑎𝑝𝑎𝑟𝑒𝑛𝑡𝑜𝑓 𝑋 𝑖)

2010.10.07

Page 7: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE7

Learning Bayesian Network Bayesian Network

S R T FF F 0.0 1.0F T 0.8 0.2T F 0.9 0.1T T 0.99 0.01

T F0.2 0.8

R T FF 0.4 0.6T 0.01 0.99

Sprinkler

Wet Grass

Rain

2010.10.07

Page 8: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE8

Learning Bayesian Network Learning Bayesian Network from data

Structure (B)To learn the structure of a BN, we need A scoring metric (or a set of scoring metrics) on

structures A search procedure

Parameters (Θ,θ) Given the structure of a BN, learning parameters is

straight forward. Maximum Likelihood (ML),

Learning parameters is easy,but learning the best BN structure is NP-Complete

2010.10.07

Page 9: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE9

Learning Bayesian Network Scoring Metrics: evaluations of a BN structure

Bayesian MetricsDetermines the likelihood of a structure given the observed data and some prior knowledgeEg. Bayesian Dirichlet Metric (BD)

Minimum Description Length MetricsEvaluate the structure according to the number of bits required to store the model and the data compressed according to the modelEg. Bayesian Information Criterion

We’ll come back to the scoring metrics later.2010.10.07

Page 10: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE10

Learning Bayesian Network The Search Procedure of a good Bayesian Network

It can be shown that finding the best Bayesian network isNP-Complete. But the best BN is not required in BOA, a good BN is enough.

Greedy Algorithm can be used to find a good BN

Greedy Algorithm of network constructioninitialize the network B (an empty network or the network of the last generation)done false;while (not done) {O all simple graph operations applicable to B;IF there exists an operation in O that improves score(B) THENop = operation from O that improves score(B) the most;apply op to B;ELSEdone true; }return B;

2010.10.07

Page 11: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE11

Learning Bayesian Network Simple Graph Operations of Bayesian Network

Edge Addition Edge Removal Edge Reversal

Rain

Radar

Wet Road

Speed

Car Cras

h

2010.10.07

Page 12: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE12

Learning Bayesian Network Learning Parameters

Maximum Likelihood (ML)

Rain

Radar

Wet Road

Speed

Car Cras

h

2010.10.07

Page 13: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE13

Sampling Bayesian Network Generate Offspring with a Bayesian Network

1. Given a Bayesian network with structure & parameters

2. Perform a topology sort on the Bayesian network, which is a directed acyclic graph (DAG)

3. Assign values to the new chromosome bit by bit in the topological sorted order. according to the parameters.

Rain

Radar

Wet Road

Speed

Car Cras

h

2010.10.07

Page 14: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE14

Bayesian Optimization Algorithm

Selection

Learning Bayesian Network

Sampling Bayesian Network

Replacement

Evaluation

Until Termination

Initialization

2010.10.07

Page 15: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE15

Scoring Metrics Revisited Minimum Description Length Metrics

Evaluate the structure according to the number of bits required to store the model and the data compressed according to the model

Bayesian Information Criterion

B: Bayesian StructureH(A|B): Conditional Entropy of A given BN: population size2010.10.07

Page 16: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE16

Scoring Metrics Revisited Bayesian Metrics

Determines the likelihood of a structure given the observed data and some prior knowledge

Bayesian Dirichlet Metric (BD)

B: Bayesian StructureD: Observed Data𝜉: Prior InformationNijk: # of Observed Data that has value k on bit i with the parent string jN’ijk: prior knowledgeΓ: Gamma Function

2010.10.07

Page 17: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE17

Scoring Metrics Revisited Bayesian Dirichlet Metric (BD)

In BOA, is set to 1 and . This reduced form of BD metric is called K2 metric.Physical meaning: all outcomes k of a given parental setup has the same probability at the beginning .

The term can be set either to a constant or set to favor simpler structures.

2010.10.07

Page 18: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE18

Scoring Metrics Revisited Decomposability of scoring metrics

In both metrics, the score of a structure only changes locally after performing a simple graph operation (by greedy search)

Only one particular term (one particular i) is changed in the entire metric

Largely simplifies the computation of the greedy search

2010.10.07

Page 19: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE19

Scoring Metrics Revisited Problems exist in both scoring metrics

In BIC, the term about model complexity confines the complexity of the Bayesian structure, resulting in over simplified structures

In BD, maximizing marginal probability leads to over-fitting, resulting in over complicated structures

A combination of both can produce favorable results

2010.10.07

Page 20: BOA (Bayesian Optimization Algorithm)

hBOA

Hierarchical Bayesian Optimization Algorithm

Page 21: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE21

Hierarchical BOA (hBOA) The hierarchical version of BOA, used to solve

nearly decomposable and hierarchical problems

Three important challenges must be considered for the design of solvers of difficult hierarchical problems Decomposition

Bayesian Network Chunking

Representing partial solutions at each level compactly to enable the algorithm to effectively process partial solutions for higher order.Using local structures

Diversity MaintenanceRTR replacement

2010.10.07

Page 22: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE22

Hierarchical BOA (hBOA) Local Structure

Decision Tree, in hBOA Full Table

A=1

C=1 C=0

B=1 B=0

A=0

ABC

000 0.4 0.6001 0.4 0.6010 0.4 0.6011 0.4 0.6100 0.5 0.5101 0.6 0.4110 0.3 0.7111 0.6 0.4

2010.10.07

Page 23: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE23

Hierarchical BOA (hBOA) Benefits of building local structure

Simplifies the modelIn the case shown, 8 parameters has to be maintained for full conditional probability model table, but only 4 for decision tree

Generalizes the parental conditionIn the case shown, with the full table setting, an occurrence of ABCX=1010 contributes nothing in predicting ABCX=1110 in the future; with the local structure 1010 DOES predict 1110

A=1

C=1 C=0

B=1 B=0

A=0

2010.10.07

Page 24: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE24

Hierarchical BOA (hBOA) //EDIT Scoring Metrics: evaluations of a local

structure Bi Bayesian Metrics

In hBOA, is set to favor simpler models.

2010.10.07

Page 25: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE25

Hierarchical BOA (hBOA) //EDIT Scoring Metrics: evaluations of a local

structure Bi Minimum Description Length Metrics

2010.10.07

Page 26: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE26

Hierarchical BOA (hBOA) Search procedure for local structure (decision tree)

Greedy Algorithm of local structure (decision tree) constructioninitialize the structure Bi (a one-node tree that represents all parental strings)// top-downBranch (Bi , Πi);return Bi;

Branch (T, P)IF exists elements in P THENchoose π ∈ P that best splits the decision tree T;Left Child = Branch (Tπ=1 , P- π);Right Child = Branch (Tπ=0 , P- π);// bottom-upIF the score given by Tπ=1 and Tπ=0 is worse than T THENmerge Tπ=1 and Tπ=0 back into T;ELSE Left Child = Right Child = NIL;return T;

2010.10.07

Page 27: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE27

Hierarchical BOA (hBOA) Search procedure for local structure (Decision

Tree)demonstration

A=1

C=1

B=1 B=0

C=0

B=1 B=0

A=0

B=1

C=1 C=0

B=0

C=1 C=0

2010.10.07

Page 28: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE28

Hierarchical BOA (hBOA) Modified network construction for hBOA

Greedy Algorithm of network with local structure constructioninitialize the network B (an empty network or the network of the last generation)done false;while (not done) {O all simple graph operations applicable to B;optimize every structure in O with local structure;IF there exists an operation in O that improves score(B) THENop = operation from O that improves score(B) the most;apply op to B;ELSEdone true; }return B;

2010.10.07

Page 29: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE29

Hierarchical BOA (hBOA) Sampling a Bayesian network with local

structure1. Topology sort2. Assign values according to local structures,

instead of full conditional probability tables

A=1

C=1 C=0

B=1 B=0

A=0Rain

Radar

Wet Road

Speed

Car Cras

h

2010.10.07

Page 30: BOA (Bayesian Optimization Algorithm)

Hsuan Lee @ NTUEE30

Some Thoughts about BOA/hBOA Use causal Bayesian network to solve an

acausal problem

Are arrows really needed? “Markovian” Optimization Algorithm, MOA? Adopt the idea of Bayesian Dirichlet Metric.

Rain

Radar

Wet road

Speed

Car Cras

h

2010.10.07

Page 31: BOA (Bayesian Optimization Algorithm)

End of PresentationThank You! Thank Yu!