gecco'2007: modeling xcs in class imbalances: population size and parameter settings

38
Modeling XCS in Class Modeling XCS in Class Imbalances: Population Size Imbalances: Population Size and Parameter Settings Albert Orriols-Puig 1,2 David E. Goldberg 2 Kumara Sastry 2 Ester Bernadó-Mansilla 1 Kumara Sastry Ester Bernadó Mansilla 1 Research Group in Intelligent Systems Enginyeria i Arquitectura La Salle, Ramon Llull University Enginyeria i Arquitectura La Salle, Ramon Llull University 2 Illinois Genetic Algorithms Laboratory Department of Industrial and Enterprise Systems Engineering University of Illinois at Urbana Champaign University of Illinois at Urbana Champaign

Upload: albert-orriols-puig

Post on 24-Jan-2015

331 views

Category:

Education


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Modeling XCS in Class Modeling XCS in Class Imbalances: Population Size Imbalances: Population Size

and Parameter Settingsg

Albert Orriols-Puig1,2 David E. Goldberg2

Kumara Sastry2 Ester Bernadó-Mansilla1Kumara Sastry Ester Bernadó Mansilla

1Research Group in Intelligent SystemsEnginyeria i Arquitectura La Salle, Ramon Llull UniversityEnginyeria i Arquitectura La Salle, Ramon Llull University

2Illinois Genetic Algorithms LaboratoryDepartment of Industrial and Enterprise Systems Engineering

University of Illinois at Urbana ChampaignUniversity of Illinois at Urbana Champaign

Page 2: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Framework

New instance

Learner M d l

Information basedon experience

Knowledgeextraction

New instance

Domain Learner Model

Consisting

Examples

Counter-examples

Predicted Output

ofCou te e a p es

In real-world domains, typically:, yp yHigher cost to obtain examples of the concept to be learntSo, distribution of examples in the training dataset is usually imbalanced

Applications:Fraud detectionMedical diagnosis of rare illnesses

Slide 2GRSI Enginyeria i Arquitectura la Salle

Medical diagnosis of rare illnessesDetection of oil spills in satellite images

Page 3: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Framework

Do learners suffer from class imbalances?– Methods that do global optimization

L Minimize theTrainingLearner Minimize the

global errorSet

examplesnumbererrorsnumerrorsnum

error cc 21 .. +=Biased towards

the overwhelmed class

Maximization of the overwhelmed class accuracy,in detriment of the minority class.

Slide 3GRSI Enginyeria i Arquitectura la Salle

Page 4: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Motivation

And what about incremental learning?

Sampling instances of the minority class less frequently

Rules that match instances of the minority class poorly activated

Rules of the minority class would receive less genetic opportunities (Orriols & Bernadó, 2006)

Slide 4GRSI Enginyeria i Arquitectura la Salle

Page 5: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Aim

Facetwise analysis of XCS for class imbalances

Impact of class imbalances on the initialization process

How can XCS create rules of the minority class if the covering process failsg p

Population size bound with respect to the imbalance ratioPopulation size bound with respect to the imbalance ratio

U til hi h i b l ti ld XCS b bl t lUntil which imbalance ratio would XCS be able to learn from the minority class?

Slide 5GRSI Enginyeria i Arquitectura la Salle

Page 6: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Outline1. Description of XCS2. Facetwise Analysis3. Design of test Problems4. XCS on the one-bit Problem5. Analysis of Deviations6. Results7. Conclusions

1. Description of XCS

2 Facetwise Analysis2. Facetwise Analysis

3. Design of test Problems

4. XCS on the one-bit Problem

5 A l i f D i ti5. Analysis of Deviations

6. Results

7. Conclusions

Slide 6GRSI Enginyeria i Arquitectura la Salle

Page 7: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Description of XCS1. Description of XCS2. Facetwise Analysis3. Design of test Problems4. XCS on the one-bit Problem5. Analysis of Deviationsp

In single-step tasks:

6. Results7. Conclusions

Environment

g p

Problem instance

1 C A P ε F num as ts exp3 C A P ε F num as ts exp5 C A P ε F num as ts exp

Match Set [M]Selected

action

Minorityclass instance

1 C A P ε F num as ts exp3 C A P ε F num as ts exp5 C A P ε F num as ts exp

Match Set [M]Majorityclass instance

1 C A P ε F num as ts exp3 C A P ε F num as ts exp5 C A P ε F num as ts exp

Match Set [M]

1 C A P ε F num as ts exp2 C A P ε F num as ts exp3 C A P ε F num as ts exp

Population [P] Match set generation

5 C A P ε F num as ts exp6 C A P ε F num as ts exp

…Prediction Array

REWARD1000/01 C A P ε F num as ts exp

2 C A P ε F num as ts exp3 C A P ε F num as ts exp

Population [P] Match set generation

5 C A P ε F num as ts exp6 C A P ε F num as ts exp

…1 C A P ε F num as ts exp2 C A P ε F num as ts exp3 C A P ε F num as ts exp

Population [P] Match set generation

5 C A P ε F num as ts exp6 C A P ε F num as ts exp

4 C A P ε F num as ts exp5 C A P ε F num as ts exp6 C A P ε F num as ts exp

A ti S t [A]

c1 c2 … cn

Random Action

4 C A P ε F num as ts exp5 C A P ε F num as ts exp6 C A P ε F num as ts exp

… Starved niches

4 C A P ε F num as ts exp5 C A P ε F num as ts exp6 C A P ε F num as ts exp

… Nourished niches

1 C A P ε F num as ts exp3 C A P ε F num as ts exp5 C A P ε F num as ts exp

C

Action Set [A]

Selection, Reproduction, Mutation

Deletion ClassifierParameters

Update6 C A P ε F num as ts exp…Genetic Algorithm

Update

Problem niche: the schema defines the relevant

Slide 7GRSI Enginyeria i Arquitectura la Salle

Problem niche: the schema defines the relevant attributes for a particular problem niche. Eg: 10**1*

Page 8: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Outline

1. Description of XCS

2 Facetwise Analysis2. Facetwise Analysis

3. Design of test Problems

4. XCS on the one-bit Problem

5 A l i f D i ti5. Analysis of Deviations

6. Results

7. Conclusions

Slide 8GRSI Enginyeria i Arquitectura la Salle

Page 9: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Facetwise Analysis1. Description of XCS2. Facetwise Analysis3. Design of test Problems4. XCS on the one-bit Problem5. Analysis of Deviationsy6. Results7. Conclusions

Study XCS capabilities to provide representatives of starved niches:– Population initialization– Generation of correct representatives of starved niches– Time of extinction of these correct classifiers

Derive a bound on the population sizeDerive a bound on the population sizeDepart from theory developed for XCS

– (Butz, Kovacs, Lanzi, Wilson,04): Model of generalization pressures of XCS(Butz, Kovacs, Lanzi, Wilson,04): Model of generalization pressures of XCS – (Butz, Goldberg & Lanzi, 04): Learning time bound – (Butz, Goldberg, Lanzi & Sastry, 07): Population size bound to guarantee niche

supportsupport– (Butz, 2006): Rule-Based Evolutionary Online Learning Systems: A Principled

Approach to LCS Analysis and Design.

Slide 9GRSI Enginyeria i Arquitectura la Salle

Page 10: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Facetwise Analysis1. Description of XCS2. Facetwise Analysis3. Design of test Problems4. XCS on the one-bit Problem5. Analysis of Deviationsy6. Results7. Conclusions

Assumptions– Problems consisting of n classesProblems consisting of n classes

– One class sampled with a lower frequency: minority class

classminority theof instances num.classminority theother than classany of instances num.ir =

– Probability of sampling an instance of the minority class:

i11Ps(min) =

i1irPs(maj) =

ir1( )

+ ir1( j)

+

Slide 10GRSI Enginyeria i Arquitectura la Salle

Page 11: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Facetwise Analysis1. Description of XCS2. Facetwise Analysis3. Design of test Problems4. XCS on the one-bit Problem5. Analysis of Deviationsy6. Results7. Conclusions

Facetwise AnalysisPopulation initialization– Population initialization

– Generation of correct representatives of starved niches

– Time of extinction of these correct classifiers

– Population size bound

Slide 11GRSI Enginyeria i Arquitectura la Salle

Page 12: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Population Initialization1. Description of XCS2. Facetwise Analysis3. Design of test Problems4. XCS on the one-bit Problem5. Analysis of Deviationsp6. Results7. Conclusions

Covering procedure– Covering: Generalize over the input with probability P#Covering: Generalize over the input with probability P#

– P# needs to satisfy the covering challenge (Butz et al., 01)

Would I trigger covering on minority class instances?– Probability that one instance is covered by at leastProbability that one instance is covered, by, at least,

one rule is (Butz et. al, 01):Inputlength

Population specificity

Initially 1 – P#

Population size

y #

Slide 12GRSI Enginyeria i Arquitectura la Salle

Page 13: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Population Initialization1. Description of XCS2. Facetwise Analysis3. Design of test Problems4. XCS on the one-bit Problem5. Analysis of Deviationsp6. Results7. Conclusions

Slide 13GRSI Enginyeria i Arquitectura la Salle

Page 14: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Facetwise Analysis1. Description of XCS2. Facetwise Analysis3. Design of test Problems4. XCS on the one-bit Problem5. Analysis of Deviationsy6. Results7. Conclusions

Facetwise AnalysisPopulation initialization– Population initialization

– Generation of correct representatives of starved niches

– Time of extinction of these correct classifiers

– Population size bound

Slide 14GRSI Enginyeria i Arquitectura la Salle

Page 15: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Creation of Representatives of Starved Niches

1. Description of XCS2. Facetwise Analysis3. Design of test Problems4. XCS on the one-bit Problem5. Analysis of DeviationsStarved Niches 6. Results7. Conclusions

AssumptionsCovering has not provided any representative of starved niches– Covering has not provided any representative of starved niches

– Simplified model: only consider mutation in our model.

How can we generate representative of starved niches?– Specifying correctly all the bits of the schema that represents theSpecifying correctly all the bits of the schema that represents the

starved niche

Slide 15GRSI Enginyeria i Arquitectura la Salle

Page 16: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Creation of Representatives of Starved Niches

1. Description of XCS2. Facetwise Analysis3. Design of test Problems4. XCS on the one-bit Problem5. Analysis of DeviationsStarved Niches 6. Results7. Conclusions

Possible cases:– Sample a minority class instance

ir11 Ps(min)+

=

• Activate a niche of the minority class μ: Mutation probability

K : Order of the schema

• Activate a niche of another class

Km: Order of the schema

– Sample a majority class instanceir1

ir Ps(maj)+

=

• Activate a niche of the minority class

• Activate a niche of another class

Slide 16GRSI Enginyeria i Arquitectura la Salle

Page 17: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Creation of Representatives ofStarved Niches

1. Description of XCS2. Facetwise Analysis3. Design of test Problems4. XCS on the one-bit Problem5. Analysis of DeviationsStarved Niches 6. Results7. Conclusions

Summing up, time to get the first representative of a starved nichesta ed c e

n: number of classes

μ: Mutation probability

Km: Order of the schema

It increases:Linearly with the number of classes

Exponentially with the order of the schemaExponentially with the order of the schema

It does not depend on the imbalance ratio

Slide 17GRSI Enginyeria i Arquitectura la Salle

Page 18: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Facetwise Analysis1. Description of XCS2. Facetwise Analysis3. Design of test Problems4. XCS on the one-bit Problem5. Analysis of Deviationsy6. Results7. Conclusions

Facetwise AnalysisPopulation initialization– Population initialization

– Generation of correct representatives of starved niches

– Time of extinction of these correct classifiers

– Population size bound

Slide 18GRSI Enginyeria i Arquitectura la Salle

Page 19: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Bounding the Population Size1. Description of XCS2. Facetwise Analysis3. Design of test Problems4. XCS on the one-bit Problem5. Analysis of Deviationsg p6. Results7. Conclusions

Time to extinction

– Consider random deletion:

Slide 19GRSI Enginyeria i Arquitectura la Salle

Page 20: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Facetwise Analysis1. Description of XCS2. Facetwise Analysis3. Design of test Problems4. XCS on the one-bit Problem5. Analysis of Deviationsy6. Results7. Conclusions

Facetwise AnalysisPopulation initialization– Population initialization

– Generation of correct representatives of starved niches

– Time of extinction of these correct classifiers

– Population size bound

Slide 20GRSI Enginyeria i Arquitectura la Salle

Page 21: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Bounding the Population Size1. Description of XCS2. Facetwise Analysis3. Design of test Problems4. XCS on the one-bit Problem5. Analysis of Deviationsg p6. Results7. Conclusions

Population size bound to guarantee that there will be representatives of starved nichesep ese tat es o sta ed c es

– Require that:

– Bound:

Slide 21GRSI Enginyeria i Arquitectura la Salle

Page 22: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Bounding the Population Size1. Description of XCS2. Facetwise Analysis3. Design of test Problems4. XCS on the one-bit Problem5. Analysis of Deviationsg p6. Results7. Conclusions

Population size bound to guarantee that representatives of starved niches will receive a genetic opportunity:– Consider θGA = 0

– We require that the best representative of a starved niche receive a genetic event before being removed

– Time to receive the first genetic event

Slide 22GRSI Enginyeria i Arquitectura la Salle

Page 23: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Bounding the Population Size1. Description of XCS2. Facetwise Analysis3. Design of test Problems4. XCS on the one-bit Problem5. Analysis of Deviationsg p6. Results7. Conclusions

Population size bound to guarantee that representatives of starved niches will receive a genetic opportunity:o sta ed c es ece e a ge et c oppo tu ty

The population size to guarantee that the best representatives of starve niches will receive at least one genetic opportunity g pp y

increases linearly with the imbalance ratio

Slide 23GRSI Enginyeria i Arquitectura la Salle

Page 24: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Outline

1. Description of XCS

2 Facetwise Analysis2. Facetwise Analysis

3. Design of test Problems

4. XCS on the one-bit Problem

5 A l i f D i ti5. Analysis of Deviations

6. Results

7. Conclusions

Slide 24GRSI Enginyeria i Arquitectura la Salle

Page 25: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Design of Test Problems1. Description of XCS2. Facetwise Analysis3. Design of test Problems4. XCS on the one-bit Problem5. Analysis of Deviationsg6. Results7. Conclusions

One-bit problem

000110 :0 Value of the left-most bit

Condition length (l)

– Only two schemas of order one: 0***** and 1*****

– Undersampling instances of the class labeled as 1

ir11 Ps(min)+

=

Slide 25GRSI Enginyeria i Arquitectura la Salle

Page 26: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Design of Test Problems1. Description of XCS2. Facetwise Analysis3. Design of test Problems4. XCS on the one-bit Problem5. Analysis of Deviationsg6. Results7. Conclusions

Parity problemCondition

01001010

Condition length (l)

:1 Number of 1 mod 2

Relevant

– The k bits of parity form a single building block

bits ( k)

– Undersampling instances of the class labeled as 1

1ir1

1 Ps(min)+

=

Slide 26GRSI Enginyeria i Arquitectura la Salle

Page 27: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Outline

1. Description of XCS

2 Facetwise Analysis2. Facetwise Analysis

3. Design of test Problems

4. XCS on the one-bit Problem

5 A l i f D i ti5. Analysis of Deviations

6. Results

7. Conclusions

Slide 27GRSI Enginyeria i Arquitectura la Salle

Page 28: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

XCS on the one-bit Problem1. Description of XCS2. Facetwise Analysis3. Design of test Problems4. XCS on the one-bit Problem5. Analysis of Deviations6. Results7. Conclusions

XCS configuration

α=0.1, ν=5, ε0=1, θGA=25, χ=0.8, μ=0.4, θdel=20, θsub=200, δ=0.1, P#=0.6selection=tournament, mutation=niched, [A]sub=false, N = 10,000 ir

Evaluation of the results:Evaluation of the results:– Minimum population size to achieve:

TP rate * TN rate > 95%TP rate TN rate > 95%

R lt 25 d– Results are averages over 25 seeds

Slide 28GRSI Enginyeria i Arquitectura la Salle

Page 29: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

XCS on the one-bit Problem1. Description of XCS2. Facetwise Analysis3. Design of test Problems4. XCS on the one-bit Problem5. Analysis of Deviations6. Results7. Conclusions

N remains constant up to ir = 64

N increases linearly from ir=64 to ir=256

N increases exponentially fromp yir=256 to ir=1024

Higher ir could not be solved

Slide 29GRSI Enginyeria i Arquitectura la Salle

Page 30: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Outline

1. Description of XCS

2 Facetwise Analysis2. Facetwise Analysis

3. Design of test Problems

4. XCS on the one-bit Problem

5 A l i f D i ti5. Analysis of Deviations

6. Results

7. Conclusions

Slide 30GRSI Enginyeria i Arquitectura la Salle

Page 31: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Analysis of the Deviations1. Description of XCS2. Facetwise Analysis3. Design of test Problems4. XCS on the one-bit Problem5. Analysis of Deviationsy6. Results7. Conclusions

Niched Mutation vs. Free Mutation– Classifiers can only be created if minority class instances are sampled– Classifiers can only be created if minority class instances are sampled

Inheritance Error of Classifiers’ Parameters– New promising representatives of starved niches are created from

l ifi th t b l t i h d i hclassifiers that belong to nourished niches

– These new promising rules inherit parameters from these classifiers. This is specially delicate for the action set size (as)This is specially delicate for the action set size (as).

– Approach: initialize as=1.

Slide 31GRSI Enginyeria i Arquitectura la Salle

Page 32: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Analysis of the Deviations1. Description of XCS2. Facetwise Analysis3. Design of test Problems4. XCS on the one-bit Problem5. Analysis of Deviationsy6. Results7. Conclusions

Subsumption– An overgeneral classifier of the majority class may receive ir positive– An overgeneral classifier of the majority class may receive ir positive

reward before receiving the first negative reward

– Approach: set θsub>irpp sub

Stabilizing the population before testingStabilizing the population before testing– Overgeneral classifiers poorly evaluated

Approach: introduce some extra runs at the end of learning with the GA– Approach: introduce some extra runs at the end of learning with the GA switched off.

We gather all these little tweaks in XCS+PMC

Slide 32GRSI Enginyeria i Arquitectura la Salle

Page 33: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Outline

1. Description of XCS

2 Facetwise Analysis2. Facetwise Analysis

3. Design of test Problems

4. XCS on the one-bit Problem

5 A l i f D i ti5. Analysis of Deviations

6. Results

7. Conclusions

Slide 33GRSI Enginyeria i Arquitectura la Salle

Page 34: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

XCS+PCM in the one-bit Problem1. Description of XCS2. Facetwise Analysis3. Design of test Problems4. XCS on the one-bit Problem5. Analysis of Deviations6. Results7. Conclusions

N remains constant up to ir = 128

F hi h i N li htl iFor higher ir, N slightly increases

We only have to guarantee that aWe only have to guarantee that a representative of the starved niche will be created

Slide 34GRSI Enginyeria i Arquitectura la Salle

Page 35: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

XCS+PCM in the Parity Problem1. Description of XCS2. Facetwise Analysis3. Design of test Problems4. XCS on the one-bit Problem5. Analysis of Deviationsy6. Results7. Conclusions

Building blocks of size 3 need to be processed

Empirical results agree with thetheory

P l ti i b d t tPopulation size bound to guaranteethat a representative of the nichewill receive a genetic event

Slide 35GRSI Enginyeria i Arquitectura la Salle

Page 36: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Outline

1. Description of XCS

2 Facetwise Analysis2. Facetwise Analysis

3. Design of test Problems

4. XCS on the one-bit Problem

5 A l i f D i ti5. Analysis of Deviations

6. Results

7. Conclusions

Slide 36GRSI Enginyeria i Arquitectura la Salle

Page 37: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Conclusions and Further Work1. Description of XCS2. Facetwise Analysis3. Design of test Problems4. XCS on the one-bit Problem5. Analysis of Deviations6. Results7. Conclusions

We derived models that analyzed the representatives of starved niches provided by covering and mutation

A population size bound was derived

We saw that the empirical observations met the theory if four aspects were considered:

– Type of mutation

– as initialization

– Subsumption

– Stabilization of the populationStabilization of the population

Further analysis of the covering operator

Slide 37GRSI Enginyeria i Arquitectura la Salle

Page 38: GECCO'2007: Modeling XCS in Class Imbalances: Population Size and Parameter Settings

Modeling XCS in Class Modeling XCS in Class Imbalances: Population Size Imbalances: Population Size

and Parameter Settingsg

Albert Orriols-Puig1,2 David E. Goldberg2

Kumara Sastry2 Ester Bernadó-Mansilla1Kumara Sastry Ester Bernadó Mansilla

1Research Group in Intelligent SystemsEnginyeria i Arquitectura La Salle, Ramon Llull UniversityEnginyeria i Arquitectura La Salle, Ramon Llull University

2Illinois Genetic Algorithms LaboratoryDepartment of Industrial and Enterprise Systems Engineering

University of Illinois at Urbana ChampaignUniversity of Illinois at Urbana Champaign