an organizational co evolutionary algorithm for classification
TRANSCRIPT
An Organizational Co-evolutionary Algorithm For Classification
Developed By: Badar Munir
National University of Computer & Emerging Sciences, Islamabad
Index1. Abstract2. Introduction3. Reference Techniques4. Proposed Technique5. Results6. Conclusion7. Future idea
National University of Computer & Emerging Sciences, Islamabad
AbstractOCEC is inspired from human interacting process.- It uses the concept of Multi Poulation.- It evolves individuals of population, individuals that have same class arranges them in organization.Determines the fitness of each organization by Calculating its
- Significance of each attribute.- # of attributes in it
National University of Computer & Emerging Sciences, Islamabad
Abstract- Rules are extracted when evolutionary process ends.- Generalized rules are by merging rules.- OCEC performs better than other EA based classification algorithms and has less computational complexity.
National University of Computer & Emerging Sciences, Islamabad
Co-evolutionary Algorithm- EA are based on the process Natural Selection.- When ever it is applied on engineering problems it gives satisfactory results. - Co-evolutionary algorithm is Multi-Population.- In Co-evolutionary algorithms individuals of species-I competes/ cooperates with species-II. Best individual from both them is selected and copied to next generation.
National University of Computer & Emerging Sciences, Islamabad
Co-evolutionary AlgorithmTwo types of Co-evolutionary algorithms are:- Competitive- Cooperative
National University of Computer & Emerging Sciences, Islamabad
ClassificationClassification is a technique in which• # possible inputs, #attributes in input, • Range of attribute values• Output Classes are already known.
National University of Computer & Emerging Sciences, Islamabad
NAME RANK YEARS TENUREDMike Assistant Prof 3 noMary Assistant Prof 7 yesBill Professor 2 yesJim Associate Prof 7 yesDave Assistant Prof 6 noAnne Associate Prof 3 no
Classification- We divide the dataset into
National University of Computer & Emerging Sciences, Islamabad
Test DataInput DataTraining Data
Classification
National University of Computer & Emerging Sciences, Islamabad
TrainingData
NAME RANK YEARS TENUREDMike Assistant Prof 3 noMary Assistant Prof 7 yesBill Professor 2 yesJim Associate Prof 7 yesDave Assistant Prof 6 noAnne Associate Prof 3 no
ClassificationAlgorithms
IF rank = ‘professor’OR years > 6THEN tenured = ‘yes’
Classifier(Model)
ClassificationOur aim in classification is to develop- Generalized rules instead of Specific
ClassificationCases results during the evaluation of classification:Underflow & Overflow
Reference Techniques1- Michigan Approach2- Pittsburgh approach3- Chonnei Algorithm4- GABIL Approach5- COGIN6- JOINGA7- REGAL8- G-Net
9- XCS10- GEP11- DMEL12- EVOPROL13- SIA14- ESIA15- EENCL16- EPNET
National University of Computer & Emerging Sciences, Islamabad
Michigan Approach-Maintains a population of individual rules which compete with each other for space and priority in a population.- It is not a good approach because it cannot find best solution in complex problems instead it converges rapidly.
National University of Computer & Emerging Sciences, Islamabad
Pittsburgh Approach-Maintains a population of variable-length rule set which compete with each other with respect to performance on a domain task.- computational cost for complex problems is too high.
National University of Computer & Emerging Sciences, Islamabad
GABIL Approach- GABIL continuously learns and refines classification rules by interacting with environment.- For rules refinement it uses Genetic Algorithm
National University of Computer & Emerging Sciences, Islamabad
COGIN Approach- CONGIN is a inductive approach that uses GA.- It promotes Competitive or Predator type COE between classification nichie’s.
National University of Computer & Emerging Sciences, Islamabad
JOINGA Approach- CONGIN is a inductive approach that uses GA.- It uses Cooperative or Symbiotic type COE between classification nichie’s.- It is used for Multi-Model classification.
National University of Computer & Emerging Sciences, Islamabad
REGAL Approach- It is a distributed GA based approach designed for learning first-order logic concepts description from examples.
National University of Computer & Emerging Sciences, Islamabad
G-NET Approach-G-NET is a descendant of REGAL that consistently achieves better performance.
National University of Computer & Emerging Sciences, Islamabad
Organizational co-evolutionary (OCEC)- OCEC copies COE model of Multiple Populations- It organizes the individuals in a sets called organizations.- Focusing on extracting rules from individuals & organization.- It does not focus on making organizations but it focus on simulating interacting process among organization.- It is bottom-up approach.
National University of Computer & Emerging Sciences, Islamabad
Organizational co-evolutionary (OCEC)- OCEC is based on organizations.
National University of Computer & Emerging Sciences, Islamabad
• Organization 4• Organization3
• Organization 2• Organization 1
Organization?- An organization is a set of instances that have same class- Intersection between organizations is empty.
Org1 Π Org2 = Ø
* Each instance of an org is called Member of org.
National University of Computer & Emerging Sciences, Islamabad
Outlook Temp Humidity Wind PlaySunny Hot High False No
Sunny Hot High True No
Overcast Hot High False Yes
Rainy Mild High False Yes
Rainy Cool Normal False Yes
Organization?- If all members of org have the same value for attribute A , then A is a Fixed-Value Attribute.Suppose A’ is a fixed-value attribute that satisfy the conditions required for rule extraction, then A’ is a Useful Attribute. The fixed-value attribute set of org is labeled as Forg, and the useful attribute set is labeled as Uorg
- Useful attribute is significant because it extracts rule.
National University of Computer & Emerging Sciences, Islamabad
Organization?Wind Forg1 & Uorg1 (Org2)Outlook Uorg2 (Org2)Temp Forg2 & Uorg2
Humidity Forg2 & Uorg2
National University of Computer & Emerging Sciences, Islamabad
Outlook Temp Humidity Wind PlaySunny Hot High False No
Sunny Hot High True No
Overcast Hot High False Yes
Rainy Mild High False Yes
Rainy Cool Normal false Yes
Classification of OrganizationsClassification of organizations are:- Normal organization- Trivial Organization- Abnormal organization
National University of Computer & Emerging Sciences, Islamabad
Normal Organization- It has more than one members- Has non-empty useful attributes set
- It is denoted as ORGN
National University of Computer & Emerging Sciences, Islamabad
Outlook Temp Humidity Wind PlaySunny Hot High False No
Sunny Hot High True No
Overcast Hot High False Yes
Rainy Mild High False Yes
Rainy Cool Normal False Yes
Trivial Organization- It has only one members &- All attributes of a member are useful.
- It is denoted as ORGT
National University of Computer & Emerging Sciences, Islamabad
Outlook Temp Humidity Wind PlaySunny Hot High True No
Overcast Hot High False Yes
Abnormal Classification- It is an organization with empty useful attributes.
- It is denoted as ORGA
National University of Computer & Emerging Sciences, Islamabad
Outlook Temp Humidity Wind PlaySunny Hot High False No
Sunny Hot High True No
Overcast Hot High False Yes
Rainy Mild High True Yes
Rainy Cool Normal False Yes
Organization Records Organization keeps record of - Member list - Attribute type - Organization type- Member class- Fitness of organization
National University of Computer & Emerging Sciences, Islamabad
Fitness of OrganizationFitness of an organization is calculated as:- # of members- # of useful attributes-
National University of Computer & Emerging Sciences, Islamabad
Data RepresentationOCEC can handle both
- Nominal &- Continuous data
National University of Computer & Emerging Sciences, Islamabad
Outlook Temp Humidity Wind PlaySunny Hot High False No
Sunny Hot High True No
Overcast Hot High False Yes
Rainy Mild High False Yes
Rainy Cool Normal false Yes
Knowledge Representation- A is a set of attributes - Each attribute has range of values.
National University of Computer & Emerging Sciences, Islamabad
Outlook Temp Humidity Wind PlaySunny Hot High False No
Sunny Hot High True No
Overcast Hot High False Yes
Rainy Mild High False Yes
Rainy Cool Normal false Yes
Knowledge Representation- Instance Space I is the cartesian product of set of attributes
National University of Computer & Emerging Sciences, Islamabad
Outlook Temp Humidity Wind PlaySunny Hot High False No
Sunny Hot High True No
Overcast Hot High False Yes
Rainy Mild High False Yes
Rainy Cool Normal false Yes
Knowledge Representation- C is a set of classes- Each member is
National University of Computer & Emerging Sciences, Islamabad
Outlook Temp Humidity Wind PlaySunny Hot High False No
Sunny Hot High True No
Overcast Hot High False Yes
Rainy Mild High False Yes
Rainy Cool Normal false Yes
Rule RepresentationRules are represented in
IF <condition> THEN <class>Each term in condition is triple:
Attribute, operator, value
* Rules are extracted when evolutionary process Ends
National University of Computer & Emerging Sciences, Islamabad
Working of (OCEC)- OCEC during COE process generates a of set of examples and at the end of COE it generates set of rules.
if Temp = Mild and Outlook= Sunny then Class = Play Tennis
National University of Computer & Emerging Sciences, Islamabad
Working of (OCEC)- Inclusion or exclusion of attribute from a rule depends upon the Significance of the attribute.- EA Method is devised for determining the Significance of the attribute.- on the basis of attribute significance Fitness function of organization is defined.
National University of Computer & Emerging Sciences, Islamabad
Working of (OCEC)- EA Method is devised for determining the Significance of the attribute.- On the basis of attribute significance Fitness function of organization is defined.
National University of Computer & Emerging Sciences, Islamabad
Evolutionary Operators (OCEC)- Migrating Operator- Exchanging Operator- Merging Operator
Traditional operators such as mutation and crossover are not used.
National University of Computer & Emerging Sciences, Islamabad
Migrating Operators (OCEC)- 2 parent organizations are selected - n members are selected from either parent and are migrated to child’s
National University of Computer & Emerging Sciences, Islamabad
1 2 3 4 5 6 7 8
5 1 2 31 2 3 4
Exchanging Operators (OCEC)- 2 org’s are randomly selected from Population org1 & org2
National University of Computer & Emerging Sciences, Islamabad
Parent ORG1
Parent ORG2
Off-ORGc2
Child-ORGc1
Exchanging Operators (OCEC)- n members from each parent org1 are randomly selected and exchanged - Two child organization orgc1 & orgc2
National University of Computer & Emerging Sciences, Islamabad
1 2 3 4 5 6 7 8
1 6 7 8 5 1 2 3
Exchanging Operators (OCEC)- Two child organization orgc1 & orgc2
- Precondition is:|orgp1|>1 and |orgp2|>1
1 ≤ n < MIN{|orgp1|, |orgp2|}
National University of Computer & Emerging Sciences, Islamabad
Merging Operators- 2 org’s are randomly selected from Population orgp1 & orgp2
National University of Computer & Emerging Sciences, Islamabad
Parent ORG1
Parent ORG2
Child-ORGc1
Merging Operators (OCEC)- n members from each org1 are randomly selected and merged.- One child organization orgc1 & orgc2
National University of Computer & Emerging Sciences, Islamabad
1 2 3 4 5 6 7 8
1 2 7 8
Selection Operators (OCEC)- Tournament Selection Mechanism is used.
National University of Computer & Emerging Sciences, Islamabad
Rule Extraction From Organization-Rules are extracted from organizations when Evolutionary process ends.- Rules are extracted on the basis useful attributes.- Each useful attribute becomes TERM (part of condition).
if temp=hot then play = yes
National University of Computer & Emerging Sciences, Islamabad
Performance Evaluation of OCEC-Multiplexer problem- Radar Target Recognition Problem.-All results shows that OCEC has
- Higher prediction accuracy - Low computational cost.
National University of Computer & Emerging Sciences, Islamabad
Scalability Evaluation of OCEC-Scalability of OCEC is evaluated on synthetic sets.
- trainging exampels increases from 1lac to 10 Million- attributes are increases from 9 to 400.- results shows that I achieves good scalability.
National University of Computer & Emerging Sciences, Islamabad
EVALUATION OF OCEC’S EFFECTIVENESSA. Multiplexer Problemso Multiplexer problems were introduced to the
machine learning community by Wilson in 1987, and have often been used to evaluate the performance of learning classifier systems
National University of Computer & Emerging Sciences, Islamabad
EVALUATION OF OCEC’S EFFECTIVENESSB. Experimental Resultso The 20- and 37-multiplexer problems are usedo The training set of the 20-multiplexer problem
has 3000 examples, and that of the 37-multiplexer problem has 15 000 examples
o The test set of each problem has 100 000 examples
o The parameter N is set to 10% of the number of the training set, and n
National University of Computer & Emerging Sciences, Islamabad
EVALUATION OF OCEC’S EFFECTIVENESS
National University of Computer & Emerging Sciences, Islamabad
The evolutionary process of OCEC for the 20-multiplexer problem
EVALUATION OF OCEC’S EFFECTIVENESS
National University of Computer & Emerging Sciences, Islamabad
The evolutionary process of OCEC for the 37-multiplexer problem
Coding Output
National University of Computer & Emerging Sciences, Islamabad
The evolutionary process of OCEC for the 37-multiplexer problem
Coding Output
National University of Computer & Emerging Sciences, Islamabad
Coding Output
National University of Computer & Emerging Sciences, Islamabad
Comparison between OCEC & EA- OCEC is based on organization while traditional EA are based in individuals.-OCEC has bottom-up searching mechanism while EA has top-down searching mechanism- the benefit of using organization is that I does not generate meaningless rules. - OCEC has higher prediction accuracy and low computational cost.
National University of Computer & Emerging Sciences, Islamabad
Conclusion- It is best tool for data mining.- It has low computational cost- It performs well in a complex, huge dataset of individuals.- On high scalability it performs better than other techniques.
National University of Computer & Emerging Sciences, Islamabad
Future IDEA-If we use a Floating Point Fitness Function then it will give us better result in Scientific applications.
National University of Computer & Emerging Sciences, Islamabad