learning system

Upload: samaher-hussein

Post on 30-May-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/14/2019 Learning System

    1/30

    1

  • 8/14/2019 Learning System

    2/30

    2

    One central element ofintelligent behavior is the ability to learn fromexperience

    Understand the principles ofClustering, Classification and Prediction

    Explain therelationship between learning and data mining

    Given examples ofSupervised learning(back propagation neuralnetwork, and a decision tree ),unsupervised learning(Kohonen map ),

    learning to rule-based systems using genetic algorithms

    Examine the concept ofinformation theory in decision tree

  • 8/14/2019 Learning System

    3/30

    3

    Classification: Definition .1

    *Given a collection of records (training set) Each record contains a set ofattributes , one of the attributes is theclass.

    *Finda modelfor class attribute as a function of the values of other.attributes

    Goal: previously unseen records should be*assigned a class as accurately as. possible

    A **test setis used to determine the accuracy of the model. Usually,the given data set is divided into training and test sets, withtrainingset used tobuild the model andtest set usedto validate it.

  • 8/14/2019 Learning System

    4/30

    4

    Illustrating Classification Task .2

  • 8/14/2019 Learning System

    5/30

    5

    is the process of grouping the data into classes orclusters, so that objects within acluster havehigh similarity in comparison to one another but arevery dissimilarto objects in other clusters. Dissimilarities are assessed based on the attribute.values describing the objects. Often, distance measures are used

    As a result: clustering means

    Finding groups of objects such that the objects in a group will be similar (or related) toone another and different from (or unrelated to) the objects in other groups.

  • 8/14/2019 Learning System

    6/30

    6

    is the most common form of learning and is sometimes calledprogramming byexample . Thelearning agent is trained by showing it examples of the problem stateor attributes along with the desired output or action. The learning agent makes aprediction based on the inputs and if the output differs from the desired output, then

    the agent is adjusted or adapted to produce the correct output. This process isrepeated over and over until the agent learns to make accurate classifications orpredictions. Example of this type oflearning back propagation neural network,and adecision tree

  • 8/14/2019 Learning System

    7/30

    7

    .

    is used when thelearning agent needs to recognize similarities between inputs or toidentify features in the input data. The data is presented to the agent, and it adapts so.that it partitions the data into groups

    Reinforcement learning is a special case of supervised learning where the exact desiredoutput is unknown. It is based only on the information of whether or not the actualoutput is correct. And we can consider Reinforcement learning as a middle stagebetween supervised learning and unsupervised learning. Such Self organization map

  • 8/14/2019 Learning System

    8/30

    8

    On-line learning means that the agent is sent out to perform its tasks and that it canlearn or adapt after each transaction is processed. On-line learning is like on-the-jobtraining and places severe requirements on the learning algorithms. It must beveryfast and verystable

    is more like a business seminar. You take your sales people off the floor and placethem in an environment where they can focus on improving their skills without

    distractions. After a suitable training period, they are sent out to apply their newfound knowledge and skills

  • 8/14/2019 Learning System

    9/30

    9

    Data mining is the process of .1extraction knowledge hidden from large volumes.of raw data

    .2 data mining automates the process offinding relationships andpatterns in a rowdata and delivers results that can be either utilized in an automated decision.support system or assessed by human analyst

    .3 Data Mining is a process fordiscovering data relationships hiddenin large.databases

    So learning, as applied to data mining, can be thought ofas a way for intelligentagents to automatically discover knowledge rather than having it predefined

    using predicate logic, rules, or some other representation.

  • 8/14/2019 Learning System

    10/30

    10

    Artificial neural network: Definition

    is aninformation-processing system that able to acquire, store, andutilize experiential knowledge has been related to the networks capabilities and.performance

    Neural networks provide aneasy way to add learning ability to agents

  • 8/14/2019 Learning System

    11/30

    11

    Decision trees are one of the fundamental techniques used in data mining. They are tree-like structures used forclassification, clustering, feature selection, and prediction. Its-:have the following features

    .Decision trees are easily interpretable and understand for humans.1

    .They are well suited for high-dimensional applications .2

    .Decision trees are fast and usually produce high-quality solutions.3

    Decision tree objectives are consistent with the goals of data mining and .4.knowledge discovery

    A decision tree consists of aroot and internal nodes. The root and the internal nodesare labeled with questions in order to find a solution to the problem under.consideration

  • 8/14/2019 Learning System

    12/30

    12

    The root node is the first state of a DT. This node is assigned to all of the examplesfrom the training data. If all examples belong to the same group, no further decisionsneed to be made to split the data set. If the examples in this node belong to two ormore groups, a test is made at the node that results in a split. A DT is binary if eachnode is split into two parts, and it is non binary (multi-branch) if each node is split into.three or more parts

    A decision tree model consists of two parts:creating the tree andapplying the treeto the database. To achieve this, decision trees use several different algorithms. The

    most widely-used algorithms by computer scientists are ID3, C4-5, andC5.0.

    In general, decision trees are based on information theory and the leaf nodes of it

    represent a final classification of the record .

  • 8/14/2019 Learning System

    13/30

  • 8/14/2019 Learning System

    14/30

    14

    -:Example

  • 8/14/2019 Learning System

    15/30

    15

  • 8/14/2019 Learning System

    16/30

    16

  • 8/14/2019 Learning System

    17/30

    17

  • 8/14/2019 Learning System

    18/30

    18

  • 8/14/2019 Learning System

    19/30

    19

    1000

    -:man500 and women500

    Examples

    (Positive example (P

    (negative example (N

    Man=500

    90%(Positive example (P

    90/100*500450=

    % 10(Negative example (N

    10/100*50050=

    Women=500

    70%(Positive example (P

    70/100*500350=

    30%(Positive example (P

    30/100*500150=

    I )p / )P + n ,n / )p + n))==I

    p / (p +)n log 2)P / )p + n - )n / )p + n log 2)n / )p + n ((1

    Remainder =Pi + ni* I

    Gain =I- Remainder

    (2)

    (3)

    I= Total information =1

  • 8/14/2019 Learning System

    20/30

    20

    ](Gain =1-](500\1000)I(450\500,50\500)+ (500\1000)I(350\500,150\500](I(0.9,0.1)+(0.5)I(0.7,0.3(0.5)[-1=

    ]I(0.9,0.1)=(-]0.9 log2 0.9 + 0.1 log2 0.10.468 =

    ]I(0.7,0.3) = -]0.7 log2 0.7 + 0.3 log2 0.30.8812 =

    ](Gain= 1- ]0.5 (0.46899) + 0.5(0.88120.324857 =

    :- * 4 333* 410 333 10 333

    50%(Positive example (P

    50/100*333166=G1=333

    % 50(Negative example (N

    50/100*333166=

    90%(Positive example (P

    90/100*333300=

    100%(Positive example (P

    100/100*333333=

    % 10(Negative example (N

    10/100*33333=

    % 0(Negative example (N

    0*3330=

    G2=333

    G3=333

    ](Gain= 1- ]0.333 I(0.5,0.5) + 0.333 I (0.9,0.1)+ 0.333 I(1,0(0.466133+0.333*0.333+0.333)-1 =

    0.178778=

  • 8/14/2019 Learning System

    21/30

    21

    Back propagation is a .1systematic method for training multilayer artificialneural networkits learning rule is generalized from .2Widrow-Hoff rule for multilayer,networksW=w+ c (d - f( net )) xj

    its is a very popular .3supervised model . in neural networkIt .4does not have feedback connections, buterrors are Backpropagated duringtraining, Least mean squared error is use

  • 8/14/2019 Learning System

    22/30

    22

    Step1:-Input initial values to learning rate ( 0), maximum acceptable error to network(Emax), maximum number of epochs to learning network (Epochmax), momentum

    rate((

    Step2:-Put network error value (MSE) equal to zero and current training pattern error

    equal to one

    Step3 :-Computehidden neurons activity, by unipolar sigmoid function, with =1

    according to equation

    Step4 :-The hidden neuron outputs become inputs tooutput neurons that apply the

    .same sigmoid function to activity hidden

  • 8/14/2019 Learning System

    23/30

    23

    Step5 :- Computeerror signal value to output neurons of pattern p

    Step6 :- Computeerror signal value in hidden neurons depended on output neuronserror

    Step7 :- Adjust weights betweenhidden layer and output layer.

  • 8/14/2019 Learning System

    24/30

    24

    Step8 :- Adjust weights betweeninput layer and hidden layer

    Step9:- Increase value p by one to input the next pattern in learning process, if it doesnot reach to maximum number to training patterns then return tostep3 to trainingnetwork on that pattern else transform tostep10

    Step10:- After completing input to all training patterns to the network, compute costfunction value

    Step11:- In this step, the termination criterion is tested. This condition is valid if the total errorvalue of network becomes less than the expected error of it (Emax), or the current Epoch value(t)

    is bigger than maximum number of learning epochs (Epochmax). Else return.to step 2

  • 8/14/2019 Learning System

    25/30

    25

    A Kohonen map is a single-layer neural network, comprised of an input layer andan output Layer.

    Unlike back propagation, which is a supervised learning paradigm, feature mapsperform unsupervised learning.

    Each time an input vector is presented to the network, its distance to each unit in the

    output layer is computed. Various distance measures have been used. The mostcommon and the one used here is just the Euclidean distance.

    The output unit with the smallest distance to the input vector is declared thewinner.

  • 8/14/2019 Learning System

    26/30

  • 8/14/2019 Learning System

    27/30

    27

    Classifier systems are rule systems which use genetic algorithmsto modify.the rule baseGenetic algorithms

    are particularly suited tooptimization problems because they are, in essence,performing aparallel search. in the state space

    A .The crossover operator injects large amounts of noise into the process to make.sure that the entire search space is coveredB. The mutation operator allows fine-tuning of fit individuals in a manner similar to.hill-climbing search techniques

    C. Control parameters are used to determine how large the population is, howindividuals are selected from the population, and how often any mutation and.crossover is performed

  • 8/14/2019 Learning System

    28/30

    28

  • 8/14/2019 Learning System

    29/30

    29

    Population:- -1 represent number of individuals in that environment, usually we used(50-100) individual, each one have .consist of (L) genChromosome i= gen1 gen2 genl 1

  • 8/14/2019 Learning System

    30/30

    30

    ;(Initialization (population;(Evaluation (population;Gen 0Do;(Selection (population ,Selected parents;(.Crossover (selected parents, created offspring, crossover Pc;(.Mutation (created offspring ,Pm;( Evaluation (created offspringGen gen+1;(While (not stop_criteriaEnd SSGA

    END