lecture3 - machine learning

Introduction to MachineIntroduction to Machine LearningLearning

Lecture 3

Albert Orriols i Puigi l @ ll l [email protected]

Artificial Intelligence – Machine LearningEnginyeria i Arquitectura La Salleg y q

Universitat Ramon Llull

Recap of Lecture 2Machine learningMachine learning

Learning = Improving with experience at some taskImprove over task TImprove over task TWith respect to a performance measure PBased on experience EBased on experience E

Three especial nichesData mining: extract information from historical data to help g pdecision making

Software applications that are too complex to build a hard-pp pwired solution for

Self customizing programs

Slide 2Artificial Intelligence Machine Learning

g p g

Today’s Agenda

Characteristics Desired for ML MethodsGeneral issues

Concepts that will be used through lecturesConcepts that will be used through lectures

Summary of the Paradigms that We Won’t y gStudyS f th P bl th t W Will St dSummary of the Problems that We Will Study


Characteristics Desired MLWe would like our ML techniques to have the following q gproperties

Be able to generalize but not too muchBe able to generalize, but not too much

Be robust

B li blBe reliable

Learn models of high quality

Be scalable and efficient

Be explicativeBe explicative

Be determinist


Characteristics Desired MLBe able to generalize, but not too muchg ,

We learn from a set of examples

I i th t d i d t iImagine that we are doing data regression

Examples (observations)

-- Real domain

Learned function

We only know the examples {e1, e2, e3, e4, e5, e6, e7, e8, e9}

We do not know the real distribution

So, does the learning function fits the real distribution?

Slide 5

So, does t e ea g u ct o ts t e ea d st but o

Artificial Intelligence Machine Learning

Characteristics Desired MLBe able to generalize, but not too muchg ,

Examples (observations)

Real domain-- Real domain

Learned function

What could have happened?at cou d a e appe edI may not be a good representation of the original distributionThe ML method may not work well (overfitting)

So, what should we do?Assume that I is a good representative of the original distributiong p gGo for the simplest solution


Characteristics Desired MLBe robust

Real-world is imperfect and our measurements of real world may be even more imperfectay be e e o e pe ec

Therefore, we will deal with domains withNoiseNoiseUncertaintyVaguenessVagueness

We have to keep this in mind when designing our algorithms


Characteristics Desired MLLearn models of high quality Test setg q y

How do we evaluate learning quality?New instance

Dataset Learner Model

Information basedon experience

Knowledgeextraction

Predicted Output

Training set

More advanced validation methods:

g

k-fold cross-validationHoldout


Characteristics Desired MLBe reliable

What do you prefer?Do not predict something that you doubt about?Do not predict something that you doubt about?Or just bet for an option?

Cl t iti ?Classes are cost sensitive?What happens if I say that a patient, who has actually cancer, is healthy?is healthy?What happens if I say that a patient, who is actually healthy, has cancer?

Do I prefer to model one class as opposed to the other?Fraud detection (0.1% of fraudulent transactions)( % )Geez, I modeled perfectly the non-fraudulent transactions!

Am I successful?


Characteristics Desired MLBe scalable and efficient

Huge amount of data

I f ti hidd i th d tInformation hidden in these data

I need to process them quickly!

Two types of costs:o types o costsCost to build the modelCost to classify new test examplesy p


Characteristics Desired ML

Be explicativeSh ld I b t i i l ti ?Should I care about giving an explanation?

Text/speech recognitionThings happen too fast If errors are not too huge I do not care ifThings happen too fast. If errors are not too huge, I do not care if I read “a” instead of “e”

Medical diagnosisgI really care about obtaining an accurate explanation, since the diagnosis may involve applying surgery to a patient or not


Characteristics Desired MLBe determinist

If my data does not changeThe learned model should be always the sameThe answer for a given test instance should be always the same

If my data changesI should adapt to the changes


Paradigms in ML

Typically, techniques in ML have been divided in different paradigms

Inductive learning

Explanation-based learningp g

Analogy-based learning

Evolutionary learningEvolutionary learning

Connectionist Learning


Inductive LearningInduce rules, trees or, in general, patterns from a set of , , g , pexamples

Start from a specific experienceStart from a specific experience

Draw inferences or generalizations from it

That isInitial state: Original data

State: Symbolic description of the data with a certain degree ofState: Symbolic description of the data with a certain degree of generalization/specialization

Final state: Model with maximum generalization that implies theFinal state: Model with maximum generalization that implies the input data


Explanation-Based LearningDeduce information from a set of observations

Humans learn a lot from few examples

M hi lt f l t l th tMachine: use results from one example to solve the next problem

Domain theory for the problem

EBL

Domain theory for the problem

Goal concept New domain theory

Training example


Explanation-Based Learning

D iDomain:R1: striped(x) ^ feline(x) tiger(x)R2: runs(x) feline(x)R3: carnivorous(x) ^ has Tail(x) feline(x)

tiger (Flare)3 ca o ous( ) as_ a ( ) e e( )

R4: eats_meat(x) carnivorous(x)R5: teeth(x) ^mammal(x) carnivorous(x)R6: hairy(x) mammal(x)R7: feeds milk(x) mammal(x)

striped (Flare)feline (Flare)

R7: feeds_milk(x) mammal(x)R8: warm_blood(x) mammal(x)

Goal: TIGERcarnivorous (Flare)runs (Flare) has_tail (Flare)

Example:feeds_milk( Flare )has_tail ( Flare )striped ( Flare ) mammal (Flare)eats meat (Flare) teeth (Flare)striped ( Flare )teeth ( Flare)

mammal (Flare)eats_meat (Flare) teeth (Flare)

feeds_milk (Flare)hairy (Flare) warm_blood (Flare)


Explanation-based LearningExample

Goal: Get to Brecon

Training dataTraining dataNear (Cardiff, Brecon)Airport (Cardiff)

Domain KnowledgeNear(x,y) ^ holds( loc(x), s ) holds( loc(y), result(drive(x,y),s) )

Airport(z) loc(z) result( fly(z) s )Airport(z) loc(z), result( fly(z), s )

Operational criterion: We must express concept definition in pure description language syntax

Our goal can be expressed asHolds ( loc(Brecon), s)


Learning Based on AnalogyA is similar to A’ according to α αIf I have B, can I get B’?

Learn the causality relationship β

A A’α

β β'Learn the causality relationship β

Transform α to α’

Get B’ according to B and α’ B B’α'

β β

Get B according to B and α

Where is the trick?

In learning α’ and βIn learning α’ and β

Partial mapping

New Problem Previously solved problem

Solution of this Solution to the

DerivationTransformation


known problemproblem

Evolutionary LearningNature as problem solverp

Nature evolved adapted solutions to life

L t’ thi t t l f iLet’s use this concepts to learn from experience


Connectionist LearningMimic brain structure to build machines that are able to learn

A brain consists ofA brain consists ofConnected neurons that behave in a specific way

Let’s assume that this behavior can be coded functionally


Problems That We’ll StudyTypical ML courses go through the different familiesyp g g

Structured courses

Bi i t f th diff t l i diBig picture of the different learning paradigms

HoweverEmergence of hybrid intelligent systems

Concepts come all mixed togetherg

We are engineers. We need to solve problems

So we propose to go problem orientedSo, we propose to go problem-orientedTechniques of different paradigms will come on our way


Problems That We’ll Study

1. Data classification: C4.5, kNN, Naïve Bayes …

2 Statistical learning: SVM2. Statistical learning: SVM

3. Association analysis: A-priori

4. Link mining: Page Rank

5. Clustering: k-meansg

6. Reinforcement learning: Q-learning, XCS

7 Regression7. Regression

8. Genetic Fuzzy Systems


Next Class

How I Would Like my Problem to Look Like?How I Would Like my Problem to Look Like?

Summary of the Paradigms that we Won’t Study


Introduction to MachineIntroduction to Machine LearningLearning

Lecture 3

Albert Orriols i Puigi l @ ll l [email protected]

Artificial Intelligence – Machine LearningEnginyeria i Arquitectura La Salleg y q

Universitat Ramon Llull

lecture3 - machine learning

Education