Transcript
Page 1: Machine Learning - What, Where and How

Machine LearningWhat, Where and How

Narinder Kumar ([email protected])

Mercris Technologies (www.mercris.com)

Page 2: Machine Learning - What, Where and How

2

Agenda

Definition

Types of Machine Learning

Under-the Hood

Languages & Libraries

Page 3: Machine Learning - What, Where and How

3

What is Machine Learning ?

Page 4: Machine Learning - What, Where and How

4

Definition Field of Study that gives Computers the ability

to learn without being explicitly programmed --Arthur Samuel

A more Mathematical one

A Computer program is said to learn from Experience E with respect to some Task T and Performance measure P, if it's Performance at Task in T, as measured by P, improves with Experience E –Tom M. Mitchell

Page 5: Machine Learning - What, Where and How

5

Related Disciplines

Sub-Field of Artificial Intelligence

Deals with Design and Development of Algorithms

Closely related to Data Mining

Uses techniques from Statistics, Probability Theory

and Pattern Recognition

Not new but growing fast because of Big Data

Page 6: Machine Learning - What, Where and How

6

Types of Machine Learning Supervised Machine Learning

Provide right set of answers for different set of questions

Underlying algorithm learns/infers over a period of time

Tries to return correct answers for similar questions

Unsupervised Machine Learning

Provide data & Let underlying algorithm find some structure

Page 7: Machine Learning - What, Where and How

7

Popular Use Cases

Recommendation Systems

Amazon, Netflix, iTunes Genius, IMDb...

Up-Selling & Churn Analysis

Customer Sentiment Analysis

Market Segmentation

...

Page 8: Machine Learning - What, Where and How

8

Understanding Regression

Page 9: Machine Learning - What, Where and How

9

Problem Contest

Page 10: Machine Learning - What, Where and How

10

Typical Machine Learning Algorithm

Training Set

Learning Algorithm

HypothesisInput FeaturesInput

FeaturesExpectedOutput

Page 11: Machine Learning - What, Where and How

11

Let's Simplify a bit

50 100 150 200 250 300 350 4000

500

1000

1500

2000

2500

3000

3500

4000

House Sizes vs Prices

House Sizes (Sq Yards)

Pric

es (

1000

US

D)

➢ Goal is to draw a Straight line which covers our Data-Set reasonably

➢ Our Hypothesis can be

➢ Such that hΘ x=Θ0+Θ1 xhθ( x)=θ0+θ1 x

hθ( x)≃ y

Page 12: Machine Learning - What, Where and How

12

In Mathematical Terms➢ Hypothesis

➢ Parameters

➢ Cost Function

➢ We would like to minimize

hθ( x)=θ0+θ1 x

θ0 ,θ1

J (θ0 ,θ1)

Page 13: Machine Learning - What, Where and How

13

Solution : Gradient Descent➢ Start with an initial

values of

➢ Keep Changing until we end up at minimum

θ0 ,θ1

θ0 ,θ1

Page 14: Machine Learning - What, Where and How

14

Mathematically

Repeat Until Convergence

For Our Scenario

Generic Formula

Page 15: Machine Learning - What, Where and How

15

Let's see all this in Action

Page 16: Machine Learning - What, Where and How

16

Extending Regression➢ Quadratic Model

➢ Cubic Model

➢ Square Root Model

➢ We can create multiple new Features like

X 2=X2 X 3=X

3 X 4=√ X

Page 17: Machine Learning - What, Where and How

17

Additional Pointers

➢ Mean Normalization

➢ Feature Scaling

➢ Learning Rate

➢ Gradient Descent vs Others

Page 18: Machine Learning - What, Where and How

18

HOW-TOLanguages & Libraries

Page 19: Machine Learning - What, Where and How

19

Languages

Page 20: Machine Learning - What, Where and How

20

Libraries, Tools and Products

Page 21: Machine Learning - What, Where and How

21

A Short Introduction

Page 22: Machine Learning - What, Where and How

22

What is WEKA ? Developed by Machine Learning Group,

University of Waikato, New Zealand Collection of Machine Learning Algorithms Contains tools for

Data Pre-Processing Classification & Regression Clustering Visualization

Can be embedded inside your application Implemented in Java

Page 23: Machine Learning - What, Where and How

23

Main Components

Explorer

Experimenter

Knowledge Flow

CLI

Page 24: Machine Learning - What, Where and How

24

Terminology Training DataSet == Instances Each Row in DataSet == Instance Instance is Collection of Attributes (Features) Types of Attributes

Nominal (True, False, Malignant, Benign, Cloudy...)

Real values (6, 2.34, 0...) String (“Interesting”, “Really like it”, “Hate

It” ...) ...

Page 25: Machine Learning - What, Where and How

25

Sample DataSets@RELATION house

@ATTRIBUTE houseSize real@ATTRIBUTE lotSize real@ATTRIBUTE bedrooms real@ATTRIBUTE granite real@ATTRIBUTE bathroom real@ATTRIBUTE sellingPrice real

@DATA3529,9191,6,0,0,205000 3247,10061,5,1,1,224900 4032,10150,5,0,1,197900 2397,14156,4,1,0,189900 2200,9600,4,0,1,195000 3536,19994,6,1,1,325000 2983,9365,5,0,1,230000

@RELATION CPU

@attribute outlook {sunny, overcast, rainy}

@attribute temperature real@attribute humidity real@attribute windy {TRUE, FALSE}@attribute play {yes, no}

@datasunny,85,85,FALSE,nosunny,80,90,TRUE,noovercast,83,86,FALSE,yesrainy,70,96,FALSE,yesrainy,68,80,FALSE,yesrainy,65,70,TRUE,noovercast,64,65,TRUE,yes

Page 26: Machine Learning - What, Where and How

26

WEKA Demo

Page 27: Machine Learning - What, Where and How

27

Page 28: Machine Learning - What, Where and How

28

Apache Mahout➢ Collection of Machine Learning Algorithms➢ Map-Reduce Enabled (most cases)➢ DataSources

➢ Database➢ File-System➢ Lucene Integration

➢ Very Active Community➢ Apache License

Page 29: Machine Learning - What, Where and How

29

WEKA vs Apache Mahout

WEKA➢ Lot of Algorithms➢ Tools for

➢ Modeling➢ Comparison➢ Data-Flow

➢ May need work for running on large data-sets

➢ License Issues

Apache-Mahout➢ Lesser number of

Algorithms but growing

➢ Lack of tools for Modeling

➢ Ready by Design for Large Scale

➢ Vibrant Community➢ Apache License

Page 30: Machine Learning - What, Where and How

30

&

An Overview

Page 31: Machine Learning - What, Where and How

31

Google Prediction API 101

➢ Cloud Based Web Service for Machine Learning

➢ Exposed as REST API

➢ Does not require any Machine Learning

knowledge

➢ Capabilities

➢ Categorical &

➢ Regression

Page 32: Machine Learning - What, Where and How

32

Working with Google Prediction API

Page 33: Machine Learning - What, Where and How

33

Let's see in Action

Page 34: Machine Learning - What, Where and How

34

Analysis

Very Promising Concept

Can be powerful tool for SME's

Not configurable

Data Security

Not Yet Production Ready (IMHO)

Page 35: Machine Learning - What, Where and How

35

Recap

➢ Very vast

➢ Huge demand

➢ Has an Initial Steep Learning Curve

➢ Several libraries available

➢ Lot of Innovative work going on currently

Page 36: Machine Learning - What, Where and How

36

[email protected]

@kumar_narinder

www.mercris.com

http://mercris.wordpress.com

Page 37: Machine Learning - What, Where and How

37

Resources➢ Online Machine Learning Course - Prof. Andrew

Ng, Stanford University ➢ WEKA Wiki and API docs➢ Apache Mahout Wiki➢ IBM Developer Works Articles➢ Google Prediction API Web Site➢ Data Mining : Practical Machine Learning Tools &

Techniques – Ian H. Witten, Eibe Frank, Mark Hall➢ Machine Learning Forums


Top Related