machine learning with r as a servicedownload.microsoft.com/download/6/5/0/65023338-ae... · word...

9
29/10/2015 1 Data Platform Airlift 21 de Outubro \\ Microsoft Lisbon Experience Machine Learning with R as a Service Manuel Dias Business Analytics Lead, Microsoft [email protected] Predictive Analytics Sales and marketing Finance and risk Customer and channel Operations and workforce Utilities, Oil & Gas Agent Allocation Warehouse Efficiency Smart buildings Predictive Maintenance Supply chain optimization User Segmentation Personalized Offers Product Recommendation Fraud Detection Credit risk management Sales Forecasting Demand Forecasting Sales Lead Scoring Marketing mix optimization Energy Forecasting Grid Optimization Theft Prevention Predictive maintenance Demand Response Customer Profiling Credit Scoring Revenue Forecasting

Upload: others

Post on 25-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Machine Learning with R as a Servicedownload.microsoft.com/download/6/5/0/65023338-AE... · Word cloud: wordcloud Twitter data access: twitteR. 29/10/2015 7 Execute R Script Create

29/10/2015

1

Data Platform Airlift21 de Outubro \\ Microsoft Lisbon Experience

Machine Learning with R as a Service

Manuel Dias

Business Analytics Lead, Microsoft

[email protected]

Predictive Analytics Sales and marketing

Finance and risk

Customer and channel

Operations and workforce

Utilities, Oil & Gas

Agent Allocation

Warehouse Efficiency

Smart buildings

Predictive Maintenance

Supply chain optimization

User Segmentation

Personalized Offers

Product Recommendation

Fraud Detection

Credit risk management

Sales Forecasting

Demand Forecasting

Sales Lead Scoring

Marketing mix optimization

Energy Forecasting

Grid Optimization

Theft Prevention

Predictive maintenance

Demand Response

Customer Profiling

Credit Scoring

Revenue Forecasting

Page 2: Machine Learning with R as a Servicedownload.microsoft.com/download/6/5/0/65023338-AE... · Word cloud: wordcloud Twitter data access: twitteR. 29/10/2015 7 Execute R Script Create

29/10/2015

2

Harvard Business, Thomas H. Davenport , October 2012

What is Machine Learning?

Predictive computing

systems become smarter

with experience

We want to learn a mapping from the input to the output; correct

values are provided by supervisor:

• Fraud Detection

• Image Recognition

• SPAM Filter

• Sales Forecast

We want to find regularities in the data. The class labels of training

data is unknown.

• Customer Segmentation

• Movies Recommendation engine

Page 3: Machine Learning with R as a Servicedownload.microsoft.com/download/6/5/0/65023338-AE... · Word cloud: wordcloud Twitter data access: twitteR. 29/10/2015 7 Execute R Script Create

29/10/2015

3

Azure ML

Identify outliers on

the running data

Predict numerical

outcomes

Explore associations

between cases

Discover natural

groupings of cases

Classification Anomaly Detection RecommendersRegression Clustering

Predict what class

case belongs to

Supervised Learning Unsupervised Learning

R and

Python

Mathematical

Programming

Online

analytical

processing

Graph

analytics

Text

analytics

Support

Vector

Machines

Boosted

Decision

Trees

Time series

processing

In the future

Support for

extensibility

by enabling

users to add

their own

algorithms as

modules

Associative

rule mining

Neural

networks

Regression

analysisClustering

Nearest-

neighbor

Page 4: Machine Learning with R as a Servicedownload.microsoft.com/download/6/5/0/65023338-AE... · Word cloud: wordcloud Twitter data access: twitteR. 29/10/2015 7 Execute R Script Create

29/10/2015

4

Azure Machine Learning

DATA

HDInsight

SQL Server VM

SQL Database

Blobs & Tables

Desktop files

Excel spreadsheet

Other data files on PC

Azure Machine

Learning

ML Studio

Azure Machine Learning

ML Marketplace

Devices & Applications

Publish API

Get Historical Data

Feature Engineering

Evaluate Model

Define Model

Score Model

Train/ Test Split

Train Model

Iterate until the test

metrics are satisfactory

Page 5: Machine Learning with R as a Servicedownload.microsoft.com/download/6/5/0/65023338-AE... · Word cloud: wordcloud Twitter data access: twitteR. 29/10/2015 7 Execute R Script Create

29/10/2015

5

1 0

1 506TRUE POSITIVE (TP)

112FALSE NEGATIVE

(FN)

0 169FALSE POSITIVE (FP)

420TRUE NEGATIVE (TN)

Accuracy = 𝑇𝑃 + 𝑇𝑁

𝑃+𝑁

Sometimes a better model may have lower Accuracy!

Precision = 𝑇𝑃

𝑇𝑃+𝐹𝑃

How many of the returned documents are correct

Recall = 𝑇𝑃

𝑇𝑃+𝐹𝑁=

𝑇𝑃

𝑃

How many of the positive labels are correct

Demo 1Building a model to predict Population Income

Page 6: Machine Learning with R as a Servicedownload.microsoft.com/download/6/5/0/65023338-AE... · Word cloud: wordcloud Twitter data access: twitteR. 29/10/2015 7 Execute R Script Create

29/10/2015

6

A Language Platform…

A Community…

Tools & Resourceshttp://www.rstudio.com/

Core R: http://cran.r-project.org/

R was ranked no. 1 in the KDnuggets 2014 poll on Top Languages for analytics, data mining, data science

Classification

Decision trees: rpart, party

Random forest: randomForest, party

SVM: e1071, kernlab

Neural networks: nnet, neuralnet, RSNNS

Performance evaluation: ROCR

Clustering

k-means: kmeans(), kmeansruns()10

k-medoids: pam(), pamk()

Hierarchical clustering: hclust(), agnes(), diana()

DBSCAN: fpc

BIRCH: birch

Cluster validation: packages clv, clValid, NbClust

Association Rules

Association rules: apriori(), eclat() in package arules

Sequential patterns: arulesSequence

Visualisation of associations: arulesVi

Text Mining

Text mining: tm

Topic modelling: topicmodels, lda

Word cloud: wordcloud

Twitter data access: twitteR

Page 7: Machine Learning with R as a Servicedownload.microsoft.com/download/6/5/0/65023338-AE... · Word cloud: wordcloud Twitter data access: twitteR. 29/10/2015 7 Execute R Script Create

29/10/2015

7

Execute R Script

Create R Model

Page 8: Machine Learning with R as a Servicedownload.microsoft.com/download/6/5/0/65023338-AE... · Word cloud: wordcloud Twitter data access: twitteR. 29/10/2015 7 Execute R Script Create

29/10/2015

8

Demo 2Building a model to predict Population Income

in R and deplying it in Azure ML

Page 9: Machine Learning with R as a Servicedownload.microsoft.com/download/6/5/0/65023338-AE... · Word cloud: wordcloud Twitter data access: twitteR. 29/10/2015 7 Execute R Script Create

29/10/2015

9

Predictive Analysis

Usage files

Data Sources Ingest & Pre-

processing

Data Preparation

(normalize, clean, etc)

Analyze & Score

(build Predictive Model)

Publish for

Consumption

Consume

Cloud Storage

batch

Processing

Engine

Machine LearningData Cleanup

Relational

DW/DM

BI Tools

- Volume per move is

typically ~100GB or

- Data is most commonly

collected nightly or

hourly

- Common Pre-processing

steps: scrub for

compliance purposes &

partition for long term

storage

- HDI & Customer code used in

this step as a

transformation/cleaning tool

- E.g. enrich, normalize

ADF: Move Data, Orchestrate, Schedule & Monitor

- Generate BI-Ready results (e.g.

dims or facts, aggregated big

data, etc)

- Create result set to drive app

or business process (e.g. list of

customers likely to churn next

month)

- In this scenario is used

as a queryable storage

system for Information

workers and analyst to

connect their BI tools

to.

- BI Tools: Power BI,

Tableau, etc.

- Apps here means any

programmatic

consumption

Business

AppsCustomer Info

Real time