build your strategy and projections with azure machine learning (sergey poplavskiy technology...

34
Microsoft Azure Machine Learning Sergiy Poplavskiy DX, Microsoft Ukraine [email protected]

Upload: lviv-it-arena

Post on 07-Jan-2017

301 views

Category:

Technology


0 download

TRANSCRIPT

Microsoft Azure Machine LearningSergiy PoplavskiyDX, Microsoft [email protected]

I need our systems to think. I need them to learn and I need them to present issues and problems and anomalies to the employees, to the managers.Adam CoffeyPresident and CEO WASH Laundry Systems

What is Machine Learning?Computing systems that become smarter with experience“Experience” = past data + human input

“”

Bing mapslaunches

What’s the best way home?

Microsoft Research formed

Kinect launches

What does that motion

“mean”?

Azure Machine

Learning GA

What will happen next?

Hotmail launches

Which email is junk?

Bing search launches

Which searches are

most relevant?

Skype Translator launches

What is that person saying?

Microsoft & Machine LearningAnswering questions with experience1991 201420091997 201520102008

Machine learning is pervasive throughout Microsoft products.

How Are We Different? Enable custom predictive analytics solutions at the speed of the market

The main benefit we have experienced is that everything is in one place. Data is stored in the same place that hosts computations on the data.

Corey CoscioniWest Monroe

“”

The old Machine Learning landscape No improvement in generations

Huge set-up costs of tools, expertise, and compute/storage capacity Expensive

Siloed and cumbersome data management restricts access to dataSiloed data

Complex and fragmented tools limit participation in exploring data and building models

Disconnected tools

Many models never achieve business value due to difficulties with deploying to production

Deployment complexity

Differentiation

Model Your Way[Data Scientist]

All Skill Levels Business-tested Algorithms

R & Python

Deploy in Minutes[Data Scientist, IT & Developers]

One Click DeploymentManage via Cloud Portal

Model accessed as a Web Service

Expand your Reach[Ecosystem & Developers]

Product GalleryAzure Marketplace

The Data Science “Inside”

Accessibility

Azure Machine Learning ServiceData -> Predictive model -> Operational web API in minutes

Blobs and TablesHadoop (HDInsight)Relational DB (Azure SQL DB)

Data Clients

Model is now a web service that is

callable

Monetize the API through our marketplace

API

Integrated development environment for Machine

Learning

ML STUDIO

Machine Learning in actionCompetitive differentiation through accessibility

“We wanted to go beyond the industry standard of preventative maintenance, to offer predictive and even preemptive maintenance.Andreas SchierenbeckCEO, ThyssenKrupp Elevator

Personalized offers“At Pier 1 Imports, we’ve

embraced the cloud. It helps us operationalize technology quickly and react to our ever-evolving business needsAndrew LaudatoPier 1 Imports

”Retailer Pier 1 Imports wanted to offer a connected, personal experience both online and in store

Smart buildings“We see Azure ushering in an era of self-service predictive analytics for the masses. We can only imagine the possibilitiesBertrand LasternasCarnegie Mellon

”CMU wanted to use sensor data for more than reactive repair and diagnosis

What can Azure ML do for you…?

Social network analysis

Weather forecasting

Healthcare outcomes

Predictive maintenance

Targeted advertising

Natural resource exploration

Fraud detection

Telemetry data analysis

Buyer propensity models

Churn analysis

Life sciences research

Web app optimization

Network intrusion detection

Smart meter monitoring

Azure Machine Learning Algorithms

What makes for a good ML problem?Labeled* examplesCustomer churn: records of current customers (loyal) + former customers who left (churn)Fraud detection: examples of fraud and not fraud

Relevant featuresCustomer information: age, sex, zip code, historic spending patterns, etc.Transaction information: amount, previous transactions, customer information, etc.

Uncertainty/error toleranceIdentifying some customers who are leaving is better than current systemAutomated fraud identification still verified by human user

Why are labels important?

Supervised learning examplesThis customer will like House of CardsThis network traffic indicates a denial of service attack

Unsupervised learning examplesThese customers are similarThis network traffic is unusual

Most commonly used machine learning algorithms are supervised (requires labels)

Regression versus Classification

Regression problemsEstimate household power consumptionEstimate customer’s income

Classification problemsPower station will/will not meet demandCustomer will respond to advertising

Does your customer want to predict/estimate a number (regression) or apply a label/categorize (classification)?

Binary versus Multiclass Classification

Binary examplesclick predictionyes/noover/underwin/loss

Multiclass exampleskind of treekind of network attacktype of heart disease

Does your customer want a yes/no answer?

ReviewIs your customer’s problem suitable for machine learning?Lots of data, lots of features, uncertainty is ok

Does your customer want to predict something?Labels are essential

Does your customer want a number or a category?

Machine Learning Algorithms

Bottom Line: Most algorithms can be applied to a variety of problems

Algorithm Binary Classification in Azure ML

Multiclass Classification in AzureML

Regression in Azure ML

Logistic Regression Two-class logistic regression

Multiclass Logistic Regression

Linear Regression Linear RegressionSupport Vector Machine

Two-class support vector machine

One-vs-all + support vector machine

Decision Tree Two-class boosted decision tree

One-vs-all + boosted decision tree

Boosted decision tree regression

Neural Network Two-class neural network

Multiclass neural network

Neural network regression

Random Forest Two-class decision forest

Multiclass decision forest

Decision forest regression

Linear RegressionWhat is doesPredicts a numeric value, y, based on your input variables

How it does itFits a linear function to your data to minimize the error across all training data

Bike share demand predictionHow many bikes will be used in a particular hour?Data:• 13 features including weather,

day of the week, and temperature

• 17,000+ examples• Most data is numeric, i.e.,

temperature

Source:UCI repository and capital bike share

Model in Azure ML

Support Vector Machine (Classification)What is doesPredicts binary class label Y based on input variables X

How it does itFinds line/hyperplane, i.e., the support vector, that optimally separates classes to minimize classification error

Note Often not possible to find separating line without classification errors

Hospital readmissionWhich patients treated for diabetes related illness will be readmitted within 30 days?Data:• 35 features including age,

gender, race, test results, treatment, etc.

• 100,000+ samples• Contains several categorical

features

Source:UCI repository and Virginia Commonwealth University

Model in Azure ML

Decision Trees - ClassificationWhat is doesPredicts class label Y based on input variables X

How it does itUses a series of decision rules in a binary tree where each leaf is a class label Rules are determined by optimizing to reduce error

Decision Trees – Regression What is doesPredicts scalar y based on input variables X

How it does itUses a series of decision rules in a binary tree where each leaf is a numberRules are determined by optimizing to reduce error

Which algorithm should you pick?Depends on your problem – what does your data look like?Logistic regression, support vector machines are good for linear problemsDecision trees are intuitive and easy to interpretDecision trees naturally handle categorical data, data scaling not requiredSupport vector machines are effective for high dimensional dataBasic decision trees are not as accurate…

Advanced decision trees: Boosted TreesWhat is doesPredicts class label (classification) or value (regression) Y based on variables X

How it does itUses a series of many trees to fit data. Each successive tree is built to reduce the estimated error the previous tree, resulting in increasingly accurate models

CautionIf number of trees is too large – can fit noise instead of good data, known as overfitting

Advanced decision trees: Random ForestWhat is doesPredicts class label (classification) or value (regression) Y based on variables X

How it does itBuilds many different trees using different random inputs (training data and features), aggregates those trees to determine final predictionIf trees are independent, having more trees does not result in overfitting

Advantages to each set of algorithmsLinear Algorithms (SVM, Logistic Regression)easy to understand/interpret resultscomputationally efficient to train and score, even for large datasetsdifficult to overfit data

Nonlinear (boosted decision tree, random forest)

can find irregular boundariesnatively handles categorical featurescan achieve superior accuracy for many problems

Neural NetworksWhat is doesPredicts class label (classification) or value (regression) Y based on variables X

How it does itIs inspired by biological systems, i.e., neurons in the brain

Final thoughts on algorithm selectionDo what data scientists do---try multiple algorithms and select the best model for your data

Model Your Way: Open source/our sourceScript with R, SQLite or Python CPython 2.7 support from inside AML Studionumpy/scipy/panda/scikit-learn/etc. Anaconda distro pre-installed

Python client library Analyze data using Python and its librariesUse IPython, PTVS, Eclipse to edit/debug

Big learning with countsTB scale datasets Modular: tune/monitor/replace in isolationMonitorable and debuggable

Deploy in MinutesOne click to production Publish as a Web Service or to Gallery Continuous updates to streamline process Stay tuned to our blog for more

New in-product GalleryDiscover what others have built Learn by dropping these into your workspaceShare your work with others

Expand your Reach

Get started for free with only a Microsoft Account ID

azure.com/ml

Check out finished APIs and solutions or put your own on the Machine Learning marketplace at datamarket.azure.com Find videos, tutorials on Documentation off azure.com/mlStay tuned to our blog https://aka.ms/mlblog

© 2015 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.