azure machine learning intro

29
+49 (0) 69 24 24 08 00 [email protected] http://www.daenet.de/ http://www.daenet.de/ http://www.daenet.de/ #AzureML [Olivia Klose] [Damir Dobric]

Upload: damir-dobric

Post on 15-Jul-2015

299 views

Category:

Software


7 download

TRANSCRIPT

+49 (0) 69 24 24 08 00

[email protected]

http://www.daenet.de/

http://www.daenet.de/

http://www.daenet.de/

#AzureML[Olivia Klose]

[Damir Dobric]

+49 (0) 89 3176 6796

[email protected]

http://www.azure.com/

http://www.aka.ms/oliviaklose

http://www. twitter.com/oliviaklose

Big Data & MachineLearning

Olivia Klose

20.Okt

01.Dec 08.Dec 15.Dec 22.Dec

12.Jan

From 15.00-18.00 h

Microsoft Office

Bad Homburg

03.Nov 10.Nov 17.Nov 24.Nov

Software EngineeringWS-2014-2015

29.Dec

https://www.yammer.com/uni-se-ws-2014-2015

13.Okt 27.Okt

19.Jan 26.Jan

02.Feb 09.Feb

Azure Machine Learning

Damir Dobric

[email protected]

[email protected]

Olivia Klose

[email protected]

Blog Twitter

What is Machine Learning?

Finding patterns in data

Replacing human written code with supplying data

Why Learn?1. Learn it when you can’t code it

(e.g. Recognizing Speech/image/gestures)

2. Learn it when you can’t scale it (e.g. Recommendations, Spam & Fraud detection)

3. Learn it when you have to adapt/personalize (e.g. Predictive typing)

4. Learn it when you can’t track it (e.g. AI gaming, robot control)

Can Machine Learning Help Me?• Automated prediction is core

• Lots of past data already

available

• Magic numbers in current

prediction system

• Prediction is small part of

experience

• No past data available

• Many business-rules govern the

experience

• Predictions do not have a

predictable patternYe

s

No

Learning Problemfind ‘f’

Finding patterns in data

X Y

1 FALSE

9 TRUE

6 TRUE

4 FALSE

8 TRUE

2 FALSE

7 TRUE

3 FALSE

4 FALSE

5 FALSE

10 TRUE

15 TRUE

12 TRUE

X Y

1 FALSE

2 FALSE

3 FALSE

4 FALSE

4 FALSE

5 FALSE

6 TRUE

7 TRUE

8 TRUE

9 TRUE

10 TRUE

12 TRUE

15 TRUE

Writing code

X Y

1 FALSE

2 FALSE

3 FALSE

4 FALSE

4 FALSE

5 FALSE

6 TRUE

7 TRUE

8 TRUE

9 TRUE

10 TRUE

12 TRUE

15 TRUE

public bool Y(X feature){

if(feature > 5)return true;

else return false;

}

public bool Y(X feature){

return feature > 5;}

Supplying data

X Y1 FALSE2 FALSE3 FALSE4 FALSE4 FALSE5 FALSE6 TRUE7 TRUE8 TRUE9 TRUE

10 TRUE12 TRUE15 TRUE

Decisions

Binary classificationTrue/false, male/female, high/low, black/white

Multi-class classification{1,2,3,4}, {A,B,C,D}, {0,1€, 0,5€,1€, 2€}

Regression1,0-100,00, any real value.

Machine Learning approaches

supervised learningpredicting the future

learn from past examples to predict future

unsupervised learningunderstanding the past

making sense of data

learning structure of data

compressing data for consumption

Supervised Learning Unsupervised Learning

Supervised vs. Unsupervised

x1

x2

x1

x2

Supervised Learning

• Classification

• Regression

• Recommendation

• Anomaly Detection

Unsupervised Learning

• Clustering

• Principal Component Analysis

Supervised vs. Unsupervised

Supervised or Unsupervised?Classification or Regression?

You have a large inventory of identical items. You want to predict how many of these items will sell over the next 3 months.

Given email labeled as spam/not spam, learn a spam filter.

Given a set of news articles found on the web, group them into set of articles about the same story.

Supervised or Unsupervised?Classification or Regression?

You’d like software to examine individual customer accounts, and for each account decide if it has been hacked/compromised.

Given a database of customer data, automatically discover market segments and group customers into different market segments.

Given a dataset of patients diagnosed as either having diabetes or not, learn to classify new patients as having diabetes or not.

Truth

true false

Guess

positive

𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =𝑡𝑝

𝑡𝑝 + 𝑓𝑝

negative

𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑡𝑟𝑢𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒

𝑟𝑒𝑐𝑎𝑙𝑙 =𝑡𝑝

𝑡𝑝 + 𝑓𝑛𝑎𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =

𝑡𝑝 + 𝑡𝑛

𝑡𝑝 + 𝑡𝑛 + 𝑓𝑝 + 𝑓𝑛

Confusion Matrix

Good to know

Mean

Median separates the higher half of a data from the lower half:

{3, 3, 5, 9, 11} =>5

{3, 5, 7, 9}=>(5 + 7) / 2 = 6

Variance where

Standard DeviationSquare root of the squared variance

Delivering Advanced Analytics Business users access results from anywhere, on any device

• HDInsight• SQL Server VM• SQL DB• Blobs & Tables

Devices Applications Dashboards

Data Microsoft Azure Machine Learning

Storage space

Integrated development environment for

Machine Learning

ML

Studio

Business challenge Business valueModeling Deployment

• Desktop files

• Excel spreadsheet

• Other data files on PC

Cloud

Local

Data to model to web services in minutes

http://studio.azure

ml.net

Web

Clients

API

Model is now a web svc

Monetize this API

Azure Machine Learning

Why Azure ML - RecapModel Evaluation

Easy to test and compare different models. (+)

Powerful toolset to deal with data.(+)

Bad integration of custom tooling (exception of R modules) (-)

Web Service APIResults can be easily published as Web Service (+)

Easy to consume a model as REST WS-API (+)

Cannot download as API (-)

Rignt now no support for CORS. (-)

http://aka.ms/azureml-mva

http://aka.ms/DataScienceReport

ReferencesMachine Learning

Yaser S. Abu-Mostafa - California Institute of Technology Caltech: https://work.caltech.edu/

Caltech Course of ML: http://www.youtube.com/watch?v=MEG35RDD7RA&list=PLD63A284B7615313A

Stanford video or Coursera Course

Azure ML Intro Damir DobricEpisode I, Episode II

ML Blog: http://blogs.technet.com/b/machinelearning/

Azure ML Getting Started Video

DreamSparkDie neueste Microsoft Software kostenlos für

Entwickler, IT Professionals & Designer.

Visual Studio, SQL Server, Windows Server

uvm.

www.dreamspark.de

Student PartnersDas Förderprogramm für

technologiebegeisterte Studierende. Werdet

Student Partner an eurer Hochschule!

www.studentpartners.de

Imagine CupReicht eure App ein & sichert euch coole

Preise & die Reise zum internationalen Finale

nach Seattle!

www.imaginecup.de

www.techstudent.de // facebook.com/MSFTImagine.DE

Entwickelt ein spannendes Spiel für

die Microsoft Plattform

Ganz egal ob für Windows, Windows Phone oder auf Basis von Microsoft Azure – reicht eure Projektidee beim Imagine

Cup 2015 ein und nutzt die Chance, über den Sieg des Deutschlandfinales an den weltweiten Imagine Cup Finals in Seattle

teilzunehmen.

Holt euch:

Baut eine Anwendung, die durch

ihren Innovationsgrad überzeugtProgrammiert eine App, die das

Leben von Menschen verbessert

1. jede Menge Erfahrung und Spaß

2. ein wertvolles Netzwerk

3. umfassendes Know-how und erfahrene Mentoren

4. eine Reise nach Seattle

5. tolle Preise und Preisgelder bis zu 50.000 USD

6. Gründungsfinanzierung und eine erfolgreiche Start-up-Karriere

2015

Alle Infos unter: www.imaginecup.de

DEADLINE NATIONALES FINALE:

31. März 2015

DreamSpark

Die neuesten Microsoft Tools

für Entwickler, Designer und IT

Professionals kostenlos!

www.dreamspark.de

DreamSpark Standard

Für alle Fachbereiche an Schulen, Hochschulen, Berufsschulen und Berufskollegs.

Visual Studio Professional, Windows Server, Windows Embedded, SQL Server, XNA

Game Studio, Expression Studio, Microsoft Mathematics, Kinect for Windows SDK u.v.m.

Kostenloses Entwicklerkonto für App Developer.

Download über www.dreamspark.com oder Online-Store der Bildungseinrichtung.

DreamSpark Premium

Für MINT-Fachbereiche an Hochschulen, Berufsschulen und Berufskollegs.

Visual Studio Ultimate, Windows, Windows Server, Windows Embedded, SQL Server,

Exchange Server, SharePoint Server, Visio, Project, Access, OneNote u.v.m.

Kostenloses Entwicklerkonto für App Developer.

Download über www.dreamspark.com oder Online-Store der Bildungseinrichtung.

Example