adam filion application engineer mathworks€¦ · hadoop. java.net. matlab. compiler. matlab....

27
1 © 2015 The MathWorks, Inc. Data Analytics with MATLAB Adam Filion Application Engineer MathWorks

Upload: others

Post on 30-Aug-2020

20 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

1© 2015 The MathWorks, Inc.

Data Analytics with MATLAB

Adam FilionApplication EngineerMathWorks

Page 2: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

2

Goal:– Implement a tool for easy and accurate computation of day-

ahead system load forecast

Requirements:– Acquire and clean data from

multiple sources– Accurate predictive model– Easily deploy to production

environment

Case Study: Day-Ahead Load Forecasting

Page 3: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

3

Challenges with Data Analytics

Aggregating data from multiple sources

Cleaning data

Choosing a model

Moving to production

Page 4: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

4

NYISO Energy Load Data

Page 5: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

5

Techniques to Handle Missing Data

List-wise deletion– Unbiased estimates – Reduces sample size

Implementation options– Built in to many

MATLAB functions– Manual filtering

Page 6: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

6

Techniques to Handle Missing Data

Substitution – replace missing data points with a reasonable approximation

Easy to model

Too important to exclude

Page 7: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

7

Merge Different Sets of Data

Join along a common axis

Popular Joins:– Inner– Full Outer– Left Outer– Right Outer

Inner Join

Full Outer Join

Left Outer Join

Page 8: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

8

Full Outer Join

X Y Z1 0.1 0.23 0.3 0.45 0.5 0.67 0.7 0.8

Key B Y Z134579

First Data Set

A B1 1.14 1.47 1.79 1.9

Second Data Set

Key

Key

1.1

1.4

1.7

1.9

0.1

0.3

0.7

0.5

0.2

0.4

0.8

0.6

NaN

NaN

NaN

NaN

NaN

NaN

Joined Data Set

Page 9: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

9

Learn More: Big Data with MATLAB

www.mathworks.com/discovery/big-data-matlab.htmlwww.mathworks.com/discovery/matlab-mapreduce-hadoop.html

Page 10: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

10

Challenges with Data Analytics

Aggregating data from multiple sources

Cleaning data

Choosing a model

Moving to production

Page 11: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

11

Machine LearningCharacteristics and Examples

Characteristics– Lots of variables– System too complex to know

the governing equation(e.g., black-box modeling)

Examples– Pattern recognition (speech, images)

– Financial algorithms (credit scoring, algo trading)

– Energy forecasting (load, price)

– Biology (tumor detection, drug discovery)

Page 12: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

12

Overview – Machine Learning

MachineLearning

SupervisedLearning

Classification

Regression

UnsupervisedLearning Clustering

Group and interpretdata based only

on input data

Develop predictivemodel based on bothinput and output data

Type of Learning Categories of Algorithms

Page 13: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

13

Supervised Learning

Regression

Non-linear Reg.(GLM, Logistic)

LinearRegressionDecision Trees Ensemble

MethodsNeural

Networks

Classification

NearestNeighbor

Discriminant Analysis Naive BayesSupport Vector

Machines

Page 14: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

14

Unsupervised Learning

Clustering

k-Means,Fuzzy C-Means

Hierarchical

Neural Networks

GaussianMixture

Hidden Markov Model

Page 15: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

16

Learn More: Machine Learning with MATLAB

mathworks.com/machine-learning

Page 16: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

17

Challenges with Data Analytics

Aggregating data from multiple sources

Cleaning data

Choosing a model

Moving to production

Page 17: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

18

Deployment Highlights

Share with others who may not have MATLAB

Royalty-free deployment

Encryption to protect your intellectual property

Application Servers

Database Servers

Web Applications

Client Applications

SpreadsheetsHadoop / Big Data

Desktop Applications

Batch/Cron Jobs

Page 18: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

19

MATLAB

MATLABCompiler SDK

C/C++ExcelAdd-in JavaHadoop .NET

MATLABCompiler

MATLABProduction

ServerStandaloneApplication

Deploying Applications with MATLAB

MATLAB Compiler for sharing MATLAB programs without integration programming

MATLAB Compiler SDK provides implementation and platform flexibility for software developers

MATLAB Production Server provides the most efficient development path for secure and scalable web and enterprise applications

Page 19: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

20

Application Author

End User

1

2

Sharing Standalone Applications

MATLAB

ExcelAdd-in Hadoop

StandaloneApplication

Toolboxes

MATLAB Compiler

MATLABRuntime3

Page 20: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

21

1

2

Integrating MATLAB-based Components

MATLABToolboxes

MATLABRuntime

C/C++ Java .NETMATLAB

ProductionServer

MATLAB Compiler SDK

Application Author

Software Developer

43

Application author and software developer might be same person

Page 21: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

22

MATLAB Production Server

Directly deploy MATLAB analytic programs into production– Centrally manage multiple MATLAB programs & MCR versions– Automatically deploy updates without server restarts

Scalable & reliable– Service large numbers of concurrent requests– Add capacity or redundancy with additional servers

Use with web, database & application servers– Lightweight client library isolates MATLAB processing– Access MATLAB programs using native data types– Integrates with Java, .NET, C and Python

MATLAB Production Server(s)

HTMLXML

Java ScriptWeb

Server(s)

Page 22: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

23

MATLABDesktop

Deployed AnalyticsMATLAB Production Server

MATLABProduction

Server

WebApplication

ServerMATLAB

Production Server

Req

uest

Bro

ker

CTF

Apache Tomcat

Web Server/Webservice

Weather Data

Energy Data

Predictive Models

Train in MATLAB

Page 23: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

24

Learn More: Application Deployment with MATLAB

www.mathworks.com/solutions/desktop-web-deployment/

Page 24: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

25

Learn More: MATLAB Application Deployment

Also … www.mathworks.com/solutions/desktop-web-deployment/

Page 25: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

26

Data Analytics Products

Develop Predictive Models

Access and Explore Data Preprocess Data

Integrate Analytics with Systems

MATLAB

MATLAB Production Server

Statistics and Machine Learning ToolboxDatabase Toolbox

Neural Network ToolboxData Acquisition Toolbox

Image Processing Toolbox

Signal Processing Toolbox Computer Vision System Toolbox

Curve Fitting Toolbox

MATLAB Compiler

MATLAB Compiler SDK

Parallel Computing Toolbox, MATLAB Distributed Computing Server

Mapping Toolbox

Image Acquisition Toolbox

OPC Toolbox

Econometrics Toolbox Used in today’s demo

Additional Data Analytics products

Page 26: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

27

Key Takeaways

Data preparation can be a big job; leverage built-in MATLAB tools and spend more time on the analysis

Rapidly iterate through different predictive models, and find the one that’s best for your application

Leverage parallel computing to scale-up your analysis to large datasets

Eliminate the need to recode by deploying your MATLAB algorithms into production

Page 27: Adam Filion Application Engineer MathWorks€¦ · Hadoop. Java.NET. MATLAB. Compiler. MATLAB. Production. Server. Standalone. Application. Deploying Applications with MATLAB. MATLAB

28© 2015 The MathWorks, Inc.