big data and machine learning using matlab data and machine learning using matlab seth deland &...
TRANSCRIPT
![Page 1: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/1.jpg)
1© 2015 The MathWorks, Inc.
Big Data and Machine Learning
Using MATLAB
Seth DeLand & Amit Doshi
MathWorks
![Page 2: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/2.jpg)
2
Data Analytics
Turn large volumes of complex data into actionable information
source: Gartner
![Page 3: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/3.jpg)
3
Customer Example: Gas Natural Fenosa
Energy Production Optimization
Opportunity
• Allocate demand among power plants to minimize
generation costs
Analytics Use
• Data: Central database for historical power consumption
and price data, weather forecasts, and parameters for each
power plant
• Machine Learning: Develop price simulation scenarios
• Optimization: minimize production cost
Benefit
• Reduced generation costs
• White-box solution for optimizing power generation
User Story
![Page 4: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/4.jpg)
4
Prescriptive Analytics
Predictive Analytics
Unit CommitmentPredictive and Prescriptive Analytics
Schedule
Generator
Parameters
Unit
Commitment
Load Forecast
Historical
Weather Data
Historical
Load Data
![Page 5: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/5.jpg)
5
Big Data Analytics Workflow
Integrate Analytics with
Systems
Desktop Apps
Enterprise Scale
Systems
Embedded Devices
and Hardware
Files
Databases
Sensors
Access and Explore
Data
Develop Predictive
Models
Model Creation e.g.
Machine Learning
Model
Validation
Parameter
Optimization
Preprocess Data
Working with
Messy Data
Data Reduction/
Transformation
Feature
Extraction
![Page 6: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/6.jpg)
6
Example: Working with Big Data in MATLAB
▪ Objective: Create a model to predict the cost of a taxi ride in New York City
▪ Inputs:
– Monthly taxi ride log files
– The local data set is small (~20 MB)
– The full data set is big (~21 GB)
▪ Approach:
– Access Data
– Preprocess and explore data
– Develop and validate predictive model (linear fit)
▪ Work with subset of data for prototyping and then run on spark enabled hadoop with full data
– Integrate analytics into a webapp
![Page 7: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/7.jpg)
7
Example: Working with Big Data in MATLAB
![Page 8: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/8.jpg)
8
Demo: Taxi Fare Predictor Web App
![Page 9: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/9.jpg)
9
Big Data Analytics Workflow: Data Access and Pre-process
Integrate Analytics with
Systems
Desktop Apps
Enterprise Scale
Systems
Embedded Devices
and Hardware
Files
Databases
Sensors
Access and Explore
Data
Develop Predictive
Models
Model Creation e.g.
Machine Learning
Model
Validation
Parameter
Optimization
Preprocess Data
Working with
Messy Data
Data Reduction/
Transformation
Feature
Extraction
![Page 10: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/10.jpg)
10
Data Access and Pre-processing – Challenges
▪ Data aggregation
– Different sources (files, web, etc.)
– Different types (images, text, audio, etc.)
▪ Data clean up
– Poorly formatted files
– Irregularly sampled data
– Redundant data, outliers, missing data etc.
▪ Data specific processing
– Signals: Smoothing, resampling, denoising,
Wavelet transforms, etc.
– Images: Image registration, morphological
filtering, deblurring, etc.
▪ Dealing with out of memory data (big data)
Challenges
Data preparation accounts for about 80% of the work of data
scientists - Forbes
![Page 11: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/11.jpg)
11
Data Analytics Workflow: Big Data Access and Pre-processing
![Page 12: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/12.jpg)
12
Next: Access Big Data from MATLAB
▪ datastore
– Tabular text files
– Images
– Excel spreadsheets
– (SQL) Databases
– HDFS (Hadoop)
– S3 - Amazon
![Page 13: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/13.jpg)
13
Get data in MATLAB
![Page 15: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/15.jpg)
15
Or Data is stored in a Database
![Page 16: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/16.jpg)
16
Data Access: Summary
C Java Fortran Python
Hardware
Software
Servers and Databases
▪ Repositories – SQL, NoSQL, etc.
▪ File I/O – Text, Spreadsheet, etc.
▪ Web Sources – RESTful, JSON, etc.
Business and Transactional Data
Engineering, Scientific and Field
Data▪ Real-Time Sources – Sensors,
GPS, etc.
▪ File I/O – Image, Audio, etc.
▪ Communication Protocols – OPC
(OLE for Process Control), CAN
(Controller Area Network), etc.
![Page 17: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/17.jpg)
17
Process data which doesn't fit into memory
Integrate Analytics with
Systems
Desktop Apps
Enterprise Scale
Systems
Embedded Devices
and Hardware
Files
Databases
Sensors
Access and Explore
Data
Develop Predictive
Models
Model Creation e.g.
Machine Learning
Model
Validation
Parameter
Optimization
Preprocess Data
Working with
Messy Data
Data Reduction/
Transformation
Feature
Extraction
![Page 18: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/18.jpg)
18
Pre-processing Big Data
▪ New data type designed for data that doesn’t fit into memory
▪ Lots of observations (hence “tall”)
▪ Looks like a normal MATLAB array
– Supports numeric types, tables, datetimes, strings, etc…
– Supports several hundred functions for basic math, stats, indexing, etc.
– Statistics and Machine Learning Toolbox support
(clustering, classification, etc.)
tall arrays in
![Page 19: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/19.jpg)
19
tall arraySingle
Machine
Memory
tall arrays
▪ Automatically breaks data up into
small “chunks” that fit in memory
▪ Tall arrays scan through the
dataset one “chunk” at a time
▪ Processing code for tall arrays is
the same as ordinary arrays
Single
Machine
MemoryProcess
![Page 20: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/20.jpg)
20
tall array
Cluster of
Machines
Memory
Single
Machine
Memory
tall arrays
▪ With Parallel Computing Toolbox,
process several “chunks” at once
▪ Can scale up to clusters with
MATLAB Distributed Computing
Server
Single
Machine
MemoryProcess
Single
Machine
MemoryProcess
Single
Machine
MemoryProcess
Single
Machine
MemoryProcess
Single
Machine
MemoryProcess
Single
Machine
MemoryProcess
![Page 21: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/21.jpg)
21
Demo: Working with Tall Arrays
![Page 22: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/22.jpg)
22
Data Access and pre-processing – challenges and solution
▪ Data aggregation
– Different sources (files, web, etc.)
– Different types (images, text, audio, etc.)
▪ Data clean up
– Poorly formatted files
– Irregularly sampled data
– Redundant data, outliers, missing data etc.
▪ Data specific processing
– Signals: Smoothing, resampling, denoising,
Wavelet transforms, etc.
– Images: Image registration, morphological
filtering, deblurring, etc.
▪ Dealing with out of memory data (big data)
Challenges
▪ Point and click tools to access
variety of data sources
▪ High-performance environment
for big data
Files
Signals
Databases
Images
▪ Built-in algorithms for data
preprocessing including sensor,
image, audio, video and other
real-time data
MATLAB makes it easy to
work with business and
engineering data
1
![Page 23: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/23.jpg)
23
Data Analytics Workflow: Develop Predictive Models using Big Data
Integrate Analytics with
Systems
Desktop Apps
Enterprise Scale
Systems
Embedded Devices
and Hardware
Files
Databases
Sensors
Access and Explore
Data
Develop Predictive
Models
Model Creation e.g.
Machine Learning
Model
Validation
Parameter
Optimization
Preprocess Data
Working with
Messy Data
Data Reduction/
Transformation
Feature
Extraction
![Page 24: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/24.jpg)
24
Machine Learning
Machine learning uses data and produces a program to perform a task
Standard Approach Machine Learning Approach
𝑚𝑜𝑑𝑒𝑙 = <𝑴𝒂𝒄𝒉𝒊𝒏𝒆𝑳𝒆𝒂𝒓𝒏𝒊𝒏𝒈𝑨𝒍𝒈𝒐𝒓𝒊𝒕𝒉𝒎
>(𝑠𝑒𝑛𝑠𝑜𝑟_𝑑𝑎𝑡𝑎, 𝑎𝑐𝑡𝑖𝑣𝑖𝑡𝑦)
Computer
Program
Machine
Learning
𝑚𝑜𝑑𝑒𝑙: Inputs → OutputsHand Written Program Formula or Equation
If X_acc > 0.5
then “SITTING”
If Y_acc < 4 and Z_acc > 5
then “STANDING”
…
𝑌𝑎𝑐𝑡𝑖𝑣𝑖𝑡𝑦= 𝛽1𝑋𝑎𝑐𝑐 + 𝛽2𝑌𝑎𝑐𝑐+ 𝛽3𝑍𝑎𝑐𝑐 +
…
Task: Human Activity Detection
![Page 25: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/25.jpg)
25
Consider Machine/Deep Learning When
update as more data becomes available
learn complex non-linear relationships
learn efficiently from very large data sets
Problem is too complex for hand written rules or equations
Speech Recognition Object Recognition Engine Health Monitoring
Program needs to adapt with changing data
Weather Forecasting Energy Load Forecasting Stock Market Prediction
Program needs to scale
IoT Analytics Taxi Availability Airline Flight Delays
Because algorithms can
![Page 26: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/26.jpg)
26
Different Types of Learning
Machine
Learning
Supervised
Learning
Classification
Regression
Unsupervised
LearningClustering
Discover an internal representation from
input data only
Develop predictivemodel based on bothinput and output data
Type of Learning Categories of Algorithms
• No output - find natural groups and
patterns from input data only
• Output is a real number
(temperature, stock prices)
• Output is a choice between classes
(True, False) (Red, Blue, Green)
![Page 27: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/27.jpg)
27
Different Types of Learning
Machine
Learning
Supervised
Learning
Classification
Regression
Unsupervised
LearningClustering
Discover an internal representation from
input data only
Develop predictivemodel based on bothinput and output data
Type of Learning Categories of Algorithms
Linear
Regression
GLM
Decision
Trees
Ensemble
Methods
Neural
Networks
SVR,
GPR
Nearest
Neighbor
Discriminant
AnalysisNaive Bayes
Support
Vector
Machines
kMeans, kmedoids
Fuzzy C-MeansHierarchical
Neural
Networks
Gaussian
Mixture
Hidden Markov
Model
![Page 28: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/28.jpg)
28
Machine Learning with Big Data
• Descriptive statistics (skewness,
tabulate, crosstab, cov, grpstats, …)
• K-means clustering (kmeans)
• Visualization (ksdensity, binScatterPlot;
histogram, histogram2)
• Dimensionality reduction (pca, pcacov,
factoran)
• Linear and generalized linear regression
(fitlm, fitglm)
• Discriminant analysis (fitcdiscr)
• Linear classification methods for SVM
and logistic regression (fitclinear)
• Random forest ensembles of
classification trees (TreeBagger)
• Naïve Bayes classification (fitcnb)
• Regularized regression (lasso)
• Prediction applied to tall arrays
![Page 29: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/29.jpg)
29
Demo: Training a Machine Learning Model
![Page 30: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/30.jpg)
30
Demo: Training a Machine Learning Model
![Page 31: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/31.jpg)
31
Regression Learner
![Page 32: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/32.jpg)
32
Regression LearnerApp to apply advanced regression methods to your data
▪ Added to Statistics and Machine Learning
Toolbox in R2017a
▪ Point and click interface – no coding
required
▪ Quickly evaluate, compare and select
regression models
▪ Export and share MATLAB code or
trained models
![Page 33: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/33.jpg)
33
Classification LearnerApp to apply advanced classification methods to your data
▪ Added to Statistics and Machine Learning
Toolbox in R2015a
▪ Point and click interface – no coding
required
▪ Quickly evaluate, compare and select
classification models
▪ Export and share MATLAB code or
trained models
![Page 34: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/34.jpg)
34
and Many More MATLAB Apps for Data Analytics
Distribution Fitting
System Identification
Signal Analysis
Wavelet Design and Analysis
Neural Net Fitting
Neural Net Pattern Recognition
Training Image Labeler
and many more…
![Page 35: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/35.jpg)
35
Tuning Machine Learning ModelsGet more accurate models in less time
Automatically select best
machine leaning “features”
NCA: Neighborhood Component Analysis
Select best “features”
to keep in model from
over 400 candidates
Automatically fine-tune
machine learning parameters
Hyperparameter Tuning
![Page 36: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/36.jpg)
36
Machine Learning Hyperparameters
Hyperparameters
Tune a typical set of
hyperparameters for this model
Tune all
hyperparameters for this model
![Page 37: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/37.jpg)
37
Bayesian Optimization in Action
![Page 38: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/38.jpg)
38
Challenges
▪ Lack of data science expertise
▪ Feature Extraction – How to transform
data to best represent the system?
– Requires subject matter expertise
– No right way of designing features
▪ Feature Selection – What attributes or
subset of data to use?
– Entails a lot of iteration – Trial and error
– Difficult to evaluate features
▪ Model Development
– Many different models
– Model Validation and Tuning
▪ Time required to conduct the analysis
MATLAB enables domain experts to
do Data Science
2
Apps Language
▪ Easy to use apps
▪ Wide breadth of tools to facilitate
domain specific analysis
▪ Examples/videos to get started
▪ Automatic MATLAB code
generation
▪ High speed processing of large
data sets
Big Data Analytics Workflow: Developing
Predictive models
![Page 39: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/39.jpg)
39
Back to our example: Working with Big Data in MATLAB
▪ Objective: Create a model to predict the cost of a taxi ride in New York City
▪ Inputs:
– Monthly taxi ride log files
– The local data set is small (~20 MB)
– The full data set is big (~25 GB)
▪ Approach:
– Acecss Data
– Preprocess and explore data
– Develop and validate predictive model (linear fit)
▪ Work with subset of data for prototyping
▪ Scale to full data set on a cluster
![Page 40: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/40.jpg)
40
Data Analytics Workflow: Develop Predictive Models using Big Data
Integrate Analytics with
Systems
Desktop Apps
Enterprise Scale
Systems
Embedded Devices
and Hardware
Files
Databases
Sensors
Access and Explore
Data
Develop Predictive
Models
Model Creation e.g.
Machine Learning
Model
Validation
Parameter
Optimization
Preprocess Data
Working with
Messy Data
Data Reduction/
Transformation
Feature
Extraction
![Page 41: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/41.jpg)
41
Demo: Taxi Fare Predictor Web App
![Page 42: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/42.jpg)
42
MATLAB Production Server
▪ Server software
– Manages packaged MATLAB
programs and worker pool
▪ MATLAB Runtime libraries
– Single server can use runtimes
from different releases
▪ RESTful JSON interface
▪ Lightweight client libraries
– C/C++, .NET, Python, and Java
MATLAB Production Server
MATLAB
Runtime
Request Broker
&
Program
ManagerApplications/
Database
Servers RESTful
JSON
Enterprise
Application
MPS Client
Library
![Page 43: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/43.jpg)
43
Integrate analytics with systems
MATLAB
Runtime
C, C++ HDL PLC
Embedded Hardware
C/C++ ++ExcelAdd-in Java
Hadoop/
Spark.NET
MATLABProduction
Server
StandaloneApplication
Enterprise Systems
Python
MATLAB Analytics run anywhere
3
![Page 44: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/44.jpg)
44
YARN
Product Support for Spark
Web & Mobile
Applications
Enterprise
Applications DEVELOPMENT TOOLS
MATLAB
Compiler
MATLAB
Runtime
MATLAB Distributed
Computing Server
From MATLAB desktop:
• Access data from HDFS
• Run “tall” functions on
Spark/Hadoop using MDCS
Spark
Integrate with applications:
• Deploy MATLAB programs using “tall”
• Develop deployable applications for
Spark using MATLAB API for Spark
![Page 45: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/45.jpg)
45
YARN : Data Operating System
Spark
Deployment Offerings
▪ Deploy “tall” programs
– Create Standalone Applications: MATLAB Compiler
▪ MATLAB API for Spark
– Create Standalone Applications: MATLAB Compiler
– Functionality beyond tall arrays
– For advanced programmers familiar with Spark
– Local install of Spark to run code in MATLAB
▪ Installed on same machine as MATLAB – single node, Linux
StandaloneApplication
Edge Node
MATLAB
Runtime
MATLAB
Compiler
Program using tall
Program using
MATLAB API for Spark
Since the Standalone
must run on a Linux
Edge Node, you must
compile on Linux
![Page 46: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/46.jpg)
46
Data Analytics Workflow
Integrate Analytics
with Systems
Desktop Apps
Enterprise Scale
Systems
Embedded Devices
and Hardware
Files
Databases
Sensors
Access and Explore
Data
Develop Predictive
Models
Model Creation e.g.
Machine Learning
Model
Validation
Parameter
Optimization
Preprocess Data
Working with
Messy Data
Data Reduction/
Transformation
Feature
Extraction
MATLAB Analytics work
with business and
engineering data
1 MATLAB enables domain experts to do
Data Science
2 3MATLAB Analytics run anywhere
![Page 47: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/47.jpg)
47
Resources to learn and get started mathworks.com/machine-learning
eBook
mathworks.com/big-data
![Page 48: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/48.jpg)
48
MathWorks Services
▪ Consulting– Integration
– Data analysis/visualization
– Unify workflows, models, data
▪ Training
– Classroom, online, on-site
– Data Processing, Visualization, Deployment, Parallel Computing
www.mathworks.com/services/consulting/
www.mathworks.com/services/training/
![Page 49: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/49.jpg)
49
MathWorks Training Offerings
http://www.mathworks.com/services/training/
![Page 50: Big Data and Machine Learning Using MATLAB Data and Machine Learning Using MATLAB Seth DeLand & Amit Doshi MathWorks 2 Data Analytics Turn large volumes of complex data into actionable](https://reader033.vdocuments.mx/reader033/viewer/2022051803/5b004fa07f8b9a256b8fcc10/html5/thumbnails/50.jpg)
50
Speaker Details
Email:
LinkedIn:
https://in.linkedin.com/in/amit-doshi
https://www.linkedin.com/in/seth-deland
Contact MathWorks India
Products/Training Enquiry Booth
Call: 080-6632-6000
Email: [email protected]
Your feedback is valued.
Please complete the feedback form provided to you.