enterprise-scale matlab applications - mathworks · enterprise-scale matlab applications sylvain...
TRANSCRIPT
1© 2018 The MathWorks, Inc.
Enterprise-Scale MATLAB Applications
Sylvain Lacaze
Rory Adams
2
Preprocess DataDevelop Predictive
Models
Integrate Analytics
with Systems
MATLAB
MDCS
Request
Broker
MATLAB Production Server
Azure
IoT Hub
PI System
AWS
Kinesis
Enterprise Integration
Azure
Blob
PI System
Databases
Cloud Storage
Cosmos DB
Big Data / OT
Access and Explore
Data
Streaming
OT Platforms
Dashboards
3
MATLAB at Scale
Scale with increasing
▪ Access
▪ Computational Complexity
▪ Data Volume and Velocity
4
Key Takeaways
1. Share applications and algorithms with anyone
2. Integrate MATLAB functions into existing workflows and
development platforms.
3. Deploy MATLAB applications within enterprise systems in a
scalable manner
5
Write Once Then Share To Different Targets
MATLAB
C/C++
ExcelAdd-in Java
Hadoop
.NET
MATLAB
Compiler
MATLABProduction
ServerStandaloneApplication
MATLAB
Compiler SDK
Apps Files
Custom Toolbox
Python
WebApps
MATLAB
CoderGPU
Coder
6
Easily share apps with your team using Web Apps
▪ Use apps in browser
▪ Easy deployment
▪ No required knowledge of web
technologies
7
1
2
Integrate MATLAB-based Components With Your Own
Software
MATLAB
Toolboxes
MATLAB
Runtime
Application Author
Software Developer
43
MATLABProduction
ServerC/C++
Java
.NET
Python
MATLAB Compiler SDK
• Royalty-free Sharing
• IP Protection via Encryption
8
MATLAB
MATLAB
Compiler SDK
Customer examples: Financial customer advisory service
MATLAB Production Server
Request
BrokerAlgorithm
Developers
Request
Broker
Request
Broker
o Saved € 2 million
annually for an
external system
o Quicker
implementation of
adjustments in source
code by the
quantitative analysts
o Knowledge + MATLAB
= Build your own
systems
Global
financial
institution with
European HQ
9
Scaling up: Asset Allocation Demo
MATLAB Production Server(s)
HTML
XML
Java Script
Web Server(s)
10
MATLAB Production Server with Visualization Platforms
MATLAB analytics for use with desktop,
browser, and mobile visualization dashboards
▪ TableauAccess models published on MATLAB Production
Server inside Tableau calculated fields
▪ SpotfireAccess models published on MATLAB Production
Server inside Spotfire workbooks
MATLAB Production Server
MATLABAnalytics
TABLEAU INTERFACE
TABLEAU SERVER
11
Example: Travelling Salesman Problem with Tableau
12
MATLAB and MATLAB Production Server
▪ The easiest and most productive environment to take your enterprise
analytics solution from idea to a scalable production solution
Idea Production
13
Development
Enterprise
Application
Developer
Production
MATLAB
Developer
Production Deployment Workflow
MATLAB Production Server
MATLAB Production Server
Deployable
Archive
Web
Application
...
Function Call
MATLAB
Algorithm
MATLAB
Compiler SDK
Initial Test
Application
Debug Algorithm
Verify data
handling and
initial behavior
Web
ApplicationDeployable
Archives
Function Calls
14
Scale Up with MATLAB Production Server™
▪ Scalable and reliable– Service large numbers of concurrent requests
– Add capacity or redundancy with additional
servers
▪ Directly deploy MATLAB programs
into production– Automatically deploy updates without server
restarts
– Most efficient path for creating enterprise
applications
MATLAB Production
Server(s)
HTML
XML
Java ScriptWeb
Server(s)
15
MATLAB Production ServerEnterprise Class Framework For Running Packaged MATLAB Programs
▪ Server software
– Manages packaged MATLAB
programs and worker pool
▪ MATLAB Runtime libraries
– Single server can use runtimes from
different releases
▪ RESTful JSON interface and
lightweight client library
– Isolates the MATLAB processing
– Access using native data types
MATLAB Production Server
MATLAB
Runtime
Request Broker
&
Program
ManagerEnterprise
ApplicationRESTful
JSON
Enterprise
Application
MPS Client
Library
16
MATLAB at Scale
Scale with increasing
▪ Access
▪ Computational Complexity
▪ Data Volumes
17
Key Takeaways
1. Leverage parallel computing
2. Seamlessly scale to clusters
or the cloud
18
“… can complete urgent change requests
ourselves with MATLAB, often on the same
day…Testing time has also been reduced, …
load data 8 times faster than we could do
before.”
Julian Zenglein
Commerzbank
Commerzbank
Compute a variety of derived market data
from raw market data
Aberdeen Asset Management
Improve asset allocation strategies with
machine learning techniques
“… can develop prototypes to test machine learning
techniques quickly… get rapid, reliable results by
running the algorithms with large financial data sets on
a distributed computing cluster.”
Emilio Llorente-Cano
Aberdeen Asset Management
Commerzbank headquarters in
Frankfurt.
19
Accelerating MATLAB Applications
Parallel-enabled toolboxes
Simple programming constructs
Advanced programming constructs
Ease o
f U
se
Gre
ate
r Co
ntro
l
20
Parallel-enabled Toolboxes (MATLAB® Product Family)Enable acceleration by setting a flag or preference
Optimization
Estimation of gradients
Statistics and Machine Learning
Resampling Methods, k-Means
clustering, GPU-enabled functions
Neural Networks
Deep Learning, Neural Network
training and simulation
Image Processing
Batch Image Processor, Block
Processing, GPU-enabled functions
Computer VisionBag-of-words workflow
Signal Processing and Communications GPU-enabled FFT filtering, cross
correlation, BER simulations
Other Parallel-enabled Toolboxes
21
Independent Tasks or IterationsSimple programming constructs: parfor, parfeval
▪ Examples: parameter sweeps, Monte Carlo simulations
▪ No dependencies or communications between tasks
MATLAB client
MATLAB workers
Time Time
22
Parallel ComputingMulticore Desktops
Core 3
Core 1 Core 2
Core 4
Worker Worker
Worker Worker
MATLAB multicore
23
Parallel Computing – Scaling UpClusters/Cloud
Worker Worker Worker Worker Worker Worker
Worker Worker Worker Worker Worker Worker
WorkerWorkerWorkerWorkerWorkerWorker
Worker Worker Worker Worker Worker Worker
Parallel Computing Toolbox MATLAB Distributed Computing Server
24
Summary - Scale your applications beyond the desktop
Learn More: Parallel Computing on the Cloud
Option Parallel Computing Toolbox
Description Explicit desktop scaling
Maximum
workersNo limit
Hardware Desktop
Availability Worldwide
MATLAB Distributed
Computing Server
for Amazon EC2
Scale to EC2 with some
customization
256
Amazon EC2
United States, Canada and other
select countries in Europe
MATLAB Distributed
Computing Server
for Custom Cloud
Scale to custom cloud
No limit
Amazon EC2,
Microsoft Azure,Others
Worldwide
MATLAB Distributed
Computing Server
Scale to clusters
No limit
Any
Worldwide
MATLAB Distributed
Computing Server
for Hadoop + Spark
Scale to custom cloud
No limit
Hadoop + Spark
Worldwide
25
MATLAB at Scale
Scale with increasing
▪ Access
▪ Computational Complexity
▪ Data Volume and Velocity
26
▪ Batch Processing applies computation to a finite sized historical data set
that was acquired in the past
▪ Stream Processing applies computation to an unbounded data set that is
produced continuously
Messaging Service
• Reporting
• Real Time
Decision Support
Dashboards
Alerts
Storage
Historical Data
Storage
Files
Configure Resources Schedule and Run Job Output Data
Storage
Files
• Reporting
• Data Exploration
• Training Models
Connected
Devices
Continuous Data
f(x)
Stream Analytics
Big Data – Batch vs Stream Processing
27
Large Data Options
Data fits in memory of pool
▪ Distributed arrays
– Look like normal MATLAB
variables
Data does not fit in memory (Big Data)
▪ Tall arrays
▪ Looks like normal MATLAB variables
▪ Custom map-reduce functions
▪ Can be painful to learn
28
Typical Workflow – Big Data
Integrate Analytics
with Systems
Access and
Explore Data
Develop Predictive
ModelsPreprocess Data
1 2 3 4
12
23
“Package” MATLAB code &
run as automated batch
Data
MATLAB Developer
HDFS
Spark or
MapReduce
job
29
tall arrays
▪ Data doesn’t fit into memory
▪ Lots of observations - “tall”
▪ Looks like a normal MATLAB array
– Numeric types, tables, datetimes, strings, etc…
– Basic math, stats, indexing, etc.
– Statistics and Machine Learning Toolbox
(clustering, classification, etc.)
tall array
30
Example: Taxi Data
31
▪ Tall arraysMATLAB
▪ 100’s of functions supportedMATLABStatistics and Machine Learning Toolbox
▪ Run in parallelParallel Computing Toolbox
Using Tall Arrays
Local diskShared folders
DatabasesHDFS
▪ Run in parallel on Spark clustersMATLAB Distributed Computing Server
▪ Deploy MATLAB applications as standalone applications on Spark clustersMATLAB Compiler
Spark + Hadoop
▪ Run in parallel on compute clustersMATLAB Distributed Computing Server
32
Stream Processing: Fleet Analytics
MATLAB
Analytics Development
MATLAB Production Server
MATLAB
Analytics
Business Decisions
MATLAB
Compiler
SDK
Algorithm
Developers
Storage
Layer
End Users
API
Gateway
AWS
Lambda
Kafka
Connector
Business
Systems
Edge
Devices
Production System
33
▪ Kafka client for MATLAB Production Server
feeds topics to functions deployed on the server
▪ Configurable batch of messages passed as a
MATLAB Timetable
▪ Each consumer process feeds one topic to a
specified function
▪ Drive everything from a simple config file
– No programming outside of MATLAB!MATLAB Production Server
Request
Broker
&
Program
Manager
Consumer
Process feeds
Topic-1
Async Java
Client
Topic-0
Partition
Partition
Partition
Topic-1
Partition
Partition
Partition
Kafka Cluster
Publisher Publisher Publisher
Consumer
Process feeds
Topic-0
Async Java
Client
Connecting MATLAB Production Server to Kafka
34
Develop a Stream Processing Function in MATLAB
35
Event
Time
Vehicle RPM Torque Fuel
Flow
… … … … …
… … … … …
… … … … …
… … … … …
… … … … …
… … … … …
… … … … …
… … … … …
… … … … …
… … … … …
… … … … …
MATLAB
Function
State
State
18:01:10 55a3fd 1975 100 110
18:10:30 55a3fe 2000 109 115
18:05:20 55a3fd 1980 105 105
18:10:45 55a3fd 2100 110 100
18:30:10 55a419 2000 100 110
18:35:20 55a419 1960 103 105
18:20:40 55a3fe 1970 112 104
18:39:30 55a419 2100 105 110
18:30:00 55a3fe 1980 110 113
18:30:50 55a3fe 2000 100 110
MATLAB
Function
State
MATLAB
Function
State
Input Table
Time window Vehicle Score
… … …
18:00:00 18:10:00 55a3fd …
55a3fe …
55a419 …
18:10:00 18:20:00 55a3fd …
55a3fe …
55a419 …
18:20:00 18:30:00 55a3fd …
55a3fe …
55a419 …
18:30:00 18:40:00 55a3fd …
55a3fe …
55a419 …
Output Table
5
7
3
9
4
5
8
Streaming Data: Treated as an Unbounded Timetable
36
Typical Workflow
Integrate Analytics
with Systems
Access and
Explore Data
Develop Predictive
ModelsPreprocess Data
1 2 3 4
12
23
MATLAB Production Server
MATLAB
Analytics
• Kafka
• Azure IoT Hub
• Other
Application Server:
• Streaming
• Ad-hoc analysis via
Spotfire, Tableau, …
• Other…
“Package” MATLAB code &
run as automated batch
Data
MATLAB Developer
HDFS
Spark or
MapReduce
job
37
Batch ProcessingStream ProcessingEdge
Processing with
Generated
Code, C/C++
Time critical decisions Big Data processing on historical data Near Real time decisions
Va
lue
of d
ata
to d
ecis
ion
ma
kin
g
Time
to
React
Historical
Reactive
Actionable
Pre
ve
nti
ve
/
Pre
dic
tive
Real-
TimeSeconds Minutes Hours Days Months
Kinesis
Event Hub
Time versus Value in decision making
38
Preprocess DataDevelop Predictive
Models
Integrate Analytics
with Systems
MATLAB
MDCS
Request
Broker
MATLAB Production Server
Azure
IoT Hub
PI System
AWS
Kinesis
Enterprise Integration
Azure
Blob
PI System
Databases
Cloud Storage
Cosmos DB
Big Data / OT
Access and Explore
Data
Streaming
OT Platforms
Dashboards