leonardo framework –sap’ innovation platformsapnow.hu/wp-content/uploads/2018/08/03_sap... ·...
TRANSCRIPT
CUSTOMERDirk Miethe, SAPSeptember, 2018
Leonardo Framework – SAP’ Innovation platform for Machine Learning SAP Data Framework as Foundation for Data Innovation
2CONFIDENTIAL© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
„you can spend the whole day waiting on
the ocean without catching a wave”
5CONFIDENTIAL© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
SAP Leonardo
New Technology
Design Services
Natural LanguageImage Recognition
IoT
Deep Learning
Connected Data
Blockchain
API’s
Machine Learning
Business Process Innovation
Big Data Real-time
Analyses
Workshop
Experience
Start Ups
Discover
Innovation
Center
Prototype
The Machine
Learning
Akademie
Innovation Center
Potsdam
MicroservicesData Hub
Berlin
6CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Innovation: getting ideas into productionScaling the last mile is much harder than developing MVPs
# Projects
# People
# Users
time
PIONEER SETTLER CITY BUILDER
DATA LAB
DATA FACTORY
INTELLIGENT
ENTERPRISE
PILOT
EASY DATA ACCESS
NEW
INDEPENDENT
PRODUCTIVE
STABLE
AUTOMATE & RE-USE
INTEGRATED
INDUSTRIALIZED IT
DATA GOVERNANCE
SECURITY
PLATFORM STRATEGY
ALL AT ONCE
7CONFIDENTIAL© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Our startup found an ML
algorithm!
9CONFIDENTIAL© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Data Lab Data Factory Intelligent Enterprise
Move beyond initial innovation successes
PIONEER SETTLER CITY BUILDER
* Google Research Paper on ML Systems
Manual data extraction
Feature Matrix
<Transform>
</></>
Data ready for consumption
✓
<ML>
<Stats>
5% *
95% *Technical Debt
in ML Systems
10CONFIDENTIAL© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Plant
Protection
Force
Plant
Control
Station
Pipeline
Base
Material
ProcessingData
Scientist
IT
Operations
BW & BI
Expert
Data Engineer
IT
Developer
11CONFIDENTIAL© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Move beyond initial innovation successes
PIONEER SETTLER CITY BUILDER
Manual data extraction
Features
<Transform>
</></>
Data ready for consumption
<ML>
<Stats>
</> </> </>
Train & deploy ML
ModelsData
Pipelines
</>
ETL
<ML> Data Catalogue
Pu
sh
-Dow
n E
xe
cu
tio
n
Mo
nito
rin
g
Connections
Dashboard App
Data Lab Data Factory Intelligent Enterprise
13CONFIDENTIAL© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Example:Typical Machine Learning System
PIONEER SETTLER CITY BUILDER
Manual data extraction
Features
<Transform>
</></>
Data ready for consumption
✓
<ML>
<Stats>
</> </> </>
Train & deploy ML
Models
Dashboard App
Data
Pipelines
</>
ETL
<ML> Data Catalogue
Pu
sh
-Dow
n E
xe
cu
tio
n
Mo
nito
rin
g
Connections
App
API
Micro-
Service
API
SAP Leonardo
Data Lab Data Factory Intelligent Enterprise
14CONFIDENTIAL© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Data ScientistBusiness(User / Analyst)
Data Engineer Operations
Accelerate
transformation towards
Intelligent Enterprise
15CONFIDENTIAL© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Do you prefer an architect
& building contractor …
or integrate / coordinate yourself ?
Building the foundation of the Intelligent Enterprise
In-memory database
Flexible data access
Machine Learning Tools&Capabilities
Application Development Environment
ETL
Process Orchestration
Authority Management
ComplianceUser Interface
IT Operations
Visualization
Automation
16CONFIDENTIAL© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Application
Required building blocks to develop Machine Learning content
Data
Engines for
specific data
types
Library
Coding
Tools
› Actionable,
Business critical
Insights
Data
Preparation
17CONFIDENTIAL© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Componentmapping
Data ScientistBI ConsumerBusiness
Analyst
Business User
& Customer
Data Apps Ad-hocanalysis
BI Data Lab
Cloud Infrastructure On Premise Infrastructure
SAP HANA
SAP Data Hub
ML Foundation
SAPEnterprise Integration
SAP Analytics
CloudSAP Agile Data
Preparation
HANA (XSA)
Fiori, etc.
Any Tool
(i.e. Matlab, R-Studio, Jupyter,
Zepplin)
PAL APL
GeoText
Graph
pre trained / training / bring you own Model
18CONFIDENTIAL© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Example ML service: facial recognition
https://budgetapiwebsdciot.hana.ondemand.com/budgetapiweb/#
19CONFIDENTIAL© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Example of a Shopping App
Shop
Recommendation
with HANA Geo-
Spatial Engine
Product
Recommendation
with HANA PAL
Image
Recognition
withTensorflow
1
2
3
20CONFIDENTIAL© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Image Recognition with HANA-Tensorflow Integration
Tensorflow Libraries
› Tensorflow Image
Recognition
withTensorflow
HANA
21CONFIDENTIAL© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Product Recommendation with HANA PAL or APL
PAL/APL Libraries
› HANA Predictive Analytics Library (PAL)
› HANA Automated Predictive Library (APL)
HANA
Product
Recommendation
with HANA PAL
22CONFIDENTIAL© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Different algorithms for different type of questions - Examples
Grouping customers
› E.g. K-Means Clustering
Probability to buy
a certain product
› E.g. Logistic regression, Decision Tree
Predict the value spent
› E.g. Multiple Linear Regression
Seasonality
› E.g. Triple Exponential Smoothing
Product
Recommendation
with HANA PAL
23CONFIDENTIAL© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
SAP HANA Predictive Analytics Library (PAL) – Recent Enhancements
Classification Analysis▪ CART
▪ C4.5 Decision Tree Analysis
▪ CHAID Decision Tree Analysis
▪ K Nearest Neighbour
▪ Logistic Regression Elastic Net
▪ Back-Propagation (Neural Network)
▪ Naïve Bayes
▪ Support Vector Machine
▪ Random Forests
▪ Gradient Boosting Decision Tree (GBDT)*
▪ Linear Discriminant Analysis (LDA)*
▪ Confusion Matrix
▪ Area Under Curve (AUC)
▪ Parameter Selection / Model Evaluation
Regression▪ Multiple Linear Regression Elastic Net
▪ Polynomial, Exponential, Bi-Variate Geometric, Bi-Variate Logarithmic Regression
▪ Generalized Linear Model (GLM)*
▪ Cox Proportional Hazards Model*
Association Analysis▪ Apriori, Apriori Lite
▪ FP-Growth
▪ KORD – Top K Rule Discovery
▪ Sequential Pattern Mining*
Probability Distribution▪ Distribution Fit/ Weibull analysis
▪ Cumulative Distribution Function
▪ Quantile Function
▪ Kaplan-Meier Survival Analysis
Outlier Detection▪ Inter-Quartile Range Test (Tukey’s
Test)
▪ Variance Test
▪ Anomaly Detection
▪ Grubbs Outlier Test
Recommender Systems▪ Factorized Polynomial Regression
Models**
Link Prediction▪ Common Neighbors, Jaccard’s
Coefficient, Adamic/Adar, Katzβ* New in HANA 2 SPS0
**New in HANA 2 SPS01
***New in HANA 2 SPS02
Statistical Functions ▪ Mean, Median, Variance, Standard
Deviation, Kurtosis, Skewness
▪ Covariance Matrix
▪ Pearson Correlations Matrix
▪ Chi-squared Tests:
- Test of Quality of Fit
- Test of Independence
▪ F-test (variance equal test)
▪ Data Summary*
▪ ANOVA**
▪ One-sample Median Test**
▪ T Test**
▪ Wilcox Signed Rank Test**
Data Preparation
▪ Sampling, Binning, Scaling, Partitioning
▪ Principal Component Analysis (PCA) /
PCA Projection
▪ Factor Analysis
▪ Multi dimensional scaling
Other▪ Weighted Scores Table
▪ Substitute Missing Values
Cluster Analysis▪ ABC Classification
▪ DBSCAN
▪ K-Means/ Accelerated K-Means**
▪ K-Medoid Clustering
▪ K-Medians
▪ Kohonen Self Organized Maps
▪ Agglomerate Hierarchical
▪ Affinity Propagation
▪ Latent Dirichlet Allocation (LDA)
▪ Gaussian Mixture Model (GMM)
▪ Cluster Assignment
Time Series Analysis▪ Single/Double/ Brown /Triple Exp.Smoothing
▪ Forecast Smoothing
▪ Auto - ARIMA/ Seasonal ARIMA
▪ Croston Method
▪ Forecast Accuracy Measure
▪ Linear Regression with Damped Trend and
Seasonal Adjust
▪ Test for White Noise, Trend, Seasonality
▪ Fast Fourier Transform (FFT)*
▪ Correlation Function*
24CONFIDENTIAL© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Integrated non-SAP libraries
Other Libraries
› R library
› Python library
› others
HANA
25CONFIDENTIAL© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Nearest shop search with Geo-Spatial engine in HANA
Pre-Processing of specific data types
Shop
Recommendation
with HANA Geo-
Spatial Engine
Geo Spatial
26CONFIDENTIAL© 2018 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Other engines in HANA
Pre-Processing of specific data types
Text Analysis Graph AnalysisGeo Spatial
Demo SAP Data Hubc
Thank you.
Contact information:
Dirk Miethe & Karsten HaldenwangCenter of Excellence Database and Data ManagementMiddle & Eastern Europe