presentatie open source integration with sas forum 2016... · • r, pmml, sas ® enterprise...
TRANSCRIPT
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
OPEN SOURCE INTEGRATION WITH SAS®
PATRICK HALL, SAS
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
AGENDA
• R, PMML, SAS® Enterprise MinerTM, and SAS/IML ®
• SAS ® BI Web Services
• The Base SAS ® Java Object
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
AGENDA
• R, PMML, SAS® Enterprise MinerTM, and SAS/IML ®
• SAS ® BI Web Services
• The Base SAS ® Java Object
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
MAKING PREDICTIONS ON NEW DATA
scored_new_data <- predict.glm(model, newdata=new_data)
scored_new_data = clf.predict(new_data)
score out=scored_new_data;
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
Identify/Formulate Problem
Identify/Formulate Problem
Identify/Formulate Problem
Data Preparation/Exploration
Data Preparation/Exploration
Data Preparation/Exploration
Model BuildingModel
BuildingModel
BuildingDeploy ModelDeploy ModelDeploy Model
Evaluate/Monitor Model
Evaluate/Monitor Model
Evaluate/Monitor Model
ESTIMATION VS. PREDICTION DIFFERENT MINDSETS
Statistics Regression
Assumptions Parsimony
Interpretation
What happened? Why?
Production Deployment
Predictive Accuracy
What will happen?
Machine Learning
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
(SIMPLE) MATH
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
HOW DO WE TURN OUR INSIGHTS INTO A PRODUCTION SYSTEM?
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
HOW DO WE TURN OUR INSIGHTS INTO A PRODUCTION SYSTEM?
All data manipulation and modeling logic must be encapsulated to run in an operational database system or in a web service. R and Python runtimes may be unavailable!
Score code
• Code generation
• Scoring executable
Database
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
HOW DO WE TURN OUR INSIGHTS INTO A PRODUCTION SYSTEM?
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
WHAT IS PMML?
• A declarative, XML-based, open standard for representing predictive models
• Allows predictive models to be transferred between different languages and products
• What analytical tools can create PMML? R, SAS, Python, etc. etc. etc.
• What scoring tools can consume PMML? Teradata, Zementis, SAS, IBM
More info: http://www.dmg.org/products.html
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
WILL PMML SOLVE ALL MY DEPLOYMENT PROBLEMS?
• In simple scenarios, yes
• Difficulties with character data (MCBS) and finite precision
• Can require post-processing
• Requires re-validation with change in producing software, change in PMML standard, or consuming software version
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
LET’S ASSUME YOU KNOW WHAT SAS IS …
SAS/IML enables you to execute R code from within SAS (supported since 9.22)
SAS Enterprise Miner will train R models and convert them to SAS score code (supported since 9.4m1/13.1)
*Note: you can just use SAS analytics to generate score code directly
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
Train model
Generate PMML with R PMML
package
Score new data with PMML Interpreter
Production EnvironmentEnvironment
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
Train model
Generate PMML with R
PMML package
ConvertPMML to SAS code
using PROC PSCORE
Score new data using
SAS tools or native
process
Environment
SAS Enterprise Miner Open Source Integration node
SAS/IML Base SAS
Production Environment
Environment
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
AGENDA
• R, PMML, SAS® Enterprise MinerTM, and SAS/IML ®
• SAS ® BI Web Services
• The Base SAS ® Java Object
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
SAS BI WEB SERVICES
Among other things, enables SAS analytics to be
executed from a URL: Surfaces SAS production
analytics to nearly any language or application
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
Client SAS Stored Process Server
Train and Score in an MPP or
Hadoop Environment
SAS High Performance Analytics
EnvironmentEnvironment
HTML
Results
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
SAS Proprietary Software layer
---SAS PROCs – SAS license
required---
Exposed only to certain SAS technologies
SAS Stored Process Serverlayer
---(Potentially) Open source macros
that wrap specific SAS procedures
---Exposes SAS macros wrappers to
any web call
SAS Server
SAS BI WEB SERVICES
(Potentially) Open Source Python package to wrap web calls in native Python
?(Potentially) Open Source packages in other languages to wrap web calls
Python coders
(Potentially) Open Source R package to wrap web calls in native R
Rcoders
Other non-SAS coders
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
AGENDA
• R, PMML, SAS® Enterprise MinerTM, and SAS/IML ®
• SAS ® BI Web Services
• The Base SAS ® Java Object
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
THE BASE SAS JAVA OBJECT
• DATA step component
• Essentially a SAS language API into Java:
Instantiates Java classes and calls Java
methods with arguments from the DATA step
(Supported since 9.1.3)
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
Environment
Base SAS DATA Step
Data
SAS system Java runtime
Instantiate classesCall methodsReturn data
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
Instantiate the Java class from the DATA step
BASE SAS JAVA OBJECT BY EXAMPLE
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
A Java method to receive data from SAS
BASE SAS JAVA OBJECT BY EXAMPLE
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
A Java method to receive data from SAS
BASE SAS JAVA OBJECT BY EXAMPLE
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
Pass each row of data to Java object
BASE SAS JAVA OBJECT BY EXAMPLE
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
Do something useful with data received from SAS
BASE SAS JAVA OBJECT BY EXAMPLE
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
Execute the Java class at the end of the DATA step
BASE SAS JAVA OBJECT BY EXAMPLE
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
BASE SAS JAVA OBJECT BY EXAMPLE
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
https://github.com/sassoftware/enlighten-integration/tree/master/SAS_Base_OpenSrcIntegration
BASE SAS JAVA OBJECT NEXT STEPSCall nearly anything through Java from Base SAS
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
OVERALL SUMMARY
Technology Best Purpose Communication Method
SAS Enterprise MinerOpen Source Integration node
Predictive Modeling SAS calls R
SAS BI Web ServicesCalling SAS from
another language or using SAS to backend
an application
Java, R, Python etc. make HTTP calls to
SAS
Base SAS Java objectAdding Java
functionality to your SAS programs
SAS calls Java
Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.
RESOURCES
PMML, R, and SAS• SAS Enterprise Miner 14.1 documentation• Open Source Integration node installation cheat sheet
https://communities.sas.com/t5/SAS-Data-Mining-Library/The-Open-Source-Integration-node-installation-cheat-sheet/ta-p/223470
• GitHub examples https://github.com/sassoftware/enlighten-integration
Base SAS Java Object • Base SAS documentation
http://support.sas.com/documentation/cdl/en/lrcon/68089/HTML/default/viewer.htm#n0swy2q7eouj2fn11g1o28q57v4u.htm
• “Connecting Java to SAS Data Sets” white paperhttp://support.sas.com/resources/papers/proceedings12/008-2012.pdf
Copyr igh t © SAS Ins t i tu te Inc . A l l r igh ts reserved.
Copyright © 2016, SAS Institute Inc. All rights reserved.
You might also like …
The Preoccupation With Test Error in Applied Machine Learninghttps://www.oreilly.com/ideas/the‐preoccupation‐with‐test‐error‐in‐applied‐machine‐learning
The Evolution of Analyticshttp://www.oreilly.com/data/free/the‐evolution‐of‐analytics.csp
Stay in touch …
Quora Github Twitter Linkedinwww.quora.com/Patrick‐Hall‐4 github.com/jphall663 @jpatrickhall https://www.linkedin.com/in/jpatrickhall
github.com/sassoftware
Thanks!