presentatie open source integration with sas forum 2016... · • r, pmml, sas ® enterprise...

32
Company Confidential - For Internal Use Only Copyright © 2016, SAS Institute Inc. All rights reserved. OPEN SOURCE INTEGRATION WITH SAS ® PATRICK HALL, SAS

Upload: trantuyen

Post on 07-Mar-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

OPEN SOURCE INTEGRATION WITH SAS®

PATRICK HALL, SAS

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

AGENDA

• R, PMML, SAS® Enterprise MinerTM, and SAS/IML ®

• SAS ® BI Web Services

• The Base SAS ® Java Object

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

AGENDA

• R, PMML, SAS® Enterprise MinerTM, and SAS/IML ®

• SAS ® BI Web Services

• The Base SAS ® Java Object

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

MAKING PREDICTIONS ON NEW DATA

scored_new_data <- predict.glm(model, newdata=new_data)

scored_new_data = clf.predict(new_data)

score out=scored_new_data;

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

Identify/Formulate Problem

Identify/Formulate Problem

Identify/Formulate Problem

Data Preparation/Exploration

Data Preparation/Exploration

Data Preparation/Exploration

Model BuildingModel

BuildingModel

BuildingDeploy ModelDeploy ModelDeploy Model

Evaluate/Monitor Model

Evaluate/Monitor Model

Evaluate/Monitor Model

ESTIMATION VS. PREDICTION DIFFERENT MINDSETS

Statistics Regression

Assumptions Parsimony

Interpretation

What happened? Why?

Production Deployment

Predictive Accuracy

What will happen?

Machine Learning

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

(SIMPLE) MATH

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

HOW DO WE TURN OUR INSIGHTS INTO A PRODUCTION SYSTEM?

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

HOW DO WE TURN OUR INSIGHTS INTO A PRODUCTION SYSTEM?

All data manipulation and modeling logic must be encapsulated to run in an operational database system or in a web service. R and Python runtimes may be unavailable!

Score code

• Code generation

• Scoring executable

Database

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

HOW DO WE TURN OUR INSIGHTS INTO A PRODUCTION SYSTEM?

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

WHAT IS PMML?

• A declarative, XML-based, open standard for representing predictive models

• Allows predictive models to be transferred between different languages and products

• What analytical tools can create PMML? R, SAS, Python, etc. etc. etc.

• What scoring tools can consume PMML? Teradata, Zementis, SAS, IBM

More info: http://www.dmg.org/products.html

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

WILL PMML SOLVE ALL MY DEPLOYMENT PROBLEMS?

• In simple scenarios, yes

• Difficulties with character data (MCBS) and finite precision

• Can require post-processing

• Requires re-validation with change in producing software, change in PMML standard, or consuming software version

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

LET’S ASSUME YOU KNOW WHAT SAS IS …

SAS/IML enables you to execute R code from within SAS (supported since 9.22)

SAS Enterprise Miner will train R models and convert them to SAS score code (supported since 9.4m1/13.1)

*Note: you can just use SAS analytics to generate score code directly

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

Train model

Generate PMML with R PMML

package

Score new data with PMML Interpreter

Production EnvironmentEnvironment

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

Train model

Generate PMML with R

PMML package

ConvertPMML to SAS code

using PROC PSCORE

Score new data using

SAS tools or native

process

Environment

SAS Enterprise Miner Open Source Integration node

SAS/IML Base SAS

Production Environment

Environment

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

AGENDA

• R, PMML, SAS® Enterprise MinerTM, and SAS/IML ®

• SAS ® BI Web Services

• The Base SAS ® Java Object

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

SAS BI WEB SERVICES

Among other things, enables SAS analytics to be

executed from a URL: Surfaces SAS production

analytics to nearly any language or application

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

Client SAS Stored Process Server

Train and Score in an MPP or

Hadoop Environment

SAS High Performance Analytics

EnvironmentEnvironment

HTML

Results

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

SAS Proprietary Software layer

---SAS PROCs – SAS license

required---

Exposed only to certain SAS technologies

SAS Stored Process Serverlayer

---(Potentially) Open source macros

that wrap specific SAS procedures

---Exposes SAS macros wrappers to

any web call

SAS Server

SAS BI WEB SERVICES

(Potentially) Open Source Python package to wrap web calls in native Python

?(Potentially) Open Source packages in other languages to wrap web calls

Python coders

(Potentially) Open Source R package to wrap web calls in native R

Rcoders

Other non-SAS coders

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

AGENDA

• R, PMML, SAS® Enterprise MinerTM, and SAS/IML ®

• SAS ® BI Web Services

• The Base SAS ® Java Object

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

THE BASE SAS JAVA OBJECT

• DATA step component

• Essentially a SAS language API into Java:

Instantiates Java classes and calls Java

methods with arguments from the DATA step

(Supported since 9.1.3)

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

Environment

Base SAS DATA Step

Data

SAS system Java runtime

Instantiate classesCall methodsReturn data

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

Instantiate the Java class from the DATA step

BASE SAS JAVA OBJECT BY EXAMPLE

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

A Java method to receive data from SAS

BASE SAS JAVA OBJECT BY EXAMPLE

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

A Java method to receive data from SAS

BASE SAS JAVA OBJECT BY EXAMPLE

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

Pass each row of data to Java object

BASE SAS JAVA OBJECT BY EXAMPLE

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

Do something useful with data received from SAS

BASE SAS JAVA OBJECT BY EXAMPLE

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

Execute the Java class at the end of the DATA step

BASE SAS JAVA OBJECT BY EXAMPLE

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

BASE SAS JAVA OBJECT BY EXAMPLE

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

https://github.com/sassoftware/enlighten-integration/tree/master/SAS_Base_OpenSrcIntegration

BASE SAS JAVA OBJECT NEXT STEPSCall nearly anything through Java from Base SAS

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

OVERALL SUMMARY

Technology Best Purpose Communication Method

SAS Enterprise MinerOpen Source Integration node

Predictive Modeling SAS calls R

SAS BI Web ServicesCalling SAS from

another language or using SAS to backend

an application

Java, R, Python etc. make HTTP calls to

SAS

Base SAS Java objectAdding Java

functionality to your SAS programs

SAS calls Java

Company Confidential - For Internal Use OnlyCopyright © 2016, SAS Institute Inc. All rights reserved.

RESOURCES

PMML, R, and SAS• SAS Enterprise Miner 14.1 documentation• Open Source Integration node installation cheat sheet

https://communities.sas.com/t5/SAS-Data-Mining-Library/The-Open-Source-Integration-node-installation-cheat-sheet/ta-p/223470

• GitHub examples https://github.com/sassoftware/enlighten-integration

Base SAS Java Object • Base SAS documentation

http://support.sas.com/documentation/cdl/en/lrcon/68089/HTML/default/viewer.htm#n0swy2q7eouj2fn11g1o28q57v4u.htm

• “Connecting Java to SAS Data Sets” white paperhttp://support.sas.com/resources/papers/proceedings12/008-2012.pdf

Copyr igh t © SAS Ins t i tu te Inc . A l l r igh ts reserved.

Copyright  ©  2016,  SAS   Institute   Inc.  All  rights  reserved.

You might also like …

The Preoccupation With Test Error in Applied Machine Learninghttps://www.oreilly.com/ideas/the‐preoccupation‐with‐test‐error‐in‐applied‐machine‐learning

The Evolution of Analyticshttp://www.oreilly.com/data/free/the‐evolution‐of‐analytics.csp

Stay in touch … 

Quora Github Twitter Linkedinwww.quora.com/Patrick‐Hall‐4 github.com/jphall663 @jpatrickhall https://www.linkedin.com/in/jpatrickhall

github.com/sassoftware

Thanks!