machine learning services with sql server 2017
TRANSCRIPT
Machine Learning
Services with SQL Server 2017MARK TABLADILLO PH.D.
LEAD DATA SCIENTIST, MICROSOFT
JULY 31, 2017
Microsoft
https://marketrealist.imgix.net/uploads/2017/07/Microsoft-Shares-Are-at-an-All-Time-High-
2017-07-24.jpg?w=660&fit=max&auto=format
Microsoft and Open Source
SQL Server 2017 on Linux
Nearly 1/3 of Virtual Machines (IAAS) on Azure are Linux https://news.microsoft.com/bythenumbers/azure-virtual
Purchase of RevolutionR
R Distribution Microsoft R Client
R inside Azure Machine Learning, Power BI, SQL Server, Jupyter
Python inside Azure Machine Learning, SQL Server, Jupyter
Cloud Shell In Azure (preview): yes, we mean Bash
https://azure.microsoft.com/en-us/features/cloud-shell/
Microsoft now the leading contributor on GitHub
Focus
1) to describe major features of this technology for technology
managers;
2) to outline use cases for architects; and
3) to provide demos for developers and data scientists.
SQL Server 2017MAJOR FEATURES
Gartner Review October 2016
SQL Server on Linux
Possible with Drawbridge
Over 1M Docker Downloads
Whitepaper on Linux
https://info.microsoft.com/SQL
Server-on-Linux-Open-source-
enterprise-environment.html
Video – Overview of SQL
Server on Linux
https://channel9.msdn.com/e
vents/connect/2016/101
Microsoft Release Acronyms
CTP RC
Community Technology
Preview
Release Candidate
Versions of Microsoft SQL Server
https://docs.microsoft.com/en-us/sql/sql-server/editions-and-
components-of-sql-server-2017
Enterprise
Many data scientists will use the free developer version (not
intended for production)
Since we are still at RC (Release Candidate):
Free 180 day evaluation version (Enterprise equivalent)
Windows Docker image
Linux Docker image
https://www.microsoft.com/en-us/evalcenter/evaluate-sql-server-2017-
ctp
Data Science & AI
Certifications
https://borntolearn.mslearn.net/b/weblog/posts/microsoft-introduces-
several-new-data-management-amp-analytics-certifications
Team Data Science Process https://github.com/Azure/Microsoft-TDSP
• A statistics programming language
• Data analysis & visualization capabilities
• Majority of data scientists use R
• Thriving user groups worldwide
• Vibrant open Source community
• 10,000 + free algorithms in CRAN
• New and recent grad’s use it
#1Language
Advanced
Analytics
2.5M+Users
Open Biggest
Ecosystem
• Strong ties to academia feeds ever-
growing machine learning capabilities
What is
• Constantly innovating
but, Open Source R is not Enterprise Class
76% of analytic professionals report using R
36% select R as their primary tool
R Usage GrowthRexer Data Miner Survey
2007-2015
InadequateModeling
Performance
??
Lack of Commercial
Support
Complex DeploymentProcesses
Limited Data Scale
Our data science tool that allows you to do high performance analytics on production data, running locally on
your computer.
https://microsoft.github.io/r-server-loan-chargeoff/index.html
https://docs.microsoft.com/en-us/sql/advanced-analytics/getting-started-with-machine-learning-services
O(16)NOPERATIONALIZATION
Classified as Microsoft Confidential
• Turn R analytics Web
services in one line of
code;
• Swagger-based REST
APIs, easy to consume,
with any programming
languages, including R!
• Deploying web service
server to any platform:
Windows, SQL,
Linux/Hadoop
• On-prem or in cloud
• Fast scoring, real time
and batch
• Scaling to a grid for
powerful computing with
load balancing
• Diagnostic and capacity
evaluation tools
• Enterprise
authentication:
AD/LDAP or AAD
• Secure connection:
HTTPS with SSL/TLS 1.2
• Enterprise grade high
availability
Classified as Microsoft Confidential
• Turn R analytics Web
Service in one line of
code;
• Swagger-based REST
APIs, easy to consume,
with any programming
languages, including R!
• Deploying Web Service
server to any platform:
Windows / SQL /
Linux/Hadoop
• On Prem or in Cloud
• Fast scoring, real time
and batch
• Scaling to a grid for
powerful computing with
load balancing
• Diagnostic and capacity
evaluation tools
• Enterprise
authentication: LDAP /
AD/ AAD
• Secure connection:
HTTPS with SSL.TSL1.2
• Enterprise grade High
Availability
Classified as Microsoft Confidential
Data Scientist
Developer
Easy Integration
Easy Deployment
Easy Setup
▪ In-cloud or on-prem
▪ Adding nodes to scale
▪ High availability & load balancing
▪ Remote execution server
Microsoft R Serverconfigured for
operationalizing R analytics
Microsoft R Client
(mrsdeploy package)
Data Scientist
Easy Consumption
publishServiceMicrosoft R Client
(mrsdeploy package)
Classified as Microsoft Confidential
Build the model first Deploy as a web service instantly
Classified as Microsoft Confidential
Function Description
publishServicePublish a predictive function as a Web
Service
deleteService Delete a Web Service
getService Get a Web Service
ListServices List the different published web services
serviceOptionRetrieve, set, and list the different service
options
updateService Updates a Web Service
Classified as Microsoft Confidential
Data Scientist
# Run the following code in R
swagger <- api$swagger()
cat(swagger, file = "swagger.json",
append = FALSE)
Generate Swagger Docs for Web Services
Developer
Popular Swagger Tools:
AutoRest or Code Generator
AutoRest.exe -CodeGenerator
CSharp -Modeler Swagger -
Input swagger.json -
Namespace Mynamespace
Run Swagger tools to generate code
Developer
Write a few code to consume the service
Classified as Microsoft Confidential
Share / Reuse R code / functions• Not just models, a data scientist can share any functional code as a service.
• Other data scientists can explore in the repository to re-use those functions.
Enable Model Management capabilities• A Predictive Web Service = “Model” + “Prediction Script”
• R Server hosts all those services Central Repo of Models
• Each service has a version tag Model Version Control
• All versions are active Model Roll Back (to any version)
• A service can be accessed by any authorized users
• Model reuse
• Model validation and monitoring by QA team
After service is published, I can
test if the service works as
expected right away
Classified as Microsoft Confidential
▪ Built-in remote execute
functions in R Client/R Server
▪ Generate Diff report to
reconcile local and remote
▪ Execute .R script or interactive
R commands
▪ Results come back to local
▪ Generate working snapshots
for resume and reuse
▪ IDE agnostic
R Client
(mrsdeploy package)R Server
configured to
Remote Execute R Scripts
(Support Window Server, Linux
Server, Hadoop )
▪ Execute R Scripts
▪ Snapshot remote env.
▪ Logout remote server
▪ Login remote server
▪ Generate Diff report
▪ Reconcile Environment
Classified as Microsoft Confidential
Snapshot Functions
createSnapshotCreate a snapshot of the remote session (workspace and
working directory)
loadSnapshotLoad a snapshot from the server into the remote session
(workspace and working directory)
listSnapshots Get a list of snapshots for the current user
downloadSnapshot Download a snapshot from the server
deleteSnapshot Delete a snapshot from the server
Remote Objects Management
listRemoteFilesGet a list of files in the working directory of the
remote session
deleteRemoteFileDelete a file from the working directory of the remote
R session
getRemoteFileCopy a file from the working directory of the remote
R session
putLocalFileCopy a file from the local machine to the working
directory of the remote R session
getRemoteObject Get an object from the remote R session
putLocalObjectPut an object from the local R session and load it into
the remote R session
getRemoteWorkspaceTake all objects from the remote R session and load
them into the local R session
putLocalWorkspaceTake all objects from the local R session and load
them into the remote R session
Remote Connection
remoteLoginRemote login to the R Server with AD or admin
credentials
remoteLoginAAD Remote login to R Server server using Azure AD
remoteLogout Logout of the remote session on the DeployR Server.
Remote Execution
remoteExecute Remote execution of either R code or an R script
remoteScript Wrapper function for remote script execution
diffLocalRemote Generate a 'diff' report between local and remote
pause Pause remote connection and back to local
resume Return the user to the 'REMOTE >' command prompt
Classified as Microsoft Confidential
• Turn R analytics Web
Service in one line of
code;
• Swagger-based REST
APIs, easy to consume,
with any programming
languages, including R!
• Deploying Web Service
server to any platform:
Windows / SQL /
Linux/Hadoop
• On Prem or in Cloud
• Fast scoring, real time
and batch
• Scaling to a grid for
powerful computing with
load balancing
• Diagnostic and capacity
evaluation tools
• Enterprise
authentication: LDAP /
AD/ AAD
• Secure connection:
HTTPS with SSL.TSL1.2
• Enterprise grade High
Availability
Classified as Microsoft Confidential
ModelPrepare
SQL
2016
OperationalizeOperationalize
R & ScaleR
Models
CRAN R
Models
AzureML
Web Services
R Server VMs
ModelPrepare
Operationalize
T-SQL/Stored
Procedure
Operationalize
R Server
On PremCloud
Deploy to SQL
Server 2016
Deploy to Hadoop / Linux
Server / Windows Server
Classified as Microsoft Confidential
•
•
•
•
•
•
•
ModelPrepare
OperationalizeOperationalize
R & ScaleR Models R Models
On Prem
Classified as Microsoft Confidential
•
•
•
•
•
•
•
•
ModelPrepare
Operationalize
SQL,
HDFS
R & ScaleR Models
On Prem • R Server
• T-SQL/Stored
Procedure
Classified as Microsoft Confidential
Product Platforms Modeling Operationalization
R Server for Windows Windows Server 2012 - 2016 Same as modeling
R Server for Linux Red Hat Enterprise Linux 6.X and 7.X 7.x
R Server for Linux SUSE Enterprise SLES 11 will support in future release
R Server for Linux Ubuntu 14.04 LTS, 16.04 LTS Same as modeling
R Server for Linux CentOS 6.X and 7.X 7.x
R Server for Hadoop Red Hat and SUSE Enterprise RHEL 6.x and 7.x, SUSE SLES11 RHEL 7.x
•
•
•
Classified as Microsoft Confidential
• Turn R analytics Web
Service in one line of
code;
• Swagger-based REST
APIs, easy to consume,
with any programming
languages, including R!
• Deploying Web Service
server to any platform:
Windows / SQL /
Linux/Hadoop
• On Prem or in Cloud
• Fast scoring, real time
and batch
• Scaling to a grid for
powerful computing with
load balancing
• Diagnostic and capacity
evaluation tools
• Enterprise
authentication: LDAP /
AD/ AAD
• Secure connection:
HTTPS with SSL.TSL1.2
• Enterprise grade High
Availability
Classified as Microsoft Confidential
• Easily scale up a single
server to a grid to handle
more concurrent requests
• Load balancing cross
compute nodes
• A shared pool of warmed
up R shells to improve
scoring performance.
R
Client
Classified as Microsoft Confidential
• Health check node
configuration
• Get system status
• Trace R code execution
• Trace service execution
• Evaluate grid capacity
• Simulate traffic per service
• Configure with # of
concurrent threads or
latency thresholds
Classified as Microsoft Confidential
• Turn R analytics Web
Service in one line of
code;
• Swagger-based REST
APIs, easy to consume,
with any programming
languages, including R!
• Deploying Web Service
server to any platform:
Windows / SQL /
Linux/Hadoop
• On Prem or in Cloud
• Fast scoring, real time
and batch
• Scaling to a grid for
powerful computing with
load balancing
• Diagnostic and capacity
evaluation tools
• Enterprise
authentication: LDAP /
AD/ AAD
• Secure connection:
HTTPS with SSL.TSL1.2
• Enterprise grade High
Availability
Classified as Microsoft Confidential
• Seamless integration
with authentication
solution: LDAP/AD/AAD
• Secure connection:
HTTPS encrypted by TLS
1.2/SSL
• Compliance with
Microsoft Security
Development Lifecycle
R
Client
Classified as Microsoft Confidential
Load Balancer
• Server level HA:
Introduce multiple Web
Nodes for Active-Active
backup / recovery, via
load balancer
• Data Store HA: leverage
Enterprise grade DB, SQL
Server and Postgres’ HA
capabilities
Connect
SlideShare
Twitter @marktabnet
Abstract
SQL Server 2017 introduces Machine Learning Services with two
independent technologies: R and Python. The purpose of this
presentation is 1) to describe major features of this technology for technology managers; 2) to outline use cases for architects; and 3)
to provide demos for developers and data scientists.