the database pro’s - team data... · 2018-10-25 · the one platform for physical, virtual, and...

35
The Database Pro’s Guide to the Team Data Science Process

Upload: others

Post on 20-May-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

The Database Pro’s Guide to the Team Data Science Process

Page 2: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

Why SentryOne? CC

Deep Functionality

Unrivaled Scalability

Unmatched Expertise

Top-Rated Customer Support

Unmatched Expertise

• Fastest Diagnosis• Automation• Highest granularity• Jump-to navigation• Highly configurable

• 100% customer satisfaction on tickets• 50% of tickets resolved in <1hr• 8.8/10 Trust Radius score• NPS 50% higher than the industry

• Proven• Leading Edge Technology• End-to-end full

environmental view• Lowest overhead

• Microsoft-only focus• 8 Microsoft MVPs• Community Commitment• Built by experts

Page 3: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

Visibility Across the Microsoft Data Platform

Monitor, diagnose, and optimize SQL Server

performance.

Find and fix SSAS performance problems.

Monitor resource utilization (memory, CPU, network,

storage) of VM environments.

Boost Windows Server and Hyper-V virtualized environment

performance.

See performance for Microsoft Analytics Platform System, data

movement and queries.

Accelerate Azure SQL Data Warehouse performance with

visibility into workload impacts.

Keep Azure SQL Databases running at peak efficiency with

metrics and DTU usage.

Drive Productivity with 60+ SSIS components for connectivity,

execution performance, and security.

Inspect and compare schemas, objects, and data.

Automatic documentation, object lineage, impact analysis, data

dictionary and environment compare.

SSIS and SSRS Monitoring and package deployment in

Visual Studio.

SSIS script development and sharing.

Automated data-centric application testing and data

verification.

SQL plan analysis and advanced query tuning.

Development Operations BI Administration

Page 4: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

The One Platform for Physical, Virtual, and Cloud Performance.

Free e-booksIn these books, you will find useful, hand-picked articles that will help give insight into some of your most vexing performance problems. These articles were written by several of the SQL Server industry's leading experts, including Aaron Bertrand, Paul White, Paul Randal, Jonathan Kehayias, Erin Stellato, Glenn Berry, and Joe Sack.

http://www.sentryone.com/ebooks/

WebsitesSQLPerformance.com provides innovative and practical solutions for improving SQL Server performance.

SentryOne.com/Resources offers an inside look into the world of SentryOne with videos on query tuning and product demos, white papers, ebooks, and tech briefs.

Blogs.SentryOne.com is where you can find all of our team members’ blogs as well as important information about the latest updates to SentryOne software, SQL Server and server

performance issues.

Free Resources

Page 5: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

Your Presenter

Kevin KlinePrincipal Program Manager, SentryOneMicrosoft Data Platform MVP since 2003

Contact InfoTwitter : @KeklineEmail : [email protected] : http://blogs.SentryOne.com/author/kevinkline/LinkedIn : https://www.linkedin.com/in/kekline/

Page 6: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

Agenda

• What to expect:• How to get started with Machine Learning and the Cortana AI Suite

• Good practices

• What to study next

• What NOT to expect:• Enough skills to get a job with Ml or AI

• Full depth and breadth of SQL Server ML Services

• Statistics knowledge

Page 7: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

What’s Machine Learning?

http://dilbert.com/strip/2013-02-02

Machine learning explores the study and construction of algorithms that can learn from and make predictions on data. Such algorithms operate by building a model from example inputs in order to make data-driven predictions or decisions, rather than following strictly static program instructions. –Wikipedia.org/wiki/Machine_learning

Page 8: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

Ok, Really?

• ML will answer questions like:• Is this A or B?

• Is this weird?

• How much – or – How many?

• How is this organized?

• What should I do next?

Great write-up at https://www.sqlmelody.com/tdsp-lifecycle-business-understanding/

Page 9: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

Generally Speaking…

• Traditional programming combines code + data running on CPU(s) to produce output.

• ML combines data + output running on CPU(s) to produce a program. • The program is the ML algorithm

which then becomes the input on future iterations.

• Remember the goal is to make predictions and forecasts.

Page 10: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

The Team Data Science

Lifecycle

Page 11: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

The Team Data Science

Lifecycle

Play the game “Then what would you do?” to its conclusion.

Page 12: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

The Team Data Science

Lifecycle

Lambda Architecture

20% of Job / 80% of Work

Azure baby!

Page 13: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

The Team Data Science

Lifecycle

Page 14: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

The Team Data Science

LifecycleMicrosoft & Community gives away the store to make this as easy as possible!

Page 15: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

The Team Data Science

Lifecycle

Page 16: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

Common Roles in TDSP

Page 17: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

Applying Algorithms to TDSP

Page 18: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

Walkthru Into the Process

Page 19: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

Input Data

Page 20: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

Preprocess Data

Page 21: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

Split Data

Page 22: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

Algorithm Selection

Page 23: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked
Page 24: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked
Page 25: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

Train the Model

https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/

Check out the Azure AI Gallery to make life much easier!

Page 26: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

In-database Analytics

Page 27: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

Train the Model

Page 28: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

Train the Model

• https://towardsdatascience.com/metrics-to-evaluate-your-machine-learning-algorithm-f10ba6e38234

Page 29: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

Tips and Tricks• When installing…

Page 30: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

• DON’T• Run R / Python script as-is

• Embed secrets in scripts

• Do data transformations that can be achieved in SQL

• Access network resources

• Process/transform files as part of the stored procedure call

• Embed the R/Python code directly in applications

• DO• Develop/Test from RTVS, PTVS,

RStudio or other IDE SQL

• Compute Context from client

• Data processing & transformations in SQL Server

• Data integration using SQL Server features

• Model management in database

Page 31: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

Tools and Resources• Get the project templates:

• https://github.com/Azure/Azure-TDSP-ProjectTemplate

• Microsoft wants this to be easy!• https://github.com/Azure/Azure-TDSP-Utilities includes:

• IDEAR – Interactive Data Exploration, Analysis, and Reporting tool • AMAR – Automated Modeling and Reporting tool

• Lots of docker containers that are pre-assembled

• Learning resources:• The Team Data Science Process - https://aka.ms/tdsp• Learn AI - http://learnanalytics.microsoft.com/• A Plethora of learning resources!

https://blogs.technet.microsoft.com/machinelearning/2017/01/25/a-plethora-of-microsoft-training-options-on-ai-machine-learning-data-science-including-moocs/

Page 32: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

More Goodies

• Getting started SQL + ML tutorials: http://aka.ms/sqldev

• SSMS Reports for R Services: http://bit.ly/2r525gu

• Build 2017: Serving AI with Data (SQL Server 2017): http://bit.ly/2rbEDlZ

• SQL Server Samples on GitHub: • R Services http://bit.ly/2s7tBNV• ML Services http://bit.ly/2rbLNGz

• Using RTVS: http://bit.ly/2rbrXLU

• Build 2017 session recordings on Channel 9: http://bit.ly/2puyVun

• Microsoft Virtual Academy online: http://aka.ms/mva

Page 33: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

Tools and Resources, Non-Microsoft

• What’s your ML test score? A rubric for ML production systems • https://ai.google/research/pubs/pub45742

• Lessons learned turning machine learning models into real products and services • https://www.oreilly.com/ideas/lessons-learned-turningmachine-learning-

models-into-real-products-andservices

Page 34: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked

Summary

Download Plan Explorer!http://sentryone.com/plan-explorer

Like our products?• Please consider giving us a

referral and/or evaluation on TrustRadius!

Let’s connect!• Facebook, LinkedIn, Twitter

at KEKLINE.

Page 35: The Database Pro’s - Team Data... · 2018-10-25 · The One Platform for Physical, Virtual, and Cloud Performance. Free e-books In these books, you will find useful, hand-picked