microsoft technologies for data science sql_saturday_201505

50
Microsoft Technologies for Data Science Mark Tabladillo, Ph.D. Senior Data Scientist LogicBlox/Predictix

Upload: mark-tabladillo

Post on 20-Jul-2015

990 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Microsoft Technologies for Data Science

Mark Tabladillo, Ph.D.

Senior Data Scientist

LogicBlox/Predictix

Networking

Interactive

http://www.bizjournals.com/atlanta/subscriber-only/2014/07/11/top-employers-in-metro-atlanta.html

http://www.kdnuggets.com/polls/2014/analytics-

data-mining-data-science-software-used.html

http://products.office.com/en-us/excel

http://www.microsoft.com/en-us/server-cloud/products/sql-server/

http://pytools.codeplex.com/

http://azure.microsoft.com/en-us/services/hdinsight/

http://www.revolutionanalytics.com/

SQL Server Data Mining: Analysis Serviceshttp://sqlserverdatamining.com

SS

SQL

AS

NoSQL

Database

Services

SQL Server*

SQL Azure*

Replication

SQL Azure Data Sync*

Full Text & Semantic

Search*

Data Integration

Services

Integration Services*

Master Data Services*

Data Quality Services*

StreamInsight*

Project “Austin”*

Analytical

Services

Analysis Services*

Data Mining

PowerPivot*

Reporting

Services

Reporting Services*

SQL Azure Reporting*

Report Builder

Power View*

Data Mining

SSMS SSIS PowerShell .NET

Data mining add-in for business analysts

• Ease of use

• Rich data mining

• Scalable

Rowset

Output

with Scores

Varchar

NVarchar

Office

PDF

Documents

Full-Text

Keyword

Index

“FTI”

iFilters

Semantic Document

Similarity Index “DSI”

Semantic

Database

Semantic

Key Phrase

Index –

Tag Index

“TI”

Simplified Chinese

British English

Portuguese

Chinese (Hong Kong SAR, PRC)

Spanish

Chinese (Singapore)

Chinese (Macau SAR)

Time in Seconds vs. Number of Documents

(2011 – K. Mukerjee, T. Porter, S. Gherman – Microsoft)

Video Intro

Difference in Proportions Test

Lexicon Based Sentiment Analysis

Forecasting-Exponential Smoothing

Forecasting - ETS+STL

Forecasting-AutoRegressive Integrated Moving Average (ARIMA)

Normal Distribution Quantile Calculator

Normal Distribution Probability Calculator

Normal Distribution Generator

Binomial Distribution Probability Calculator

Binomial Distribution Quantile Calculator

Binomial Distribution Generator

Multivariate Linear Regression

Survival Analysis

Binary Classifier

Cluster Model

datamarket.azure.com

http://azure.microsoft.com/en-us/pricing/details/machine-learning/

Mutable Immutable

Classic Open

Source

Java Scala

.NET

Now Open

Source

C#, C++,

VB.NET

F#

http://channel9.msdn.com/posts/Erik-Meijer-Functional-Programming-From-First-Principles

http://channel9.msdn.com/posts/Erik-Meijer-Functional-Programming-From-First-Principles

http://www.kdd.org/

http://blogs.technet.com/b/machinelearning/

http://social.msdn.microsoft.com/forums/azure/en-US/home?forum=MachineLearning

http://sqlserverdatamining.com

http://marktab.net

http://curah.microsoft.com/342704/azure-machine-learning-videos-february-2015

http://www.inside-r.org/

http://datascience.sqlpass.org