building bi solutions with sql server pdw au3 ruwen hess senior program manager microsoft...

54

Upload: aron-shannon-mcdaniel

Post on 04-Jan-2016

227 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321
Page 2: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Building BI Solutions with SQL Server PDW AU3Ruwen HessSenior Program ManagerMicrosoft Corporation

DBI321

Page 3: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Agenda

Trends in the DW spaceHow does SQL Server PDW fit in?SQL Server PDW AU3 – What’s new?Building BI Solutions with SQL Server PDW

Customer SuccessesUsing SQL Server PDW with Microsoft BI solutionsUsing SQL Server PDW with third party BI solutionsBI solutions leveraging Hadoop integration

What’s coming next in SQL Server PDW?

Page 4: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Trends in the Data Warehousing SpaceUnderstanding the Opportunity

Source: TDWI Report – Next Generation DW

Don't Know

More than 10 TB

3 - 10 TB

1 - 3 TB

Less than 1TB

0% 5% 10% 15% 20% 25% 30% 35% 40% 45%

6%

34%

25%

18%

17%

2%

17%

19%

21%

41%

Approximate data volume managed by DW

Today In 3 years

Performance at scale: ability to analyze massive amounts of data

DW systems continue to grow at a fast pace, scalability is a key concern, growing a system from 10s of TBs, to 100s of TB, to PBs

Data Warehousing has shifted almost entirely towards the appliance model due to speed of the balanced appliance and scalability of scale out (MPP) solutions.

Jim Cobelius, Forrester Research

Page 5: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Appliances are the key trend in the next 4 years (4 Billion market by ‘15)Cloud DW longer-termBox is a slow decline

Source: MS internal analysis, DBSMIT Cloud Market Opportunity Forecast

CAGR

-0.3%

26.2%

7.1%

Share(‘15)

4.6%

5.0%

30.0%

60.4%

FY10 FY11 FY12 FY13 FY14 FY150

2

4

6

8

10

12

14

7.9 8 8.2 8.2 8.1 7.7

1.1 1.5 1.9 2.4 3 3.8

DW Software License RevenueUS$ Billions

Public Cloud

Private Cloud

Appliances/RA

Traditional

7.1%

Trends in the Data Warehousing SpaceUnderstanding the Opportunity

Page 6: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Agenda

Trends in the DW spaceHow does SQL Server PDW fit in?SQL Server PDW AU3 – What’s new?Building BI Solutions with SQL Server PDW

Customer SuccessesUsing SQL Server PDW with Microsoft BI solutionsUsing SQL Server PDW with third party BI solutionsBI solutions leveraging Hadoop integration

What’s coming next in SQL Server PDW?

Page 7: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Scale out

What is Parallel Data Warehouse (PDW)?SQL Server Data Warehousing in Appliance Model

ScalableStandardsBased

FlexibleCost Effective

Page 8: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

CONTROL RACK DATA RACK

Control Node (query submitted here)

Management Node

Landing Zone

Backup Node

• Query is executed on all nodes• Multiple queries are simultaneously executed across all nodes• PDW supports querying while data is loading

SQL Server PDW Hardware Architecture

Page 9: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

PDW Data ExampleTime Dim

Date Dim IDCalendar YearCalendar QtrCalendar MoCalendar Day

Store Dim

Store Dim IDStore NameStore MgrStore Size

Product Dim

Prod Dim IDProd CategoryProd Sub CatProd Desc

MktgCampaign Dim

Mktg Camp IDCamp NameCamp MgrCamp StartCamp End

SQL

SQL

SQL

SQL

PDW Compute Nodes

Sales Facts

Date Dim IDStore Dim IDProd Dim IDMktg Camp IdQty SoldDollars Sold

Page 10: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Time DimDate Dim IDCalendar YearCalendar QtrCalendar MoCalendar Day

Sales Facts

Date Dim IDStore Dim IDProd Dim IDMktg Camp IdQty SoldDollars Sold

PDW Data Example

Store Dim

Store Dim IDStore NameStore MgrStore Size

Product Dim

Prod Dim IDProd CategoryProd Sub CatProd Desc

MktgCampaign Dim

Mktg Camp IDCamp NameCamp MgrCamp StartCamp End

SQL

SQL

SQL

SQL

PDTD

MDSD

PDTD

MDSD

PDTD

MDSD

PDTD

MDSD

Smaller Dimension Tables are Replicated on Every Compute

Node

Page 11: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

PDW Data ExampleTime Dim

Date Dim IDCalendar YearCalendar QtrCalendar MoCalendar Day

Store Dim

Store Dim IDStore NameStore MgrStore Size

Product Dim

Prod Dim IDProd CategoryProd Sub CatProd Desc

Sales Facts

Date Dim IDStore Dim IDProd Dim IDMktg Camp IdQty SoldDollars Sold Mktg

Campaign Dim

Mktg Camp IDCamp NameCamp MgrCamp StartCamp End

SQL

SQL

SQL

SQL

PDTD

MDSD

PDTD

MDSD

PDTD

MDSD

PDTD

MDSD

SF-1

SF-2

SF-3

SF-4

Larger Fact Table is Hash Distributed Across All

Compute Nodes

SF-1SF-2SF-3SF-4

Page 12: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

SQL Server Parallel Data WarehouseA quick look at MPP query execution

Compute Node 1

Compute Node 2

Compute Node N

ClientControl Node

..

.

The control node handles global query execution, and generates a distributed execution plan

The user connects to ‘the appliance’ like he would to a ‘normal’ SQL Server, and sends his request

The actual user data resides on compute nodes, and steps of the global execution plan are executed on each compute node

SQL Server PDW is a shared nothing MPP system, meaning user data is distributed across the nodes*. Data Movement Service is responsible for moving data around so that individual nodes can satisfy queries that need data from other nodes.

SQL Server PDW Appliance

Page 13: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Shuffle MovementDMS Redistributes the data by color values in parallel.

Co

mp

ute

No

de

1C

om

pu

te N

od

e 2

Dealing with Distributions - ShufflingExample:Select [color], SUM([qty]) from [Store Sales] group by [color];

Retu

rn

Ss_id

color qty

Store Sales

1 Red 5

3 Blue 11

5 Red 12

7 Green 7

Ss_id

color qty

Store Sales

2 Red 8

4 Blue 10

6 Yellow 12

Distributed Table

Temp_1

Red 5

Red 12

Red 8

Green 7

Temp_1

Blue 11

Yellow 12

Blue 10

color qty

color qty

Hash

Blue 21

Red 25

Green 7

Yellow 12

color qty

Hash

HashHashParallel Merge and Aggregate

Page 14: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

SQL Server Parallel Data WarehouseOverall Architecture

Legend:

Control Node

Client Interface(JDBC, ODBC,

OLE-DB, ADO.NET) DMS Manager

PDW Engine

Compute Node 1

DMS Core

PDW Agent

Landing Zone Node

Bulk Data LoaderPDW Agent

Management NodeActive Directory

PDW Agent

PDW AgentCompute Node 2

DMS Core

PDW Agent

Compute Node 10

DMS Core

PDW AgentPDW service

Data Movement ServiceDMS =Parallel Data WarehousePDW =

ETL Interface

Data Rack (up to 4)Control Rack

Page 15: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Agenda

Trends in the DW spaceHow does SQL Server PDW fit in?SQL Server PDW AU3 – What’s new?Building BI Solutions with SQL Server PDW

Customer SuccessesUsing SQL Server PDW with Microsoft BI solutionsUsing SQL Server PDW with third party BI solutionsBI solutions leveraging Hadoop integration

What’s coming next in SQL Server PDW?

Page 16: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

SQL ServerCompatibility

SQL Server Parallel Data Warehouse AU3Release Themes

BI, Analytics, & ETL Integration

Performance At Scale

Broader functionality

Full Alignment

Less work for the same results

Do the same work more efficiently

Native Support for- Analysis Services- Reporting Services- PowerPivot

Lay the foundation for broad connectivity support

Page 17: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

SQL Server PDW ArchitectureHow did it work before?

ProblemBasic RDBMS functionality, that already exists in SQL Server, was re-built in PDW

Challenge for PDW AU3 release Can we leverage SQL Server and focus on MPP related challenges?

Contro

l Node

Page 18: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

SQL Server PDW AU3 Architecture PDW AU3 Architecture with Shell Appliance and Cost-Based Query Optimizer

Shell Appliance(SQL Server)

Engine Service

Plan

S

teps

Plan

S

teps

Plan

S

teps

Compute Node (SQL Server)

Compute Node (SQL Server)

Compute Node (SQL Server)

Con

trol N

od

e

SELECTSELECT

foo foofoo

foo

SQL Server runs a ‘Shell Appliance’

Every database exists as an empty ‘shell’

All objects, no user data

DDL executes against both the shell and the compute nodes

Large parts of basic RDBMS functionality now provided by the shell

Authentication and authorizationSchema binding Metadata catalog

Page 19: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

SQL Server PDW AU3 Architecture PDW AU3 Architecture with Shell Appliance and Cost-Based Query Optimizer

1. User issues a query

2. Query is sent to the Shell through sp_showmemo_xml stored procedure

SQL Server performs parsing, binding, authorizationSQL optimizer generates execution alternatives

3. MEMO containing candidate plans, histograms, data types is generated

4. Parallel execution plan generated

5. Parallel plan executes on compute nodes

6. Result returned to the user

Shell Appliance(SQL Server)

Engine Service

Plan

S

teps

Plan

S

teps

Plan

S

teps

ME

MO

Compute Node (SQL Server)

Compute Node (SQL Server)

Compute Node (SQL Server)

Con

trol N

od

e

SELECTSELECT

Return

Page 20: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

PDW Cost-Based OptimizerOptimizer lifecycle…

1. Simplification and space explorationQuery standardization and simplification (e.g. column reduction, predicates push-down)Logical space exploration (e.g. join re-ordering, local/global aggregation)Space expansion (e.g. bushy trees – dealing with intermediate resultsets)Physical space explorationSerializing MEMO into binary XML (logical plans)De-serializing binary XML into PDW Memo

2. Parallel optimization and pruningInjecting data move operations (expansion)Costing different alternativesPruning and selecting lowest cost distributed plan

3. SQL GenerationGenerating SQL Statements to be executed

Page 21: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

PDW Cost-Based Optimizer… And Cost Model Details

PDW cost model assumptions:Costing only data movement operations (relational operations excluded)

Sequential step execution (no pipelined and independent parallelism)

Data movement operation costs modeled at detailEach movement consists of multiple tasksEach task has Fixed and Variable overhead

Uniform data distribution assumed (no data skew)

Page 22: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Distributed Query Cost Based OptimizerOperator trees for a sample query

(l_o = o_o)

PDW AU2 operator tree

O (o_o) LI (l_o)

(l_o = o_o)shuffle (l_pk)

PDW AU3 operator tree

O (o_o)

LI (l_o)

(l_pk = p_pk)

broadcast

P (p_pk)

SELECT * from orders JOIN lineitem on (o_orderkey =

l_orderkey) JOIN part on (l_partkey = p_partkey)WHERE p_name like '%smoke%';

P (p_pk)

Page 23: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

PDW Sales Test WorkloadAU2 to AU3

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 390

10

20

30

40

50

60

70

80

AU2AU3

Seco

nds

Queries

5x improvement in terms of total elapsed time out of the box

Page 24: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Theme: Performance at ScaleZero data conversions in data movement

Q1 Q3 Q5 Q7 Q9Q11 Q13 Q15 Q17 Q19 Q21

0102030405060

DMS CPU Utilization - TPCH

AU2 AU3

CP

U (

%)

Broadcast

Trim

Replicate

Shuffle

Repl Table Load

0% 100% 200% 300% 400% 500% 600%

Throughput improvement for data movements

GoalEliminate CPU utilization spent on data conversionsFurther parallelize operations during data moves

FunctionalityUsing ODBC instead of ADO.NET for reading and writing dataMinimizing appliance resource utilization for data moves

BenefitsBetter resource, CPU, utilization 6x or more faster move operationsIncreased concurrencyMixed workload (loads + queries)

Page 25: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Theme: SQL Server CompatibilitySQL Server Security and Metadata

SecuritySQL Server security syntax and semanticsSame underlying authorization model and codeSupporting user, roles and loginsFixed database rolesAllows script re-useAllows well-known security procedures/processes

MetadataPDW metadata stored in SQL ServerExisting SQL Server metadata tables/views (e.g. security views)PDW distribution info as extended properties in SQL Server metadataExisting means and technology for persisting metadataImproved 3rd party tool compatibility (BI, ETL)

Page 26: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Theme: SQL Server CompatibilitySupport for SQL Server (Native) Client

SQL PDW Clients

(ODBC, OLE-DB, ADO.NET)

SQL Server Clients

(ADO.NET, ODBC, OLE-DB, JDBC)

TDS

Server: 10.217.165.13, 17001

Server: 10.217.165.13, 17000

SequeLink

Goal‘Look’ just like a normal SQL ServerBetter integration with other BI tools

FunctionalityUse existing SQL Server drivers to connect to SQL Server PDWImplement SQL Server TDS protocolNamed Parameter supportSQLCMD connectivity to PDW

BenefitsUse known tools and proven technology stackExisting SQL Server ’eco-system’2x performance improvement for return operations5x reduction of connection time

Page 27: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Goal Support common scenarios of code encapsulation and reuse in Reporting and ETL

Functionality System and user-defined stored proceduresInvocation using RPC or EXECUTEControl flow logic, input parameters

BenefitsEnables common logic re-useBig impact for Reporting Services scenariosAllows porting existing scriptsIncreases compatibility with SQL Server

Theme: SQL Server CompatibilityStored Procedure Support (Subset)

SyntaxCREATE { PROC | PROCEDURE } [dbo.]procedure_name     [ { @parameter  data_type } [ = default ] ] [ ,...n ] AS { [ BEGIN ] sql_statement [;] [ ...n ] [ END ] } [;]

ALTER { PROC | PROCEDURE } [dbo.]procedure_name [ { @parameter data_type } [ = default ]    ] [ ,...n ] AS { [ BEGIN ] sql_statement [;] [ ...n ] [ END ] } [;]

DROP { PROC | PROCEDURE } { [dbo.]procedure_name } [;]

[ { EXEC | EXECUTE } ]  {     { [database_name.][schema_name.]procedure_name }       [{ value | @variable }] [ ,...n ]  } [;]

{ EXEC | EXECUTE }   ( { @string_variable | [ N ]'tsql_string' } [ + ...n ] ) [;]

Unsupported Functionality

Stored Proc Nesting Output Params

Return Try-Catch

Page 28: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Theme: SQL Server CompatibilityCollations

GoalSupport local and international data

FunctionalityFixed server level collationUser-defined column level collationSupporting all Windows collationsAllow COLLATE clauses in Queries and DML

BenefitsStore all the data in PDW w/ additional querying flexibilityExisting T-SQL DDL and Query scriptsSQL Server alignment and functionality

SyntaxCREATE TABLE T ( c1 varchar(3) COLLATE traditional_Spanish_ci_ai, c2 varchar(10) COLLATE …)

SELECT c1 COLLATE Latin1_General_Bin2FROM T

SELECT * FROM T ORDER BY c1 COLLATE Latin1_General_Bin2

Unsupported Functionality

Cannot specify DB collation during DB creation

Cannot alter column collations for existing tables

Page 29: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Theme: Improved IntegrationSQL Server PDW Connectors

Connector for HadoopBi-directional (import/export) interface between MSFT Hadoop and PDWDelimited file supportAdapter uses existing PDW tools (bulk loader, dwsql)Low cost solution that handles all the data: structured and unstructuredAdditional agility, flexibility and choice

Connector for InformaticaConnector providing PDW source and target (mappings, transformations)Informatica uses PDW bulk loader for fast loads

Leverage existing toolset and knowledge

Connector for Business Objects

Page 30: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Agenda

Trends in the DW spaceHow does SQL Server PDW fit in?SQL Server PDW AU3 – What’s new?Building BI Solutions with SQL Server PDW

Customer SuccessesUsing SQL Server PDW with Microsoft BI solutionsUsing SQL Server PDW with third party BI solutionsBI solutions leveraging Hadoop integration

What’s coming next in SQL Server PDW?

Page 31: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

PDW Retail POS WorkloadOriginal Customer SMP solution vs. PDW AU3 (with cost-based query optimizer)

Q1 Q2 Q3 Q4 Q5 Q6 Q70

200

400

600

800

1000

1200

1400

1600

Old SMPPOS ODS AU3

Seco

nd

s

Queries

Page 32: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Customer SuccessesHow are customers using PDW & BI ?

Data Volume 80 TB data warehouse analyzing data from exchangesExisting system based on SQL SMP farm

2 different clusters of 6 servers each

Requirement Linear scalability with additional hardwareSupport hourly loads with SSIS – 300GB/dayBI Integration: SSRS, SSAS and PowerPivot

AU3 FeedbackSP and increased T-SQL support was greatMigrating SMP SSRS to PDW was painless142x for scan heavy queries & no summary tablesEnabled queries that do not run on existing system

Reports

Dashboards

Scorecards

CUSTOMER EXAMPLE:Stock Exchange in the US

Portal

ETL

PDWOperationa

l DB’s

Page 33: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

PDW

Nielsen OLTP system

SSAS

Click-Stream

Customer Successes – cont’dHow are customers using PDW & BI ?

CUSTOMER EXAMPLE:Major Retailer in the US

Data Volume 36 TB data warehouse analyzing data from transactional and clickstream sourcesBusiness need to expand to 7 year data window (currently 1 year data)

Requirements Scalability - growing data volume does not affect performancePerformance and ad-hoc analysis for interactive querying by usersBI Integration with Microsoft BI stack - SSAS and SSRS

AU3 FeedbackSSAS cubes worked ‘out-of-box’Performance an order of magnitude faster than existing system (~30x on an expanded data set)

Page 34: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Agenda

Trends in the DW spaceHow does SQL Server PDW fit in?SQL Server PDW AU3 – What’s new?Building BI Solutions with SQL Server PDW

Customer SuccessesUsing SQL Server PDW with Microsoft BI solutionsUsing SQL Server PDW with third party BI solutionsBI solutions leveraging Hadoop integration

What’s coming next in SQL Server PDW?

Page 35: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Role of PDW within the BI stack

PDW

DM

DM DM

3rd party BI

SSAS / SSRS

SSAS / SSRS

SSAS / SSRS

PDW role as fast ‘data hub’Fast and parallel feeding of data marts (DMs) via Infiniband

CREATE REMOTE TABLE AS SELECT

Aggregation abilities avoids ETL overhead in existing systems

No need for indexes No need to maintain indexed/materialized views (summary tables)

Infiniband

GBit link

Page 36: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

SSAS with SQL Server PDWUnderstanding the differences compared to ‘SMP world’

Specific to the nature of large dataParallel cube processing/deployment has its limits

Cautious about parallel loads of SSAS - query timeout settings

Query design crucial - only include required dataBI tools traditionally not designed for handling huge amount of data

Specific to PDWPDW does not support foreign key constraintsShared nothing model requires careful data design and retrieval planningDesign cubes for parallel processing – via MOLAP & ROLAP storage model

Page 37: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

demo PowerPivot with SQL Server PDW

… just like any other SQL Server

Page 38: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Agenda

Trends in the DW spaceHow does SQL Server PDW fit in?SQL Server PDW AU3 – What’s new?Building BI Solutions with SQL Server PDW

Customer SuccessesUsing SQL Server PDW with Microsoft BI solutionsUsing SQL Server PDW with third party BI solutionsBI solutions leveraging Hadoop integration

What’s coming next in SQL Server PDW?

Page 39: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Supported Third Party BI Solutions

AU3 T-SQL compatibility allows for common access for multiple tools

Current support on PDW drivers includesMicroStrategySAP BusinessObjectsInformatica

Other tools have ‘mixed experience’Cognos support required : CURRENT_TIMESTAMP , @@DATEFIRST, SET OPTION …Core connectivity enhancements planned for the next 2 releases

Page 40: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Agenda

Trends in the DW spaceHow does SQL Server PDW fit in?SQL Server PDW AU3 – What’s new?Building BI Solutions with SQL Server PDW

Customer SuccessesUsing SQL Server PDW with Microsoft BI solutionsUsing SQL Server PDW with third party BI solutionsBI solutions leveraging Hadoop integration

What’s coming next in SQL Server PDW?

Page 41: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

New Challenges for Business Analytics

Huge amount of data born ‘unstructured’Increasing demand for (near) real-time business analyticsPre-filtering of important from less relevant raw data required

ApplicationsSensor networks & RFIDSocial networks & Mobile AppsBiological & Genomics

Sensor/RFID Data

Blogs, Docs

Web Data

HADOOP

Page 42: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

HADOOP

Fast ETL processi

ng

Active Archive

FastRefinery

Cost-Optimal storage

Hadoop as a Platform SolutionIn the context of ETL , BI , and DW

Platform to accelerate ETL processes (not competing with current ETL software tools!)

Flexible and fast development of ‘hand-written’ refining requests of raw data

Active & cost effective data archive to let (historical) data ‘live forever’

Co-existence with a relational DW

Page 43: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Importing HDFS data into PDW for advanced BI

HADOOP

Sensor/RFID Data

Blogs, Docs

Web Data

SQL Server PDW

Interactive BI/Data Visualization

SQOOP

Application Programmers

DBMS Admin

Power BI Users

Page 44: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Hadoop - PWD Integration via SQOOP (export)

Landing Zone

Compute Node 1

Compute Node 8

HDFS

PDW-configuration file

PDW Hadoop Connecto

r

SQOOP export with source (HDFS path) &

target (PDW DB & table)1. FTP

Server

Copies incoming data on Landing

Zone

3.

2.Read HDFS

data via mappers

Invokes‘DWLoader’

Telnet

Server

4.

Control Node

Compute Nodes

Windows/PDW

Linux/Hadoo

p

5.

Page 45: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

demo Hadoop Sqoop Connector with SQL Server PDW

… integrating unstructured data into your end-to-end DW/BI solution

Page 46: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Agenda

Trends in the DW spaceHow does SQL Server PDW fit in?SQL Server PDW AU3 – What’s new?Building BI Solutions with SQL Server PDW

Customer SuccessesUsing SQL Server PDW with Microsoft BI solutionsUsing SQL Server PDW with third party BI solutionsBI solutions leveraging Hadoop integration

What’s coming next in SQL Server PDW?

Page 47: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

SQL Server PDW Roadmap What is coming next?

Q1 Q2 Q3 Q4 Q1 Q2

• Improved node manageability

• Better performance and reduced overhead

• OEM requests

• Programmability• Batches• Control flow• Variables

• Temp tables• QDR infiniband switch• Onboard Dell

• Columnar store index• Stored procedures• Integrated Authentication• PowerView integration• Workload management• LZ/BU redundancy• Windows 8 • SQL Server 2012• Hardware refresh

CALENDAR YEAR 2011 CALENDAR YEAR 2012

• Cost based optimizer • Native SQL Server drivers,

including JDBC• Collations• More expressive query

language • Data Movement Services

performance• SCOM pack• Stored procedures (subset)• Half-rack

• 3rd party integration (Informatica, MicroStrategy, Business Objects, HADOOP)

Q4

V-NextAppliance Update 3Appliance Update 1Shipped

Appliance Update 2

Q3

Shipped

Shipped

Page 48: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

In Review

Session Objectives Provide an overview of SQL Server PDW Introduce PDW AU3 and share details regarding the new features and their impact on BI scenarios

Key TakeawaysPDW is the SQL Server DW Appliance for 10-100s TBAU3 enables you to use your existing BI solutions on Microsoft & 3rd Party BI ToolsExpect at least 5x performance improvements over PDW AU2

Specific workloads can see much more

Page 49: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Related Content

DBI209 – Big Data, Big Deal

Lots of BI Tool Specific Related Sessions (PowerPivot, Analysis services, Etc.)

Breakthrough Insights: Big Data Analytics & Data Warehousing Demo Station

PDW Deep Dive Session Online from TechEd 2010

Page 51: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Resources

Connect. Share. Discuss.

http://europe.msteched.com

Learning

Microsoft Certification & Training Resources

www.microsoft.com/learning

TechNet

Resources for IT Professionals

http://microsoft.com/technet

Resources for Developers

http://microsoft.com/msdn

Page 52: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

Evaluations

http://europe.msteched.com/sessions

Submit your evals online

Page 53: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321

© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to

be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS

PRESENTATION.

Page 54: Building BI Solutions with SQL Server PDW AU3 Ruwen Hess Senior Program Manager Microsoft Corporation DBI321