sql server 2008 r2 parallel data warehouse

27
SQL Server and Data Warehousing SQL Server and Data Warehousing SQL Server 2008 R2 Parallel Data Warehouse Appliance SQL Server 2008 R2 Parallel Data Warehouse Appliance Speaker: Phil Hummel of WinWireTechnologies Presentation developed by: Bruce Campbell Western Region Data Warehouse Specialist, Microsoft Silicon Valley SQL Server User Group February 16, 2009 Mark Ginnebaugh, User Group Leader, [email protected]

Upload: mark-ginnebaugh

Post on 28-Nov-2014

3.995 views

Category:

Technology


3 download

DESCRIPTION

Presentation by Bruce Campbell of Microsoft Learn about a new capability in SQL Server 2008 R2, Parallel Data Warehouse, formerly known as Project Madison.

TRANSCRIPT

Page 1: SQL Server 2008 R2 Parallel Data Warehouse

SQL Server and Data WarehousingSQL Server and Data WarehousingSQL Server 2008 R2 Parallel Data Warehouse ApplianceSQL Server 2008 R2 Parallel Data Warehouse Appliance

Speaker: Phil Hummel of WinWire Technologies

Presentation developed by: Bruce Campbell

Western Region Data Warehouse Specialist, Microsoft

Silicon Valley SQL Server User Group

February 16, 2009

Mark Ginnebaugh, User Group Leader,

[email protected]

Page 2: SQL Server 2008 R2 Parallel Data Warehouse

Agenda

• SLQ 2008 R2 Parallel DW Appliance

– Hardware and Software Architecture

– Case Study

– Customer Experience Opportunities– Customer Experience Opportunities

• Next Steps

Page 3: SQL Server 2008 R2 Parallel Data Warehouse

SQL Server Parallel Data Warehouse Formerly Project Madison

ProjectMadison Madison MPP Layer

INDUSTRY STANDARD

NETWORKING

INDUSTRY STANDARD

SERVERSReference Hardware Platforms

INDUSTRY STANDARD

STORAGE

Page 4: SQL Server 2008 R2 Parallel Data Warehouse

Parallel DW Appliance Experience

• All hardware from a single vendor

• Multiple vendors to chose from

• Orderable at the rack or cluster

• Vendor will

– Assemble appliances– Assemble appliances

– Image appliances with OS, SQL Server and Madison software

• Appliance installed in less than a day

• Support –

– Vendor provides hardware support

– Microsoft provides software support

Page 5: SQL Server 2008 R2 Parallel Data Warehouse

SQL Server Parallel DW Node

Page 6: SQL Server 2008 R2 Parallel Data Warehouse

Parallel DW - MPP Example

Database Servers

Query Rewritten Into Steps

That Run Efficiently On

Database Servers

ODBC/JDBC

SQL92 with

Analytical

ExtensionsDual

Dual Infiniband

Infiniband

Dual Fiber Channel

Dual Fiber ChannelExtensions

SELECT location, year

sum(b.sales_amt)

FROM customer a, sales b

WHERE b.sales > 500 and

a.custid = b.custid

GROUP BY location, year

ORDER BY 1,2

Page 7: SQL Server 2008 R2 Parallel Data Warehouse

Database Servers

• A SQL Server 2008 instance

• SQL as primary interface

• Each MPP node is a highly tuned SMP node with standard interfaces

• DB engine nodes autonomous on local data

SQLSQLSQL

Database Server

Page 8: SQL Server 2008 R2 Parallel Data Warehouse

Ultra Shared Nothing

• An extension of traditional shared nothing design

– Push shared nothing architecture into SMP node

• IO and CPU affinity within SMP nodes

– Eliminate contention per user query

– Use full PDW Node resources for each user query– Use full PDW Node resources for each user query

– Multiple physical instances of tables

• Distribute large tables

• Replicate small tables

– Re-Distribute rows “on-the-fly” when necessary

Page 9: SQL Server 2008 R2 Parallel Data Warehouse

Control Node & Client Drivers• Client connections always go through the control node

– Clustered to a passive node to support High Availability

• Processes SQL requests

• Prepares execution plan

• Orchestrates distributed execution

• Local SQL Server to do final query plan processing / result

aggregation

• Drivers

• ODBC

• OLE-DB

• Ado.Net client drivers

Page 10: SQL Server 2008 R2 Parallel Data Warehouse

Landing Zone• Provides high capacity storage for data files from ETL

processes

• Supports division of workload dedicated to ETL processes

• SSIS available on the landing zone

• Connected to PDW internal network• Connected to PDW internal network

• Available as sandbox for other applications and scripts that run on internal network.

SourceLanding

Zone Files

Data Loader

Compute Nodes

Page 11: SQL Server 2008 R2 Parallel Data Warehouse

Backup Node

• Builds on SQL Server native backup/restore facility

• Executes at Infiniband network speeds

• Database-level backup

• Subsequent Back Ups are Optimized

• Coordinated backup across the nodes

• Quiesce write activity to synchronize

Page 12: SQL Server 2008 R2 Parallel Data Warehouse

Software Architecture

PDW Services

DMS

IIS

Compute NodesDatabase Server

Landing Zone

Nexus Nexus

Query

Tool

JDBC

OLE-DB

ODBC

Ado.Net

SQL Server

DMS

User DataAdmin Console

MS BI

(AS, RS)

DMSLoader

SQL SSIS

Control Node

Other 3Other 3rd

Party

Tools

SQL Server

DW

Authentication

DW

Configuration

DW

QueueDW Schema

DMS

Backup Node

Management Node

Built by DWPUExisting MS software 3rd Party

DSQLCore Engine

Services

DMS

Manager

DMS

DMS

Loader

ClientSQL SSIS

HPC AD

SQL OS

SQL OS

Page 13: SQL Server 2008 R2 Parallel Data Warehouse

Data Distribution supports even distribution of data across PDW nodes

Page 14: SQL Server 2008 R2 Parallel Data Warehouse

Data Replication

Page 15: SQL Server 2008 R2 Parallel Data Warehouse

SQL Server Parallel DW Architecture - HP

Database Servers

Control Nodes

Active / Passive

SQLSQLSQL

SQLSQLSQL

SQLSQLSQL

SQLSQLSQL

SQLSQLSQLClient Drivers

Dual Infiniband

Dual Infiniband

Spare Database Server

Dual Fiber Channel

Dual Fiber ChannelSQLSQLSQL

SQLSQLSQL

SQLSQLSQL

SQLSQLSQLETL Load Interface

Corporate Backup

Solution

Data Center

Monitoring

Corporate Network Private Network

SQLSQLSQL

SQLSQLSQL

MPP Architecture

HA Built In

Linear Scalability

Page 16: SQL Server 2008 R2 Parallel Data Warehouse

Hub and Spoke – Flexible Business Alignment

Parallel database copy

technology enables rapid

data integration and

consistency between hub

and spokes

Support user groups with

very different SLAs; hot,

warm and cold data;

different requirements on

data loading, etc.

16

A Hub and Spoke solution gives you the flexibility to add/change diverse workloads/user groups, A Hub and Spoke solution gives you the flexibility to add/change diverse workloads/user groups,

while maintaining data consistency across the enterprisewhile maintaining data consistency across the enterprise

Create SQL Server Parallel Data Warehouse, SQL Server 2008, Fast Track Data Warehouse,

and SQL Server Analysis Services spokes

Page 17: SQL Server 2008 R2 Parallel Data Warehouse

Parallel DW and Fast Track Hub and Spoke

Regional Reporting

Departmental

Reporting

High Performance HQ

17

Central EDW Hub

Regional Reporting

ETL Tools

High Performance HQ

Reporting

Page 18: SQL Server 2008 R2 Parallel Data Warehouse

Microsoft Released first Technology Preview for

Parallel Data Warehouse• First Technology Preview released on August 14

• DATAllegro’s MPP engine is now ported to SQL Server 2008 and Windows Server 2008

• 10 customers from 7 industries signed up

– First Premier BankCard was the first customer to enlist on Madison

– Internally – ICE, MSIT, ADCenter, XBOX

• Appliances with 8 to 20 nodes now ready to host customers test drives

Early Results

• Data Loading rates of 1 TB per hour

• Query executions at over 1.5 TB per minute

• Madison running 5 times faster than DATAllegro with Ingres DBMS before acquisition!

Launch of Parallel Data Warehouse:

• Next Technology Preview due early CY2010

• Technology Adoption Program (TAP) due early CY2010

• Nominations now open

• Parallel Data warehouse to launch in summer 2010

Page 19: SQL Server 2008 R2 Parallel Data Warehouse

Parallel DW Beta Programs

• Two Programs

– MTP – Madison Technology Preview

• 20 – 30 participants

• Duration of 4 to 6 weeks• Duration of 4 to 6 weeks

– TAP – Beta production implementation

• 6 – 8 customers

• First iteration 9 to 12 weeks

Page 20: SQL Server 2008 R2 Parallel Data Warehouse

Parallel DW Beta Programs

• Requirements

– Focus on EDW and large data marts

– Migration projects, not green field

– Open to customers & prospects

– 30+ TB of data…at least 4 100+ TB – 30+ TB of data…at least 4 100+ TB

– Hub-and-spoke in only a select few cases

Page 21: SQL Server 2008 R2 Parallel Data Warehouse

Case Study: First Premier Bankcard

ExistingExisting

EnvironmentEnvironment

Hardware16 CPU HP 8620 Itanium

Hitachi Storage 27TB Raw

SATA 21 LUNS

Software

Current Current

ChallengesChallenges

Data Load Speeds

Analytic Capacity

Analytic Speed

MadisonMadison

Highlights Highlights

�Improved by 300%

�30TB/160 Cores

�Query Speeds 70X SoftwareWindows 2003 SP2

SQLServer 2008

SSIS/SSRS

Data Warehouse18 Terabytes

Star Schema

80 Fact Tables

500 + Dimensions

Analytic Speed

Mixed Workload

Total Cost of

Ownership

�Query Speeds 70X

Improvement

�Concurrency

Mixed Workload

�TCO Lowered by

50%

Page 22: SQL Server 2008 R2 Parallel Data Warehouse

Microsoft Commitment

• MTP

– High touch Support

– MS or partner will provide HW and will host the MTP

– Customer may have opportunity to engage with TAP

– MS will work with customer to define scope and success criteria

– MS will perform the bulk of MTP work (2 -3 resources)

• TAP

– Customer must procure the Madison reference architecture and conduct the TAP in their own data center

– Premier support will be provided

– MSFT Services will be provided

– Training / mentoring will be provided

– MS will work with customer to define scope and success criteria

Page 23: SQL Server 2008 R2 Parallel Data Warehouse

Customer Commitment

• MTP

– Customer to provide data, queries, concurrency model, existing data

model, etc.

– Customer to provide SME and DBA to answer questions of MTP team

– Customer to provide existing benchmarks

– Customer to define priorities for testing and areas of interest– Customer to define priorities for testing and areas of interest

– Customer to attend 2-3 day MTP interactive session and review

• TAP

– Customer to provide data, queries, concurrency model, existing data

model, etc.

– Customer to provide SME, DBA and other resources to work with MS

TAP team

– For onsite – customer to provide building access, internet access, etc

– Customer to provide PDW Reference Hardware

Page 24: SQL Server 2008 R2 Parallel Data Warehouse

MTP & TAP Schedule

• MTP 1 – Completed

• MTP 2 – Q1 2010

• TAP – Q2 2010

• RTM – Summer 2010

Page 25: SQL Server 2008 R2 Parallel Data Warehouse

Next Steps

Proof Steps

� Quick Start DW Roadmap Service

� Architectural Design Session

� Madison Technology Preview (MTP)

� Review Madison, SQL Server Classic or Fast Track DW HW/SW configurations and pricing

Page 26: SQL Server 2008 R2 Parallel Data Warehouse

www.bayareasql.org

To attend our meetings or inquire about speaking opportunities, please contact:

Mark Ginnebaugh, User Group Leader [email protected]

Page 27: SQL Server 2008 R2 Parallel Data Warehouse

© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.

The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market

conditions,

it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.

MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.