sql - parallel data warehouse (pdw)

18
About Presenter Karan Gulati is SQL Server Analysis Services Maestro (MCM) Working as Support Escalation Engineer in Microsoft for last five years Currently he is focusing more on SQL BI and SQL PDW. He is very Active blogger and contributed to multiple whitepapers which are published on MSDN or TechNet site. He had also written tools which are available on CodePlex. Karan Gulati (SSAS Maestro) 1

Upload: karan-gulati

Post on 24-Dec-2014

6.465 views

Category:

Technology


0 download

DESCRIPTION

In this presentation we will figure out what's SQL PDW, SMP Vs. MPP and world of Appliance....

TRANSCRIPT

Page 1: SQL - Parallel Data Warehouse (PDW)

Karan Gulati (SSAS Maestro)1

About Presenter

Karan Gulati is SQL Server Analysis Services Maestro (MCM)

Working as Support Escalation Engineer in Microsoft for last five years

Currently he is focusing more on SQL BI and SQL PDW. He is very Active blogger and contributed to multiple whitepapers which are published on MSDN or TechNet site. He had also written tools which are available on CodePlex.

Page 2: SQL - Parallel Data Warehouse (PDW)

Karan Gulati (SSAS Maestro)2

SQL - Parallel Data Warehouse (PDW)

Let’s figure out……….

Page 3: SQL - Parallel Data Warehouse (PDW)

Karan Gulati (SSAS Maestro)3

What are we covering

• World of Appliance• Introducing SQL Parallel Data Warehouse (PDW)• Different Kinds of Nodes in PDW• Hub and Spoke Architecture

Page 4: SQL - Parallel Data Warehouse (PDW)

Karan Gulati (SSAS Maestro)4

What’s an Appliance?

Are we talking about a refrigerator or an oven?

Page 5: SQL - Parallel Data Warehouse (PDW)

Karan Gulati (SSAS Maestro)5

Appliance World…….

Appliance is nothing but preconfigured machine which is dedicated for specific use in contrast to general use.

In Computer world - An appliance comes with hardware, with pre-installed OS, and Software, keeping all best practices or guideline in mind while building an Appliance.

What this means to users?Just plug and play…... and ready to use just like a refrigerator or an oven.

Page 6: SQL - Parallel Data Warehouse (PDW)

Karan Gulati (SSAS Maestro)6

Have you heard about SQL PDW

Microsoft SQL Server Parallel Data Warehouse (SQL Server PDW) is:

• Massively Parallel Processing Appliance (MPP)• Simple to deploy• Pre-built Appliance with software, hardware and networking

components• Highly scalable data storage, and high-speed data transfer• One answer to largest data warehouse workloads

Page 7: SQL - Parallel Data Warehouse (PDW)

Karan Gulati (SSAS Maestro)7

Symmetric Multi Processing

First, lets understand Symmetric multi processing(SMP)

In SMP each CPU core can work with any section of memory or disk, and all memory and all disk available to each core.

Problem starts when too many CPUs making requests same time for data on the system bus which creates a traffic jam and that results in queue consequently slowness and limited amount of processing can take place on SMP creates limitation as the usage grows System Bus.

Page 8: SQL - Parallel Data Warehouse (PDW)

Karan Gulati (SSAS Maestro)8

Solution to SMP Problem lies in MPP

Massively Parallel Processing Architecture refers to the use of a large number of separate computes to perform a set of a job.

In simple words MPP is:Multiple boxes with their own CPUs, Memory and other resources to perform given task; this way we are using the power of all machines / nodes in one go.

Page 9: SQL - Parallel Data Warehouse (PDW)

Karan Gulati (SSAS Maestro)9

SQL PDW: Flow of Query Execution

Query hits control node

Control node break the Query

into multiple parallel

operations and distribute them out to compute

nodes where the actual data

resides

DMS or Data Movement

Service coordinates any

needed data movement

among nodes

When the compute nodes

are finished, control nodes handles post-

processing and re-integration of result sets for

delivery back to the users

Page 10: SQL - Parallel Data Warehouse (PDW)

Karan Gulati (SSAS Maestro)10

SQL PDW: Nodes and Services

Control Node

Compute Node

Administrative Service Nodes

Data Movement Services

Page 11: SQL - Parallel Data Warehouse (PDW)

Karan Gulati (SSAS Maestro)11

Control Node

An Control node that is the central point of control for processing queries on the SQL Server PDW appliance. The Control node receives the user query, creates a distributed query plan, communicates relevant plan operations and data to Compute nodes, receives Compute node results, performs any necessary aggregation of results, and then returns the query results to the user.

Page 12: SQL - Parallel Data Warehouse (PDW)

Karan Gulati (SSAS Maestro)12

Compute Node

An Compute node that is the basic unit of scalability and storage. Each Compute node in the SQL Server PDW appliance uses its own user-data and computing resources to perform a portion of each parallel query.

Page 13: SQL - Parallel Data Warehouse (PDW)

Karan Gulati (SSAS Maestro)13

Administrative Service Nodes

• Landing Zone node: An appliance node that provides temporary storage and processing for loading data onto the appliance.

• Management node: An appliance node that performs multiple functions related to managing the hardware and software in the appliance. This node is the hub for software deployment and servicing, authentication within the appliance (not login authentication), and monitoring system health and performance

• Backup Node: The Backup Node provides high-speed integrated backup at the database level. This is tied to the organization’s overall backup strategy and systems.

Page 14: SQL - Parallel Data Warehouse (PDW)

Karan Gulati (SSAS Maestro)14

Data Movement Services

DMS

• When a query is submitted to a control node, it is the PDW Engine that determines what the query plan will be on each individual compute node, then submits the query to all the compute nodes through the DMS

• Further DMS coordinates any needed data movement among nodes taking place between and handles any functions that needed to be resolved centrally

• In simple words DMS is the brain that ties all the nodes together

Page 15: SQL - Parallel Data Warehouse (PDW)

Karan Gulati (SSAS Maestro)15

Hub and Spoke Architecture

Data warehousing architecture with a central hub data warehouse that provides a flexible and high speed ability to move or copy EDW data to spokes.

A spoke is typically a data mart in an optimized physical storage for a particular user group or organization.

A data mart is usually a much smaller subset of the data in the EDW and specific to the reporting and analytic needs of a specific user community.

Page 16: SQL - Parallel Data Warehouse (PDW)

Karan Gulati (SSAS Maestro)16

SQL PDW – Act as Hub

Using a true hub-and-spoke architecture, all enterprise data can be maintained on a SQL Server 2008 R2 Parallel Data Warehouse hub while departments or business units keep their existing data marts to suit their needs. High-speed data transfer relieves traditional barriers to hub and spoke. Power users can even deploy a dedicated MPP appliance as a spoke so they can autonomously manage resources, while IT can enforce enterprise standards across all data.

Page 18: SQL - Parallel Data Warehouse (PDW)

Karan Gulati (SSAS Maestro)18

Thanks

Contact Speaker -

http://karanspeaks.com

http://blogs.msdn.com/karang

https://twitter.com/karangspeaks

http://in.linkedin.com/in/karanspeaks