dell reference configuration microsoft sql server...

16
DELL Reference Configuration Microsoft SQL Server 2008 Fast Track Data Warehouse A Dell Technical Configuration Guide Database Solutions Engineering Dell Product Group Anthony Fernandez Jisha J

Upload: lamque

Post on 01-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

DELL Reference Configuration Microsoft SQL Server 2008 Fast Track Data Warehouse

A Dell Technical Configuration Guide

Database Solutions Engineering

Dell Product Group

Anthony Fernandez

Jisha J

ii

Executive Summary Data Warehouses provide the foundation for Business Intelligence systems, such as Analysis and

Reporting Services, which are critical for an organization to be able to scan through very large amounts

of data as efficiently as possible. Data Warehouse configurations typically suffer from design principles

that were originally intended for online transaction processing systems (OLTP). As the amount of data

grows, so do scan times that affect the overall business time to gather critical information.

Dell and Microsoft provide a set of guidelines and design principles called Data Warehouse Fast Track

(DWFT) to help customers design and implement balanced configurations specifically for Data

Warehouse databases providing a hardware balanced approach and predictable out-of-box

performance.

This reference configuration describes the architecture design principles to achieve a balanced

configuration for Dell™ PowerEdge™ R710 and Dell | EMC AX4-5 Fibre Channel Storage.

THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL

ERRORS AND TECHNICAL INACCURACIES. THE CONTENT IS PROVIDED AS IS, WITHOUT EXPRESS OR

IMPLIED WARRANTIES OF ANY KIND.

© 2010 Dell Inc. All rights reserved. Reproduction of this material in any manner whatsoever without

the express written permission of Dell Inc. is strictly forbidden. For more information, contact Dell.

Dell, the DELL logo, and the DELL badge, PowerEdge, and PowerConnect are trademarks of Dell Inc.

EMC, CLARiiON, and PowerPath are registered trademarks of EMC Corporation. Microsoft and SQL

Server are registered trademarks of Microsoft Corporation in the United States and/or other countries.

Intel and Xeon are registered trademarks of Intel Corporation in the U.S. and/or other countries. Other

trademarks and trade names may be used in this document to refer to either the entities claiming the

marks and names or their products. Dell Inc. disclaims any proprietary interest in trademarks and trade

names other than its own.

July 2010

1

Contents

Executive Summary .................................................................................................... ii

Introduction ............................................................................................................. 2

Audience and Scope ................................................................................................. 2

Microsoft Fast Track Data Warehouse Overview .................................................................. 2

Dell Microsoft Fast Track Data Warehouse Reference Architecture ........................................... 2

Recommendations and Best Practices ............................................................................ 5

Conclusion ............................................................................................................. 13

References ............................................................................................................. 14

Tables

Table 1. Microsoft Fast Track Reference Architecture List ........................................ 3 Table 2. Dell PowerEdge R7100 Configuration Details ............................................. 4

Figures

Figure 1. Dell Microsoft Fast Track Reference Architecture ....................................... 3 Figure 2. Dell Microsoft Fast Track Reference Architecture With High Availability ............ 4 Figure 3. Reference Architecture Redundant Fibre Channel Switches ........................... 5 Figure 4. Reference Architecture With Clustering ................................................... 6 Figure 5. Maximum Bandwidth Memory Configuration .............................................. 7 Figure 6. Balanced Performance Memory Configuration ............................................ 8 Figure 7. Internal Storage Configuration .............................................................. 8 Figure 8. Dell | EMC AX4-5 Chassis ..................................................................... 9 Figure 9. Disk Group Layout............................................................................ 10 Figure 10. Disk LUN Layout .............................................................................. 10 Figure 11. Storage Controller LUN Assignment ....................................................... 11 Figure 12. Switch and Zone Configuration............................................................. 12

2

Introduction This reference configuration describes a reference architecture to implement a Microsoft Fast Track

Data Warehouse utilizing the Dell PowerEdge R710 and Dell | EMC AX4-5 Storage. The goal of the

Microsoft Fast Track program is to define a methodology to build a balanced and optimized hardware

and software configuration specifically for Microsoft® SQL Server® Data Warehouse deployments.

Utilizing a building block approach, a Microsoft Fast Track solution offers a cost effective and proven

platform that has been tested and optimized to offer customers faster time to deploy and configure a

data warehouse infrastructure. A data warehouse is the central component in a Business Intelligence

solution that stores large quantities of data and information. As data continues to grow at a fast pace,

optimizing data retrieval is crucial for organizations to maintain service level agreements (SLAs).

Microsoft Fast Track provides a framework which allows customers to select a reference architecture

that has been designed with those principals in mind that best fit their needs, the amount of data, and

their performance requirements.

Audience and Scope

This reference configuration is intended for customers, partners, solution architects, storage

administrators, and database administrators who are evaluating, planning, or deploying a balanced

Data Warehouse solution. The scope is limited to the main repository of data or Data warehouse and

the balanced configuration thereof. Other systems that utilize the Data Warehouse as a source of

data, such as Analysis Services or Reporting Services, are not part of the scope of Microsoft Fast Track.

Microsoft Fast Track Data Warehouse Overview The Microsoft Fast Track Data Warehouse initiative provides a framework to build and deploy a

balanced configuration for a SQL Server Data Warehouse. The initiative provides guidelines and best

practices to configure software and hardware to achieve optimal cost and performance. Microsoft Fast

Track utilizes a building block approach focused on balanced configurations that has been tested and

validated for Data Warehouse workloads.

Dell Microsoft Fast Track Data Warehouse Reference Architecture This reference configuration provides detailed configuration information to set up and deploy a Data

Warehouse utilizing Microsoft Fast Track guidelines on the Dell PowerEdge R710 and Dell | EMC AX4-5

Fibre Channel Storage.

Table 1, as follows, lists recommended hardware that has been tested and selected for a balanced

configuration as published by Microsoft in the Fast Track Reference Configuration Guide listed in the

“References” section of this reference configuration. This reference configuration focuses on the Dell

PowerEdge R710 with two Dell| EMC AX4-5 Fibre Channel Arrays for a capacity range of 4TB to 8TB of

compressed data.

3

Table 1. Microsoft Fast Track Reference Architecture List

Server CPU CPU Cores

SAN Data Drive Count

Initial Capacity

Maximum Capacity

Dell PowerEdge R710 (2) Intel® Xeon® Nehalem Quad Core @ 2.66GHz

8 (2) EMC® AX4-5

(16) 300GB 15K FC

4TB 8TB

Figure 1 shows the hardware components that comprise the Data Warehouse solution published by

Microsoft. Note that component selection such as Fibre Channel HBAs, Fibre Channel Switches, and

Networking Switches are not listed in the Microsoft Fast Track Reference Configuration Guide. The

following configuration was deployed in the Dell Labs.

Figure 1. Dell Microsoft Fast Track Reference Architecture

Dell PowerEdge R710

Dell | EMC AX4-5 Fibre Channel Storage

Brocade SW200E Fibre Channel Switch

Dell PowerConnect™ 54XX Series

Two Dual-Port 8Gb Fibre Channel HBAs

4

Figure 2 shows the reference architecture configured with High Availability.

Figure 2. Dell Microsoft Fast Track Reference Architecture With High Availability

Table 2. Dell PowerEdge R7100 Configuration Details

Server PowerEdge R710

CPU (2) Quad Core Intel Xeon X5550 2.66GHz 8MB Cache, 6.4GT/s Intel QPI, Turbo, HT, 95W

Number of Cores 8

Memory 48GB RAM (6x8GB DIMMS @1333MHz)

PCI-E Slots PCI-E Riser with two x4 Gen 2 slots (Slots 1 and 2)

Internal Storage Controller PERC6/I or H700 512MB Cache

Internal Drives (2) 73GB 6Gpbs SAS 15K HDD (6) 300GB 6Gbps SAS 15K HDD* * Various sizes can be configured.

Network Adapters Two Embedded Broadcom 5709C Dual-Port Gigabit Ethernet (four ports total)

Network Switch Dell PowerConnect 54XX Series

FC HBA (2) Emulex LPe-12002-E (8Gb Dual-Port FC HBA) Or (2) QLogic QLE2562 (8Gb Dual-Port FC HBA)

FC Switch (2) Brocade SW200E (8 Ports enabled each)* * High Availability Configuration requires two FC switches.

FC Storage (2) Dell | EMC AX4-5 Fibre Channel Arrays

Dell PowerEdge R710 Cluster Node 1

Dual Brocade SW200E Fibre Channel Switches

Dell | EMC AX4-5 Fibre Channel Storage

Dell PowerEdge R710 Cluster Node 2

Dell PowerConnect Series Private Switch

Dell PowerConnect Series Public Switch

5

Recommendations and Best Practices

This section details the recommendations and best practices for implementing a high performing Fast

Track data warehouse on the latest Dell PowerEdge Server R710 and Dell EMC AX4-5F storage

enclosure. In this section, the hardware based optimizations are given preference compared to the

software and database parameters.

Database Availability

Based on business requirements, it may be necessary to have a highly reliable and available database

configuration with high performance for data warehousing. Dual Fibre Channel Switches provide path

redundancy and availability. In addition, they provide the flexibility to add clustered nodes with

Microsoft Cluster Service (MSCS) to provide server redundancy.

The complete Dell Reference Architecture may be represented as in Figure 3, as follows, for Single

Node configurations with redundant Fibre Channel switches.

Figure 3. Reference Architecture Redundant Fibre Channel Switches

The Dell Microsoft Fast Track architecture with high availability MSCS may be depicted as in Figure 4,

as follows.

0

SP B

1 0

SP A

1 0

SP B

1 0

SP A

1

Dual Brocade SW200E

Fibre Channel Switches

Dell PowerEdge R710

Dual Emulex 8Gb HBAs

Dell | EMC AX4-5 FC

Storage

6

Figure 4. Reference Architecture With Clustering

Figure 4 shows an active-passive SQL server configuration using Microsoft Clustering technology.

Microsoft clustering enables the passive database node to host the database service if the primary

database server (the active node) fails. This guarantees a highly available configuration at the

database, HBA, network, and SAN layers.

Server

Server selection for Microsoft Fast Track implementations is based on the principals for building a

balanced system from the storage to the server. Given the multicore capabilities of today’s servers, a

starting point to build a balanced configuration is based on the maximum CPU Core consumption rate

(MCR) a system can process. Microsoft suggests 200MB/s per CPU core as a starting point to build a Fast

Track configuration based on test results.

The Dell PowerEdge R710 is a dense 2U server with dual socket Intel Xeon 5500 series Quad-Core

processors, Intel’s 5520 I/O Hub (IOH) with QuickPath Architecture, DDR3 memory, DIMM thermal

sensors, PCI Express Generation 2, and dual-port embedded Gigabit Ethernet controllers.

CPU

The Microsoft Fast Track reference describes the balanced performance for the R710 with the Intel

Xeon 5500 Series X5550 2.66GHz Quad-Core processor.

Using a 200MB per second per Core of page-compressed data to determine the MCR of the 8 core

system, the R710 would yield an MCR of 1.6GB/s. The MCR provides a starting point to determine the

bandwidth from the storage to maximize the CPU capacity. This methodology helps define a

watermark of performance when designing a balanced configuration free of bottlenecks throughout the

data path, for example, storage spindles, storage controllers, SAN fabric, FC HBAs, and memory.

0

SP B

1 0

SP A

1 0

SP A

1 0

SP A

1

MSCS clustered nodes Active Node Passive Node

7

Memory

Memory sizing for a Fast Track Data Warehouse configuration depends on the workload characteristics

of data access. Microsoft recommends the minimum amount of memory required to drive the MCR or

200MB/s per core which is 4GB of RAM per Core. For an 8 Core system, 32GB would be the minimum.

The R710 has three Memory Channels and up to 3 DIMMs per Channel. For configurations that require

maximum bandwidth but do not necessarily benefit from capacity, it is recommended to configure the

memory with 8GB RDIMMs in the following configuration. Figure 5 shows two sockets each displaying

three channels with 3 DDR3 DIMMs per channel. With the 8GB DIMMs, only the first slot per channel is

populated. This configuration provides maximum bandwidth with a total of 48GB of RAM.

Figure 5. Maximum Bandwidth Memory Configuration

For a balanced configuration that does not require higher capacities, 4GB DDR3 DIMMs offer a cost

effective solution as shown in Figure 6 below.

A1

A2

A3

A4

A5

A6

A7

A8

A9

B1

B2

B3

B4

B5

B6

B7

B8

B9

6 x 8GB DDR3 1333MHz

Total RAM 48 GB

10.6GB/s per Memory Channel

3 Memory Channels per Socket

Quad-Core Socket

1

Quad-Core Socket

2

DIMM Slots

8

Figure 6. Balanced Performance Memory Configuration

Internal Storage

Customers have a choice to select from two flavors of chassis for the R710 with up to eight 2.5” 6Gbps

SAS drives or up to six 3.5” 6Gbps SAS drives.

Figure 7, as follows, shows both chassis configured with the first two drives in a RAID 1 configuration

for Operating System installation. The remaining drives are used for staging or for backup space.

Depending on the space required, customers can configure the drives in the following two

configurations.

Figure 7. Internal Storage Configuration

*Space configured with RAID5 or other RAID types can be configured depending on space or performance requirements.

Space Configured:

Operating System: 73GB

Staging/Backup*:

(6) 600GB Drives: 2.8TB

(6) 300GB Drives: 1.4TB

(6) 146GB Drives: 680GB

Space Configured: Operating System: 143GB Staging/Backup*: (4) 600GB Drives: 1.7TB (4) 450GB Drives: 1.3TB (4) 300GB Drives: 900GB (4) 146GB Drives: 400GB

A1

A2

A3

A4

A5

A6

A7

A8

A9

B1

B2

B3

B4

B5

B6

B7

B8

B9

12 x 4GB DDR3 1333MHz

Total RAM 48 GB

8.5 GB/s per Memory Channel

3 Memory Channels per Socket

Quad-Core Socket

1

Quad-Core Socket

2

DIMM Slots

9

It is recommended to utilize the PERC/i or PERC H700 internal storage controller. This controller

provides internal hardware RAID capabilities (0, 1, 5, 6, 10, 50, and 60) and mixed RAID configurations,

for example, RAID 1 for Operating System drives, and RAID5 or RAID10 for staging/backup.

For configurations in which RAID 1 is sufficient for the staging/backup area, the SAS6/iR or H200 offers

RAID 0 and RAID 1 capabilities. Note that the SAS6/i and H200 offer a maximum of two RAID 1 sets.

Database

Microsoft has definite configuration guidelines and best practices for deploying a Data Warehouse using

the Fast Track approach. This configuration includes minimum indexing, table partitioning, and table

compression considerations. The major database settings that would enable the SQL Server database to

be optimized for a sequential database workload may include the following:

Database startup parameters

–E : Increases the number of extends allocated to a database table

–T1117: Ensures the even growth of all files in the file group

For the complete configuration guidelines, please refer to the Microsoft Fast Track Reference

Configuration Guide listed in the “References” section of this reference configuration.

Storage Attached Network

The storage attached network (SAN) configuration plays a major role in the database performance.

This section discusses storage Disk Group, Logical Unit (LUN) and Data file layout, switch and HBA

cabling, and Multipath I/O configuration.

Storage

The Microsoft Fast Track Reference Configuration Guide provides the strict guidelines for creating the

RAID configurations and storage LUNs on the external storage array. In addition, the distribution of the

storage LUNs between the storage processors of the array plays a significant role in delivering the

optimum output out of the array.

Figure 8, as follows, shows the front and back of the chassis. Note that Disk 0 through Disk 3 contains

software flares installed on the MBR.

Figure 8. Dell | EMC AX4-5 Chassis

Figure 9 shows the physical Disk Group layout across both storage arrays. Microsoft specifies utilizing

RAID1 mirrored sets for database data and Log files. Each AX4-5 is divided into four Data Disk Groups

and one Log Disk Group per Array.

Front

Back

10

Figure 9. Disk Group Layout

Figure 10, as follows, shows the LUN layout recommendations. Two volumes (LUNs) are carved out of

each of the RAID groups created for data files, and a single LUN utilizes the entire disk group for a Log

file. Lab tests showed that the EMC storage processor is able to maximize performance per spindle

within the mirrored set by dedicating a LUN to a spindle.

Figure 10. Disk LUN Layout

In this case, the storage processor (SP) is able to manage the requests to the backend disks in a more

streamlined manner to reduce the overall seek time of both the mirrored member disks. This

streamlining results in an overall improvement in the disk throughput.

Hot Spares

Hot Spares

Data

LUN

10

Data

LUN

9

Log LUN 2

Data

LUN

12

Data

LUN

11

Data

LUN

14

Data

LUN

13

Data

LUN

16

Data

LUN

15

Data

LUN

2

Data

LUN

1

Data

LUN

4

Data

LUN

3

Data

LUN

6

Data

LUN

5

Data

LUN

8

Data

LUN

7

Log LUN 1

LOG

Data1 Data2 Data3 Data4

Hot Spares

LOG

Data5 Data6

Data7 Data8

Hot Spares

11

Figure 11. Storage Controller LUN Assignment

To ensure maximum performance, both LUNs from the same Disk Group should be assigned to the same

Storage Processor. The Log volumes in each of the arrays may be assigned to either SP A or SP B.

Storage Cache Configuration

The Dell | EMC AX4-5 has a total of 2GB of cache, 1GB of cache per processor. Some performance

gains may be achieved by configuring 100% of cache to writes during data load windows. Temporary

files may also benefit from storage cache configuration. However, since tempdb files share the same

spindles with database files, it is recommended to increase the available memory to allow sorts to

happen in memory as opposed to disk which may affect the overall performance. For more in depth

discussion of storage cache configuration recommendations, refer to “Configuring cache memory for

DW/DSS workloads” in the following white paper at http://www.emc.com/collateral/hardware/white-

papers/h5548-deploying-clariion-dss-workloads-wp.pdf.

Switch and HBA Configuration

The Microsoft Fast Track Reference Configuration Guide describes a basic approach for HBA-SAN

configuration. Dell reference architecture proposes value additions at the SAN Configuration to

provide availability to the Fast Track architecture.

Several tests were conducted in Dell labs to determine the optimized SAN Configuration to complement

the Fast Track concepts. The proposed Dell Microsoft Fast Track SAN Configuration may be represented

as in Figure 12.

Hot Spares

Hot Spares

Data

LUN

10

Data

LUN

9

Log LUN 2

Data

LUN

12

Data

LUN

11

Data

LUN

14

Data

LUN

13

Data

LUN

16

Data

LUN

15

SP A SP B

12

Figure 12. Switch and Zone Configuration

The use of at least two switches to the setup adds high availability at the switch layer. Switch zoning

should be configured properly to make sure that all the HBA ports are able to access the storage ports.

In the proposed configuration, high availability is ensured at the HBA layer and the switch layer. In

other words, the failure of a single HBA or switch will not result in downtime of the configuration.

Multipathing Options and Settings

Customers have the choice of implementing Microsoft Multipath I/O (MPIO) or EMC PowerPath® for

storage connectivity and multipathing, depending on their preference and expertise. Both solutions

are fully tested and supported by Dell Fast Track Solutions.

Microsoft MPIO

Microsoft references MPIO as the multipathing solution for fast track implementations as a fully

integrated solution for external SAN connectivity. Note that the use of EMC PowerPath is required to

initialize and format the external LUNs presented to the Operating System as disks. When the disks

have been fully configured, Microsoft MPIO can be utilized for multipathing.

Microsoft recommends the use of the Round Robin with Subset function as the preferred load balancing

policy.

EMC PowerPath

EMC PowerPath provides a number of load balancing policies for routing I/O requests. The default

setting is the Clariion® Optimized (ClarOpt) setting. ClarOpt processes I/O requests across multiple

paths based on reads and writes. ClarOpt also allows users to define priority on specific paths.

In lab testing, a 4-5% improvement was observed with the load balancing policy set to Least I/Os when

compared with ClarOpt setting. This setting assigns I/Os to paths with the fewest number of requests

in the queue.

Dell recommends to experiment with various settings for a specific workload to find optimal

performance.

Database Server

HBA 1 HBA 2

A B A B

SW1 SW2

StorageArray 1 StorageArray 2

Zone1 Zone2

13

Conclusion Microsoft Fast Track Architecture presents a balanced configuration of processor core and disk

performance capabilities, optimized for a sequential workload. Dell, as an efficient hardware partner,

adds value to the configuration by providing the best practices and recommendations at all the

hardware layers. Following these recommendations will help ensure a complete balanced configuration

with optimized performance specifically designed for a sequential workload.

To summarize, Dell Microsoft Fast Track Reference Architecture provides the following value additions

and advantages along with a powerful, high performing configuration:

Best practices at all the hardware and software layers

Tested and Validated Configuration with proven methodology

High Availability at every level of the Fast Track configuration

Dell and Microsoft together provide expertise at the hardware and software layers to design and build

balanced reference architectures for Data Warehouse Fast Track deployments ensuring better out-of-

box performance.

14

References

Dell SQL Server Solutions

www.dell.com/sql

Dell Services www.dell.com/services

Dell Support

www.dell.com/support

Microsoft Fast Track Data Warehouse

www.microsoft.com/fasttrack

An Introduction to Fast Track Data Warehouse Architectures

http://msdn.microsoft.com/en-us/library/dd459146.aspx

Implementing a SQL Server Fast Track Data Warehouse

http://msdn.microsoft.com/en-us/library/dd459178.aspx

Microsoft Fast Track Reference Configuration Guide

http://download.microsoft.com/download/D/B/D/DBDE7972-1EB9-470A-BA18-

58849DB3EB3B/FTRARefConfigGuide.docx

Microsoft Dell Data Sheet

http://download.microsoft.com/download/D/F/A/DFAAD98F-0F1B-4F8B-988F-

22C3F94B08E0/Dell%20Fast%20Track%202.0%20Datasheet.pdf

Introduction to New Data Warehouse Scalability Features in SQL Server 2008

http://msdn2.microsoft.com/en-us/library/cc278097(SQL.100).aspx

Best Practices for Data Warehousing with SQL Server 2008

http://msdn.microsoft.com/library/cc719165.aspx