stephen archbold. working with sql server for 6+ years former production dba for 24/7 high volume...

Stephen Archbold

Working with SQL Server for 6+ years Former Production DBA for 24/7 High volume operation Currently SQL Server consultant at Prodata Specialising in Performance Tuning and Consolidation Blog at http://blogs.Prodata.ie and http://simplesql.blogspot.comhttp://blogs.Prodata.iehttp://simplesql.blogspot.com Get me on twitter @StephenArchbold About me

Data Filegroup/File Fundamentals Storage Design Patterns OLTP Data Warehousing Fast Track style Data Warehousing on a SAN What other go faster buttons have we got Case Study The unruly fact table How do we make the changes Agenda

General Filegroup Recommended Practices. Separate for: Nothing but system tables in Primary I/O patterns Different volatility Data Age If using Multiple Files in a Filegroup Files must be equally sized Files must be equally full SQL does not redistribute data when adding more files Filegroup/File Fundamentals

Transactional processing is all about speed You want to get the transaction recorded and the user out as quick as possible Metric for throughput becomes less about MB/Sec, and more about transactions and I/Os per second Pattern 1 - OLTP

Solid State Disk becoming more commonplace These thrive on Random I/O As the databases can be small, file/filegroup layout can suffer Faster disk brings different challenges Challenges of OLTP

File2.NDF PRIMARY Filegroup File MyDB.MDF File1.NDF File1.NDF Transactions Reference Ref.NDF Volatile Volatile.NDF PAGELATCH!

Facts and Figures

Behind the scenes Wait Type% SOS_SCHEDULER_YIELD 55 PAGEIOLATCH_EX 17 PAGELATCH_SH 15 ASYNC_IO_COMPLETION 5 PAGELATCH_UP 5 Wait Type% SOS_SCHEDULER_YIELD 66 PAGEIOLATCH_EX 12 PAGELATCH_SH 10 ASYNC_IO_COMPLETION 7 SLEEP_BPOOL_FLUSH 2 Single FileTwo Files

Resolving in memory contention lies with the file layout This is actually nothing new, TempDB has been tuned this way for years! Keep in mind, files are written to in a round robin fashion What can we take away from this?

Data Warehousing

Large Volume Star Schema Need to optimize for sequential throughput Scanning Entire Table Not Shared Storage Pattern 2 Fast Track Scenario

Filegroup File / LUN Large Partitioned Fact Table Enclosure 1 MyFact_part1.NDF MyFact_part2.NDF Controller 1 HBA 1 Partition 1 Partition 2 Enclosure 2 MyFact_part5.NDF MyFact_part6.NDF MyFact_Part7.NDF Myfact_Part8.NDF Controller 1 Controller 2 HBA 2 Partition 5 Partition 6 Partition 7 Partition 8 CPU MyFact_Part3.NDF Myfact_Part4.NDF Controller 2 Partition 3 Partition 4 CPU Enclosure MyFact_part9.NDF MyFact_part10.NDF MyFact_Part11.NDF Myfact_Part12.NDF Controller 1 Controller 2 HBA 3 Partition 9 Partition 10 Partition 11 Partition 12 CPU

Pros Easy to figure out your needs Simple, cheap and fast In depth guidance available from Microsoft Cons Not recommended for pinpoint queries Only really for processing entire data sets Need VERY understanding Infrastructure team Fast Track Pros and Cons

Large Volume Star Schema Cannot optimize for sequential throughput Shared Storage More mixed workload Pattern 3 Datawarehouse on SAN

We need Read Ahead Enterprise edition is capable of issuing a request for 512KB on a single read ahead request (Standard youre stuck at 64K) It can issue several of these (outstanding I/O) at a time, up to 4MB But you may not even be close to 512KB Goal Large Request Size

Run something like: And watch this guy: How close are you to the 512k Nirvana

#1 killer of read ahead Read ahead size will be reduced if pages being requested arent in logical order Being a diligent type, you rebuild your indexes Because SQL is awesome, it does this using parallelism! So whats the catch? If Read Ahead is your goal, MAXDOP 1 to rebuild your indexes! Fragmentation - Party Foul Champion

Enclosure 1 PRIMARY MyDB.MDF Filegroup File / LUN

Partition 1 Partition 2 Partition 3 Partition 4 Partition 5 Partition 6 Partition 7 Partition 8 Enclosure 1 Filegroup File / LUN MyDB.MDF Primary Dimensions Volatile Dimensions.NDF Staging.NDF Facts.NDF Facts Fact.NDF Large Fact Partition1.NDF Partition2.NDF Partition3.NDF Partition4.NDF Partition5.NDF Partition6.NDF Partition7.NDF Partition8.NDF

How does Analysis Services pull in data? Getting data out of your Data Warehouse for Analysis Services

On read heavy workloads and Enterprise Edition, Compression If storing multiple Tables in a Filegroup: -E For Data Warehouses - This allocates 64 extents (4MB) per table, per file, rather than the standard 1 (64K) If using multiple Files in a Filegroup -T1117 For all - This ensures that if auto growth occurs on one file, it occurs on all others. Ensures round robin remains in place In General on dedicated SQL servers Evaluate -T834 Requires Lock Pages in memory enabled This enables large page allocations for the Buffer Pool (2Mb 16Mb) Can cause problems if memory is fragmented by other apps Do we have any go faster buttons?

3 TeraByte Data Warehouse Table scan was topping out at 300 mb/sec Storage was capable of 1.7 GB/sec Table partitioning was in place All tables were in a single Filegroup Had to get creative on enhancing throughput Case Study The Unruly Fact Table

16 core server, Hyper Threaded to 32 cores 128 GB of Memory SQLIO fully sequential, storage gives 2.2 GB/Sec 32 range scans started up to simulate workload Page compression was enabled, the T834 trace flag was enabled MAXDOP of 1 on the server to ensure # of threads were controlled Test Conditions

Facts and Figures

Other metrics ScenarioTime SecsAvg IO (K)Avg MB/SecMax MB/Sec Baseline70181575622 Post Index Rebuild with MAXDOP 168181615668 Single FG for large table(s)565119421196 Multiple FG one per partition425051,1501,281

Thankfully easy - Index rebuilds! For non partitioned tables, drop and re-create on the new Filegroup For partitioned tables Alter the partition scheme to point to the new FileGroup For heaps, create a Clustered Index on the table on the new filegroup, then drop it! How do we make the changes

File and Filegroup considerations can yield huge gains Know your workload and optimise for it If you have a Hybrid workload, then have a Hybrid architecture! Dont neglect your SQL Settings Code changes and indexes arent the only way to save the day! Summary

Paul Randal Multi file/filegroup testing on Fusion IO http://www.sqlskills.com/blogs/paul/benchmarking-multiple-data-files-on-ssds-plus-the-latest-fusion-io- driver/ http://www.sqlskills.com/blogs/paul/benchmarking-multiple-data-files-on-ssds-plus-the-latest-fusion-io- driver/ Fast Track Configuration Guide http://msdn.microsoft.com/en-us/library/gg605238.aspx Resolving Latch contention http://www.microsoft.com/en-us/download/details.aspx?id=26665 Maximizing Table Scan Speed on SQL 2008 R2 http://henkvandervalk.com/maximizing-sql-server-2008-r2-table-scan-speed-from-dsi-solid-state-storage Specifying storage requirements (Find that sweet spot!) http://blogs.prodata.ie/post/How-to-Specify-SQL-Storage-Requirements-to-your-SAN-Dude.aspx Fragmentation in Data Warehouses http://sqlbits.com/Sessions/Event9/The_Art_of_War-Fast_Track_Data_Warehouse_and_Fragmentation Partial Database Availability and Piecemeal restores http://technet.microsoft.com/en-US/sqlserver/gg545009.aspx Useful links

Thank you!

stephen archbold. working with sql server for 6+ years former production dba for 24/7 high volume...

Documents