rené balzano technology solution professional data platform microsoft switzerland sql server...

René BalzanoTechnology Solution Professional Data PlatformMicrosoft Switzerland

SQL ServerPerformance Programming

This Session

is aboutHow to design databases and T-SQL code in a way that helps achieving good performanceHow to monitor and analyze what might decrease the performance of your database and application

does not containC#, EF, ODBC, Visual Studio, German, French, etc.

Help your DBAThe DBA who runs the database you have programmed can compensate for many design-flaws and improve performance without touching your codeYou don’t want to depend on a DBA’s skill and attention when it comes to defining your application’s performanceDesign your database and the interaction of your application with it in the best possible way, so that your database performs well even without a DBA’s interventionTo design for optimal and less DBA-dependent performance, it helps to understand what goes on under the hood of SQL ServerSo let’s have a look…

Demo…

Scenario #1

Choosing the right keys

Page Splits and Fragmentation

Why a Clustered Index?

121:1 1 Huber … 20070502121:2 5 Meier … 20010219121:3 6 Meier … 19880502121:4 3 Oberst … 20110107121:5 … … … …… … … … …121:11 … … … …121:12 15 Glauser … 19620522121:13 22 Keller … 19811111121:14 9 Zurbriggen … 19910414

122:1 2 Amsler … 20080502122:2 5 Kern … 20010319122:3 4 Zorbas … 20080511122:4 7 Klaus … 20010108122:5 … … … …… … … … …122:11 … … … …122:12 49 Straub … 20020722122:13 18 Djuric … 19811121122:14 15 Dankner … 19890212 …

Heap: UnorderedRecord identified by RID (file#:page#:position#)

This example (simplified):4000 records, 550kb/record, 14 records/page286 pages (#121 - #407)(8096 bytes data/page)

407:1 … … … …407:2 … … … …407:3 … … … …407:4 … … … …407:5 … … … …… … … … …407:11 … … … …407:12 … … … ….407:13 … … … …407:14 … … … …

file #1:page #121 (1st page of this table) file #1:page #122 file #1:page #407 (286th page of this table)

19620522 1:121:1 ……

19811111 1:121:13 … …19811121 1:122:13 … …19880502 1:121:3 … …… … … …… … … …… … 199104141:121:14… … 200101081:122:4… … 200102191:121:219890212 1:122:14 … …

file #1:page #2132 (2nd page of this index)

20010319 1:122:2 … …… … …

…… … …

…20020722 1:122:12 … …20070502 1:121:1 … …… … … …… … 20080511 1:122:3… … … …… … 20110107 1:121:4… … … …

file #1:page #2139 (9th page of this index)

…

19620522 1:213220010319 1:2139 … …… …… …… … … …… …

file #1:page #2131 (1st page of this index)Nonclustered (secondary) Index on a Heap:Ordered by index key: DateRecord pointer is RID (file#:page#:position#)

This example (simplified):4000 index records, 16bytes/record, 506 records/page

9 index pages: 1 index b-tree page (#2131)+ 8 index leaf pages (#2132-#2139)

(8096 bytes data/page)

When a record in a heap moves to a differentdisk location, its entry has to be updatedin ALL secondary indexes , resulting in increaseddisk activity and reduced performance for other tasks


121:1 1 Huber … 20070502121:2 2 Amsler … 20080502121:3 3 Oberst … 20110107121:4 4 Zorbas … 20080511121:5 5 Meier … 20010219121:6 5 Kern … 20010319121:7 6 Meier … 19880502… … … … …121:13 7 Klaus … 20010108121:14 … … … …

…

122:1 … … … …122:2 … … …122:3 … … … …122:4 9 Zurbriggen … 19910414122:5 … … …122:6 … … …… … … …122:12 15 Dankner … 19890212122:13 15 Glauser … 19620522122:14 18 Djuric … 19811121 …

Clustered Index:Ordered by clustering Key: IDRecord identified by Clustering Key

This example (simplified):4000 data records, 550kb/record, 14 records/page286 leaf (data) pages (#121 - #407)

4000 index records, 16bytes/record, 506 records/page9 index pages: 1 index b-tree page (#112)

+ 8 index leaf pages (#113-#120)

Total of 295 pages for this clustered index(8096 bytes data/page)

407:1 22 Keller … 19811111407:2 … … … …407:3 … … … …… … … … …407:5 49 Straub … 20020722… … … … …407:11 … … … …407:12 … … … ….407:13 … … … …407:14 … … … …

file #1:page #121 (10st page of this table) file #1:page #122 file #1:page #407 (295th page of this table)

1 1:121:1 ……

2 1:121:2 … …3 1:121:3 … …4 1:121:4 … …5.1 1:121:5 … …5.2 1:121:6 … …6 1:121:7 9 1:122:4… … … …… … 15.1 1:122.12 7 1:121:13 15.2 1:122.13

file #1:page #113 (2nd page of this clustered table)

22 1:407:1 … …… … …

…… … …

…49 1:407:5 … …… … … …… … … …… … … …… … … …… … … …… … … …

file #1:page #120 (9th page of this clustered table)

…

1 1:11322 1:120 … …… …… …… … … …… …

file #1:page #112 (1st page of this clustered table)


19620522 15 ……

19811111 22 … …19811121 18 … …19880502 6 … …… … … …… … … …… … 199104149… … 200101087… … 200102195.119890212 15 … …


20010319 5.2 … …… … …

…… … …

…20020722 49 … …20070502 1 … …… … … …… … 20080511 4… … … …… … 20110107 3… … … …


…

19620522 1:213220010319 1:2139 … …… …… …… … … …… …

file #1:page #2131 (1st page of this index)

Nonclustered (secondary) Index on a Clustered Table (Index):Ordered by index key: DateRecord pointer is Clustering Key (ID)

Same size as in previous example.

1 1:121:1 ……

2 1:121:2 … …3 1:121:3 … …4 1:121:4 … …5.1 1:121:5 … …5.2 1:121:6 … …6 1:121:7 9 1:122:4… … … …… … 15.1 1:122.12 7 1:121:13 15.2 1:122.13

22 1:407:1 … …… … …

…… … …

…49 1:407:5 … …… … … …… … … …… … … …… … … …… … … …… … … …

1 1:11322 1:120 … …… …… …… … … …… …

When a record in a clustered table moves to a differentdisk location, its entry only has to be updatedin the ONE clustered index, no secondary indexhas to be touched, no extensive disk activity results.

Clustered Table (Index)

Avoid secondary updates when changing data

Why a short Clustering Key?

19620522 0813A496-817E-43DB-B01B-B7C5B0EDFA70 | Glauser | Peter 19811111 AEE9226F-0796-4D02-9C62-44B0FBCFB15B | Keller | Klara

… …19811121 9E2EF8A9-C1F5-4824-96B8-3669CF8FC875 | Djuric | Vladimir … …


20010319 036B285D-6D20-4BA2-BA3D-A4AC40B6AD8E | Kern | Beat…

… ……

20020722 2D498FB1-8085-436E-A783-CB4E800F9AF7 | Straub | Trudi

… …… … … …… … … …… … … …


…

…

Large clustering keys result in every secondary index being larger, thus increasing disk activity (number of pages to read from disk). Eventually this leads to additional levels inthe B-tree, adding one or more extra IOs to EVERY read or update operation in EVERY secondary index.

Avoid large secondary indexes and large numbers of disk IOs

Why a monotonous growing Clustering Key?

19620522 15 ……

19811111 22 … …19811121 18 … …19880502 6 … …… … … …… … … …… … 199104149… … 200101087… … 200102195.119890212 15 … …

Full page,sorted byDate

Inserting 19850101 leads to page split:

19620522 1519811111 2219811121 1819850101 66

19880502 6 … …… … … …… … … …… … 199104149… … 200101087… … 200102195.119890212 15 … …

Page splits occur when new data has to be inserted in ordered full pages. A page split results in increased disk activity. An index (including the clustered table) in which many pages splits have occured, is fragmented (pages with consecutive ordered data are spread over the disk, resulting in slower IO operations).

If a clustering key’s values don’t grow monotonous, page splits occur on the base table, having a large negative impact on IO performance during writes and reads (fragmentation).

Avoid Page Splits

Indexes and Performance

To minimize disk activity when inserting and updating data and to reduce the number of disk IOs when reading data (= keep fragmentation low):

Always have a clustered index

Define clustering keyssmallmonotonously growingwith unchanging values

Be prepared for rebuilding indexes as they show fragmentation

Clustering Keys

Greatint with IDENTITY clause (SEQUENCE in Denali)date or datetime, e.g. a timestamp value

To be avoidedGUID, UNIQUEIDENTIFIER

(or at least use NEWSEQUENTIALID)varchar fields, at least those that aren’t tinycomposite keys with multiple fields

Monitoring Fragmentation

Check fragmentation of your indexes (tables) :select * from sys.dm_db_index_physical_stats(db_id(),null,null,null,null)Goal: As low as possible, reorg above 10%, rebuild above 30%

Check page densityDBCC SHOWCONTIGGoal: As high as possible (also depends on record size and fillfactor)

Demo…

Scenario #2

Don’t be afraid of the CXPACKET

Parallelism

ParallelismSQL Server tries to parallelize over all available cores (minus 1) by defaultParallellism is generally great for querying, but not necessarily so in OLTP settingsBe careful:

Seeing CXPACKET waitstats often lets programmers use MAXDOP 1 to avoid parallelizationCXPACKET waits are not necessarily bad, they occur in most ‘healthy’ parallelization settings tooIf SQL Server parallelizes wrongly (so that you would see high numbers for CXPACKET and use MAXDOP 1) this could also be due to bad indexing or outdated statistics

Still:Parallelizing generates overhead for splitting up the workload and later recombining the resultsIn certain settings (usually OLTP with many writes) MAXDOP 1 improves performance(e.g. recommended server-wide setting for SharePoint’s SQL Server configuration)Since as a developer you won’t know what the DBA sets with sp_configure, consider using the MAXDOP clause in your code when you know that parallelism isn’t useful

Monitoring Parallelism

If you think that parallelism might be the source of a performance bottleneck

See if your query plan is a parallel oneCheck CXPACKET values in sys.dm_os_wait_stats

Demo…

Scenario #3

From Cubes to Columnstore: Life gets easier

Don’t denormalize

DenormalizationMany techniques exist to denormalize a technical data model for improved performance

Cubes with pre-calculated aggregatesTemporary tables with redundant copies of values from related tables and pre-calculated aggregatesDenormalized technical models with redundant values

Technologies exist and evolve that make denormalization less necessary

Indexed views can replace temporary tablesCompressed tables and indexes improve performance per seLight indexing based on columnar storage can further improve performance without touching the base tables or indexing them

Indexed ViewsAn index on a view persists the view’s data content to diskUsing indexed views instead of data ‘manually’ copied to redundant temporary tables or replicated columns relieves you from maintaining the redundant objectsUsing temporary objects in Stored Procedures can lead to increased recompilesSince for the majority of database applications updates represent only a single digit percentage of all operations, very often the update-overhead for an additional index (on the view) is neglegibleMind the schema-binding requirements for indexed views

Compression

Database compression is a feature of SQL Server 2008’s Enterprise Edition and aboveCompressing indexes and tables improves performance substantially, since a smaller number of disk pages have to be accessedFor medium and large databases even processor load goes down, since fewer pages have to be maintained, reducing management overheadCompression is transparent for any application, you don’t have to touch any code when you start using compression

Columnstore IndexesSQL Server 2012 introduces columnstore indexes, based on Vertipaq technology (PowerPivot)Columnstore indexes speed up data access hugely through

A new storage architecture (columnar instead of column-wise)Much higher compression than prevoiuslyHighly improved access algorithms

A single columnstore index on a table covers any query that is run against that table

You no longer need to create and maintain a separate covering index for every important query, making you less dependent on your DBA to rebuild them etc.The original technical data model performs much better, without denormalization and without the us of extensive indexing (-> light indexing, see PDW)

Caveat in first release: The base table under a columnstore index will be read-only (workarounds exist)

How columnstore speeds up queries

ID Name City State

1 John Seattle WA

2 Jane Redmond

WA

3 Jill Redmond

OR

4 Jane Bellevue WA

1 John Seattle WA2 Jane Redmond WA3 Jill Redmond OR4 Jane Bellevue, WA

Row Store1 2 3 4John Jane Jill JaneSeattle Redmond Redmond BellevueWA WA OR WA

Column Store


Fetches only neededcolumns from diskLess IOBetter buffer hit rates

C1

C2

C4 C5 C6

C3

SELECT region, sum (sales) …


Advanced query processing technologyBatch mode execution of some operations

Processes column data in batchesGroups of batch operations in query plan

Compact data representationHighly efficient algorithmsBetter parallelism


Column SegmentSegment contains values from one column for a set of rowsSegments for the same set of rows comprise a row groupSegments are compressedEach segment stored in a separate LOBSegment is unit of transfer between disk and memory

C1 C2

C3 C5 C6C4

Row group

Segment

By the way: Recompilation

Stored Procedures will be recompiled automatically,if relevant information was not available to the optimizer when they were compiled the last timeor when such information has changed in the meantime (e.g. the structure of a referenced table)

Recompiles have a negative impact on performance, increasing processor load and blocking (locked objects during compilation)Make sure that you place DML in a bloc at the beginning of a SP and use temporary objects and SET statements defensivelyUse Profiler (SP:Recompile Event Class) to analyze the reasons for recompilationsSQL Server 2008 and 2012 reduce recompiles considerably too

Demo…

Scenario #4

Manage your isolation levels actively

Locking and Blocking

Transaction IsolationThe transaction isolation level defines

how you want to access data that is in use by othershow you want others to be restricted when accessing the same data that you are using

The transaction isolation level is a property of your database connectionIt is usually defined as a default of the client-application or –library(e.g. Tools-Options in SSMS)

The default setting generally is READ COMMITTEDWith this setting,

you prevent some operations for others while you’re in a transactionyou often wait unnecessarily even if you just read uncritical and unchanging data(e.g. for reporting and data warehousing)

Frequently waiting for locks to be lifted (being blocked) makes applications slow, they're just waiting alll the time...Consider

setting your isolation level to READ UNCOMMITTED for reads in uncritical situationsusing SQL Server 2008’s READ COMMITTED SNAPSHOT ISOLATION mode (with this you depend on your DBA, it also implies increased load for tempdb)

Monitoring for Blocking

Via SSMSsp_locksys.dm_tran_lock (allows WHERE)sys.dm_os_wait_stats (allows WHERE)

Via Performance MonitorMSSQL$yourinstance:Locks

Lock Waits/sec etc.

Demo…

Scenario #5

Beware of growing their content

Varchar fields

Varchar fieldsAs long as their content is short, varchar fields are placed on the same disk page as the rest of their recordIf an existing varchar field value is updated to a longer value that no longer fits on the same page, it is offloaded to a separate disk area, with a link remaining on the original pageThis operation creates additional disk IOs that will impact the database’s performance (fragmentation could stay beneath the common 30% threshold for rebuilds)If your application follows the habit of

first creating a new record with empty or default valuesthen reading back the default values and doing some additional stuff on the clientand finally updating the new record to its final values

you might be doing just that by default…

Watch out for additional performance topics at http://blogs.technet.com/b/swisssql

There is more to say

ReviewPage Splits and Fragmentation: Choosing the right keys.Parallelism: Don’t fear the CXPACKETDon’t denormalize: Indexed views, compression and columnstoreLocking and Blocking: Manage your isolation levels activelyVarchar fields: Beware of growing their content

The Tools

DMVselect sys.dm_ (IntelliSense will help you further)

SSMS SettingsSET STATISTICS IO ON

ProfilerWatch out for Extended Events

Performance Monitor

Please help us make TechDays even better by evaluating this session. Thank you!

Give us your feedback!

© 2011 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.

The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after

the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

rené balzano technology solution professional data platform microsoft switzerland sql server...

Documents

index pages

index records

index key

th page

index btree page

nd page

index leaf pages

right keys page splits