sql query performance analysis

Post on 20-Jan-2015

401 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

Sql server query optimization

TRANSCRIPT

SQL Query Performance Analysis

Topics To Cover

• Query Optimizer

• Addhoc queries • Execution Plan • Statistics Analysis

Query OptimizerThe query optimizer in SQL Server is cost-based. It includes:

1. Cost for using different resources (CPU and IO)2. Total execution time It determines the cost by using: • Cardinality: The total number of rows processed at each level

of a query plan with the help of histograms , predicates and constraint

• Cost model of the algorithm: To perform various operations like sorting, searching, comparisons etc.

Addhoc queriesAny non-Parameterized quires are called addhoc queries. For example :

SELECT MsgID, Severity FROM SqlMessage WHERE MsgID = 100

In sql server if we execute a sql query it goes through two steps just like any other programming languages:

• 1. Compilation • 2. Execution

Properties of addhoc query • Case sensitive• Space sensitive• Parameter sensitive

Sql severs treats two same sql query but of different parameters as different sql statements. For example:

• SELECT MsgID, Severity FROM SqlMessage WHERE MsgID = 1• SELECT MsgID, Severity FROM SqlMessage WHERE MsgID = 2

Effect of faulty C# code• Sql server has took extra n * (Compilation time) ms to display

records

• Extra time to insert records in cached plans.

• Sql server has to frequently fire a job to delete the cached plan since it will reach the max limit very soon.

• It will not only decrease the performance of this sql query but all sql queries of other application since this faulty code will force to delete cached query plans of other sql statement.

Prepared queriesExample:

(@Msgid int)SELECT MsgID, Severity FROM SqlMessage WHERE MsgID = @Msgid

• It is not case, space and parameter sensitive and it is our goal.

Stored procedure :

• It is precompiled sql queries which follow a common execution plan.

Execution Plan• What is an index in sql server?

Index is a way to organize data in a table to make some operations like searching, sorting, grouping etc faster. So, in other word we need indexing when sql query has:

• WHERE clause (That is searching)• ORDER BY clause (That is sorting)• GROUP BY clause (This is grouping) etc.

Table scan:

RollNo Name Country Age

101 Greg UK 23

102 Sachin India 21

103 Akaram Pakistan 22

107 Miyabi China 18

108 Marry Russia 27

109 Scott USA 31

110 Benazir Banglades 17

111 Miyabi Japan 24

112 Rahul India 27

113 Nicolus France 19

SELECT * FROM Student WHERE RollNo = 111

Time complexity of table scan is : O(n)

Clustered index

• When we create a clustered index on any table physical organization of table is changed.

• Now data of table is stored as binary search tree(B tree).

Types of scanning• Table scan: It is very slow can and it is used only if table has

not any clustered index.

• Index scan: It is also slow scan. It is used when table has clustered index and either in WHERE clause non-key columns are present or query has not been covered (will discuss later) or both.

• Index Seek: It is very fast. Our goal is to achieve this.

Terms of execution plan • Predicate: It is condition in WHERE clause which is either non-

key column or column which has not been covered.

• Object: It is name of source from where it getting the data. It can be name of table, Clustered index or non-clustered index

• Output list: It is name of the columns which is getting from object.

• Seek Predicate: It is condition in WHERE clause which is either key column or fully covered.

Non-clustered index• It is logical organization of data of table. A non-clustered index

can be of two types.

1. Heap2. Based on clustered index.

• If table has clustered index then leaf node of non-clustered index keeps the key columns of clustered index.

• If the table has not any clustered index then leaf node of non-

clustered index keeps RID which unique of each row of table.

Based on clustered Index

Based on heap

Covering of queries• We can specify maximum 16 column names.

• Sum of size of the columns cannot be more than 900 bytes.

• All columns must belong to same table.

• Data type of columns cannot be ntext, text, varchar (max), nvarchar (max), varbinary (max), xml, or image

• It cannot be non-deterministic computed column.

Statistics Analysis

• The query optimizer uses statistics to create query plans that improve query performance

• A correct statistics will lead to high-quality query plan.

• Auto create and updates applies strictly to single-column statistics.

• The query optimizer determines when statistics might be out-of-date by counting the number of data modifications since the last statistics update and comparing the number of modifications to a threshold.

To improve cardinality• If possible, simplify expressions with constants in them.

• If there is cross relation between column use computed column.

• Rewriting the query to use a parameter instead of a local variable.

• Avoid changing the parameter value within the stored procedure before using it in the query.

Goal

• Should we use sub query or inner join?• Should we use temp table or table variable?

Other tools:

• Sql query profiler• Database Tuning Advisor• Resource Governor

THANK YOU

top related