common mistakes developers make in sql serverpublications.sqltopia.com/common mistakes developers...
TRANSCRIPT
Common mistakes developers make inSQL Server“Amateurs work until they get it right. Professionals work until they can't get it wrong.”
About me
• MCT 2011
• MCTS and MCITP 2011
• SQL Server MVP 2009
• MCP and MCDBA 2001
• Technical reviewer Team Based Development
• TR/TE Exam 70-462 Administering SQL Server 2012
• Co-author of SQL Server MVP Deep Dives 2
• Twitter @SwePeso
• Active as SwePeso
• Blog at SQLTopia.com
Agenda
• Faster hardware
• Row By Agonizing Row (pronounced Ree-bar)
• Cursors
• Bad Use Of Indexes
• Triggers
• Indeterministic functions
• Database Normalization
Before we start
Before you make any change on a production SQL Server, be sure you first try it in a test environment
• NO EXCEPTIONS!
• I mean it!
• Really!
• No kidding
• I wouldn’t lie to you
• You’d be crazy not listening to this advice
• Never forget, DBAs and developers are the protectors of the organization’s data
• You took this oath when you accepted the job
FASTER HARDWARE
The golden bullet
Faster hardware
Faster hardware
Faster hardware
Faster hardware
Faster hardware
Faster hardware - Overview
Query Index Parallellism Reads CPU Duration
Old query No No 1,805,000 216.00 217.00
Old query No Yes 1,845,000 221.00 113.00
Old query Yes No 59,000 3.00 4.00
Old query Yes Yes 59,000 4.00 3.00
New query No No 120 0.10 1.00
New query Yes No 120 0.05 1.00
What have we learned?
”The difference between fast code and just good enough code is that just good enough leads to sloppy techniques.
And while one or two sloppy procedures probably won't bring down your servers or department, sooner or later everything is written with poor standards and things are not working as efficiently as they should be.
Then we need to invest in more powerful hardware, memory upgrades and larger disks to compensate for problems that could be avoided with better code and design.”
http://www.sqlservercentral.com/Forums/Topic965572-263-4.aspx
ROW BY AGONIZING ROW
Loops, cursors and triangular joins
Row By Agonizing Row
• Phrased by Jeff Moden
• Good friend and fellow SQL Server MVP from SQLServerCentral.
• http://www.sqlservercentral.com/articles/T-SQL/61539/
• Watch out for "Hidden RBAR"
Row By Agonizing Row
A <= B A < B
𝒂 ∗ (𝒃 + 𝟏)
𝟐
(𝒂 − 𝟏) ∗ 𝒃
𝟐
𝒏𝟐
𝟐
Internal rows
55
5 050
500 500
50 005 000
5 000 050 000
1
10
100
1 000
10 000
100 000
1 000 000
10 000 000
100 000 000
1 000 000 000
10 000 000 000
10 100 1000 10000 100000
Total
Row By Agonizing Row
Cursor-based solution Set-based solution
Row By Agonizing Row
• Good examples from the Phil Factor Speed Phreak competitions
• http://ask.sqlservercentral.com/questions/92/the-subscription-list-sql-problem
• Cursor 780 seconds (13 minutes).
• Set-based 0.3 seconds - 2,500 times faster!
• http://ask.sqlservercentral.com/questions/826/the-fifo-stock-inventory-sql-problem
• Cursor 2,400 seconds (40 minutes).
• Set-based 1.3 seconds – 1,800 times faster!
• http://ask.sqlservercentral.com/questions/6529/the-ssn-matching-sql-problem
• Cursor 4,000 – 4,500 seconds (70 - 75 minutes).
• Set-based 0.5 seconds – 9,000 times faster!
• SQLCLR 0.4 seconds – 11,000 times faster!
Row By Agonizing Row
• Kathi Kellenberger, former SQL Server MVP, provides in-depth descriptions and analyses of the different solutions here
• http://www.simple-talk.com/sql/performance/writing-efficient-sql-set-based-speed-phreakery/
• http://www.simple-talk.com/sql/performance/set-based-speed-phreakery-the-fifo-stock-inventory-sql-problem/
• http://www.simple-talk.com/sql/performance/ssn-matching-speed-phreakery/
Cursors
• A loop (cursor)
• Works row-by-agonizing-row.
• Overrides the natural abilities of the optimizer.
• Only operates on a single row instead of a set of rows.
• Most explicit loops are not set based programming and they just crush performance.
Cursors vs set-based
Go to the
store
Buy one beer
Bring it home
Drink it
Go to the store
Buy a case of beer
Bring it home
Get one from the case
Drink it
BAD USE OF INDEXES
The hidden performance crusher
Bad Use Of IndexesSELECT ent.entEngagementID
,thd.thdPlaceOfSale
,COUNT(*)
FROM dbo.Engagement ent
INNER LOOP JOIN dbo.TransactionHead thd ON thd.entEngagementID
= ent.entEngagementID
INNER JOIN dbo.EngagementParameter epr ON epr.tepTypeEngagementParameterID
= (SELECT tepTypeEngagementParameterID
FROM TypeEngagementParameter
WHERE tepCode = 'StorePeriodLen')
AND DATEDIFF(MM, DATEADD(MM, -CONVERT(INT,eprParameterValue),
GETDATE()), thd.thdTransactionDate) > 0
AND epr.tetTypeEngagementID = ent.tetTypeEngagementID
INNER JOIN dbo.TypeTransactionHead tth ON tth.tthTypeTransactionHeadID
= thd.tthTypeTransactionHeadID
AND tth.tthCode IN ('Purchase', 'Return', 'Mixed')
GROUP BY ent.entEngagementID
,thdPlaceOfSale
Bad Use Of Indexes
Bad Use Of IndexesSELECT ent.entEngagementID,
thd.thdPlaceOfSale,
COUNT(*)
FROM dbo.TypeEngagementParameter AS tep
INNER JOIN dbo.EngagementParameter AS epr ON epr.tepTypeEngagementParameterID
= tep.tepTypeEngagementParameterID
INNER JOIN dbo.Engagement AS ent ON ent.tetTypeEngagementID
= epr.tetTypeEngagementID
INNER JOIN dbo.TransactionHead AS thd ON thd.entEngagementID
= ent.entEngagementID
INNER JOIN dbo.TypeTransactionHead AS tth ON tth.tthTypeTransactionHeadID
= thd.tthTypeTransactionHeadID
AND tth.tthCode IN ('Purchase', 'Return', 'Mixed')
WHERE thd.thdTransactionDate >= DATEADD(MONTH, DATEDIFF(MONTH, '19000101',
GETDATE()) - CONVERT(INT, epr.eprParameterValue), '19000201')
AND tep.tepCode = 'StorePeriodLen'
GROUP BY ent.entEngagementID,
thdPlaceOfSale
Bad Use Of Indexes
Bad Use Of Indexes - Overview
Bad Use Of Indexes - Graphs
Bad Use Of Indexes-- Query 1
SELECT *
FROM dbo.TestIndexes
WHERE YEAR(MyDate) = 2005
-- Query 2
SELECT *
FROM dbo.TestIndexes
WHERE MyDate >= '20050101'
AND MyDate < '20060101'
-- Query 3
SELECT *
FROM dbo.TestIndexes
WHERE CONVERT(CHAR(4), MyDate, 112) = '2005'
Bad Use Of Indexes
Bad Use Of Indexes
Query Records Scan Count Logical Reads CPU Duration
Query 1 1,000 1 5 0 0
Query 2 1,000 1 2 0 0
Query 3 1,000 1 5 0 0
Query 1 10,000 1 25 0 3
Query 2 10,000 1 2 0 0
Query 3 10,000 1 25 0 8
Query 1 100,000 1 226 31 30
Query 2 100,000 1 3 0 0
Query 3 100,000 1 226 78 81
Query 1 1,000,000 1 2,250 312 300
Query 2 1,000,000 1 12 0 1
Query 3 1,000,000 1 2,250 811 819
Query 1 10,000,000 17 22,679 3,620 295
Query 2 10,000,000 1 85 0 180
Query 3 10,000,000 17 22,679 10,329 739
TRIGGERS
To batch or not to batch?
TriggersCreate TRIGGER [dbo].[Fault_Parked_UPDATE] ON [dbo].[Detail]
FOR INSERT, UPDATE
AS
Declare @CalliD Varchar(8), @Park1Start datetime,@Park1End datetime, @Park1 INT, @TotalParkedTime INT
Declare @FaultEnd datetime, @FaultTotal INT, @FaultMinusParked INT, @FaultStart datetime
Select @callid=Callid FROM inserted WHERE isdate(FaultStart)=1 and isdate(FaultEnd)=1
IF @Callid IS NOT NULL Begin
Select @Park1Start = CASE WHEN ISDATE(@Park1Start)= 1 THEN Cast([ParkedDate1]+' '+[ParkedTime1] as datetime)
ELSE CAST('1900-01-01 00:00:00' as DateTime) END
from detail
Where callid = @Callid
Select @Park1End = CASE WHEN ISDATE(@Park1End)= 1 THEN Cast([UnParkedDate1]+' '+[UnParkedTime1] as datetime)
ELSE CAST('1900-01-01 00:00:00' as DateTime) END
from detail
Where callid = @Callid
Select @Park1 = datediff(ss,@Park1Start ,@Park1End)/60
Select @TotalParkedTime = @Park1
Select @FaultStart = Cast([FaultStart]+' '+[FaultStartTime] as datetime) from detail Where callid = @Callid
Select @FaultEnd = Cast([FaultEnd]+' '+[FaultEndTime] as datetime) from detail Where callid = @Callid
Select @FaultTotal = datediff(ss,@FaultStart ,@FaultEnd)/60
Select @FaultMinusParked = @FaultTotal - @TotalParkedTime
UPDATE CallLog
set Park1 = @Park1, TotalParkedTime = @TotalParkedTime,
FaultMinusParked = @FaultMinusParked, FaultTotal = @FaultTotal
Where calllog.CallID = @CallID
END
Triggers
CREATE TRIGGER dbo.Fault_Parked_UPDATE
ON dbo.Detail
FOR INSERT,
UPDATE
AS
;WITH cteValid(CallID, Park1, FaultTotal)
AS (
SELECT CallID,
CASE
WHEN Park1End > Park1Start AND ISDATE(ParkedDate1) = 1 THEN DATEDIFF(SECOND, Park1Start, Park1End) / 60
ELSE 0
END AS Park1,
DATEDIFF(SECOND, FaultStart, FaultEnd) / 60 AS FaultTotal
FROM (
SELECT CallID,
ParkedDate1,
CAST(ParkedDate1 + ' ' + ParkedTime1 AS DATETIME) AS Park1Start,
CAST(UnParkedDate1 + ' ' + UnParkedTime1 AS DATETIME) AS Park1End,
CAST(FaultStart + ' ' + FaultStartTime AS DATETIME) AS FaultStart,
CAST(FaultEnd + ' ' + FaultEndTime AS DATETIME) AS FaultEnd
FROM inserted
WHERE ISDATE(FaultStart) = 1
AND ISDATE(FaultEnd) = 1
AS d
)
UPDATE cl
SET cl.Park1 = v.Park1,
cl.TotalParkedTime = v.Park1,
cl.FaultMinusParked = v.FaultTotal - v.Park1,
cl.FaultTotal = v.FaultTotal
FROM dbo.CallLog AS cl
INNER JOIN cteValid AS v ON v.CallID = cl.CallID
INDETERMINISTIC FUNCTIONS
I didn’t see that coming…
Simple indeterministic functionselect top(10) case abs(checksum(newid()))%4 when 0 then 0 when 1 then 1 when 2 then 2 when 3 then 3 end from sys.objects
CASE WHEN abs(checksum(newid())) % 4 = 0 THEN 0 ELSE WHEN abs(checksum(newid())) % 4 = 1 THEN 1 ELSE WHEN abs(checksum(newid())) % 4 = 2 THEN 2 ELSE WHEN abs(checksum(newid())) % 4 = 3 THEN 3 ELSE NULL END END END END
Advanced indeterministic function
CASE
WHEN datepart(month, dbo.fnGetOrderDate(SalesOrderID)) = 1 THEN 'Jan'
ELSE CASE WHEN datepart(month, dbo.fnGetOrderDate(SalesOrderID)) = 2 THEN 'Feb'
ELSE CASE WHEN datepart(month, dbo.fnGetOrderDate(SalesOrderID)) = 3 THEN 'Mar'
ELSE CASE WHEN datepart(month, dbo.fnGetOrderDate(SalesOrderID)) = 4 THEN 'Apr'
ELSE CASE WHEN datepart(month, dbo.fnGetOrderDate(SalesOrderID)) = 5 THEN 'May'
ELSE CASE WHEN datepart(month, dbo.fnGetOrderDate(SalesOrderID)) = 6 THEN 'Jun'
ELSE CASE WHEN datepart(month, dbo.fnGetOrderDate(SalesOrderID)) = 7 THEN 'Jul'
ELSE CASE WHEN datepart(month, dbo.fnGetOrderDate(SalesOrderID)) = 8 THEN 'Aug'
ELSE CASE WHEN datepart(month, dbo.fnGetOrderDate(SalesOrderID)) = 9 THEN 'Sept'
ELSE CASE WHEN datepart(month, dbo.fnGetOrderDate(SalesOrderID)) = 10 THEN 'Oct'
ELSE CASE WHEN datepart(month, dbo.fnGetOrderDate(SalesOrderID)) = 11 THEN 'Nov'
ELSE CASE WHEN datepart(month, dbo.fnGetOrderDate(SalesOrderID)) = 12 THEN 'Dec'
ELSE NULL
END
END
END
END
END
END
END
END
END
END
END
END
DATABASE NORMALIZATION
When not following
Typical scenario
Background• Got a call from an architect that a tab
in an application is running slower than usual (5-6 seconds vs 1 second)
• Architect wanted some common advice what to do, such as adding an index.
Steps taken• Started SQL Profiler and activated the
trace• Architect worked through the
application• Stopped the trace
Action plan• Set up SQL Profiler to monitor RPC
and SP• Set up SQL Profiler to measure CPU,
duration, reads och writes
Result• usp_FilterCard
• 1.5 seconds and 1.5 million reads• No rows returned
• usp_FilterPurchase_ByFilterID• 1.5 seconds and 1.5 million reads• No rows returned
• usp_FilterPurchase_ByKeys• 3 seconds and 2.2 million reads• No rows returned
A total of6 seconds5.2 million reads
and no rows returned!
Database NormalizationSELECT prtProductID, prtDesc
FROM Product
WHERE CHARINDEX(',' + CONVERT(VARCHAR(12), prtProductID) + ',', ','
+ (
SELECT REPLACE(fprValue, '''', '')
FROM FilterParameter
INNER JOIN TypeFilterParameter ON TypeFilterParameter.tfpTypeFilterParameterID =
FilterParameter.tfpTypeFilterParameterID
INNER JOIN TypeFilterParameterGroup ON TypeFilterParameterGroup.tfgTypeFilterParameterGroupID =
TypeFilterParameter.tfgTypeFilterParameterGroupID
WHERE FilterParameter.firFilterID = @firFilterID
AND TypeFilterParameterGroup.tfgCode = 'Purchase'
AND TypeFilterParameter.tfpCode = 'ProductList'
) + ',' ) > 0
0
20000
40000
60000
80000
100000
120000
1 2 3 4 5
Database NormalizationCREATE TABLE #Products
(
prtProductID INT PRIMARY KEY CLUSTERED
)
INSERT #Products
(
prtProductID
)
SELECT f.Data
FROM (
SELECT REPLACE(fpr.fprValue, '''', '') AS fprValue
FROM FilterParameter AS fpr
INNER JOIN TypeFilterParameter AS tfp ON tfp.tfpTypeFilterParameterID = fpr.tfpTypeFilterParameterID
AND tfp.tfpCode = 'ProductList'
INNER JOIN TypeFilterParameterGroup AS tfg ON tfg.tfgTypeFilterParameterGroupID = tfp.tfgTypeFilterParameterGroupID
AND tfg.tfgCode = 'Purchase’
WHERE fpr.firFilterID = @firFilterID
) AS d
CROSS APPLY dbo.fnParseList(',', d.fprValue) AS f
SELECT prt.prtProductID,
prt.prtDesc
FROM Product AS prt
INNER JOIN #Products AS p ON p.prtProductID = prt.prtProductID
DROP TABLE #Products
3 seconds and 2.2 million reads before0.1 seconds and 300 reads after
Want to know more?
• Married with children• Girls - 8 year, 4 year girl and 1 year.
• Boys – 2 year.
• Live outside Helsingborg, Sweden.
• Blog at• http://weblogs.sqlteam.com/peterl/
• http://www.sqltopia.com
• Phil Factor Speed Phreak challenges• 3 time winner
• Simple Talk article series
• The Road To Professional Database Developer