breaking data

65
Thank you to our Sponsors Breaking Data Terry Bunio @Tbunio Media Sponsor:

Upload: terry-bunio

Post on 03-Jul-2015

86 views

Category:

Software


0 download

DESCRIPTION

Stress test SQL Server with OStress - bitch!

TRANSCRIPT

Page 1: Breaking data

Thank you to our Sponsors

Breaking Data – Terry Bunio

@Tbunio

Media Sponsor:

Page 2: Breaking data

When Good Data Goes Bad…

Page 3: Breaking data

Who Am I?

• Terry Bunio

• Data Base Administrator

- Oracle, SQL Server 6,6.5,7,2000,2005,2012, Informix, ADABAS

• Sharepoint fan

• Data Modeler/Architect

- Investors Group, LPL Financial, Manitoba Blue Cross, Assante

Financial, CI Funds, Mackenzie Financial

- Normalized and Dimensional

• Agilist

- Innovation Gamer, Team Member, SQL Developer, Test writer,

Sticky Sticker, Project Manager, PMO on SAP Implementation

Page 4: Breaking data
Page 5: Breaking data
Page 6: Breaking data
Page 7: Breaking data
Page 8: Breaking data
Page 9: Breaking data

My Blog – www.agilevoyageur.com

Page 10: Breaking data

Breaking Data

Page 11: Breaking data

SQL Saturday Winnipeg

• November 22nd – Red River Community College

- Downtown Campus

• First SQL Saturday ever in Winnipeg

• 3rd in Canada after Toronto and Vancouver

• 20 Sessions

• 4 Tracks

- Business Intelligence

- DBA

- Developer

- New Database Technology

Page 12: Breaking data

March 2 – 3, 2015

Call for Speakers Open!

www.prairiedevcon.com

Page 13: Breaking data

Question?

• What is broken data?

• How do we fix it?

Page 14: Breaking data

Objectives

• Mine

- Hopefully introduce a couple of ideas you can take back and

improve on

• Yours?

Page 15: Breaking data
Page 16: Breaking data

Three types of broken data

• Inconsistent - Easy

• Incoherent - Moderate

• Ineffectual - Hard

Page 17: Breaking data

Inconsistent

Page 18: Breaking data

Inconsistent

• All Data must have a structure

• Domain

- “In data management and database analysis, a data

domain refers to all the values which a data element may

contain.” - Wikipedia

Page 19: Breaking data

Inconsistent

• Domains are a simple way to ensure data consistency

• Many times this is overlooked due to tools that don’t

promote it

- Hand-rolled SQL DDL and scripts

• Use tools that require you to define Data Domains

- Erwin

- Oracle SQL Data Modeler

• FREE!

• http://www.oracle.com/technetwork/developer-

tools/datamodeler/overview/index.html

Page 20: Breaking data

Inconsistent

Page 21: Breaking data

Inconsistent

• You try and find inconsistencies in that model!

• Luckily I have used Oracle Data Modeler and defined the

following Domains

Page 22: Breaking data
Page 23: Breaking data

Incoherent

Page 24: Breaking data
Page 25: Breaking data

Incoherence

Page 26: Breaking data

Incoherent

• Many databases remain coherent by using Foreign Key

Constraints

- These constraints ensure records can’t be stored in one table

unless the row they refer to in another table already exists

- These constraints are usually enabled all the time

- They can slow down performance and cause the data to be

Ineffectual

Page 27: Breaking data

Incoherent

• Most databases create the constraints and leave them

enabled all the time

• Constraint Double-Whammy

- Slows down actual inserted/modification of data

- Further slows down code as you validate the code values before

you try to insert/update to avoid throwing database exception

Page 28: Breaking data

Incoherent

• Alternative approach

- Leave Constraints disabled

- Attempt to re-enable them periodically to report on any invalid

data – Daily or Weekly

• You can then correct that data

- Disable constraints again

• In the past this process wasn’t possible due to the length

of time such a process would take

• only takes 75 minutes to re-enable 616 Foreign Key

constraints on a 1.1 Terabyte MSSQL 2012 database.

Thanks Microsoft!

Page 29: Breaking data

Incoherent

• Demo

Page 30: Breaking data

Ineffectual

• There are three types of database Performance Tuning

that you can do to make your data less ineffectual

- Execution Plan / Statistics IO

- SQL Profiler

- OStress

Page 31: Breaking data

Execution Plan

• Demo

Page 32: Breaking data

Execution Plan

• 1.sql

Page 33: Breaking data
Page 34: Breaking data

Execution Plan

• You then get an Execution Plan tab

• Execution Plan process has actually got very good at

recommending indexes.

• Anyone remember MSSQL Index Wizard?

Page 35: Breaking data
Page 36: Breaking data
Page 37: Breaking data
Page 38: Breaking data

How to read Execution Plan

• Index Seek >> Index Scan >> Table Scan

• Look for steps that are a large percentage of the overall

query

- See if those steps are using the right indexes

• Hover over each step to get details

Page 39: Breaking data

How to read Execution Plan

• Cached plan size – how much memory the plan

generated by this query will take up in stored procedure

cache. This is a useful number when investigating cache

performance issues because you'll be able to see which

plans are taking up more memory.

• Estimated Operator Cost – Overall percentage cost of

the step

Page 40: Breaking data

How to read Execution Plan

• Estimated Subtree Cost – tells us the accumulated

optimizer cost assigned to this step and all previous

steps, but remember to read from right to left. This

number is meaningless in the real world, but is a

mathematical evaluation used by the query optimizer to

determine the cost of the operator in question; it

represents the amount of time that the optimizer thinks

this operator will take.

• Estimated number of rows – calculated based on the

statistics available to the optimizer for the table or index

in question.

Page 41: Breaking data

SET STATISTICS IO ON

Page 42: Breaking data

SET STATISTICS IO ON

• DBCC FREEPROCCACHE

Page 43: Breaking data
Page 44: Breaking data

SQL Profiler

• Demo

Page 45: Breaking data

SQL Profiler

Page 46: Breaking data
Page 47: Breaking data

SQL Profiler

Page 48: Breaking data

SQL Profiler

Page 49: Breaking data

SQL Profiler

Page 50: Breaking data
Page 51: Breaking data

SQL Profiler

• You can save traces and replay those traces to simulate

load

• There are some limitations though

- Replay RPC events as remote procedure calls

- Replay attention

- Replay DTC transactions

- Replay as part of an automated scripts === SCALABLE Tool

Page 52: Breaking data
Page 53: Breaking data

OStress

• Comprised of two utilties:

- Read80Trace

• Required in order to convert trace files into RML files

- OSTRESS

• Multithreaded ODBC-based query utility. The OSTRESS utility reads input

from a command-line parameter. The command-line parameter can be an

RML file that is produced by the Read80Trace utility or a standard go-

delimited .SQL script file. In stress mode, one thread is created for each

connection, and all threads run as fast as possible without synchronization

among the threads. You can use this mode to generate a specific type of

stress load on the server.

Page 54: Breaking data

Ostress

• First we need to download and install the tools on the

server where we want to run our trace files

- http://www.microsoft.com/en-

us/search/Results.aspx?q=ostress&form=DLC

Page 55: Breaking data

Ostress

• Demo

Page 56: Breaking data
Page 57: Breaking data
Page 58: Breaking data

Convert Trace Files to RML files

• DOS Command

- CD c:\Program Files\Microsoft Corporation\RMLUtils

- ReadTrace –Ic:\TraceFiles\TraceSample.trc –oc:\RMLFiles –T28

• T28 flag

- Important as it allows you to replay the RML file against SQL

Server

Page 59: Breaking data

Ostress

• Now you can simply run Ostress with those RML files

- OSTRESS -creplay.ini -mreplay -T88 –ic:\RMLFiles\*rml –

oc:\RMLFiles\ReplayResult

Page 60: Breaking data

Specific use

• You can run and compare Ostress results when you

upgrade SQL Server or other system software and

hardware!

• You can compare them using the following command:

- ReadTrace –Ic:\TraceFiles\*.trc –oc:\TraceFiles\ReplayResult\ –

dods –f

• You can answer confidently whether the new server can

handle the current production load and stress

Page 61: Breaking data
Page 62: Breaking data

Review

• The power of this structure is that we can now automate

hundreds of threads to replay loads on the database

• This can now also become part of automated

testing/continuous integration processes

Page 63: Breaking data

Whew…

Page 64: Breaking data

Questions?

Page 65: Breaking data