copyright © 2006 quest software title slide copyright: 8 pt. arial systematic oracle performance...

30
Copyright © 2006 Quest Software Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions March 2007

Upload: ryley-grimstead

Post on 30-Mar-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

Copyright © 2006 Quest Software

Systematic Oracle performance tuning

Guy Harrison,

Chief Architect, Database Solutions

March 2007

Page 2: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

• A brief history of Oracle tuning

• Limitations of common approaches

• Systematically tuning by levels– Application demand

– Database contention

– Reducing physical IO

– Tuning physical IO

Agenda

Page 3: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

The story so far ….

• First, the earth cooled …• Then, the dinosaurs came• Then, Oracle 5 was released• The Rules of Thumb:

– Buffer cache hit rate > 90%– Every segment in a single extent– Index everything– Trial and error

• Limited Instrumentation– Tkprof (6.0)– V$sysstat

• No WWW

Page 4: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

The performance tuning renaissance

• Oracle 7-8 1992+• An empirical, profiling based approach• Many champions, but notably:

– Anjo Kolk– Carey Millsap

• An Oracle performance community emerges (Oak Table, etc)

• Yet Another Performance Profiling Methodology (YAPP): ResponseTime=ServiceTime+WaitTime

Page 5: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

YAPP and Wait interface based tuning

• Based primarily on existence of the “wait interface” and related Oracle instrumentation

• When Oracle sessions wait for a resource (IO, lock, latch, buffer, etc) they record wait durations:– V$SYSTEM_EVENT, $SESSION_EVENT,

$SESSION_WAIT

– SQL trace files created with the ‘10046 event’

• Largest wait categories represent greatest opportunities for optimization

• Average wait times can reveal contention points

Page 6: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

The path to enlightenment

• Problems to overcome:– Many waits were/are mysterious

• Exactly what does “PMON to cleanup pseudo-branches at svc stop time” mean?

– Message had yet to reach all in the community• Many still busy tweaking “hit ratios”

– Service time – especially CPU time – was hard to accurately measure• Oracle’s internal CPU counters notoriously inaccurate• No breakdown within CPU time • Tuning less effective for CPU-bound systems

– Response time included components outside of Oracle: especially as client-server gave way to 3/N tier systems

• No guarantee that tuning would result in end user improvement• In 10g, most of the technical issues have been resolved:

– Documentation and external resources (eg, the WWW)– Time model– Ms timings– Extended V$SQL timings

Page 7: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

WBT pitfalls : an example

Single block (e.g. indexed) I/Os represent 83% of DB elapsed time.

Average time for an IO is about 10x expected.

Therefore it would be reasonable to assume insufficient IO bandwidth (e.g., distinct disk drives) to support the workload.

Page 8: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

But before you go upgrading that Disk array….• A single missing index

can cause huge increases in logical IO demand.

• This magnifies disk IO demand, internal contention (latches, etc) and CPU utilization.

• Furthermore, IO demands from a missing index can increase faster than hardware upgrades can be applied

Page 9: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

Making a distinction between symptoms and causes

• When faced with an obviously IO-bound database, it’s tempting to deal with the IO subsystem immediately.

• However, as often as not, IO problems are symptoms of problems arising at other levels.– Typically SQL tuning or schema design

• Likewise, eliminating a contention point might actually increase physical IO demand– Lock waits can reduce the demand on the IO subsystem

• Tuning SQL might increase transaction rates– Which then results in latch contention

• You are best advised to ignore symptoms in the lower levels until you have optimized the performance of high levels.

Page 10: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

Interactions between layers

SQL/PLSQL

Rows/return code

Parse SQL, check security, acquire locks, latches,

buffer space, etc

Application DB access code

(parse, execute, single fetch, array fetch)

Block read/write

Data Blocks

Cache data blocks for re-use; sort

data as required, create hash tables

for hash join

Disk reads/writes

Data/Return code

Read/write data from disk devices

Application Oracle SoftwareMemory: SGA

and PGA IO Subsystem

Application demand Logical IO Physical IO Concurrency Mgt

Page 11: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

What it means (key slide)

• Problems in one layer can be caused or cured by configuration in the higher layer.

• The logical steps in Oracle tuning are therefore:1. Reduce application demand to it’s logical minimum by tuning

SQL, optimizing physical design (partitioning, indexing) and tuning PL/SQL

2. Maximize concurrency by minimizing contention for locks, latches, buffers and so on in the Oracle code layer

3. Having normalized logical IO demand by the preceding steps, minimize the resulting physical IO by optimizing Oracle memory

4. Now that the physical IO demand is realistic, configure the IO subsystem to meet that demand by providing adequate bandwidth and evenly distributing the resulting load

Page 12: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

Examining the time model

• High end tuning tools offer big returns on investment, but basic Oracle instrumentation serves well for most stages.

• A lot of insight can be gained by looking at the time model combined with the wait interface

Page 13: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

Why it doesn’t all add up….• Items in the time model are “nested”: some categories

incorporate times from other categories

• Wait time contributes to DB time and background elapsed time

• Does parse time include CPU time, etc?1) background elapsed time 2) background cpu time 2) background wait time1) DB time 2) DB CPU 2) User wait time 2) connection management call elapsed time 2) sequence load elapsed time 2) sql execute elapsed time 2) parse time elapsed 3) hard parse elapsed time 4) hard parse (sharing criteria) elapsed time 5) hard parse (bind mismatch) elapsed time 3) failed parse elapsed time 4) failed parse (out of shared memory) elapsed time 2) PL/SQL execution elapsed time 2) inbound PL/SQL rpc elapsed time 2) PL/SQL compilation elapsed time 2) Java execution elapsed time

Page 14: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

Reduce the application demand

• If you can, tune the application code– “The best optimized SQL is the SQL you didn’t do” (Anjo)

• Reduce the amount of logical IO generated by SQL statements– This in turn reduces CPU and IO but which fluctuate less predictably.

– Rewrite SQL or use stored outlines/profiles

– Index wisely

– Consider physical design changes

• Partitioning

• Denormalization

• Don’t forget other application demand factors– Parse CPU consumption (also can cause latch contention)

– PL/SQL execution time

Page 15: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

Searching for a specific plan steps

Table and index scans might be amenable to better indexing

We could also look for row counts relative to the object being scanned

Procedurally, we can even look for nested table scans and other “anti-patterns”; even before they cause problems

Page 16: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

PL/SQL tuning workflow

Page 17: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

Contention – the proverbial bottleneck

Application Demand for DB

services

Contention for limited or serialized resources causes

waits and or queuing

Apparent demand at lower layers is reduced

Page 18: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

Types of contention

• Locks– Mostly application, but occasionally system (ST in legacy tablespaces)

• Latches– Often side effect of excessive application demand– Lack of bind variables – But sometimes the final constraint on DB throughput

• Buffers– Buffer cache, redo buffer, etc– Hot blocks (buffer busy)– Slow “lazy” writers processes (DBWR, LGWR, RVWR)

• Sequences• Redo/Archive logs

– Contention between the LGWR and the DBWR or ARCH

• Shared servers/PQO servers• Global Cache (RAC) contention

– Most significant for “hot block” types of contention (cbc latch, buffer busy)

Page 19: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

Locks

Page 20: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

Latch contention & spin count

0

20

40

60

80

100

120

0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000

Spin Count

%

cpu utilization Time waited on latch Execution Rate

Cpu best fit Execution rate best fit latch sleep time best fit

Page 21: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

IO is increasingly expensive

4002000

20

3200

12.51

10

100

1,000

10,000

%ch

ang

e

IO rate DiskCapacity

IO/GB CPU IO/CPU

.

Page 22: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

Reducing physical IO

• Memory is used to cache data and to perform sorting and hashing (ORDER/GROUP BY, joins).

• Oracle 10g manages allocations within PGA/SGA well enough

• Determining the trade offs between these areas is the most important IO reduction method

Buffer pools

Program Global Area (PGA)Sort area

Hash Area

Data (USER) tablespace

Temporary tablespace

Oracle Session

Page 23: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

A good buffer cache hit ratio doesn’t help much

• A FTS can only generate so much IO

• A disk sort can generate many times the IO required to read the data from disk

• Some of this sort IO appears to be hidden from the wait interface

0

10,000,000

20,000,000

30,000,000

40,000,000

50,000,000

60,000,000

70,000,000

100

250

500

750

1,00

02,

250

2,50

02,

750

3,00

05,

000

7,50

0

10,00

0

12,50

0

15,00

0

17,50

0

20,00

0

25,00

0

Sort Area Size (KB)

Ela

pse

d t

ime

(mic

rose

c)

Other

Direct I/O

User I/O

DB CPU

Page 24: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

Advisories

• Advisories exist that provide estimates of the workload impact if a memory area were a different size

– In essence, Oracle is maintaining LRU chains that are longer than actual allocations

• Key advisories:– V_$PGA_TARGET_ADVICE– V_$DB_CACHE_ADVICE– V_$SHARED_POOL_ADVICE– V_$SGA_TARGET_ADVICE (10g)

• Unfortunately, Oracle suffered from “split-brain” during implementation:

– PGA reports in bytes RW– DB cache reports in IOs saved– Shared pool reports in parse time

• When comparing PGA and SGA advisories, you need to convert them to common units (wait time)

Page 25: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions
Page 26: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

IO management• The demand for physical IO should be about right now

– You’ve reduced the application logical IO demand – You’ve eliminated contention that might mask that demand– You’ve minimized the amount of logical IO than turns into physical IO

by effective memory management

• Now optimize the IO subsystem for the amount of physical IO you observe– Buy enough disks to meet IO demand and storage demand

• Remember also that sparse disks are more efficient– Oracle’s SAME (Stripe and Mirror Everything) is a good default

approach– Separating log/FRA and datafile IO is arguably the only split up you

should have to make• Dedicated disks for sequential redo IO OR• Separate wide, fine-grained stripe for redo and FRA

– ASM makes this very easy, but be careful: intolerant of poor underlying configurations

Page 27: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

ASM makes it very easy

• Only need to provision for total storage for all files and total IO demand

• Balancing within the diskgroup is (partially) automatic• Can implement redundancy within the group • DBA gets more control – Sys admin involvement is

minimal

• BUT:– Still a version 1.0 technology– Simplistic algorithm does not work well if underlying disks are of

different sizes/characteristics– Hard to map segments to spindles, especially if implemented on top

of hardware storage appliance– Auto-rebalance has not been observed – ASM instance is a vulnerability

Page 28: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

ASM disk groups

Page 29: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions

Q&A

Page 30: Copyright © 2006 Quest Software Title slide Copyright: 8 pt. Arial Systematic Oracle performance tuning Guy Harrison, Chief Architect, Database Solutions