understanding oracle logicalio issues

Misunderstandings About Oracle InternalsThe Cost of Oracle Logical I/O Calls

Agenda

• The true cost of a block visit• Operational measurements• You probably don’t need a hardware upgrade• What you should do instead• Conclusion

Do you believe?

• “Retrieving information from memory is over 10,000 times faster than retrieving it from disk.”

• “To tune SQL, simply eliminate disk I/O.”

• “Solid-state disk devices will make our applications a lot faster!”

• “When we have terabytes of memory for our SGAs, there’ll be no more need for tuning!”

Part I

Reading from the buffer cache is more expensive than you might think.

The relative speeds of disk and memory accesses are less relevant than you’re taught to believe.

Storage medium Typical access latency Relative performance

Memory 0.000 000 004 seconds 1

Disk 0.0044 seconds 1000000

Retrieval method Typical access latency Relative performance

Oracle LIO 0.000 053 seconds 1

Oracle LIO + PIO 0.001 966 seconds 37

Definitions: LIO and PIO…

• Oracle Logical I/O (LIO)– Oracle requests a block from the database buffer cache

• Oracle Physical I/O (PIO)– Oracle requests a block from the operating system– Might be “physical,” might not be

for x in ( select rowid from emp ) --- CONSISTENT GETS loop delete from emp where rowid = x.rowid; --- CURRENT MODE GETS

end loop;

An Oracle LIO is not just “a memory access.”

function LIO(dba, mode, ...)# dba is the data block address (file#, block#) of the desired block# mode is either ‘consistent’ or ‘current’address = buffer_cache_address(dba, ...);if no address was found

address = PIO(dba, …); # potentially a multi-block pre-fetch[1]update the LRU chain if necessary; # necessary less often in 8.1.6if mode is ‘consistent’

construct read-consistent image if necessary, by cloning the block and calling LIO for the appropriate undo blocks;

increment ‘cr’ statistic in trace data and ‘consistent gets’ statistic in v$ data;else (mode is ‘current’)

increment ‘cu’ statistic in trace data and ‘db block gets’ statistic in v$ data;parse the content of the block;return the relevant row source;

end

How does the Oracle kernel “know” whether a block is in the database buffer cache or not?

Latch serialization impacts LIO latency.

• Oracle8i– Scanners, modifiers serialize on a CBC latch

• => Oracle9i– Scanners can share a CBC latch– But modifiers still serialize

Oracle’s latch acquisition algorithm attempts the fine balance between busy-waiting and sleeping.

function get_latch(latch, …) # multi-CPU implementationif fastget_latch(latch) return true;for try = 0 to +infinityfor spin = 1 to _spin_countif fastget_latch(latch) return true;sleep for min(f(try), _max_exponential_sleep) centiseconds;

end

function fastget_latch(latch, …)if test(latch) shows that latch is availableif test_and_set(latch) is successfulreturn true;return false;

end

1.92502841810

1.92502841898

1.92502841AB8

1.92502841C50

0.44202770FD0

1.2680284F950

0.49902772E18

2.51402850F18

0.61310282DE18

0.51070280BE18

0.4083027E9E18

0.152302794E18

1.649902841CD8

1.649902841920

1.649902841B40

_db_block_hash_buckets – 1

0

1

2

3

4

To find long chainsselect hladdr, count(*)

from x$bh group by hladdr order by 2;

To find hot blocks :

select sid, p1raw, p2, p3, seconds_in_wait, wait_time, state from v$session_wait where event = ’latch free’ order by p2, p1raw;

Part II

How to measure LIO and PIO latencies operationally.

Tracing your own session •alter session set timed_statistics = true;•alter session set max_dump_file_size = 20M;•alter session set tracefile_identifier = <id>;•alter session set events '10046 trace name context forever, level 8';…. ….•alter session set events '10046 trace name context off‘

Tracing someone else’s session •exec dbms_system.set_bool_param_in_session(sid,serial#,'timed_statistics',

true);•exec dbms_system.set_ev (sid,serial#,10046,8,'');…..….. •exec dbms_system.set_ev (sid,serial#,10046,0,'');

How to measure per-block LIO durations operationally…

FETCH #1:c=3334,e=3919,p=58369,cr=586601,cu=0,mis=0,r=1,dep=0,og=4,tim=3736344566

• Logical reads– Consumed 3,334 quanta of user-mode CPU time

• 33.34s in Oracle8i and 0,003334s in 9I,10G– Retrieved 586,601 (586,601+0) blocks from db buffer cache– 0.000 057 seconds per block (33.34s/586,601b)

• Note risk of overestimating LIO cost

How to measure per-block PIO durations operationally…

WAIT #491: nam='db file scattered read' ela= 0 p1=142 p2=14523 p3=3WAIT #491: nam='db file sequential read' ela= 1 p1=142 p2=9218 p3=1WAIT #491: nam='db file sequential read' ela= 1 p1=142 p2=9223 p3=1WAIT #491: nam='db file scattered read' ela= 1 p1=142 p2=9231 p3=4WAIT #491: nam='db file scattered read' ela= 1 p1=142 p2=9237 p3=8

• “Physical” reads– Consumed ela=4 quanta of elapsed time

• 0.04s in Oracle8i or 0,00004 s in 9i and 10G– Retrieved p3=17 database blocks– 0.002 353 seconds per block (0.04s/17b)

How to measure latch sleep durations operationally…

WAIT #92: nam='latch free' ela= 0 p1=17184432736 p2=66 p3=0WAIT #91: nam='latch free' ela= 0 p1=17184432736 p2=66 p3=0WAIT #98: nam='latch free' ela= 1 p1=17184656544 p2=66 p3=0WAIT #96: nam='latch free' ela= 1 p1=17184458272 p2=66 p3=0WAIT #96: nam='latch free' ela= 2 p1=17184458272 p2=66 p3=0

• Sleeps on latch acquisition attempts– Consumed ela=4 quanta of elapsed time

• 0.04s in Oracle8i– Latch number is p2=66 (cache buffers chains on this system)

Part III

You probably don’t need faster disk or more memory.

“Increasing the parameter associated with latches” is usually an ineffective remedy.

• Consider the relative impact…– Increase db_block_buffers by 10– Increase _db_block_hash_buckets by 10– Increase _db_block_hash_latches by 10– Reduce LIO count by 100,000

PIOs Might Not Be Your Bottleneck

Do you have a physical I/O bottleneck on your system? Chances are that if any of the following is true, then somebody at your business probably thinks that you do:

• Your disk utilization figures are high.• Your disk queue lengths are long.• Your average read or write latency is high.If you suspect that you have a physical I/O bottleneck for any of

these reasons, do not upgrade your disk subsystem until you figure out how much impact your Oracle PIO latencies have upon user response time.

For example, what impact would a disk upgrade or more memory have upon the application with the following resource profile?

Oracle Kernel Event Duration Calls Avg------------------------------ ------------------ -------- ----------CPU service 1,527.51s 60.8% 158,257 0.009652sdb file sequential read 432.03s 17.2% 62,495 0.006913sunaccounted-for 209.56s 8.3% global cache lock s to x 99.87s 4.0% 3,434 0.029083sglobal cache lock open s 85.93s 3.4% 3,507 0.024502sglobal cache lock open x 57.88s 2.3% 1,930 0.029990slatch free 26.77s 1.1% 1,010 0.026505sSQL*Net message from client 19.11s 0.8% 6,714 0.002846swrite complete waits 11.13s 0.4% 155 0.071806srow cache lock 11.10s 0.4% 485 0.022887senqueue 11.09s 0.4% 330 0.033606slog file switch completion 7.31s 0.3% 15 0.487333slog file sync 3.31s 0.1% 39 0.084872swait for DLM latch 2.95s 0.1% 91 0.032418s...------------------------------ ------------------ -------- ----------Total 2,510.50s 100.0%

This statement went undetected for several years because, by traditional measures, it is “efficient.”

update po_requisitions_interface set requisition_header_id=:b0where (req_number_segment1=:b1 and request_id=:b2)

----- Response Time -------Action Count Rows Elapsed CPU Waits LIO Blks PIO Blks------- ----- ---- --------- --------- ------- --------- ---------Parse 0 0 0.00 0.00 0.00 0 0Execute 1,166 0 1,454.98 1,066.43 388.55 8,216,887 3,547Fetch 0 0 0.00 0.00 0.00 0 0------- ----- ---- --------- --------- ------- --------- ---------Total 1,166 0 1,454.98 1,066.43 388.55 8,216,887 3,547Per Exe 1 0 1.25 0.91 0.33 7,047 3Per Row 1,166 1 1,454.98 1,066.43 388.55 8,216,887 3,547

db buffer cache hit ratio = 99.956833%

SQL> select /*+ use_nl(o,n)*/ 2 o.object_name 3 from objects_t o, 4 nom_owner n 5 where o.owner=n.owner1;222888 rows selected.

SQL> select /*+ use_hash(n,o)*/ 2 o.object_name 3 from objects_t o, 4 nom_owner n 5 where o.owner=n.owner1;222888 rows selected.

Execution Plan---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=690 Card=186632 By tes=17170144)

1 0 NESTED LOOPS (Cost=690 Card=186632 Bytes=17170144) 2 1 TABLE ACCESS (FULL) OF 'OBJECTS_T' (TABLE) (Cost=656 Car d=186632 Bytes=14743928)

3 1 INDEX (UNIQUE SCAN) OF 'OWNER1_UK' (INDEX (UNIQUE)) (Cos t=0 Card=1 Bytes=13)

Statistics---------------------------------------------------------- 0 recursive calls 0 db block gets 35408 consistent gets 2954 physical reads 0 redo size 6581693 bytes sent via SQL*Net to client 163957 bytes received via SQL*Net from client 14861 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 222888 rows processed cash hit ratio =0.92 execution for 8.17 sec.

Execution Plan---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=661 Card=186632 By tes=17170144)

1 0 HASH JOIN (Cost=661 Card=186632 Bytes=17170144) 2 1 INDEX (FULL SCAN) OF 'OWNER1_UK' (INDEX (UNIQUE)) (Cost= 1 Card=121 Bytes=1573)

3 1 TABLE ACCESS (FULL) OF 'OBJECTS_T' (TABLE) (Cost=656 Car d=186632 Bytes=14743928)

Statistics---------------------------------------------------------- 7 recursive calls 0 db block gets 17718 consistent gets 2961 physical reads 0 redo size 6581693 bytes sent via SQL*Net to client 163957 bytes received via SQL*Net from client 14861 SQL*Net roundtrips to/from client 2 sorts (memory) 0 sorts (disk) 222888 rows processed cash hit ratio =0.83 execution for 3,48 sec.

Part IV

How to reduce LIO call frequency.

Focus on LIO reduction first.

• When you’ve eliminated PIOs, you’re still not done yet• If you begin with LIO reduction, you’ll get more benefit

– Most PIOs are motivated by LIOs• However, eliminating LIOs requires actual thought

Use the most efficient execution plan, not necessarily the one that your rules of thumb say to use.

• A statement’s buffer cache hit ratio is an illegitimate measure of its efficiency

• Any “x% of rows returned” index rule of thumb is illegitimate– Not all index range scans are good– Not all full-table scans are bad– Not all nested loops plans are good.

Efficient SQL is necessary SQL that puts as little load upon the database as possible.

• Eliminate unnecessary work– Filter early– Use arrays– Generate redo only when needed

Part V

Conclusion: Excessive LIO frequency is a major scalability barrier.

When do you need faster disk?

• Buy faster disk only when...– Disk latency significantly impacts an important program’s

response time– And you’ve exhausted less expensive workload reduction

opportunities

• Invalid motives– Device average I/O latency is higher than you want– Device utilization is higher than you want– PIO count is non-zero

When do you need more memory?

• Buy more memory only when…– The lack of memory significantly impacts an important

program’s response time• E.g., user PGAs need so much memory you page/swap

– And you’ve exhausted less expensive workload reduction opportunities

• Invalid motives– Your db buffer cache hit ratio is low– Your db buffer cache hit ratio is high

A multi-terabyte SGA won’t reduce LIO latencies, or eliminate LIO calls.

• LIOs are expensive– User-mode CPU time

• Spinning for latch• Inspecting cache buffers chain• Interpreting and filtering block content• Executing data type conversions

– Other response time• Sleeping for “latch free”

– Additional LIOs required to build ‘cr’ blocks

understanding oracle logicalio issues

Documents