tuning oracle rac wednesday, june 04, 2014 guy peleg president maklee engineering...

Post on 30-Mar-2015

226 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Tuning Oracle RAC

Monday, April 10, 2023

Guy PelegPresident

Maklee Engineeringguy.peleg@maklee.com

Agenda

RAC overview & Performance ExpectationsPerformance Tips SQL Tuning

Typical RAC configuration

Public Network

Private Network(Interconnect)

Storage Network

AppServer1

AppServer2

Client

Node 1

DatabaseInstance 1

Node 2

DatabaseInstance 2

Local Storage Local StorageShared

Database

Oracle RAC

Oracle RAC provides two main features:Availability

Scalability

May operate in two modes:All nodes are active (load distributed between nodes)

Active/Passive

RAC scaling/performance considerations are similar to OpenVMS clustering scaling/performance considerations

Interconnect

Locks

Sharing

RAC Scaling – Maklee’s Golden Rules

Application that does not scale on a standalone node – will not scale on RAC

Start with single instance tuningshutdown all nodes measure scalingtest scaling by adding CPUs

Add one node at a time to measure scalability

Scalability Benchmark

2 nodes cluster1.3 Ghz rx2600, running OpenVMS V8.3-1H1

Oracle 10gR2 RACLatest set of patches

Test database contains information about 50,000 customers

200,000 customer orders

200,000 ordered items

Scalability Benchmark

PL/SQL procedure to fetch data about 2000 random customers

Read only test

All data in SGA

No I/O

CPU Bound

Scalability Benchmark

0

200

400

600

800

1000

1200

1400

1 Job

6 Parallel jobs -one instance

3 Parallel jobsinstance 1

3 Parallel jobsinstance 2

Elapsed time (seconds per job) to complete the testLess is better

Elapsed time (seconds per job) to complete the testLess is better

RAC Proof Of Concept

MAKLEE Engineering recently performed a RAC proof of concept installation at a large chain of department stores in Switzerland.

Benchmarked a single Alpha GS1280 (production node) vs. a RAC cluster running 2 Integrity servers rx6600.

The goals were:Install RAC Get hands on experience with RACPerform RAC scaling testsMake a go/no go decision on implementing RAC in production

Hardware & Software Configuration

Oracle RAC Configuration:2 nodes OpenVMS ClusterEach node is rx6600 with 8 coresOpenVMS V8.3-1H1EVA8000 storage

Products installed:Oracle CRS (Cluster Ready Services)Oracle 10g R2DBCA executed for configuring RAC enabled databaseDatabase patches

27 Parallel Database Import Jobs

Minutes to complete database import

less is better

0

20

40

60

80

100

120

140

160Standalone GS1280

rx6600RAC/standaloneItanium

Database Import

Itanium outperformed Alpha

Operating in RAC environment does not increase the throughput of the import operation

Spreading the jobs across two nodes or running all jobs on one node yields identical performance/throughput

No performance degradation witnessed

Batch Processing Benchmark

0

5

10

15

20

25

30

35

40

4554 jobs - SingleAlpha GS1280

54 jobs - Singlerx6600

54 jobs - Spreadacross the RAC

Minutes to complete batch processing cycle

Less is better

Minutes to complete batch processing cycle

Less is better

Batch Processing Benchmark

Itanium outperformed Alpha

RAC allows scaling outside of the box

Second RAC node adds 40% more throughput

Another Example – European Bank

European Bank migrating from Alpha to Itanium

2 nodes AlphaServer ES47 -> 2 nodes rx7640

Migrating to Oracle 10gR2 RAC

Availability is main concernInteractive users will be distributed between nodesNo plans to distributed batch load between nodesNeeded to verify that RAC does not degrade performance

Another Example – European Bank

Benchmarked various batch jobs – focusing on one specific batch job.

Initial results did not favor Itanium.

Batch Processing Benchmark

0

5

10

15

20

25

30

35Alpha ES47

rx7640 out of thebox

rx7640 after tuning

Minutes to complete selected batch job

Less is better

Minutes to complete selected batch job

Less is better

European Bank - Summary

Tuning is critical for achieving optimal performance

Don’t run “out of the box”.66% improvement after (minimal) tuning

The specific benchmark is running 52% faster on Itanium comparing to Alpha.

European Bank - Summary

All other batch jobs/applications witnessed similar improvement.

RAC increases availability and does not degrade performance.

RAC will go into production in few weeks

Performance Tips

CRS Base Priority

CRS is running in batch

Usually, runs in a dedicated batch queue

By default, base priority of a batch queue is 4

On a system with thousands of processes, CRS may need to compete (and sometimes lose) for CPU resources

CRS should be given high priority

Set base priority of CRS queue to 12

RAC Cluster Interconnect

The performance of the cluster interconnect is critical to the performance of the RAC.

Interconnect used forCluster managementLocksCache Fusion

Oracle requires (at least one) dedicated cluster interconnect

Gigabit Ethernet is highly recommendedEnable Jumbo FramesTransfer rate of ~ 25MB per second (faster than some disks ;-)

Cluster interconnect Performance

Latency is CRITICAL for RAC performance

Measure the latency of the interconnect:

set numwidth 20

column "AVG CR BLOCK RECEIVE TIME (ms)" format 9999999.9

select

b1.inst_id,

b2.value "GCS CR BLOCKS RECEIVED",

b1.value "GCS CR BLOCK RECEIVE TIME",

((b1.value/b2.value) * 10) "AVG CR BLOCK RECEIVE TIME (ms)"

from gv$sysstat b1,

gv$sysstat b2

where b1.name='gc cr block receive time'

and b2.name='gc cr blocks received'

and b1.inst_id=b2.inst_id;

Cluster interconnect Performance

Latency should be lower than 15ms

OpenVMS achieved 0.5ms onblades RAC (BL860)V8.3-1H1Gigabit EthernetJumbo Frames enabled

Load distribution between instances

set pagesize 60 space 2 numwidth 8 linesize 132 verify off feedback offcolumn service_name format a20 truncated heading 'Service'column instance_name heading 'Instance' format a10column service_time heading 'Service Time|mSec/Call' format 999999999select service_name,  instance_name,  elapsedpercall service_time,  cpupercall cpu_time,  dbtimepercall db_time,  callspersec throughput  from gv$instance gvi,   gv$active_services gvas,   gv$servicemetric gvsm  where gvas.inst_id=gvsm.inst_id  and gvas.name_hash=gvsm.service_name_hash   and gvi.inst_id=gvsm.inst_id   and gvsm.group_id=10  order by   service_name,   gvi.inst_id;

Standalone Database Import

Minutes to complete database import

less is better

0

1

2

3

4

5

6

7

8

rx6600 beforetuning

rx6600 after tuning

37% Improvement

Database import

Install imp.exe as resident image with shared address space$ install add imp.exe/resident/share=addr

Increase default quotas for BEQ’s mailboxes$ define/sys ORA_BEQ_MBXSIZ 64000$ define/sys ORA_BEQ_MBXSBFQ 64000

Set DEFMBXBUFQUO to 64000Set DEFMBXMXMSG to 64000

DBMS_STATS.GATHER_SCHEMA_STATS

0

10

20

30

40

50

60

70

80

90

100

rx6600 beforetuning

rx6600 after tuning

Minutes to gather database statistics (350GB database)

Less is better

Minutes to gather database statistics (350GB database)

Less is better

DBMS_STATS.GATHER_SCHEMA_STATS

Calling gather_schema_stats results in a database server process being created

The server process in not multithreaded

Typically consumes 100% of one CPU

Performance improvement achieved by affinitizing the server process to one CPU and increasing QUANTUM to 20.

SORT

Analyze the efficiency of sort operations

Determine the number of optimal, one pass and multipass operationsSELECT optimal_count, round(optimal_count*100/total, 2) optimal_perc, onepass_count, round(onepass_count*100/total, 2) onepass_perc, multipass_count, round(multipass_count*100/total, 2) multipass_percFROM (SELECT decode(sum(total_executions), 0, 1, sum(total_executions)) total, sum(OPTIMAL_EXECUTIONS) optimal_count, sum(ONEPASS_EXECUTIONS) onepass_count, sum(MULTIPASSES_EXECUTIONS) multipass_count FROM v$sql_workarea_histogramWHERE low_optimal_size > 64*1024);

Sizing the SGA

Reserve memory for the SGA (SYSMAN)

Avoid automatic memory management in the SGA whenever possible.

The following query will help properly size the SGA

select sga_size, sga_size_factor as size_factor,estd_physical_reads as estimated_physical_reads from v$sga_target_advice order by sga_size_factor;

Sizing the SGA

SQL> select sga_size, sga_size_factor as size_factor,

2 estd_physical_reads as estimated_physical_reads

3 from v$sga_target_advice order by sga_size_factor;

SGA_SIZE SIZE_FACTOR ESTIMATED_PHYSICAL_READS

---------- ----------- ------------------------

4356 ,75 44485808

5808 1 24659539

7260 1,25 24659539

8712 1,5 24659539

10164 1,75 24659539

SQL>

What’s wrong in this picture?

$ show memory

System Memory Resources on 1-APR-2008 15:32:35.62

Physical Memory Usage (bytes): Total Free In Use Modified

Main Memory (GB) 64.00 58.27 5.69 0.02

Extended File Cache (Time of last reset: 31-MAR-2008 15:14:46.99)

Allocated (MBytes) 397.03 Maximum size (MBytes) 32768.00

Free (MBytes) 17.82 Minimum size (MBytes) 3.12

In use (MBytes) 379.20 Percentage Read I/Os 77%

Read hit rate 99% Write hit rate 0%

Read I/O count 5368075 Write I/O count 1578011

Read hit count 5315683 Write hit count 0

Reads bypassing cache 79 Writes bypassing cache 241954

Files cached open 739 Files cached closed 2255

Vols in Full XFC mode 0 Vols in VIOC Compatible mode 52

Vols in No Caching mode 0 Vols in Perm. No Caching mode 0

....

Of the physical memory in use, 8.52 GB are permanently allocated to OpenVMS.

$

SQL Tuning

The next step in improving performance

SQL Tuning !

With previous “Alpha Vs. Itanium” benchmarks we had to play it fare

Not a single SQL statement was changed.

SQL tuning may improve performance by magnitudes

SQL Tuning

All the tools that are required for SQL tuning are shipping with the database:

Automatic Workload Repository (AWR)Endless amount of performance related informationEnhanced version of statpak

Active Session History (ASH)Automatic Database Diagnostic Monitor (ADDM)SQL Access AdvisorSQL Tuning Advisor

Statspack analyzer (not part of the DB but available for free)

The power of SQL tuning

AWR was used to analyze the “scalability benchmark”97% of the time was spent executing single SQL statementAfter SQL tuning – elapsed time of the benchmark was reduced from 411 seconds to 3.18 seconds !

0

50

100

150

200

250

300

350

400

450

Untunedversion

Tuned version

130 times

faster!!!!

The power of SQL tuning

“Real life” examplerx6600, Oracle 10g, DWH DBSingle SQL statement required 140 minutes to completeBy biasing the optimizer, elapsed time reduced to 10 minutes

0

20

40

60

80

100

120

140

Untunedversion

Tuned version

See us at www.maklee.com for:

• Performance improvements

• Oracle Tuning

• Platform Migration

• Custom Engineering solutions

• Custom Training

Questions?

top related