1. mastering oracle real application clusters performance tuning at verizon wireless s311852 ian...
TRANSCRIPT
1
Mastering Oracle Real Application Clusters Performance Tuning at Verizon WirelessS311852
Ian Remedios Ph.D.Director, Global Product ManagementOracle Advanced Customer Services
3
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions.The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
4
•Solution Lifecycle Management Services
•Database and Application Management Services
•Industry specific Solution Support Centers
•Remote and On-Site Expert Services
Oracle Advanced Customer Services
Dedicated to the continual operational improvement of Oraclesolutions and to maximizing the value of Oracle investments.
5
What Differentiates Advanced Customer Services
1
• Financial Services• Telecommunications• Aerospace and Defense• Public Sector
Broad Industry Presence
High Customer Loyalty
Breadth of Products & Services
Industry Leadership
• Custom Support Packages• 2,500+ World Class Experts• Remote and Onsite Services for
unique and complex environments
• High Renewal Rates• Existing customers expand services• Strong References
•Industry/Analyst/Media recognition for operational excellence
6
Performance Optimization
Services
Virtual Center of Excellence with
designated experts
Dedicated Hotline
Proactive Onsite Support
Prioritized Service Requests
Personalized Portal
Escalation Management
Service Delivery Manager
Escalation Management
Service Delivery Manager
Prioritized Service Requests
Personalized Portal
Escalation Management
Service Delivery Manager
Proactive Onsite Support
Access to team of service engineers
Prioritized Service Requests
Personalized Portal
Escalation Management
Service Delivery Manager
Advanced
Support Assistance
SolutionSupport
Center
Priority
Service
Business
Critical
Assistance
Personalized Support Faster
Problem Resolution
Proactive Problem
AvoidanceContinual
OperationalImprovement
Customers can customize their
solutions by choosing from more than 50
individual products, tools and expert
services
Advanced Customer Services Four levels of annual services to meet specific business and budget requirements
7
RAC: The Cluster Database
ClusteredDatabase Servers
Mirrored Disk Subsystem
High Speed Switch or Interconnect
Hub or Switch Fabric
Network
Centralized Management Console
Drive and Exploit Industry Advances in Clustering
Users
No SinglePoint Of Failure
Interconnect
Shared CacheShared Cache
Storage Area Network
8
Service
Oracle RAC Architecture
public network
Node1
Operating System
Oracle Clusterwarecluster
interconnect
instance 1
ASM
Node n
Operating System
Oracle Clusterware
instance n
ASM
Redo / Archive logs all instances
shared storage
Database / Control files
OCR and Voting Disks
VIP1 VIPn
Managed by ASM
RAW Devices
Listener Listener
Service
9
RAC Deployment Cycle
2
34
1
Create reference RAC systems Stage gold images
Create Production clustersScale up RAC cluster
Test users
15
Scale down
Testing and Staging
Production
10
RAC Related Service Offerings byAdvanced Customer Services
1
• Configuration• Performance • Patch
Assessments
Managed Services
Assisted Services
Exadata V1 and V2
• Upgrade Assistance• Patching Assistance• Onsite Assistance• Backup & Recovery Review
• Remote Monitoring• Escalation Management
•Data Migration Advisory•Storage Server Configuration•Database Machine •Assisted and Managed Services
11
New: Oracle Customer Success AssessmentGet more value from your Oracle investment with Customer Services
• 15min Online Survey on 5 Domains• Strategy• Process• Technology• People• Governance
• Personalized Benchmark Study• Compare your results to peers• Advice on 25 good practice areas• Recommended actions to take • Oracle services to assist in practice
improvements• Navigate Oracle’s service catalog
• Complete portfolio of services mapped to IT lifecycle (ITIL) on oracle.com
Oracle Customer Success Assessment
Assessment URL: http://www.oracle.com/us/products/035498.htm
12
For More Information
http://search.oracle.com
orhttp://oracle.com/goto/acs
Advanced Customer Services
Mastering Oracle Real Application Clusters Performance Tuning At Verizon Wireless
Session Id: S311852
Syamal Bandyopadhyay
Mastering Oracle Real Application Clusters Performance Tuning at Verizon Wireless
Agenda: Verizon Wireless Business Requirements for RAC RAC Implementation / Deployment Architecture Methodology To Proactively Detect Performance
Issues Techniques / Tips in Resolving Performance Issues Application Performance Score Card: A 360 Degree
View Conclusions Q/A
Verizon Wireless Business Requirements for RAC
High Availability Scalability Reduced IT cost Application performance meets or beats non-RAC
deployment performance
RAC Implementation / Deployment Architecture
RAC Deployment
Middleware - WebsphereMiddleware - Websphere
Application:• Business Critical• Customer Facing• 24 X 7 X 365• 30000 Concurrent Users
DB CONFIGURATIONDB CONFIGURATION
• V10.2.0.4 (64 Bit)• DB Size 1.5 TB• 4 Oracle Instances• Data Guard• Flashback
VZWPROD1
Platform Components• Symantec SFRAC 5.0 • Hitachi Storage• Shareplex
30000
SERVER CONFIGURATION• Sun M5000• 8 Quad Core CPUs• 64 GB RAM• Solaris 10 (64 Bit)
Disaster Recovery Strategy
Middleware MiddlewareMiddleware
VZWPROD1VZWDR2
VZWDR1 VZWPROD2
• Runs on 2 Data Centers• 2 databases identical structure• VZWPROD1 and VZWPROD2
Oracle – Data Guard (physical)
• Oracle’s Data Guard for DR • Physical Data Guard with Maximum Availability• VZWDR1 and VZWDR2
Data center 1
User
2 – Way Data Replication using Shareplex (Quest
Software)
VZWPROD1VZWPROD2
Data center 2
Real Time Data Replication• Bi-directional real time data replication • More than 100 Tables• Replicate data using Shareplex
MiddlewareMiddleware
Data Guard – Preferred Node 4Shareplex – Preferred Node 3
VZWPROD1
Node 1Node 1 Node 2Node 2 Node 3Node 3 Node 4Node 4
Application Load Distribution
Application-1Application-1 – Preferred Node 1 & 2 Application-2 – Preferred Node 3 & 4 Application-3 – Preferred Node 3 & 4
• 2 Data Centers• 2 Production Databases – 4 Instances • 2 Disaster Recovery Databases – 4 Instances• Real Time Data Replication using Shareplex
Middleware MiddlewareMiddleware
VZWPROD1VZWDR2 VZWDR1 VZWPROD2
Oracle – Data Guard (physical)Shareplex
Deployment Architecture• Oracle 10.2.0.4 • Oracle Data Guard• SUN M5000 Servers• Symantec SFRAC 5.0• Hitachi Storage
30000
Methodology To Proactively Detect Performance Issues
Performance Challenges For RAC Implementation
Concern regarding meeting current level of application response time
Inserts are taking significantly longer time Increased response time for both selects
and updates
SQL Response Time – Key Components
SQL Response Time Equation:
Non RAC: Response time = CPU time + Wait time (IO wait
+ Queue time)
RAC: Response time = CPU time + Wait time (IO wait
+ Queue time + Cluster Wait Time)
Time to access the blocks/data from the cache of the partner instance(s)
More the # of blocks to access greater the wait time is Inefficient SQLs make cluster wait time worse Inadequate indexes increase cluster wait time Avoid / Minimize Block transfer among RAC instances Access Paths causing high rate of block transfer:
Full Table Scans Index Full Scans Index Fastfull Scans index Skip Scans
Cluster Wait Time
Methodology to Improve Performance
Collect non-RAC production performance stats for all SQLs
Collect RAC test performance stats for all SQLs Collect performance stats from GV$SQL table Compare RAC test performance stats with pre-
RAC stats Identify SQLs and database objects having
performance issues Analyze the performance data to detect the root
cause of unacceptable performance Tune the database objects
Table To Collect Performance Data
CREATE TABLE SQL_PERF_DATA( COLLECTION_TS DATE, --- data collection timestamp INDICATOR VARCHAR2(12 BYTE), --- test type e. g . test1 SQLID VARCHAR2(13 BYTE), --- sql_id of the SQL OWNER VARCHAR2(30 BYTE), --- parsing schema name of the sql INSTID NUMBER, --- RAC instance id EXCNT NUMBER, --- # of execution of the SQL ELAPT NUMBER, --- elapsed time (ms) of
the sql CPUT NUMBER, --- cpu time (ms) of the sql CWT NUMBER, --- cluster wait time (ms) of the
sql LIO NUMBER, --- # of buffer gets per execution
of the sql PHYIO NUMBER, --- # of disk reads per execution of
the sql ROWCNT NUMBER, --- # rows in result set per execution SQLFULLTEXT CLOB --- SQL full text
)STORAGE...........
SQL Script To Collect Performance Data
insert into sql_perf_data1(collection_ts, indicator, sqlid, owner, instid, excnt,elapt, cput, cwt, lio, phyio, rowcnt,sqlfulltext)
select sysdate, 'ractst01', sql_id, parsing_schema_name, inst_id, executions, elapsed_time/1000/executions,
cpu_time/1000/executions, cluster_wait_time/1000/executions, buffer_gets/executions, disk_reads/executions ,
rows_processed/executions, sql_fulltext from gv$sql where executions > 0 and
(elapsed_time/1000)/(decode(executions,0,1,executions)) > 1 and parsing_schema_name in ( 'APPL1' , 'APPL2')
Sample Performance Data Collection
Note: Times are in milliseconds per execution
indicator sqlid owner instid excnt elapt cput cwt lio phyio rowcnt sqlfulltextprode bh9kgy575a8j7 APPL1 7,231,254 7.3 0.3 0.0 32.1 2.4 17.3 INSERT into..prode 2hacpd8v3fbsa APPL3 3,952,181 4.6 1.1 0.0 26.7 0.9 1.0 UPDATE …..prode 88xdp6maq6pvs APPL1 15,981 1793.3 397.9 0.0 13183.0 111.7 1.0 SELECT …..
indicator sqlid owner instid excnt elapt cput cwt lio phyio rowcnt sqlfulltextractst01 bh9kgy575a8j7 APPL1 1 1,250,058 34.6 1.3 25.9 32.1 2.4 13.2 INSERT into..ractst01 bh9kgy575a8j7 APPL1 2 927,829 33.0 1.3 24.6 29.6 2.3 18.5 INSERT into..ractst01 2hacpd8v3fbsa APPL3 4 2,290,539 28.4 2.3 23.4 26.7 1.0 1.0 UPDATE …..ractst01 88xdp6maq6pvs APPL1 1 5,005 19,118.2 517.2 14,186.1 10,414.8 319.3 1.0 SELECT …..
Non-RAC Production:
RAC Test:
Performance Data Comparison SQL Script
Compares the performance stats between any 2 collections of data (e.g. production vs. test; test1 vs. test2, etc.)
select a.owner, a.sqlid,a.instid Rinstid, a.excnt Rexcnt, a.elapt Relapt, (a.cput) Rcput, (a.lio) Rlio, (a.phyio) Rphyio, a.cwt Rcwt, (a.rowcnt) Rrowcnt, (a.elapt - b.elapt) "RAC - PROD" , (a.elapt /b.elapt) "RAC over PROD" , b.excnt Pexcnt, (b.elapt) Pelapt, (b.cput) Pcput, (b.lio) Plio, (b.phyio) Pphyio, b.cwt Pcwt, (b.rowcnt) Prowcnt,b.sqlid Psqlid, b.sqlfulltext
from sql_perf_data a, sql_perf_data b
where a.indicator = 'ractst01' and b.indicator = 'prode' and a.sqlid = b.sqlid order by (a.elapt /b.elapt) desc
Performance Data Comparison Result Set
RAC - PROD RAC over PROD27.3 4.7625.7 4.5423.8 6.13
17,324.9 10.66
PEXCNT PELAPT PCPUT PCWT PLIO PPHYIO PROWCNT PSQLID7,231,254 7.3 0.3 0.0 32.1 2.4 17.27 bh9kgy575a8j77,231,254 7.3 0.3 0.0 32.1 2.4 17.27 bh9kgy575a8j73,952,181 4.6 1.1 0.0 26.7 0.9 1.00 2hacpd8v3fbsa
15,981 1793.3 397.9 0.0 13183.0 111.7 0.99 88xdp6maq6pvs
Note: Times are in milliseconds per execution
OWNER SQLID RINSTID REXCNT RELAPT RCPUT RCWT RLIO RPHYIO RCWTAPPL1 bh9kgy575a8j7 1 1,250,058 34.6 1.3 25.9 32.1 2.4 25.9APPL1 bh9kgy575a8j7 2 927,829 33.0 1.3 24.6 29.6 2.3 24.6APPL3 2hacpd8v3fbsa 4 2,290,539 28.4 2.3 23.4 26.7 1.0 23.4APPL1 88xdp6maq6pvs 1 5,005 19,118.2 517.2 14,186.1 10,414.8 319.3 14,186.1
RAC Performance Stats
(RAC Elapsed time –Production Elapsed time)
(RAC Elapsed time / Production Elapsed time)
> 1 Yes No RAC Performs worse
RAC Performs Better
Non-RAC Performance Stats
Cluster Wait Time Data Collection Methodology
• V$SQL view contains cluster_wait_time for every sql
• Calculate cluster_wait_time per execution of the sql
• Calculate percentage of cluster_wait_time over elapsed_time per execution
• Focus on SQLs having:
• High cluster_wait_time per execution
• High % of cluster_wait_time over elapsed_time
• SQLs with high execution frequency
Cluster Wait Time Data Collection Script
select sql_id sqlid, parsing_schema_name appl, inst_id instid, executions excnt, cluster_wait_time/1000/executions cwt, ((cluster_wait_time/1000/executions)/( elapsed_time/1000/executions)) "CWT over ELAPT" , elapsed_time/1000/executions elapt, cpu_time/1000/executions cput
from gv$sql
where executions > 0 and (elapsed_time/1000)/(decode(executions,0,1,executions)) > 1 and parsing_schema_name in ( 'APPL1') order by ((cluster_wait_time/1000/executions)/( elapsed_time/1000/executions)) desc
Sample Cluster Wait Time Data
Note: Times are in milliseconds per execution
SQLID APPL INSTID EXCNT CWT CWT over ELAPT ELAPT CPUT2hacpd8v3fbsa APPL3 4 2,290,539 23.4 82.35% 28.4 2.3bh9kgy575a8j7 APPL1 1 1,250,058 25.9 74.97% 34.6 1.3bh9kgy575a8j7 APPL1 2 927,829 24.6 74.54% 33.0 1.388xdp6maq6pvs APPL1 1 5,005 14,186.1 74.20% 19,118.2 517.2
SQL To Capture Database Objects Having Full Table Scans
select b.table_name, b.num_rows,a.frequency, b.owner from dba_tables b, (select object_name, count(*) frequency from gv$sql_plan where operation = 'TABLE ACCESS' and options like '%FULL%' and object_owner in ('SCHEMA_NAME1', 'SCHEMA_NAME2') group by object_name) a where a.object_name = b.table_name order by 4, 2 desc
Sample Data:
TABLE_NAME NUM_ROWS FREQUENCY OWNER
ORDER_DETAILS 24512171 4 SCHEMA_NAME1
ORDER_HEADER 19217318 6 SCHEMA_NAME1
SQL to Capture SQL_ID doing Full Table Scan
SQL to capture SQL_ID doing Full Table Scans
select object_name, sql_id from gv$sql_plan where operation = 'TABLE ACCESS' and options like '%FULL%' and object_owner in ('SCHEMA_NAME1', 'SCHEMA_NAME2') order by object_name
Sample data:OBJECT_NAME SQL_ID
ORDER_DETAILS 4a6svn2r2wamc
ORDER_DETAILS 88sqhdvtp0fsd
ORDER_HEADER ggtf49yp69p2t
Oracle Automated Workload Repository (AWR)
• Analyzed AWR
• AWR contains significant amount of RAC related stats
• Each RAC Instance has its own AWR
• Very helpful performance stats for Global Cache blocks sent and received
• Interconnect Traffic Volume – for each instance
• Database Objects incurring Global Cache Buffer Busy Waits
• Database Objects having Consistent Read (CR) blocks received waits
• Database Objects with Current Blocks received waits
Additional Oracle Dynamic Performance Views
• GV$ACTIVE_SESSION_HISTORY
• GV$SESSION_WAIT
• These views contain tremendous amount of performance related stats
• Identified the database objects with frequent GLOBAL CACHE (gc_%) waits
Techniques & Tips for Resolving Performance Issues
Cluster Wait Time Reduction Techniques• Table / Index Changes:
Introduce partition / sub-partition – Hash Partition where feasible
Add: freelist / freelist group
Increase INITRANS
Index Changes:
Eliminate if possible
Modify inefficient indexes
Global Hash partition
Make local where feasible
Use Multiple Block Size, especially for indexes
• Reduce full tablescans, index fastfull scans, index full scans, index skip scans
Tune SQLs / database objects to reduce the # of logical / physical io
Table / Index Change : ExamplePre-RAC
CREATE TABLE TABLEA ( LOGIN_ID …)
PARTITION BY RANGE (TIME_STAMP) ……
CREATE INDEX IND2 ON TABLEA (ACCOUNT_NUMBER, TIME_STAMP, MTN)
======================================================================
In RAC
CREATE TABLE TABLEA .. INITRANS 10 .. FREELISTS 12 FREELIST GROUPS 4
PARTITION BY RANGE (TIME_STAMP) …SUBPARTITION BY HASH (LOGIN_ID)........SUBPARTITIONS 64 …
CREATE INDEX IND2 ON TABLEA (ACCOUNT_NUMBER, TIME_STAMP, MTN) TABLESPACE ACSS_2K_IDX03
INITRANS 10 FREELISTS 12 FREELIST GROUPS 4
GLOBAL PARTITION BY HASH (ACCOUNT_NUMBER, TIME_STAMP, MTN)
PARTITIONS 64
Application Performance Score Card: A 360 Degree View
Performance Data Collection : By Applications
• Collect performance stats for each Application
• Compare RAC test results with non-RAC
• Compare non-RAC to RAC production stats
• SQL to capture the performance stats:
select parsing_schema_name "appluser", sum(executions) "exec cnt", sum(elapsed_time/1000)/sum(executions) "elap", sum(cluster_wait_time/1000)/sum(executions) "cwt", sum(cpu_time/1000)/sum(executions) "cpu", sum(buffer_gets)/sum(executions) "log io", sum(disk_reads)/sum(executions) "phy io", sum(rows_processed)/sum(executions) "row/exec" from gv$sql where executions > 0 and (elapsed_time/1000)/(decode(executions,0,1,executions)) > 0 and parsing_schema_name in ( 'APPL1', 'APPL2', 'APPL3') group by parsing_schema_name order by parsing_schema_name
Performance Data Comparison: Final Score Card
Pre RAC Productionappluser exec cnt elap cwt cpu log io phy io row/execAPPL1 189,032,993 2.30 0.00 1.01 23.56 0.27 7.71APPL2 16,696,364 8.26 0.00 2.36 231.13 1.08 1.65APPL3 226,087,216 1.97 0.00 0.66 10.19 0.40 5.76
RAC Testappluser exec cnt elap cwt cpu log io phy io row/execAPPL1 7,738,861 1.82 0.32 1.20 31.13 0.15 9.88APPL2 2,299,352 6.63 0.39 0.66 36.45 1.01 1.62APPL3 1,955,154 1.86 0.33 0.59 13.52 1.13 5.13
Post RAC Productionappluser exec cnt elap cwt cpu log io phy io row/execAPPL1 39,738,861 2.17 0.31 1.03 27.91 0.19 7.58APPL2 3,299,352 8.13 0.49 1.96 217.90 0.87 1.81APPL3 41,955,154 2.03 0.42 0.69 10.13 0.38 5.83
Met Goals of RAC Implementation at Verizon Wireless
Achieved the goals: High Availability Scalability Reduced IT cost Application performance meets or beats pre-RAC
performance
Concern regarding meeting current level of application response time… Met the required performance
Inserts are taking significantly longer time… Dramatic reduction of the elapsed time for inserts
Increased response time for both selects and updates… Improved to meet the goals
Conclusions
• Identify SQLs and Database Objects having High Cluster Wait Time
• Cluster Wait Time must be reduced• Use the Techniques to reduce Cluster Wait Time
• Partition / Sub-partition tables• Partition indexes (preferably Hash)• Use Freelist Group • Use Freelist• Increase initrans • Remove unnecessary Indexes• Reduce Full Table Scans, Index Full Scans, Index Fastfull
Scans, Skip Index scans• Use Multiple block size (2K, 4K, 8K, etc.)
Q&A
THANK YOU!!!