Download - Mma 10g r2_936
“This presentation is for informational purposes only and may not be incorporated into a contract or agreement.”
This document is for informational purposes. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development,
release, and timing of any features or functionality described in this document remains at the sole discretion of Oracle. This document in any form, software or printed matter, contains proprietary information
that is the exclusive property of Oracle. This document and information contained herein may not be disclosed, copied,
reproduced or distributed to anyone outside Oracle without priorwritten consent of Oracle. This document is not part of your license agreement nor can it be incorporated into any contractual agreement
with Oracle or its subsidiaries or affiliates.
Lawrence To & Joe MeeksOracle
Jeffrey McCormickThe Hartford
What They Didn't Print in the Doc
HA Best Practices by Gurus from Oracle’s Maximum Availability Architecture Team
Agenda
• Maximum Availability Architecture (MAA)• The Hartford and MAA• HA Best Practices, Tips and Results
• Turbocharged Data Guard • Oracle Snapshots and Clones • More Uptime for Planned Downtime• Transparent Client Failover for Disaster Recovery
Maximum Availability Architecture - MAA
! Oracle recommended architecture and best practices for High Availability
! Database, Application Server, Enterprise Manager, Collaboration Suite and Oracle Applications
• Improved and validated with new Oracle versions, features and product suites
• Focused on reducing unplanned and planned downtime• Focused on making customers successful
http://www.oracle.com/technology/deploy/availability/htdocs/maa.htm
Our Approach
• Develop HA solutions and features• Work closely with different development teams
• Provide feedback early in the development cycles• Integrate features and test before and after release
• Deploy MAA on internal production systems• Design and influence future solutions and features• Partner with strategic infrastructure providers• Document in best practice books and white papers
35 Person Years of Effort & Growing
Strategic MAA Partners
•Servers
•Dell, HP
•Network
•F5, Qlogic, Foundry Networks, Emulex
•Storage
•Apple, Engenio, NetApp, HP, EMC
Our success – measured by the response from customers like you . . .
Jeff McCormick Senior Data ArchitectThe Hartford• $22.7 billion in revenue• Leading provider of investment products, life
insurance, employee benefits, auto, homeowner & business insurance
• Largest seller of individual annuities in U.S.• 11,000 agencies, 100,000 broker/dealers• 30,000 employees
Architecture Review
• Focus on Business Continuity • Assess information technology architectures• Minimize/avoid planned & unplanned downtime• Rapid recovery/failover to remote location• Provide excellent service at lowest cost• Retain flexibility to incorporate new technology
The Hartford – Future State
Primary Primary
Primary Site Secondary Site Tertiary Site
StorageArray
StorageArray
StorageArray
Tape Drive Tape Drive
MediaServer
MediaServer
RMAN RMAN
Data Guard
Standby
Database REDO
Database REDO
ApplicationAccess
Data Guard
Standby
Data Guard
Standby
Data Guard
Standby
Real Application Cluster
The Value of MAA to The Hartford
• Simple . . .
Implement a High Availability solution that offers considerable savings in cost, resources, and time.
MAA Best Practices
Lawrence ToOracle
Turbocharged Data GuardDisaster Recovery Solution for
Oracle Databases
Data Guard Best Practices
• Test results show significant out of the box improvements with Data Guard Release 10.2
• Reduction of failover times, potential data loss and primary database impact
• More efficient redo transport• Data Guard SYNC implementation is less impact
than remote mirroring implementation
New Data Guard Feature: Fast-Start Failover
• Automatic and fast• Logical standby achieved < 20 seconds• Physical standby achieved < 20 seconds• Old primary is reinstated automatically once
connectivity is reestablished between observer and primary database
Attend Session 937, “Best Practices for Automatic Failover Using Oracle Data Guard 10g Release 2”
Data Guard Best Practices:Switchover for Planned Maintenance
• For fastest switchover (< 1 minute)• Prior to switchover
• a physical standby transitioning from read only back to Redo Apply should be restarted
• disconnect all sessions and stop job processing • shutdown abort for all secondary RAC instances• enable real-time apply on the standby database and ensure
the standby is synchronized or caught up with the primary database
• For manual switchovers open the new primary directly from the mount state
• Or, simulate a Fast-Start Failover - complete transactions and shutdown abort all primary instances
Data Guard Best Practices:Faster Redo Transport
• Set SDU=32K• Tune network parameters that affect network
buffer sizes and queue lengths • Ensure sufficient network bandwidth for
maximum database redo rate + other activities
Note: Please refer to MAA paper, “Oracle9i Data Guard: Primary Site and Network Configuration Best Practices”http://www.oracle.com/technology/deploy/availability/pdf/MAA_DG_NetBestPrac.pdf
Oracle 10g Release 2 paper coming soon
Data Guard Best Practices:Tune Network Parameters
• Send and receive buffer size = 3 x bandwidth delay product (BDP)
BDP = 1,000 Mbps * 25ms (.025 secs)= 1,000,000,000 * .025= 25,000,000 Megabits / 8 = 3,125,000 bytes
• Tune network device queues to eliminate packet losses and waits. Set device queues to a minimum of 10,000 (default 100)
* BDP = the product of the estimated minimum bandwidth and the round trip time between the primary and standby server
Impact of Network Tuning
Impact of Network Tuning
937
10.8
0 200 400 600 800 1000
Tuned
Default
Mbits/secNetwork Throughput
Oracle MAA Test Result
Data Guard Release 10.2 Redo Transport Improvements
• Increased network write sizes to 10 MB to better utilize network capacity for both ARCH and LNS
• Full decoupling of LGWR and LNS processes• No more waits during log switches• No more waits when LNS buffer is full
• Intra-file parallelism support for ARCH• Up to 29 parallel remote archive processes• Dedicated local ARCH
Faster ASYNC Transport
52 63 5477 74
155
102
264
0
50
100
150
200
250
300
Tim
e to
tra
nsf
er(s
ecs)
0ms 10ms 50ms 100ms
Network latency
1GB redo transfer
10gR2
Previousversions
ARCH Performance Gains
12.817.1
23.024.4
27.8
0
5
10
15
20
25
30
Effe
ctiv
e tr
ansf
er
rate
(MB
/sec
)
1 2 3 4 5
Parallel ARCH Processes
ARCH Intrafile Parallelism
Data Guard Best Practices:Gap Resolution and Data Loss
• For fastest gap resolution• Leverage intra-file archive parallelism • Follow tips for tuning redo transport to improve network
utilization
• To minimize data loss, • Use SYNC transport with a low latency and with a high
bandwidth network • For ASYNC transport, follow tips for tuning redo transport
• Example: Less than 7 seconds of data loss exposure for high redo rates of 2-12 MB/sec with <=25 ms latency in our tests
Data Guard Best Practices:Reduce Overhead on PrimaryNew Data Guard 10g Release 2 ASYNC Transport• Less primary overhead across different latencies and throughput• NEW: LNS reads directly from the Online Redo Logs
Best Practice• Allocate additional I/O bandwidth for Online Redo Log Files
Performance Gains• For Redo rates less than 2 MB/sec, there is less than 5% impact on
the primary database across different latencies• For very high redo rates of 20 MB/sec, less than 10% impact on
primary database even with latencies of 50 and 100 ms• Overall, Oracle 10g Release 2 database throughput (redo rate) was
2-3 times faster than 10gR1 at high redo rates and latencies
Data Guard Best Practices:Reduce Overhead on PrimaryOffload Backups to Standby Database• Eliminate backup overhead on primary database• RMAN enables hot backups of the standby database
Best Practices• Use Redo Apply (Physical Standby)• For simplicity, use identical directory structures on the primary and
standby databases• Directory structures can be different – see best practice paper for details
• Use RMAN Recovery Catalog so that backups taken on one database server can be restored on another
• Use a catalog server physically separate from primary and standby sites• Reference MAA RMAN/Data Guard best practices paper
• http://www.oracle.com/technology/deploy/availability/pdf/RMAN_DataGuard_10g_wp.pdf
Data Guard Sync Transport Less Overhead than Remote Mirroring
No DataNo Data
39% DB Impact39% DB Impact
26% DB Impact26% DB Impact
3 % DB Impact3 % DB Impact
10% DB Impact10% DB Impact20 ms20 ms
No DataNo Data15 ms15 ms
4 % DB Impact4 % DB Impact10 ms10 ms
4 % DB Impact4 % DB Impact0 ms0 ms
RTTRTT Data Guard Data Guard Remote MirroringRemote Mirroring
Actual Customer Test Data
Data Guard Advantage Because …
• DG only transmits the redo in contrast to all the DB writes
• DBWR – database writer• LGWR – log writer• ARCH - archiver• RVWR – flashback log writer
• Higher wait times for DBWR (db file parallel writes) result in
• Contention for free buffers• Increase in buffer busy waits
Oracle Snapshots and ClonesAn alternative to third-party snapshots and database clones
Database Restore Points:Database Snapshots
• Business Needs• Database Snapshots for Quick Backups and Restores • Fast, instantaneous snapshots with little overhead
• Oracle Solution• Database [guaranteed] restore points
Create restore point <snap1> [guarantee flashback database]• Restore points only captures one before image block for
every changed blocks regardless of how many times it has been changed
• Flashing back to restore point is proportional to copying the changed blocks over and applying a small amount of recovery
• Not appropriate as a replacement for full backups
Database Restore Points: An alternative for snapshots• Database Restore Points
• No additional cost from a different vendor• Creating restore point is instantaneous• No hot backup is required• Database consistent after flashback • Less system resources than a full backup if flashback is disabled• Leverage as a fallback or checkpoint mechanism to protect from
logical failures or for quick restores in test environments• Best Practice: Monitor space and I/O performance
• Monitor space utilization from v$restore_point and v$flashback_database_stat
• Monitor for high flashback buffer free wait events• More benefits for larger databases
Database Restore Points:Use Cases
• Fast fallback for database patches, upgrades, application changes or batch jobs
• Upgrade from 10.1.0.4 to 10.2.0.1 ==> 1+ hours• Flashback prior to upgrade ==> 2 minutes
compared to hours to restore the database• Quick restore of test environments to original state
• Change 1% (5 GB), Flashback ==> 10 minutescompared to 100 minutes to restore the 500 GB database
Data Guard, Flashback and RMAN:Database Clones
• Business need • Users need copies or clones of their primary
database for testing, development, reporting• Typically clones are refreshed daily or weekly
• Oracle has all the tools to create a clone without the need of third party products
Data Guard, Flashback and RMAN:Creating and Resynching a Clone
Read/WriteClone ofPrimary
2. Activate standby for testingStandby >> Clone
3. Flashback cloneto restore pointClone >> Standby
PhysicalStandbyDatabase
PhysicalStandbyDatabase
1. Create restore point 4. Resync with
incremental backup or archives from primary
Steps to Clone and Resync
• Step 1: Activate Clone• Create Restore Point Guarantee Flashback
Database (instantaneous)• Activate Standby Database (clone)
• Step 2: Use Clone for Read-Write Testing• Step 3: Resync Clone
• Flashback to Restore Point• Create Incremental Backup from the Primary
containing all changes since the time of the restore point
• Apply Incremental to the clone
Clone Performance:Resync vs Recreate
18.4723.68
97
0102030405060708090
100
Resync Clone(Parallel)
Resync Clone(Serial)
Recreate fromPrimary
Time (Mins)
Data Guard, Flashback and RMAN:Database Clones
• Oracle clones can be used as an alternative to third party database cloning solutions
• No additional cost from a different vendor• All features are present in Oracle to create and resync a clone • Steps need to be scripted and automated • Targeting Enterprise Manager wizard for the future
• Best Practices:• Compare performance between Oracle and current approaches • Sufficient IO bandwidth and storage implies faster flashback and
resync performance• Enable block change tracking on the primary • Use RMAN parallelism
More UptimeDuring
Planned Downtime
Reducing Planned Downtime
Best Practice:• Pick the right strategy• Test, test, test and automate
Reducing Planned Downtime Best Practices
• Dynamic Resource Provisioning, Online redefinition and reorganization reduces planned downtime
• Detect new processors from an SMP server• Dynamically grow, shrink and tune memory• Table and index modifications• How?: Automatic Shared Memory Management• How?: Online physical and logical table changes• How?: Online index operations
Reducing Planned Downtime Best Practices
• ASM eliminates downtime for• storage maintenance and storage migration• How?: Automatic data rebalance
• RAC rolling upgrade eliminates downtime for • Patching and system maintenance• How?: Rolling upgrade with qualified patches• How?: Service relocation
Reducing Planned Downtime Best Practices
• Data Guard SQL Apply minimizes downtime:• Node, system, cluster, and site maintenance• Database upgrades• How?: Fast switchover < 5 minutes and no additional
downtime for upgrade steps• How?: Rolling upgrades (starting w/ 10.1.0.3)
• Best upgrade approach if RAC rolling upgrade is not possible and there are no data type restrictions
Reducing Planned Downtime Best Practices
• Streams approach eliminates or minimizes downtime for
• Database upgrades• Platform migration (e.g. Windows to Linux)• Character set migration• How?: Support heterogeneous versions in active/active mode• How?: Support heterogeneous platforms• How?: Automatic conversion between character sets
• Best upgrade approach for customers that are currently using streams
Data Type RestrictionsData Guard SQL Apply and Streams
• Unsupported data types• BFILE, ROWID, User defined types • Collections and VARRAYs • XML types • Multimedia types
• With Streams, you can work around some data type restrictions by using
• triggers to capture changes from an unsupported tables to a “shadow” tables that has supported data types
• Replicate the “shadow” table changes• Use customized apply to apply the changes to the original
tables on the target database
Reducing Planned Downtime Best Practices
• Transportable Tablespace reduces planned downtime• Platform migration (e.g. Windows to Linux)• Database upgrade• How?: Cross-platform datafile conversion• How?: Transport tablespaces to new version
Transportable Tablespace to Minimize Downtime for UpgradesWhen to use
• Logical standby and streams are not best fit solutions• Time to run the upgrade or migration scripts is greater than the
time to export and import the meta data
Phase 1: Preparation1. Create shell of target database using new version2. Create schemas in target database3. Create physical standby if source and target hosts are different
Source database1.Remove transport violations, if any2.Make user tablespaces read-only3.Export tablespace metadata
Target database1.Recover standby and shutdown2.Use datafiles for target database3.Import tablespace metadata4.Make user tablespaces read-write
Phase 2: Transport database
Transportable Tablespace to Minimize Downtime for Upgrades
• Customer Example• AMADEUS
• Upgrade electronic ticketing system from Oracle 9.2.0.3 on HP N Class to 10.1.0.4 on HP Superdome
• Total Downtime 8 Minutes (compared 25 minutes for normal upgrade)
http://www.oracle.com/technology/deploy/availability/pdf/AmadeusProfile_TTS.pdfhttp://www.oracle.com/technology/deploy/availability/pdf/AmadeusProfile.pdf
Transparent Client Failover for Disaster Recovery
Client FailoverOracle Database 10g Release 2
• Fast Application Notification Prerequisites• Oracle 10g Release 2 OCI Clients
• Server Side TAF enabled with AQ_HA_NOTIFICATION=TRUE
• FAN OCI event using AQ notifies OCI mid-tier clients automatically
• Oracle 10g JDBC clients • Fast Connection Failover enabled
Client Failover with Data Guard 10g Release 2: Validated Solution • Data Guard failover can complete in seconds• DB_ROLE_CHANGE database trigger can be configured
to automatically . . .1. Enable production database services using
DBMS_SERVICE2. Change LDAP or DNS or some naming service to ensure
that clients reconnect to the new available primary database
3. Call any other application pre-failover steps4. Notify JDBC clients with external program to publish FAN
ONS events • FAN OCI event using AQ notifies OCI mid-tier clients
automatically(10gR2)
MAA Best Practice Home Pagehttp://www.oracle.com/technology/deploy/availability/htdocs/maa.htmHA Best Practices for Oracle Database
• Oracle Database High Availability Overview 10g Release 2 - Documentation • Oracle Database High Availability Architecture and Best Practices 10g Release 1 - Documentation • Oracle Database 10g Best Practices: Data Guard Redo Apply and Media Recovery • Oracle Database 10g Best Practices: Data Guard SQL Apply • Oracle Database 10g Best Practices: Data Guard Role Transitions and Streams • Using Recovery Manager with Oracle Data Guard in Oracle Database 10g• Oracle Database 10g Best Practices: Migration to Automatic Storage Management (ASM)• Best Practices for Creating a Low-Cost Storage Grid for Oracle Databases • Oracle9i Data Guard: Primary Site and Network Configuration Best Practices• Oracle9i Fast-Start Checkpointing Best Practices
HA Best Practices for Oracle Application Server• OracleAS 10g Infrastructure Highly Available Architectures • Highly Available Distributed Identity Management • Highly Available Identity Management Deployment Example - Rack Mounted Identity Management• Highly Available Identity Management Deployment Example - Cold Failover Cluster Identity Management • Configuring Highly Available OracleAS Infrastructure With F5's BIG-IP Load Balancer • Oracle9i Application Server Cold Failover Cluster Infrastructure Upgrade to Oracle Application Server 10g Cold Failover Cluster• Transformation From A Single Host Oracle Application Server Infrastructure To An Oracle Application Server 10g Cold Failover Cluster
HA Best Practices for Oracle Applications & Oracle Collaboration Suite• Configuring Oracle Applications Release 11i with 10g RAC and 10g ASM• E-Business Suite 11i on RAC: Configuring Database Load balancing & Failover• Oracle E-Business Suite Release 11i with 9i RAC: Installation and Configuration using AutoConfig • Business Continuity for Oracle Applications Release 11i• Oracle Collaboration Suite High Availability Configuration Release 2 (9.0.4) for UNIX and Linux
HA Best Practices for Oracle Grid Control• Configuring Enterprise Manager for High Availability• Enterprise Manager 10g Backup, Recovery and Disaster Recovery Considerations
High Availability Demos/Sessions From Oracle Development
Sessions - Monday, Sep 19! 1:30-2:30 pm, Room 303 - Optimizing Linux I/O! 3:00-4:00 pm, Room 104 - The Future of Database Information Technology! 4:30-5:30 pm, Room 103 - What They Didn't Print in the DOC - HA Best
Practices by Gurus from Oracle's Maximum Availability Architecture Team
! 3:00-4:00 pm, Room 104 - Logical Standby Unleashed! 4:30-5:30 pm, Room 104 - Best Practices for Oracle Database 10g Backup
and Recovery
Demogrounds - Monday, Sep 19 – Thursday, Sep 22! Oracle Data Guard! ILM and Storage
! Oracle Secure Backup! RMAN, Flashback, and Online Redefinition
Sessions - Tuesday, Sep 20
High Availability Sessions From Oracle Development
Sessions - Thursday, Sep 22! 1:00-2:00 pm, Room 104 - Minimizing Application Development Time Using
Flashback: A Customer Case Study! 2:30-3:30 pm, Room 104 - Best Practices To Achieve Business Continuity
Using Oracle Applications and Oracle Database Technology! 4:00-5:00 pm, Room 104 - Best Practices for Automatic Failover Using
Oracle Data Guard 10g Release 2
Sessions - Wednesday, Sep 21! 11:00 am-12:00 pm, Room 104 - Improve Your Tape Backup Results with
Oracle Secure Backup! 3:00-4:00 pm, Room 304 - Implementing Information Lifecycle Management
(ILM) using the Oracle Database
Q U E S T I O N SQ U E S T I O N SA N S W E R SA N S W E R S