Download - Mma 10g r2_936

Transcript
Page 1: Mma 10g r2_936

“This presentation is for informational purposes only and may not be incorporated into a contract or agreement.”

Page 2: Mma 10g r2_936

This document is for informational purposes. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development,

release, and timing of any features or functionality described in this document remains at the sole discretion of Oracle. This document in any form, software or printed matter, contains proprietary information

that is the exclusive property of Oracle. This document and information contained herein may not be disclosed, copied,

reproduced or distributed to anyone outside Oracle without priorwritten consent of Oracle. This document is not part of your license agreement nor can it be incorporated into any contractual agreement

with Oracle or its subsidiaries or affiliates.

Page 3: Mma 10g r2_936
Page 4: Mma 10g r2_936

Lawrence To & Joe MeeksOracle

Jeffrey McCormickThe Hartford

Page 5: Mma 10g r2_936

What They Didn't Print in the Doc

HA Best Practices by Gurus from Oracle’s Maximum Availability Architecture Team

Page 6: Mma 10g r2_936

Agenda

• Maximum Availability Architecture (MAA)• The Hartford and MAA• HA Best Practices, Tips and Results

• Turbocharged Data Guard • Oracle Snapshots and Clones • More Uptime for Planned Downtime• Transparent Client Failover for Disaster Recovery

Page 7: Mma 10g r2_936

Maximum Availability Architecture - MAA

! Oracle recommended architecture and best practices for High Availability

! Database, Application Server, Enterprise Manager, Collaboration Suite and Oracle Applications

• Improved and validated with new Oracle versions, features and product suites

• Focused on reducing unplanned and planned downtime• Focused on making customers successful

http://www.oracle.com/technology/deploy/availability/htdocs/maa.htm

Page 8: Mma 10g r2_936

Our Approach

• Develop HA solutions and features• Work closely with different development teams

• Provide feedback early in the development cycles• Integrate features and test before and after release

• Deploy MAA on internal production systems• Design and influence future solutions and features• Partner with strategic infrastructure providers• Document in best practice books and white papers

35 Person Years of Effort & Growing

Page 9: Mma 10g r2_936

Strategic MAA Partners

•Servers

•Dell, HP

•Network

•F5, Qlogic, Foundry Networks, Emulex

•Storage

•Apple, Engenio, NetApp, HP, EMC

Page 10: Mma 10g r2_936

Our success – measured by the response from customers like you . . .

Page 11: Mma 10g r2_936

Jeff McCormick Senior Data ArchitectThe Hartford• $22.7 billion in revenue• Leading provider of investment products, life

insurance, employee benefits, auto, homeowner & business insurance

• Largest seller of individual annuities in U.S.• 11,000 agencies, 100,000 broker/dealers• 30,000 employees

Page 12: Mma 10g r2_936

Architecture Review

• Focus on Business Continuity • Assess information technology architectures• Minimize/avoid planned & unplanned downtime• Rapid recovery/failover to remote location• Provide excellent service at lowest cost• Retain flexibility to incorporate new technology

Page 13: Mma 10g r2_936

The Hartford – Future State

Primary Primary

Primary Site Secondary Site Tertiary Site

StorageArray

StorageArray

StorageArray

Tape Drive Tape Drive

MediaServer

MediaServer

RMAN RMAN

Data Guard

Standby

Database REDO

Database REDO

ApplicationAccess

Data Guard

Standby

Data Guard

Standby

Data Guard

Standby

Real Application Cluster

Page 14: Mma 10g r2_936

The Value of MAA to The Hartford

• Simple . . .

Implement a High Availability solution that offers considerable savings in cost, resources, and time.

Page 15: Mma 10g r2_936

MAA Best Practices

Lawrence ToOracle

Page 16: Mma 10g r2_936

Turbocharged Data GuardDisaster Recovery Solution for

Oracle Databases

Page 17: Mma 10g r2_936

Data Guard Best Practices

• Test results show significant out of the box improvements with Data Guard Release 10.2

• Reduction of failover times, potential data loss and primary database impact

• More efficient redo transport• Data Guard SYNC implementation is less impact

than remote mirroring implementation

Page 18: Mma 10g r2_936

New Data Guard Feature: Fast-Start Failover

• Automatic and fast• Logical standby achieved < 20 seconds• Physical standby achieved < 20 seconds• Old primary is reinstated automatically once

connectivity is reestablished between observer and primary database

Attend Session 937, “Best Practices for Automatic Failover Using Oracle Data Guard 10g Release 2”

Page 19: Mma 10g r2_936

Data Guard Best Practices:Switchover for Planned Maintenance

• For fastest switchover (< 1 minute)• Prior to switchover

• a physical standby transitioning from read only back to Redo Apply should be restarted

• disconnect all sessions and stop job processing • shutdown abort for all secondary RAC instances• enable real-time apply on the standby database and ensure

the standby is synchronized or caught up with the primary database

• For manual switchovers open the new primary directly from the mount state

• Or, simulate a Fast-Start Failover - complete transactions and shutdown abort all primary instances

Page 20: Mma 10g r2_936

Data Guard Best Practices:Faster Redo Transport

• Set SDU=32K• Tune network parameters that affect network

buffer sizes and queue lengths • Ensure sufficient network bandwidth for

maximum database redo rate + other activities

Note: Please refer to MAA paper, “Oracle9i Data Guard: Primary Site and Network Configuration Best Practices”http://www.oracle.com/technology/deploy/availability/pdf/MAA_DG_NetBestPrac.pdf

Oracle 10g Release 2 paper coming soon

Page 21: Mma 10g r2_936

Data Guard Best Practices:Tune Network Parameters

• Send and receive buffer size = 3 x bandwidth delay product (BDP)

BDP = 1,000 Mbps * 25ms (.025 secs)= 1,000,000,000 * .025= 25,000,000 Megabits / 8 = 3,125,000 bytes

• Tune network device queues to eliminate packet losses and waits. Set device queues to a minimum of 10,000 (default 100)

* BDP = the product of the estimated minimum bandwidth and the round trip time between the primary and standby server

Page 22: Mma 10g r2_936

Impact of Network Tuning

Impact of Network Tuning

937

10.8

0 200 400 600 800 1000

Tuned

Default

Mbits/secNetwork Throughput

Oracle MAA Test Result

Page 23: Mma 10g r2_936

Data Guard Release 10.2 Redo Transport Improvements

• Increased network write sizes to 10 MB to better utilize network capacity for both ARCH and LNS

• Full decoupling of LGWR and LNS processes• No more waits during log switches• No more waits when LNS buffer is full

• Intra-file parallelism support for ARCH• Up to 29 parallel remote archive processes• Dedicated local ARCH

Page 24: Mma 10g r2_936

Faster ASYNC Transport

52 63 5477 74

155

102

264

0

50

100

150

200

250

300

Tim

e to

tra

nsf

er(s

ecs)

0ms 10ms 50ms 100ms

Network latency

1GB redo transfer

10gR2

Previousversions

Page 25: Mma 10g r2_936

ARCH Performance Gains

12.817.1

23.024.4

27.8

0

5

10

15

20

25

30

Effe

ctiv

e tr

ansf

er

rate

(MB

/sec

)

1 2 3 4 5

Parallel ARCH Processes

ARCH Intrafile Parallelism

Page 26: Mma 10g r2_936

Data Guard Best Practices:Gap Resolution and Data Loss

• For fastest gap resolution• Leverage intra-file archive parallelism • Follow tips for tuning redo transport to improve network

utilization

• To minimize data loss, • Use SYNC transport with a low latency and with a high

bandwidth network • For ASYNC transport, follow tips for tuning redo transport

• Example: Less than 7 seconds of data loss exposure for high redo rates of 2-12 MB/sec with <=25 ms latency in our tests

Page 27: Mma 10g r2_936

Data Guard Best Practices:Reduce Overhead on PrimaryNew Data Guard 10g Release 2 ASYNC Transport• Less primary overhead across different latencies and throughput• NEW: LNS reads directly from the Online Redo Logs

Best Practice• Allocate additional I/O bandwidth for Online Redo Log Files

Performance Gains• For Redo rates less than 2 MB/sec, there is less than 5% impact on

the primary database across different latencies• For very high redo rates of 20 MB/sec, less than 10% impact on

primary database even with latencies of 50 and 100 ms• Overall, Oracle 10g Release 2 database throughput (redo rate) was

2-3 times faster than 10gR1 at high redo rates and latencies

Page 28: Mma 10g r2_936

Data Guard Best Practices:Reduce Overhead on PrimaryOffload Backups to Standby Database• Eliminate backup overhead on primary database• RMAN enables hot backups of the standby database

Best Practices• Use Redo Apply (Physical Standby)• For simplicity, use identical directory structures on the primary and

standby databases• Directory structures can be different – see best practice paper for details

• Use RMAN Recovery Catalog so that backups taken on one database server can be restored on another

• Use a catalog server physically separate from primary and standby sites• Reference MAA RMAN/Data Guard best practices paper

• http://www.oracle.com/technology/deploy/availability/pdf/RMAN_DataGuard_10g_wp.pdf

Page 29: Mma 10g r2_936

Data Guard Sync Transport Less Overhead than Remote Mirroring

No DataNo Data

39% DB Impact39% DB Impact

26% DB Impact26% DB Impact

3 % DB Impact3 % DB Impact

10% DB Impact10% DB Impact20 ms20 ms

No DataNo Data15 ms15 ms

4 % DB Impact4 % DB Impact10 ms10 ms

4 % DB Impact4 % DB Impact0 ms0 ms

RTTRTT Data Guard Data Guard Remote MirroringRemote Mirroring

Actual Customer Test Data

Page 30: Mma 10g r2_936

Data Guard Advantage Because …

• DG only transmits the redo in contrast to all the DB writes

• DBWR – database writer• LGWR – log writer• ARCH - archiver• RVWR – flashback log writer

• Higher wait times for DBWR (db file parallel writes) result in

• Contention for free buffers• Increase in buffer busy waits

Page 31: Mma 10g r2_936

Oracle Snapshots and ClonesAn alternative to third-party snapshots and database clones

Page 32: Mma 10g r2_936

Database Restore Points:Database Snapshots

• Business Needs• Database Snapshots for Quick Backups and Restores • Fast, instantaneous snapshots with little overhead

• Oracle Solution• Database [guaranteed] restore points

Create restore point <snap1> [guarantee flashback database]• Restore points only captures one before image block for

every changed blocks regardless of how many times it has been changed

• Flashing back to restore point is proportional to copying the changed blocks over and applying a small amount of recovery

• Not appropriate as a replacement for full backups

Page 33: Mma 10g r2_936

Database Restore Points: An alternative for snapshots• Database Restore Points

• No additional cost from a different vendor• Creating restore point is instantaneous• No hot backup is required• Database consistent after flashback • Less system resources than a full backup if flashback is disabled• Leverage as a fallback or checkpoint mechanism to protect from

logical failures or for quick restores in test environments• Best Practice: Monitor space and I/O performance

• Monitor space utilization from v$restore_point and v$flashback_database_stat

• Monitor for high flashback buffer free wait events• More benefits for larger databases

Page 34: Mma 10g r2_936

Database Restore Points:Use Cases

• Fast fallback for database patches, upgrades, application changes or batch jobs

• Upgrade from 10.1.0.4 to 10.2.0.1 ==> 1+ hours• Flashback prior to upgrade ==> 2 minutes

compared to hours to restore the database• Quick restore of test environments to original state

• Change 1% (5 GB), Flashback ==> 10 minutescompared to 100 minutes to restore the 500 GB database

Page 35: Mma 10g r2_936

Data Guard, Flashback and RMAN:Database Clones

• Business need • Users need copies or clones of their primary

database for testing, development, reporting• Typically clones are refreshed daily or weekly

• Oracle has all the tools to create a clone without the need of third party products

Page 36: Mma 10g r2_936

Data Guard, Flashback and RMAN:Creating and Resynching a Clone

Read/WriteClone ofPrimary

2. Activate standby for testingStandby >> Clone

3. Flashback cloneto restore pointClone >> Standby

PhysicalStandbyDatabase

PhysicalStandbyDatabase

1. Create restore point 4. Resync with

incremental backup or archives from primary

Page 37: Mma 10g r2_936

Steps to Clone and Resync

• Step 1: Activate Clone• Create Restore Point Guarantee Flashback

Database (instantaneous)• Activate Standby Database (clone)

• Step 2: Use Clone for Read-Write Testing• Step 3: Resync Clone

• Flashback to Restore Point• Create Incremental Backup from the Primary

containing all changes since the time of the restore point

• Apply Incremental to the clone

Page 38: Mma 10g r2_936

Clone Performance:Resync vs Recreate

18.4723.68

97

0102030405060708090

100

Resync Clone(Parallel)

Resync Clone(Serial)

Recreate fromPrimary

Time (Mins)

Page 39: Mma 10g r2_936

Data Guard, Flashback and RMAN:Database Clones

• Oracle clones can be used as an alternative to third party database cloning solutions

• No additional cost from a different vendor• All features are present in Oracle to create and resync a clone • Steps need to be scripted and automated • Targeting Enterprise Manager wizard for the future

• Best Practices:• Compare performance between Oracle and current approaches • Sufficient IO bandwidth and storage implies faster flashback and

resync performance• Enable block change tracking on the primary • Use RMAN parallelism

Page 40: Mma 10g r2_936

More UptimeDuring

Planned Downtime

Page 41: Mma 10g r2_936

Reducing Planned Downtime

Best Practice:• Pick the right strategy• Test, test, test and automate

Page 42: Mma 10g r2_936

Reducing Planned Downtime Best Practices

• Dynamic Resource Provisioning, Online redefinition and reorganization reduces planned downtime

• Detect new processors from an SMP server• Dynamically grow, shrink and tune memory• Table and index modifications• How?: Automatic Shared Memory Management• How?: Online physical and logical table changes• How?: Online index operations

Page 43: Mma 10g r2_936

Reducing Planned Downtime Best Practices

• ASM eliminates downtime for• storage maintenance and storage migration• How?: Automatic data rebalance

• RAC rolling upgrade eliminates downtime for • Patching and system maintenance• How?: Rolling upgrade with qualified patches• How?: Service relocation

Page 44: Mma 10g r2_936

Reducing Planned Downtime Best Practices

• Data Guard SQL Apply minimizes downtime:• Node, system, cluster, and site maintenance• Database upgrades• How?: Fast switchover < 5 minutes and no additional

downtime for upgrade steps• How?: Rolling upgrades (starting w/ 10.1.0.3)

• Best upgrade approach if RAC rolling upgrade is not possible and there are no data type restrictions

Page 45: Mma 10g r2_936

Reducing Planned Downtime Best Practices

• Streams approach eliminates or minimizes downtime for

• Database upgrades• Platform migration (e.g. Windows to Linux)• Character set migration• How?: Support heterogeneous versions in active/active mode• How?: Support heterogeneous platforms• How?: Automatic conversion between character sets

• Best upgrade approach for customers that are currently using streams

Page 46: Mma 10g r2_936

Data Type RestrictionsData Guard SQL Apply and Streams

• Unsupported data types• BFILE, ROWID, User defined types • Collections and VARRAYs • XML types • Multimedia types

• With Streams, you can work around some data type restrictions by using

• triggers to capture changes from an unsupported tables to a “shadow” tables that has supported data types

• Replicate the “shadow” table changes• Use customized apply to apply the changes to the original

tables on the target database

Page 47: Mma 10g r2_936

Reducing Planned Downtime Best Practices

• Transportable Tablespace reduces planned downtime• Platform migration (e.g. Windows to Linux)• Database upgrade• How?: Cross-platform datafile conversion• How?: Transport tablespaces to new version

Page 48: Mma 10g r2_936

Transportable Tablespace to Minimize Downtime for UpgradesWhen to use

• Logical standby and streams are not best fit solutions• Time to run the upgrade or migration scripts is greater than the

time to export and import the meta data

Phase 1: Preparation1. Create shell of target database using new version2. Create schemas in target database3. Create physical standby if source and target hosts are different

Source database1.Remove transport violations, if any2.Make user tablespaces read-only3.Export tablespace metadata

Target database1.Recover standby and shutdown2.Use datafiles for target database3.Import tablespace metadata4.Make user tablespaces read-write

Phase 2: Transport database

Page 49: Mma 10g r2_936

Transportable Tablespace to Minimize Downtime for Upgrades

• Customer Example• AMADEUS

• Upgrade electronic ticketing system from Oracle 9.2.0.3 on HP N Class to 10.1.0.4 on HP Superdome

• Total Downtime 8 Minutes (compared 25 minutes for normal upgrade)

http://www.oracle.com/technology/deploy/availability/pdf/AmadeusProfile_TTS.pdfhttp://www.oracle.com/technology/deploy/availability/pdf/AmadeusProfile.pdf

Page 50: Mma 10g r2_936

Transparent Client Failover for Disaster Recovery

Page 51: Mma 10g r2_936

Client FailoverOracle Database 10g Release 2

• Fast Application Notification Prerequisites• Oracle 10g Release 2 OCI Clients

• Server Side TAF enabled with AQ_HA_NOTIFICATION=TRUE

• FAN OCI event using AQ notifies OCI mid-tier clients automatically

• Oracle 10g JDBC clients • Fast Connection Failover enabled

Page 52: Mma 10g r2_936

Client Failover with Data Guard 10g Release 2: Validated Solution • Data Guard failover can complete in seconds• DB_ROLE_CHANGE database trigger can be configured

to automatically . . .1. Enable production database services using

DBMS_SERVICE2. Change LDAP or DNS or some naming service to ensure

that clients reconnect to the new available primary database

3. Call any other application pre-failover steps4. Notify JDBC clients with external program to publish FAN

ONS events • FAN OCI event using AQ notifies OCI mid-tier clients

automatically(10gR2)

Page 53: Mma 10g r2_936

MAA Best Practice Home Pagehttp://www.oracle.com/technology/deploy/availability/htdocs/maa.htmHA Best Practices for Oracle Database

• Oracle Database High Availability Overview 10g Release 2 - Documentation • Oracle Database High Availability Architecture and Best Practices 10g Release 1 - Documentation • Oracle Database 10g Best Practices: Data Guard Redo Apply and Media Recovery • Oracle Database 10g Best Practices: Data Guard SQL Apply • Oracle Database 10g Best Practices: Data Guard Role Transitions and Streams • Using Recovery Manager with Oracle Data Guard in Oracle Database 10g• Oracle Database 10g Best Practices: Migration to Automatic Storage Management (ASM)• Best Practices for Creating a Low-Cost Storage Grid for Oracle Databases • Oracle9i Data Guard: Primary Site and Network Configuration Best Practices• Oracle9i Fast-Start Checkpointing Best Practices

HA Best Practices for Oracle Application Server• OracleAS 10g Infrastructure Highly Available Architectures • Highly Available Distributed Identity Management • Highly Available Identity Management Deployment Example - Rack Mounted Identity Management• Highly Available Identity Management Deployment Example - Cold Failover Cluster Identity Management • Configuring Highly Available OracleAS Infrastructure With F5's BIG-IP Load Balancer • Oracle9i Application Server Cold Failover Cluster Infrastructure Upgrade to Oracle Application Server 10g Cold Failover Cluster• Transformation From A Single Host Oracle Application Server Infrastructure To An Oracle Application Server 10g Cold Failover Cluster

HA Best Practices for Oracle Applications & Oracle Collaboration Suite• Configuring Oracle Applications Release 11i with 10g RAC and 10g ASM• E-Business Suite 11i on RAC: Configuring Database Load balancing & Failover• Oracle E-Business Suite Release 11i with 9i RAC: Installation and Configuration using AutoConfig • Business Continuity for Oracle Applications Release 11i• Oracle Collaboration Suite High Availability Configuration Release 2 (9.0.4) for UNIX and Linux

HA Best Practices for Oracle Grid Control• Configuring Enterprise Manager for High Availability• Enterprise Manager 10g Backup, Recovery and Disaster Recovery Considerations

Page 54: Mma 10g r2_936

High Availability Demos/Sessions From Oracle Development

Sessions - Monday, Sep 19! 1:30-2:30 pm, Room 303 - Optimizing Linux I/O! 3:00-4:00 pm, Room 104 - The Future of Database Information Technology! 4:30-5:30 pm, Room 103 - What They Didn't Print in the DOC - HA Best

Practices by Gurus from Oracle's Maximum Availability Architecture Team

! 3:00-4:00 pm, Room 104 - Logical Standby Unleashed! 4:30-5:30 pm, Room 104 - Best Practices for Oracle Database 10g Backup

and Recovery

Demogrounds - Monday, Sep 19 – Thursday, Sep 22! Oracle Data Guard! ILM and Storage

! Oracle Secure Backup! RMAN, Flashback, and Online Redefinition

Sessions - Tuesday, Sep 20

Page 55: Mma 10g r2_936

High Availability Sessions From Oracle Development

Sessions - Thursday, Sep 22! 1:00-2:00 pm, Room 104 - Minimizing Application Development Time Using

Flashback: A Customer Case Study! 2:30-3:30 pm, Room 104 - Best Practices To Achieve Business Continuity

Using Oracle Applications and Oracle Database Technology! 4:00-5:00 pm, Room 104 - Best Practices for Automatic Failover Using

Oracle Data Guard 10g Release 2

Sessions - Wednesday, Sep 21! 11:00 am-12:00 pm, Room 104 - Improve Your Tape Backup Results with

Oracle Secure Backup! 3:00-4:00 pm, Room 304 - Implementing Information Lifecycle Management

(ILM) using the Oracle Database

Page 56: Mma 10g r2_936

Q U E S T I O N SQ U E S T I O N SA N S W E R SA N S W E R S

Page 57: Mma 10g r2_936

Top Related