rac sig rds webcast

39
<Insert Picture Here> Next-Generation RAC Interconnect Protocol: InfiniBand and Reliable Datagram Sockets (RDS) Paul Tsien, Oracle William Song, JDA Software Group, Inc. (formerly Manugistics, Inc.)

Upload: kesava

Post on 11-Feb-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 1/39

<Insert Picture Here>

Next-Generation RAC Interconnect Protocol:InfiniBand and Reliable Datagram Sockets (RDS)

Paul Tsien, OracleWilliam Song, JDA Software Group, Inc. (formerly Manugistics, Inc.)

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 2/39

<Insert Picture Here>

Agenda

• What is RDS (Reliable Datagram Sockets)?

• Open Source RDS for Linux• Beta Customer Experience

• JDA’s Oracle RDS Project

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 3/39

<Insert Picture Here>

What is RDS (ReliableDatagram Sockets)?

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 4/39

Oracle RAC IPC

• RAC IPC

• Thousands of processes• 200K+ associations (not connections)

• 64 nodes

• Oracle IPC Usage

• New grid aware applications will significantly increase IPCutilization

• Approach database I/O rates

• Very large messages

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 5/39

RDS Vision Statement

• A low overhead, low latency, high bandwidth,

ultra reliable, supportable, IPC protocol andtransport system

• Which matches Oracle’s existing IPC models for

RAC communication• Optimized for transfers from 200 bytes to 8 MB

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 6/39

Goal and Objective

• Support for a reliable datagram IPC

• Based on Socket API• Minimal code change / testing for Oracle

• Runs over InfiniBand, 10 Gig Ethernet, and iWARP

• 6 month validation / certification for RAC

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 7/39

Goal and Objective

• Leverage InfiniBand’s built-in availability and

load balancing features• Port fail-over on the same HCA

• HCA fail-over on the same system

• Automatic load balancing

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 8/39

Reliable Datagram IPC

• UDP – Oracle adds reliable delivery via user

mode wire protocol engine• Two sockets per process, thousands of messages

on wire

• Slow sends times (windowing, acks, retrans)• Holds together, but degenerates under CPU load

• Well tested !

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 9/39

Available Options

• uDAPL / ITAPI – not supporting

• IP over IB – high CPU overhead• SDP – connection oriented

• We want to take our existing well tested UDP

module, shut off most of it to run over an O/Sprovided RD IPC

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 10/39

RDS IPC over InfiniBand

• RDS – Reliable Datagram IPC over IB co-developed by

Oracle and QLogic (former SilverStorm Technologies)• Minimal Oracle code change

• Previously certified SilverStorm InfiniBand stack (UDP/IPoIB)

• Stable code, easily passed all Oracle regression tests

• Supports fail-over across and within HCAs• Oracle internal interconnect test tool shows

• 50% less CPU than IP over IB, UDP

• ½ latency of UDP (no user-mode acks)

• 50% faster cache-to-cache Oracle block throughput

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 11/39

Open Source RDS

• SilverStorm/QLogic RDS contributed toOpenFabrics (Industry Consortium)

• Oracle is building interconnect-agnosticOpen Source RDS for Linux

http://oss.oracle.com/projects/rds/ 

• Oracle will support RDS on Linux

• Oracle RDS will be pulled into OFED

• Oracle RDS will support InfiniBand,10 Gig Ethernet, and iWARP

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 12/39

RDS Status

• Oracle support for SilverStorm/QLogic RDS

GA in 10.2.0.3• RDS beta testing completed, excellent performance

and stability

• Open Source RDS• Oracle is developing/testing Open Source RDS on

InfiniBand

• All tier one Unix system vendors aredeveloping/testing RDS

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 13/39

YSUSE LinuxEnt Server 9

IBM DB2 UDB8.2

02/14/0532.80 US $53,451IBM eServer xSeries 346

Red HatEnterprise

Linux AS 3

Red HatEnterpriseLinux AS 4Update 3

HP UX 11.i

V2 64 bit

OperatingSystem

YOracle 10g RACwith Partitioning

10/21/0459.93 US $35,141HP ProLiant DL585 Cluster 48P

YOracle 10g R2

RAC withPartitioning

10/23/0624.94 US $59,356PANTAmatrix 32P (64-core) Cluster

N

OracleDatabase 10g

R2 EnterpriseEdtw/Partitioning

01/18/0659.00 US $68,100HP Integrity Superdome Enterprise Server

ClusterDatabaseSystem

AvailabilityPrice/QphHQphHSystem

Current 1TB TPC-H Results

As of 10/23/06

Large Price/PerformanceImprovement over current #1 TPC-Hperformance entry (HP Superdome)

World Record clustered TPC-HPerformance and Price/Performance

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 14/39

<Insert Picture Here>

Beta CustomerExperience

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 15/39

Customer Requirements

• Improve application performance (throughput

and latency)• Maintain data availability

• Lower TCO through commodity hardware and

improve performance/scalability• Want to implement Grid and Utility computing

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 16/39

Results

• RDS/IB shows significant real world

application performance gains for certainworkloads: DW, DSS and mixed Batch/OLTPworkloads

• Throughput and latency• Customers are interested in unified fabric for

cost and manageability reasons

• Reservation/QoS

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 17/39

JDA Software Group, Inc.

Application Test Participants:

JDA Software Group, Inc.QLogic Corporation (former SilverStorm Technologies)

Oracle Corporation

Intel Corporation

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 18/39

Overview

• Fundamental Benefits of Oracle RAC

• JDA Grid Computing Architecture

• Characteristics of JDA Grid Computing

• Why RAC, InfiniBand and RDS

• RAC Gigabit Ethernet Test• RAC RDS InfiniBand Test

• SMP Test

• Commodity RAC vs. SMP Comparison

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 19/39

JDA Reasons for Testing RAC• Performance

• Deterministic database performance in a clustered environment

• Scalability• Scale a RAC database by adding instances to the cluster database

• Fault Tolerance• A RAC database is made up of multiple instance. Whileperformance may degrade, loss of an instance does not bring downthe entire database

• Cost

• Remove the barrier to entry by reducing the cost of the initialimplementation

• Provide incremental scalability by allowing RAC instances to beadded to the cluster without losing value in the initial investment ofservers

• Reduce the total cost of implementation through lower capital andongoing maintenance costs

• Complete our Grid Computing Architecture by bringing it tothe database tier

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 20/39

Why JDA Applications?

• JDA’s Strategic Supply and Demand Management (SSDM)applications are rigorous, intense, and demandingespecially at the database tier, solving very large-scale

planning, scheduling, and revenue optimization problems – Enterprise DSS.

• We employ a Grid Computing Architecture at the

application tier, while using Oracle as the data store for clientinput data and algorithm solution output.

• We enable our application scalability and performance byregulating the number of grid computing nodes running

across a network of distributed commodity servers at theapplication tier.

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 21/39

JDA Grid Computing Architecture

• Originally name Service Request Environment (SRE)

• SRE framework is written in PL/SQL – wrapped and residesinside the database

• SRE Computing Nodes are written in Java

Autonomous, no single master node,self-sustaining, kill failed nodes, spawn new nodes

Multithreaded multiple concurrent database connections

• The database is the reliable persistent communicationlayer, media, and channel for all grid computing nodes.

Leverage all the advantages of Oracle’s databasetechnology –performance, fault tolerance and scalability

• Run on any platform Windows, Sun, AIX, and HP

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 22/39

JDA Grid Computing Architecture

Oracle Database

Ecosystem of Grid Computing Nodes on Commodity Servers

JDBC Thin Connections

SREJVMSRE

JVM

SREJVM

SREJVM

SREJVM

SREJVM

SREJVM

PL/SQL

SRE

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 23/39

CPU Utilization at Grid Application Tier

CPU Saturated at 100% Load Throughout The Entire Run.

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 24/39

Database I/O

SAN Disk Storage Utilization Saturated at 100%

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 25/39

Shifting Trend in Deployment Paradigm

Monolithic SMP

•Application

•Database

Database Tier

Application Tier on Commodity servers

Mixed Configuration

•Commodity Application Servers

•SMP Database Servers

Application Tier on Commodity servers

Database Tier on Commodity Servers

Grid Computing

•All Commodity Servers

Past Present Future

Application and Database onSame SMP Server

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 26/39

Gig-E Test Configuration

SP A SP B

Oracle RAC 10g Database Tier

QLogic 5000InfiniBand switchwith FC gateway

Dell EMC CX500FC Storage

PublicEthernet

network to

GridComputing

NodesIntel Jarrell

Private

Gig-EInterconnect

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 27/39

Gig-E Interconnect Bound

Private InterconnectCluster Contention

66 minutes

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 28/39

RDS InfiniBand Test Configuration

   R

   D   S

SP A SP B

Oracle RAC 10g Database Tier

QLogic 5000InfiniBand switchwith FC gateway

Dell EMC CX500FC Storage

PublicEthernet

network toGrid

ComputingNodes Intel Jarrell

• RDS InfiniBand as RAC private interconnect• QLogic IB VFx direct connect to SAN Storage

• Disk I/O to servers through same IB HCA• Eliminate need for Fiber Channel HBAs (savings)• SAN switch optional (more savings)

S R P 

IB Connection HandlesRAC Private

Interconnect Trafficand SAN Disk I/O

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 29/39

InfiniBand and RDS Scale

Oracle10g RAC Scales with InfiniBand & RDS

25 minutes

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 30/39

RDS InfiniBand vs. Gigabit Ethernet

Time to Plan 1 Million SKU

• 66 min. on Gigabit Ethernet

• 25 min. on InfiniBand with RDS

   G

   i  g  a   b   i   t   E   t   h  e  r  n  e   t

   R   D   S

   I  n   f   i  n   i   B  a  n   d

0

10

20

30

40

50

60

70

min.

1

1 Million SKU

62% Improvement with

QLogic InfiniBand & RDS

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 31/39

Gigabit Ethernet vs. InfiniBand Costs

• Ethernet network & Fiber Channel SAN $19,721

• Popular brand of enterprise class GE switch

• Popular brand of enterprise class FC switch & HBAs

• InfiniBand unified fabric for RAC $12,825

• QLogic 5000 multi-protocolInfiniBand switch with FC gateway

• InfiniBand HCAs

 35% cost reduction with QLogic InfiniBand network consolidation

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 32/39

SMP Test Configuration

Public

Ethernetnetwork to

GridComputing

Nodes

8 CPU SMP Server

SP A SP B

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 33/39

SMP vs. Commodity RAC Performance

Time to Plan 1 Million SKU

• 100 min. on 8 CPU SMP Server

• 25 min. on 4 Commodity Server RACwith RDS InfiniBand

   8   C   P   U   S   M   P   S  e  r  v  e  r

   4   @    2

   C   P   U   C  o  m  m  o

   d   i   t  y   S  e  r  v  e  r

0

10

20

30

40

50

60

70

80

90

100

min.

1

1 Million SKU

75% Performance Improvement on Intel

Commodity RAC Servers with

 InfiniBand & RDS

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 34/39

SMP vs. Commodity Cost

$120,000

8 CPU SMP Server

$20,000

4 @ 2 CPU Intel EM64T1 @ QLogic 5000

InfiniBand Switch

83% Cost Reduction in hardware for Intel Commodity RAC Servers and InfiniBand & RDS

vs. SMP Server

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 35/39

Price/Performance of SMP vs. Commodity

• 8 CPU SMP

• Dollar cost to process 1M SKU = 0.200 [$ hr/SKU]*

• 1M SKU/100min = 600,000 SKU/hr

• $120,000• 4 @ 2 CPU Commodity RAC

• Dollar cost to process 1M SKU = 0.008 [$ hr/SKU]*

• 1M SKU/25min = 2,400,000 SKU/hr• $20,000

( *Similar to $/TCP-H as applied to Strategic Supply and Demand Management Industry)

 96% Price/Performance Improvement on IntelCommodity RAC Servers with InfiniBand & RDS

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 36/39

 S  p e e d 

 ,P  er f   or m an c e

 , an d L  ow er  C 

I  

nf  i  ni  B  an d  & 

R D  S  an d  Or  a c l   e1  0  gR A  C 

 on C  omm o d i   t   y  S  er v 

 $ 2  5  , 0  0 

 $ 1  0  0  , 0  0 

 $ 7  5  , 0  0 

 $  8  0  , 0  0 

 $  5  0  , 0  0 

66 min.

 Gigabit Ethernet

25 min.

RDS InfiniBand.

100 min.

 8 CPU SMP Serve

25 min.

4 Commodity Server RAC

RDS InfiniBand

$120,000

8 CPU SMP Serve

$20,000

 4 Commodity Server RAC

RDS InfiniBand

 0  2   0  

4   0  

 6   0  

 8   0  

1   0   0  

1  2   0  

mi  n.

1    

1 Mi  l  l  i   on S K  U 

 $ 1 2  0  , 0  0 

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 37/39

Summary of Improvements

62% Speed Improvement on InfiniBand with RDSvs. Gigabit Ethernet

75% Performance Improvement on Intel Commodity RAC Servers with InfiniBand vs. SMP Server

83% Cost Reduction in Hardware for Intel Commodity RAC Servers with InfiniBand vs. SMP Server

 96% Price/Performance Improvement on IntelCommodity RAC Servers with InfiniBand 

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 38/39

Complete Grid Computing Architecture

Mixed Configuration•Commodity Application servers•SMP Database servers

Complete Grid Computing Solution•All Commodity servers

 JDA Completes Grid Computing Architecture Solution

Application Tier on Commodity servers

Database Tier on Commodity serversDatabase Tier

Application Tier on Commodity servers

7/23/2019 Rac Sig Rds Webcast

http://slidepdf.com/reader/full/rac-sig-rds-webcast 39/39

For More Information…

• Oracle: Paul [email protected]

(650) 633-6711

• JDA Software: William [email protected](301) 255-8353

• QLogic: Gunnar Gunnarsson

[email protected](301) 529-1811