extreme performance data warehousing - oracle exadata · extreme performance data warehousing -...

28
<Insert Picture Here> Extreme Performance Data Warehousing - Oracle Exadata Saint Kim Director, Technology Sales Consulting, Oracle Korea Nov, 2008 © 2008 Oracle Corporation – Proprietary and Confidential

Upload: lamdien

Post on 11-Jul-2018

234 views

Category:

Documents


0 download

TRANSCRIPT

<Insert Picture Here>

Extreme Performance Data Warehousingg- Oracle ExadataSaint KimDirector, Technology Sales Consulting, Oracle KoreaNov, 2008

© 2008 Oracle Corporation – Proprietary and Confidential

Data Warehouses Growing Rapidly

Tripling In Size Every Two Yearsp g y

1000D t

600

800 Data Warehouse Size (TB) • The challenge is to

maintain or improve performance in spite

400

performance in spite of the exponential increase in size

0

200

02000 2004 2008 2012

Actual Projected

– 2 –Source: Winter TopTen Survey, Winter Corporation, Waltham MA, 2008.

Data Warehouse Approaches

• “Brainy Software” approach - use smart database software to minimize the need for h dhardware• OLAP, Bitmap Indexing, Join indexing, Materialized

Views, Result Caches, Range Partitioning, etc.

• “Brawny Hardware” approach - use powerful hardware to perform brute-force scans and joins• Brawny hardware approach only works if the rate of

scanning data scales with size of data

– 3 –

The Brawny Hardware ChallengeStorage Data BandwidthStorage Data Bandwidth

• Current warehouse deployments often have bottlenecks limiting the• Current warehouse deployments often have bottlenecks limiting the movement of data from disks to servers• Storage Array internal bottlenecks on processors and Fibre Channel Loops• Limited Fibre Channel host bus adapters in servers• Under configured and complex SANs

• Pipes between disks and servers are often 10x too slow for data size

– 4 –

Pipes between disks and servers are often 10x too slow for data size

The Effect Of Data Bandwidth BottleneckThe Larger The Warehouse The Slower The PerformanceThe Larger The Warehouse, The Slower The Performance

10 Hour

Table Scan Time

5 Hour

Table Size1TB 10 TB 100TB

1 Hour

– 5 –

Table Size1TB 10 TB 100TB

Business Impact of Increasing Data Bandwidth

• Business queries if 10 TB of data must be scannedBusiness queries if 10 TB of data must be scanned

Data Bandwidth

1 GB/sec 3 queries/work day Don’t Ask

10 TB10 GB/sec

3 queries/hour Ask Tomorrow10 TB 3 queries/hour

100 GB/sec

Ask Tomorrow

100 GB/sec35 queries/hour Ask Anything

– 6 –

Solutions To Data Bandwidth Bottleneck

• Ship less data through pipesAdd more pipes• Add more pipes

• Make pipes bigger

– 7 –

Smart Scans

• Exadata cells implement smart scans to greatly reduce the data that needs to be processed by databasedata that needs to be processed by database

• Only return relevant rows and columns to database• Offload predicate evaluation

• Data reduction is usually very large• Column and row reduction often decrease data to be returned

to the database by 10x

– 8 –

Simple Query Example

What were my

Exadata Storage Grid

Oracle Database Grid

sales yesterday

Select sum(sales)

where

Retrieve sales amounts from

Sept 24whereDate=’24-Sept’

p

SUM

– 9 –

Shared Disk vs. Shared Nothing

• The Shared Disk vs Shared Nothing debate has raged for years• The Shared Disk vs. Shared Nothing debate has raged for years• Exadata brings a fresh perspective to this debate• Basic differences between approaches are simple

Shared disk ships data Shared nothing ships pto where code is

g pcode to where data is

– 10 –

Shared Disk with Benefits of Shared Nothing

• Exadata delivers Shared Disk with the benefits of Shared Nothing

• Exadata ships code to data for scan operations• Processes bulk data directly as it comes off disk

• Exadata ships data to code for OLTP operations• Efficiently runs any application with no changes including ultra-complex ones likeEfficiently runs any application with no changes, including ultra complex ones like

ERP, CRM, HR, etc.

– 11 –

Massively Parallel Storage Grid

• Storage cells are organized into a massively parallel storage gridparallel storage grid

• ScalableS l t h d d f t ll

16 GB/sec

• Scales to hundreds of storage cells• Data automatically distributed across cells by ASM

• Transparently redistributed when cells are added or removed 8 GB/secor removed

• Data bandwidth scales linearly with capacity

A il bl

4 GB/sec

• Available• Data is mirrored across cells• Failure of disk or cell transparently tolerated Exadata bandwidth scales

• Simple• Works transparently - no application changes

linearly with capacity

– 12 –

Storage Array Bandwidth Limitations

Storage Array

• Conventional Storage Arrays use Fibre Channel Loops to connect

Storage Processors & Cache p

storage processors to disk trays

• Data Bandwidth is limited by FCFibre Channel Loops

• Data Bandwidth is limited by FC Loops and storage processors

St b d idth d t l• Storage bandwidth does not scale with storage capacity

FC Loop bandwidth 400 MB/sec max

– 13 –

400 MB/sec max

Infiniband Network• Infiniband is the interconnect supported by Exadata

• Provides highest performance available – 20 Gb/secg• Widely used in high performance computing since 2002

• Infiniband looks like normal Ethernet to host software • All IP based tools work transparently – tcp/ip, udp, http, ssh, …

• Infiniband has efficiency of SAN• Zero copy and buffer reservation abilities of Storage network

• Unified Network Fabric• Infiniband used for both storage and RAC interconnect• Less configuration, lower cost, higher performance

• Uses high performance ZDP Infiniband protocol (RDS V3)• Zero-copy Zero-loss Datagram protocol • Available as Linux Open Source• Very low CPU overhead

• Transfers 1GB/sec of data with 2% CPU utilization

– 14 –

Infiniband Throughput

1400Single Connection Throughput

1000

1200

1400

600

800

1000

MB/sec3x slower

400

60012x slower

3x slower

0

200

Gigabit Ethernet 4Gb Fibre 20Gb Infiniband

• Graph shows throughput achieved in real-world deployments• Infiniband is held back by PCIe 1.0 x8 bus on typical host systems

Gigabit Ethernet 4Gb Fibre 20Gb Infiniband

– 15 –

Exadata – A New ArchitectureBreaks Data Bandwidth Bottleneck

• Exadata Ships Less Data Through Pipes• Query processing is moved into storage to

dramatically reduce data sent to serversdramatically reduce data sent to servers while offloading server CPUs

E d t h M Pi• Exadata has More Pipes• Modular storage “cell” building blocks

organized into Massively Parallel Grid • Bandwidth scales with capacity

• Exadata has Bigger Pipes• Exadata has Bigger Pipes• InfiniBand interconnect transfers data 5x

faster than Fibre ChannelExadata Moves a Lot

Less Data a Lot Faster

– 16 –

HP Exadata Storage Server Hardware

• Building block of massively parallel Exadata Storage Grid

U t 1GB/ d t b d idth llExadata Storage Server

• Up to 1GB/sec data bandwidth per cell• HP DL180 G5

• 2 Intel quad-core processors• 8GB RAM• Dual-port 4X DDR InfiniBand card• 12 SAS or SATA disks

• Software pre-installed• Oracle Exadata Storage Server Software• Oracle Enterprise Linux

Racked Exadata p

• HP Management Software• Hardware Warranty

• 3 YR Parts/3 YR Labor/3 YR On-site

Storage Servers

3 YR Parts/3 YR Labor/3 YR On site • 24X7, 4 Hour response

– 17 –

HP Exadata Storage Server Hardware Details

8 GB DRAMRedundant 110/220V

Power Supplies 8 GB DRAMPower Supplies

P400 Smart Array Disk Controller card

2M b b k d h- 512M battery backed cache

12 x 3.5” Disk Drives

2 Intel Xeon Quad-core Processors

InfiniBand DDR dual port card LO100c –

Management Card Included Software:

• Oracle Exadata Storage Server Software

• Oracle Enterprise Linux

• HP Management Software

– 18 –

g

SAS or SATA Disks in Exadata Servers• Choice of either

• 450 GB 15,000 RPM SAS disks1 TB 7 200 RPM SATA disks• 1 TB 7,200 RPM SATA disks

• Choose SAS Based Servers for High Performance

SAS Advantages SAS SATA Advantage

Throughput (MB/s) 1,000 750 1.33X

Average Seek Time (ms) 3.6 7.4 2.05Xg ( )

Disk level read errors (per year) 6.3 63 10.00X

Years to disk failure 15.2 11.4 1.33X

• Choose SATA Based Servers for High Capacity

SATA Ad t SAS SATA Ad tSATA Advantages SAS SATA AdvantageCapacity (TB) 5.4 12 2.22X

– 19 –

ScalableAdd k t l f th

Scale to 18 cells in one rackAdd racks to scale further

Each cell connects to 2 InfiniBand

switches for RedundancyRedundancy

This delivers 4x the bandwidth

SAS raw capacity per rack:97TBSATA raw capacity per rack:

InfiniBand links across racks for full connectivity

– 20 –

p y p216TBPeak throughput per rack : >18GB/s

y

Exadata Performance Scales

E adata deli ers bra n10 Hour

• Exadata delivers brawny hardware for use by Oracle’s brainy software

Table Scan Time

• Performance scales with sizeTypical Warehouse

• Result• More business insight

5 HourWarehouse

g• Better decisions• Improved competitiveness

Table Size1TB 10 TB 100TB

1 HourExadata

– 21 –

Simeon DimitrovSimeon DimitrovEnterprise Resources Manager, M-Tel

European Telecommunications Provider

“Every query was faster on Exadata compared to our current systems. The

smallest performance improvement was 1010x and the biggest one was an

incredible 72x.”

– 22 –

M-Tel Exadata Speedup – 10X to 72X

Tablespace Creation 28xIndex Creation

Tablespace Creation 28xAverageSpeedup

CRM Customer Discount Report

Handset to Customer Mapping Reportp p

CRM Service Order Repport

CRM Customer Discount Report

CDR Full Table Scan

Warehouse Inventory Report

0 10 20 30 40 50 60 70 80

– 23 –

Grant SalmonCEO, LGR Telecommunications

Telecomms Business Intelligence SolutionsTelecomms Business Intelligence Solutions

“Call Data Record queries that used to run for over 30 minutes

l t i d 1 i tnow complete in under 1 minute. That's extreme performance.”

– 24 –

Walt LitzenbergerDirector Enterprise Database Systems, CME Group

World’s Largest Futures ExchangeWorld s Largest Futures Exchange

“Oracle Exadata outperforms anything we’ve tested to date by 10 to 15 times.

This product flat-out screams.”

– 25 –

Giant Eagle Exadata Speedup – 3X to 50X

Merchandising Level 1 Detail:

Merchandising Level 1 Detail:Period Ago

Supply Chain Vendor - Year - ItemMovement

Merchandising Level 1 Detail:Current - 52 weeks

Materialized Views Rebuild

Merchandising Level 1 Detail byWeek

Prompt04 Clone for ACL audit

Date to Date MovementComparison - 53 weeks

Gift Card Activations

Sales and Customer Counts

- 5.0 10.0 15.0 20.0 25.0 30.0 35.0 40.0 45.0 50.0

Recall Query

– 26 –

Exadata Changes the Game

• Exadata delivers brawny hardware for useExadata delivers brawny hardware for use by Oracle’s brainy software

• Delivers database intelligence and massively parallel scaling in the storage tier• Industry standard hardware

• Eliminates all bottlenecks preventing high performance query processingperformance query processing

• Dramatic performance improvement for data warehousing

• Simplicity of appliance seamlessly integrated with the database

– 27 –

– 28 –