extreme performance data warehousing - oracle exadata · extreme performance data warehousing -...
TRANSCRIPT
<Insert Picture Here>
Extreme Performance Data Warehousingg- Oracle ExadataSaint KimDirector, Technology Sales Consulting, Oracle KoreaNov, 2008
© 2008 Oracle Corporation – Proprietary and Confidential
Data Warehouses Growing Rapidly
Tripling In Size Every Two Yearsp g y
1000D t
600
800 Data Warehouse Size (TB) • The challenge is to
maintain or improve performance in spite
400
performance in spite of the exponential increase in size
0
200
02000 2004 2008 2012
Actual Projected
– 2 –Source: Winter TopTen Survey, Winter Corporation, Waltham MA, 2008.
Data Warehouse Approaches
• “Brainy Software” approach - use smart database software to minimize the need for h dhardware• OLAP, Bitmap Indexing, Join indexing, Materialized
Views, Result Caches, Range Partitioning, etc.
• “Brawny Hardware” approach - use powerful hardware to perform brute-force scans and joins• Brawny hardware approach only works if the rate of
scanning data scales with size of data
– 3 –
The Brawny Hardware ChallengeStorage Data BandwidthStorage Data Bandwidth
• Current warehouse deployments often have bottlenecks limiting the• Current warehouse deployments often have bottlenecks limiting the movement of data from disks to servers• Storage Array internal bottlenecks on processors and Fibre Channel Loops• Limited Fibre Channel host bus adapters in servers• Under configured and complex SANs
• Pipes between disks and servers are often 10x too slow for data size
– 4 –
Pipes between disks and servers are often 10x too slow for data size
The Effect Of Data Bandwidth BottleneckThe Larger The Warehouse The Slower The PerformanceThe Larger The Warehouse, The Slower The Performance
10 Hour
Table Scan Time
5 Hour
Table Size1TB 10 TB 100TB
1 Hour
– 5 –
Table Size1TB 10 TB 100TB
Business Impact of Increasing Data Bandwidth
• Business queries if 10 TB of data must be scannedBusiness queries if 10 TB of data must be scanned
Data Bandwidth
1 GB/sec 3 queries/work day Don’t Ask
10 TB10 GB/sec
3 queries/hour Ask Tomorrow10 TB 3 queries/hour
100 GB/sec
Ask Tomorrow
100 GB/sec35 queries/hour Ask Anything
– 6 –
Solutions To Data Bandwidth Bottleneck
• Ship less data through pipesAdd more pipes• Add more pipes
• Make pipes bigger
– 7 –
Smart Scans
• Exadata cells implement smart scans to greatly reduce the data that needs to be processed by databasedata that needs to be processed by database
• Only return relevant rows and columns to database• Offload predicate evaluation
• Data reduction is usually very large• Column and row reduction often decrease data to be returned
to the database by 10x
– 8 –
Simple Query Example
What were my
Exadata Storage Grid
Oracle Database Grid
sales yesterday
Select sum(sales)
where
Retrieve sales amounts from
Sept 24whereDate=’24-Sept’
…
p
SUM
– 9 –
Shared Disk vs. Shared Nothing
• The Shared Disk vs Shared Nothing debate has raged for years• The Shared Disk vs. Shared Nothing debate has raged for years• Exadata brings a fresh perspective to this debate• Basic differences between approaches are simple
Shared disk ships data Shared nothing ships pto where code is
g pcode to where data is
– 10 –
Shared Disk with Benefits of Shared Nothing
• Exadata delivers Shared Disk with the benefits of Shared Nothing
• Exadata ships code to data for scan operations• Processes bulk data directly as it comes off disk
• Exadata ships data to code for OLTP operations• Efficiently runs any application with no changes including ultra-complex ones likeEfficiently runs any application with no changes, including ultra complex ones like
ERP, CRM, HR, etc.
– 11 –
Massively Parallel Storage Grid
• Storage cells are organized into a massively parallel storage gridparallel storage grid
• ScalableS l t h d d f t ll
16 GB/sec
• Scales to hundreds of storage cells• Data automatically distributed across cells by ASM
• Transparently redistributed when cells are added or removed 8 GB/secor removed
• Data bandwidth scales linearly with capacity
A il bl
4 GB/sec
…
• Available• Data is mirrored across cells• Failure of disk or cell transparently tolerated Exadata bandwidth scales
• Simple• Works transparently - no application changes
linearly with capacity
– 12 –
Storage Array Bandwidth Limitations
Storage Array
• Conventional Storage Arrays use Fibre Channel Loops to connect
Storage Processors & Cache p
storage processors to disk trays
• Data Bandwidth is limited by FCFibre Channel Loops
• Data Bandwidth is limited by FC Loops and storage processors
St b d idth d t l• Storage bandwidth does not scale with storage capacity
FC Loop bandwidth 400 MB/sec max
– 13 –
400 MB/sec max
Infiniband Network• Infiniband is the interconnect supported by Exadata
• Provides highest performance available – 20 Gb/secg• Widely used in high performance computing since 2002
• Infiniband looks like normal Ethernet to host software • All IP based tools work transparently – tcp/ip, udp, http, ssh, …
• Infiniband has efficiency of SAN• Zero copy and buffer reservation abilities of Storage network
• Unified Network Fabric• Infiniband used for both storage and RAC interconnect• Less configuration, lower cost, higher performance
• Uses high performance ZDP Infiniband protocol (RDS V3)• Zero-copy Zero-loss Datagram protocol • Available as Linux Open Source• Very low CPU overhead
• Transfers 1GB/sec of data with 2% CPU utilization
– 14 –
Infiniband Throughput
1400Single Connection Throughput
1000
1200
1400
600
800
1000
MB/sec3x slower
400
60012x slower
3x slower
0
200
Gigabit Ethernet 4Gb Fibre 20Gb Infiniband
• Graph shows throughput achieved in real-world deployments• Infiniband is held back by PCIe 1.0 x8 bus on typical host systems
Gigabit Ethernet 4Gb Fibre 20Gb Infiniband
– 15 –
Exadata – A New ArchitectureBreaks Data Bandwidth Bottleneck
• Exadata Ships Less Data Through Pipes• Query processing is moved into storage to
dramatically reduce data sent to serversdramatically reduce data sent to servers while offloading server CPUs
E d t h M Pi• Exadata has More Pipes• Modular storage “cell” building blocks
organized into Massively Parallel Grid • Bandwidth scales with capacity
• Exadata has Bigger Pipes• Exadata has Bigger Pipes• InfiniBand interconnect transfers data 5x
faster than Fibre ChannelExadata Moves a Lot
Less Data a Lot Faster
– 16 –
HP Exadata Storage Server Hardware
• Building block of massively parallel Exadata Storage Grid
U t 1GB/ d t b d idth llExadata Storage Server
• Up to 1GB/sec data bandwidth per cell• HP DL180 G5
• 2 Intel quad-core processors• 8GB RAM• Dual-port 4X DDR InfiniBand card• 12 SAS or SATA disks
• Software pre-installed• Oracle Exadata Storage Server Software• Oracle Enterprise Linux
Racked Exadata p
• HP Management Software• Hardware Warranty
• 3 YR Parts/3 YR Labor/3 YR On-site
Storage Servers
3 YR Parts/3 YR Labor/3 YR On site • 24X7, 4 Hour response
– 17 –
HP Exadata Storage Server Hardware Details
8 GB DRAMRedundant 110/220V
Power Supplies 8 GB DRAMPower Supplies
P400 Smart Array Disk Controller card
2M b b k d h- 512M battery backed cache
12 x 3.5” Disk Drives
2 Intel Xeon Quad-core Processors
InfiniBand DDR dual port card LO100c –
Management Card Included Software:
• Oracle Exadata Storage Server Software
• Oracle Enterprise Linux
• HP Management Software
– 18 –
g
SAS or SATA Disks in Exadata Servers• Choice of either
• 450 GB 15,000 RPM SAS disks1 TB 7 200 RPM SATA disks• 1 TB 7,200 RPM SATA disks
• Choose SAS Based Servers for High Performance
SAS Advantages SAS SATA Advantage
Throughput (MB/s) 1,000 750 1.33X
Average Seek Time (ms) 3.6 7.4 2.05Xg ( )
Disk level read errors (per year) 6.3 63 10.00X
Years to disk failure 15.2 11.4 1.33X
• Choose SATA Based Servers for High Capacity
SATA Ad t SAS SATA Ad tSATA Advantages SAS SATA AdvantageCapacity (TB) 5.4 12 2.22X
– 19 –
ScalableAdd k t l f th
Scale to 18 cells in one rackAdd racks to scale further
Each cell connects to 2 InfiniBand
switches for RedundancyRedundancy
This delivers 4x the bandwidth
SAS raw capacity per rack:97TBSATA raw capacity per rack:
InfiniBand links across racks for full connectivity
– 20 –
p y p216TBPeak throughput per rack : >18GB/s
y
Exadata Performance Scales
E adata deli ers bra n10 Hour
• Exadata delivers brawny hardware for use by Oracle’s brainy software
Table Scan Time
• Performance scales with sizeTypical Warehouse
• Result• More business insight
5 HourWarehouse
g• Better decisions• Improved competitiveness
Table Size1TB 10 TB 100TB
1 HourExadata
– 21 –
Simeon DimitrovSimeon DimitrovEnterprise Resources Manager, M-Tel
European Telecommunications Provider
“Every query was faster on Exadata compared to our current systems. The
smallest performance improvement was 1010x and the biggest one was an
incredible 72x.”
– 22 –
M-Tel Exadata Speedup – 10X to 72X
Tablespace Creation 28xIndex Creation
Tablespace Creation 28xAverageSpeedup
CRM Customer Discount Report
Handset to Customer Mapping Reportp p
CRM Service Order Repport
CRM Customer Discount Report
CDR Full Table Scan
Warehouse Inventory Report
0 10 20 30 40 50 60 70 80
– 23 –
Grant SalmonCEO, LGR Telecommunications
Telecomms Business Intelligence SolutionsTelecomms Business Intelligence Solutions
“Call Data Record queries that used to run for over 30 minutes
l t i d 1 i tnow complete in under 1 minute. That's extreme performance.”
– 24 –
Walt LitzenbergerDirector Enterprise Database Systems, CME Group
World’s Largest Futures ExchangeWorld s Largest Futures Exchange
“Oracle Exadata outperforms anything we’ve tested to date by 10 to 15 times.
This product flat-out screams.”
– 25 –
Giant Eagle Exadata Speedup – 3X to 50X
Merchandising Level 1 Detail:
Merchandising Level 1 Detail:Period Ago
Supply Chain Vendor - Year - ItemMovement
Merchandising Level 1 Detail:Current - 52 weeks
Materialized Views Rebuild
Merchandising Level 1 Detail byWeek
Prompt04 Clone for ACL audit
Date to Date MovementComparison - 53 weeks
Gift Card Activations
Sales and Customer Counts
- 5.0 10.0 15.0 20.0 25.0 30.0 35.0 40.0 45.0 50.0
Recall Query
– 26 –
Exadata Changes the Game
• Exadata delivers brawny hardware for useExadata delivers brawny hardware for use by Oracle’s brainy software
• Delivers database intelligence and massively parallel scaling in the storage tier• Industry standard hardware
• Eliminates all bottlenecks preventing high performance query processingperformance query processing
• Dramatic performance improvement for data warehousing
• Simplicity of appliance seamlessly integrated with the database
– 27 –