creating scalable and high performance databases with iomemoryhosteddocs.ittoolbox.com ›...

13
Creating Scalable and High Performance Databases with ioMemory

Upload: others

Post on 23-Jun-2020

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Creating Scalable and High Performance Databases with ioMemoryhosteddocs.ittoolbox.com › database_iomemory_wp.pdf · performance to double roughly every eighteen months. However,

Creating Scalable and High Performance Databases

with ioMemory

Page 2: Creating Scalable and High Performance Databases with ioMemoryhosteddocs.ittoolbox.com › database_iomemory_wp.pdf · performance to double roughly every eighteen months. However,

W W W . F U S I O N I O . C O M ©2011 Fusion-io, Inc. All rights reserved. ioDrive® is a registered trademark of Fusion-io in the United States and/or other countries. All other product and company names and marks mentioned in this document are property of their respective owners. 2

Achieving Peak Performance with Your Database Fusion-io is doing what disk drives and SANs cannot—speeding performance, increasing efficiency, and guaranteeing reliability. Fusion's ioMemory works directly with the server to not only accelerate performance, but save customers money on a day-to-day basis.

Our customers speak for themselves:

Figure 1. Actual benefits to customers.

“The adapter helped my group by reducing the number of servers we expected to manage during the holiday season from 9–12 down to 3.”

-Kris Ongbongan, System Administration Manager for Zappos.com

“Our rapid growth requires massive scalability. The ioDrive’s linear scaling allows us to double our performance capabilities with a single card.”

-Dave McCabe, Director of Data Center Operations for Datalogix

Jump ahead to learn how databases can be re-architected to dramatically improve performance: see Architecting ioMemory-based Database Systems on page 8.

If you prefer to read about the benefits of ioMemory, turn to the How ioMemory Helps section on page 7.

LESS HARDWARE

QUERY PROCESSING ON

9x

75%LESS HARDWARE

QUERY PROCESSING ON

3x

50%LESS HARDWARE

QUERY PROCESSING ON

5x

50%

Page 3: Creating Scalable and High Performance Databases with ioMemoryhosteddocs.ittoolbox.com › database_iomemory_wp.pdf · performance to double roughly every eighteen months. However,

W W W . F U S I O N I O . C O M ©2011 Fusion-io, Inc. All rights reserved. ioDrive® is a registered trademark of Fusion-io in the United States and/or other countries. All other product and company names and marks mentioned in this document are property of their respective owners. 3

Introduction Data flows through our digital world at an ever-increasing rate. Databases, whose role is to cull this data to provide valuable and usable insights and information, face a daunting challenge. Database hardware and software must continually innovate just to keep pace with rapid data growth and information processing demands.

CPU processing power has kept pace with rapid data growth, thanks to Moore’s law, which allows processor performance to double roughly every eighteen months. However, databases applications have not been able to exploit these advancements due to technology barriers in two fundamental hardware components: DRAM and hard disks. DRAM offers applications extremely low latencies and high performance, but cost is prohibitive, and thermal density thresholds limit in-server scalability. High-capacity DRAM systems are only possible via expensive clustered systems. These clusters experience network latency and reduced performance from using high-density DRAM (>128GB), which effectively discourages market adoption as a result of high costs and lower performance.

Similarly, magnetic hard drive density has grown prodigiously, but even the cleverest engineering arrays produced negligible performance improvements. A system that overcomes these obstacles must improve three characteristics that affect storage device performance: latency, IOPS, and bandwidth.

This document discusses the problems traditional systems create for the enterprise and how ioMemory addresses the fundamental weaknesses of these systems. It then describes architectures implementing ioMemory that can be used to address various databases’ environmental needs.

Latency Latency (usually measured in milliseconds) is the biggest bottleneck in most databases. Because the gap in response time between modern CPUs and the fastest disk array is still several orders of magnitude (see figure 2), powerful processors typically sit idle the majority of the time, waiting on disks (see figure 3). Even at its best, a disk array is thousands of times slower than a CPU.

Figure 2. Access delay between DRAM and SSDS is several orders of magnitude.

5-6 orders of magnitude

L1

L2

L3

DRAM

SSDs

SAN

, NAS

, RAI

Ded

DAS

ACCESS DELAY IN TIMENanosecond (10E-9) Millisecond (10E-3)

Page 4: Creating Scalable and High Performance Databases with ioMemoryhosteddocs.ittoolbox.com › database_iomemory_wp.pdf · performance to double roughly every eighteen months. However,

W W W . F U S I O N I O . C O M ©2011 Fusion-io, Inc. All rights reserved. ioDrive® is a registered trademark of Fusion-io in the United States and/or other countries. All other product and company names and marks mentioned in this document are property of their respective owners. 4

Figure 3. The majority of CPUs are utilized at fewer than 50 percent. (Source: IDC)

IOPS and Bandwidth The simplest performance metric for any storage device is the time it takes the CPU to input and output data. This is measured by the number of read and/or write operations performed per second (IOPS). When measuring a storage device’s performance, however, one must also consider bandwidth. IOPS and bandwidth are functions of each other: the data block size affects IOPS performance and IOPS performance potentially limits bandwidth saturation.

Let us consider two examples that calculate bandwidth for a random-access database and IOPS for a sequential access database.

Bandwidth is simply the number IOPS * the block size. One hard drive can achieve a rough maximum of 200 IOPS for random-access patterns, and yet can achieve a rough maximum of 2400 IOPS for sequential access patterns. If it performs the random operations with 8k block sizes (standard for most databases), it can achieve 1.56 MB/s bandwidth, calculated as follows:

200 x 8k = 1600k /1024 = 1.56MB/S (1MB = 1024k)

Conversely, let’s calculate the IOPS for a database with sequential access patterns. Hard drives perform much better with sequential access and can achieve a sustained bandwidth of 150MB/s. Per MS SQL best practices, the block size is set to 64k:

150 * 1024 / 64k = 2400 IOPS/sec

0

500,000

1,000,000

1,500,000

2,000,000

2,500,000

1-20% 21-30% 31-50% 51-100%

Serv

er U

nits

CPU Utilization %Se

rver

Uni

ts/Y

ear

Page 5: Creating Scalable and High Performance Databases with ioMemoryhosteddocs.ittoolbox.com › database_iomemory_wp.pdf · performance to double roughly every eighteen months. However,

W W W . F U S I O N I O . C O M ©2011 Fusion-io, Inc. All rights reserved. ioDrive® is a registered trademark of Fusion-io in the United States and/or other countries. All other product and company names and marks mentioned in this document are property of their respective owners. 5

The Problem The conventional way to increase IOPS and bandwidth is to aggregate disks into arrays. However, doing this greatly increases the primary application performance bottleneck: latency. Increased latency leads to deep queuing and severely underutilized CPU cores (figure 2). It also results in diminishing marginal value and exponentially increasing costs, for incremental IOPS (figure 3).

The average HDD storage array in the SPC-1 Benchmark (2008 results or later) was able to achieve less than 90,000 IOPS before unacceptable latency levels prevented further IOPS scaling. The average cost for this level of performance was $1.2 million, far exceeding the average cost of External Storage Systems reported by IDC (<$50K).

Disk arrays with an intelligent DRAM cache can reduce sequential access latency to around 1 to 2 ms. However, most modern databases have very random I/O access patterns and random access is much more problematic because its often impractical to store the target block or larger datasets in DRAM, forcing the database to go to the magnetic disk. Random reads can quickly climb above 10 to 20 ms of latency, and this problem becomes worse when the number of competing applications increases (over-utilized disk arrays). The end result is poor database and application performance.

The following illustration depicts how performance decreases as transaction volume increases.

Figure 4. Performance decreases as transaction volume increases.

Additional unfortunate side effects of disk-aggregating solutions are increased system complexity, reduced reliability, and permanently expanded power, cooling, floor space, and administration overhead.

All of this makes it difficult for traditional storage systems to guarantee database performance levels under varying or increasing loads. Just keeping up with simple data growth is a significant challenge. IDC estimates that the volume of data stored doubles every 18 months (serendipitously similar to processing performance and Moore’s law). Additionally, many database environments experience large spikes in transaction volumes (such as over holidays or financial end-of-year). The cost to accommodate these spikes with today’s inefficient systems can make it impractical to purchase performance capacity, and DBAs may be forced to endure periods when they simply cannot meet their SLA or QOS commitments. In addition to spikes in transaction volume, many enterprises—such as those in the e-commerce industry—see rapid increases in transaction volume as part of organic growth. Keeping up with performance requirements is a cost of business, but with the traditional, inefficient systems, this cost is high.

Acceptable SLA Response Time Threshold

Bad Performance

RESP

ON

SE T

IME

I/O WORKLOAD ACTIVITY

Typical SSD-based system When utilizing HHD storage array When utilizing ioDrives

Good Performance

Page 6: Creating Scalable and High Performance Databases with ioMemoryhosteddocs.ittoolbox.com › database_iomemory_wp.pdf · performance to double roughly every eighteen months. However,

W W W . F U S I O N I O . C O M ©2011 Fusion-io, Inc. All rights reserved. ioDrive® is a registered trademark of Fusion-io in the United States and/or other countries. All other product and company names and marks mentioned in this document are property of their respective owners. 6

Poorly performing databases can result in symptoms that can negatively affect businesses in the following ways:

Figure 5. Consequences of poorly performing databases.

Symptoms: Results: Slow e-commerce websites • Poor conversion rates and lost revenue as customers lose patience

• Bad customer experience harming brand and leading to negative word-of-mouth advertising

Slow enterprise application • Staff and customer complains due to application performance

• Staff and customer complaints due to stale data on read/slave servers making application “appear” to have problems

• Lost productivity as end users wait on poorly performing applications

Slow reporting and Business Intelligence (BI) analysis

• Reported complexity limited by performance constraints

• Slow analysis translates to lost time-sensitive business opportunities

Slow back-up/recovery/maintenance windows

• Backup processes impact core application performance when run simultaneously

• Backups not run as frequently as desired due to impact on core applications

• Long recover window can lead to revenue and opportunity loss

How ioMemory Helps ioMemory is a NAND flash-based, in-server memory tier that approaches flash as an extension of the memory hierarchy and as a new building block for computer hardware and software architecture, rather than confining it only to traditional storage paradigms.

The result is an elegant cut-through architecture that provides near-linear performance scaling with very little software/hardware overhead, unprecedented flash reliability and endurance, customer flexibility in formatting, software development opportunities, and future-proof field upgradeability. It balances database systems in the following ways:

Reduces DRAM requirements. ioMemory provides massive capacities of DDRAM-like Flash memory at a much lower cost and without the volatility. This makes it ideal as a replacement for DDRAM cache. ioMemory improves system efficiency by reducing the DRAM footprint, which in turn improves both DDRAM and overall system performance.

Improves processor utilization. As has been shown, ioMemory uses CPUs much more productively and effectively than devices that offload memory management.

Eliminates network thrashing. ioMemory allows most or all active data to be stored on the servers that applications are running on. This reduces the need for applications to traverse the network for data.

Improves disk capacity utilization. Because active data can be stored on servers, storage infrastructure does not need to be wasted on racks of wide-striped arrays. Instead, disk technologies can focus on what they do best: transfer and store high capacity sequentially stored files.

Page 7: Creating Scalable and High Performance Databases with ioMemoryhosteddocs.ittoolbox.com › database_iomemory_wp.pdf · performance to double roughly every eighteen months. However,

W W W . F U S I O N I O . C O M ©2011 Fusion-io, Inc. All rights reserved. ioDrive® is a registered trademark of Fusion-io in the United States and/or other countries. All other product and company names and marks mentioned in this document are property of their respective owners. 7

Architecting ioMemory-based Database Systems ioMemory products make large capacities of Fusion-optimized NAND flash available to databases on the server. This allows organizations to re-architect database systems to dramatically improve performance by eliminating network latency associated with going to backend data stores. Additionally, organizations that maintain database cache farms can use ioMemory products to dramatically reduce the number of servers in their cache farms.

This section provides information about the three primary ioMemory architectures: All ioMemory, ioMemory with Data Store, and ioMemory Cache. Each following section provides an overview of the architecture, summarizes the benefits, and closes with considerations to help the reader determine if the architecture is suitable for their requirements.

ALL IOMEMORY

This architecture places the entire database on ioMemory within a database server. A second failover server ensures availability through redundancy.

Figure 6. Placing the entire database on ioMemory greatly reduces infrastructure needs.

Page 8: Creating Scalable and High Performance Databases with ioMemoryhosteddocs.ittoolbox.com › database_iomemory_wp.pdf · performance to double roughly every eighteen months. However,

W W W . F U S I O N I O . C O M ©2011 Fusion-io, Inc. All rights reserved. ioDrive® is a registered trademark of Fusion-io in the United States and/or other countries. All other product and company names and marks mentioned in this document are property of their respective owners. 8

Figure 7. MySQL distributed master/slave database architecture (failover,) supported by rapid replication, ensures availability through redundancy.

Benefits

• 3-10x expected performance improvement. Delivers the highest IOPS, highest throughput, and lowest latency to the entire database.

• Maximum Server Consolidation. Just two servers provide a fully redundant database platform.

• Easy implementation. ioMemory’s high throughput enables a very fast transition to the new architecture.

• 2-5x performance improvement for distributed master/slave environments.

Considerations

• Size limitations. The number of PCI Express slots within the database server limits the database size this architecture supports. See Architecture Limitations Based on Size on page 11.

• Read/Write Profile. ioMemory can be configured to keep write performance high. However, doing this reduces ioMemory’s usable capacity. See Sizing Databases Based on Write Load on page 12 for further information.

Page 9: Creating Scalable and High Performance Databases with ioMemoryhosteddocs.ittoolbox.com › database_iomemory_wp.pdf · performance to double roughly every eighteen months. However,

W W W . F U S I O N I O . C O M ©2011 Fusion-io, Inc. All rights reserved. ioDrive® is a registered trademark of Fusion-io in the United States and/or other countries. All other product and company names and marks mentioned in this document are property of their respective owners. 9

• RAID configuration. Organizations can RAID configure ioMemory products within the server to provide redundancy (RAID 1) or to improve performance (RAID 0). RAID configuration reduces the maximum database size this architecture can support.

• High availability. The “All ioMemory” architecture provides full database functionality from a single server. However, best practices recommend a second failover server for high availability. See Implementing High Availability on page 12 in this document.

IOMEMORY WITH DATA STORE

This architecture is useful for organizations whose databases exceed the capacity of ioMemory that can be fitted in their current server model. In this architecture, ioMemory becomes an in-server tier for the I/O-intensive components of the database while serving the other areas via traditional disk media, either DAS or SAN attached.

The Data Store (DAS/SAN/array) is used to house the following:

• Less I/O-intensive components

• Data that is accessed by multiple sources

• Historical data that is accessed less frequently

• Backups

Figure 8. Storing I/O-intensive components on ioMemory non-invasively improves performance.

Page 10: Creating Scalable and High Performance Databases with ioMemoryhosteddocs.ittoolbox.com › database_iomemory_wp.pdf · performance to double roughly every eighteen months. However,

W W W . F U S I O N I O . C O M ©2011 Fusion-io, Inc. All rights reserved. ioDrive® is a registered trademark of Fusion-io in the United States and/or other countries. All other product and company names and marks mentioned in this document are property of their respective owners. 10

Benefits

• 50-300 percent expected performance improvement.

• Server consolidation and slow scale out. ioMemory eliminates the need for SAN/DAS systems to scale out for performance. Existing performance arrays can be repurposed.

• Easy implementation. Add ioMemory to database servers, similar to installing a graphics card, and place high I/O components on ioMemory.

Considerations

• PCI Express slot constraints. The number of PCI Express slots within the database server limits the number of I/O intensive files that can be moved to ioMemory (typically 4TB). See Architecture Limitations Based on Size on page 11.

• RAID configuration. While ioMemory devices support RAID 1 configurations, its design makes RAID 1 unnecessary in the same way that RAIDing DRAM chips together is unnecessary. Where LUN sizes greater than the current capacity of ioMemory are required, RAID 0 devices can be used to scale capacity and performance linearly without increasing latency.

• High availability. Use a failover design to facilitate patching, upgrades, and other operational tasks on the operating systems and database. Many current database applications support failover configurations that do not require a shared disk model.

IOMEMORY CACHE

Organizations frequently use the MemcacheD application to create caching server farms that relieve the pressure from database systems like MySQL. Typically, caching layers consist of clusters of servers loaded up with as much DRAM as they can hold. However, because DRAM is so expensive and because DRAM slots are limited, these server farms can grow quite large. By having applications like MemcacheD referencing ioMemory, each server’s caching capabilities increase from gigabytes of DRAM to terabytes of high performance memory, greatly improving a server’s capabilities.1

Page 11: Creating Scalable and High Performance Databases with ioMemoryhosteddocs.ittoolbox.com › database_iomemory_wp.pdf · performance to double roughly every eighteen months. However,

W W W . F U S I O N I O . C O M ©2011 Fusion-io, Inc. All rights reserved. ioDrive® is a registered trademark of Fusion-io in the United States and/or other countries. All other product and company names and marks mentioned in this document are property of their respective owners. 11

Figure 9. ioMemory can provide terabytes of cache capacity per server, improving performance and reducing the number of servers needed.

Benefits

• Server Consolidation. ioMemory adds significant capacities of high performance memory to each server, allowing organizations to achieve higher levels of performance with much fewer servers.

• Improved performance by enabling more data to be cached, increasing hit rate ratio.2

Considerations

• Only available for supported Linux and Unix operating systems.

ARCHITECTURE LIMITATIONS BASED ON SIZE

Databases realize the greatest performance improvements when the entire database is stored on ioMemory. However, the number and size of PCI Express slots within a database server limits the number of ioMemory products that can be installed, and therefore limits the size of the database for this configuration.3 If the database size exceeds the usable capacity of ioMemory, significant performance gains can still be achieved by moving the most I/O intensive components to ioMemory.

The following table shows expected improvements for an “All ioMemory” architecture compared to moving I/O intensive components.

Page 12: Creating Scalable and High Performance Databases with ioMemoryhosteddocs.ittoolbox.com › database_iomemory_wp.pdf · performance to double roughly every eighteen months. However,

W W W . F U S I O N I O . C O M ©2011 Fusion-io, Inc. All rights reserved. ioDrive® is a registered trademark of Fusion-io in the United States and/or other countries. All other product and company names and marks mentioned in this document are property of their respective owners. 12

Figure 10. Expected improvement from different ioMemory architectures.

Use case Expected improvement4 Entire database on ioMemory 400% to 1000%

Tempdb on ioMemory 50% to 200%

Log files on ioMemory 50% to 200%

The following table lists high I/O components for several popular databases.

Figure 11. High I/O components for several popular databases.

Database platform Specific components MS SQL Server Log files, tempdb, indexes, partition tables, and frequently

accessed tables Oracle Redo logs, indexes, temporary tablespace, undo tablespace,

and frequently accessed tables

Sybase Log files, tempdb, indexes and frequently accessed tables MySQL Binary log, application logs, and frequently accessed tables

DETERMINING A SERVER’S IOMEMORY CAPACITY

Consult the PCI Express specification for your server and the ioMemory device interface requirements to determine the capacity limits of a server. Contact your Fusion-io representative if you have any questions.

SIZING DATABASES BASED ON WRITE LOAD

The vast majority of deployments require no change to the ioMemory device. However, databases with heavy write patterns should format the device for optimal high-write performance. This operation is enabled via the GUI or CLI and is undertaken before deployment. The level of write optimization is based on read/write ratio, number of IOPS, and duration of high load.

This table below provides formatting recommendations.

Figure 12. Formatting recommendations for high-write performance.

Write percentage Recommended formatting and Impact

Up to 30% writes Default

Up to 50% writes Improved write performance—Reduces usable capacity about 30% 70% writes and above Maximum write performance—Reduces usable capacity about 50%

IMPLEMENTING HIGH AVAILABILITY

ioMemory protects data from being lost or corrupted in the event of a server failure. However, it cannot prevent server failures. Additionally, while rare, it is possible for hardware components on the ioMemory device to fail. Organizations still need to implement server redundancy schemes at the level their organization requires, including providing a failover server, backup power, separate power sources, etc. The following table describes recommended options for implementing high availability with ioMemory:

Page 13: Creating Scalable and High Performance Databases with ioMemoryhosteddocs.ittoolbox.com › database_iomemory_wp.pdf · performance to double roughly every eighteen months. However,

W W W . F U S I O N I O . C O M ©2011 Fusion-io, Inc. All rights reserved. ioDrive® is a registered trademark of Fusion-io in the United States and/or other countries. All other product and company names and marks mentioned in this document are property of their respective owners. 13

Figure 13. Recommended options for implementing high availability with ioMemory.

Database Platform Recommended option MS SQL Server Database Mirroring, Multi-site Clustering

Oracle Data Guard MySQL Master-slave Replication Sybase Database Mirroring

Conclusion ioMemory is architected like no other solid-state technology today. Its advanced design over SSDs enables the performance benefits of memory through the familiar disk interfaces of modern operating systems, eliminating the complexity of traditional storage architecture that creates unnecessary context switching, deep queuing, and I/O storms. ioMemory dramatically improves database performance, out-performing every disk storage array on the market—even when they are SSD equipped.

ioMemory performance allows organizations to leverage database architectures to create higher performing, more scalable, and simpler environments that require much less upfront (CAPEX) and operational costs (OPEX) than have been previously possible using legacy technology.

1 Notably, because ioMemory acts like DRAM rather than disks, its performance and scalability in caching layers are much greater than SCSI, SATA, and FC SSDs. 2 Performance improvement depends on how much data is added to the ioMemory cache and how frequently the data is accessed. 3 Optimizing devices for higher write loads requires reducing usable capacity of drives. See “Sizing Databases Based on Write Load.” 4 The impact of moving multiple I/O intensive components to ioMemory are cumulative, but not linearly so.