persistence of memory: in-memory is not often the answer

On the Persistence of Memory (in Database Systems) i

© 2012 Hired Brains Inc. All Rights Reserved

On the Persistence of Memory…

In Database Systems

Picture credit Creative Commons

By Neil Raden Hired Brains, Inc.

December, 2012

On the Persistence of Memory (in Database Systems) ii


Table of Contents

Executive Summary 1

The Basics 1

Database Memory and Processing Models 3

In-‐Memory Database 4

Why is in-‐memory, a fairly old concept, interesting again? 6

Limitations of iMDB 8 Cost 8 Persistence 8 Volume 9 Dual-Purpose OLTP and Analytics 9 Not so “green” 10

The Hybrid DBMS 10

Compare and Contrast 12

Conclusion 12

ABOUT THE AUTHOR 14

On the Persistence of Memory (in Database Systems) 1


Executive Summary

Recent drop in computer memory prices and the introduction of early implementations

of In-‐Memory database solutions have recently raised the level of interest in in-‐memory

databases, but the topic of in-‐memory databases is not new. In fact, there are literally

dozens of in-‐memory database products, some in production for decades, but due to

the prohibitive cost differential between memory-‐based systems and disk-‐based

systems, none have found a place beyond certain niche markets. But the drastic and

remarkable (there is hardly a word to describe it) drop in the cost of memory combined

with an equally remarkable growth in density and capacity is driving the discussion into

the mainstream of computing architectures.

For the purposes of discussion, we refer to in-‐memory databases systems as iMDB and

current relational database systems incorporating large memory models with attached

storage (including traditional magnetic disk and solid-‐state devices) as hybrid-‐DBMS.

Though the discussion is occasionally technical, our conclusions are that:

• iMDB are leveraging lower-‐cost RAM for storage but still lack persistence and

data scalability while limiting the types of solutions supported by iMDB

architecture.

• Hybrid-‐DBMS is a proven technology and provides high performance and flexible

architecture to support a variety of analytics applications.

The Basics



All database management systems (DBMS), in fact, virtually all programs in conventional

computing environments behave exactly the same way. A central processing unit (CPU)

performs a single very low-‐level instruction on a single piece of data. While complex

application programs like DBMS have many layers of functionality and can be described

logically as a set of higher-‐level interworking pieces, the CPU has utterly no insight into

this, it just chugs along one instruction at a time. If you were to sit inside a CPU and

watch its stream of sequential processes, you would be unable to determine what the

controlling program was doing. So database software, or really, any software, is just a

logical structure that encapsulates all of the smaller steps. When things get calculated,

they bear no resemblance to the whole. A CPU doesn’t know what a join or an index is.

How those bits of work are presented to the CPU is the heart of the application design.

In other words, though there is no difference in how CPU’s execute from one application

to another, the order of those instructions is the key to performance.

Each step in execution is composed of a single instruction and a single piece of data

(though today’s CPU’s are composed of multiple “cores,” essentially multiple CPU’s on a

single chip). The instruction and the data have to be presented to the CPU through

memory, either system RAM or a memory cache on the CPU itself. It makes no

difference if the application is “in-‐memory” or disk-‐based, the CPU has to be presented

with the instruction (actually, the “instruction set” is burned into the CPU, what is

presented to it is an instruction for which instruction to execute). For this reason, an in-‐

memory architecture, where all instructions and data are in RAM should, in theory,

provide superior performance compared to DBMS that must fetch data from remote

mechanical disk drives.

Solid-‐state drives (SSD) mentioned above use solid-‐state memory chips, typically flash

memory (NAND), instead of spinning magnetic disk drives. Flash memory (NAND), is less

expensive and slower than RAM/SRAM, but it is non-‐volatile, meaning, it retains data



without power. It does not lose data in the case of a system shutdown. RAM is volatile

and must be powered continuously and requires backup, typically conventional disk

drives for reliability.

One could say that a DBMS with SSD instead of traditional disks could be an in-‐memory

device, but there are two fundamental differences. First, the “memory” chips of an SSD

are part of a disk drive “card” or assembly that uses the same block addressing as the

disks it replaces. In other words, even though the seek time finding data on an SSD is at

least an order of magnitude greater than a spinning magnetic disk (this is a

generalization), there is still a call for external data, handled by the disk controller and

passed to RAM. An interesting arrangement, typically used for add-‐on accelerators, not

primary database operations are SSD’s constructed from SRAM, boosting the seek time

on the drives. This is a special-‐purpose architecture and very expensive and not further

considered here.

Database Memory and Processing Models

To clear up confusion between various models for memory in databases, it’s useful to

describe the predominant versions. There is a difference between memory models in

database systems for processing. The two predominant memory models for the most

common database systems are shared memory and shared nothing. In both, memory is

used only for processing, not for persistent storage. This is the essential difference

between today’s iMDBs and more conventional on-‐disk or hybrid systems.

In the shared memory model, all database operations use the same single aggregation

of memory and the system allocates its memory and processing tasks. All memory is

available to every processor. In a shared nothing system, each separate node of

processors and memory do their own work in parallel and are, typically, controlled by a



master node (which can be physical or virtual). In reality, nodes in a shared nothing

environment may, themselves, operate as independent shared memory nodes. But in

neither case is data stored in memory until it is called for. The exception is when data is

cached (frequently used data in “pinned” in memory), but it is still volatile and the data

can be flushed at any time.

iMDB operate more or less like a shared memory systems, but everything, including

operating systems, software programs (executables), workspace, indexes and data are

stored in RAM. When these systems are scaled out with multiple nodes connected by a

network, they operate more like a grid or distributed network than like a true MPP-‐

engineered system. However, concepts of shared memory versus shared disk (shared

nothing) are a little obsolete now as CPU’s themselves are multi-‐core, meaning, the

processors themselves are capable of parallel processing, provided the software

program (DBMs) has been designed to take advantage of it)

This description is a simplification and there are many exceptions, but in general, no

database management system stores data persistently in memory except, of course,

iMDB. The difference between the various memory models described above is how

memory is used for processing data.

In-‐Memory Database

It is an unassailable truth that data processed from memory is orders of magnitude

faster than retrieving it from a disk drive, but that is only a small part of the story.

Historically, CPU processors have been “I/O bound,” meaning they spent a significant

amount of time waiting for the requested data to arrive, requiring extreme

countermeasures in software design to minimize the latency. With data streaming to

processors at the speed of random-‐access memory (RAM), just the opposite situation



can occur – the CPU’s may become flooded with data

and unable to process as quickly as it is presented. The

point cannot be stressed enough – merely boosting the

available RAM does not guarantee smooth, faster

executions of existing programs. This turn of events calls

for careful engineering and balance. In other words,

performance of complex applications is rarely resolved

by changing one thing, it usually requires rethinking of the whole approach. The result is

that software migration to in-‐memory usually requires a great deal of re-‐work; It is not

just move and drop.

Even the notion of iMDB is a bit of a misnomer as there

is still the requirement for separate conventional storage

devices for mirroring everything for persistence, and

keeping the iMDB refreshed and reliable. Systems can

fail, which means in-‐memory systems still have to

maintain multiple copies of the data, and a complete

reload if the system fails. Adding all of these factors

together can make the effort quite expensive despite the

seemingly reasonable price of memory today (though at multiple terabytes, you will feel

the pinch). In addition, to make maximum use of RAM, all database systems use

compression of data, to one degree or another. IMDBs typically employ aggressive

compression algorithms to maximize the amount of data that can be put in working

memory. Back-‐up of an iMDB is usually lightly-‐ or un-‐compressed so it can be read by

other processes, among other reasons. Assuming a realistic 3.5x compression for an

iMDB (not all RAM is available for the data), the back-‐up drives will need to be 5X the

size of RAM, and there may be multiple archives, and the backups themselves will likely

be mirrored. With even average-‐sized analytical data warehouses today running about

50 terabytes (there are, of course much larger ones), an iMDB to accommodate those

The point cannot be stressed enough – merely boosting the available RAM does not guarantee smooth, faster executions of existing programs.

Even the notion of iMDB is a bit of a misnomer as there is still the requirement for separate conventional storage devices for mirroring everything for persistence, and keeping the iMDB refreshed and reliable.



would need 75-‐100TB of separate disk drives to handle back-‐ups, snapshots, logs and

staging areas.

Another thing to consider is that a database still has to perform all of the database

functions, from loading data to presenting it as the result of a query. Conventional

relational database technology, including those platforms are that are designed

specifically for data warehousing and analytical work, as opposed to transactional

processing, must employ a host of services to be useful to an enterprise including:

• Workload management for efficient management of the resources

• Security

• Reliability

• High availability

• Use of performance statistics for query optimization.

They must also support, in addition to traditional row-‐based schema, columnar

organization of the data which is particularly effective for wide tables with many

attributes, but it is less effective with more normalized schema and has some serious

drawbacks in the ability to update the database in real-‐time. But columnar orientation is

not a feature limited to iMDBs – most analytical database systems incorporate or even

operate solely in columnar mode.

Why is in-‐memory, a fairly old concept, interesting again?

iMDBs have been used for quite some time but they have always been limited primarily

by three factors: The cost of memory, size of database, and the persistence of data.

Today, a dollar will buy 500 to 1000 times as much memory as it did in 1995, and the

capacity per square inch of the chips has increased in inverse proportion. Memory

speeds increased as well, though not as dramatically. If the amount of data that could



be stored in early in-‐memory systems was too small for most applications, 1000 times

more memory might be enough for in-‐memory to be feasible.

This extremely simplified diagram depicts the essential (but certainly not all) differences

between an iMDB and a hybrid-‐DBMS. iMDB maximizes the use of RAM but uses

essentially the same hardware architecture of 2 CPU’s with levels of on-‐board cache,

and RAM for holding the entire database, the database software, working space, caches

and embedded functionality. The only difference in the hybrid-‐DBMS is less reliance on

RAM and the ability to address vastly greater amounts of data from the storage

subsystem. The hybrid-‐DBMS has documented databases of greater than a petabyte.

iDBMS typically scale out to 16 servers with up to 1 terabyte of RAM each, but with a

significant amount of RAM taken up with operating system, working memory, etc,.

Therefore even with 5x compression, the maximum amount of uncompressed data per



server is no more than 40Tb. Given the expense of these large iMDB systems, scaling out

to sizes that are needed today is difficult.

Limitations of iMDB

In-‐memory databases are constrained by key overwhelming limitations:

• No matter how inexpensive RAM is today compared to historical cost, it is still

considerably more expensive than its alternatives limiting its useful for

enterprise level systems.

• Data cannot persist in memory indefinitely. It is inevitable that something will

fail, which requires mechanisms to protect the data that can erode the value

proposition.

• With today’s data volumes, it is still not practical to use an in-‐memory approach

for a data warehouse.

• iMDB rely on the system being up 24/7.

Cost Though RAM is 10,000 times faster to read than a mechanical disk drive, data volumes

today are enormous and growing. A petabye-‐sized in memory database would cost

more than $5 million, perhaps twice that. SSD for that capacity would cost 1/5 to 1/10

the price. And a hybrid-‐DBMS, hot/warm/cold hierarchical storage architecture would

cost far less than that.

Persistence In-‐memory architecture still requires conventional storage. RAM is volatile and if

something fails, or even just hiccups, there can be data loss. Therefore, everything in

memory has to have a copy on less volatile storage devices. Updating the memory

requires log files, “Snapshots” and “checkpoints which can slow down processing).



Volume In-‐memory cannot economically, or even practically, scale to the volumes of today’s

data warehouses. Ten years ago, a terabyte-‐size data warehouse was remarkable, but

today, there are dozens, perhaps even more than a hundred greater than a petabyte,

one thousand times larger. Projections are that this growth rate is not diminishing.

Dual-Purpose OLTP and Analytics Some iMDB products promise the ability to perform OLTP and analytical processing on

the same platform, with the same data. This would be a real advantage as it would

alleviate need to extract and transform data from operational systems and provide

analytical support without additional. Unfortunately, this is currently impossible.

iMDB platforms generally cannot support OLTP because they have to wait for a

transaction to complete on disk to be ACID compliant. When data is updated in

memory, it is held in log files usually stored on SSD drives. iMDB platforms use this disk

based “persistent” layer to “weather” a node failure, which, in a narrow sense, suggests

they have ACID properties. When the iMDB node comes back up (after the failed part is

replaced or the cold standby node takes over), the data that is resident on the disk

“persistent” layer is reloaded back into memory. It can be done in one of two ways –

“Lazy”, where the data is reloaded as queries enter the system and request a specific

table (which doesn’t really make sense since the iMDB appears in memory as one

dimensional table), or “Full” where queries must wait until all the data is reloaded. In

both cases, the log files stored on disk or flash have to be read and applied1.

There are features to handle different kinds of failure, though. Both the SSD area and

Disk Persistent layer have RAID capability to cover for a disk failure. So, if a node

has a problem, but keeps power, then all “may be” ok. It is an “error dependent” issue.

If there is a problem with a memory chip, it is unlikely the data will survive -‐-‐ requiring a



total reload.. If a node loses power, then a total reload of all the data that was on that

node is required.

Not so “green” At a time when most vendors are formulating a “green ” message, it turns out that

iMDBs require a lot of power, considerably more than spinning drives and significantly

more than solid-‐state drives (SSD – more on this below)) RAM is volatile and needs to be

powered 24/7 if the data is to persist.

The Hybrid DBMS IMDB vendors often portray disk-‐based systems as dinosaurs that have outlived their

usefulness, but in fact, they are the result of 30 years of research and development by

some of the most brilliant minds in the technology industry and have hardly been

standing still. In the same way relational database technology gradually gained new

hardware capabilities and evolved to become hybrid-‐DMBS, it seems likely that the

major database vendors will continue to evolve to leverage the advantages of more

memory over disk drives. The dramatic cost reductions of memory have benefits that

accrue to hybrid-‐DMBSs too – solid-‐state disk drives replacing traditional magnetic

drives with improvements in I/O speed. Teradata Virtual Storage for example

automatically manages the movement of the hot and the cold data. Large memory

models are common, too, even if the persistent data remains on attached storage

instead of completely in memory.

Another consideration is that for most database applications, there is a clear difference

between hot and cold data. In other words, data that is used at the moment as opposed

to data that is use less frequently. This tilts the decision between disk-‐only and in-‐

memory to an in-‐between alternative, a hybrid scheme with large memory, SDD drives,

and less expensive slower HDD for warm or cold data. Hybrid-‐DBMS leverage the speed

of SSD to reduce query response time delays by cutting the painful delay times

introduced by lengthy I/O queues in HDD storage. A query requires many I/O



operations to complete so the time spent with I/O requests in storage queues has a

direct impact. Not only does the speed and parallel channel capability of SSD result in

40X faster I/O completions, but the queue in the HDD are shortened by aiming 80% of

I/O at the SSD, this can result up to a 60X improvement in average response times.

A Hybrid scheme requires not only a physical assemblage of devices, but also an

intelligent data manager that continually and transparently optimizes the architecture

by moving data to its best location. The figure below represents Teradata’s version of

such as system.2

Notice that in this scheme, each node is balanced with a combination of CPU’s and their

characteristics, the amount of RAM and the storage devices. This provides for optimum

balance between processing, memory and addressable storage which leads to optimal

performance. It does, however, somewhat limit configuration flexibility as the drives

and CPU’s are fixed.

2 Teradata are working on extending the data management to the memory layer



Compare and Contrast

Today there are two ways to store data electronically: on arrays of solid-‐state memory

chips (on either a memory bus or on SSD) or on magnetic disk drives. Solid-‐state chips

are obviously faster than magnetic drives (although in some cases, the differential can

be overcome with good platform design and workload management). Solid-‐state chips

are considerably more expensive than magnetic drives, and volatile RAM chips are

considerably more expensive (and faster) than non-‐volatile RAM. We can’t see the

future with perfect clarity, but it is likely for the foreseeable future, this stratification of

memory and storage will not change, even as the price/performance of each continues

to improve. The faster RAM chips will remain volatile, making full in-‐memory databases

impractical for most uses.

iMDB lack the balance of CPU and storage could lead to flooding of the CPU’s. iMDB

trades the potential for I/O latency with the very real possibility of RAM out-‐performing

the processors. Without I/O bottleneck, processors can become saturated. This is

something that the software developers should be aware of, and design for, but given

the relative recency of certain iMDB’s, these features may not be well developed. It

may be the case that client applications may need to be rewritten to not only take

advantage of the memory resources but to keep them from bogging down.

iMDB rely on large banks of very fast, expensive RAM, but also on the other types of

memory and storage to operate for high availability and for backup. Hybrid-‐DBMS relies

on the same collection of memory and storage types, but in different proportion. A

hybrid system uses solid-‐state memory judiciously and attempts to keep as much data

pinned in memory as possible for active work, but relies on only one mechanism for

persistent storage.

Conclusion



iMDB vendors claim that In-‐Memory will replace traditional hybrid-‐DBMS, unless they

are new laws of physics, holding persistent data for months or years simply isn’t feasible

without resorting to a hybrid in-‐memory and disk-‐based system. In a way, one can think

of an iMDB as merely an accelerator for a conventional database because it cannot

meet the requirements durability on its own.

On the other hand, hybrid-‐DMBS are based on proven data warehousing technologies

and offer flexible architectures and deliver high performance with automatic storage

management.

It would be easy to predict that iMDBs, and that includes DBMS with all SSD drives, will

eventually overtake disk-‐based systems. However, the cost of memory will still be

greater, no matter what it is, than disk drives and though it is impossible to predict, the

amount of data captured and analyzed will continue to grow at a rate faster than the

price/Gb of memory.



ABOUT THE AUTHOR

Neil Raden, based in Santa Fe, NM, is an industry analyst and active consultant, widely

published author and speaker and the founder of Hired Brains, Inc.,

http://www.hiredbrains.com. Hired Brains provides consulting, systems integration and

implementation services in Data Warehousing, Business Intelligence, “big data:, Decision

Automation and Advanced Analytics for clients worldwide. Hired Brains Research

provides consulting, market research, product marketing and advisory services to the

software industry.

Neil was a contributing author to one of the first (1995) books on designing data

warehouses and he is more recently the co-‐author of Smart (Enough) Systems: How to

Deliver Competitive Advantage by Automating Hidden Decisions, Prentice-‐Hall, 2007. He

welcomes your comments at [email protected] or at his blog at Competing on

Decisions.

persistence of memory: in-memory is not often the answer

Data & Analytics