in memory data grids, demystified!

Uri CohenHead of Product @ GigaSpaces@uri1803github.com/uric

In-Memory Data Grids, Demystified

Agenda

• Why IMDG?• Brief History• How It Works– Data model & placement– HA and fault tolerance – Consistency – Internals

Why IMDG?

Today, more than ever, there are many choices when it comes to storing your data

But There Many

Solutions

Just A Few Years Back

So Why Indeed??

The Need for Speed, In

Real Time…

Some Facts

Memory will always be faster

than disk (usually by orders of

magnitude)

Recent Survey

The ratio of IT managers that think that real time analysis is the biggest challenge for big data implementations

• Plan to use in memory technologies for big data projects.• Only 32%

mentioned Hadoop

Stream Processing

Hell, Even Gartner Thinks So

“In memory computing (IMC) … provides transformational opportunities. The execution of

certain-types of hours-long batch processes can be squeezed into minutes or even seconds …

Millions of events can be scanned in a matter of a few tens of millisecond to detect correlations and patterns

pointing at emerging opportunities and threats "as things happen.”

And nowadays

HW and SW just makes it a whole lot

cheaper

Some Common Use Cases

Fast, Transactional Data Access

• Inventory management • Financial

reference data• Real time

transactional data

Real Time Stream

Processing

• Fraud Detection• Click Stream

Analysis • Real time

analytics • Continuous

calculation

Heavyweight Offline

Calculations

• Trade Reconciliation • Pattern analysis

and detection• Number crunching

Caching

• Database offloading • Content heavy

websites

The Evolution of Data Grids

First There Were Local Caches

CacheIn process cachingof Key->Value data

structure

Distribute CachePartitioned cache

IMDGPartitioned system

of record

IMDG.next()

Good for repetitive-data reads

Limited in capacity

Doesn’t handle write-heavy scenarios

Reads are only part latency path

Then Came Distributed Caches

structure

of record

Increased Capacity

Still no support for write-heavy scenarios

Limited to ID-based reads

Reads are only part latency path

IMDG.next()

In Memory Data Grids

structure

Increased capacity

Write scalability

Can serve as system of record with querying & transaction semantics

Still limited in capacity

Latency can come from other parts of your app

of record

IMDG.next()

How It Works

Data Models

Data Placement – Fixed Hashing

hash(key) % #nodes

Fixed Hashing - HA

hash(key) % #nodes

Fixed Hashing – Scaling

Source: http://www.griddynamics.com/distributed-algorithms-in-nosql-databases/

Data Placement – Consistent Hashing

Data Consistency

Since we’re dealing with distributed data, consistency cannot be taken for granted• Read after write • Read after read • Write-write consistency

Solution 1: Single

Master

Solution 2: Read/Write Quorums

Some More Concerns

• Transactions• Querying • Failure detection • Leader election • Persistency • Interoperability

IMDG.next()

Using IMDG for messaging, BL

IMDG.next()

SSD FTW!

Thank You!

docs.gigaspaces.com

in memory data grids, demystified!

data consistency

distributed data

data models

data placement fixed

memory data grids cache

evolution of data grids

big data implementations

big data projects

Technology

paris nosql user group - in memory data grids in action...

mongodb memory management demystified

fluid mechanics - · pdf filequantum mechanics demystified...

advancedcalculus - cloudflare-ipfs.com...as p.net 2.0...

demystified series - unamalberto/apuntes/mcmahon.pdf ·...

implementing operational intelligence using in-memory...

geo-analytics with apache spark and in-memory data grids

in-memory data grids essentials. oracle coherence

erp demystified

geecon 2011 - nosql and in memory data grids from a...

big data eine annäherung · in-memory-datenbanken (imdb)....

in-memory data grids - ampool (1)

imc summit 2016 breakout - william bain - implementing...

german conversation demystifiedpharmacology demystified...

lte demystified

entropy demystified

meteorology demystifieds2.bitdl.ir/ebook/geology/meteorology...

demystified series - unamalberto/apuntes/mcmahon.pdf ·...

squeezing performance of out of in-memory data grids - fuad...

in-memory data grids: explained