fit for purpose: the new database revolution findings webcast

One Size Doesn’t Fit AllThe database revolution

April 25, 2012

Mark R. Madsenhttp://ThirdNature.net

Robin Bloorhttp://Bloorgroup.com

Wednesday, April 25, 12

http://ThirdNature.net




http://Bloorgroup.com

http://Bloorgroup.com

Your Host

[email protected]


mailto:[email protected]

mailto:[email protected]

Analysts Host

Bloor Madsen


Introduction

Significant and revolutionary changes are taking place in database technology

In order to investigate and analyze these changes and where they may lead, The Bloor Group has teamed up with Third Nature to launch an Open Research project.

This is the final webinar in a series of webinars and research activities that have comprised part of the project

All published research will be made available through our web site: Databaserevolution.com


Sponsors of This Research


General Webinar Structure

Market Changes, Database Changes (Some Of The Findings)

Let’s Talk About Performance

How to Select A Database


Market Changes, Database Changes


Database Performance Bottlenecks

CPU saturation

Memory saturation

Disk I/O channel saturation

Locking

Network saturation

Parallelism – inefficient load balancing


Multiple Database Roles

BIApp

BIAppBI

AppBI

AppBIApp

BIApp

BIApp

BIAppBI

App

OLAPCubesOLAP

CubesData

WarehouseStaging

Area

OperationalDataStore

DataMartsData

Marts

PersonalData

StoresPersonal

DataStores

ContentDBMS

File or DBMS

Transactional Systems BI and Analytics Systems

Unstructured Data

Structured Data

File orDBMS

File orDBMS

AppAppAppAppAppApp

File orDBMS

DBMSDBMS

DBMS

Now there are more...Wednesday, April 25, 12

The Origin of Big Data

+ Embedded Systems Data

+ Social Network Data

+ Web Data

+ Supply Chain & Cust. Data

+ Personal Data

+ Unstructured Data

CorporateDatabases


Big Data = Scale Out

Server 1

Data is compressed andpartitioned on disk by column and by range

Query

Sub Query 1

Sub Query 2

The query is decomposed into a sub-query for each node

CommonMemory

The columnar database scales up and out by adding more serversDatabase

Table

Cache

CPU CPU

Server 2

CPU CPU

Server 1

CPU CPU

CommonMemory

Cache

CommonMemory

Cache

DataDataDataDataDataDataDataData

DataDataDataData


Let’s Stop Using the Term NoSQL

As the graph indicates, it’s just not

helpful. In fact it’s downright confusing.

nosql

Data Volume

Single Table

Star Schema

Snow Flake

TNF Schema

Nested Data

Graph Data

Complex Data

OLAP

newsqloldsql


NoSQL DirectionsSome NDBMS do not attempt to provide all ACID properties. (Atomicity, Consistency, Isolation, Durability)

Some NDBMS deploy a distributed scale-out architecture with data redundancy.

XML DBMS using XQuery are NDBMS.

Some documents stores are NDBMS (OrientDB, Terrastore, etc.)

Object databases are NDBMS (Gemstone, Objectivity, ObjectStore, etc.)

Key value stores = schema-less stores (Cassandra, MongoDB, Berkeley DB, etc.)

Graph DBMS (DEX, OrientDB, etc.) are NDMBS

Large data pools (BigTable, Hbase, Mnesia, etc.) are NDBMS


The Joys of SQL?

SQL: very good for set manipulation. Works for OLTP and many query environments.

Not good for nested data structures (documents, web pages, etc.)

Not good for ordered data sets

Not good for data graphs (networks of values)


The “Impedance Mismatch”

The RDBMS stores data organized according to table structures

The OO programmer manipulates data organized according to complex object structures, which may have specific methods associated with them.

The data does not simply map to the structure it has within the database

Consequently a mapping activity is necessary to get and put data

Basically: hierarchies, types, result sets, crappy APIs, language bindings, tools


The SQL Barrier

SQL has:DDL (for data definition)

DML (for Select, Project and Join)

But it has no MML (Math) or TML (Time)

Usually result sets are brought to the client for further analytical manipulation, but this creates problems

Alternatively doing all analytical manipulation in the database creates problems


Hadoop/MapReduce

Hadoop is a parallel processing environment

Map/Reduce is a parallel processing framework

Hbase turns Hadoop into a database of a kind

Hive adds an SQL capability

Pig adds analytics

BackUp/Recov

HDFS

Node 1

MappingProcess

Scheduler

BackUp/Recov

HDFS

Node i

MappingProcess

BackUp/Recov

Node i+1

ReducingProcess

BackUp/Recov

Node j

ReducingProcess

BackUp/Recov

Node k

ReducingProcess

Map Partition Combine Reduce

BackUp/Recov


Market Forces

A new set of products appear

They include some fundamental innovations

A few are sufficiently popular to last

Fashion and marketing drive greater adoption

Products defects begin to be addressed

They eventually challenge the dominant products


Let’s Talk About Performance


Performance%and%Scalability%

Scalability%and%performance%are%not%the%same%thing%

Throughput:"the"number"of"tasks"completed"in"a"given"5me"period"A"measure"of"how"much"work"is"or"can"be"done"by"a"system"in"a"set"amount"of"5me,"e.g."TPM"or"data"loaded"per"hour."

It’s"easy"to"increase"throughput"without"improving"response"5me."

Page 14

Performance%measures%

Performance%measures%

Response'8me:"the"speed"of"a"single"task"

Response"5me"is"usually"the"measure"of"an"individual's"experience"using"a"system.""

Response"5me"=""5me"interval"/"throughput"

Page 15

Scalability%vs%throughput%vs%response%<me%

Scalability"="consistent"performance"for"a"task"over"an"increase"in"a"scale"factor"

Three%possible%scale%factors%

Number of users!

Computations!

Amount of data!

Scale:%Data%Volume%

The"different"ways"people"count"make"establishing"rules"of"thumb"for"sizing"hard."

How"do"you"measure"it?"▪  Row"counts"▪  Transac5on"counts"▪  Data"size"▪  Raw"data"vs"loaded"data"▪  Schema"objects"

People's8ll'have'trouble'scaling'for'databases'as'large'as'a'single'PC'hard'drive.'

Scale:%Concurrency%(ac<ve%and%passive)%

Scalability%rela<onships%

As"concurrency"increases,"response"5me"(usually)"decreases,"

This"can"be"addressed"somewhat"via"workload"management"tools."

When"a"system"hits"a"bogleneck,"response"5me"and"throughput"will"ohen"get"worse,"not"just"level"off."

“Linear%Scalability”%

This"is"the"part"of"the"chart"most"vendors"show."

If you’re lucky they leave the bottom axis on so you know where their system flatlines.

Scale:%Computa<onal%Complexity%

A"key"point"worth"remembering:"

Performance"over"size"<>"performance"over"complexity"

Analy5cs"performance"is"about"the"intersec5on"of"both."

Database"performance"for"BI"is"mostly"related"to"size"and"query"complexity."

SOME%TECHNOLOGY%STUFF%

Large%Memories%and%Large%Databases%

Not"as"fast"as"you"expect"because"of"how"databases"were"designed"(op5mized"for"small"memories"and"disk"access)."For"example:"sequen5al"scans"and"cache"serializa5on"

512GB DB buffer cache

1B rows, 100/block = 640GB table unread

LRU overwrites older blocks

In_Memory%Databases%Today%

1.  Maybe"not"as"fast"you"think."Depends"en5rely"on"the"database"(e.g."VectorWise)"

2.  Applied"mainly"to"shared?everything"systems"

3.  Very"large"memories"are"more"applicable"to"shared?nothing"than"shared?memory"systems"

7.  S5ll"an"expensive"way"to"get"performance"

" "Box?limited "Limited"by"node"scaling"" "e.g."2"TB"max "e.g."16"nodes,"512GB"per"="8TB"

Hardware%changes%enable%new%so`ware%models%

The"extra"CPU"allows"us"to"do"things"in"sohware"that"we"avoided"in"the"past"because"of"scarce"resources."

Compression"techniques"and"columnar"database"architectures"which"that"consumed"too"much"are"now"possible."

Improving%Query%Performance:%Columnar%Databases%

Marge"Inovera"

Anita"Bath"

Ivan"Awfulitch"

Nadia"Geddit"

$150,000"

$120,000"

$166,000"

$36,000"

1"

2"

3"

4"

In a row-store model these three rows would be stored in sequential order as shown here, packed into a block.

In a column store they would be divided into columns and stored in different blocks.

ID% Name% Salary% Posi<on%

1" Marge"Inovera" $150,000" Sta5s5cian"

2" Anita"Bath" $120,000" Sewer"inspector"

3" Ivan"Awfulitch" $160,000" Dermatologist"

4" Nadia"Geddit" $36,000" DBA"

Sta5s5cian"

Sewer"inspector"

Dermatologist"

DBA"

Inser<ng%data%into%a%columnar%database%

Marge"Inovera"

Anita"Bath"

Ivan"Awfulitch"

Nadia"Geddit"

$150,000"

$120,000"

$166,000"

$36,000"

1"

2"

3"

4"

Each column is stored in its own set of blocks, written to disk separately.

Extra work for writes over rowstore, update complexity, delete complexity.

Sta5s5cian"

Sewer"inspector"

Dermatologist"

DBA"

Reading%from%a%columnar%database%

Marge"Inovera"

Anita"Bath"

Ivan"Awfulitch"

Nadia"Geddit"

$150,000"

$120,000"

$166,000"

$36,000"

1"

2"

3"

4"

SELECT * FROM emp WHERE ID = 1

4 reads, extract & stitch

Sta5s5cian"

Sewer"inspector"

Dermatologist"

DBA"

Column%elimina<on%and%I/O%

Marge"Inovera"

Anita"Bath"

Ivan"Awfulitch"

Nadia"Geddit"

$150,000"

$120,000"

$166,000"

$36,000"

1"

2"

3"

4"

SELECT AVG(salary) FROM emp

1 read

Sta5s5cian"

Sewer"inspector"

Dermatologist"

DBA"

How%do%we%scale%performance%for%queries?%

Faster"CPUs"means"quicker"response"5me,"increased"throughput."

Query

CPU

Make CPU faster

Parallelize query execution

Add CPUs

Parallel"query"execu5on"resolves"response"5me"but"it"consumes"more"resources,"reducing"concurrency"and"possibly"throughput."

More"CPUs"means"more"throughput."

Early%query%performance%scaling:%table%par<<oning%

Table"par55oning"distributes"rows"across"table"par55ons"by"range,"hash"or"round"robin"when"you"insert"or"load"the"data."

QI Sales Table

fn

Q2 Sales Table Q3 Sales Table Q4 Sales Table

Scale_up%vs.%Scale_out%Parallelism%

Uniprocessor"environments"required"chip"upgrades."

SMP"servers"can"grow"to"a"point,"then"it’s"a"forklih"upgrade"to"a"bigger"box."

MPP"servers"grow"by"adding"mode"nodes."

Slide 34 Copyright"Third"Nature,"Inc."

(a)"Scaling"up"with"a"larger"server"(b)"Scaling"out"with"many"small"servers"

Sharding,%aka%Par<<oning%at%the%Node%Level%

Sharding"is"basically"horizontal"par55oning"applied"across"mul5ple"database"servers."

Each"node"holds"a"(hopefully)"self?consistent"por5on"of"the"database."

Good"as"long"as"queried"data"lives"on"a"single"node."

One large database = several smaller databases

Query redirect

Sharding,%Databases%and%Queries%

What"happens"when"you"need"to"scan"a"full"table"or"join"tables"across"nodes?"Mul5ple"queries"and"s5tching"at"the"applica5on"level."

Sharding"works"well"for"fixed"access"paths,"uniform"query"plans,"and"data"sets"that"can"be"isolated."Mainly"this"describes"an"OLTP?style"workload."

Cloud%Hardware%Architecture%

It’s"a"scale?out"model."Uniform"virtual"node"building"blocks."

This"is"the"future"of"sohware"deployments,"albeit"with"increasing"node"sizes,"so"paying"agen5on"to"early"adopters"today"will"pay"off."

This"implies"that"an"MPP"database"architecture"will"be"needed"for"scale."

X

MPP%Database%Architecture%


Worker"nodes"

Leader"node(s)"used"by"some"

High"speed"interconnect"

Some"use"separate"loader"nodes"

Some database are symmetric (all nodes are the same). Some allow mixed worker node sizes. Some are leaderless.

Some problems with leaders, loaders, e.g. less automated management of the environment, treating bottlenecks

Key%to%MPP:%data%distribu<on%


Table data is evenly spread across all nodes.

The good: scalability to petabyte range, much faster filtering and selection on scans.

The bad: data skew (values, not rowcounts), aggregate function bottlenecks, concurrency challenges, complex multi-table joins with unlike distributions.

Single logical view of a table

MPP%challenges%mostly%hinge%on%data%distribu<on%

Imagine"fact"&"dim"tables"spread"across"all"nodes."

You"need"to"get"dim"data"to"each"node"to"join"with"fact"rows"stored"there."

Cross?node"joins"result"in"data"shipping."This"is"where"inter?node"latency,"data"skew,"node"skew"can"bog"down"query"performance."

The"real"test"of"an"MPP"database"is"not"how"fast"it"can"scan"data."That’s"easy."Test"joins"in"a"PoC."

Fact tb

Dim tb

Node 1

Fact tb

Dim tb

Node 2

MATCHING%PROBLEMS%TO%TECHNOLOGIES%

Solving%the%Problem%Depends%on%the%Diagnosis%

Three%General%Workloads%

Online"Transac5on"Processing"▪  Read,"write,"update"▪  User"concurrency"is"the"common"performance"limiter"

▪  Low"data,"compute"complexity"

Business"Intelligence"/"Data"warehousing"▪  Assumed"to"be"read?only,"but"really"read"heavy,"write"heavy,"usually"separated"in"5me"

▪  Data"size"is"the"common"performance"limiter"

▪  High"data"complexity,"low"compute"complexity"

Analy5cs"▪  Read,"write"▪  Data"size"and"complexity"of"algorithm"are"the"limiters"

▪  Moderate"data","high"compute"complexity"

Three%General%Workloads%

But…"

BI"is"not"read"only"

OLTP"is"not"write?only"

Analy5cs"is"not"purely"computa5on"

Types%of%workloads%

Write?biased:""▪ OLTP"▪ OLTP,"batch"▪ OLTP,"lite"▪ Object"persistence"▪ Data"ingest,"batch"▪ Data"ingest,"real?5me"

Read?biased:"▪ Query"▪ Query,"simple"retrieval"▪ Query,"complex"▪ Query?hierarchical"/"object"/"network"

▪ Analy5c"

Mixed

Inline analytic execution, operational BI

What%you%need%depends%on%workload%&%need%

Op5mizing"for:"▪  Response"5me?"▪  Throughput?"▪  both?"

Concerned"about"rapid"growth"in"data?"

Unpredictable"spikes"in"use?"

Bulk"loads"or"incremental"inserts"and/or"updates?"

Important%workload%parameters%to%know%

•  Read?intensive""vs."write?intensive"


•  Read?intensive""vs."write?intensive"•  Mutable"vs."immutable"data"



•  Immediate"vs."eventual"consistency"




•  Short"vs."long"access"latency"




•  Short"vs."long"data"latency"•  Predictable"vs."unpredictable"data"access"pagerns"




•  Short"vs."long"data"latency"•  Predictable"vs."unpredictable"data"access"pagerns"•  Simple"vs."complex"data"types"

You"must"understand"your"workload"mix"?"throughput"and"response"5me"requirements"aren’t"enough."▪  100"simple"queries"accessing"month?to?date"data"

▪  90"simple"queries"accessing"month?to?date"data"and"10"complex"queries"using"two"years"of"history"

▪  Hazard"calcula5on"for"the"en5re"customer"master"

▪  Performance"problems"are"rarely"due"to"a"single"factor.""

Two%useful%concepts%to%characterize%queries%

Selec7vity"–"The"restric5veness"of"a"query"when"accessing"data."A"highly"selec5ve"query"filters"out"most"rows."Low"selec5ve"queries"read"most"of"the"rows."

"High "Low"SELECT SUM(salary) FROM emp WHERE ID = 1

SELECT SUM(salary) FROM emp

Two%useful%concepts%to%characterize%queries%

Retrieval"–"The"restric5veness"of"a"query"when"returning"data."High"retrieval"brings"back"most"of"the"rows."Low"retrieval"brings"back"rela5vely"few"rows."

"High "Low"SELECT name, salary FROM emp

SELECT SUM(salary) FROM emp

Selec<vity%and%number%of%columns%queried%

Row"store"or"column"store,"indexed"or"not?"

Chart from “The Mimicking Octopus: Towards a one-size-fits-all Database Architecture”, Alekh Jindal

Characteris<cs%of%query%workloads%

Workload% Selec<vity% Retrieval% Repe<<on% Complexity%

Repor<ng%/%BI% Moderate% Low% Moderate% Moderate%

Dashboards%/%scorecards%

Moderate% Low% High% Low%

Ad_hoc%query%and%analysis%

Low%to%high%

Moderate%to%low%

Low% Low%to%moderate%

Analy<cs%(batch)% Low% High% Low%to%High% Low*%

Analy<cs%(inline)% High% Low% High% Low*%

Opera<onal%/%embedded%BI%

High% Low% High% Low%

* Low for retrieving the data, high if doing analytics in SQL

Characteris<cs%of%read_write%workloads%

Workload% Selec<vity% Retrieval% Repe<<on% Complexity%

Online%OLTP% High% Low% High% Low%

Batch%OLTP% Moderate%to%low%

Moderate%to%high%

High% Moderate%to%high%

Object%persistence%

High% Low% High% Low%

Bulk%ingest% Low%(write)% n/a% High% Low%

Real<me%ingest% High%(write)% n/a% High% Low%

With ingest workloads we’re dealing with write-only, so selectivity and retrieval don’t apply in the same way, instead it’s write volume.

Workload%parameters%and%DB%types%at"data"scale"

Workload%parameters%

Write_biased%

Read_biased%

Updateable%data%

Eventual%consistency%ok?%

Un_predictable%query%path%

Compute%intensive%

Standard%RDBMS%

Parallel%RDBMS%

NoSQL%(kv,%dht,%obj)%

Hadoop*%

Streaming%database%

You see the problem: it’s an intersection of multiple parameters, and this chart only includes the first tier of parameters. Plus, workload factors can completely invert these general rules of thumb.

Workload%parameters%and%DB%types%at"data"scale"

Workload%parameters%

Complex%queries%

Selec<ve%queries%

Low%latency%queries%

High%concurrency%

High%ingest%rate%

Standard%RDBMS%

Parallel%RDBMS%

NoSQL%(kv,%dht,%obj)%

Hadoop%

Streaming%database%

You have to look at the combination of workload factors: data scale, concurrency, latency & response time, then chart the parameters.

Problem:%Architecture%Can%Define%Op<ons%

A%general%rule%for%the%read_write%axes%

As"workloads"increase"in"both"intensity"and"complexity,"we"move"into"a"realm"of"specialized"databases"adapted"to"specific"workloads."

Write intensity

Read intensity

OldSQL

NewSQL

NoSQL

In%general…%

Rela5onal"row"store"databases"for"conven5onally"tooled"low"to"mid?scale"OLTP"Rela5onal"databases"for"ACID"requirements"

Parallel"databases"(row"or"column)"for"unpredictable"or"variable"query"workloads"Specialized"databases"for"complex"data"query"workjloads"

NoSQL"(KVS,"DHT)"for"high"scale"OLTP"NoSQL"(KVS,"DHT)"for"low"latency"read?mostly"data"access"Parallel"databases"(row"or"column)"for"analy5c"workloads"over"tabular"data"NoSQL"/"Hadoop"for"batch"analy5c"workloads"over"large"data"volumes"

How To Select A Database


How To Select A Database - (1)1.What are the data management requirements and policies (if any) in

respect of:

- Data security (including regulatory requirements)?

- Data cleansing?

- Data governance?

- Deployment of solutions in the cloud?

- If a deployment environment is mandated, what are its technical characteristics and limitations? Best of breed, no standards for anything, “polyglot persistence” = silos on steroids, data integration challenges, shifting data movement architectures

2.What kind of data will be stored and used?

- Is it structured or unstructured?

- Is it likely to be one big table or many tables?


How To Select A Database - (2)3.What are the data volumes expected to be?

- What is the expected daily ingest rate?

- What will the data retention/archiving policy be?

- How big do we expect the database to grow to? (estimate a range).4. What are the applications that will use the database?

- Estimate by user numbers and transaction numbers

- Roughly classify transactions as OLTP, short query, long query, long query with analytics.

- What are the expectations in respect of growth of usage (per user) and growth of user population?

5.What are the expected service levels?

- Classify according to availability service levels

- Classify according to response time service levels

- Classify on throughput where appropriate


How To Select A Database - (3)6.What is the budget for this project and what does that cover?7.What is the outline project plan?

- Timescales

- Delivery of benefits

- When are costs incurred?

8.Who will make up the project team?

- Internal staff

- External consultants

- Vendor consultants

9.What is the policy in respect of external support, possibly including vendor consultancy for the early stages of the project?


How To Select A Database - (4)10.What are the business benefits?

- Which ones can be quantified financially?

- Which ones can only be guessed at (financially)?

- Are there opportunity costs?


A random selection of databasesSybase IQ, ASETeradata, Aster DataOracle, RACMicrosoft SQLServer, PDWIBM DB2s, NetezzaParaccelKognitioEMC/GreenplumOracle ExadataSAP HANAInfobrightMySQLMarkLogicTokyo Cabinet

EnterpriseDB LucidDBVectorwiseMonetDBExasolIlluminateVerticaInfiniDB1010 DataSANDEndecaXtreme DataIMSHive

AlgebraixIntersystems CachéStreambaseSQLStreamCoral8IngresPostgresCassandraCouchDBMongoHbaseRedisRainStorScalaris

And a few hundred more…Wednesday, April 25, 12

Product%selec<on%op<ons%

The"Subtrac5on"Model"▪  Start"with"a"full"set,"remove"what’s"bad,"evaluate"the"remainder"▪ Conven5onal"analyst"model"

▪ Works"best"with"a"stable"market"

The"Addi5on"Model"▪  Start"with"an"empty"set,"add"what’s"good,"evaluate"the"results"

▪  The"designer"model"▪ Works"best"in"an"emerging"or"changing"market"

Product Selection

Preliminary investigation

Short-list (usually arrived at by elimination)

Be sure to set the goals and control the process.

Evaluation by technical analysis and modeling

Evaluation by proof of concept.

Do not be afraid to change your mind

Negotiation


Conclusion

Wherein all is revealed, or ignorance exposed


Thank YouFor YourAttention


fit for purpose: the new database revolution findings webcast

Technology

data redundancy

database changeswednesday

ndbms orientdb

database sub subadding

revolutionary changes

ndbms gemstone

changes andwhere

stores cassandra