deploying massive scale graphs for realtime insights

20
Deploying massive scale for graphs for realtime insights B Brech CTO POWER Solutions [email protected]

Upload: neo4j-the-fastest-and-most-scalable-native-graph-database

Post on 08-Jan-2017

66 views

Category:

Software


1 download

TRANSCRIPT

Page 1: Deploying Massive Scale Graphs for Realtime Insights

Deploying massive scale forgraphs for realtime insights

B Brech

CTO POWER Solutions

[email protected]

Page 2: Deploying Massive Scale Graphs for Realtime Insights

2

2

© 2016 International Business Machines Corporation

Drive Efficiency- Time Reduction- Cost Reduction- Consistency

Better Insights- Broader Scope- Learning Models- Speed & Accuracy

Better Business- Innovation- Customer Care- Reactivity

Business relies more on Data than ever before

Page 3: Deploying Massive Scale Graphs for Realtime Insights

3

3

© 2016 International Business Machines Corporation

1990’s

2020’s

Video

Text

Exa

Peta

Tera

Giga

Da

taV

olu

me

2000’s

2010’s

Structured data

Audio

Image

Med

High

Low

Co

mp

uta

tio

na

lN

ee

ds

So

ph

isti

ca

tio

no

fA

na

lys

is

Ex

pre

ssiv

en

es

s

Digital Marketing

10+% of videoviews

Wide Area Imagery

100’s TB per day72 videohrs/minute

MediaSource: IBMMarket Insightsbased oncompositesources

Safety / Security

Healthcare

Customer

1B camera phones

1B medical images/yr

10s millions cameras

Enterprise Video

Used by 1/3 ofenterprises

Data VolumeData Velocity

Data AuthenticityData Complexity

Data VariabilityData Variety

While Data is Exploding

Page 4: Deploying Massive Scale Graphs for Realtime Insights

4

4

© 2016 International Business Machines Corporation

Time is Moneyand

Insights are critical

Ingest Analyze Act Measure Learn

Optimize

Decision time is shrinking

Page 5: Deploying Massive Scale Graphs for Realtime Insights

5

5

© 2016 International Business Machines Corporation

Recommendation engines- used in variety of industries

Network intrusion prevention

Fraud prevention

Financial Services

BioMedical - Genomics

Combination of Scale & Speed is criticalin many use cases

Extreme Scale Example:- 30TB and growing DB- 25 BG/s ingress- over 400K updates / Sec- 60B+ relationships- Query Response < 200ms

Page 6: Deploying Massive Scale Graphs for Realtime Insights

6

6

© 2016 International Business Machines Corporation

DB2 > DB2Blu

SAP > SAP Hana

Oracle > 12C

CICS

EnterpriseDB

Etc..

NoSQLs :

MemCached, REDIS,

NEO4J, CASSANDRA,

MARIA, MONGO,

ORIENT, COUCH,

Etc…

Traditional DBs

going in-memoryDesigned as

in-memory repositories

AnalyzeDecision

Innovation

Act

Ingest

But in-memory has some constraints and limits.

Data repositories are changing also

Page 7: Deploying Massive Scale Graphs for Realtime Insights

7

7

© 2016 International Business Machines Corporation

Built with open innovation toput your data to work across the enterprise

Designed forBig Data

OpenInnovationPlatform

Superior CloudEconomics

IBM POWER8 : Designed for Big Data

Page 8: Deploying Massive Scale Graphs for Realtime Insights

8

8

© 2016 International Business Machines Corporation

UNSTRUCTURED IN-MEMORY STRUCTURED

Flash for extremeperformance

Massive IObandwidth

Continuous

data load

Parallelprocessing

Large-scalememory processing

Optimized for a broad range of big data & analytics workloads:

Processorsflexible, fast execution of

analytics algorithms

Memorylarge, fast workspace to

maximize business insight

Cacheensure continuous data load

for fast responses

4Xthreads per core vs. x86

(up to 1536 threads per system)

4Xmemory bandwidth vs. x861

(up to 16TB of memory)

4Xmore cache vs. x862

(up to 800MB cache per socket)

IBM POWER8 brings performance and scale

Page 9: Deploying Massive Scale Graphs for Realtime Insights

9

9

© 2016 International Business Machines Corporation

POWER Ecosystem

Designedfor Big Data

WorkloadAcceleration

Definedby Software

Retail Healthcare

Banking Government Telecom

Open andCollaborative

Technology &Price/PerfLeadership

Watson

LinuxHadoop

POWER8

Hypervisor

Virt I/O ServerShared I/O

Single SMP Hardware System

Built inVirtualization

LeadingPerformance

ProcessorInnovation

Streams

Foundations

SuzhouPowerCoreTechnology

VirtualizationOfferings

Key solutions:+Open Source Tools+Middleware+Industry Solutions+ Social / Mobile / Analytics / Cloud

HadoopSpark

Page 10: Deploying Massive Scale Graphs for Realtime Insights

10

10

© 2016 International Business Machines Corporation

Fundamental forces are acceleratingindustry change

IT innovation can no longercome from just the processor

Solution Innovation andAcceleration is a key to

the future

Price/P

erf

orm

ance

Full system stack innovationrequired

Moore’s Law

Technology andProcessors

2000 2020

Firmware / OSAcceleratorsSoftwareStorageNetwork

Full StackAcceleration (Lower

isbetter)

The OpenPOWER Foundationis an open ecosystem,

using thePOWER Architecture to serve

the evolving needs ofcustomers.

Page 11: Deploying Massive Scale Graphs for Realtime Insights

11

11

© 2016 International Business Machines Corporation

NVLINK

GPUFPGA

Flash NIC

MRAM PCM

Solution Acceleration is a key to the future

Page 12: Deploying Massive Scale Graphs for Realtime Insights

12

12

© 2016 International Business Machines Corporation

NVLINK

GPU

Flash

Graphics – CAE - EDA

Weather

Defense

Financial Services

Bio-Sciences

General: Compression

Encryption

DataBases: Flash

Finance: Algorithms, Facial

Genomics : Algorithms

DecisionSupport

DataAnalytics

FinancialSimulations

GenomicAnalysis

NetworkData Forensics

FacialRecognition

Solution Acceleration is a key to the future

Page 13: Deploying Massive Scale Graphs for Realtime Insights

13

13

© 2016 International Business Machines Corporation

IBM Data Engine for NoSQL is an integrated platform for large and fast growing NoSQL datastores. It builds on the CAPI capability of POWER8 systems and provides super-fast access to

large flash storage capacity. It delivers high speed access to both RAM and flash storage whichcan result in significantly lower cost, and higher workload density for NoSQL deployments than astandard RAM-based system. The solution offers superior performance and price-performance toscale out x86 server deployments that are either limited in available memory per server or have

flash memory with limited data access latency.

Up to 56TB of extended memory with one POWER8 server + CAPI attach FLASH

Power S822L /S812L

Flash System 900Power S822L / S812L / S822 LC

NEW

External Flash Configuration Integrated Flash Configuration

Up to 8TB of super-fast storage tier on one POWER8 server

IBM Data Engine for NoSQLCost Savings for In-Memory NoSQL Data Stores

Page 14: Deploying Massive Scale Graphs for Realtime Insights

14

14

© 2016 International Business Machines Corporation

Identical hardware with 3 differentpaths to data

FlashSystem

ConventionalI/O (FC) CAPI - E

IBM POWER S822L

CAPI - I

IBM's CAPI NVMe Flash Accelerator is almost 5X moreefficient in performing IO vs traditional storage.

21%

35%

56%

100%

0%

25%

50%

75%

100%

CAPI NVMe Traditional NVMe Traditional Storage -Direct IO

Traditional Storage -Filesystem

Relative CAPI vs. NVMe Instruction Counts per IO

Kernel Instructions User Instructions

ONCAPI Unlocks the Next Level

of Performance for Flash

Page 15: Deploying Massive Scale Graphs for Realtime Insights

15

15

© 2016 International Business Machines Corporation

ONEfficient IO Enables True Utilization

of Storage Bandwidth

Under heavy load, IOPs per threadbecomes a critical metric for sustainingthroughput in a storage system. Asthroughput increases, more CPU is requiredto maintain performance.

CAPI NVMe flash leverages improved pathlength, architectural improvements, andhardware built-in to POWER8 to greatly-improve the relative IOPs per CPU thread.

At high levels of IO (sustained millions ofIOPs), more data can be processed moreefficiently, radically changing the amount ofCPU required to “feed the (IO) beast.”

0.6X

1X

2.6X

3.7X

0%

100%

200%

300%

400%

Fibre Channel NVMe CAPI Fibre Channel CAPI NVMe

Average Relative IOPs per CPU Thread

CAPI-accelerated NVMe Flash can issue 3.7X more IOsper CPU thread than regular NVMe flash.

Page 16: Deploying Massive Scale Graphs for Realtime Insights

16

16

© 2016 International Business Machines Corporation

Neo4j + IBM POWER8:Unparalleled Scale and Performance

Neo4j on IBM POWER8

The strength and tooling of Neo4j

The performance of POWER8

The scalability of POWER8 & CAPIFlash

Unrivaled graph applicationscalability and performance

ON

Page 17: Deploying Massive Scale Graphs for Realtime Insights

© 2016 IBM Corporation

Real-World mixed graph transaction workloadrunning Neo4j on POWER8 delivers 1.82X better

performance than Intel Xeon E5-2650 v4 Broadwell

711

390

0

100

200

300

400

500

600

700

800

POWER8 x86

Re

pre

se

nta

tive

mix

ed

wo

rklo

ad

Th

rou

gh

pu

t

IBM Power S822LC (20c/160t) x86 Broadwell Server (24c/48t)

82%More

Throughput

• POWER8 delivers 1.82X morequery throughput for arepresentative mixed sampleworkload than x86

– POWER8 (20 cores / 256 GB):

– x86 system with Broadwellprocessor (24 cores / 256 GB):

•Based on IBM internal testing of single system and OS image running a real-world mixed graph transaction workload based on LDBC benchmark. Conducted under laboratory condition, individual result can vary based on workload size, use of storagesubsystems & other conditions.• IBM Power System S822LC; 20 cores (2 x 10c chips) / 160 threads, POWER8; 256 GB memory, Neo4j, Ubuntu 16. Competitive stack: HP Proliant DL380 Gen9; 24 cores (2 x 12c chips) / 48 threads; Intel E5-2650 v4; 256 GB memory, Neo4j, RHEL 7.2 .

Pricing is based bundled pricing for S822LC with Integrated CAPI Flash card.

Page 18: Deploying Massive Scale Graphs for Realtime Insights

© 2016 International Business Machines Corporation 18

Scale up and/or out based on yourapplication requirements

• Out-of-order, super-scalar design forexploiting instructionlevel parallelizationleading to low CPI

• Larger caches and99.94% data-cachehit rate

• SMT design to improvecore efficiency andincrease throughputcapability

Use the paradigm shift to realize yourimagination

CA

PI-

Fla

sh

Performance and Scale as YOU Need ON

Page 19: Deploying Massive Scale Graphs for Realtime Insights

Open innovation to put data to workacross the enterprise

Thanks!

© 2016 International Business Machines Corporation 19

Page 20: Deploying Massive Scale Graphs for Realtime Insights

© Copyright International Business Machines Corporation 2016

Printed in the United States of America September 2016

IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corp.,registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies.A current list of IBM trademarks is available on the Web at “Copyright and trademark information” atwww.ibm.com/legal/copytrade.shtml.

The following terms are trademarks or registered trademarks licensed by Power.org in the United States and/or other countries: Power ISA.Information on the list of U.S. trademarks licensed by Power.org may be found at www.power.org/about/brand-center/.Linux is a trademark of Linus Torvalds in the United States, other countries, or both.Other company, product, and service names may be trademarks or service marks of others.

All information contained in this document is subject to change without notice. The products described in this documentare NOT intended for use in applications such as implantation, life support, or other hazardous uses where malfunctioncould result in death, bodily injury, or catastrophic property damage. The information contained in this document does notaffect or change IBM product specifications or warranties. Nothing in this document shall operate as an express or impliedlicense or indemnity under the intellectual property rights of IBM or third parties. All information contained in this documentwas obtained in specific environments, and is presented as an illustration. The results obtained in other operatingenvironments may vary.

While the information contained herein is believed to be accurate, such information is preliminary, and should not be relied upon for accuracy or completeness, and no representationsor warranties of accuracy or completeness are made.

Note: This document contains information on products in the design, sampling and/or initial production phasesof development. This information is subject to change without notice. Verify with your IBM field applicationsengineer that you have the latest version of this document before finalizing a design.

You may use this documentation solely for developing technology products compatible with Power Architecture®. You may not modify or distribute this documentation. No license,express or implied, by estoppel or otherwise to any intellectual property rights is granted by this document.

THE INFORMATION CONTAINED IN THIS DOCUMENT IS PROVIDED ON AN “AS IS” BASIS. In no event will IBM beliable for damages arising directly or indirectly from any use of the information contained in this document.

IBM Systems and Technology Group2070 Route 52, Bldg. 330Hopewell Junction, NY 12533-6351

The IBM home page can be found at ibm.com®.

Version 1.1January, 2016