how to choose the right imc solution for your app · – apache spark • machine learning •...

23
© 2018 GridGain Systems, Inc. GridGain Company Confidential How to Choose the Best In Memory Solution for Your Apps

Upload: others

Post on 28-May-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: How to Choose the Right IMC Solution for your App · – Apache Spark • Machine Learning • Graph Processing • SQL – Warehouse/DB Vendors Fast Data & Big Data • Fast Data

©2018 GridGainSystems,Inc. GridGainCompanyConfidential

HowtoChoosetheBestInMemorySolutionforYourApps

Page 2: How to Choose the Right IMC Solution for your App · – Apache Spark • Machine Learning • Graph Processing • SQL – Warehouse/DB Vendors Fast Data & Big Data • Fast Data

©2018 GridGainSystems,Inc. GridGainCompanyConfidential

Agenda

• IMC Introduction• IMC Myths• IMC Product Categories• Q & A

Page 3: How to Choose the Right IMC Solution for your App · – Apache Spark • Machine Learning • Graph Processing • SQL – Warehouse/DB Vendors Fast Data & Big Data • Fast Data

©2018 GridGainSystems,Inc. GridGainCompanyConfidential

• Very Active Community• Great Way to Learn Distributed Computing• How To Contribute:

–https://ignite.apache.org/community/contribute.html#contribute

–https://cwiki.apache.org/confluence/display/IGNITE/How+to+Contribute

Apache Ignite: We Are Hiring!

Page 4: How to Choose the Right IMC Solution for your App · – Apache Spark • Machine Learning • Graph Processing • SQL – Warehouse/DB Vendors Fast Data & Big Data • Fast Data

©2018 GridGainSystems,Inc. GridGainCompanyConfidential

In-Memory Computing

uses high-performance, distributed memory systemsto compute and transact on large-scale data sets

in real-time, orders of magnitude faster than disk-based systems.

Page 5: How to Choose the Right IMC Solution for your App · – Apache Spark • Machine Learning • Graph Processing • SQL – Warehouse/DB Vendors Fast Data & Big Data • Fast Data

©2018 GridGainSystems,Inc. GridGainCompanyConfidential

Memory Centric vs. Disk Centric

• Disk First Architecture–Disk as primary storage, memory for caching–Client-Server processing–Latency: milliseconds

• Memory First Architecture–Memory as primary storage, disc for backup–Collocated processing–Latency: nanoseconds to microseconds

Page 6: How to Choose the Right IMC Solution for your App · – Apache Spark • Machine Learning • Graph Processing • SQL – Warehouse/DB Vendors Fast Data & Big Data • Fast Data

©2018 GridGainSystems,Inc. GridGainCompanyConfidential

Myth #1: Too Expensive

• Facts:–Memory price declined over last 30 years• went slightly up in past 2 years, but insignificant–Memory can be used as a caching layer• disk is a super set of memory–Memory can be extended to disk with swap store• disk only for cold data

Page 7: How to Choose the Right IMC Solution for your App · – Apache Spark • Machine Learning • Graph Processing • SQL – Warehouse/DB Vendors Fast Data & Big Data • Fast Data

©2018 GridGainSystems,Inc. GridGainCompanyConfidential

Myth #2: Not Durable

• Facts:– IMC systems have durable backups and disk storage• active or passive replicas, • transactional read-through and write-through–Mature IMC systems provide tiered storage• disk to store superset of data or cold data• memory to store hot data–Operational vs. Historical datasets• 99% of operational datasets < 10TB

Page 8: How to Choose the Right IMC Solution for your App · – Apache Spark • Machine Learning • Graph Processing • SQL – Warehouse/DB Vendors Fast Data & Big Data • Fast Data

©2018 GridGainSystems,Inc. GridGainCompanyConfidential

Myth #3: Flash Is Fast Enough

• Facts:–Flash on PCI-E is still... a block device.• Still going through OS I/O, I/O controller, etc.

–DRAM - nanoseconds–10GbE - microseconds (~50)–Flash or SSD - microseconds (between 20-500+)–Spinning Disk - milliseconds (between 4-7)

Page 9: How to Choose the Right IMC Solution for your App · – Apache Spark • Machine Learning • Graph Processing • SQL – Warehouse/DB Vendors Fast Data & Big Data • Fast Data

©2018 GridGainSystems,Inc. GridGainCompanyConfidential

Mapping nanoseconds to our universe

Page 10: How to Choose the Right IMC Solution for your App · – Apache Spark • Machine Learning • Graph Processing • SQL – Warehouse/DB Vendors Fast Data & Big Data • Fast Data

©2018 GridGainSystems,Inc. GridGainCompanyConfidential

Myth #4: Only For Caching

• Facts:–Caching is important use case, but limited• Easiest adoption and a “low-hanging fruit”– In-Memory Data Grids & Databases for today• Main system of records are in-memory–Memory-Centric Systems for tomorrow• Memory and disk are tightly integrated• Can store more data than fits in memory

Page 11: How to Choose the Right IMC Solution for your App · – Apache Spark • Machine Learning • Graph Processing • SQL – Warehouse/DB Vendors Fast Data & Big Data • Fast Data

©2018 GridGainSystems,Inc. GridGainCompanyConfidential

IMC Product Categories

• In-Memory“Options”–Oracle Database 12c, Microsoft SQL Server, Cassandra• In-MemoryCaches

- Redis, Memcached• In-MemoryRBDMS– VoltDB, MemSQL, Apache Ignite (use case)• In-MemoryDataGrids–Hazelcast, Coherence, Geode, Apache Ignite (use case)•Memory-CentricPlatforms– Apache Ignite, GridGain

Page 12: How to Choose the Right IMC Solution for your App · – Apache Spark • Machine Learning • Graph Processing • SQL – Warehouse/DB Vendors Fast Data & Big Data • Fast Data

© 2015 GridGain Systems, Inc.

Category: In-Memory“Options”

• FeatureontoanEXISTING database• Idealwhenonlyconfigurationchangeispossible:– NoAPIchanges– Nocodechanges– Nodatamigration

• Limitedbenefits– “marketing”forbasiccaching– notdistributed– nothorizontallyscalable

Page 13: How to Choose the Right IMC Solution for your App · – Apache Spark • Machine Learning • Graph Processing • SQL – Warehouse/DB Vendors Fast Data & Big Data • Fast Data

© 2015 GridGain Systems, Inc.

• BigData– OLAPmostly– LargerHistoricalDataSet– Read-Mostly– ThroughputNotImportant– LowQueryLatencies– Good-enoughforinteractive

analytics

FastData&BigData

• FastData– OLTPmostly– SmallerOperationalDataSet– HighThroughput(ops/sec)– LowLatencies– ConsistentorTransactional

Page 14: How to Choose the Right IMC Solution for your App · – Apache Spark • Machine Learning • Graph Processing • SQL – Warehouse/DB Vendors Fast Data & Big Data • Fast Data

© 2015 GridGain Systems, Inc.

• BigData– ApacheHadoop• MapReduce• HDFS• HBase

– ApacheSpark• MachineLearning• GraphProcessing• SQL

– Warehouse/DBVendors

FastData&BigData

• FastData– Streaming• ApacheFlink• ApacheKafka• ApacheApex

– In-MemoryDataGrids• ApacheIgnite• ApacheGeode

– In-MemoryDatabase• VoltDB• MemSQL

– NoSQL• MongoDB• ApacheCassandra

Page 15: How to Choose the Right IMC Solution for your App · – Apache Spark • Machine Learning • Graph Processing • SQL – Warehouse/DB Vendors Fast Data & Big Data • Fast Data

© 2015 GridGain Systems, Inc.

• DistributedIn-Memorycache– Redis– Memcached

• MainFeatures– SharedCache– BeyondLocalRAMCapacity– Easyofmaintenance

Category:In-MemoryCaches

Page 16: How to Choose the Right IMC Solution for your App · – Apache Spark • Machine Learning • Graph Processing • SQL – Warehouse/DB Vendors Fast Data & Big Data • Fast Data

© 2015 GridGain Systems, Inc.

Where Distributed Caches Fail?

Page 17: How to Choose the Right IMC Solution for your App · – Apache Spark • Machine Learning • Graph Processing • SQL – Warehouse/DB Vendors Fast Data & Big Data • Fast Data

© 2015 GridGain Systems, Inc.

• In-MemoryDatabases– MemSQL– VoltDB

• MainFeatures– High-Throughput– LowLatencies– FullSQLSupport• However,SQListheonlyAPI

– DiskPersistence• Diskisjustacopyofmemory

• Completereplacementofexistingdatabases!GoodorBad?

Category:In-MemoryDatabases

Page 18: How to Choose the Right IMC Solution for your App · – Apache Spark • Machine Learning • Graph Processing • SQL – Warehouse/DB Vendors Fast Data & Big Data • Fast Data

© 2015 GridGain Systems, Inc.

• In-MemoryDataGrids– ApacheGeode,Hazelcast,OracleCoherence– ApacheIgnite(usecase)

• MainFeatures– HighThroughput&LowLatencies– Transactions– CollocatedProcessing– DataQueryingCapability– DiskPersistence• Read&Write-throughtodatabases• Keepyourexistingdatabase

Category:In-MemoryDataGrids

Page 19: How to Choose the Right IMC Solution for your App · – Apache Spark • Machine Learning • Graph Processing • SQL – Warehouse/DB Vendors Fast Data & Big Data • Fast Data

©2018 GridGainSystems,Inc. GridGainCompanyConfidential

ApacheIgnite – MemoryCentricPlatform

Secu

rity &

Aud

iting

Mon

itorin

g &

Man

agem

ent

Data

Sna

psho

ts &

Rec

over

y

Memory-Centric StorageScale to 1000s of Nodes & Store TBs of Data

Ignite Native Persistence(Flash, SSD, Intel 3D XPoint)

Third-Party PersistenceKeep Your Own DB

(RDBMS, HDFS, NoSQL)

SQL Transactions Compute Services MLStreamingKey/Value

IoTFinancialServices

Pharma &Healthcare

E-CommerceTravel & LogisticsTelco

Data

cent

re R

eplic

ation

Page 20: How to Choose the Right IMC Solution for your App · – Apache Spark • Machine Learning • Graph Processing • SQL – Warehouse/DB Vendors Fast Data & Big Data • Fast Data

©2018 GridGainSystems,Inc. GridGainCompanyConfidential

Memory-centric distributeddatabase, caching, and processing platform

Page 21: How to Choose the Right IMC Solution for your App · – Apache Spark • Machine Learning • Graph Processing • SQL – Warehouse/DB Vendors Fast Data & Big Data • Fast Data

©2018 GridGainSystems,Inc. GridGainCompanyConfidential

GridGain In-Memory Computing Platform

Secu

rity &

Aud

iting

Mon

itorin

g &

Man

agem

ent

Data

Sna

psho

ts &

Rec

over

y

Memory-Centric StorageScale to 1000s of Nodes & Store TBs of Data

Ignite Native Persistence(Flash, SSD, Intel 3D XPoint)

Third-Party PersistenceKeep Your Own DB

(RDBMS, HDFS, NoSQL)

SQL Transactions Compute Services MLStreamingKey/Value

IoTFinancialServices

Pharma &Healthcare

E-CommerceTravel & LogisticsTelco

Data

Cen

ter R

eplic

ation

Page 22: How to Choose the Right IMC Solution for your App · – Apache Spark • Machine Learning • Graph Processing • SQL – Warehouse/DB Vendors Fast Data & Big Data • Fast Data

©2018 GridGainSystems,Inc. GridGainCompanyConfidential

Feature RDBMS NoSQL IMDG Ignite

ScaleOut X ✓ ✓ ✓

Availability X ✓ ✓ ✓

Consistency ✓ X ✓ ✓

In-Memory ✓ X ✓ ✓

Persistence ✓ ✓ X ✓

SQL ✓ X X ✓

Key-Value X ✓ ✓ ✓

CollocatedProcessing X X ✓ ✓

HowIgniteCompares

Page 23: How to Choose the Right IMC Solution for your App · – Apache Spark • Machine Learning • Graph Processing • SQL – Warehouse/DB Vendors Fast Data & Big Data • Fast Data

©2018 GridGainSystems,Inc. GridGainCompanyConfidential

Follow the conversation.http://www.gridgain.com

Any Questions?

#apacheignite#gridgain#dmagda