supercharge your rdbms with elasticsearch

14
Supercharge Your RDBMS with Elasticsearch Arthur Gimpel, Director of DataZone

Upload: arthur-gimpel

Post on 16-Apr-2017

62 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Supercharge your RDBMS with Elasticsearch

Supercharge Your RDBMS with Elasticsearch

Arthur Gimpel, Director of DataZone

Page 2: Supercharge your RDBMS with Elasticsearch

Name: Arthur Gimpel

Position: Technology Evangelist, Solutions Architect, Trainer

Tech Stack: Elastic Stack, SQL Server, MongoDB, Couchbase, Redis, Kafka, StreamSets, Python, .NET…

Free Time: Motorcycles, Skydiving…

Click to edit Master title styleAbout Me

Page 3: Supercharge your RDBMS with Elasticsearch

• First RDBMS was introduced in late 1970s

• Exist in all possible flavors but share one thing - ACID• Still dominate the database market

Click to edit Master title styleRelational Database Management Systems

Page 4: Supercharge your RDBMS with Elasticsearch

• Atomicity: All or nothing approach, transactions

• Consistency: Hard state, every transaction changes the whole DBMS

• Isolation: Transactions cannot interfere with each other

• Durability: Every transaction is persisted

Click to edit Master title styleRDBMS in Theory - ACID

Page 5: Supercharge your RDBMS with Elasticsearch

• Everything is persisted, synchronously. Limited by IO performance

• All data is bound to a tabular schema, hard to make changes in big databases

• ACID makes horizontal scaling nearly* impossible

• Complex schema slows down aggregations and queries drastically

Click to edit Master title styleACID Is Not Perfect

Page 6: Supercharge your RDBMS with Elasticsearch

• Distributed / Horizontal Scalability

• Mostly Open Source• Mostly schema less:

• Key - Value

• Document

• Graph

• Serves specific purposes

Click to edit Master title styleNoSQL - New Kid in Town

Page 7: Supercharge your RDBMS with Elasticsearch

• Every data store has its purpose. There is no single solution to all database needs

• NoSQL does not implement all of RDBMS’s abilities (CDC, Jobs, Stored Procedures, Triggers)

• Every data store has its own languages, and APIs. There is no ANSI SQL

Click to edit Master title styleNoSQL - Challenges

Page 8: Supercharge your RDBMS with Elasticsearch

Click to edit Master title styleNoSQL = Not Only SQL | Polyglot Persistence

Page 9: Supercharge your RDBMS with Elasticsearch

• Search platform, data store based on Apache Lucene

• Supports various search types: Filtered, Full-text, Geography, Aggregation (Facet, Nested, Pipeline), Graph

• Distributed - every index is split to shards relying on (potentially) a node

• Document store - JSON

• “Optimistic” Schema-less architecture

• Supports Replication by nature

• Supports Unsupervised Machine Learning by nature (Prelert, in beta)

Click to edit Master title style

Page 10: Supercharge your RDBMS with Elasticsearch

Click to edit Master title styleSearch != SQL Querying

Page 11: Supercharge your RDBMS with Elasticsearch

Click to edit Master title styleReference Architecture #1

Page 12: Supercharge your RDBMS with Elasticsearch

Click to edit Master title styleReference Architecture #2

Page 13: Supercharge your RDBMS with Elasticsearch

Click to edit Master title styleArchitecture Comparison

Architecture #1 Architecture #2

Data distribution strategy Data store based Application based

Data distribution component Data Pipeline ( StreamSets ) Message Queue ( Kafka )

Implementation Team Data Engineers / DevOps DevOps / Developers

Implementation Complexity Low: Data pipeline development High: data access layer refactor

Potential additional licensing Elasticsearch, StreamSets None

Scalability Limited to RDBMS Scale Fully scalable regardless of RDBMS

Page 14: Supercharge your RDBMS with Elasticsearch

Thank You!