presentation: mongo db & elasticsearch & membase

32
MongoDB ElasticSearch Couchbase Prepared by: Shalkarbayuly A. (PhD student) Presentation was prepared for “Database Management” course as a part of PhD program in Turgut Ozal University

Upload: ardak-shalkarbayuli

Post on 15-Jul-2015

195 views

Category:

Software


1 download

TRANSCRIPT

MongoDB

ElasticSearch

CouchbasePrepared by: Shalkarbayuly A. (PhD student)

Presentation was prepared for “Database Management” course as

a part of PhD program in Turgut Ozal University

1. NoSQL databases

2. MongoDB

3. ElasticSearch

4. Couchbase

What is NoSQL?

NoSQL databases

NOSQL sometimes stands for Not Only SQL

NOSQL is mechanism for storage and retrieval

of data other than tabular data

Motivations are simplicity of design, horizontal

scaling, control over availability

Problems of RD solved by NoSQL

★ RD will not scale to your traffic at an

acceptable cost

★ NoSQL provides a tool to develop new

features easily.

★ NoSQL have local data transactions which

do not have to be very durable. e.g. “liking”

items on websites.

Avoid NoSQL

➢ If application requires run-time flexibility.

➢ If application requires ACID

➢ if application requires complicated queries

➢ if application requires query language

➢ If consistency is mandatory and there will

be no drastic changes in data volume,

relational databases would be a better option

What is MongoDB?

★ MongoDB is cross-platform

★ MongoDB is document-oriented database

★ MongoDB is a NoSQL database

★ MongoDB stores data in JSON-like

documents

MongoDB philosophy

Keep functionality when we can

Non-relational makes scaling horizontally

practical

Document models are good

Database technology should run anywhere

VMs, cloud, metal, etc

Use cases for MongoDB

Need of horizontal scaling:

storing in many regular servers

Iterative development:

regular changes of database’s structure

Document-oriented logic:

web page is important than data

Functionality vs Scalability & Performance

Goal: create Web-application for e-commerce

Products: there is no specific products, different

type of products may be sold on webapp

Problem: design database schema

E-commerce: sample problem

Product_Book{id, name, shipping_info, price, description, ……….author,title,publisher,edition,ISBN}

Product_Media{id, name, shipping_info, price, description, ……….artist,title,track_listing,label,format}

Simple solution: each time create table for specific product type

Problem: very complex code, creating table, rewriting app takes time and

causes errors, items cannot be considered as one item

RE

LA

TIO

NA

L D

AT

AB

AS

E

appro

ach

Product{id, name, shipping_info, price, description, field1_value, field1_name,field2_value,field2_name,}

Product{id, name, shipping_info, price, description, type,author,artist,track_listing,ISBN,}

Set of fields with value and name

Problems: what if there are many

fields, how to find all books

All types of attributes in one table

Problems: new items causes

changes in code and table

RE

LA

TIO

NA

L D

AT

AB

AS

E

appro

ach

Book{title,author,ISBN,}

Product{id, name, shipping_info, price, description, type (whether book or media)}

Creating polymorphic tables

Problems: need extra JOINS, which causes increase of speed

RE

LA

TIO

NA

L D

AT

AB

AS

E

appro

ach

MongoDB rescues

★ Flexible schema

★ Easily searchable

★ Easily accessible

★ Fast

{title: “Matrix”, price: 3500,details:{actors:[‘Keanu Reaves’,’’K. Zeta Jones’]}}

{title: “Sherlock Holmes”, price: 2100, details:{ISBN:33002A,author:”Conan Doyle”}}

★ MongoDB stores data in document form

○ Don’t need schema

○ Store in JSON form

○ Datas are edited in application

★ No JOINS

○ Data is loaded in LINEAR time

CRUD operations

Retrieving, creating, updating, deleting operations are done

on application side

There is no SQL queries

if software has many apps (on different platforms) it is BAD.

Because you have to write logic each time

if software has one app it is GOOD.

Because you don’t have to mess with SQL code

MongoDB: java example

//Create String json = “{‘name’:’Mike’,’surname’:’Smith’}”;DBObject dbObject = (DBObject)JSON.parse(json);collection.insert(dbObject);//Retrievedb.products.find({‘title’:”The Matrix”});//UpdateBasicDBObject newD = new BasicDBObject();newD.append("$set",new BasicDBObject().append("clients", 110));BasicDBObject sq = new BasicDBObject().append("name", "Mike");collection.update(sq, newD);

ElasticSearch

★ Real-time data

★ Real-time analytics

★ Distributed

★ High-availability

★ Multitenancy

★ Full-text search

★ Document-oriented

★ Schema-free

★ RESTful API

★ Build on top

Apache Lucene

Elasticsearch: important features

Elasticsearch is based on Lucene

Elasticsearch is ready for search of all types

➔Elasticsearch is search engine

➔Elasticsearch is document database

What is Lucene? (small explanation)

Lucene is information retrieval library, which

takes documents and makes them easily

searchable, through:

● indexing

● advanced analysis

● tokenization (indexing mice as mouse)

Lucene creates inverted

index, so that searching in

documents is performed in

linear time

Indexing

Searching terms in

documents without

indexing is

doc.size x no.documents

Don’t do this

Elasticsearch: use case

Elasticsearch is used as:

★ search engines

★ as a search mechanism for web-apps

among main database○ e.g. E-commerce storing data in MongoDB, while

search data is stored in elasticsearch

★ as a document database

Elasticsearch: REST

Elasticsearch can be

accessed through

REST protocol

#Inserting dataPUT http://localhost:9200/movies/movie/1{"title": "The Godfather","director": "Francis Ford Coppola"}#Getting dataGET http://localhost:9200/movies/movie/1#Delete dataDELETE http://localhost:90/movies/movie/1#Searching dataPOST http://localhost:9200/_search"query": {

"query_string": {"query": "kill"

}}

Couchbase

Couchbase history

Couchbase was created by combining two

NoSQL databases

Membase + CouchOne (principal players

behind CouchDB) = Couchbase

CouchBase

● Written in: Erlang & C

● Main point: Memcache compatible, but with

persistence and clustering

● Protocol: memcached + extensions

● Very fast (200k+/sec) access of data by key

● Provides memcached-style in-memory

caching buckets

Couchbase

Best used: Any application where low-latency

data access, high concurrency support and

high availability is a requirement.

For example: Low-latency use-cases like ad

targeting or highly-concurrent web apps like

online gaming (e.g. Zynga).

Couchbase

Couchbase store data in key-value or in document form

Couchbase is a key-value store: every Document has a

Key and a Value

Key can be a up to 250 characters

Keys are unique, within a bucket there can be only one key

Values can be JSON, string, numbers, binary blobs, special

positive number

Key-value NOSQL databases

Performance: high

Scalability: high

Flexibility: high

Complexity: none

Advantage:

High speed of response

Disadvantage:

All logic is located in app

Couchbase java example

CouchbaseClient c = new CouchbaseClient(baseURI, "default", "");long userCounter = c.incr(“user_counter”,1,1);c.set(“user:”+userCounter,json.toJson(user));

c.set("key", 0, "Hello World");System.out.println(c.get("key"));