front range php nosql databases

NoSQL Databases

Jon [email protected]

IntroduceDisclose work for BashoWorking on Dynamo clone for the last couple of years

What isn't NoSQL?

NOT a standard.

NOT a product.

NOT a single technology.

Well, what is it?

It's a buzzword.

A banner for non-relational databases to organize under.

Mostly created in response to scaling and reliability problems.

Huge differences between 'NoSQL' systems but have elements in common.

Where did it come from?

They've been around for a whileLocal key/value stores

Object databases

Graph databases

XML databases

New problems are emergingInternet search

e-commerce

Social networking

Where did it come from?

Some efforts came from scaling the web...

Several papers published 2006 Google BigTable

2007 Dynamo Paper

In 2008 - explosion of data storage projects

All shambling under the NoSQL banner.

Really, why not use RDBMs?

I need to perform arbitrary queries

My application needs transactions

Data needs to be nicely normalized

I have replication for scalabilty/reliability

Data Mapping Woes

Relational databases divide data into columns made up of tables.

Programmers use complex nested data structuresHashes

Sets

Arrays

Things of things

Have to map between the two

Data Mapping Woes (2)

Data in systems evolve over time
which means changes to the schema.

Upgrade/rollback scripts have to operate on the whole database could be millions of rows.

Doing phased rollouts is hard
the application needs to do work

Alternative!

Let the application do it

Use convenient language featuresPHP serialize/unserialize

or use standards for mixed platformsJSON very popular and well supported

Google's protocol buffers

even XML

Design for forward compatibilityPreserve unknown fields

Version objects

Scalability and Availability

ScalabilityHow many requests you can process

AvailabilityHow does your service degrade as things break.

RDBMS solutions - replication and sharding

Scaling RDBMs - Replication

Master-Slave replication is easiest

Every change on the master happens on the slave.

Slaves are read-only. Does not scale INSERT, UPDATE, DELETE queries.

Application responsible for distributing queries to correct server.

Scaling RDBMs - Replication

Multi-master ring replicationCan update any master

Updates travel around the ring

What happens when it fails?Reconfigure the ring

What happens on returnSynchronize the master

Add back in to the ring

Replication

Replication is usually asynchronous for performance you don't want to wait for the slowest slave on each update.

Replication takes time there is time lag between the first and last server to see an update.

You may not read your writes not getting aCid properties any more.

Scaling RDBMS Sharding

Do application level splitting of dataSplit large table into N smaller tables

Use Id modulo N to find the right table

Tables could be spread across multiple database serversBut the application needs to know where to query

Availability

If you want availability you need multiple servers maybe even multiple sites.

In the real world you get network partitionsJust because you can't see your other data center doesn't mean users can't.

What should you do if you can't see the other data center?

Availability

Degrade one site to read-onlyDefeats availability

If you allow both sites to operateThere's a chance two users could modify the same data.

The application needs to know how to resolve it

The bottom line...

Building systems that are...Scalable......Available......Maintainable...

with an RDBMs requires large efforts by application developers and operational staff

It's hard because...

Significant work for developers.App needs to convert data to table/columns

App needs to know data location

App needs to handle failover

App needs to handle inconsistency

Work for operational staffFixing replication topologies and synchronizing servers is fiddly work.

Last decades bleeding edge is here

Organizations with big problems started experimenting with alternatives

Developed internal systems during the mid 2000sDistributed by design

Different data models

Published details in 2006/2007

Amazon

Huge e-commerce vendor.

Amazon cares about customer experienceAvailabilty

Latency at the 99th percentile

Built as an SOA pages built from hundreds of services.

Amazon runs multiple data centers.Hardware failure is their normal state

Network partitions common

Amazon Requirements

Shopping cart service must always be available

Customers should be able to view and add to their carts (in their words)If disks are failing

Network routes are flapping

Data centers are being destroyed by tornadoes

Amazon Observations

Many services just stored state.Access by primary key

No queries

ExamplesShopping carts

Best seller lists

Customer profiles

Hard to scale out relational databases

Amazon Solution: Dynamo

Primary key access only

Fault tolerant: Keeps N copies of the data

Designed for inconsistency

Totally decentralized nodes 'gossip' state

Self-healing

Eventual Consistency 1

Brewer's CAP TheoremConsistency

Availability

Partition tolerance

Pick two out of three!

Amazon chose A-P over C


N copies of each value

Read operations (get) require 'R' nodes to respond

Write operations (put) require 'W' nodes to respond

If R+W > N nodes will read their writes (if no failure)

NRW tunes the cluster typically (3,2,2)


Consequence of availability: Conflicts

Conflicts can come fromNetwork partitions

Applications themselves no transactions or locking

Applications must handle conflicts

Dynamo minimizes with vector clocks

Vector Clocks

Partitioning

Example: Shopping Cart

User browses site adds 3 widgets

{ user_id:fd44dbb4-2e1e-4fb9-9ea3-b5c672e04ac5, items: [ {id:widget, quantity:3, per_unit:5.43} ] }

Shopping Cart - Conflict

{ user_id:fd44..., items: [ {id:widget, quantity:3, per_unit:5.43}, {id:thing, quantity:1, per_unit:0.33} ] }{ user_id:fd44..., items: [ {id:widget, quantity:3, per_unit:5.43}, {id:stuff, quantity:7, per_unit:9.32} ] }

Network Failure

Shopping Cart - Merge

{ user_id:fd44..., items: [ {id:widget, quantity:3, per_unit:5.43}, {id:thing, quantity:1, per_unit:0.33} ] }{ user_id:fd44..., items: [ {id:widget, quantity:3, per_unit:5.43}, {id:stuff, quantity:7, per_unit:9.32} ] }{ user_id:fd44..., items: [ {id:widget, quantity:3, per_unit:5.43}, {id:thing, quantity:1, per_unit:0.33} {id:stuff, quantity:7, per_unit:9.32} ] }

Open Source Dynamo

Dynamo is internal to Amazon

Open source optionsRiak from Basho

Project Voldemort

Google BigTables

Used internally at GoogleIndexing the web

Google Earth

Finance

Distributed storage system for structured data

Data representation

Data stored in tables.

Table indexed by {key,timestamp} and a variable number of sparse columns

Columns are grouped into column families. Columns in a family are stored together.

Each table is broken into tablets.

Tablets are stored on a cluster file system (GFS).

BigTable Column Families

Copyright Google

Map/Reduce

Processing framework that sits on top of BigTable.

Programmers write two functions map() and reduce().

Table is mapped, then reduced.

Job control system monitors and resubmits.

Map/Reduce

Source: institutes.lanl.gov

BigTable has inspired...

Hadoop/Hbase

Cassandra

Riak

CouchDB

Map/Reduce

Explosion of NoSQL Dbs

Too many projects

Two good resourceshttp://nosql.mypopescu.com/

http://www.vineetgupta.com/
2010/01/nosql-databases-part-1-landscape.html

So many projects!

Dynamo, BigTables, RedisRiak, Voldemort, CouchDb, PeanutsHadoop/Hbase, Cassandra, HypertableMongoDb, Terrastore, Scalaris, BerkleyDBMemcacheDB, Dynomite, Neo4J,TokyoCabinet and more

NoSQL Characteristics

Broad typesKey/Value

Sparse Column Family

Document oriented

PersistenceIn memory

On disk

DistributionReplicated

Decentralized

Riak from Basho
http://riak.basho.com

Dynamo clone written in Erlang

RESTful HTTP interface

Fully distributed

Clients for multiple languages

Multiple storage backendsIn-memory

Filesystem

Embedded InnoDB

I work there now!

Redis 1.2

http://code.google.com/p/redis/

Key/Value Store with structured values

Written in C

Memcache-like protocol

In use at Github

Engine Yard

VideoWiki

Redis 1.2 (cont)

Values can be strings, sets, ordered sets, lists

Operations like increment, decrement, intersection, push, pop

In-memory (can be backed by disk)

Auto-sharding in client libraries

No fault tolerance (coming after 2.0)

Example: retwis Twitter clone in PHP

Cassandra

http://incubator.apache.org/cassandra/

BigTable ColumnFamily data model

Dynamo data distribution

Written in Java

Thrift based interface

In use atFacebook

Twitter

CouchDB

Document oriented databaseAll JSON documents

Written in Erlang

Used by Ubuntu One

HTTP interface

Uses Javascript for indexing/mapreduce

Incremental replication

BerkleyDB

Sleepycat now owned by Oracle

Key/Value StoreMulti-threaded

Multi-process

Replicated

Tranactional

Alternative: Tokyo Cabinet

I'm out of time

MongoDB

Neo4J Graph Database

Peanuts Yahoo

This is all great but...

Relational databases provide a lot of functionality.Giving up queries

Even range queries are hard for distributed hash systems.

No transactions rules out some classes of applications.

Space is still evolving

Conclusion

NoSQL systems give applications the tools they need for scalability/availability

They force you to think about distributed design issues like consistency.

Play with them!

???Page ??? (???)03/10/2010, 17:52:49Page /

front range php nosql databases

Technology