front range php nosql databases
TRANSCRIPT
NoSQL Databases
IntroduceDisclose work for BashoWorking on Dynamo clone for the last couple of years
What isn't NoSQL?
NOT a standard.
NOT a product.
NOT a single technology.
Well, what is it?
It's a buzzword.
A banner for non-relational databases to organize under.
Mostly created in response to scaling and reliability problems.
Huge differences between 'NoSQL' systems but have elements in common.
Where did it come from?
They've been around for a whileLocal key/value stores
Object databases
Graph databases
XML databases
New problems are emergingInternet search
e-commerce
Social networking
Where did it come from?
Some efforts came from scaling the web...
Several papers published 2006 Google BigTable
2007 Dynamo Paper
In 2008 - explosion of data storage projects
All shambling under the NoSQL banner.
Really, why not use RDBMs?
I need to perform arbitrary queries
My application needs transactions
Data needs to be nicely normalized
I have replication for scalabilty/reliability
Data Mapping Woes
Relational databases divide data into columns made up of tables.
Programmers use complex nested data structuresHashes
Sets
Arrays
Things of things
Have to map between the two
Data Mapping Woes (2)
Data in systems evolve over time
which means changes to the schema.
Upgrade/rollback scripts have to operate on the whole database could be millions of rows.
Doing phased rollouts is hard
the application needs to do work
Alternative!
Let the application do it
Use convenient language featuresPHP serialize/unserialize
or use standards for mixed platformsJSON very popular and well supported
Google's protocol buffers
even XML
Design for forward compatibilityPreserve unknown fields
Version objects
Scalability and Availability
ScalabilityHow many requests you can process
AvailabilityHow does your service degrade as things break.
RDBMS solutions - replication and sharding
Scaling RDBMs - Replication
Master-Slave replication is easiest
Every change on the master happens on the slave.
Slaves are read-only. Does not scale INSERT, UPDATE, DELETE queries.
Application responsible for distributing queries to correct server.
Scaling RDBMs - Replication
Multi-master ring replicationCan update any master
Updates travel around the ring
What happens when it fails?Reconfigure the ring
What happens on returnSynchronize the master
Add back in to the ring
Replication
Replication is usually asynchronous for performance you don't want to wait for the slowest slave on each update.
Replication takes time there is time lag between the first and last server to see an update.
You may not read your writes not getting aCid properties any more.
Scaling RDBMS Sharding
Do application level splitting of dataSplit large table into N smaller tables
Use Id modulo N to find the right table
Tables could be spread across multiple database serversBut the application needs to know where to query
Availability
If you want availability you need multiple servers maybe even multiple sites.
In the real world you get network partitionsJust because you can't see your other data center doesn't mean users can't.
What should you do if you can't see the other data center?
Availability
Degrade one site to read-onlyDefeats availability
If you allow both sites to operateThere's a chance two users could modify the same data.
The application needs to know how to resolve it
The bottom line...
Building systems that are...Scalable......Available......Maintainable...
with an RDBMs requires large efforts by application developers and operational staff
It's hard because...
Significant work for developers.App needs to convert data to table/columns
App needs to know data location
App needs to handle failover
App needs to handle inconsistency
Work for operational staffFixing replication topologies and synchronizing servers is fiddly work.
Last decades bleeding edge is here
Organizations with big problems started experimenting with alternatives
Developed internal systems during the mid 2000sDistributed by design
Different data models
Published details in 2006/2007
Amazon
Huge e-commerce vendor.
Amazon cares about customer experienceAvailabilty
Latency at the 99th percentile
Built as an SOA pages built from hundreds of services.
Amazon runs multiple data centers.Hardware failure is their normal state
Network partitions common
Amazon Requirements
Shopping cart service must always be available
Customers should be able to view and add to their carts (in their words)If disks are failing
Network routes are flapping
Data centers are being destroyed by tornadoes
Amazon Observations
Many services just stored state.Access by primary key
No queries
ExamplesShopping carts
Best seller lists
Customer profiles
Hard to scale out relational databases
Amazon Solution: Dynamo
Primary key access only
Fault tolerant: Keeps N copies of the data
Designed for inconsistency
Totally decentralized nodes 'gossip' state
Self-healing
Eventual Consistency 1
Brewer's CAP TheoremConsistency
Availability
Partition tolerance
Pick two out of three!
Amazon chose A-P over C
Eventual Consistency 2
N copies of each value
Read operations (get) require 'R' nodes to respond
Write operations (put) require 'W' nodes to respond
If R+W > N nodes will read their writes (if no failure)
NRW tunes the cluster typically (3,2,2)
Eventual Consistency 3
Consequence of availability: Conflicts
Conflicts can come fromNetwork partitions
Applications themselves no transactions or locking
Applications must handle conflicts
Dynamo minimizes with vector clocks
Vector Clocks
Partitioning
Example: Shopping Cart
User browses site adds 3 widgets
{ user_id:fd44dbb4-2e1e-4fb9-9ea3-b5c672e04ac5, items: [ {id:widget, quantity:3, per_unit:5.43} ] }
Shopping Cart - Conflict
{ user_id:fd44..., items: [ {id:widget, quantity:3, per_unit:5.43}, {id:thing, quantity:1, per_unit:0.33} ] }{ user_id:fd44..., items: [ {id:widget, quantity:3, per_unit:5.43}, {id:stuff, quantity:7, per_unit:9.32} ] }
Network Failure
Shopping Cart - Merge
{ user_id:fd44..., items: [ {id:widget, quantity:3, per_unit:5.43}, {id:thing, quantity:1, per_unit:0.33} ] }{ user_id:fd44..., items: [ {id:widget, quantity:3, per_unit:5.43}, {id:stuff, quantity:7, per_unit:9.32} ] }{ user_id:fd44..., items: [ {id:widget, quantity:3, per_unit:5.43}, {id:thing, quantity:1, per_unit:0.33} {id:stuff, quantity:7, per_unit:9.32} ] }
Open Source Dynamo
Dynamo is internal to Amazon
Open source optionsRiak from Basho
Project Voldemort
Google BigTables
Used internally at GoogleIndexing the web
Google Earth
Finance
Distributed storage system for structured data
Data representation
Data stored in tables.
Table indexed by {key,timestamp} and a variable number of sparse columns
Columns are grouped into column families. Columns in a family are stored together.
Each table is broken into tablets.
Tablets are stored on a cluster file system (GFS).
BigTable Column Families
Copyright Google
Map/Reduce
Processing framework that sits on top of BigTable.
Programmers write two functions map() and reduce().
Table is mapped, then reduced.
Job control system monitors and resubmits.
Map/Reduce
Source: institutes.lanl.gov
BigTable has inspired...
Hadoop/Hbase
Cassandra
Riak
CouchDB
Map/Reduce
Explosion of NoSQL Dbs
Too many projects
Two good resourceshttp://nosql.mypopescu.com/
http://www.vineetgupta.com/
2010/01/nosql-databases-part-1-landscape.html
So many projects!
Dynamo, BigTables, RedisRiak, Voldemort, CouchDb, PeanutsHadoop/Hbase, Cassandra, HypertableMongoDb, Terrastore, Scalaris, BerkleyDBMemcacheDB, Dynomite, Neo4J,TokyoCabinet and more
NoSQL Characteristics
Broad typesKey/Value
Sparse Column Family
Document oriented
PersistenceIn memory
On disk
DistributionReplicated
Decentralized
Riak from Basho
http://riak.basho.com
Dynamo clone written in Erlang
RESTful HTTP interface
Fully distributed
Clients for multiple languages
Multiple storage backendsIn-memory
Filesystem
Embedded InnoDB
I work there now!
Redis 1.2
http://code.google.com/p/redis/
Key/Value Store with structured values
Written in C
Memcache-like protocol
In use at Github
Engine Yard
VideoWiki
Redis 1.2 (cont)
Values can be strings, sets, ordered sets, lists
Operations like increment, decrement, intersection, push, pop
In-memory (can be backed by disk)
Auto-sharding in client libraries
No fault tolerance (coming after 2.0)
Example: retwis Twitter clone in PHP
Cassandra
http://incubator.apache.org/cassandra/
BigTable ColumnFamily data model
Dynamo data distribution
Written in Java
Thrift based interface
In use atFacebook
CouchDB
Document oriented databaseAll JSON documents
Written in Erlang
Used by Ubuntu One
HTTP interface
Uses Javascript for indexing/mapreduce
Incremental replication
BerkleyDB
Sleepycat now owned by Oracle
Key/Value StoreMulti-threaded
Multi-process
Replicated
Tranactional
Alternative: Tokyo Cabinet
I'm out of time
MongoDB
Neo4J Graph Database
Peanuts Yahoo
This is all great but...
Relational databases provide a lot of functionality.Giving up queries
Even range queries are hard for distributed hash systems.
No transactions rules out some classes of applications.
Space is still evolving
Conclusion
NoSQL systems give applications the tools they need for scalability/availability
They force you to think about distributed design issues like consistency.
Play with them!
???Page ??? (???)03/10/2010, 17:52:49Page /