cloud computing clase 8 - nosql miguel saez @masaez johnny halife @johnnyhalife matias woloski...

15
Cloud Computing Clase 8 - NoSQL Miguel Saez @masaez Johnny Halife @johnnyhalife Matias Woloski @woloski

Upload: eustacia-merritt

Post on 13-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cloud Computing Clase 8 - NoSQL Miguel Saez @masaez Johnny Halife @johnnyhalife Matias Woloski @woloski

Cloud ComputingClase 8 - NoSQL

Miguel Saez@masaez

Johnny Halife@johnnyhalife

Matias Woloski

@woloski

Page 2: Cloud Computing Clase 8 - NoSQL Miguel Saez @masaez Johnny Halife @johnnyhalife Matias Woloski @woloski

NoSQL

• What does it mean?• RDBMS legacy and rise of NoSQL• NoSQL classification• Pros and Cons• Possible use cases• Real-world examples• What next?

Page 3: Cloud Computing Clase 8 - NoSQL Miguel Saez @masaez Johnny Halife @johnnyhalife Matias Woloski @woloski

What does it mean?

• Movement, not a specification• Subjective term (like Web 2.0)– Originally used in 1998– Reintroduced at Rackspace to refer to non-RDBMS

• NoSQL != No SQL• NoSQL == Not Only SQL ?

Page 4: Cloud Computing Clase 8 - NoSQL Miguel Saez @masaez Johnny Halife @johnnyhalife Matias Woloski @woloski

NoSQL Comment

Page 5: Cloud Computing Clase 8 - NoSQL Miguel Saez @masaez Johnny Halife @johnnyhalife Matias Woloski @woloski

RDBMS Legacy

• Efficient data storage• Powerful querying

capabilities (SQL)• Support ACID

Transactions• Mature, well supported• Ubiquitous• Bottom-up design

• Storage is cheap• O/R Impedance• Complex to manage• Always the bottleneck• Who really needs

transactions?

Page 6: Cloud Computing Clase 8 - NoSQL Miguel Saez @masaez Johnny Halife @johnnyhalife Matias Woloski @woloski

Rise of NoSQL

• Internet• Google• 2006 Bigtable whitepaper (Google)

– “a sparse, distributed multi-dimensional sorted map”

• 2007 Dynamo whitepaper (Amazon)• 2008 Cassandra released (Facebook)

– “a BigTable data model running on an Amazon Dynamo-like infrastructure”

• 2009 Voldemort released (LinkedIn)– “a big, distributed, persistent, fault-tolerant hash table”

Page 7: Cloud Computing Clase 8 - NoSQL Miguel Saez @masaez Johnny Halife @johnnyhalife Matias Woloski @woloski

No-SQL Offering

Windows Azure

Page 8: Cloud Computing Clase 8 - NoSQL Miguel Saez @masaez Johnny Halife @johnnyhalife Matias Woloski @woloski

Rise of NoSQL – Amazon

“There are many services on Amazon’s platform that only need primary-key access to a data store. For many services, such as those that provide best seller lists, shopping carts, customer preferences, session management, sales rank, and product catalog, the common pattern of using a relational database would lead to inefficiencies and limit scale and availability. Dynamo provides a simple primary-key only interface to meet the requirements of these applications.”

Page 9: Cloud Computing Clase 8 - NoSQL Miguel Saez @masaez Johnny Halife @johnnyhalife Matias Woloski @woloski

NoSQL Data Store Classifications• Key-Value store

– Amazon SimpleDB, Amazon Dynamo (Amazon), Tokyo Cabinet, Voldemort (Gilt Groupe)

• Wide-column (sparse) store– Hadoop (Yahoo, EBay), Cassandra (Facebook), Bigtable (Google!), Azure Table

Storage (MSFT), Excel(!)• Document database

– MongoDB, CouchDB (BBC), RavenDB• Graph database

– Neo4J, InfoGrid• Object database

– Db4o, Versant, Perst, Cache• Data Grids

– Infinispan, GigaSpaces, Terracotta

Page 10: Cloud Computing Clase 8 - NoSQL Miguel Saez @masaez Johnny Halife @johnnyhalife Matias Woloski @woloski

Why NoSQL

Good• Flexible (schema-less)• Very scalable• Scales over cheap hardware• Reduces the need to DBA• Simple to use and operate• Eventually consistent• Cheap• Suited to Web applications

Bad• Immature• No common standards• No support• No standard• Poor transaction support• Poor query support• New mindset required

Page 11: Cloud Computing Clase 8 - NoSQL Miguel Saez @masaez Johnny Halife @johnnyhalife Matias Woloski @woloski

NoSQL Use Cases

Good Examples• Logging data• Shopping carts• Favourites• Preferences• Session data• Mock data providers

• Temporary / working data• Variable schema data

Stick with RDBMS• Transactions (orders etc.)• LOB applications• Anything involving $$$• Business-critical data• Reporting

Page 12: Cloud Computing Clase 8 - NoSQL Miguel Saez @masaez Johnny Halife @johnnyhalife Matias Woloski @woloski

Real-world Examples

Page 13: Cloud Computing Clase 8 - NoSQL Miguel Saez @masaez Johnny Halife @johnnyhalife Matias Woloski @woloski

Real-world Examples

Page 14: Cloud Computing Clase 8 - NoSQL Miguel Saez @masaez Johnny Halife @johnnyhalife Matias Woloski @woloski

Real-world Examples

“As I described in an earlier blog post, the new BBC homepage has been built on a whole new technical architecture. Since launching we’ve found an issue with the service we use to save users’ customisation settings. Although we ran a public beta for more than 2 months, this problem only became apparent when we moved the whole audience across to the new site, increasing the load on the platform 20 times. Despite thorough load testing before launch we were unable to accurately predict the type and combination of customisations that users would perform, and as a result we now need to re-architect the way we save your homepage customisation settings in a more efficient way.”

Page 15: Cloud Computing Clase 8 - NoSQL Miguel Saez @masaez Johnny Halife @johnnyhalife Matias Woloski @woloski

Summary

• NoSQL is not a replacement for RDBMS• No two scenarios are the same• Use best tool for the job• Experiment

Not only SQL