"navigating the database universe" by dr. michael stonebraker and scott jarr, voltdb
DESCRIPTION
Webinar presentation delivered by Dr. Michael Stonebraker and Scott Jarr of VoltDB on December 11, 2012. www.voltdb.com The design decisions you make today will have a huge performance impact down the line. Until recently, when it came to databases, the choice was easy. Essentially, you had one option: the RDBMS. Today, there's a new universe of databases being thrown into production — and not always with the greatest success. How do you make the right choice for your next application? Database pioneer Dr. Michael Stonebraker and VoltDB co-founder Scott Jarr have some thoughts.TRANSCRIPT
![Page 1: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/1.jpg)
Navigating the Database Universe
Dr. Michael Stonebraker and Scott Jarr
![Page 2: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/2.jpg)
About Our Presenters
Mike Stonebraker
Co-founder & CTO, VoltDB
A pioneer of database research and technology for more than a quarter of a century, and the main architect of the Ingres relational DBMS and the object-relational DBMS PostgreSQL
Scott Jarr
Co-founder & Chief Strategy Officer, VoltDB
More than 20 years of experience building, launching and growing technology companies from inception to market leadership in the search, mobile, security, storage and virtualization markets
![Page 3: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/3.jpg)
• The (proper) design of DBMSs– Presented by Dr. Michael Stonebraker
• The database universe
• Where the future value comes from
Agenda
![Page 4: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/4.jpg)
• “Big Data” is a rare, transformative market
• Velocity is becoming the cornerstone
• Specialized databases (working together) are the answer
• Products must provide tangible customer value... Fast
We Believe…
![Page 5: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/5.jpg)
THE (PROPER) DESIGNOF THE DBMS
Dr. Michael Stonebraker
![Page 6: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/6.jpg)
Lessons from 40 Years of Database Design
1. Get the user interaction right
– Bet on a small number of easy-to-understand constructs
– Plus standards
2. Get the implementation right
– Bet on a small number of easy-to-understand constructs
3. One size does not fit all
– At least not if you want fast, big or complex
Those who don’t learn from history are destined to repeat it.
“”-Winston Churchill
![Page 7: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/7.jpg)
#1: Get the User Interaction Right
Winner: RDBMS• Simple data model
(tables)• Simple access
language (SQL)• ACID (transactions)• Standards (SQL)
Loser: CODASYL• Complicated data model
(records; participate in “sets”; set has one owner and, perhaps, many members, etc.)
• Messy access language (sea of “cursors”; some -- but not all -- move on every command, navigation programming)
Loser: OODBs• Complex data model
(hierarchical records, pointers, sets, arrays, etc.)
• Complex access language (navigation, through this sea)
• No standards
Historical Lesson: RDBMS vs. CODASYL vs. OODB
![Page 8: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/8.jpg)
Interaction Take Away − Simple is Good
• ACID was easy for people to understand
• SQL provided a standard, high-level language and made people productive (transportable skills)
![Page 9: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/9.jpg)
#2: Get the Implementation Right
• Leverage a few simple ideas: Early relational implementations– System R storage system dropped links– Views (protection, schema modification, performance)– Cost-based optimizer
• Leverage a few simple ideas: Postgres– User-defined data types and functions (adopted by most everybody)– Rules/triggers– No-overwrite storage
• Leverage a few simple ideas: Vertica– Store data by column– Compressed up the ging gong– Parallel load without compromising ACID
Histo
rical Win
ners
![Page 10: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/10.jpg)
#3: One Size Does NOT Fit All
• OSFA is an old technology with
hundreds of bags hanging off it
• It breaks 100% of the time when under
load
• Load = size or speed or complexity
• Load is increasing at a startling rate
• Purpose-built will exceed by 10x to 100x
• History has not been completely written
yet…but let’s look at VoltDB as an
example
…specialized systems can each be a factor of 50 faster than the single ‘one size fits all’ system…A factor of 50 is nothing to sneeze at.
“
”-My Top 10 Assertions About Data Warehouses, 2010
![Page 11: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/11.jpg)
Example: VoltDB
• Get the interface right– SQL– ACID
• Implementation: Leverage a few simple ideas– Main memory– Stored procedures– Deterministic scheduling
• Specialization– OLTP focus allowed for above implementation choices
![Page 12: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/12.jpg)
Proving the Theory
• Challenge: OLTP performance
– TPC-C CPU cycles
– On the Shore DBMS prototype
– Elephants should be similar
Recovery 24%Latching 24%
Buffer Pool 24%Locking 24%
Useful Work4%
![Page 13: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/13.jpg)
Implementation Construct #1: Main Memory
• Main memory format for data
– Disk format gets you buffer pool overhead
• What happens if data doesn’t fit?
– Return to disk-buffer pool architecture (slow)
– Anti-caching
• Main memory format for data
• When memory fills up, then bundle together elderly tuples and write them out
• Run a transaction in “sleuth mode”; find the required records and move to main memory (and pin)
• Run Xact normally
![Page 14: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/14.jpg)
Implementation Construct #2: Stored Procedures
• Round trip to the DBMS is expensive
– Do it once per transaction
– Not once per command
– Or even once per cursor move
• Ad-hoc queries supported
– Turn them into dynamic stored procedures
![Page 15: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/15.jpg)
Implementation Construct #3: Deterministic and Non-deterministic Scheduling
• Non-deterministic (can’t tell order until commit time)
– MVCC
– Dynamic locking
• Deterministic
– Time stamp order
![Page 16: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/16.jpg)
Result of Design Principles: VoltDB Example
• Good interface decisions – made developers more productive
– SQL & ACID
• Leveraging a few simple implementation ideas – made VoltDB wicked fast
– Main memory
– Stored procedures
– Deterministic scheduling
![Page 17: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/17.jpg)
Proving the Theory
• Answer: OLTP performance
– 3 million transactions per second
– 7x Cassandra
– 15 million SQL statements per second
– 100,000+ transactions per commodity server
…we are heading toward a world with at least 5 (and probably more) specialized engines and the death of the ‘one size fits all’ legacy systems.
“
”-The End of an Architectural Era (It’s Time for a Complete
Rewrite), 2007
![Page 18: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/18.jpg)
THE DATABASE UNIVERSE
Scott Jarr
![Page 19: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/19.jpg)
Technology Meets the Market
Believe
– “Big Data” is a rare, transformative market
– Velocity is becoming the cornerstone
– Specialized databases (working together) are the answer
– Products must provide tangible customer value… Fast
Observations
– Noisy, crowded and new – kinda like Christmas shopping at the mall
– Everyone wants to understand where the pieces fit
– Analysts build maps on technology NOT use cases
What we need is…
![Page 20: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/20.jpg)
Data Value Chain
Interactive Real-time Analytics Record Lookup Historical Analytics Exploratory Analytics
Milliseconds Hundredths of seconds Second(s) Minutes Hours
• Place trade• Serve ad• Enrich stream• Examine packet• Approve trans.
• Calculate risk• Leaderboard• Aggregate• Count
• Retrieve click stream
• Show orders
• Backtest algo• BI• Daily reports
• Algo discovery• Log analysis• Fraud pattern match
Age of Data
![Page 21: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/21.jpg)
Data Value Chain
Interactive Real-time Analytics Record Lookup Historical Analytics Exploratory Analytics
Milliseconds Hundredths of seconds Second(s) Minutes Hours
• Place trade• Serve ad• Enrich stream• Examine packet• Approve trans.
• Calculate risk• Leaderboard• Aggregate• Count
• Retrieve click stream
• Show orders
• Backtest algo• BI• Daily reports
• Algo discovery• Log analysis• Fraud pattern match
Value of Individual Data Item
Data V
alue
AggregateData Value
Age of Data
![Page 22: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/22.jpg)
Traditional RDBMSSimple SlowSmall
FastComplexLarge
Ap
pli
cati
on
Co
mp
lexi
ty
Value of Individual Data Item Aggregate Data Value
Data V
alue
The Database Universe
Interactive Real-time Analytics Record Lookup Historical Analytics Exploratory Analytics
Transactional Analytic
![Page 23: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/23.jpg)
Traditional RDBMSSimple SlowSmall
FastComplexLarge
Ap
pli
cati
on
Co
mp
lexi
ty
Value of Individual Data Item Aggregate Data Value
Data V
alue
NewSQLData
Warehouse
Hadoop, etc.NoSQL
Velocity
The Database Universe
Interactive Real-time Analytics Record Lookup Historical Analytics Exploratory Analytics
Transactional Analytic
![Page 24: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/24.jpg)
Closed-loop Big Data
Interactive & Real-time Analytics
Historical Reports & Analytics
Exploratory Analytics
loginssensors impressionsorders
authorizations clickstrades
![Page 25: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/25.jpg)
Closed-loop Big Data
• Make the most informed decision every time there is an interaction
• Real-time decisions are informed by operational analytics and past knowledge
Knowledge
Interactive & Real-time Analytics
Historical Reports & Analytics
Exploratory Analytics
loginssensors impressionsorders
authorizations clickstrades
![Page 26: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/26.jpg)
The Velocity Use Case
What’s it look like?
– High throughput, relentless data feeds
– Fast decisions on high-value data
– Real-time, operational analytics present immediate visibility
What’s the big deal?
– Batch converts to real time = efficiency
– Decisions made at time of event = better decisions
– Ability to micro segment/target/personalize/etc. = conversion, satisfaction, more data is
coming at you, use it to improve your business
![Page 27: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/27.jpg)
QUESTIONS AND ANSWERS
Next Up
![Page 28: "Navigating the Database Universe" by Dr. Michael Stonebraker and Scott Jarr, VoltDB](https://reader034.vdocuments.mx/reader034/viewer/2022052522/54b6cf724a79596f468b4616/html5/thumbnails/28.jpg)
THANK YOU
www.voltdb.com