augmenting mysql with nosql options - data lifecycles
TRANSCRIPT
Augmenting MySQL with Big
Data and NoSQL options
The Data Lifecycle
Lead DBA @ Data Services / ObjectRocket by Rackspace
15+ years in data and information systems, ranging from application develop,
data architecture, system design, and more.
Primary focus – Helping business focus on using data not managing
and storing it.
David Murphy
@davidmurphy_data
www.linkedin.com/in/davidbmurphy/
True genius resides in the capacity for
evaluation of uncertain, hazardous,
and conflicting information.
- Winston Churchill
EVERYONE’S GOT TO HAVE A
GREAT DATA QUOTE RIGHT?!
Lifecycle, say
what
Where are
the
technologies
Why One
isn't enough
How to fit
them
together
OutcomesWe want you to leave here understanding:
This
is
NOT…
a deep dive on any technology
a comprehensive list
a roadmap discussion
the end of the journey
What We Will Cover
What We’ll CoverConcepts What are the lifecycle stages
How to classify your workloads
Terminology
Actions What technologies are there
When to use them
Fitting them together
Why is this better
What are the lifecycle stages
Transient
• Sessions
• Logins
• Shop Cart
Short - Medium
• Feeds
• E-Commerce
• Video Game Stats
Analytics
• Reports
• Summary Data
• Dash boards
Archival
• Cold Storage
• Seldom Access
• Governances
L i f e C y c l e
What are the lifecycle stages
Transient
• Sessions
• Logins
• Shop Cart
Short - Medium
• Feeds
• E-Commerce
• Video Game Stats
Analytics
• Reports
• Summary Data
• Dash boards
Archival
• Cold Storage
• Seldom Access
• Governances
L i f e C y c l e
What are the lifecycle stages
Transient
• Sessions
• Logins
• Shop Cart
Short - Medium
• Feeds
• E-Commerce
• Video Game Stats
Analytics
• Reports
• Summary Data
• Dash boards
Archival
• Cold Storage
• Seldom Access
• Governances
L i f e C y c l e
What are the lifecycle stages
Transient
• Sessions
• Logins
• Shop Cart
Short - Medium
• Feeds
• E-Commerce
• Video Game Stats
Analytics
• Reports
• Summary Data
• Dash boards
Archival
• Cold Storage
• Seldom Access
• Governances
L i f e C y c l e
Updated frequently
Ultra fast retrieval
If missing is OK
IS IS NOT
Workloads - Transient
Rich Query-able
Durable
Point of truth
Some to many updates
Rich Query-able
Durable + Point of Truth
IS IS NOT
Workloads - Short to Medium
Built for short term
99% Write 1% Reads
Heavy Aggregations
Heavy Aggregations
More Latency
Massive Parallelized
IS IS NOT
Workloads - Analytics
Rich Query-able
Good for many updates
Point of truth
High / Extreme Latency
Ultra Cheap
Built for Retention
IS IS NOT
Workloads - Archival
Rich Query-able
Updateable
Short Term Storage
Terminology:
Documents Rows
Terminology:
Documents
Columns
Rows
Terminology:
Documents
Columns
Rows
Partition
s
Terminology:
Documents
Columns
Rows
Partition
s
Terminology:
Documents
Columns
Rows
Partition
sGeo & DR
Terminology:
Documents
Columns
Rows
Partition
s
Scaling
Geo & DR
Terminology:
Documents
Columns
Rows
Backups
Partition
s
Scaling
Geo & DR
Terminology:
Documents
Columns
Rows
Backups
Partition
s
Scaling
Geo & DR
The dreaded polyglot persistence
Transient
• Memcache
• CouchBase
• Redis
• SQLite
Medium
• MySQL
• Maria
• PostgreSQL
• Mongo DB
• XtraCluster
• NDB
Analytics
• Hadoop
• InfoBright
• Cassandra
• Teradata
Archival
•Hadoop +
External
•Hadoop
Snapshots
•Cassandra
using S3
Technologies
Fitting it together
• What is the fewest technologies we can use
• What will for new requests
• Do I have plans to handle each stage of data?
• If not can the technologies do a decent job on the
odd case?
• Have talent now? Can I get a service or person easily?
Fitting it together - tools
Build a matrix with
• Features needs ( Transactions, Persistent , Geo,…)
• Importance ( 1- 5)
• Current or Attainable Talent ( 1 -5 )
• Does its Licensing work for this project ( 0 or 1)
(Features * Importance * Talent * License) = Combined Rank
Klout’s great example, but it’s polyglot!
Appboy getting better!
How it should be…
How to scale – focus on what you know
You scale your app by letting someone else
• Build the hardware
• Know the Ops side for the technology
• Make the technologies pass data as its ages vs duplicating
the data
• Be the experts
• You just focus on the features of your app and make $$$
Questions?
WE ARE HIRING! ( DBA, DevOps, and more)
https://rackertalent.com
https://www.objectrocket.com/careers
Twitter: @dmurphy_data @rackspace @objectrocket
Email: [email protected]
Github: https://github.com/dbmurphy
SlideDeck: https://github.com/dbmurphy/presentations