![Page 1: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/1.jpg)
![Page 2: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/2.jpg)
SCHEMA ON READ
![Page 3: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/3.jpg)
Index everything One query type Low latency High concurrency
Index nothing Queries as programs High latency Low concurrency
![Page 4: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/4.jpg)
Index everything One query type Low latency High concurrency
Index nothing Queries as programs High latency Low concurrency
![Page 5: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/5.jpg)
IT’S POPULAR, BUT WHY?
![Page 6: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/6.jpg)
![Page 7: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/7.jpg)
7
Diverse operational workloads are common
Top 5 Marketing Firm Government Agency Top 5 Investment Bank
Data Key / Value 10+ fields, arrays, nested documents 20+ fields, arrays, nested documents
Queries Key – based
1-100 docs / query 80/20 read/write
Compound queries Range queries
MapReduce 20/80 read/write
Compound queries Range queries
50/50 read/write
Servers ~250 ~50 4
Ops / Sec 1,200,000 500,000 30,000
![Page 8: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/8.jpg)
8
Some deployments are large
Cluster Scale Performance Scale Data Scale
Entertainment Company 1,400 servers 250 Million Ticks / Sec Petabytes
Asian Internet Company 1,000+ servers 300k Ops / Sec 10s of billions of
objects
250+ servers Federal Agency 500k Ops / Sec 13 billion documents
![Page 9: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/9.jpg)
9
Multiple indicators suggest adoption is strong
RANK DBMS MODEL SCORE GROWTH (20 MO)
1. Oracle Relational DBMS 1,442 -5%
2. MySQL Relational DBMS 1,294 2%
3. Microsoft SQL Server Relational DBMS 1,131 -10%
4. MongoDB Document Store 277 172%
5. PostgreSQL Relational DBMS 273 40%
6. DB2 Relational DBMS 201 11%
7. Microsoft Access Relational DBMS 146 -26%
8. Cassandra Wide Column 107 87%
9. SQLite Relational DBMS 105 19%
Source: DB-engines database popularity rankings; May 2015
![Page 10: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/10.jpg)
Source: Stack Overflow via Stackoverkill.com
![Page 11: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/11.jpg)
Source: Stack Overflow via Stackoverkill.com
![Page 12: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/12.jpg)
TO ME, THREE THINGS DRIVE THIS ADOPTION
![Page 13: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/13.jpg)
13
We asked users why, here’s what they told us
{ CODE } DB SCHEMA XML CONFIG
APPLICATION RELATIONAL DATABASE OBJECT RELATIONAL MAPPING
![Page 14: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/14.jpg)
14
We asked users why, here’s what they told us
{ CODE } DB SCHEMA XML CONFIG
APPLICATION RELATIONAL DATABASE OBJECT RELATIONAL MAPPING
![Page 15: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/15.jpg)
15
RDBMS MongoDB
Database Database
Table Collection
Index Index
Row Document
Join Embedding & Linking
#1 The data model
![Page 16: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/16.jpg)
16
Documents are rich data structures
{ first_name: ‘Paul’, surname: ‘Miller’, cell: 447557505611, city: ‘London’, location: [45.123,47.232], Profession: [‘banking’, ‘finance’, ‘trader’], cars: [ { model: ‘Bentley’, year: 1973, value: 100000}, { model: ‘Rolls Royce’, year: 1965, value: 330000} ]
}
Fields can contain an array of sub-documents
Typed field values
Fields can contain arrays
String
Number
Geo-Location
Fields
![Page 17: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/17.jpg)
17
Documents are self-describing
{ product_name: ‘Acme Paint’, color: [‘Red’, ‘Green’],
size_oz: [8, 32], finish: [‘satin’, ‘eggshell’]
}
{ product_name: ‘T-shirt’, size: [‘S’, ‘M’, ‘L’, ‘XL’], color: [‘Heather Gray’ … ],
material: ‘100% cotton’, wash: ‘cold’, dry: ‘tumble dry low’
}
{ product_name: ‘Mountain Bike’, brake_style: ‘mechanical disc’, color: ‘grey’,
frame_material: ‘aluminum’, no_speeds: 21, package_height: ‘7.5x32.9x55’,
weight_lbs: 44.05, suspension_type: ‘dual’, wheel_size_in: 26}
Documents in the same product catalog collection in MongoDB
![Page 18: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/18.jpg)
18
#2 Idiomatic drivers & frameworks
Morphia
MEAN Stack
![Page 19: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/19.jpg)
// Java: mapsDBObject query = new BasicDBObject(”publisher.founded”, 1980));Map m = collection.findOne(query);Date pubDate = (Date)m.get(”published_date”);
// Javascript: objectsm = collection.findOne({”publisher.founded” : 1980});pubDate = m.published_date; // ISODateyear = pubDate.getUTCFullYear();
# Python: dictionariesm = coll.find_one({”publisher.founded” : 1980 });pubDate = m[”pubDate”].year # datetime.datetime
Documents map to language constructs
![Page 20: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/20.jpg)
20
#3 It’s easy…and fun
• Easy to acquire – AGPL license • Easy to install and configure – up and running in <5 min • Easy to get high performance – no black magic for millisecond latency, scale out architecture • Easy to deliver “always on” – replication and automatic failover built in • Easy to add, query data – no complex modeling, no DDL
![Page 21: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/21.jpg)
21
#3 It’s easy…and fun
• Easy to acquire – AGPL license • Easy to install and configure – up and running in <5 min • Easy to get high performance – no black magic for millisecond latency, scale out architecture • Easy to deliver “always on” – replication and automatic failover built in • Easy to add, query data – no complex modeling, no DDL
BUT WHAT ABOUT • Data governance? • Referential integrity? • Analytics?
![Page 22: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/22.jpg)
DOCUMENT VALIDATION
![Page 23: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/23.jpg)
23
Data governance: document validation
Implement data governance without sacrificing the
agility that comes from schema on read
![Page 24: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/24.jpg)
24
Document validation gives you flexible control
• Use familiar MongoDB Query Language • Automatically tests each insert/update; delivers warning or error if a rule is broken • You choose what keys to validate and how
db.runCommand({ collMod: "contacts", validator: { $and: [ {year_of_birth: {$lte: 1994}}, {$or: [ {phone: { $type: ”string"}}, {email: { $type: ”string"}} ]}] }})
![Page 25: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/25.jpg)
25
Example validation failure
db.contacts.insert( name: "Fred", email: "[email protected]", year_of_birth: 2012})
Document failed validationWriteResult({ "nInserted": 0, "writeError": { "code": 121, "errmsg": "Document failed validation”}})
![Page 26: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/26.jpg)
26
Many ways to validate, no foreign keys yet
• Can check most things that work with a find expression – Existence – Non-existence – Data type of values – <, <=, >, >=, ==, != – AND, OR – Regular expressions
– Some geospatial operators (e.g. $geoWithin & $geoIntersects) • Validate existing data by wrapping expression in $not
![Page 27: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/27.jpg)
27
Where MongoDB validation excels (vs. RDBMS)
• Simple – Use familiar search expressions (MQL) – No need for stored procedures
• Flexible – Only enforced on mandatory parts of the schema – Can start adding new data at any point and then add validation later if needed
• Practical to deploy – Simple to role out new rules across thousands of production servers
• Light weight – Negligible impact to performance
![Page 28: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/28.jpg)
28
Controlling validation
validationLevel
off moderate strict
validationAction
warn
No checks
Warn on validation failure for inserts & updates to existing valid documents. Updates to
existing invalid docs OK.
Warn on any validation failure for any insert or update.
error
No checks
Reject invalid inserts & updates to existing valid documents.
Updates to existing invalid docs OK.
Reject any violation of validation rules for any insert or update.
DEFAULT
![Page 29: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/29.jpg)
29
Versioning of validators (optional)
• Application can lazily update documents with an older version or with no version set at all
db.runCommand({ collMod: "contacts", validator: {$or: [{version: {"$exists": false}}, {version: 1, {Name: {"$exists": true}} }, {version: 2, {Name: {"$type": ”string"}} } ] } })
![Page 30: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/30.jpg)
SCHEMA DISCOVERY
![Page 31: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/31.jpg)
![Page 32: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/32.jpg)
FUTURE DECISIONS
![Page 33: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/33.jpg)
33
Still lots of hard problems to solve
• Schema evolution • Specialized storage engines
– WORM – Blockchain – Proprietary hardware – Integrated data warehouse
• Complex transactions
![Page 34: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/34.jpg)
34
One surface fits all
Content Repo IoT Sensor Backend Ad Service Customer
Analytics Archive
MongoDB Query Language (MQL) + Native Drivers
MongoDB Document Data Model
BTree LSM
Man
agem
ent
Sec
urity
In-memory WORM Archive
![Page 35: SCHEMA ON READ · RANK DBMS MODEL SCORE GROWTH (20 MO) 1. Oracle Relational DBMS 1,442 -5% 2. MySQL Relational DBMS 1,294 2% 3. Microsoft SQL Server Relational DBMS 1,131 -10% 4](https://reader033.vdocuments.mx/reader033/viewer/2022042221/5ec71617226d9e5351785ed5/html5/thumbnails/35.jpg)