![Page 1: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/1.jpg)
MongoDBDanny JackowitzSE5214/10/13
![Page 2: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/2.jpg)
What is MongoDB?
● NoSQL database management system (DBMS)
● Humongous○ => Intended for large datasets
● Document-oriented● Developed by 10gen● Started in 2007● Open-sourced in 2009● Production-ready as of version 1.4 (now 2.4)
![Page 3: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/3.jpg)
NoSQL or NoSQL?
● NoSQL popular buzzword● "No SQL" or "Not only SQL"?
○ Most NoSQL DBMSs allow you to execute SQL (or close to it) commands■ Ex. Cassandra Query Language
○ MongoDB does NOT!■ Takes a completely different approach
![Page 4: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/4.jpg)
DBMS Showdown
RDBMS vs. MongoDB
![Page 5: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/5.jpg)
Round 1: Schemas
● RDBMS○ Explicitly define schema before inserting data
● MongoDB○ Schema implicitly created on first insert○ "_id" primary key automatically generated if not
specified○ Just throw data at Mongo, it can handle it!
CREATE TABLE stuff (id int PRIMARY KEY,some_data varchar(64)
)
![Page 6: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/6.jpg)
Round 2: Tables
● RDBMS○ Tables store rows of data○ Data is organized by column
■ All rows in a table have same column structure
● MongoDB○ Collections store documents of data○ Data is organized by fields
■ Documents in a collection need not have identical fields
![Page 7: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/7.jpg)
Round 3: Joins
● RDBMS
○ Returns a (logical) single table● MongoDB
○ No such concept○ Manual linking
■ Store _id of document within other document■ "Join" on the client
○ Embedded documents■ Denormalized data to remove need for join
table_1 JOIN table_2 ON table_1.a = table_2.b
![Page 8: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/8.jpg)
Round 4: Transactions
● RDBMS
● MongoDB○ Atomic operations within a single document○ No multi-document commit with rollback
BEGIN;-- Do some stuffCOMMIT;
![Page 9: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/9.jpg)
MongoDB Query Language
● No SQL!● BSON
○ Binary JSON○ JSON == JavaScript Object Notation
■ Key-value pairs{ _id: ObjectId("5099803df3f4948bd2f98391"), name: { first: "Alan", last: "Turing" }, birth: new Date('Jun 23, 1912'), death: new Date('Jun 07, 1954'), contribs: ["Turing machine", "Turing test"], views : NumberLong(1250000)}
![Page 10: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/10.jpg)
Inserting Documentsrecord = { _id : 1, name : "mongo" }db.records.insert( record )
db.records.insert({_id : 2, name : "mongo"})
// batch insert using JavaScriptfor (var i = 1; i <= 20; i++) {
db.records.insert( { x : i } )}
![Page 11: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/11.jpg)
Retrieving Documents// find alldb.records.find()// find specific (WHERE)db.records.find( { name : "mongo" } )
var cursor = db.records.find()while ( cursor.hasNext() ) { printjson( cursor.next() )}printjson( cursor[0] )
![Page 12: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/12.jpg)
Updating Documentsdb.records.update({_id : 1}, { $set : { name : "mongodb" }})
db.records.update({_id : 2}, { $unset : { name : "ignored" }})
var r = db.records.find({name : "mongodb"})r[0]["name"] = "mongo"db.records.save(r[0])
![Page 13: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/13.jpg)
Deleting Documents
// delete specific documentsdb.records.remove({name:"mongo"})
// delete all documentsdb.records.remove()
// delete collectiondb.records.drop()
![Page 14: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/14.jpg)
Aggregation Framework
● db.collection.aggregate(...)● Uses a pipeline system
○ Works like the UNIX pipeline○ ls | grep "text" | more
db.collection.aggregate( { $op1 : val1 }, { $op2 : val2 }, { $op3 : val3 },);
![Page 15: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/15.jpg)
Aggregation Framework
● $project○ Include fields from the original document○ Insert computed fields○ Rename fields○ Create and populate fields that hold sub-documents
db.zips.aggregate({ $project : { city : 1, state : 1, _id : 0 }})
![Page 16: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/16.jpg)
Aggregation Framework
● $match○ Can work with implied equality or any of comparison
operators■ ==, !=, >, <, >=, <=
db.zips.aggregate( { $match : {pop : 8000}})db.zips.aggregate( { $match : { pop : { $gt : 80000, $lte : 82000 }}})
![Page 17: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/17.jpg)
Aggregation Framework
● $limit○ Restricts the number of documents that pass
through pipeline at this point
db.zips.aggregate( {$match : { pop : { $gt : 80000, $lte : 82000 }}}, {$limit : 2})
![Page 18: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/18.jpg)
Aggregation Framework
● $unwind○ Peels off the elements of an array individually○ Returns one document for every member of the
unwound arraydb.zips.aggregate( {$limit : 1}, {$project : { city : 1, state : 1, loc : 1, _id : 0 }}, {$unwind : "$loc" })
![Page 19: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/19.jpg)
Aggregation Framework
● $group○ Groups documents together for the purpose of
calculating aggregate values based on a collection of documents
db.zips.aggregate( { $group : { _id : "$state", totalPop : { $sum : "$pop" }, avgPop : { $avg : "$pop"} }})
![Page 20: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/20.jpg)
Aggregation Framework
● $sort○ Obvious...○ 1 ascending, -1 descending
db.zips.aggregate( { $sort : { state : 1, pop: -1 } })
![Page 21: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/21.jpg)
More complex queries?
● MongoDB provides MANY other functions that allow for complex queries to be executed efficiently.
● Craigslist○ Archiving (still RDBMS for active listings)○ 2+ billion listings!
● SourceForge○ All project and download pages
● Lots of gaming back ends○ Disney, EA○ Storing scores, stats, achievements, etc.
![Page 22: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/22.jpg)
What's the catch?
● MongoDB is designed for non-relational data● Faking relational loses efficiency
○ "Joining" on the client is slow● Embedded documents to preserve speed
○ De-normalizes data■ Consider books written by authors■ Each book document has own embedded copy of
author■ Author changes contact info■ Must update ALL books written by author!
![Page 23: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/23.jpg)
Conclusion
● MongoDB is awesome for non-relational data○ Self-contained documents
● MongoDB is awesome for loosely structured data○ Each document in collection can have different
format● MongoDB is awesome for (mostly) static
data○ Throw all the data at it○ Normalization not as much of a concern○ Super fast queries with indices, etc.
● MongoDB is NOT a replacement for RDBMSs
![Page 24: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/24.jpg)
Configuring a MongoDB Cluster
● MongoDB intended as a distributed system○ Different components run on different machines
● Three components○ mongod
■ --configsvr■ --replSet■ --shardsvr
○ mongos○ mongo
![Page 25: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/25.jpg)
mongod
● "MongoDB Daemon"● Primary daemon process● Runs on every machine acting as data store● Comparable to postgresql-server● Defaults to port 27017● Configuration server
○ Started with --configsvr○ Special instance that stores all metadata for cluster○ Defaults to port 27019
![Page 26: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/26.jpg)
Replication
● Exact same data stored on multiple instances
● Primary vs. Secondary○ Only primary accepts writes - propagates to
secondaries○ Fully Consistent (by default)
■ All reads and writes go through single primary○ Asynchronous replication
● Failover○ If primary fails, secondaries elect new primary○ Must have at least 2 secondaries for voting to work
● --replSet [name]
![Page 27: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/27.jpg)
Sharding
● Partitions collections○ Based on shard key
● Stores different portions on different machines○ Ex. Storing transaction records
■ 1/1/10 - 12/31/10 -> server1■ 1/1/11 - 12/31/11 -> server2■ ...
● Easy scaling - add more racks!● --shardsvr
○ Switches to port 27018
![Page 28: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/28.jpg)
mongos
● "MongoDB Shard"● Not a data store● Routing service for shards
○ Knows what data on what shard○ Directs request to appropriate shard
● To user/application looks same as single mongod instance○ Same interface as mongod○ Same default port (27017)○ Connect in same way
![Page 29: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/29.jpg)
mongo
● Interactive shell interface● Comparable to psql● JavaScript
○ Can use loops, conditionals, etc. in queries
![Page 30: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/30.jpg)
Our Architecture
![Page 31: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/31.jpg)
Our Architecture
● 4 machine cluster○ server1
■ mongod --configsvr (27019)■ mongod --shardsvr (27018)■ mongos (27017)
○ server2■ mongod --shardsvr --replSet rs0 (27018)
○ server3■ mongod --shardsvr --replSet rs0 (27018)
○ server4■ mongod --shardsvr --replSet rs0 (27018)
![Page 32: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/32.jpg)
Starting Everything Up...server1: sudo -u mongodb mongod --configsvr sudo -u mongodb mongod --shardsvrserver2, server3, server4: sudo -u mongodb mongod --shardsvr --replSet rs0server1: mongos --configdb 134.198.169.41
![Page 33: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/33.jpg)
Setting Up Replication & Shardingserver2 (or 3 or 4): mongo --port 27018 rs.initiate() rs.add("134.198.169.43:27018") rs.add("134.198.169.44:27018") rs.conf()
server1: mongo sh.addShard("rs0/134.198.169.42:27018") sh.addShard("134.198.169.41:27018")
![Page 34: MongoDB - University of Scrantonbi/2013s-html/se521/MongoDB.pdfMongoDB is awesome for non-relational data Self-contained documents MongoDB is awesome for loosely structured data Each](https://reader033.vdocuments.mx/reader033/viewer/2022052718/5f04f3bf7e708231d41085fc/html5/thumbnails/34.jpg)
... And Watching It Worksh.enableSharding("test")sh.shardCollection("test.shardtest", { _id : 1 })
for (var i = 1; i <= 2000000; i++) { db.shardtest.insert( { _id : i, junk : "Some reasonably long text that will make this take up more space in the database and better illustrate sharding"})}