when?, why? and what? of mongodb
DESCRIPTION
I've been using mongodb for 2 years and many times I've faced myself asking "why should I use it for this?" or "when should I really use Mongodb?" and many other times "What did I do wrong?". Experiences, examples and real use cases many times say things that benchmarks or technical documentation don't, that for, I'll be presenting the When, Why and What of mongodb. For real, that's what really matters.TRANSCRIPT
Flavio [FlaPer87] Percoco [email protected]
twitter: @flaper87
What?Why?
When?
domingo 8 de mayo de 2011
When?domingo 8 de mayo de 2011
When?Dictionaries!
domingo 8 de mayo de 2011
When?Dictionaries!
Spidering!
domingo 8 de mayo de 2011
When?
Statistics!
Dictionaries!
Spidering!
domingo 8 de mayo de 2011
When?
Statistics!
Dictionaries!
Spidering!
Queues!
domingo 8 de mayo de 2011
When?
Logging!
Statistics!
Dictionaries!
Spidering!
Queues!
domingo 8 de mayo de 2011
Why?domingo 8 de mayo de 2011
Why?
* Unstructured Data! (Spidering)
domingo 8 de mayo de 2011
Why?
* Lot of reads! (Dictionaries, Queues)
* Unstructured Data! (Spidering)
domingo 8 de mayo de 2011
Why?
* Lot of reads! (Dictionaries, Queues)
* Unstructured Data! (Spidering)
* [JB]son like Document Oriented API (All)
domingo 8 de mayo de 2011
Why?
* Lot of writes! (Logging, Statistics, Queues)
* Lot of reads! (Dictionaries, Queues)
* Unstructured Data! (Spidering)
* [JB]son like Document Oriented API (All)
domingo 8 de mayo de 2011
What?
# lets get our collectioncollection = connection['dictionaries']['it']
* Make sure you create the right indexes
def insert_word(word, data): collection.update({'word' : word}, data, upsert=True)
domingo 8 de mayo de 2011
What?
# lets get our collectioncollection = connection['dictionaries']['it']
# lets ensure there’s an index for the key wordcollection.ensure_index([("word", pymongo. ASCENDING)])
* Make sure you create the right indexes
def insert_word(word, data): collection.update({'word' : word}, data, upsert=True)
domingo 8 de mayo de 2011
What?
def parse(response): url_netloc = urlparse.urlsplit(response.url).netloc crawled = { "url" : response.url, "base_url" : url_netloc, "content" : response.body_as_unicode(), "status" : response.status, "encoding" : response.encoding, "headers" : response.headers, "lastcrawl" : time.time(), } collection.update({'url' : response.url}, crawled, True)
* Make sure you save what you really need
domingo 8 de mayo de 2011
What?
* Make sure you understand that schemaless != mess
logs = [ {'url' : "http://www.google.com", "time" : 1304336526.011287}, {'address' : "http://www.yahoo.com", "time" : 1304336551.0424709 }
]
def insert_log() for log in logs: collection.insert(log)
domingo 8 de mayo de 2011
What?
logs = [ {'url' : "http://www.google.com", "time" : 1304336526.011287}, {'address' : "http://www.yahoo.com", "time" : 1304336551.0424709 } ]
def insert_log() for log in logs: log_to_insert = { "url" : log.get('url', log.get('address')), "time" : log.get('time') } collection.insert(log_to_insert)
* Make sure you understand that schemaless != mess
domingo 8 de mayo de 2011
What?
* “Relate” what you occasionally need, “Embed” what you always need
message = { 'msg' : "This is a test message", 'time' : time.time(), 'user' : { 'username' : 'flaper87', 'email' : '[email protected]', }}
domingo 8 de mayo de 2011
What?
* ObjectIDs have an embedded datetime
def _get(self, queue): try: msg = self.client.database.command("findandmodify",
"messages", query={"queue": queue}, sort={"_id": pymongo.ASCENDING}, remove=True) except errors.OperationFailure, exc: if "No matching object found" in exc.args[0]: raise Empty() raise return deserialize(msg["value"]["payload"])
domingo 8 de mayo de 2011
Lets talk about mongoDB!!
Thanks!!
domingo 8 de mayo de 2011
Thanks!!
Lets talk about mongoDB!! Thanks 10gen!!
domingo 8 de mayo de 2011