mongodb and python

53

Upload: norberto-leite

Post on 18-Jul-2015

523 views

Category:

Software


1 download

TRANSCRIPT

Page 1: MongoDB and Python
Page 2: MongoDB and Python

MongoDB + Python

Norberto Leite Technical Evangelist [email protected]

Page 3: MongoDB and Python

Agenda

Introduction to MongoDB pymongo CRUD Aggregation GridFS Indexes ODMs

Page 4: MongoDB and Python

Ola, I'm Norberto

Norberto Leite Technical Evangelist !Madrid, Spain @nleite [email protected] http://www.mongodb.com/norberto

Page 5: MongoDB and Python

MongoDB

Page 6: MongoDB and Python

MongoDB

GENERAL PURPOSE DOCUMENT DATABASE OPEN-SOURCE

Page 7: MongoDB and Python

Fully Featured

Page 8: MongoDB and Python

MongoDB Features

JSON Document Model with Dynamic Schemas

Auto-Sharding for Horizontal Scalability

Text Search

Aggregation Framework and MapReduce

Full, Flexible Index Support and Rich Queries

Built-In Replication for High Availability

Advanced Security

Large Media Storage with GridFS

Page 9: MongoDB and Python

MongoDB Inc.

400+ employees 2,000+ customers

Over $311 million in funding 13 offices around the world

Page 10: MongoDB and Python

THE LARGEST ECOSYSTEM

9,000,000+ MongoDB Downloads

250,000+ Online Education Registrants

35,000+ MongoDB User Group Members

35,000+ MongoDB Management Service (MMS) Users

750+ Technology and Services Partners

2,000+ Customers Across All Industries

Page 11: MongoDB and Python

pymongo

Page 12: MongoDB and Python

pymongo

• MongoDB Python official driver • Rockstart developer team

• Jesse Jiryu Davis, Bernie Hackett • One of oldest and better maintained drivers • Python and MongoDB are a natural fit

• BSON is very similar to dictionaries • (everyone likes dictionaries)

• http://api.mongodb.org/python/current/ • https://github.com/mongodb/mongo-python-driver

Page 13: MongoDB and Python

pymongo 2.8

• Support for upcoming MongoDB 3.0 • New get collections and get indexes commands for Wired Tiger • Backward compatible w/ 2.6 !

• Future releases of pymongo (3.0) • Server discovery spec • Monitoring spec • Faster client startup when connecting to Replica Set • Faster failover • More robust replica set connections • API clean up

Page 14: MongoDB and Python

Connecting

Page 15: MongoDB and Python

Connecting#!/bin/python from pymongo import MongoClient !mc = MongoClient()

client  instance

Page 16: MongoDB and Python

Connecting#!/bin/python from pymongo import MongoClient !uri = 'mongodb://127.0.0.1' mc = MongoClient(uri)

Page 17: MongoDB and Python

Connecting#!/bin/python from pymongo import MongoClient !uri = 'mongodb://127.0.0.1' mc = MongoClient(host=uri, max_pool_size=10)

Page 18: MongoDB and Python

Connecting to Replica Set#!/bin/python from pymongo import MongoClient !uri = ‘mongodb://127.0.0.1?replicaSet=MYREPLICA' mc = MongoClient(uri)

Page 19: MongoDB and Python

Connecting to Replica Set#!/bin/python from pymongo import MongoClient !uri = ‘mongodb://127.0.0.1' mc = MongoClient(host=uri, replicaSet='MYREPLICA')

Page 20: MongoDB and Python

Database Instance#!/bin/python from pymongo import MongoClient mc = MongoClient() !db = mc['zurich_pug'] !#or !db = mc.zurich_pug

database  instance

Page 21: MongoDB and Python

Collection Instance#!/bin/python from pymongo import MongoClient mc = MongoClient() !coll = mc[‘zurich_pug']['testcollection'] !#or !coll = mc.zurich_pug.testcollection

collection  instance

Page 22: MongoDB and Python

CRUD

http://www.ferdychristant.com/blog//resources/Web/$FILE/crud.jpg

Page 23: MongoDB and Python

Operations

• Insert • Remove • Update • Query • Aggregate • Create Indexes • …

Page 24: MongoDB and Python

CRUD

• Insert • Remove • Update • Query • Aggregate • Create Indexes • …

Page 25: MongoDB and Python

Insert#!/bin/python from pymongo import MongoClient mc = MongoClient() !coll = mc['zurich_pug']['testcollection'] !!coll.insert( {'field_one': 'some value'})

Page 26: MongoDB and Python

Find#!/bin/python from pymongo import MongoClient mc = MongoClient() !coll = mc['zurich_pug']['testcollection'] !!cur = coll.find( {'field_one': 'some value'}) !for d in cur: print d

Page 27: MongoDB and Python

Update#!/bin/python from pymongo import MongoClient mc = MongoClient() !coll = mc['zurich_pug']['testcollection'] !!result = coll.update( {'field_one': 'some value'}, {"$set": {'field_one': 'new_value'}} ) !print(result) !

Page 28: MongoDB and Python

Remove#!/bin/python from pymongo import MongoClient mc = MongoClient() !coll = mc['zurich_pug']['testcollection'] !!result = coll.remove( {'field_one': 'some value'}) !print(result) !

Page 29: MongoDB and Python

Aggregate

http://4.bp.blogspot.com/-­‐0IT3rIJkAtM/Uud2pTrGCbI/AAAAAAAABZM/-­‐XUK7j4ZHmI/s1600/snowflakes.jpg

Page 30: MongoDB and Python

Aggregation Framework

• Analytical workload solution • Pipeline processing • Several Stages

• $match • $group • $project • $unwind • $sort • $limit • $skip • $out !

• http://docs.mongodb.org/manual/aggregation/

Page 31: MongoDB and Python

Aggregation Framework#!/bin/python from pymongo import MongoClient mc = MongoClient() !coll = mc['zurich_pug']['testcollection'] !!cur = coll.aggregate( [ {"$match": {'field_one': {"$exists": True }}} , {"$project": { "new_label": "$field_one" }} ] ) !for d in cur: print(d)

Page 32: MongoDB and Python

GridFS

http://www.appuntidigitali.it/site/wp-­‐content/uploads/rawdata.png

Page 33: MongoDB and Python

GridFS

• MongoDB has a 16MB document size limit • So how can we store data bigger than 16MB? • Media files (images, pdf’s, long binary files …)

• GridFS • Convention more than a feature • All drivers implement this convention

• pymongo is no different • Very flexible approach • Handy out-of-the-box solution

Page 34: MongoDB and Python

GridFS#!/bin/python  from  pymongo  import  MongoClient  import  gridfs  !!mc  =  MongoClient()  database  =  mc.grid_example  !!gfs  =  gridfs.GridFS(  database)  !read_file  =  open(  '/tmp/somefile',  'r')  !gfs.put(read_file,  author='Norberto',  tags=['awesome',  'zurich',  'pug'])  

call  grids  lib  w/  database

Page 35: MongoDB and Python

GridFS#!/bin/python  from  pymongo  import  MongoClient  import  gridfs  !!mc  =  MongoClient()  database  =  mc.grid_example  !!gfs  =  gridfs.GridFS(  database)  !read_file  =  open(  '/tmp/somefile',  'r')  !gfs.put(read_file,  author='Norberto',  tags=['awesome',  'zurich',  'pug'])  

open  file  for  reading

Page 36: MongoDB and Python

GridFS#!/bin/python  from  pymongo  import  MongoClient  import  gridfs  !!mc  =  MongoClient()  database  =  mc.grid_example  !!gfs  =  gridfs.GridFS(  database)  !read_file  =  open(  '/tmp/somefile',  'r')  !gfs.put(read_file,  author='Norberto',  tags=['awesome',  'zurich',  'pug'])  

call  put  to  store  file  and  metadata

Page 37: MongoDB and Python

GridFSmongo  nair(mongod-­‐3.1.0-­‐pre-­‐)  grid_sample>  show  dbs  grid_sample    0.246GB  local                0.000GB  nair(mongod-­‐3.1.0-­‐pre-­‐)  grid_sample>  show  collections  fs.chunks                      258.995MB  /  252.070MB  fs.files                        0.000MB  /  0.016MB  

database  created

Page 38: MongoDB and Python

GridFSmongo  nair(mongod-­‐3.1.0-­‐pre-­‐)  grid_sample>  show  dbs  grid_sample    0.246GB  local                0.000GB  nair(mongod-­‐3.1.0-­‐pre-­‐)  grid_sample>  show  collections  fs.chunks                      258.995MB  /  252.070MB  fs.files                        0.000MB  /  0.016MB   2  collections

Page 39: MongoDB and Python

GridFSmongo  nair(mongod-­‐3.1.0-­‐pre-­‐)  grid_sample>  show  dbs  grid_sample    0.246GB  local                0.000GB  nair(mongod-­‐3.1.0-­‐pre-­‐)  grid_sample>  show  collections  fs.chunks                      258.995MB  /  252.070MB  fs.files                        0.000MB  /  0.016MB  

chunks  collection  holds  binary  data  

files  holds  metada  data

Page 40: MongoDB and Python

Indexes

Page 41: MongoDB and Python

Indexes

• Single Field • Compound • Multikey • Geospatial

• 2d • 2dSphere - GeoJSON

• Full Text • Hash Based • TTL indexes • Unique • Sparse

Page 42: MongoDB and Python

Single Field Indexfrom pymongo import ASCENDING, MongoClient mc = MongoClient() !coll = mc.zurich_pug.testcollection !coll.ensure_index( 'some_single_field', ASCENDING )

indexed  field indexing  order

Page 43: MongoDB and Python

Compound Field Indexfrom pymongo import ASCENDING, DESCENDING, MongoClient mc = MongoClient() !coll = mc.zurich_pug.testcollection !coll.ensure_index( [('field_ascending', ASCENDING), ('field_descending', DESCENDING)] )

indexed  fields indexing  order

Page 44: MongoDB and Python

Multikey Field Indexmc = MongoClient() !coll = mc.zurich_pug.testcollection !!coll.insert( {'array_field': [1, 2, 54, 89]}) !coll.ensure_index( 'array_field')

indexed  field

Page 45: MongoDB and Python

Geospatial Field Indexfrom pymongo import GEOSPHERE import geojson !!p = geojson.Point( [-73.9858603477478, 40.75929362758241]) !coll.insert( {'point', p) !coll.ensure_index( [( 'point', GEOSPHERE )])

index  type

Page 46: MongoDB and Python

ODM and others

Page 47: MongoDB and Python

Friends

• mongoengine • http://mongoengine.org/

• Motor • http://motor.readthedocs.org/en/stable/ • async driver • Tornado • Greenlets

• ming • http://sourceforge.net/projects/merciless/

Page 48: MongoDB and Python

Let's recap

Page 49: MongoDB and Python

Recap

• MongoDB is Awesome • Specially to work with Python

• pymongo • super well supported • fully in sync with MongoDB server

Page 50: MongoDB and Python

MongoDB 3.0 is coming

Page 51: MongoDB and Python

3.0.0-RC8 https://www.mongodb.org/downloads http://www.mongodb.com/lp/misc/norberto-leite https://jira.mongodb.org/secure/Dashboard.jspa

Please go and test it!

Page 52: MongoDB and Python

Obrigado!

Norberto Leite Technical Evangelist @nleite [email protected]

Page 53: MongoDB and Python