data analysis and visualization with mongo db [mongodb world 2016]
TRANSCRIPT
![Page 1: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/1.jpg)
Data Analysis and Visualization with MongoDB
Alexander C. S. Hendorf @hendorf
MongoDB World 2016, NYC
![Page 2: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/2.jpg)
Alexander C. S. Hendorf
CTO Königsweg GmbH
mongoDB master 2016, MUG Leader
EuroPython organizer + program chair
Speaker EuroPython, mongoDB days, CEBIT, PyCon It, PyData…
Hobbies: see above
@hendorf
![Page 3: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/3.jpg)
#15
180+sessions
20 freetrainings
interactivesessions
panelsopenspaces
socialevent
5dtalks &trainings
2dsprints
beginners’ day
17th - 24th of July
@EuroPython
![Page 4: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/4.jpg)
![Page 5: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/5.jpg)
2003 2012 2016
![Page 6: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/6.jpg)
2003 2012 2016
![Page 7: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/7.jpg)
2003 2012 2016
![Page 8: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/8.jpg)
2003 2012 2016
Apple launches the iTunes music store in the U.S.
![Page 9: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/9.jpg)
2003 2012 2016
Apple launches the iTunes music store in the U.S.
14
77
122
2x 122
global coverage + Apple Music
x 10 Genres x 3 Single, Album,
Video
![Page 10: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/10.jpg)
![Page 11: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/11.jpg)
![Page 12: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/12.jpg)
RDBS Data Lake
![Page 13: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/13.jpg)
mongoDB Data Lake
![Page 14: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/14.jpg)
Aggregation Framework
![Page 15: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/15.jpg)
![Page 16: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/16.jpg)
![Page 17: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/17.jpg)
{'_id': ObjectId('56deffde0947000f05fc415a'),'adamIds': [ '1067854407', '1063750649', '1064007468', '1066300693', ... '271232254', '453857235', '377045644' ], 'kinds': {'album': True}, 'title': 'Top Albums’ },
'discovered': 1457447797.81184,'store-id': ‚143444‘,'url': 'https://itunes…/viewTop?id=27740&genreId=50'}
position = rank in charts
array
unix timestamp
1
2
3
200
chart id
![Page 18: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/18.jpg)
'adamIds': ['1067854407', '1063750649', '1064007468', '1066300693', '296867433', '956751167', '328069028', '505586080', '676328847', '642644496', '271232254', '453857235', '377045644'],'discovered': 1457447797
'adamIds': ['1067854407', '453857235', '1063750649', '1066300693', '296867433', '328069028', '292372676', '505586080', '676328847', '642644496', '956751167', '271232254', '544816699'],'discovered': 1457447836
![Page 19: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/19.jpg)
'adamIds': ['1067854407', '1063750649', '1064007468', '1066300693', '296867433', '956751167', '328069028', '505586080', '676328847', '642644496', '271232254', '453857235', '377045644'],'discovered': 1457447797
'adamIds': ['1067854407', '453857235', '1063750649', '1066300693', '296867433', '328069028', '292372676', '505586080', '676328847', '642644496', '956751167', '271232254', '544816699'],'discovered': 1457447836
![Page 20: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/20.jpg)
1
200
100
rank
documents / time
![Page 21: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/21.jpg)
pipeline = [
{"$match": {
"discovered": {$gte: 1457447797, $lte: 1457447836}
"url": "http://the/url/is/a/identifier/the/chart/"},
{"$unwind": {"$adamId"}},
{"$group": …
"$push: ""$adamId"}
]
![Page 22: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/22.jpg)
pipeline1 = [{"$match": {…}}, {"$project": {"products": "$chart.adamIds", "discovered": "$downloadinfo.discovered"}},# unwind with numbering{"$unwind": { "path": "$products", "includeArrayIndex": "arrayIndex" }}, {"$project": {"product": "$products",
# arrayIndex attribute was added by $unwind, is 0-indexed "rank": {"$add": ["$arrayIndex", 1 ]}, "discovered": 1,
# any '_id' attribute must be unique for storing, rename "_id": 0, "origin_id": "$_id"}},{"$sort": {"origin_id": -1}},
# save as new collection{"$out": "individual_movements"}
]
1
'products': ['1067854407', '1063750649', ... '642644496', '377045644'],'discovered': 1457447797
![Page 23: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/23.jpg)
[ {'_id': ObjectId('572c69bc8651fa448821083b'),
'discovered': 1441110721.19208,
'origin_id': ObjectId('55e59b260947007aef84dccb'),
'rank': 1,
'product': '1032438740'},
{'_id': ObjectId('572c69bc8651fa448821083c'),
'discovered': 1441110721.19208,
'origin_id': ObjectId('55e59b260947007aef84dccb'),
'rank': 2,
'product': '976241375'}, …
]
![Page 24: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/24.jpg)
pipeline2 = [
{"$group": {"_id": "$origin_id", "discovered": {"$first": "$discovered"}}},
{"$project": {"discovered": 1, "_id": 1}},
{"$out": "x_axis"}]
2
![Page 25: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/25.jpg)
[{'_id': ObjectId('559332e6c419ab6d8b0738f9'),
'discovered': 1435710159.830053},
{'_id': ObjectId('5594ae5f09470044a56f1c61'),
'discovered': 1435807294.457157},
{'_id': ObjectId('5594bcaac419ab6d280b740e'),
'discovered': 1435810952.364217}]
![Page 26: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/26.jpg)
pipeline3 = [ {"$lookup": {"from": "individual_movements", "localField": "_id", "foreignField": "origin_id", "as": "values"}},
{"$unwind": "$values"},
{"$project": {"product": "$values.product", "rank": "$values.rank", "discovered": 1}}
]
3
![Page 27: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/27.jpg)
![Page 28: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/28.jpg)
x_axis collection: documents / time
![Page 29: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/29.jpg)
1
200
100
individual_movements: rank
x_axis collection: documents / time
![Page 30: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/30.jpg)
[{'_id': ObjectId('559332e6c419ab6d8b0738f9'),
'discovered': 1435710159.830053,
'rank': 1,
'product': '1000697870'},
{'_id': ObjectId('559332e6c419ab6d8b0738f9'),
'discovered': 1435710159.830053,
'rank': 2,
'product': '986637877'},
{'_id': ObjectId('559332e6c419ab6d8b0738f9'),
'discovered': 1435710159.830053,
'rank': 3,
'product': '995987630'},…]
![Page 31: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/31.jpg)
![Page 32: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/32.jpg)
Data Scientists?
![Page 33: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/33.jpg)
![Page 34: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/34.jpg)
![Page 35: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/35.jpg)
Data Scientists!
• Grantaccesswiththebuiltinrolemanagement
• DatascientistscananalysethedatawithtypicaltoolsasPandas,R,etc…
• easyasacakewithVIEWscomingin3.4:-)
![Page 36: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/36.jpg)
Data
![Page 37: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/37.jpg)
Visualization!
0
25
50
75
100
April May June July
![Page 38: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/38.jpg)
Analysts?
![Page 39: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/39.jpg)
BI Connector
![Page 40: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/40.jpg)
![Page 41: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/41.jpg)
{'_id': ObjectId('559332e6c419ab6d8b0738f9'),
'rank': 0,
'product': '1000697870':
'abc': ["a", "b", "c"]
}
![Page 42: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/42.jpg)
schema:- db: mydatabase tables: - table: my_mongodb_collection collection: my_mongodb_collection pipeline: [] columns: - Name: _id MongoType: bson.ObjectId SqlName: _id SqlType: varchar - Name: rank MongoType: int SqlName: rank SqlType: numeric - Name: product MongoType: string SqlName: product SqlType: varchar…
{'_id': ObjectId('559332e6c419ab6d8b0738f9'),
'rank': 0,
'product': '1000697870':
'abc': ["a", "b", "c"]
}
![Page 43: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/43.jpg)
SqlName: _id SqlType: varchar - Name: rank MongoType: int SqlName: rank SqlType: numeric - Name: product MongoType: string SqlName: product SqlType: varchar…
{'_id': ObjectId('559332e6c419ab6d8b0738f9'),
'rank': 0,
'product': '1000697870':
'abc': ["a", "b", "c"]
} - table: my_mongodb_collection_abc collection: my_mongodb_collection pipeline: - $unwind: includeArrayIndex: abc path: $abc columns: - Name: abc MongoType: string SqlName: abc…
![Page 44: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/44.jpg)
mongobiuser create user mongodb://localhost:27017/myDB# create a user account
mongodrdl --host localhost -d myDB -o schema.drdl# create the schema
mongobischema import user schema.drdl# load the schema
![Page 45: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/45.jpg)
BI Connector
![Page 46: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/46.jpg)
![Page 47: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/47.jpg)
![Page 48: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/48.jpg)
Analysts!
BIConnector• accessmongoDBfromBItools• mock-upofRDBS• BIuseraccounts
![Page 49: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/49.jpg)
![Page 50: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/50.jpg)
![Page 51: Data analysis and visualization with mongo db [mongodb world 2016]](https://reader031.vdocuments.mx/reader031/viewer/2022022200/58a727221a28ab0d0d8b5309/html5/thumbnails/51.jpg)
Alexander C. S. Hendorf @hendorf