the weather of the century part 3: visualization

Post on 29-Aug-2014

150 Views

Category:

Technology

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

MongoDB natively supports geospatial indexing and querying, and it integrates easily with open source visualization tools. In this presentation, learn high-performance techniques for querying and retrieving geospatial data, and how to create a rich visual representation of global weather data using Python, Monary, and Matplotlib.

TRANSCRIPT

A. Jesse Jiryu Davis

#MongoDBWorld

The Weather of the Century!Part III:!Visualization

Senior Python Engineer, MongoDB

Serious MongoDB Talk

Serious MongoDB Talk

Database

Serious MongoDB Talk

This Talk

Where’s the data from?

Where’s the data from?

How Much Is There?

Deployment

Visualization

Visualization Pipeline

MongoDB PyMongo NumPy MatplotlibPython dicts

SciPy

import numpy!import pymongo!!data = []!db = pymongo.MongoClient().my_database!!for doc in db.collection.find(query):! data.append((! doc['position']['coordinates'][0],! doc['position']['coordinates'][1],! doc['airTemperature']['value']))!!arrays = numpy.array(data)!

# NumPy column access syntax.!lons = arrays[:, 0]!lats = arrays[:, 1]!temps = arrays[:, 2]!

from scipy import griddata!from matplotlib import pyplot!!xs = numpy.linspace(-180, 180, 361)!ys = numpy.linspace(-90, 90, 181)!zs = griddata(lats, lons, temps,! (xs, ys),! method='linear')!!pyplot.contour(xs, ys, zs)!

Magic!!

Also magic!!

from matplotlib import pyplot!!xs = numpy.linspace(-180, 180, 361)!ys = numpy.linspace(-90, 90, 181)!zs = griddata(lats, lons, temps,! (xs, ys),! method='linear')!!pyplot.contour(xs, ys, zs)!

Triangulation

Triangulation

What temperature?

Triangulation

Barycentric Interpolation

What temperature? 53

48

54

Weighted Average

51.1

Interpolation

51.1

Interpolation

Interpolation

Contours

Contours

import numpy!import pymongo!!data = []!db = pymongo.MongoClient().my_database!!for doc in db.collection.find(query):! data.append((! doc['position']['coordinates'][0],! doc['position']['coordinates'][1],! doc['airTemperature']['value']))!!arrays = numpy.array(data)!

Not terrifically fast

Analyzing large datasets

• Querying: 109k documents per second • (On localhost) • Can we go faster? • Enter “Monary”

MongoDB PyMongo NumPy MatplotlibPython dicts

MongoDB Monary NumPy Matplotlib

Monary by David Beach

import monary!!data = []!connection = monary.Monary()!!arrays = monary_connection.query(! db='my_database',! coll='collection',! query=query,! fields=['lon', 'lat', 'temp'],! types=[! 'float32', 'float32', 'float32'])!

Monary

• PyMongo: 109k documents per second • Monary: 817k documents per second

Visualization

{! ts: ISODate("1991-01-01T00:00:00Z"),! position: {! type: "Point",! coordinates: [! -94.6,! 39.117! ]! },! airTemperature: {! value: 45,! quality: "1"! }!}!

Original Schema

{! ts: ISODate("1991-01-01T00:00:00Z"),! lon: 39.117,! lat: -94.6,! temp: 45!}!

Target Schema

Future of Monary

• Author:David Beach

• Interns:Kyle SuarezMatt Cotter

• Mentors:A. Jesse Jiryu DavisJason Carey

Future of Monary

• Subdocuments: "airTemperature.value"

• Aggregation cursor

• Packaging

• Bugfixes

• Python 3

Thanks

• Monary

• NumPy

• SciPy

• Matplotlib

Thanks

Thank you

#MongoDBWorld

A. Jesse Jiryu DavisSenior Python Engineer, MongoDB

top related