taking society’s pulse in real-time ... - harvard university
TRANSCRIPT
Taking Society’s Pulse in Real Time
Taking Society’s Pulse in Real-Time
TweetMap
Ben Lewis Center for Geographic Analysis,
Harvard
TweetMap is…
• Part of the WorldMap platform
• Beginnings of a big data capability in WorldMap
• Interactive analytics for geo-referenced Twitter data
• Testing ground for new ways to scale WorldMap, in terms of visualization, analytics, and search against its large number of layers.
WorldMap is…
• Designed to lower barriers for researchers who would like to use spatial technology
• Web-based, cloud hosted platform
• Made available to the world
• Open source software with a global developer community
WorldMap Stats
In the past year and a half…
7,844 registered users
8,199 layers added or created
2,054 map collections created
350,000 visitors
WorldMap Growth
Traffic by City
TweetMap
What we mean by real-time
• Immediate Updates: MapD can support real-time database updates (in development).
• Interactive Analysis: Response time for query and visualization against very large datasets is fast enough let users explore them interactively.
Key Features of MapD
• MapD is capable of providing real-time query/visualization access to a billion records
• Visualizations against keywords, geography, and time simultaneously
• Open source and built on cheap hardware
New kind of tool for a new kind of media
5 Million Human Sensors
“snow” (December)
“rain”
(rain in Arabic) ”مطر“
“Dunkin”
Dunkin
“y’all”
sick
enfermo
“newtown”
“earthquake”
Bringing in other layers for analysis
“earthquake”
Under development
• Real time streaming from Twitter to map
• Support for larger datasets
• Trend analysis
• Spatial overlay and regression
• Machine learning
Robert Kirkpatrick of UN’s Global Pulse Initiative:
• Our challenge in international development is we are still using 3 year old data to make 5 year plans.
• There is an opportunity to harness the phenomenon of big data to provide real time feedback on where things ARE NOT working so we can take rapid action to fix it.
UN Global Pulse: Big Data for Development
Many forms of big data
• These may be less accurate than official sources
• But are much faster to get
• And cheaper to collect (free?)
• SO… how to leverage speed to change the outcome?
UN Global Pulse: Big Data for Development
Anonymized cell phone data can provide information on
• Mobility • Diameter of mobility and social network
• Radius of gyration
• Mobility Patterns
• Social • Degree of the social network
• Weight of the contacts, frequency of communication
• Consumption • Number of calls, call duration, SMS/MMS/voice
• Size, frequency, total number of airtime purchases
• Handset Type and Features
UN Global Pulse: Big Data for Development
Which can be used in various ways
• Real-time Events: Earlier detection of anomalies and events allows rapid response to crises.
• Real-time Trends: A current analysis of population activities and dynamics supports more effective program planning and implementation.
• Real-time Evaluation: Real-time measurement of impact-related behavior change allows for rapid, adaptive course corrections in programs.
UN Global Pulse: Big Data for Development
Coming soon…
+
Thank you
Telephone Calls
Airplane Flights
What is meant by real-time
Global pulse definition: Information about a phenomenon available quickly enough to maintain an accurate reflection of its current state, such that effective action may be taken in response.
• Malnutrition -> Months
• Starvation -> Weeks
• Cholera -> Days
• Earthquake -> Hours
Global Pulse
• We need a global, real-time public/private data commons
• Risk of reidentification
• Big data is a human rights issue
• “EU vs Mark Zukerberg
– Privacy is dead and profit is king
– All reuse if forbidden without my explicit permission”
• “Global pulse sees big data as a raw public good but needs to be transformed and safely into a powerful asset for understanding what is happening in the world”
Create sandbox
• Create safe space to experiment with different types of data for specific purposes
• While in parellel develop ways to protect privacy
Agile Global Development (Global Pulse)
World Economic Forum
Understanding the Dynamics of the Data Ecosystem
Big Data is Human Rights Issue (global pulse)
• Never analyze personally identifiable information
• Never analyze confidential data
• Never seek to re-identify individuals
• The way people buy credits – predicts household income
• These patterns can change which can show things
• When poor people are calling you. Reciprocity index increases with income.
Men and Women use their phones differently
MapD: A Massively Parallel Database
• Vendors can sell info to phone carriers that estimate within a couple years how old a person is.
Predicting cholora outbreaks
• flowminder
Harvard healthmap
• Shows that if twitter had been used to monitor cholora would have known about it two weeks earlier.
• Rumi Chunara, Jason Andrews and John Brownstein
• http://www.ajtmh.org/content/86/1/39.abstract
• Flu trends
• Dengue trends
• How to get institutions to use real time data?
• Data is not replacing their staticstics but a compliment
• Most of the universe is invisible
• Most valuable real time data is being gathered by companies as business asset
• Begun engaging around concept of data philanthopy – aggrgated data shared into a public commons, your data is shared with competitors
• Why would they? They might help their own customers to help pulic policy intervene
Pulse labs
• Form partnerships between public sector and business and academy
• Idea of dash board fed by streams of data being run through predictive models
• Socio economic weather stations
• Marshall Mcluen famously declared that “the the medium is the message”.
• I want to tell you about a new kind of media called tweetmap and enlist your help in figuring out what its message is for us.
Niche between desktop and web
Ease
of
Use
WorldMap
web apps
desktop apps
Lowering Barriers to Collaboration
• Open registration
• Open access to data
• Open service protocols (WMS, WFS, ESRI Rest)
• Open data formats (Shape, GeoTIFF, GeoRSS, KML, Json, CSV)
• Open source code (GPL on github)
• Runs on open source operating systems (Linux) and could run on Windows
Working Across Disciplines
http://www.architectmagazine.com/technology/meet-the-geodesigner.aspx
Allows researchers to…
• Organize one’s own (large) mapping datasets online
• Discover other people’s (public) data • Mashup / Overlay one’s data with the data of
others • Create maps with complex symbology • Collaborate by letting several people edit the
same map • Publish data to the world or to just a few
collaborators
Picasa Feeds
Place Name Search (Gazetteers)
Mobile: Feature Creation
Built by the community
TweetMap
Big Data Visual
The best data is distributed
across many organizations
Universities, schools
Scholars outside the academy
Memory institutions
Government agencies
Businesses
Working Across Disciplines
http://www.architectmagazine.com/technology/meet-the-geodesigner.aspx