city data fusion: a big data infrastructure to sense the pulse of the city in real-time

18
City Data Fusion http://citydatafusion.org A Big Data Infrastructure to sense the pulse of the city in real-time Emanuele Della Valle [email protected] http://emanueledellavalle.org IBM and Politecnico di Milano bridging industrial and academic excellence 2.10.2013

Upload: emanuele-della-valle

Post on 27-Jan-2015

108 views

Category:

Technology


1 download

DESCRIPTION

Streams of information flow through our cities thanks to their progressive instrumentation with diverse sensors, a wide adoption of smart phones and social networks, and a growing open release of datasets. This research investigates the possibility to feel the pulse of our cities in real-time by fusing and making sense of all those information flows. The expected result is a Big Data infrastructure that exploits: semantic technologies, streaming databases, visual analytics, and crowd-sourcing techniques whose incentives are designed for urban environment and life styles. Early deployments for city scale events offer insights on the kind of services such infrastructure will enable.

TRANSCRIPT

Page 1: City Data Fusion: A Big Data Infrastructure to sense the pulse of the city in real-time

City Data Fusionhttp://citydatafusion.org

A Big Data Infrastructure to sense the pulse of the city in real-timeEmanuele Della [email protected]://emanueledellavalle.org

IBM and Politecnico di Milano bridging industrial and academic excellence

2.10.2013

Page 2: City Data Fusion: A Big Data Infrastructure to sense the pulse of the city in real-time

http://citydatafusion.org - Emanuele Della Valle

Agenda

Context

Goal

An example

Proposed solution

Research hypothesis

Early deployments

On-going collaborations

3

Page 3: City Data Fusion: A Big Data Infrastructure to sense the pulse of the city in real-time

http://citydatafusion.org - Emanuele Della Valle

The digital reflection of reality is sharpening

Streams of information flows through our cities thanks to:

4

the pervasive deployment of sensors in our cities

the wide adoption of smart phones (equipped with sensors)

the usage of (location-based) social networks

the availability of datasets about urban environment

Page 4: City Data Fusion: A Big Data Infrastructure to sense the pulse of the city in real-time

http://citydatafusion.org - Emanuele Della Valle

Goal

Advance our ability to feel the pulse of our citiesin order to deliver innovative services

5

fusing all those data sources

making sense of the fused information

Page 5: City Data Fusion: A Big Data Infrastructure to sense the pulse of the city in real-time

http://citydatafusion.org - Emanuele Della Valle

E.g., is Milano Design Week perceivable?

6

Step 1: associate mobile traffic to urban areas

Real data recorded on 13 April 2013 between 13:00 and 00:00

Page 6: City Data Fusion: A Big Data Infrastructure to sense the pulse of the city in real-time

http://citydatafusion.org - Emanuele Della Valle

E.g., is Milano Design Week perceivable?

7

Step 2: subtract what is systematic

Real data recorded on 13 April 2013 between 13:00 and 00:00

Page 7: City Data Fusion: A Big Data Infrastructure to sense the pulse of the city in real-time

http://citydatafusion.org - Emanuele Della Valle

E.g., is Milano Design Week perceivable?

8

Step 3: Identify interesting areas

Brera

Navigli

PortaRomana

Real data recorded on 13 April 2013 between 13:00 and 00:00

Page 8: City Data Fusion: A Big Data Infrastructure to sense the pulse of the city in real-time

http://citydatafusion.org - Emanuele Della Valle

E.g., is Milano Design Week perceivable?

9

Step 4: retrieve the top hashtags

Brera

Navigli

PortaRomana

Real data recorded on 13 April 2013 between 13:00 and 00:00

Page 9: City Data Fusion: A Big Data Infrastructure to sense the pulse of the city in real-time

http://citydatafusion.org - Emanuele Della Valle

E.g., is Milano Design Week perceivable?

10

Step 5: exclude what is systematic

Brera

Navigli

PortaRomana

Real data recorded on 13 April 2013 between 13:00 and 00:00

Page 10: City Data Fusion: A Big Data Infrastructure to sense the pulse of the city in real-time

http://citydatafusion.org - Emanuele Della Valle

Ingredients of the proposed Big Data infrastructure

semantic technologies - Address "variety" using Ontology Based Data Access- Named Entity recognition and linkage - Knowledge discovery (e.g., detecting systematicy)

streaming algorithms- Address "velocity" of data stream- Address "volume" by being able to process data that

do not fit in main memory

crowd-sourcing techniques- Address "veracity" by cleansing and

enriching data

Visual analytics- Allow no-expert access to data- Tell stories out of data

11

Page 11: City Data Fusion: A Big Data Infrastructure to sense the pulse of the city in real-time

http://citydatafusion.org - Emanuele Della Valle

?

? ?

? ?

Limitation of current systems

Insufficient methods for making sense in real-time of heterogeneous data and social streams w.r.t. the vast collections of (open) data

Lack of crowd-sourcing techniques whose incentives leverage needs of people in the urban environment

Lack of visualization techniques tailored to non-experts

12

Page 12: City Data Fusion: A Big Data Infrastructure to sense the pulse of the city in real-time

http://citydatafusion.org - Emanuele Della Valle

Research hypothesis

1. To scale order matters

2. Crowdsourcing needs the urban-centric incentives

3. Visualization must tell stories

13

Page 13: City Data Fusion: A Big Data Infrastructure to sense the pulse of the city in real-time

http://citydatafusion.org - Emanuele Della Valle

Research hypothesis: order matters!

Observation: order reflects recency, relevance, trustability …

harnessing orders is key to make sense in real-time of heterogeneous, massive and volatile data

14

Indexes

Recency

Relevance, Trustability, etc.

Combinations

Typ

es o

f o

rder

s

No Yes

Traditional solutions

DSMS/CEP

Top-k Q/A

Continuous top-k Q/A

Scalable reasoning

Stream reasoning

Order-aware reasoning

Top-k Reasoning

Types of reasoning

Page 14: City Data Fusion: A Big Data Infrastructure to sense the pulse of the city in real-time

http://citydatafusion.org - Emanuele Della Valle

Research hypothesis: urban-centric incentives!

incentives designed for urban environmentand life styles are key

15

Masl

ow

's h

iera

rchy o

f needs

[source: http://www.behance.net/gallery/Maslows-Hierarchy-of-Needs-Infographic/4376921 ]

Page 15: City Data Fusion: A Big Data Infrastructure to sense the pulse of the city in real-time

http://citydatafusion.org - Emanuele Della Valle

Research hypothesis: visualizations must tell stories

16

[Source: http://www.densitydesign.org/2013/04/whatever-the-weather/ ]

Page 16: City Data Fusion: A Big Data Infrastructure to sense the pulse of the city in real-time

http://citydatafusion.org - Emanuele Della Valle

Prototypes already deployed

17

http://streamreasoning.org/demos/bottari

http://streamreasoning.org/demos/london2012

http://twindex.fuorisalone.it/

Page 17: City Data Fusion: A Big Data Infrastructure to sense the pulse of the city in real-time

http://citydatafusion.org - Emanuele Della Valle

On-going collaborations

Who Semantic techs

Streaming algorithm

s

Crowd- sourcing

Visual analytics

CEFRIEL

Density Design Lab – PoliMi

DISI – University of Trento

KDD Lab – ISTI, CNR, Pisa

ML Group – SIEMENS

Ontology Eng. Group – UPM

Saltlux – Korea

SKIL Lab - Telecom Italia

Studio Labo

Web IS - TU Delft

18

Page 18: City Data Fusion: A Big Data Infrastructure to sense the pulse of the city in real-time

City Data Fusionhttp://citydatafusion.org

Thank you! Emanuele Della [email protected]://emanueledellavalle.org

IBM and Politecnico di Milano bridging industrial and academic excellence

2.10.2013