visualising real time traffic data using elasticsearch and ... · visualising real time traffic...

VISUALISING REAL TIME TRAFFIC DATA USING ELASTICSEARCH AND C3JS

@jettroCoenradie Trifork Amsterdam

Case Study ANWB (Royal Dutch Automobile Association)

FACT SHEET

Jettro Coenradie Software engineer @ Triforkspecialised in search

Twitter @jettroCoenradie@gridshore

Gihub https://github.com/jettro

Linkedin https://www.linkedin.com/in/jettro

Blogs http://www.gridshore.nlhttp://blog.trifork.com/author/jettro/

http://www.gridshore.nl

http://blog.trifork.com/author/jettro/

GOAL

Ideas for combining (open) data

Evaluate options and performance

WHAT IS ANWB?• Dutch Automobile Driver Assistance

• Sister from:

FDM (Danmark)

ADAC (Germany)

AA (England)

Algemene Nederlandse Wieler BondGeneral Dutch Bicycle Association

Founded in 1883 as

WHAT IS ELASTICSEARCH

• Distributed / Scalable search

• Structured and full-text

• Data analytics

• Log analysis

(OPEN) DATA

Real time traffic data

Weather data

Automobile Assistance data

GOAL FOR THE PROJECT

Amount of cars on the roads

Traffic intensity on the roads

Wrong data

FLOW OF THE PROJECT

• Get to know the data: Logstash / Kibana

• Start improving data quality

• Present data using our own charts

TECHNICAL OVERVIEW

Data view

Data integration

Data Store

Tomcat - Spring mvc - c3js

Spring Integrationxml / csv

elasticsearch

Index A Index B Index C

Shard 1 Shard 2Shard R 1 Shard R 2

Lucene Lucene Lucene Lucene

Strings

Numbers

Dates

Geo points

TIME BASED INDICES

NDW

TIME BASED INDICES

NDW-2014-09-15

NDW-2014-09-16

NDW-2014-09-17

mapping-template

NDW

Alias

SCHEMA-LESS

• There is always a schema

• The schema can be dynamic

• Often you want to be specific

Dates / Numbers / Geo locations

Dynamic schema

SEARCH

Full text search

Structured search

Versus

STRUCTURED SEARCH

• Can be cached most of the time

• No scoring

• Fast

Filters

FILTERS WE USED

• Range filters

• Term filters

• Composite (bool) filters

Range FilterDate Range Filter

Term Filter

AGGREGATIONS

• Create buckets of data

• Compute Metrics

Two types of aggregations

DocDocDoc

Set of documents

Condition

Bucket Bucket Bucket Bucket

Term: red, blue, green, yellowRange: 0-10, 10-20, 20-30, 30-40

DD

Set of documents

AGGREGATIONS WE USED

• Date histogram aggregations

• Terms aggregations

• AVG aggregations

Date Histogram Aggregation + AVG metric Aggregation

Terms Aggregation

GEO LOCATIONS

Two types of locations

• Using latitude and longitude

• Using geohash (creating a grid)

GEO LAT/LON

• Used for distance based queries

• Used for distance based aggregations

GEO HASH

• Uses a hash te represent a square

• More characters means more precision

GEOHASH

http://www.bigdatamodeling.org/2013/01/intuitive-geohash.html

PERCOLATOR

“The opposite of executing a query

and finding results”

PERCOLATOR

“Match an (existing) document

against stored queries.”

PERCOLATOR

Geo polygon filter

Zuid-WestNoord-WestNoord-OostZuid

{ location: [

3.5123, 46.3412 ]}

Zuid-West

QUESTIONS

@jettroCoenradie

visualising real time traffic data using elasticsearch and ... · visualising real time traffic...

Documents