Page 1
VISUALISING REAL TIME TRAFFIC DATA USING ELASTICSEARCH AND C3JS
@jettroCoenradie Trifork Amsterdam
Case Study ANWB (Royal Dutch Automobile Association)
Page 2
FACT SHEET
Jettro Coenradie Software engineer @ Triforkspecialised in search
Twitter @jettroCoenradie@gridshore
Gihub https://github.com/jettro
Linkedin https://www.linkedin.com/in/jettro
Blogs http://www.gridshore.nlhttp://blog.trifork.com/author/jettro/
Page 3
GOAL
Ideas for combining (open) data
Evaluate options and performance
Page 4
WHAT IS ANWB?• Dutch Automobile Driver Assistance
• Sister from:
FDM (Danmark)
ADAC (Germany)
AA (England)
Page 5
Algemene Nederlandse Wieler BondGeneral Dutch Bicycle Association
Founded in 1883 as
Page 6
WHAT IS ELASTICSEARCH
• Distributed / Scalable search
• Structured and full-text
• Data analytics
• Log analysis
Page 7
(OPEN) DATA
Real time traffic data
Weather data
Automobile Assistance data
Page 8
GOAL FOR THE PROJECT
Amount of cars on the roads
Traffic intensity on the roads
Wrong data
Page 10
FLOW OF THE PROJECT
• Get to know the data: Logstash / Kibana
• Start improving data quality
• Present data using our own charts
Page 12
TECHNICAL OVERVIEW
Data view
Data integration
Data Store
Tomcat - Spring mvc - c3js
Spring Integrationxml / csv
elasticsearch
Page 15
Index A Index B Index C
Shard 1 Shard 2Shard R 1 Shard R 2
Lucene Lucene Lucene Lucene
Page 16
Strings
Numbers
Dates
Geo points
TIME BASED INDICES
NDW
Page 17
TIME BASED INDICES
NDW-2014-09-15
NDW-2014-09-16
NDW-2014-09-17
mapping-template
NDW
Alias
Page 18
SCHEMA-LESS
• There is always a schema
• The schema can be dynamic
• Often you want to be specific
Dates / Numbers / Geo locations
Dynamic schema
Page 19
SEARCH
Full text search
Structured search
Versus
Page 20
STRUCTURED SEARCH
• Can be cached most of the time
• No scoring
• Fast
Filters
Page 21
FILTERS WE USED
• Range filters
• Term filters
• Composite (bool) filters
Page 22
Range FilterDate Range Filter
Term Filter
Page 23
AGGREGATIONS
• Create buckets of data
• Compute Metrics
Two types of aggregations
Page 24
DocDocDoc
Set of documents
Condition
Bucket Bucket Bucket Bucket
Term: red, blue, green, yellowRange: 0-10, 10-20, 20-30, 30-40
Page 25
DD
Set of documents
Page 26
AGGREGATIONS WE USED
• Date histogram aggregations
• Terms aggregations
• AVG aggregations
Page 27
Date Histogram Aggregation + AVG metric Aggregation
Page 28
Terms Aggregation
Page 29
GEO LOCATIONS
Two types of locations
• Using latitude and longitude
• Using geohash (creating a grid)
Page 30
GEO LAT/LON
• Used for distance based queries
• Used for distance based aggregations
Page 31
GEO HASH
• Uses a hash te represent a square
• More characters means more precision
Page 32
GEOHASH
http://www.bigdatamodeling.org/2013/01/intuitive-geohash.html
Page 33
PERCOLATOR
“The opposite of executing a query
and finding results”
Page 34
PERCOLATOR
“Match an (existing) document
against stored queries.”
Page 35
PERCOLATOR
Geo polygon filter
Zuid-WestNoord-WestNoord-OostZuid
{ location: [
3.5123, 46.3412 ]}
Zuid-West
Page 37
QUESTIONS
@jettroCoenradie