VISUALISING REAL TIME TRAFFIC DATA USING ELASTICSEARCH AND C3JS
@jettroCoenradie Trifork Amsterdam
Case Study ANWB (Royal Dutch Automobile Association)
FACT SHEET
Jettro Coenradie Software engineer @ Triforkspecialised in search
Twitter @jettroCoenradie@gridshore
Gihub https://github.com/jettro
Linkedin https://www.linkedin.com/in/jettro
Blogs http://www.gridshore.nlhttp://blog.trifork.com/author/jettro/
GOAL
Ideas for combining (open) data
Evaluate options and performance
WHAT IS ANWB?• Dutch Automobile Driver Assistance
• Sister from:
FDM (Danmark)
ADAC (Germany)
AA (England)
Algemene Nederlandse Wieler BondGeneral Dutch Bicycle Association
Founded in 1883 as
WHAT IS ELASTICSEARCH
• Distributed / Scalable search
• Structured and full-text
• Data analytics
• Log analysis
(OPEN) DATA
Real time traffic data
Weather data
Automobile Assistance data
GOAL FOR THE PROJECT
Amount of cars on the roads
Traffic intensity on the roads
Wrong data
FLOW OF THE PROJECT
• Get to know the data: Logstash / Kibana
• Start improving data quality
• Present data using our own charts
TECHNICAL OVERVIEW
Data view
Data integration
Data Store
Tomcat - Spring mvc - c3js
Spring Integrationxml / csv
elasticsearch
Index A Index B Index C
Shard 1 Shard 2Shard R 1 Shard R 2
Lucene Lucene Lucene Lucene
Strings
Numbers
Dates
Geo points
TIME BASED INDICES
NDW
TIME BASED INDICES
NDW-2014-09-15
NDW-2014-09-16
NDW-2014-09-17
mapping-template
NDW
Alias
SCHEMA-LESS
• There is always a schema
• The schema can be dynamic
• Often you want to be specific
Dates / Numbers / Geo locations
Dynamic schema
SEARCH
Full text search
Structured search
Versus
STRUCTURED SEARCH
• Can be cached most of the time
• No scoring
• Fast
Filters
FILTERS WE USED
• Range filters
• Term filters
• Composite (bool) filters
Range FilterDate Range Filter
Term Filter
AGGREGATIONS
• Create buckets of data
• Compute Metrics
Two types of aggregations
DocDocDoc
Set of documents
Condition
Bucket Bucket Bucket Bucket
Term: red, blue, green, yellowRange: 0-10, 10-20, 20-30, 30-40
DD
Set of documents
AGGREGATIONS WE USED
• Date histogram aggregations
• Terms aggregations
• AVG aggregations
Date Histogram Aggregation + AVG metric Aggregation
Terms Aggregation
GEO LOCATIONS
Two types of locations
• Using latitude and longitude
• Using geohash (creating a grid)
GEO LAT/LON
• Used for distance based queries
• Used for distance based aggregations
GEO HASH
• Uses a hash te represent a square
• More characters means more precision
GEOHASH
http://www.bigdatamodeling.org/2013/01/intuitive-geohash.html
PERCOLATOR
“The opposite of executing a query
and finding results”
PERCOLATOR
“Match an (existing) document
against stored queries.”
PERCOLATOR
Geo polygon filter
Zuid-WestNoord-WestNoord-OostZuid
{ location: [
3.5123, 46.3412 ]}
Zuid-West
QUESTIONS
@jettroCoenradie