realtime visitor analysis with couchbase and elasticsearch · realtime visitor analysis with...

46
Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

Upload: hatu

Post on 02-Jul-2018

237 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

Realtime visitor analysis with Couchbase and Elasticsearch

Jeroen Reijn | @jreijn | #nosql13

follow the Hippo trail

Page 2: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

About me

Jeroen Reijn

Software engineer

Hippo

@jreijn

http://blog.jeroenreijn.com

Page 3: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

About Hippo

Page 4: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

OneHippo @ Goto

Visitor Analysis

Page 5: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

OneHippo @ Goto

Page 6: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

OneHippo @ Goto

Page 7: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

Journey based Targeting

Page 8: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

OneHippo @ Goto

How we analyse visitors @ Hippo

Page 9: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

Registration

Visitor - entity making HTTP requests Collector - records data about a visitor or his behaviour

Example: location collector (GeoIPCollector) Targeting Data - all data about a specific visitor

Example: IP address is located in Amsterdam

Page 10: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

MatchingCharacteristic - a type of fact about visitors

Example: "comes from a city", "experiences a type of weather"

Target Group - the specification of a Characteristic Example: "comes from a European city", "comes from Amsterdam"

Persona - one or more target groups that describe a certain type of visitor

Example: "Jim, the European urban consumer", "Alice, the Pet owner"

Page 11: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

What do we store?Request log

!

Targeting data

!

Statistics

Averages, e.g. how many visitors became which persona

Page 12: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

Real-time analysis

Page 13: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

How about YOU?

• Do you analyse your visitors?

• Do you do it ‘real-time’?

Page 14: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

OneHippo @ GotoArchitecture

Page 15: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

RDBMS

Hippo Delivery Tier

Hippo Repository

App server

XMLJSON (X)HTML

Page 16: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

Delivery Tier

URL Matching

Fetch content

Compose output

Request

Response

Page 17: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

Delivery Tier

URL Matching

Collect data

Compose output

Request

Response

Fetch content

Scoring

Page 18: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

OneHippo @ GotoScaling

Page 19: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

RDBMS

Hippo Delivery Tier

Hippo Repository

App server

Hippo Delivery Tier

Hippo Repository

App server

Scaling out

Page 20: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

RDBMS

Delivery Tier

Repository

App server

Delivery Tier

Repository

App server

Scaling out

Targeting Datastore

Page 21: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

OneHippo @ GotoWhat kind of storage?

Page 22: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

OneHippo @ Goto

Writer

Single write

Datastore

Several reads

Typical Data Access Pattern

Page 23: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

Analytics Data Access Pattern

Writers

Datastore

Single read

Several writes

CMS user

Page 24: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

Targeting Data Access Pattern

Visitors

Datastore

Single read

Several writes

Several reads

CMS user

Page 25: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

Distributed Cache

Page 26: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

OneHippo @ Goto

Requirements change!

Page 27: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

OneHippo @ GotoNoSQL ?

Page 28: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

Suitable types• Key-value store

• Document database

• Column oriented store

Page 29: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

Assessment Criteria

Maturity Data model

Consistency model

PerformanceReplication

Caching model Query model

Monitoring

Scalability

Reliability

Support

Page 30: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

Selection Criteria• Performance

• Scalability

• Schema flexibility

• Simplicity

Page 31: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

OneHippo @ GotoCouchbase

Page 32: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

Why Couchbase?

• Drop-in replacement for memcached

• Read/Write-through cache

• High throughput

• Easily scalable

• Schema flexibility

• Low latency

Page 33: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

Couchbase

• Open Source

• Document-oriented

• Easy Scalable

• Consistent High Performance

• Apache licensed

Page 34: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

Performance

• Object managed cache

• Write Queue to disk

Page 35: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

Easy scalable

• Auto sharding

• Cross cluster replication (XDCR)

• Master - Master replication

Page 36: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

Flexible data model

• Native JSON support

• Incremental Map Reduce

• Gives power to the developer

Page 37: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

OneHippo @ Goto

How we run Couchbase @ Hippo

Page 38: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

Load Balancer

Database cluster

Hippo Delivery Tier Couchbase cluster

•Request log data •Targeting data •Statistics data

Page 39: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

Analysis capabilities• Querying via views

• Secondary indexes via views

• Views based on Map - Reduce

• Limited ad-hoc query capabilities

Page 40: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

Elasticsearch

• Apache Lucene

• Designed to be distributed

• Schema free

• Apache license

• RESTful API

Page 41: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

Added value• Unstructured search

• Structured search

• Faceted search

• Geo spatial search

• Combinate all

• All in (near) real-time

Page 42: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

Couchbase Server Cluster Elasticsearch Server Cluster

Hippo Delivery Tier

Java API

Wri

te

Rea

d

Couchbase Transport plugin

Replication

XDCR

Read / Query

Page 43: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

OneHippo @ GotoWhat’s Next?

Page 44: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

Advanced analytics

Page 45: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

OneHippo @ Goto{ Demo }

Page 46: Realtime visitor analysis with Couchbase and Elasticsearch · Realtime visitor analysis with Couchbase and Elasticsearch Jeroen Reijn | @jreijn | #nosql13 follow the Hippo trail

follow the Hippo trail

NoSQL Matters 2013

OneHippo @ Goto

!

Thanks! !

[email protected] @jreijn