couchbase@live person meetup july 22nd

22
CouchBase @ LivePerson Ido Shilon | 7/22/2013 | [email protected]

Upload: ido-shilon

Post on 27-Jun-2015

565 views

Category:

Technology


1 download

DESCRIPTION

was presented in www.meetup.com/ILTechTalks/events/127553252/

TRANSCRIPT

Page 1: Couchbase@live person meetup   july 22nd

CouchBase @ LivePerson

Ido Shilon | 7/22/2013 | [email protected]

Page 2: Couchbase@live person meetup   july 22nd

Agenda

● Connection before content

● Who LivePerson is

● Data & Technology @ LivePerson

● What is the use case (Real time analytics)

● The Story

● Why CouchBase was chosen

● Architecture & Code

● Production

● What did we learn so far

● Q&A

Page 3: Couchbase@live person meetup   july 22nd

Company

● The leading customer intelligent

engagement platform

● 8,500 customers around the globe

● 8 of the top 10 Fortune 500 companies

● Doing SaaS from 1999 (when it was

called ASP)

Mission

Creating Meaningful Customer Connections

LivePerson is…

Page 4: Couchbase@live person meetup   july 22nd

Typical Engagements

Page 5: Couchbase@live person meetup   july 22nd

Technology @ LivePerson

● Application Stack○ JVM heavy○ Linux on commodity servers○ Private cloud based on openstack

Page 6: Couchbase@live person meetup   july 22nd

Data @ LivePerson

● 1.8 billion visitors monitored (sessions) per month

● 20 million connections per month● ~0.5TB compressed data is loading into

Hadoop each day

KafkaStorm

Page 7: Couchbase@live person meetup   july 22nd

The LivePerson use case

Visitor List

Real time Analytics for LivePerson's customers

Page 8: Couchbase@live person meetup   july 22nd

What do we need it for ?

● Provide visibility to LivePerson's customers on their online visitors

● Been used by thousands of agents (call centers) around the world

Page 9: Couchbase@live person meetup   july 22nd

The story - once upon a time

AgentsConsole(Java app)

Visitor events/data

Web site Visitors

Monitoring Events

● Stateful web servers (vertical scalability)

● Multi purpose servers ● Hold all visitors state in memory

Page 10: Couchbase@live person meetup   july 22nd

DC 2DC1

And then the story continue

Web Agent(Web app) ??????

Web site Visitors

Monitoring Events

● Multiple web servers (horizontal scalability)● SOA architecture ● All visitors states are streamed to event bus

Kafka & Strom(Event bus)

Send events Send

events

Page 11: Couchbase@live person meetup   july 22nd

Possible solutions we considered

Page 12: Couchbase@live person meetup   july 22nd

Why did we picked Couchbase

● Performance, high throughput, really fast● Resilience solution ● Linear scale ● Schemaless !!! ● Searchable (Queries)● Supports both K/V & document store ● Cross data center replication● Simplicity (quick dev and roll out)● We needed a solution !!!

Page 13: Couchbase@live person meetup   july 22nd

System architecture

Visitor (Browser/Mobile)

Stream Event Processing

Visitor Feed - Storm Topology

Customer Representative

Kaf

ka

Couchbase

Visitor Monitoring Service

(1) Visitor browsing

(2) Visitor events

(4) Write event to user document

(6) Return relevant visitors

(7) Return relevant visitors

(5) Get visitors ListEvery 3 sec

Visitor Feed API

(3) Analyze relevant events and persist

Page 14: Couchbase@live person meetup   july 22nd

Data design & numbers

● Document = User● Document Structure :

○ Each document contains 15-20 attributes, in addition to 3 lists of sub attributes

○ Each doc contain the account id (multi tenant db)

Page 15: Couchbase@live person meetup   july 22nd

Data design & numbers

● Numbers ○ Avg doc size - 10 K○ Average key size - 10 characters○ 5 2nd level indexes

● Throughput (Final rollout) ○ ~ 1 M concurrent documents/visitors○ ~ 100K ops/sec (heavy on insert/update)

Page 16: Couchbase@live person meetup   july 22nd

Insert/Update

public Visitor getVisitor(String visitorSessionId) {

dalMetrics.addCouchbaseReadTotalCount();

Visitor visitor = null;

try {

String visitorDoc = (String) client.get(visitorSessionId);

visitor = gson.fromJson(visitorDoc, Visitor.class); } catch (Exception e) {

LOG.debug("Failed to retrieve or convert visitor: " + e.getMessage());

dalMetrics.addCouchbaseReadErrorCount();

throw e;

}

return visitor;

}

public void setVisitorWithFields(String rtSessionId, Visitor visitor) {

try {

client.set(rtSessionId, defaultTtl, gson.toJson(visitor)); } catch (Exception e) {

LOG.error("Error occurred while updating visitor fields: " + e.getMessage());

}

}

Page 17: Couchbase@live person meetup   july 22nd

Views/ Design doc

Use the view to set the keys and sorting

function (doc, meta) {

order = ....

if (order, doc.accountId, doc.visitStartTime.fieldValue) {

emit([doc.accountId, order, doc.visitStartTime.fieldValue],null); }

}

Page 18: Couchbase@live person meetup   july 22nd

Retrieve data

.../_view/by_accountid_state_timestamp?limit=10&skip=0&startkey=

["qa15020713", 0 ]&endkey=["qa15020713" , 9 ]

ComplexKey startKey = null;

// Create a new View Query

Query query = new Query();

query.setIncludeDocs(true); // Include the full document as

if (startValueToFilterBy != null) {

startKey = ComplexKey.of(accountId, startValueToFilterBy);

}

if (endValueToFilterBy == null && startKey != null) {

query.setKey(startKey);

} else {

ComplexKey endKey = ComplexKey.of(accountId, endValueToFilterBy);

query.setRange(startKey, endKey);

}

if (limit > 0) {

query.setLimit(limit);

}

if (skip > 0) {

query.setSkip(skip);

}

Page 19: Couchbase@live person meetup   july 22nd

Cross data center replication options

1. Unidirectional replication to replicate the data to our DR data center

2. (Future) Bi Directional replication, each data center holds portion of entire the traffic

Insights :

○ Keyspace is the same (in both scenarios) - avoid conflicts

○ Impact on the cluster size - from 5 nodes to 7-8 nodes in each cluster

Page 20: Couchbase@live person meetup   july 22nd

What did we learn till now ?

● Delete docs○ Use TTL instead instead of delete○ Use longer TTL if possible

● In our use case the working set is around 100 % - RAM and SSD are key factors in scalability

● Move to production ASAP, even for staging !!

Page 21: Couchbase@live person meetup   july 22nd

Couchbase in LP - Additional Use cases

● Session state ● Cross Session state● Caching layer - Memcached style

Page 22: Couchbase@live person meetup   july 22nd

Thank You