eventsourcing with php and mongodb

99
Eventsourcing with PHP and MongoDB Data Analysis with events

Upload: jacopo-nardiello

Post on 14-Jul-2015

1.602 views

Category:

Software


1 download

TRANSCRIPT

Eventsourcingwith%PHP%and%MongoDB

Data$Analysis$with$events

Hello!

My#name#is#Jacopo&Nardiello

• Programmer,*currently*located*in*London

• I*care*about*code*quality

• If*it's*not*tested,*it*doesn't*exist.

If#you#want#to#follow#me#@jnardiello

What%is%this%talk%about

A"li%le"bit"of"context

About&Onebip

Mobile'payments!pla%orm.Start,up!born!in!2005,!acquired!by!Neomobile!group!in!2011.!

Onebip'today:.'70'countries.'200+'carriers.'5'billions'registered'users

It#all#started#with#a#MonolithLAMP%stack

To#Distributed#Systemsself%contained-services-talking-via-REST

Modern'ServicesFirst&class&modern&NoSQL&distributed&dbs

But$the$Monolith$is$s)ll$there

The$problem

A"repor'ng"horror"story

We#need#three#new#reports!

—"Manager

Sure,&no&problem!

Deal%with%the%legacy%SQL%Schema

Deal%with%MongoDB

A"li%le"bit"of"queries"here,a"li%le"bit"of"map/reduce"there

1"month"later...

Reports(are(finally(ready!

un#l...

Your%queries%are%killing%produc2on!

—"SysAdmin

Heavy&querying&op/miza/onNew$indexes

s"ll$not$enough

Let's&re(use&data&from&other&reports(don't'do'that)

DB#is#Ok,#reports#delivered.

but$then...

Huston,(we(have(a(problem.(Reports(are(not(consistent((with(other(

reports)—"business"guy

Mistakesweremade

#1

Avoid&Mul*ple&sources&of&truthIt's%hard%to%compare%different%data%from%different%in%a%distributed%

system%spli7ed%across%mul8ple%domains

#2

Ubiquitous)languageSame%words,%different%concepts%across%domains

#3

Fault&tolerance&to&changeChanging'a'report'shouldn't'have'side'effects

Most%common%solu+ons

#1

ETL$+$Map/Reduce

#2

DWH$+$Consultants

#3#Mad#science(Yeppa!)

What%we%wanted

Must%haveNo#down'me#in#produc'on

Consistent(across(domains

Nice%to%haveA"system"elas*c"enough"to"extract"any"metric

Real%me'data

In#DDD#we#found#the#light

Eventsourcing-&-CQRS

Eventsourcing

The$fundamental$idea$of$Event$Sourcing$is$that$of$ensuring(every(change(to(the(state(of(an(applica3on(is(captured(in(an(event(object,$and$that$these$event$objects$are$themselves$stored(in(the(sequence(

they(were(applied

—(Mar3n(Fowler

Unrolling(a(stream(of(eventsStar%ng(from(the(beginning(of(%me,(you(are(

literally(unrolling(history(to(reach(the(state(in(a(given(%me

Idea%#1

Every&change&to&the&state&of&your&applica4on&is&captured&in&an&event&object

"User%Logged%In",#"Payment%Sent",#"User%Landed"

Idea%#2

Events'are'stored'in'the'sequence'they'were'applied'inside'an'eventstore

Command'Query'Responsibility'Segrega6on

(CQRS)

Commands

Anything(that(happens(in(one(of(your(domains(is(triggered(by(a(command(and(generate(one(or(more(events.

Order received --> payment sent --> Items queued --> Confirmation email sent

Idea%#3

Everything+is+an+event.+No+more+state.

Query

Generate'read'models'from'events'depending'how'data'needs'to'be'actually'used'(by'users'and'other'applica9on'internals)

Idea%#4

One$way$to$store$data/events$but$poten2ally$infinite$ways$to$read$them

A"prac'cal"example

Tech%ops,%Business%control,%Monitoring,%Accoun4ng%they%are%all%interested%in%reading%data%from%different%views.

Healthy(NoSQL

You$start$with$this

{ "_id": ObjectId("123"), "username": "Flash", "city": …, "phone": …, "email": …,}

The$more$successful$your$company$is,$the$more$people

...The$more$people,$the$more$views

With%documental%dbs%it's%magically%easy%to%add%new%fields%to%your%collec7ons.

Soon$you$might$end$up$with

{ "_id": ObjectId("123"), "username": "Flash", "city": …, "phone": …, "email": …, "created_at": …, "updated_at": …, "ever_tried_to_purchase_something": …, "canceled_at": …, "acquisition_channel": …, "terminated_at": …, "latest_purchase_date": …, …}

A"bomb"wai)ng"to"detonate

It's%impossible%to%keep%adding%state%changes%to%your%documents%and%then%expect%to%be%able%to%extract%them%with%a%single%query.

Exploring*Tools

EventStore• Engineered)for)event)sourcing

• Supports)projec4ons

• By)the)father)of)CQRS

• Great)performances

The$badWin$Only.$Run$on$*nix$with$Mono,$s4ll$too$unstable.

LevelWHEN)

An#eventstore#built#with#Node.js#and#LevelDB8#Faster#than#light8#Completely#custom,#no#tool#to#handle#aggregates

The$known$path

• PHP$(any$other$language$would$just$do$fine)

• MongoDB$2.x

Why$MongoDBEvents'are'not'rela,onal

Scales'Well

Awesome'aggrega+on'framework

Hands&on

Storing(Events

The$write$architecture

Service | \ | \ [event payload] | \ | Service --- Queue System <------------> API -> MongoDB / | / [event payload] | / |Service |

Queues

MongoDB'RS

A"mongod"replica*set"with"two"logical"dbs."1."Eventstore"DB"where"we"would"store"2."Repor=ng"DB"where"we"would"store"aggregates"and"final"reports

Anatomy(of(an(event

{ '_id' : '3318c11e-fe60-4c80-a2b2-7add681492d9', 'type': 'an-event-type', 'data': { 'meta' : { … }, 'payload' : { … } }}

Storing(events

Anatomy(of(an(event

'meta' : { 'creation_date': ISODate("2014-21-11T00:00:01Z"), 'saved_date': ISODate("2014-21-11T00:00:02Z"), 'source': 'some-bounded-context', 'correlation_id': 'a-correlation-id'},'payload' : { 'user_id': '1234', 'animal': 'unicorn', 'colour': 'pink', 'purchase_date': ISODate("2014-21-11T00:00:00Z"), 'price': '20\fantaueros'}

Storing(events

Don't&trust&the&network:&Idempotence{ '_id' : '3318c11e-fe60-4c80-a2b2-7add681492d9', …}

The$_id$field$is$actually$defined$client0side$and$ensures$idempotence$if$an$event$is$received$two$8mes.

Indexes• Events(collec,ons(are(HUGE((~100M*N(documents)

• Use(indexes(wisely(as(they(are(necessary(yet(expensive(

• With(suggested(events(structure:(type + data.meta.created_at

BenchmarkingHow$many$events/second$can$you$store?

Our$machines$were$able$to$store$roughly$150$events/sec.$This$number$can$be$greatly$increased$with$dedicated$IOPS,$more$aggressive$inser@ng$policies,$etc..

Final&Tips• Use%SSDs%on%your%storage%machines

• Pay%a5en6on%to%write%concerns

From%Events%to%Meaningful%Metrics

The$Event$Processing$Pipeline

Sequential Projector -> Event Mapper -> Projection -> Aggregation

A"real"life"problem

What%is%the%conversion%rate%of%our%registered%users?

#1#The#registra-on#event{ '_id' : '3318c11e-fe60-4c80-a2b2-7add681492d9', 'type': 'user-registered', 'data': { 'meta' : { 'save_date': ISODate("2014-21-11T00:00:09Z"), 'created_at': ISODate("2014-21-11T00:00:01Z"), 'source': 'core-domain', 'correlation_id': 'user-123456' }, 'payload' : { 'user_id': 123, 'username': 'flash', 'email': '[email protected]', 'country': 'IT' } }}

#2#The#purchase#event{ '_id' : '3318c11e-fe60-4c80-a2b2-7add681492d9', 'type': 'user-purchased', 'data': { 'meta' : { 'save_date': ISODate("2014-21-11T00:10:09Z"), 'created_at': ISODate("2014-21-11T00:10:01Z"), 'source': 'payment-gateway', 'correlation_id': 'user-123456' }, 'payload' : { 'user_id': 123, 'email': '[email protected]', 'amount': 20, 'value': EUR, 'payment': 'credit_card', 'item': 'fluffy cat' } }}

Sequen&al)projector)(1)

Divides'the'stream'of'events'into'batches,'filters'events'by'type'and'pass'those'of'interest'to'the'mapper'

[]->[x]->[]->[x]->[]->[]->[]->[] |--------------| |------------| | | | | ---> Projector

Sequen&al)projector)(2)• It's&a&good&idea&to&select&fixed&sizes&batches&to&avoid&memory&problems&when&you&load&your&Cursor&in&memory

• Could&be&a&long=running&process&selec>ng&events&as&they&arrive&in&real%me

Event&Mapper&(1)

Translates)event)fields)to)the)Read)Model)domain.

Takes&an&event&as&input,&applies&a&bunch&of&logic&and&will&return&a&list&of&Read&Model&fields.

Event&Mapper&(2)

Input&Event:&

user-registered

Output:

$output = [ 'user_id' => 123, // simply copied 'user_name' => 'flash', // simply copied 'email' => '[email protected]', // simply copied 'registered_at' => "2014-21-11T00:00:01Z" // From the data.meta.created_at event field ];

Event&Mapper&(3)

Input&Event:&

user-purchased

Output:

$output = [ 'user_id' => 123, // simply copied 'email' => '[email protected]', // simply copied 'purchased_at': "2014-21-11T00:10:01Z" // From the data.meta.created_at event field ];

Projec'on)(1)

Essen%ally)it)is)your)read)model.)The$data$that$the$business$is$interested$in.

The$Projec*on$,$a.er$event#1$(2)

db.users_conversion_rate_projection.findOne()

{ 'user_id': 123, 'user_name': 'flash', 'email': '[email protected]', 'registered_at': "2014-21-11T00:00:01Z"}

The$Projec*on$,$a.er$event#2$(3){ 'user_id': 123, 'user_name': 'flash', 'email': '[email protected]', 'registered_at': "2014-21-11", 'purchased_at': "2014-21-11" // Added this field and rewrote others}

The$Projec*on$,$collec*on{ 'user_id': 123, 'user_name': 'flash', 'email': '[email protected]', 'registered_at': "2014-21-11", 'purchased_at': "2014-21-11" // Added this field and rewrote others}{ 'user_id': 456, 'user_name': 'batman', 'email': '[email protected]', 'registered_at': "2014-21-11", 'purchased_at': "2014-21-11" // Added this field and rewrote others}{ 'user_id': 789, 'user_name': 'superman', 'email': '[email protected]', 'registered_at': "2014-21-12", 'purchased_at': "2014-21-12" // Added this field and rewrote others}

The$Projec*on$,$A$few$thoughts$(4)

Note%that%we%didn't%copy%from%events%to%projec6on%all%the%available%fields.%Just%relevant%ones.

From%these%two%events%we%could%have%generated%infinite%read%models%such%as:

• List&all&purchased&products&and&related&amounts&for&the&company&buyers&

• Map&all&sales&and&revenues&for&our&accoun8ng&dept

• List&transac8ons&for&the&financial&department

One$way$to$write,$infinite$ways$to$read!

The$aggrega(on$(1)$.$Total$registered$usersvar registered = db.users_conversion_rate_projection.aggregate([ { $match: { "registered_at": { $gte: ISODate("2015-11-21"), $lte: ISODate("2015-11-22") } } }, { $group: { _id: { }, count: { $sum:1 } } }]);

The$aggrega(on$(2)$.$Users$with$a$purchasevar purchased = db.users_conversion_rate_projection.aggregate([ { $match: { "registered_at": { $gte: ISODate("2015-11-21"), $lte: ISODate("2015-11-22") }, "purchased_at": { $exists: true } } }, { $group: { _id: { }, count: { $sum:1 } } }]);

The$aggrega(on$(3)$.$Automate$all$the$things• You%can%easily%create%the%aggrega2on%framework%statement%by%composi2on%abstrac2ng%the%concept%of%Column.

• This%way%you%can%dynamically%aggregate%your%projec2ons%on%(for%example)%an%API%requests.

• If%your%Projector%is%a%long%running%process,%your%projec2ons%will%be%updated%to%the%second%and%you%automagically%get%real%me%data.

Another(events(usage:(Business(&(Tech(Monitoring

Beware&of&the&beast!No#Silver#Bullet

Events'are'expensiveRequire'a'lot'of'TIME'to'be'parsed

Events'are'expensiveYou$will$end$up$with$this$billion$size$collec2on$

(and$coun2ng).

Fixing&wrong&events&is&painful

Events'are'complex

Moving'around'events'is'horribly'painful

Mongo%won't%help%you.Actually(it(will(make(your(life(incredibly(difficult(with(hidden(bugs(and(leaking(

documenta8on.

Thank&you!@jnardiello