building scalable systems: an asynchronous approach

84
/ Building Scalable Systems an Asynchronous Approach pleasure and pain Monday, June 20, 2011

Upload: theo-schlossnagle

Post on 14-Jan-2015

2.119 views

Category:

Technology


1 download

DESCRIPTION

A brief tour through asynchronous systems.

TRANSCRIPT

Page 1: Building Scalable Systems: an asynchronous approach

/

Building Scalable Systemsan Asynchronous Approach

pleasure and pain

Monday, June 20, 2011

Page 2: Building Scalable Systems: an asynchronous approach

Who am I? @postwait on twitter

Author of “Scalable Internet Architectures”Pearson, ISBN: 067232699X

Contributor to “Web Operations”O’Reilly, ISBN:

Founder of OmniTI, Message Systems, Fontdeck, & CirconusI like to tackle problems that are “always on” and “always growing.”

I am an EngineerA practitioner of academic computing.IEEE member and Senior ACM member.On the Editorial Board of ACM’s Queue magazine.

2

Monday, June 20, 2011

Page 3: Building Scalable Systems: an asynchronous approach

/

Some rants about systems.

cutting through the crap

Monday, June 20, 2011

Page 4: Building Scalable Systems: an asynchronous approach

BIG data

• BIG data: doesn’t exist

• but it sure is a good way to market things.

• If you measure things in petabytesas we have for the last several yearsyou have data, not BIG data.

Monday, June 20, 2011

Page 5: Building Scalable Systems: an asynchronous approach

Data stores

• “new” NoSQL systems can scale better than an RDBMS

• yes

• They scale better than sharded RDBMS

• no

• I need NoSQL systems for “BIG” data.

• no

Monday, June 20, 2011

Page 6: Building Scalable Systems: an asynchronous approach

Why NoSQL

• The choice to use a NoSQL system is driven by:

• a more suitable data model

• we desire to shift our CAP theorem constraints

Monday, June 20, 2011

Page 7: Building Scalable Systems: an asynchronous approach

The cloud

• The cloud is not magic.

• The cloud enables engineers to rapidly deploy an architecture to run an application.

• There is nothing wrong with this.

Monday, June 20, 2011

Page 8: Building Scalable Systems: an asynchronous approach

The cloud: gotcha 1

• Provisioning services “the old way” took time due to:

• installing systems

• install packages

• determining and enforcing data security

• determining and enforcing SLA monitoring

• determining and enforcing DR: RPO and RTO

• documenting escalation and remediation efforts

Monday, June 20, 2011

Page 9: Building Scalable Systems: an asynchronous approach

The cloud: gotcha 2

• Quality of service in multi-tenancy environments is a very, very hard problem.

• In fact, no one has solved it.

Monday, June 20, 2011

Page 10: Building Scalable Systems: an asynchronous approach

The cloud: gotcha 3

• Platform as a service provides:

• MySQL? PostgreSQL? Redis? Node.js?

• .. patched.

• .. patched.

• .. patched.

• .. patched.

Monday, June 20, 2011

Page 11: Building Scalable Systems: an asynchronous approach

The cloud: gotcha 4

• Scaling isn’t magic pixie dust.

• “I can run more instances of my app.”

• horribly false sense of security.

• You must have a scalable architecture

Monday, June 20, 2011

Page 12: Building Scalable Systems: an asynchronous approach

/

Design & Implementation Techniques

some say architecture != implementation

Monday, June 20, 2011

Page 13: Building Scalable Systems: an asynchronous approach

Architecture vs. Implementation

Monday, June 20, 2011

Page 14: Building Scalable Systems: an asynchronous approach

Architecture vs. Implementation

Architecture is without specification of the vendor,make, and model of components.

Monday, June 20, 2011

Page 15: Building Scalable Systems: an asynchronous approach

Architecture vs. Implementation

Architecture is without specification of the vendor,make, and model of components.

Implementation is the adaptation of an architectureto embrace available technologies.

Monday, June 20, 2011

Page 16: Building Scalable Systems: an asynchronous approach

Architecture vs. Implementation

Architecture is without specification of the vendor,make, and model of components.

Implementation is the adaptation of an architectureto embrace available technologies.

They are intrinsically tied.Insisting on separation is a metaphysical argument(with no winners)

Monday, June 20, 2011

Page 17: Building Scalable Systems: an asynchronous approach

Respect Engineering Math

Engineering math:

19 + 89 = 110

“Precise” Math:

19 + 89 = 10.8

Monday, June 20, 2011

Page 18: Building Scalable Systems: an asynchronous approach

Respect Engineering Math

Engineering math:

19 + 89 = 110

“Precise” Math:

19 + 89 = 10.8

Ok. Ok. I must have, I must have put a decimal point in the wrong place or something. Shit. I always do that. I always mess up some mundane detail.

- Michael Bolton in Office Space

Monday, June 20, 2011

Page 19: Building Scalable Systems: an asynchronous approach

Ensure the gods aren’t angry.

Monday, June 20, 2011

Page 20: Building Scalable Systems: an asynchronous approach

Ensure the gods aren’t angry.

Bob: We need to grow our cluster of web servers.

Monday, June 20, 2011

Page 21: Building Scalable Systems: an asynchronous approach

Ensure the gods aren’t angry.

Bob: We need to grow our cluster of web servers.

Alice: How many requests per second do they do, how many do you have and what is their current resource utilization?

Monday, June 20, 2011

Page 22: Building Scalable Systems: an asynchronous approach

Ensure the gods aren’t angry.

Bob: We need to grow our cluster of web servers.

Alice: How many requests per second do they do, how many do you have and what is their current resource utilization?

Bob: About 200 req/second, 8 servers and they have no headroom.

Monday, June 20, 2011

Page 23: Building Scalable Systems: an asynchronous approach

Ensure the gods aren’t angry.

Bob: We need to grow our cluster of web servers.

Alice: How many requests per second do they do, how many do you have and what is their current resource utilization?

Bob: About 200 req/second, 8 servers and they have no headroom.

Alice: How many req/second do you need?

Monday, June 20, 2011

Page 24: Building Scalable Systems: an asynchronous approach

Ensure the gods aren’t angry.

Bob: We need to grow our cluster of web servers.

Alice: How many requests per second do they do, how many do you have and what is their current resource utilization?

Bob: About 200 req/second, 8 servers and they have no headroom.

Alice: How many req/second do you need?

Bob: 800 req/second would be good.

Monday, June 20, 2011

Page 25: Building Scalable Systems: an asynchronous approach

Ensure the gods aren’t angry.

Bob: We need to grow our cluster of web servers.

Alice: How many requests per second do they do, how many do you have and what is their current resource utilization?

Bob: About 200 req/second, 8 servers and they have no headroom.

Alice: How many req/second do you need?

Bob: 800 req/second would be good.

Alice: Um, Bob, 200 x 8 = 1600... you have 50% headroom on your goal.

Monday, June 20, 2011

Page 26: Building Scalable Systems: an asynchronous approach

Ensure the gods aren’t angry.

Bob: We need to grow our cluster of web servers.

Alice: How many requests per second do they do, how many do you have and what is their current resource utilization?

Bob: About 200 req/second, 8 servers and they have no headroom.

Alice: How many req/second do you need?

Bob: 800 req/second would be good.

Alice: Um, Bob, 200 x 8 = 1600... you have 50% headroom on your goal.

Bob: No... 200 / 8 = 25 req/second per server.

Monday, June 20, 2011

Page 27: Building Scalable Systems: an asynchronous approach

Ensure the gods aren’t angry.

Bob: We need to grow our cluster of web servers.

Alice: How many requests per second do they do, how many do you have and what is their current resource utilization?

Bob: About 200 req/second, 8 servers and they have no headroom.

Alice: How many req/second do you need?

Bob: 800 req/second would be good.

Alice: Um, Bob, 200 x 8 = 1600... you have 50% headroom on your goal.

Bob: No... 200 / 8 = 25 req/second per server.

Alice: Bob... the gods are angry.

Monday, June 20, 2011

Page 28: Building Scalable Systems: an asynchronous approach

Why you’ve pissed off the gods.

Monday, June 20, 2011

Page 29: Building Scalable Systems: an asynchronous approach

Why you’ve pissed off the gods.

Most web apps are CPU bound (as I/O happens on a different layer)

Monday, June 20, 2011

Page 30: Building Scalable Systems: an asynchronous approach

Why you’ve pissed off the gods.

Most web apps are CPU bound (as I/O happens on a different layer)

Typical box today: 8 cores are 2.8GHz or 22.4 BILLION instructions per second.

Monday, June 20, 2011

Page 31: Building Scalable Systems: an asynchronous approach

Why you’ve pissed off the gods.

Most web apps are CPU bound (as I/O happens on a different layer)

Typical box today: 8 cores are 2.8GHz or 22.4 BILLION instructions per second.

22x109 instr/s / 25 req/s = 880 MILLION instructions per request.

Monday, June 20, 2011

Page 32: Building Scalable Systems: an asynchronous approach

Why you’ve pissed off the gods.

Most web apps are CPU bound (as I/O happens on a different layer)

Typical box today: 8 cores are 2.8GHz or 22.4 BILLION instructions per second.

22x109 instr/s / 25 req/s = 880 MILLION instructions per request.

This same effort (per-request) provided me with approximately15 minutes enjoying “Might & Magic 2” on my Apple IIe- you’ve certainly pissed me off.

Monday, June 20, 2011

Page 33: Building Scalable Systems: an asynchronous approach

Why you’ve pissed off the gods.

Most web apps are CPU bound (as I/O happens on a different layer)

Typical box today: 8 cores are 2.8GHz or 22.4 BILLION instructions per second.

22x109 instr/s / 25 req/s = 880 MILLION instructions per request.

This same effort (per-request) provided me with approximately15 minutes enjoying “Might & Magic 2” on my Apple IIe- you’ve certainly pissed me off.

No wonder the gods are angry.

Monday, June 20, 2011

Page 34: Building Scalable Systems: an asynchronous approach

Develop a model

Monday, June 20, 2011

Page 35: Building Scalable Systems: an asynchronous approach

Develop a model

Queue theoretic models are for “other people.”

Monday, June 20, 2011

Page 36: Building Scalable Systems: an asynchronous approach

Develop a model

Queue theoretic models are for “other people.”

Sorta, not really.

Monday, June 20, 2011

Page 37: Building Scalable Systems: an asynchronous approach

Develop a model

Queue theoretic models are for “other people.”

Sorta, not really.

Problems:

Monday, June 20, 2011

Page 38: Building Scalable Systems: an asynchronous approach

Develop a model

Queue theoretic models are for “other people.”

Sorta, not really.

Problems:

very hard to develop a complete and accurate model

Monday, June 20, 2011

Page 39: Building Scalable Systems: an asynchronous approach

Develop a model

Queue theoretic models are for “other people.”

Sorta, not really.

Problems:

very hard to develop a complete and accurate model

Benefits:

Monday, June 20, 2011

Page 40: Building Scalable Systems: an asynchronous approach

Develop a model

Queue theoretic models are for “other people.”

Sorta, not really.

Problems:

very hard to develop a complete and accurate model

Benefits:

provides insight on architecture capacitance dependencies

Monday, June 20, 2011

Page 41: Building Scalable Systems: an asynchronous approach

Develop a model

Queue theoretic models are for “other people.”

Sorta, not really.

Problems:

very hard to develop a complete and accurate model

Benefits:

provides insight on architecture capacitance dependencies

relatively easy to understand

Monday, June 20, 2011

Page 42: Building Scalable Systems: an asynchronous approach

Develop a model

Queue theoretic models are for “other people.”

Sorta, not really.

Problems:

very hard to develop a complete and accurate model

Benefits:

provides insight on architecture capacitance dependencies

relatively easy to understand

illustrates opportunities to further isolate work

Monday, June 20, 2011

Page 43: Building Scalable Systems: an asynchronous approach

Rationalize your model

Monday, June 20, 2011

Page 44: Building Scalable Systems: an asynchronous approach

Rationalize your model

Draw your model out

Monday, June 20, 2011

Page 45: Building Scalable Systems: an asynchronous approach

Rationalize your model

Draw your model out

Take measurements and walk through the model to rationalize iti.e. prove it to be empirically correct

Monday, June 20, 2011

Page 46: Building Scalable Systems: an asynchronous approach

Rationalize your model

Draw your model out

Take measurements and walk through the model to rationalize iti.e. prove it to be empirically correct

You should be able to map actions to consequences:

Monday, June 20, 2011

Page 47: Building Scalable Systems: an asynchronous approach

Rationalize your model

Draw your model out

Take measurements and walk through the model to rationalize iti.e. prove it to be empirically correct

You should be able to map actions to consequences:

A user signs up ➙ 4 synchronous DB inserts (1 synch IOPS + 4 asynch writes) 1 AMQP durable, persistent message 1 asynch DB read ➙ 1/10 IOPS writing new Lucene indexes

Monday, June 20, 2011

Page 48: Building Scalable Systems: an asynchronous approach

Rationalize your model

Draw your model out

Take measurements and walk through the model to rationalize iti.e. prove it to be empirically correct

You should be able to map actions to consequences:

A user signs up ➙ 4 synchronous DB inserts (1 synch IOPS + 4 asynch writes) 1 AMQP durable, persistent message 1 asynch DB read ➙ 1/10 IOPS writing new Lucene indexes

In a dev environment, simulate traffic and rationalize your model

Monday, June 20, 2011

Page 49: Building Scalable Systems: an asynchronous approach

Rationalize your model

Draw your model out

Take measurements and walk through the model to rationalize iti.e. prove it to be empirically correct

You should be able to map actions to consequences:

A user signs up ➙ 4 synchronous DB inserts (1 synch IOPS + 4 asynch writes) 1 AMQP durable, persistent message 1 asynch DB read ➙ 1/10 IOPS writing new Lucene indexes

In a dev environment, simulate traffic and rationalize your model

I call this a “data flow causality map”

Monday, June 20, 2011

Page 50: Building Scalable Systems: an asynchronous approach

Complexity will eat your lunch

Monday, June 20, 2011

Page 51: Building Scalable Systems: an asynchronous approach

Complexity will eat your lunch

there will always be empirical variance from your model

Monday, June 20, 2011

Page 52: Building Scalable Systems: an asynchronous approach

Complexity will eat your lunch

there will always be empirical variance from your model

explaining the phantoms leads to enlightenment

Monday, June 20, 2011

Page 53: Building Scalable Systems: an asynchronous approach

Complexity will eat your lunch

there will always be empirical variance from your model

explaining the phantoms leads to enlightenment

service decoupling in complex systems gives:

Monday, June 20, 2011

Page 54: Building Scalable Systems: an asynchronous approach

Complexity will eat your lunch

there will always be empirical variance from your model

explaining the phantoms leads to enlightenment

service decoupling in complex systems gives:

simplified modeling and capacity planning

Monday, June 20, 2011

Page 55: Building Scalable Systems: an asynchronous approach

Complexity will eat your lunch

there will always be empirical variance from your model

explaining the phantoms leads to enlightenment

service decoupling in complex systems gives:

simplified modeling and capacity planning

slight inefficiencies

Monday, June 20, 2011

Page 56: Building Scalable Systems: an asynchronous approach

Complexity will eat your lunch

there will always be empirical variance from your model

explaining the phantoms leads to enlightenment

service decoupling in complex systems gives:

simplified modeling and capacity planning

slight inefficiencies

promotes lower contention

Monday, June 20, 2011

Page 57: Building Scalable Systems: an asynchronous approach

Complexity will eat your lunch

there will always be empirical variance from your model

explaining the phantoms leads to enlightenment

service decoupling in complex systems gives:

simplified modeling and capacity planning

slight inefficiencies

promotes lower contention

requires design of systems with less coherency requirements

Monday, June 20, 2011

Page 58: Building Scalable Systems: an asynchronous approach

Complexity will eat your lunch

there will always be empirical variance from your model

explaining the phantoms leads to enlightenment

service decoupling in complex systems gives:

simplified modeling and capacity planning

slight inefficiencies

promotes lower contention

requires design of systems with less coherency requirements

each isolated service is simpler and safer

Monday, June 20, 2011

Page 59: Building Scalable Systems: an asynchronous approach

Complexity will eat your lunch

there will always be empirical variance from your model

explaining the phantoms leads to enlightenment

service decoupling in complex systems gives:

simplified modeling and capacity planning

slight inefficiencies

promotes lower contention

requires design of systems with less coherency requirements

each isolated service is simpler and safer

SCALES.

Monday, June 20, 2011

Page 60: Building Scalable Systems: an asynchronous approach

/

Asynchronous Systems

it’s likely you have no idea what you’re doing

Monday, June 20, 2011

Page 61: Building Scalable Systems: an asynchronous approach

Asychronous

• of or requiring a form of computer control timing protocol in which a specific operation begins upon receipt of an indication (signal) that the preceding operation has been completed.

• ...or “I’ll act when you tell me you are done”

• ...or a protocol wherein the initiation of a task and the report of its completion are separate operations.

Monday, June 20, 2011

Page 62: Building Scalable Systems: an asynchronous approach

Protocols

• Standards:

• AMQP(impl: ActiveMQ, RabbitMQ, OpenAMQ, etc.)

• Others:

• ZeroMQ

• Gearman

Monday, June 20, 2011

Page 63: Building Scalable Systems: an asynchronous approach

Guarantees

• Queueing protocols can be misleading.

• Are you sure you did what you think you did?

• Let’s use a publish as an example.

Monday, June 20, 2011

Page 64: Building Scalable Systems: an asynchronous approach

Publication

• Imagine a Queue:

• You assume that by calling “publish” thatyour message is placed on the queue andwill eventually be consumed(assuming a consumer).

• Most systems are ‘more’ asynchronous than that.

Monday, June 20, 2011

Page 65: Building Scalable Systems: an asynchronous approach

Publication what you think happens

Kernel Network Stack Network Stack Queue

message framewrite

User Space

publish

SAFE

read

message framewritepublish read

BOOMwritemessage framereaderror

call

return

writemessage framereaderror

call

return

Monday, June 20, 2011

Page 66: Building Scalable Systems: an asynchronous approach

Publication what really happens

Kernel Network Stack Network Stack Queue

message framewrite

User Space

publish

SAFE

read

message framewritepublish

read

BOOMwrite

message framereaderror

call

return

call

return

return

Monday, June 20, 2011

Page 67: Building Scalable Systems: an asynchronous approach

Why?

• Why do queueing protocols use

“silence for success?”

• Simple: performance

• no need for a roundtrip before the next message

• success is common, failure rare

Monday, June 20, 2011

Page 68: Building Scalable Systems: an asynchronous approach

Why?

• AMQP is not alone in this...

• 0MQ as well.

Monday, June 20, 2011

Page 69: Building Scalable Systems: an asynchronous approach

Now what?

• In each component you must decide if you need:

• synchronous system w/ synchronous protocol

• asynchronous system w/ synchronous protocol

• asynchronous system w/ asynchronous protocol

Monday, June 20, 2011

Page 70: Building Scalable Systems: an asynchronous approach

Service decisions

• Knowing you can lose messages is... okay?

• it can be

• there are plenty of uses for unreliable communications

• however... generally,it is much easier to build services that have end-to-end guarantees.

Monday, June 20, 2011

Page 71: Building Scalable Systems: an asynchronous approach

Non-asynchronous: synchronous

Kernel Network Stack Network Stack Database

message framewrite

User Space

publish

SAFE

read

message framewritepublish read

BOOMwritemessage framereaderror

call

return

writemessage framereaderror

call

return

Monday, June 20, 2011

Page 72: Building Scalable Systems: an asynchronous approach

Asynchronous to the purpose

• Why is a Queue “asynchronous”

• and a Database “synchronous”

• I lied... “asynchronous” is “to the purpose.”

• If the ultimate, final goal is: storage in a DB

• and you return the result only after a commit

• then you are synchronous

Monday, June 20, 2011

Page 73: Building Scalable Systems: an asynchronous approach

Simple example: image thumbnailing

• A user uploads an email to a web site

• you need to produce 7 different transformations

• (size, color, etc.)

• Asynchronous system:

• synchronous upload protocol:

• user upload -> thank you we have it

• asynchronous processing

• file -> 7 mutations

Monday, June 20, 2011

Page 74: Building Scalable Systems: an asynchronous approach

/

A (more) complete example.

foursquare-like, untappd.com-like service

Monday, June 20, 2011

Page 75: Building Scalable Systems: an asynchronous approach

Better example: rewards calculation

• A user performs an action on your site

• and you need to reward them based on:

• social network, history, value

• you want to show them their reward “immediately.”

• Step 1: engineer for failure.

Monday, June 20, 2011

Page 76: Building Scalable Systems: an asynchronous approach

Rewards calculation: step 1

• the inability to calculate the rewardshall not prevent the action.(think: beer checkin on untappd)

• I want the reward calculation immediately.

• I need the checkin to be recorded.

Monday, June 20, 2011

Page 77: Building Scalable Systems: an asynchronous approach

Rewards calculation: step 2

• Decouple the rewards calculation:

1. receive user request

2. store(C)

3. queue the checkin(C) on QC

4. wait up to 500ms (reading rewards R from QR)

5. return R witnessed.

Monday, June 20, 2011

Page 78: Building Scalable Systems: an asynchronous approach

Rewards calculation: step 3

• Decouple the rewards calculation:

1. dequeue checkin: C from QC

2. calculate rewards(C) -> R

3. store(R)

4. queue(R) on QR

Monday, June 20, 2011

Page 79: Building Scalable Systems: an asynchronous approach

Rewards calculation: win

• You win big.

• If the rewards calculation system is

• too slow, or

• goes offline

• checkins still proceed and

• responses are served within 500ms

• You have decoupled the service availability requirements of the checkin system from the rewards system: happier users.

Monday, June 20, 2011

Page 80: Building Scalable Systems: an asynchronous approach

/

Final random thoughts

think outside of the box

Monday, June 20, 2011

Page 81: Building Scalable Systems: an asynchronous approach

Things to look at: free your mind

• Node.js

• Javascript? Seriously?

• Yes.

• Forces you to think asynchronously

• Forces you to share nothing

• Forces you to build stateless systems

• These systems scale

Monday, June 20, 2011

Page 82: Building Scalable Systems: an asynchronous approach

unsafe: when to use

• “silence is success” messaging is almost always useful when new, more temporally relevant data is bound to arrive.

• game location data

• performance data

• status data

• the casual observer

Monday, June 20, 2011

Page 83: Building Scalable Systems: an asynchronous approach

be mindful

• Always monitor:

• message rates

• queue depths

• queue counts

• connection concurrency

Monday, June 20, 2011

Page 84: Building Scalable Systems: an asynchronous approach

Thank you.

• Thank you

• Merci beaucoup.

Monday, June 20, 2011