desert code camp 2016.1 - stateful distributed systems

141
Statefulness in Distributed Systems

Upload: joe-rawlings

Post on 15-Jan-2017

87 views

Category:

Engineering


1 download

TRANSCRIPT

Page 1: Desert Code Camp 2016.1 - Stateful Distributed Systems

Statefulness in Distributed

Systems

Page 2: Desert Code Camp 2016.1 - Stateful Distributed Systems

Amazon is hiring!

To learn more about our Dev Centers: http://bit.ly/phxdevcenters To learn more about current opportunities, email: [email protected]

Our mission is to provide the most innovative, scalable, and reliable systems in the

world.

Page 3: Desert Code Camp 2016.1 - Stateful Distributed Systems

@JoeRawlings

Page 4: Desert Code Camp 2016.1 - Stateful Distributed Systems

https://goo.gl/O9PAUK

Page 5: Desert Code Camp 2016.1 - Stateful Distributed Systems

Outline

Page 6: Desert Code Camp 2016.1 - Stateful Distributed Systems

OutlineWhat’s that smell?

Page 7: Desert Code Camp 2016.1 - Stateful Distributed Systems

OutlineWhat’s that smell?

What’s the meaning of this?

Page 8: Desert Code Camp 2016.1 - Stateful Distributed Systems

OutlineWhat’s that smell?

What’s the meaning of this?

Pics or it didn’t happen.

Page 9: Desert Code Camp 2016.1 - Stateful Distributed Systems

OutlineWhat’s that smell?

What’s the meaning of this?

Pics or it didn’t happen.

Swallowing the red pill.

Page 10: Desert Code Camp 2016.1 - Stateful Distributed Systems

OutlineWhat’s that smell?

What’s the meaning of this?

Pics or it didn’t happen.

Swallowing the red pill.

Challenge accepted!

Page 11: Desert Code Camp 2016.1 - Stateful Distributed Systems

What’s that smell?

Page 12: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful Services

Page 13: Desert Code Camp 2016.1 - Stateful Distributed Systems

DESIGN SMELL

Stateful Services

Page 14: Desert Code Camp 2016.1 - Stateful Distributed Systems

ANTI-PATTERNDESIGN SMELL

Stateful Services

Page 15: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful Services

Page 16: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful Services

Page 17: Desert Code Camp 2016.1 - Stateful Distributed Systems

WEB SERVER

TIER

CLIENT TIER

DATABASE TIER

Page 18: Desert Code Camp 2016.1 - Stateful Distributed Systems

WEB SERVER

TIER

CLIENT TIER

DATABASE TIER

Page 19: Desert Code Camp 2016.1 - Stateful Distributed Systems

WEB SERVER

TIER

CLIENT TIER

DATABASE TIER

Page 20: Desert Code Camp 2016.1 - Stateful Distributed Systems

WEB SERVER

TIER

CLIENT TIER

DATABASE TIER

Page 21: Desert Code Camp 2016.1 - Stateful Distributed Systems

What’s the meaning of this?

Page 22: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful Services

Page 23: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful ServicesYou’re already doing it!

Page 24: Desert Code Camp 2016.1 - Stateful Distributed Systems

Required to do something useful

Stateful ServicesYou’re already doing it!

Page 25: Desert Code Camp 2016.1 - Stateful Distributed Systems

Required to do something useful

Stateful Services

(most of the time)

You’re already doing it!

Page 26: Desert Code Camp 2016.1 - Stateful Distributed Systems

Required to do something useful

Stateful Services

(most of the time)

You’re already doing it!

stateful processing systems share similar concerns

Page 27: Desert Code Camp 2016.1 - Stateful Distributed Systems

Required to do something useful

Stateful Services

(most of the time)

You’re already doing it!

stateful processing systems share similar concerns

(This is where the fun is)

Page 28: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful Processing

Page 29: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful ProcessingData locality

Page 30: Desert Code Camp 2016.1 - Stateful Distributed Systems

data intensive systems

Stateful ProcessingData locality

Page 31: Desert Code Camp 2016.1 - Stateful Distributed Systems

data intensive systems

Stateful Processing

Strong Consistency

Data locality

Page 32: Desert Code Camp 2016.1 - Stateful Distributed Systems

data intensive systems

Stateful Processing

Strong Consistency

Data locality

High performance

Page 33: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateless Services

Page 34: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateless Services

does not care what has happened

Page 35: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateless Services

does not care what has happened

does not care what has changed

Page 36: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful Processing

Page 37: Desert Code Camp 2016.1 - Stateful Distributed Systems

Needs to care what has happened

Stateful Processing

Page 38: Desert Code Camp 2016.1 - Stateful Distributed Systems

Needs to care what has happenedNeeds To care what has changed

Stateful Processing

Page 39: Desert Code Camp 2016.1 - Stateful Distributed Systems

Needs to care what has happenedNeeds To care what has changed

Divorce Process lifecycle from data lifecycle

Stateful Processing

Page 40: Desert Code Camp 2016.1 - Stateful Distributed Systems

CLIENT TIER

WEB SERVER

TIER

DATABASE TIER

Page 41: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful Processing

service

CLIENT TIER

Page 42: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful Processing

service

CLIENT TIER

Storage TIER

Page 43: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful Processing

service

CLIENT TIER

Storage TIER

Page 44: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful Processing

service

CLIENT TIER

Storage TIER

Page 45: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful Processing

service

CLIENT TIER

Storage TIER

Page 46: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful Processing

service

CLIENT TIER

Storage TIER

Page 47: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful Processing

service

CLIENT TIER

Storage TIER

Page 48: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful Processing

service

CLIENT TIER

Storage TIER

Page 49: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful Processing

service

CLIENT TIER

Storage TIER

Page 50: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful Processing

service

CLIENT TIER

Storage TIER

Page 51: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful Processing

service

CLIENT TIER

Storage TIER

Page 52: Desert Code Camp 2016.1 - Stateful Distributed Systems

Pics Or It Didn’t Happen

Page 53: Desert Code Camp 2016.1 - Stateful Distributed Systems

https://github.com/graphite-project/whisper https://github.com/graphite-project/carbon

Page 54: Desert Code Camp 2016.1 - Stateful Distributed Systems

https://www.elastic.co/products/elasticsearch http://lucene.apache.org/

Page 55: Desert Code Camp 2016.1 - Stateful Distributed Systems

Swallowing The Red Pill

Page 56: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful Processing

Page 57: Desert Code Camp 2016.1 - Stateful Distributed Systems

Gossip protocols

Stateful Processing

Page 58: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful Processing

service

CLIENT TIER

Storage TIER

Page 59: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful Processing

service

CLIENT TIER

Storage TIER

Page 60: Desert Code Camp 2016.1 - Stateful Distributed Systems

Gossip Protocols

Page 61: Desert Code Camp 2016.1 - Stateful Distributed Systems

Gossip ProtocolsUsage

Page 62: Desert Code Camp 2016.1 - Stateful Distributed Systems

Gossip ProtocolsUsagedissemination - event / background

Page 63: Desert Code Camp 2016.1 - Stateful Distributed Systems

Gossip ProtocolsUsagedissemination - event / backgroundanti-entropy - data repair

Page 64: Desert Code Camp 2016.1 - Stateful Distributed Systems

Gossip ProtocolsUsagedissemination - event / backgroundanti-entropy - data repairaggregation - compute across the network

Page 65: Desert Code Camp 2016.1 - Stateful Distributed Systems

Gossip ProtocolsUsagedissemination - event / backgroundanti-entropy - data repairaggregation - compute across the network

SWIM

Page 66: Desert Code Camp 2016.1 - Stateful Distributed Systems

Gossip ProtocolsUsagedissemination - event / backgroundanti-entropy - data repairaggregation - compute across the network

SWIMscalable weakly-consistent infection-style process group membership protocol - membership list (non-faulty processes)

Page 67: Desert Code Camp 2016.1 - Stateful Distributed Systems

SWIM Protocol

Page 68: Desert Code Camp 2016.1 - Stateful Distributed Systems

SWIM Protocolfailure detection

Page 69: Desert Code Camp 2016.1 - Stateful Distributed Systems

SWIM Protocolfailure detection1. ping random process from member list

Page 70: Desert Code Camp 2016.1 - Stateful Distributed Systems

SWIM Protocolfailure detection1. ping random process from member list

A. Receives Ack, Great! end

Page 71: Desert Code Camp 2016.1 - Stateful Distributed Systems

SWIM Protocolfailure detection1. ping random process from member list

A. Receives Ack, Great! endB. does not receive ack, goto 2

Page 72: Desert Code Camp 2016.1 - Stateful Distributed Systems

SWIM Protocolfailure detection1. ping random process from member list

A. Receives Ack, Great! endB. does not receive ack, goto 2

2. send ping request to k processes from member list

Page 73: Desert Code Camp 2016.1 - Stateful Distributed Systems

SWIM Protocolfailure detection1. ping random process from member list

A. Receives Ack, Great! endB. does not receive ack, goto 2

2. send ping request to k processes from member list3. k processes try to ping “failed” process

Page 74: Desert Code Camp 2016.1 - Stateful Distributed Systems

SWIM Protocolfailure detection1. ping random process from member list

A. Receives Ack, Great! endB. does not receive ack, goto 2

2. send ping request to k processes from member list3. k processes try to ping “failed” process

A. K processes respond back to original process about status

Page 75: Desert Code Camp 2016.1 - Stateful Distributed Systems

SWIM Protocolfailure detection1. ping random process from member list

A. Receives Ack, Great! endB. does not receive ack, goto 2

2. send ping request to k processes from member list3. k processes try to ping “failed” process

A. K processes respond back to original process about status

dissemination

Page 76: Desert Code Camp 2016.1 - Stateful Distributed Systems

SWIM Protocolfailure detection1. ping random process from member list

A. Receives Ack, Great! endB. does not receive ack, goto 2

2. send ping request to k processes from member list3. k processes try to ping “failed” process

A. K processes respond back to original process about status

disseminationmulticast failure (vol. leave) updates

Page 77: Desert Code Camp 2016.1 - Stateful Distributed Systems

SWIM Protocolfailure detection1. ping random process from member list

A. Receives Ack, Great! endB. does not receive ack, goto 2

2. send ping request to k processes from member list3. k processes try to ping “failed” process

A. K processes respond back to original process about status

disseminationmulticast failure (vol. leave) updatesmulticast new members

Page 78: Desert Code Camp 2016.1 - Stateful Distributed Systems

Gossip protocols

Stateful Processing

Page 79: Desert Code Camp 2016.1 - Stateful Distributed Systems

consensusGossip protocols

Stateful Processing

Page 80: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful Processing

service

CLIENT TIER

Storage TIER

Page 81: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful Processing

service

CLIENT TIER

Storage TIER

Page 82: Desert Code Camp 2016.1 - Stateful Distributed Systems

http://zookeeper.apache.org/

Page 83: Desert Code Camp 2016.1 - Stateful Distributed Systems

Paxos

Page 84: Desert Code Camp 2016.1 - Stateful Distributed Systems

PaxosProposers, Acceptors, Learners

Page 85: Desert Code Camp 2016.1 - Stateful Distributed Systems

PaxosProposers, Acceptors, Learners

Proposers:

Page 86: Desert Code Camp 2016.1 - Stateful Distributed Systems

PaxosProposers, Acceptors, Learners

Proposers: Ask Acceptors to Approve Proposals

Page 87: Desert Code Camp 2016.1 - Stateful Distributed Systems

PaxosProposers, Acceptors, Learners

Proposers: Ask Acceptors to Approve ProposalsAsk Acceptors to accept a proposal & Version

Page 88: Desert Code Camp 2016.1 - Stateful Distributed Systems

PaxosProposers, Acceptors, Learners

Proposers:

Acceptors:

Ask Acceptors to Approve ProposalsAsk Acceptors to accept a proposal & Version

Page 89: Desert Code Camp 2016.1 - Stateful Distributed Systems

PaxosProposers, Acceptors, Learners

Proposers:

Acceptors:

Ask Acceptors to Approve ProposalsAsk Acceptors to accept a proposal & Version

Do not have to approve or accept

Page 90: Desert Code Camp 2016.1 - Stateful Distributed Systems

PaxosProposers, Acceptors, Learners

Proposers:

Acceptors:

Ask Acceptors to Approve ProposalsAsk Acceptors to accept a proposal & Version

Do not have to approve or acceptOnly Approve/Accept what has been proposed

Page 91: Desert Code Camp 2016.1 - Stateful Distributed Systems

PaxosProposers, Acceptors, Learners

Proposers:

Acceptors:

Ask Acceptors to Approve ProposalsAsk Acceptors to accept a proposal & Version

Do not have to approve or acceptOnly Approve/Accept what has been proposed

LEARNERS:

Page 92: Desert Code Camp 2016.1 - Stateful Distributed Systems

PaxosProposers, Acceptors, Learners

Proposers:

Acceptors:

Ask Acceptors to Approve ProposalsAsk Acceptors to accept a proposal & Version

Do not have to approve or acceptOnly Approve/Accept what has been proposed

LEARNERS: Proposed Value is chosen when a majority of Acceptors have accepted the value

Page 93: Desert Code Camp 2016.1 - Stateful Distributed Systems

PaxosProposers, Acceptors, Learners

Page 94: Desert Code Camp 2016.1 - Stateful Distributed Systems

PaxosProposers, Acceptors, Learners

Phase 1:

Page 95: Desert Code Camp 2016.1 - Stateful Distributed Systems

PaxosProposers, Acceptors, Learners

Phase 1:1) Proposers propose a value

Page 96: Desert Code Camp 2016.1 - Stateful Distributed Systems

PaxosProposers, Acceptors, Learners

Phase 1:1) Proposers propose a value2) Acceptors approve value

Page 97: Desert Code Camp 2016.1 - Stateful Distributed Systems

PaxosProposers, Acceptors, Learners

Phase 1:1) Proposers propose a value2) Acceptors approve value

1) not approve or accept smaller values

Page 98: Desert Code Camp 2016.1 - Stateful Distributed Systems

PaxosProposers, Acceptors, Learners

Phase 1:1) Proposers propose a value2) Acceptors approve value

1) not approve or accept smaller values2) send back highest (accepted) version

Page 99: Desert Code Camp 2016.1 - Stateful Distributed Systems

PaxosProposers, Acceptors, Learners

Phase 1:1) Proposers propose a value2) Acceptors approve value

1) not approve or accept smaller values2) send back highest (accepted) version

Phase 2:

Page 100: Desert Code Camp 2016.1 - Stateful Distributed Systems

PaxosProposers, Acceptors, Learners

Phase 1:1) Proposers propose a value2) Acceptors approve value

1) not approve or accept smaller values2) send back highest (accepted) version

Phase 2:1) proposer Receives approval from majority

Page 101: Desert Code Camp 2016.1 - Stateful Distributed Systems

PaxosProposers, Acceptors, Learners

Phase 1:1) Proposers propose a value2) Acceptors approve value

1) not approve or accept smaller values2) send back highest (accepted) version

Phase 2:1) proposer Receives approval from majority2) sends accept! to acceptors with highest

version

Page 102: Desert Code Camp 2016.1 - Stateful Distributed Systems

PaxosProposers, Acceptors, Learners

Phase 1:1) Proposers propose a value2) Acceptors approve value

1) not approve or accept smaller values2) send back highest (accepted) version

Phase 2:1) proposer Receives approval from majority2) sends accept! to acceptors with highest

version

Phase 3:

Page 103: Desert Code Camp 2016.1 - Stateful Distributed Systems

PaxosProposers, Acceptors, Learners

Phase 1:1) Proposers propose a value2) Acceptors approve value

1) not approve or accept smaller values2) send back highest (accepted) version

Phase 2:1) proposer Receives approval from majority2) sends accept! to acceptors with highest

version

Phase 3: 1) acceptors notify all learners

Page 104: Desert Code Camp 2016.1 - Stateful Distributed Systems
Page 105: Desert Code Camp 2016.1 - Stateful Distributed Systems

Gossip protocolsconsensus

Stateful Processing

Page 106: Desert Code Camp 2016.1 - Stateful Distributed Systems

consistent hashing

Gossip protocolsconsensus

Stateful Processing

Page 107: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful Processing

service

CLIENT TIER

Storage TIER

Page 108: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful Processing

service

CLIENT TIER

Storage TIER

Page 109: Desert Code Camp 2016.1 - Stateful Distributed Systems

Stateful Processing

service

CLIENT TIER

Storage TIER

Page 110: Desert Code Camp 2016.1 - Stateful Distributed Systems

Consistent Hashing

Page 111: Desert Code Camp 2016.1 - Stateful Distributed Systems

Consistent HashingDistributed Hash Tables

Page 112: Desert Code Camp 2016.1 - Stateful Distributed Systems

Consistent HashingDistributed Hash TablesConvenient, Fast O(1)

Page 113: Desert Code Camp 2016.1 - Stateful Distributed Systems

Consistent HashingDistributed Hash TablesConvenient, Fast O(1)Great for lookups given a key

Page 114: Desert Code Camp 2016.1 - Stateful Distributed Systems

Consistent HashingDistributed Hash TablesConvenient, Fast O(1)Great for lookups given a key

Issue

Page 115: Desert Code Camp 2016.1 - Stateful Distributed Systems

Consistent HashingDistributed Hash TablesConvenient, Fast O(1)Great for lookups given a key

IssueHorizontal scaling

Page 116: Desert Code Camp 2016.1 - Stateful Distributed Systems

Consistent HashingDistributed Hash TablesConvenient, Fast O(1)Great for lookups given a key

IssueHorizontal scalingFault Tolerance

Page 117: Desert Code Camp 2016.1 - Stateful Distributed Systems

Consistent HashingDistributed Hash TablesConvenient, Fast O(1)Great for lookups given a key

IssueHorizontal scalingFault Tolerancehash(object) (mod n)

n = number of slots

Page 118: Desert Code Camp 2016.1 - Stateful Distributed Systems

Consistent HashingDistributed Hash TablesConvenient, Fast O(1)Great for lookups given a key

IssueHorizontal scalingFault Tolerancehash(object) (mod n)Remapping keys to more slots

n = number of slots

Page 119: Desert Code Camp 2016.1 - Stateful Distributed Systems

Consistent HashingDistributed Hash TablesConvenient, Fast O(1)Great for Key / Value lookups

Solution

K = keys, n = number of slots

Page 120: Desert Code Camp 2016.1 - Stateful Distributed Systems

Consistent HashingDistributed Hash TablesConvenient, Fast O(1)Great for Key / Value lookups

Solutionconsistent hashing

K = keys, n = number of slots

Page 121: Desert Code Camp 2016.1 - Stateful Distributed Systems

Consistent HashingDistributed Hash TablesConvenient, Fast O(1)Great for Key / Value lookups

Solutionconsistent hashingDuring resize, k / n keys are remapped

K = keys, n = number of slots

Page 122: Desert Code Camp 2016.1 - Stateful Distributed Systems

Consistent Hashing

Page 123: Desert Code Camp 2016.1 - Stateful Distributed Systems

Challenge Accepted!

Page 124: Desert Code Camp 2016.1 - Stateful Distributed Systems

Challenges

Stateful Processing

Page 125: Desert Code Camp 2016.1 - Stateful Distributed Systems

Challengesresult composition

Stateful Processing

Page 126: Desert Code Camp 2016.1 - Stateful Distributed Systems

Challengesresult compositionwork distribution

Stateful Processing

Page 127: Desert Code Camp 2016.1 - Stateful Distributed Systems

Challengesresult compositionwork distributioncode deployments

Stateful Processing

Page 128: Desert Code Camp 2016.1 - Stateful Distributed Systems

Challengesresult compositionwork distributioncode deploymentsunbounded data structures

Stateful Processing

Page 129: Desert Code Camp 2016.1 - Stateful Distributed Systems

Challengesresult compositionwork distributioncode deploymentsunbounded data structuresmemory management

Stateful Processing

Page 130: Desert Code Camp 2016.1 - Stateful Distributed Systems

Challengesresult compositionwork distributioncode deploymentsunbounded data structuresmemory managementpersistence strategies

Stateful Processing

Page 131: Desert Code Camp 2016.1 - Stateful Distributed Systems

Challengesresult compositionwork distributioncode deploymentsunbounded data structuresmemory managementpersistence strategiesconcurrency

Stateful Processing

Page 132: Desert Code Camp 2016.1 - Stateful Distributed Systems
Page 133: Desert Code Camp 2016.1 - Stateful Distributed Systems

Key Takeaways

Page 134: Desert Code Camp 2016.1 - Stateful Distributed Systems

Key Takeaways

Stateful and Stateless systems co-exist

Page 135: Desert Code Camp 2016.1 - Stateful Distributed Systems

Key Takeaways

Stateful and Stateless systems co-exist

process lifecycle vs data lifecyCle

Page 136: Desert Code Camp 2016.1 - Stateful Distributed Systems

Key Takeaways

Stateful and Stateless systems co-exist

process lifecycle vs data lifecyCle

Stateful services have their place

Page 137: Desert Code Camp 2016.1 - Stateful Distributed Systems

Key Takeaways

Stateful and Stateless systems co-exist

process lifecycle vs data lifecyCle

Stateful services have their place

Interesting opportunities

Page 138: Desert Code Camp 2016.1 - Stateful Distributed Systems

Key Takeaways

Stateful and Stateless systems co-exist

process lifecycle vs data lifecyCle

Stateful services have their place

Interesting opportunities

But not all is rosy

Page 139: Desert Code Camp 2016.1 - Stateful Distributed Systems

Dynamo: Amazon’s Highly Available Key-value Store http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf

Papers We Love

The Chubby lock service for loosely-coupled distributed systemsresearch.google.com/archive/chubby-osdi06.pdf

Paxos made simplehttp://research.microsoft.com/en-us/um/people/lamport/pubs/paxos-simple.pdf

Time, Clocks, and the Ordering of Events in a Distributed Systemhttp://research.microsoft.com/en-us/um/people/lamport/pubs/time-clocks.pdf

The Google File Systemhttp://research.google.com/archive/gfs-sosp2003.pdf

Page 140: Desert Code Camp 2016.1 - Stateful Distributed Systems

Resources

Caitie McCaffrey’s talks - https://goo.gl/8MSdRz

Apache Foundation - http://apache.org/

graphite - https://graphiteapp.org/

elastic - https://www.elastic.co/

Page 141: Desert Code Camp 2016.1 - Stateful Distributed Systems

Thanks! Q&A

Amazon is hiring!To learn more about our Dev Centers: http://bit.ly/phxdevcenters To learn more about current opportunities, email: [email protected]