aws summit benelux 2013 - architecting for high availability

131
ARCHITECTING FOR HIGH AVAILABILITY Carlos Conde Sr. Mgr. Solutions Architecture

Upload: amazon-web-services

Post on 16-Apr-2017

980 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: AWS Summit Benelux 2013 - Architecting for High Availability

ARCHITECTING FOR HIGH

AVAILABILITY

Carlos Conde Sr. Mgr. Solutions Architecture

Page 2: AWS Summit Benelux 2013 - Architecting for High Availability

“LET’S BUILD

A ________ WEB

APPLICATION”

Page 3: AWS Summit Benelux 2013 - Architecting for High Availability

“LET’S BUILD

A HIGHLY AVAILABLE

________ WEB

APPLICATION”

Page 4: AWS Summit Benelux 2013 - Architecting for High Availability

“LET’S BUILD

A HIGHLY AVAILABLE

AND SCALABLE

________ WEB

APPLICATION”

Page 5: AWS Summit Benelux 2013 - Architecting for High Availability

“LET’S BUILD A HIGHLY AVAILABLE,

DURABLE AND SCALABLE

________ WEB APPLICATION”

Page 6: AWS Summit Benelux 2013 - Architecting for High Availability

“LET’S BUILD A HIGHLY AVAILABLE, DURABLE, RESILIENT

AND SCALABLE ________ WEB APPLICATION”

Page 7: AWS Summit Benelux 2013 - Architecting for High Availability

AWS BUILDING BLOCKS Inherently Fault-Tolerant Services Fault-Tolerant with the

right architecture Amazon S3

Amazon DynamoDB

Amazon CloudFront

Amazon SWF

Amazon SQS

Amazon SNS

Amazon SES

Amazon Route53

Elastic Load Balancing

AWS IAM

AWS Elastic Beanstalk

Amazon ElastiCache

Amazon EMR

Amazon Redshift

Amazon CloudSearch

Amazon EC2

Amazon EBS

Amazon RDS

Amazon VPC

Page 8: AWS Summit Benelux 2013 - Architecting for High Availability
Page 9: AWS Summit Benelux 2013 - Architecting for High Availability
Page 10: AWS Summit Benelux 2013 - Architecting for High Availability
Page 11: AWS Summit Benelux 2013 - Architecting for High Availability
Page 12: AWS Summit Benelux 2013 - Architecting for High Availability

1. DESIGN FOR FAILURE

2. USE MULTIPLE AZs

3. BUILD FOR SCALE

4. DECOUPLE COMPONENTS

Page 13: AWS Summit Benelux 2013 - Architecting for High Availability

« Everything fails all the time »

Werner Vogels

CTO of Amazon

Page 14: AWS Summit Benelux 2013 - Architecting for High Availability

YOUR GOAL

APPLICATIONS SHOULD CONTINUE TO FUNCTION

EVEN IF THE UNDERLYING PHYSICAL HARDWARE

FAILS OR IS REMOVED OR REPLACED

Page 15: AWS Summit Benelux 2013 - Architecting for High Availability

#1 DESIGN FOR FAILURE

Page 16: AWS Summit Benelux 2013 - Architecting for High Availability

AVOID SINGLE POINTS OF

FAILURE

ASSUME EVERYTHING FAILS,

AND WORK BACKWARDS

Page 17: AWS Summit Benelux 2013 - Architecting for High Availability

AVOID SINGLE POINTS OF

FAILURE

ASSUME EVERYTHING FAILS,

AND WORK BACKWARDS

Page 18: AWS Summit Benelux 2013 - Architecting for High Availability
Page 19: AWS Summit Benelux 2013 - Architecting for High Availability
Page 20: AWS Summit Benelux 2013 - Architecting for High Availability
Page 21: AWS Summit Benelux 2013 - Architecting for High Availability
Page 22: AWS Summit Benelux 2013 - Architecting for High Availability
Page 23: AWS Summit Benelux 2013 - Architecting for High Availability

HEALTH CHECKS

Page 24: AWS Summit Benelux 2013 - Architecting for High Availability
Page 25: AWS Summit Benelux 2013 - Architecting for High Availability
Page 26: AWS Summit Benelux 2013 - Architecting for High Availability
Page 27: AWS Summit Benelux 2013 - Architecting for High Availability
Page 28: AWS Summit Benelux 2013 - Architecting for High Availability
Page 29: AWS Summit Benelux 2013 - Architecting for High Availability

#2 USE MULTIPLE

AVAILABILITY ZONES

Page 30: AWS Summit Benelux 2013 - Architecting for High Availability

US-WEST (N. California) EU-WEST (Ireland)

ASIA PAC (Tokyo)

ASIA PAC

(Singapore)

US-WEST (Oregon)

SOUTH AMERICA (Sao Paulo)

US-EAST (Virginia)

GOV CLOUD

ASIA PAC (Sidney)

Page 31: AWS Summit Benelux 2013 - Architecting for High Availability
Page 32: AWS Summit Benelux 2013 - Architecting for High Availability

AMAZON RDS

MULTI-AZ

Page 33: AWS Summit Benelux 2013 - Architecting for High Availability
Page 34: AWS Summit Benelux 2013 - Architecting for High Availability
Page 35: AWS Summit Benelux 2013 - Architecting for High Availability
Page 36: AWS Summit Benelux 2013 - Architecting for High Availability
Page 37: AWS Summit Benelux 2013 - Architecting for High Availability
Page 38: AWS Summit Benelux 2013 - Architecting for High Availability
Page 39: AWS Summit Benelux 2013 - Architecting for High Availability
Page 40: AWS Summit Benelux 2013 - Architecting for High Availability

#3 BUILD FOR SCALE

Page 41: AWS Summit Benelux 2013 - Architecting for High Availability

AMAZON

CLOUDWATCH MONITORING FOR AWS RESOURCES

Page 42: AWS Summit Benelux 2013 - Architecting for High Availability
Page 43: AWS Summit Benelux 2013 - Architecting for High Availability
Page 44: AWS Summit Benelux 2013 - Architecting for High Availability

AUTO SCALING SCALE UP/DOWN EC2 CAPACITY

Page 45: AWS Summit Benelux 2013 - Architecting for High Availability
Page 46: AWS Summit Benelux 2013 - Architecting for High Availability
Page 47: AWS Summit Benelux 2013 - Architecting for High Availability
Page 48: AWS Summit Benelux 2013 - Architecting for High Availability
Page 49: AWS Summit Benelux 2013 - Architecting for High Availability
Page 50: AWS Summit Benelux 2013 - Architecting for High Availability
Page 51: AWS Summit Benelux 2013 - Architecting for High Availability
Page 52: AWS Summit Benelux 2013 - Architecting for High Availability
Page 53: AWS Summit Benelux 2013 - Architecting for High Availability
Page 54: AWS Summit Benelux 2013 - Architecting for High Availability

HEALTH CHECKS

+ AUTO SCALING

Page 55: AWS Summit Benelux 2013 - Architecting for High Availability
Page 56: AWS Summit Benelux 2013 - Architecting for High Availability
Page 57: AWS Summit Benelux 2013 - Architecting for High Availability
Page 58: AWS Summit Benelux 2013 - Architecting for High Availability
Page 59: AWS Summit Benelux 2013 - Architecting for High Availability
Page 60: AWS Summit Benelux 2013 - Architecting for High Availability

HEALTH CHECKS

+ AUTO SCALING

=

SELF-HEALING

Page 61: AWS Summit Benelux 2013 - Architecting for High Availability
Page 62: AWS Summit Benelux 2013 - Architecting for High Availability

#4 DECOUPLE COMPONENTS

Page 63: AWS Summit Benelux 2013 - Architecting for High Availability

BUILD LOOSELY

COUPLED SYSTEMS

The looser they are coupled,

the bigger they scale,

the more fault tolerant they get…

Page 64: AWS Summit Benelux 2013 - Architecting for High Availability
Page 65: AWS Summit Benelux 2013 - Architecting for High Availability
Page 66: AWS Summit Benelux 2013 - Architecting for High Availability

PUBLISH

& NOTIFY RECEIVE TRANSCODE

Page 67: AWS Summit Benelux 2013 - Architecting for High Availability

AMAZON SQS SIMPLE QUEUE SERVICE

Page 68: AWS Summit Benelux 2013 - Architecting for High Availability

PUBLISH

& NOTIFY RECEIVE TRANSCODE

Page 69: AWS Summit Benelux 2013 - Architecting for High Availability

PUBLISH

& NOTIFY RECEIVE TRANSCODE

Page 70: AWS Summit Benelux 2013 - Architecting for High Availability

PUBLISH

& NOTIFY RECEIVE

Page 71: AWS Summit Benelux 2013 - Architecting for High Availability

PUBLISH

& NOTIFY RECEIVE TRANSCODE

Page 72: AWS Summit Benelux 2013 - Architecting for High Availability
Page 73: AWS Summit Benelux 2013 - Architecting for High Availability
Page 74: AWS Summit Benelux 2013 - Architecting for High Availability
Page 75: AWS Summit Benelux 2013 - Architecting for High Availability
Page 76: AWS Summit Benelux 2013 - Architecting for High Availability
Page 77: AWS Summit Benelux 2013 - Architecting for High Availability
Page 78: AWS Summit Benelux 2013 - Architecting for High Availability
Page 79: AWS Summit Benelux 2013 - Architecting for High Availability

ARCHITECTURE

DESIGN PATTERN

Page 80: AWS Summit Benelux 2013 - Architecting for High Availability
Page 81: AWS Summit Benelux 2013 - Architecting for High Availability
Page 82: AWS Summit Benelux 2013 - Architecting for High Availability
Page 83: AWS Summit Benelux 2013 - Architecting for High Availability
Page 84: AWS Summit Benelux 2013 - Architecting for High Availability

SQS VISIBILITY TIMEOUT

Page 85: AWS Summit Benelux 2013 - Architecting for High Availability
Page 86: AWS Summit Benelux 2013 - Architecting for High Availability
Page 87: AWS Summit Benelux 2013 - Architecting for High Availability
Page 88: AWS Summit Benelux 2013 - Architecting for High Availability
Page 89: AWS Summit Benelux 2013 - Architecting for High Availability

BUFFERING

Page 90: AWS Summit Benelux 2013 - Architecting for High Availability
Page 91: AWS Summit Benelux 2013 - Architecting for High Availability
Page 92: AWS Summit Benelux 2013 - Architecting for High Availability
Page 93: AWS Summit Benelux 2013 - Architecting for High Availability
Page 94: AWS Summit Benelux 2013 - Architecting for High Availability
Page 95: AWS Summit Benelux 2013 - Architecting for High Availability
Page 96: AWS Summit Benelux 2013 - Architecting for High Availability

CLOUDWATCH METRICS FOR AMAZON SQS

+ AUTO SCALING

Page 97: AWS Summit Benelux 2013 - Architecting for High Availability
Page 98: AWS Summit Benelux 2013 - Architecting for High Availability
Page 99: AWS Summit Benelux 2013 - Architecting for High Availability
Page 100: AWS Summit Benelux 2013 - Architecting for High Availability
Page 101: AWS Summit Benelux 2013 - Architecting for High Availability

PUBLISH

& NOTIFY RECEIVE TRANSCODE

Page 102: AWS Summit Benelux 2013 - Architecting for High Availability
Page 103: AWS Summit Benelux 2013 - Architecting for High Availability

PUBLISH

& NOTIFY RECEIVE TRANSCODE

Page 104: AWS Summit Benelux 2013 - Architecting for High Availability

CAT?

CHECK

IMAGE

TOO

BIG?

RESIZE

IMAGE

NO

YES NO

OMG, IT’S

A CAT!

TRANSCODE

CAT

CHECK

START

PUBLISH

& NOTIFY

STOP REJECT

Page 105: AWS Summit Benelux 2013 - Architecting for High Availability

CAT?

CHECK

IMAGE

TOO

BIG?

RESIZE

IMAGE

NO

YES NO

YES

TRANSCODE

CAT

CHECK

START

PUBLISH

& NOTIFY

STOP REJECT

Page 106: AWS Summit Benelux 2013 - Architecting for High Availability

CAT?

CHECK

IMAGE

TOO

BIG?

RESIZE

IMAGE

NO

YES NO

YES

TRANSCODE

CAT

CHECK

START

PUBLISH

& NOTIFY

STOP REJECT

Page 107: AWS Summit Benelux 2013 - Architecting for High Availability

TAKS

DECISIONS

HISTORY

Page 108: AWS Summit Benelux 2013 - Architecting for High Availability

TAKS

DECISIONS

HISTORY

STATELESS !

Page 109: AWS Summit Benelux 2013 - Architecting for High Availability

STATELESS SCALES

HORIZONTALLY

Page 110: AWS Summit Benelux 2013 - Architecting for High Availability

AMAZON SWF ENABLES RESILIENT, SCALABLE,

DISTRIBUTED WORKFLOWS

Page 111: AWS Summit Benelux 2013 - Architecting for High Availability

WORKFLOW ACTORS

Page 112: AWS Summit Benelux 2013 - Architecting for High Availability

DECIDERS COORDINATION LOGIC

1. Poll for work on a decision list Long polling: 60 seconds

2. Evaluate workflow execution history SWF sends full history in JSON format

3. Return decision to Amazon SWF Usually scheduling another task

Page 113: AWS Summit Benelux 2013 - Architecting for High Availability

WORKERS EXECUTION LOGIC

1. Poll for work on a specific task list Long polling: 60 seconds

2. Execute works, send heartbeats SWF sends input data from deciders

3. Return success / failure Detailed data can be provided to deciders

Page 114: AWS Summit Benelux 2013 - Architecting for High Availability

SWF IS WATCHING TRACKING:

Execution tracking Time to start, time to finish, …

Time to finish for overall workflow

Timeouts controlled for each of these (and more)

Heartbeats for long-running activities (optional)

Decider is informed of timeouts Schedule retries, “mitigation” strategies or cleanup tasks

Page 115: AWS Summit Benelux 2013 - Architecting for High Availability

NO NEW LANGUAGE

TO LEARN

YOUR CODE IS YOUR WORKFLOW LANGUAGE

AMAZON SWF MAINTAINS STATE

Page 116: AWS Summit Benelux 2013 - Architecting for High Availability
Page 117: AWS Summit Benelux 2013 - Architecting for High Availability
Page 118: AWS Summit Benelux 2013 - Architecting for High Availability

ALL HORIZONTAL SCALING

PATTERNS APPLY

Page 119: AWS Summit Benelux 2013 - Architecting for High Availability
Page 120: AWS Summit Benelux 2013 - Architecting for High Availability
Page 121: AWS Summit Benelux 2013 - Architecting for High Availability
Page 122: AWS Summit Benelux 2013 - Architecting for High Availability
Page 123: AWS Summit Benelux 2013 - Architecting for High Availability
Page 124: AWS Summit Benelux 2013 - Architecting for High Availability
Page 125: AWS Summit Benelux 2013 - Architecting for High Availability

CHAINED TASKS

WITHOUT DECISIONS?

USE AMAZON SQS

PUBLISH

& NOTIFY RECEIVE TRANSCODE

Page 126: AWS Summit Benelux 2013 - Architecting for High Availability

TASK GRAPH WITH DECISIONS?

USE AMAZON SWF

SANITY

CHECK

RECEIVE

DATA

CHECK

FORMAT

REJECT ADJUST

FORMAT

PUBLISH

& NOTIFY

GOOD

LONG

OK

SPAM

TRANSCODE

Page 127: AWS Summit Benelux 2013 - Architecting for High Availability

1. DESIGN FOR FAILURE

2. USE MULTIPLE AZs

3. BUILD FOR SCALE

4. DECOUPLE COMPONENTS

Page 128: AWS Summit Benelux 2013 - Architecting for High Availability

YOUR GOAL

APPLICATIONS SHOULD CONTINUE TO FUNCTION

EVEN IF THE UNDERLYING PHYSICAL HARDWARE

FAILS OR IS REMOVED OR REPLACED

Page 129: AWS Summit Benelux 2013 - Architecting for High Availability

AWS ARCHITECTURE CENTER http://aws.amazon.com/architecture

AWS TECHNICAL ARTICLES http://aws.amazon.com/articles

AWS BLOG http://aws.typepad.com

AWS PODCAST http://aws.amazon.com/podcast

Page 130: AWS Summit Benelux 2013 - Architecting for High Availability
Page 131: AWS Summit Benelux 2013 - Architecting for High Availability

THANK YOU!