aws summit berlin 2013 - building web scale applications with aws
TRANSCRIPT
Ryan Shuttleworth, Technical Evangelist
Building Web-Scale Applications with AWS
What’s a web scale application?
Three principles to build upon
Layering the cake:Data
Application
Total Jobs Group – their story
What are we going to cover?
What do web scale apps have in
common?
Actual demand
Predicted demand
Customerdissatisfaction
Waste
Demand
Time
Elastic capacity No need to guess capacity requirements and over-provisionElastic Capacity
Elastic capacity
Demand
Time
Elastic capacity No need to guess capacity requirements and over-provisionElastic Capacity
Built on a global footprint
9 Regions
25 Availability Zones
Continuous Expansion
Built across regional availability zones
Relational Database ServiceDatabase-as-a-Service
No need to install or manage database instances
Scalable and fault tolerant configurations
DynamoDBProvisioned throughput NoSQL database
Fast, predictable performance
Fully distributed, fault tolerant architecture
Use RDS for databases
Use DynamoDB for high performance key-
value DB
Architected using services
Amazon SQS
Processing
task/processing
trigger
Processing results
Amazon SQSReliable, highly scalable, queue service
for storing messages as they travel
between instances
Task A
Task B
(Auto-scaling)
Task C
2
3
1
Simple WorkflowReliably coordinate processing steps
across applications
Integrate AWS and non-AWS resources
Manage distributed state in complex
systems
Push inter-process workflows into the cloud with SWF
Reliable message queuing without
additional software
Architected using services
Cloud SearchElastic search engine based upon
Amazon A9 search engine
Fully managed service with
sophisticated feature set
Scales automatically
DocumentServer
Results
SearchServer
Don’t install search software, use CloudSearch
Process large volumes of data cost effectively
with EMR
Elastic MapReduceElastic Hadoop cluster
Integrates with S3 & DynamoDB
Leverage Hive & Pig analytics scripts
Integrates with instance types such as
spot
Architected using services
Three principles to build upon…
Scale
1
Scale
Elasticity
1
Scale
Elasticity
State Data
1
Scale
Security
Elasticity
State Data
2
Scale
Security
Elasticity
State Data
Inherent2
Scale
Security
Elasticity
State Data
Inherent
VPC
Groups
2
Scale
FailureSecurity
Elasticity
State Data
Inherent
VPC
Groups
3
Scale
FailureSecurity
Elasticity
State Data
Inherent
VPC
Expected
Groups
3
Scale
FailureSecurity
Elasticity
State Data
Inherent
VPC
Expected
Automation
TestingGroups
3
Scale
Failure
Elasticity
State Data
Expected
Automation
Testing
SecurityInherent
VPC
Groups
Layering the cake
Data
Web scale data
Object storage
Data
You put it in S3AWS stores with 99.999999999% durability
Highly scalable web access to objects
Multiple redundant copies in a region
What is S3?
Highly scalable data storage
Access via APIsA web store,
not a file system
Fast
Highly available & durable
Economical
Data
Data
Case S
tudy
Data
Web scale data
Object storage
Data
Web scale data
Object storageRelational data
Data
Master/Slave Horizontal ScalingReasonably simple to implement
Leverage PIOPs for raw performance
Easy to change instances sizes
Has an upper limit
Data
hash ring
Sharded Horizontal ScalingMore complex at the application layer
No practical limit on scalabilityOperation complexity/sophistication
Shard by function or key spaceRDBMS or NoSQL
A
BC
D
Data
Web scale data
Object storageRelational data
Data
Web scale data
Object storageRelational data
NoSQL
Data
Horizontal Scaling - Fully Managed
DynamoDBProvisioned throughput NoSQL database
Fast, predictable performance
Fully distributed, fault tolerant architecture
Considerations for non-uniform data
Data
DynamoDBProvisioned read/write performance per table
Predictable high performance scaled via console or
API
Dial it up
Data
Low provisioned throughput
TablePartition
SSD
Region
Illustrative diagram only
Data
Illustrative diagram only
Region
TablePartition
SSD
TablePartition
SSD
TablePartition
SSD
TablePartition
SSD
TablePartition
SSD
TablePartition
SSD
TablePartition
SSD
TablePartition
SSD
TablePartition
SSD
TablePartition
SSD
Increased provisioned throughput
DataRegion
Illustrative diagram only
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
TablePartition
High provisioned throughput
Data
Application
Loose coupling sets you free!
The looser they're coupled, the bigger they scale
Independent components
Design everything as a black box
Decouple interactions
Load-balance clusters
Data
Application
Amazon SQS
Processing
task/processing trigger
Processing results
Amazon SQSReliable, highly scalable, queue service for storing
messages as they travel between instances
Data
Application
Controller A Controller B Controller C
Tight Coupling
Data
Application
Controller A Controller B Controller C
Controller A Controller B Controller C
Tight Coupling
Loose Coupling
Q Q Q
Data
Application
Auto ScalingAutomatic resizing of compute clusters based on demand
Trigger auto-scaling policy
Feature Details
Control Define minimum and maximum instance pool sizes and when scaling and cool down occurs.
Integrated to Amazon
CloudWatch
Use metrics gathered by CloudWatch to drive scaling.
Instance types Run Auto Scaling for On-Demand and Spot Instances. Compatible with VPC.
Data
Application
Where does state
reside?
Data
Application
Where does state
reside?
Browser cookies
Framework session handler
Session database
Memory session
manager
Data
Application
State store should be:
Performant
Scalable
Reliable
Data
Application
Trigger auto-scaling policy
Where should state reside?
Data
Application
Trigger auto-scaling policy
Where should state reside?
Not here
Data
Application
Trigger auto-scaling policy
Where should state reside?
Not here
Session state service
State must reside OUTSIDE the scope of the elements you wish to scale
Data
Application
Where should state reside?PerformantScalableReliable
Data
Application
Load Balancing
Feature Details
Available Load balance across instances in multiple Availability Zones
Health checks Automatically checks health of instances and takes them in or out of service
Session stickiness Route requests to the same instance
Secure sockets layer Supports SSL offload from web and application servers with flexible cipher support
Monitoring Publishes metrics to CloudWatch
Elastic Load BalancingCreate highly scalable applications
Distribute load across EC2 instances
in multiple availability zones
Data
Application
Load Balancing
Distribution
Route53
Region A
Route53
Region B
Request
Route53Global DNS service
Data
Application
Load Balancing
Distribution
Route53
Region A
Route53
Region B
16ms 92ms
Request
Route53Global DNS service
Data
Application
Load Balancing
Distribution
Route53
Region A
Route53
Region B
16ms 92ms
Request
Route53Global DNS service
Data
Application
Load Balancing
Distribution
Route53
Region A
Route53
Region B
16ms 92ms
RequestRegion A DNS entry
Route53Global DNS service
Data
Application
Load Balancing
Distribution
London
Paris
NY
Served from S3/images/*
3
Served from EC2*.php
2
Single CNAMEwww.mysite.com
1
CloudFrontWorld-wide content distribution
network
Data
Application
Load Balancing
Distribution
Re
spo
nse
Tim
e
Se
rve
r L
oa
d
Re
spo
nse
Tim
e
Se
rve
r L
oa
d
Re
spo
nse
Tim
e
Ser
ver
Load
No CDN CDN for
Static
Content
CDN for
Static &
Dynamic
Content
CloudFrontWorld-wide content distribution
network
Data
Application
Load Balancing
Distribution
Management
Data
Application
Load Balancing
Distribution
Management
10 instancesmanageable
Data
Application
Load Balancing
Distribution
Management
100 instancesat a push
Data
Application
Load Balancing
Distribution
Management
1,000 instancesnot a chance
Data
Application
Load Balancing
Distribution
Management
Automation & management
Web scale enabler
OpsWorks Elastic Beanstalk
CloudFormation EC2
Data
Application
Load Balancing
Distribution
Management
OpsWorks Elastic Beanstalk
CloudFormation EC2control
convenience
Data
Application
Load Balancing
Distribution
Management
Summary
Use these techniques (and many others) as appropriate
Awareness of the options is the first step to good design
Scaling is the ability to move the bottlenecks around to the least expensive part of the architecture
AWS makes this easier – so your application is not a victim of its own success
Summary
Thank you