aws innovate: running databases in aws- russell nash
TRANSCRIPT
LoggingAnalyticsWebscale
Throughput
Rich SearchHot Reads
Complex Queries
and Transactions
The Database
LoggingAnalyticsWebscale
Throughput
Rich SearchHot Reads
Complex Queries
and Transactions
AmazonDynamoDB
AmazonRDS
AmazonElasticache
AmazonS3
AmazonRedshift
AmazonElasticsearch
The Data Tier
Analytics Webscale
Throughput
Complex Queries
and Transactions
AmazonDynamoDB
AmazonRDS
AmazonRedshift
The Data Tier
Power, HVAC, net
Rack & stack
Server maintenance
OS patches
DB s/w patches
Database backups
Scaling
High availability
DB s/w installs
OS installation
Query Construction
Query Optimisation
Schema Design
Traditional
DC
Why Managed Databases?
Power, HVAC, net
Rack & stack
Server maintenance
OS patches
DB s/w patches
Database backups
Scaling
High availability
DB s/w installs
OS installation
Query Construction
Query Optimisation
Schema Design
Power, HVAC, net
Rack & stack
Server maintenance
OS patches
DB s/w patches
Database backups
Scaling
High availability
DB s/w installs
OS installation
Query Construction
Query Optimisation
Schema Design
DB
on EC2
Why Managed Databases?
Traditional
DC
Power, HVAC, net
Rack & stack
Server maintenance
OS patches
DB s/w patches
Database backups
Scaling
High availability
DB s/w installs
OS installation
Query Construction
Query Optimisation
Schema Design
Power, HVAC, net
Rack & stack
Server maintenance
OS patches
DB s/w patches
Database backups
Scaling
High availability
DB s/w installs
OS installation
Query Construction
Query Optimisation
Schema Design
Power, HVAC, net
Rack & stack
Server maintenance
OS patches
DB s/w patches
Database backups
Scaling
High availability
DB s/w installs
OS installation
Query Construction
Query Optimisation
Schema Design
Amazon
RDS
Why Managed Databases?
DB
on EC2
Traditional
DC
“In a physical data center, I would need at least 3
administrators to maintain the infrastructure and ensure
similar levels of availability”
- Richard Glew, CTO
Analytics Webscale
Throughput
Complex Queries
and Transactions
AmazonDynamoDB
AmazonRDS
AmazonRedshift
The Data Tier
• MPP SQL Database
• Optimised for analytics
• Gigabytes to petabytes
• Fully relational
• Fully managed
Amazon
Redshift
ID Name
1 John Smith
2 Jane Jones
3 Peter Black
4 Pat Partridge
5 Sarah Cyan
6 Brian Snail
1 John Smith
4 Pat Partridge
2 Jane Jones
5 Sarah Cyan
3 Peter Black
6 Brian Snail
Massively Parallel
JDBC/ODBC
• Column storage
• Data compression
• Zone maps
ID Age State Amou
nt
123 20 QLD 500
345 25 WA 250
678 40 NSW 125
957 37 WA 375
Reduces I/O
Row storage
Have to read the
entire row
Column storage
Only read the
data you need
• Column storage
• Data compression
• Zone maps
Reduces I/O
ID Age State Amou
nt
123 20 QLD 500
345 25 WA 250
678 40 NSW 125
957 37 WA 375
• Column storage
• Data compression
• Zone maps
Reduces I/O
02-JAN-2016
04-JAN-2016
07-JAN-2016
08-JAN-2016
09-JAN-2016
10-JAN-2016
15-JAN-2016
21-JAN-2016
22-JAN-2016
29-JAN-2016
MIN: 02-JAN-2016
MAX: 09-JAN-2016
MIN: 10-JAN-2016
MAX: 29-JAN-2016
SELECT AVG(AMOUNT) WHERE SALES_DATE = 22-JAN-2016
Traditional
SQL Database
Amazon
Redshift
Summarise by month 02:08:35 00:35:46 00:00:12
Traditional
SQL Database
Amazon
Redshift
Performance – 2 Billion Rows
160 GB
DC1.L
8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL
8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL
8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL
8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL
8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL
8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL
8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL
8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL
8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL
8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL 8XL
2 PB
Scalability
Analytics Webscale
Throughput
Complex Queries
and Transactions
AmazonDynamoDB
AmazonRDS
AmazonRedshift
The Data Tier
WRITES
Continuously replicated to 3 AZ’s
Persisted to disk (custom SSD)
READS
Strongly or
eventually consistent
Durability
“I was extremely comfortable running
our business-critical systems in an
AWS architecture.”
- Harry Teng CIO
Analytics Webscale
Throughput
Complex Queries
and Transactions
AmazonDynamoDB
AmazonRDS
AmazonRedshift
The Data Tier
Online Labs & Training
Gain confidence and hands-on
experience with AWS.
Watch free Instructional Videos and
explore Self-Paced Labs
Instructor Led Classes
Learn how to design, deploy and
operate highly available, cost-effective
and secure applications on AWS in
courses led by qualified AWS instructors
Validate your technical expertise
with AWS and use practice exams
to help you prepare for AWS
Certification
AWS Certification
More info at http://aws.amazon.com/training