amazon aurora - cloud object storagetrack/amazon+aurora.pdf · mysql read scaling • replicas must...

35
©2015, Amazon Web Services, Inc. or its affiliates. All rights reserved Amazon Aurora Relational databases reimagined. Ronan Guilfoyle, Solutions Architect, AWS Brian Scanlan, Engineer, Intercom

Upload: others

Post on 06-Nov-2019

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

©2015,  Amazon  Web  Services,  Inc.  or  its  affiliates.  All  rights  reserved

Amazon Aurora Relational databases reimagined.

Ronan Guilfoyle, Solutions Architect, AWS

Brian Scanlan, Engineer, Intercom

Page 2: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Current DB Architectures are Monolithic

Multiple layers of functionality all on a single box

SQL

Transactions

Caching

Logging

Page 3: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Current DB Architectures are Monolithic

Even when you scale it out, you’re still replicating the same stack

SQL

Transactions

Caching

Logging

SQL

Transactions

Caching

Logging

Application

Page 4: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Current DB Architectures are Monolithic

SQL

Transactions

Caching

Logging

SQL

Transactions

Caching

Logging

Application Even when you scale it out, you’re still replicating the same stack

Page 5: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Current DB Architectures are Monolithic

SQL

Transactions

Caching

Logging

SQL

Transactions

Caching

Logging

Storage

Application Even when you scale it out, you’re still replicating the same stack

Page 6: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

This is a problem. For cost. For flexibility. And for availability.

Page 7: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Re-imagining the Relational Database

What if we were inventing the database today?

You wouldn’t design it the way we did in 1970. At least not entirely You’d build something scale-out, self-healing, that leverage existing AWS services

Page 8: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Relational databases reimagined for the cloud.

speed and availability of high-end commercial databases

simplicity and cost-effectiveness of open source databases

  drop-in compatibility with MySQL

  simple pay as you go pricing

Delivered as a managed service.

Page 9: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Amazon Aurora applying a Service-oriented architecture to the database •  Moved the logging and storage layer

into a multi-tenant, scale-out database-optimized storage service

•  Integrated with other AWS Services like EC2, VPC, DynamoDB, SWF, Route 53 for control plane operations

•  Integrated with S3 for continuous backup and 99.999999999% durability

Logging + Storage

SQL

Transactions

Caching

Control Plane Data Plane

Amazon S3

DynamoDB

Amazon SWF

Amazon Route 53

Page 10: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Aurora Works with Your Existing Apps

Page 11: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

An Established Ecosystem

Business Intelligence Data Integration Query & Monitoring SI & Consulting

“It is great to see Amazon Aurora remains MySQL compatible; we have found our connectors work with Aurora seamlessly. Today, customers can take our drivers and connect to Aurora, MariaDB or MySQL without worrying about compatibility. We look forward to working with the Aurora team in the future to further accelerate innovation within the MySQL ecosystem.” – Rasmus Johansson, VP Engineering

Page 12: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Amazon Aurora is Easy to Use

Page 13: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Aurora Makes it Easy to Run Your Databases

•  Create a database in minutes

•  Automatic patching

•  Push-button scaling

•  Failure detection and failover.

•  Read Replica’s are available as failover targets, with no data loss

Amazon RDS

Page 14: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Aurora simplifies storage management

•  Instant creation of user-snapshots •  Continuous backups to S3 •  Automatic storage scaling up to 64 TB -

no performance or availability impact •  Automatic restriping, mirror repair, hot

spot management, encryption

Amazon RDS

Page 15: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Aurora simplifies Data Security

•  Encryption to secure data at rest –  AES-256; hardware accelerated –  All blocks on disk and in Amazon S3 encrypted –  Key management via AWS KMS

•  SSL to secure data in transit

•  Network isolation via Amazon VPC by default

•  No direct access to nodes

•  Supports industry standard security and data protection certifications

AZ 1 AZ 3

Primary Instance

Amazon S3

Replica Instance

Customer VPC

Internal VPC

MySQL App

AZ 2

Page 16: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Amazon Aurora is Highly Available

Page 17: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Aurora is Highly Available

•  Highly available by default –  6-way replication across 3 AZs

–  4 of 6 write quorum •  Automatic fallback to

3 of 4 if an AZ is unavailable

–  3 of 6 read quorum •  SSD, scale-out, multi-tenant storage

–  Seamless storage scalability

–  Up to 64TB database size –  Only pay for what you use

•  Log-structured storage –  Many small segments, each with

their own redo logs

–  Log pages used to generate data pages –  Eliminates chatter between database and storage

SQL Transaction

s

AZ 1 AZ 2 AZ 3

Caching

Amazon S3

Page 18: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Aurora Performs Consistent, Low Latency Writes

Improvements •  Consistency - tolerance to outliers

•  Latency - 2 phase commit vs. asynchronous replication

•  Significantly more efficient use of network IO

AZ 1 AZ 2

Primary Instance

Standby Instance

EBS

Amazon S3

EBS mirror

EBS

EBS mirror

AZ 1 AZ 3

Primary Instance

Amazon S3

AZ 2

Replica Instance

Log records

Binlog

Data

Doublewrite buffer

FRM files, metadata

Type of writes

MySQL Multi-AZ with Standby Amazon Aurora

async 4/6 quorum

2 phase commit

PiTR

Sequential write

Sequential write Distributed

writes

Page 19: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Aurora Performs Consistent, Low Latency Writes

Improvements •  Consistency - tolerance to outliers

•  Latency - 2 phase commit vs. asynchronous replication

•  Significantly more efficient use of network IO

AZ 1 AZ 2

Primary Instance

Standby Instance

EBS

Amazon S3

EBS mirror

EBS

EBS mirror

AZ 1 AZ 3

Primary Instance

Amazon S3

AZ 2

Replica Instance

Log records

Binlog

Data

Doublewrite buffer

FRM files, metadata

Type of writes

MySQL with Standby Amazon Aurora

async 4/6 quorum

2 phase commit

PiTR

Sequential write

Sequential write Distributed

writes

Page 20: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Self-healing and fault-tolerant

•  Lose 2 copies or an AZ failure without read or write availability impact

•  Lose 3 copies without read availability impact

•  Automatic detection, replication and repair

SQL

Transaction

AZ 1 AZ 2 AZ 3

Caching

SQL

Transaction

AZ 1 AZ 2 AZ 3

Caching

Read & Write Availability Read Availability

Page 21: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Traditional Databases •  Have to replay logs since the

last checkpoint

•  Single threaded in MySQL; requires a large number of disk accesses

Amazon Aurora •  Underlying storage replays

redo records on demand as part of a disk read

•  Parallel, distributed, asynchronous

Checkpointed Data Redo Log

Crash at T0 requires a re-application of the SQL in the redo log since last checkpoint

T0 T0

Crash at T0 will result in redo logs being applied to each segment on demand, in parallel, asynchronously

Aurora Has Instant Crash Recovery

Page 22: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Aurora’s Cache Survives a DB Restart

•  We moved the cache out of the database process

•  Cache remains warm in the event of a database restart

•  Lets you resume fully loaded operations much faster

•  Instant crash recovery + survivable cache = quick and easy recovery from DB failures

SQL Transactions

Caching

SQL

Transactions

Caching

SQL Transactions

Caching

Caching Process is Outside the DB process and remains warm across a database restart

Page 23: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Multiple failover targets, without data loss.

MySQL Read Scaling •  Replicas must replay logs •  Replicas place additional load on master •  Replica lag can grow indefinitely •  Failover results in data loss

Page cache invalidation

Aurora Master

30% Read

70% Write

Aurora Replica

100% New Reads

Shared Multi-AZ Storage

MySQL Master

30% Read

70% Write

MySQL Replica

30% New Reads

70% Write

Single threaded

binlog apply

Data Volume Data Volume

Page 24: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

You Can Simulate Failures Using SQL

•  To cause the failure of a component at the database node: ALTER SYSTEM CRASH [{INSTANCE | DISPATCHER | NODE}]

•  To simulate the failure of disks: ALTER SYSTEM SIMULATE percent_failure DISK failure_type IN [DISK index | NODE index] FOR INTERVAL interval

•  To simulate the failure of networking: ALTER SYSTEM SIMULATE percent_failure NETWORK failure_type [TO {ALL | read_replica | availability_zone}] FOR INTERVAL interval

Page 25: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Intermission…  

Page 26: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Intercom

[email protected]

Page 27: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Intercom’s technology

-  Largely monolithic Ruby on Rails application. Strong culture around Continuous Deployment and general DevOps best practices. Some adoption of SOA/Microservices.

-  Built using MySQL and Ruby on Rails’ ActiveRecord ORM for rapid development. Unstructured customer data stored in MongoDB, however all messages in MySQL.

-  Heavy use of AWS services and other SaaS services (New Relic, Code Climate, CodeShip, LogEntries).

-  Custom infrastructure/code orchestration & deployment system.

Page 28: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow
Page 29: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Recent MySQL woes-  Highly sensitive to MySQL performance and started experiencing regular inexplicable

performance degradation. Could not vertically scale our way out of the problem. Engaged RDS Support & MySQL consultants.

-  Adjusted parameters e.g. lock wait timeouts, transaction read isolation levels and txn_flush_at_commit, etc. Instrumented and reduced number of long-running transactions in an attempt to reduce lock contention.

-  Greatly increased the number of MySQL metrics being collected, built application level fingerprinting and automated data collection during outages. Got our hands dirty with MySQL’s performance schema. Reducing read throughput stabilised.

Page 30: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Why we’re interested in Aurora-  While we bought ourselves time with improved caching, Aurora

gives us more options and a lot more vertical scaling opportunities. -  Operational experiences of using read-replicas with RDS/MySQL

means we don’t trust them for customer facing queries. The eventual consistency guarantees of Aurora look good enough for our application to use for practically all read queries.

Page 31: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

intercom.io

blog.intercom.io

[email protected]

[email protected]

Page 32: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Enterprise grade features and performance at open source prices

Page 33: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Aurora Pricing

Simple pricing •  No licenses •  No lock-in •  Pay only for what you use

Discounts •  44% with a 1 year RI •  63% with a 3 year RI

vCPU Mem Hourly Price

db.r3.large 2 15.25 $0.29

db.r3.xlarge 4 30.5 $0.58

db.r3.2xlarge 8 61 $1.16

db.r3.4xlarge 16 122 $2.32

db.r3.8xlarge 32 244 $4.64

• Storage consumed, up to 64TB, is $0.10/GB/month •  IOs consumed are billed at $0.20 per million IO • Prices are for Virginia

Page 34: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

Aurora – Enterprise Grade. Open Source Prices

•  Expanding to unlimited preview

•  Adding preview support for US West (Oregon) and EU (Ireland)

•  Signup for preview access at: https://aws.amazon.com/rds/aurora/preview

•  Full service launch in the coming months

Page 35: Amazon Aurora - Cloud Object Storagetrack/Amazon+Aurora.pdf · MySQL Read Scaling • Replicas must replay logs • Replicas place additional load on master • Replica lag can grow

LONDON