cost optimization at scale
TRANSCRIPT
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Mario ThomasSenior Consultant EMEA
AWS Professional Services
7 July 2016
Cost Optimisation at ScaleReducing the cost of using AWS
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What We’ll Cover Today
• A quick overview of TCO• Does “at scale” apply to you?• Applying “architecting for cost” principles to your TCO
analysis and your AWS account(s)• Customer case study• Recap
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What is TCO?
Definition: Comparative total cost of ownership analysis (acquisition and operating costs) for running an infrastructure environment end-to-endon-premises vs. on AWS.
Used for:
1) Comparing the costs of running an entire infrastructure environment or specific workload on-premises or in a co-location facility vs. on AWS
2) Budgeting and building the business case for moving to AWS
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Includes Everything In It’s Price
Hardware in a Data Center
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Cost is Rarely the Only Reason to Choose AWS
Pay-as-you Go Model
Lower Overall Costs
StopGuessing Capacity
Agility / Speed / Innovation
Avoid Undifferentiated
Heavy LiftingGo Global in
Minutes
✔ ✔ ✔ ✔ ✔ ✔On-
Premise/ Co-Location
X X X X X X
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
So how do we do it?
≠
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
TCO = acquisition costs + operations costs
Hardware – Server, Rack Chassis PDUs, ToR
Switches (+Maintenance)
Software - OS, Virtualization Licenses
(+Maintenance)
Facilities Cost
Hardware – Storage Disks, SAN/FC Switches Storage Admin costs
Network Hardware – LAN Switches, Load Balancer
Bandwidth costsNetwork Admin costs
Server Admin / Virtualization Admin4
Diagram doesn’t include every cost item. For example, software costs can include database, management, and middle-tier software costs. Facilities cost can include costs associated with upgrades, maintenance, building security, taxes, etc. IT laborcosts can include security admin and application admin costs.
Space Power Cooling
Facilities Cost
Space Power Cooling
Facilities Cost
Space Power Cooling
Server Costs
Storage Costs
Network Costs
IT Labor Costs
1
2
3
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The TCO Window
$
1 2 3 4 50
TCO
Migration Cost
Cost Optimising / BAU
Current / Do Nothing
AWS Environment
Payback Period
Time (Months)
Cost
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Cost Optimisation
$ $
Paying for whatyou use
Paying for whatyou need
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The four pillars of cost optimisation
Right sizing Reserved Instances
Increase elasticity
Measure, monitor, and
improve
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Cost Optimisation
Simple Stimulating Stretching
Consolidated BillingPermissionsTaggingIdle Resources
Design for Elasticity
Instance Right SizingStoragePurchasing Options
OS LicencingOffloading Architecture
Higher Level Services
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Consolidated Billing
• Receive a single bill for all charges incurred across all linked accounts‒ Share RI discounts‒ Combine tiering benefits
• View & manage linked accounts• Add additional accounts
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Permissions
How are you controlling the provisioning of
resources?
Control who canprovision resources
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tagging
• Use tagging to identify resources• Use tags to audit resources• Turn off untagged resources
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Idle Resources
• Dev/Test (Non-Prod) instances• Use simple instance start/stop, or• Tear down/build up altogether• Instances are disposable
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Instance Right Sizing
Right sizing• Selecting the cheapest instance available
while meeting performance requirements• Looking at CPU, RAM, storage, and network
utilization to identify potential instances that can be downsized
• Leveraging Amazon CloudWatch metrics and setting up custom RAM metrics
Rule of thumb: Right size, then reserve.(But if you’re in a pinch, reserve first).
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Use Amazon CloudWatch to collect and track metrics
CloudWatch
Match resources to workload
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon CloudWatch monitoring
Basic• 7 metrics for Amazon EC2
including:• CPU utilization• Data transfer• Disk usage and more
• 5-minute frequency• Metrics for Amazon EBS,
Amazon DynamoDB, Amazon RDS, etc.
Detailed• 1-minute frequency• Aggregation by instance type and
AMI
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Instance families: Example
Instance vCPU Mem(GiB)
Monthly price(3-yr AURI )
Ideal use case
c4.2xlarge 8 15 $125.44 Best price-compute performance
m4.2xlarge 8 32 $137.25 Balanced
r3.2xlarge 8 61 $179.25 Lowest cost per GiB RAM
r3.xlarge 4 30.5 $89.61 Lowest cost per GiB RAM
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
t2.medium
4 GiB RAM Baseline performance: 40%Bursts beyond this based on CPU creditsLess than $18 per monthDetails: http://amzn.to/1sl2bKaWeb servers, dev, small databases
t2
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
t2 vs m3.medium or c4.large
Instance vCPU Mem(GiB)
Monthly price(3-yr AURI )
Ideal use case
m3.medium 1 3.75 $19.08 Always available, balanced
c4.large 2 3.75 $31.36 Always available, compute
t2.medium 2 4 $16.86 Bursty workloads
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Savings potential
If t2 works for you• 11%: Switch from m3.medium to t2.medium• 46%: Switch from c3.large to t2.medium
Instance optimization• 9%: Switch to c4 for compute-intensive apps
(m4.2xl -> c4.2xl)• 35%: Switch to r3 for memory-intensive apps
(m4.2xl -> r3.xlarge)
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Storage
AWS Cloud
Amazon Glacier
Gateway Appliance/ AWS Storage Gateway
Amazon S3
Block File
On-premises Data Center
Archive Backup Disaster Recovery
Amazon EBS
• Amazon S3 Standard Infrequent Access‒ Lower cost of storage
• Amazon S3 Reduced Redundancy‒ 99.99% durability vs. 99.999999999%‒ Up to 20% savings‒ Great for files that are easy to reproduce
• Amazon Glacier‒ Same durability as S3‒ 3 to 5 hours restore time‒ Up to 89% savings‒ Great for archiving, long-term backups and
old data
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Instance Purchasing Options
On-Demand
Pay for compute capacity by the hour
with no long-term commitments
For spiky workloads, or to define needs
Reserved
Make a low, one-time payment and receive a significant discount on the hourly charge
For committed utilization
Spot
Bid for unused capacity, charged at a
Spot Price which fluctuates based on supply and demand
For time-insensitive or transient workloads
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Reserved Instances
Commitment Level1 Year3 Year
AWS Services Offering RIsAmazon EC2 Amazon RDSAmazon DynamoDBAmazon RedshiftAmazon ElastiCache
* Dependent on specific AWS service, size/type, and region
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Reserved Instances
Step 1: RI Coverage• Cover always on resources
Step 2: RI Utilization• Leverage RI flexibility to increase utilization• Merge and split RIs as needed
Rule of thumb: Target 70-80% always on coverage and 95% RI utilization rate.
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Be Fault Tolerant• Workloads should be Stateless• Loosely Coupled workloads preferred • If possible, deploy to Multiple AZs • Instant Flexibility is king• Take advantage of the 2 minutes warning • There is always Spot capacity available ¢
Save up to 90% compared to On-Demand¢
What could your team do with a 10,000 core data center that costs $100 per hour, with one click?
Spot Instances
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Spot InstancesPricing• Up to 90% discount
Elastic• Minimum Commitment
• Commit to 1 hour• Tradeoff
• Potential for interruption
Picking the right Spot Bid Price - Tolerance for interruptions, % likelihood of termination
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
0
2
4
6
8
10
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
/Spot
Mixed Use Instances
Reserved Instances enable you to reserve capacity for one or three years by paying a low, one-time fee for the capacity reservation and receiving a significant discount on the hourly charge for your instances
AWS Services Offering RIs • Amazon EC2 • Amazon RDS• Amazon DynamoDB• Amazon Redshift• Amazon ElastiCache
On DemandSpotRI
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
OS Licensing
• Utilise Amazon Linux• Supported by Amazon• No licence fees• Regularly updated• Savings of up to 30% over other options*
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Offload Your Architecture
Standard Setup• 4 x Medium Instances
$485• AWS Data Transfer 1 TB
$194• Total = $679
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Offload Your Architecture
+
Standard Setup• 4 x Medium Instances
$485• AWS Data Transfer 1 TB
$194• Total = $679
Optimized• 1 x Medium Instance
$121• CloudFront Data 1 TB
$168• CloudFront Requests
$1.89• Total = $291
57%
6X
Cheaper
Faster
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Design for Elasticity
Turn off non-production instances• Look for dev/test, non-prod instances that are
running always-on and turn off
Autoscale production• Use Auto Scaling to scale up and down based
on demand and usage (e.g., spikes)
Rule of thumb: Shoot for 20-30% of EC2 instances running on demand to be able to handle elasticity needs.
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Use Auto Scaling or queue-based approaches to add resources when needed, and turn them off when not
Amazon SQS Queue
Auto Scaling
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Automatic resizing of compute clusters based on demand
Trigger autoscaling policy
Feature DetailsControl Define minimum and maximum instance pool
sizes and when scaling and cool down occurs.
Integrated to Amazon
CloudWatch
Use metrics gathered by CloudWatch to drive scaling.
Instance types Run Auto Scaling for On-Demand and Spot Instances. Compatible with VPC.
AWS autoscaling create-autoscaling-group— Auto Scaling-group-name MyGroup— Launch-configuration-name MyConfig— Min size 4— Max size 200— Availability Zones us-west-2c
Use Auto Scaling
Amazon CloudWatch
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Utilization and Auto-Scaling: GranularityMore small instances vs. less large instances
29 Large @ $0.32/hr= $9.28
59 Small @ $0.08/hr= $4.72
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Higher Level Services
AWSLambda
Elastic Load Balancing
Amazon ECR Amazon CloudFront
Amazon EFS Amazon Glacier
AmazonS3
AmazonDynamoDB
Amazon ElastiCache
AmazonRDS
Amazon Redshift
Amazon VPC
AmazonRoute 53
AWS KMS
AWS WAF Amazon Elasticsearch Service
Amazon EMR
Amazon Kinesis
Amazon Machine Learning
AWS IoT AmazonWorkSpaces
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Higher Level Services
Amazon RDS Multi-AZ
High availability with a mouse click
Set up primary and standby instancesSet up identical volumesCreate synchronous replicationCreate and manage DNS entriesDetect instance failure conditionsDetect network failure conditionsDetect storage failure conditions…......Decide when to fail overRe-establish primary secondary connections
Do-it-yourself MySQL replication
Potentially ~100+ manual steps
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Reviewing & Reporting
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Detailed billing reports and
cost insights
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Detailed Billing Reports
http://amzn.to/1swNwLV
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Cost Explorer
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Forecasting
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Budgeting
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Well Architected
• Security• Reliability• Performance• Cost Optimisation
https://d0.awsstatic.com/whitepapers/architecture/AWS_Well-Architected_Framework.pdf
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Creating a culture of cost transparency
Targets and Metrics Cloud Competency Center
AWS Enterprise Support
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Metrics & targets
% Instances turned off daily% of Instances Right-Sized% Always on Resources Covered by RIs% RI Utilization
✔✔✔✔✘
✘
✘
✘
Set up metrics to define success and track progress
Smart Cloud UseFinancial Times approach to Cloud efficiency
Greg Cope
1
2
3
4
Agenda
What I am going to coverIntro
FT’s Approach in detail
Challengettes
Summary
Intro
Financial Times
Brexit slide ...
Me ...
Head of Platform Architecture and SecurityBeen at the FT a whileAWS certified
Lower barriers
Before you start - Find a way to ID costs
<AWSTag>CostCode</AWSTag>
Lots of AWS Accounts
Democratise the information
FT’s strategy
Democratise the Costs
You can have a central “Cloud” team and task them with cost control…We chose to show back the costs to technical managers and engineersAnyone at the FT can get a viewWhich needs to be simple/appropriate/accessible
We came up with simple, measurable, metrics
FT’s Approach in detail ...
Before we start
This is much easier having dedicated resources ...
… So big “Thanks” to the Cloud Enablement Team @FT
… As well as the tech teams who have implemented much of this
FT’s strategy
1.Dedicated resources
2.Use AWS Rich services rather than “instances”
3.Switch off non-prod
4.Review Operating Systems costs
5.Use newer instance types
6.Reserved Instances
Use AWS Rich services
Metric: on-demand as a % of the overall bill
Switch off non - production
Metric: non-production less than 16 instance hrs / day ~ weekends 2 hrs
Consider OS
How much does your OS cost you?Simple CPU cost vs $?
Licence/subscription costs overheads?
With uServices and 12 Factor applications become less coupled to their OSMetric: Less expensive OS instances as % of whole
Newer instancesNew instances are often more performant, at around 25% cheaper
Previous Gen. Current Gen. % Saving Saving Per Hourt1.micro t2.micro 7.5% $0.006m1.large m4.large 23.2% $0.058m1.xlarge m4.xlarge 26.2% $0.115
Metric: 0 old instance types
Reserved instance types
FT’s 1 year Reserved Instances (RIs) savings estimate of ~34%Only useful after all of the previous steps are done... otherwise you are buying the wrong RIs
Metric: X number of RIs applied
Roll out approach
Work with a team to demonstrateThat it is possible
That it works (saves $$$)
Act as techy evangelists
Challengettes
Challengettes
Fear of the unknown
Our stuff will not work
How to switch it on/off
Jenkins Jobs
Lambda Cronjobs
Other challengettes ...
IP Firewall rules broken when hosts come up with new IPMonitoring noiseSchedulers brokeDifferences between similar OS’es
Summary
But… much higher instance productivity + $$$ saved
Obligatory recruitment slide
https://www.linkedin.com/company/financial-times/career
Thanks,
For listening ...
AWS Cost optimisation; http://goo.gl/M9hLC3;
https://github.com/Financial-Times/ec2-powercycle
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Recap
Simple Stimulating Stretching
Consolidated BillingPermissionsTaggingIdle Resources
Design for Elasticity
Instance Right SizingStoragePurchasing Options
OS LicencingOffloading Architecture
Higher Level Services
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
If you need extra help
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Professional Services
• A global team of consultants across EMEA, APAC and the Americas
• Work with Enterprise and mid-market customers across many industries
• Specialists in IT Transformation, Security, DevOps, Big Data and Infrastructure
• Medium to long-term engagements of one month or more
• Provide guidance, advice, best practice and accelerates adoption
Partners
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Resources to get you startedAWS TCO Calculatorhttps://awstcocalculator.com
AWS Economics Centrehttp://aws.amazon.com/economics/
Case Studies & Researchhttp://aws.amazon.com/economics/
Please remember to rate this session under My Agenda on
awssummit.london
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Email me if you have any questions about today’s presentation:[email protected]
Thank You!