nhgri cloud computing talk
Post on 12-May-2015
3.243 Views
Preview:
DESCRIPTION
TRANSCRIPT
Scien&fic Compu&ng with Amazon Web ServicesDeepak Singh
NHGRI Cloud Compu&ng Mee&ng, Bal&more, 2010
AWS + science = win
scale has implications
data management
data processing
data sharing
Image: Chris Dagdigian
amazon web services
the cloud
has_many :definitions
infrastructure as a service
ComputeAmazon Elastic Compute
Cloud (EC2)- Elastic Load Balancing- Auto Scaling
StorageAmazon Simple
Storage Service (S3)- AWS Import/Export
Your Custom Applications and Services
Content DeliveryAmazon CloudFront
MessagingAmazon Simple
Queue Service (SQS)
PaymentsAmazon Flexible Payments Service
(FPS)
On-Demand Workforce
Amazon Mechanical Turk
Parallel ProcessingAmazon Elastic
MapReduce
MonitoringAmazon CloudWatch
ManagementAWS Management Console
ToolsAWS Toolkit for EclipseAWS Toolkit for .NET
Isolated NetworksAmazon Virtual Private
Cloud
DatabaseAmazon RDS and
SimpleDB
• Lower pricing tiers for Cloudfront • AWS Management Console
• New SimpleDB Features • FPS General Availability
• EC2 Reserved Instances • EC2 with Windows • EC2 in EU • AWS Toolkit for Eclipse
• Reserved Instances in EU • Elastic MapReduce • SQS in EU
• AWS Import/Export • Monitoring, Auto Scaling, and Elastic Load Balancing • CloudFront adds access logging
• AWS Security Center • Console support for Cloudfront
• Elastic MapReduce in EU
• AWS Multi Factor Authentication • Virtual Private Cloud private beta • Lower Reserved Instance Pricing • Console Support for CloudWatch
• EBS Shared Snapshots • SimpleDB in EU • Monitoring in EU • Auto Scaling in EU • Elastic Load Balancing in EU • AWS Solutions Provider program
• RDS Launched • High Memory Instances • Reduced EC2 Pricing • EMR Apache Hive support
• SAS 70 Type II Audit • AWS SDK for .NET • CloudFront Private Content • APAC announced
• Boot from EBS • US West Region • VPC Unlimited Beta • ELB Support in Console • CloudFront streaming • EC2 Spot Instances • Windows 2008 Support • Lowered Prices • AWS Economics Center
elasticity
3000 CPU’s for one firm’s risk management application
!"#$%&'()'*+,'-./01.2%/'
344'+567/'(.'
8%%9%.:/'
;<"&/:1='
>?,3?,44@'
A&B:1='
>?,>?,44@'
C".:1='
>?,D?,44@'
E(.:1='
>?,F?,44@'
;"%/:1='
>?,G?,44@'
C10"&:1='
>?,H?,44@'
I%:.%/:1='
>?,,?,44@'
3444JJ'
344'JJ'
scale
> 1PB of data in S3
highly availability
Image: Chris Dagdigian
“Everything fails, all the time”-- Werner Vogels
“Things will crash. Deal with it”-- Jeff Dean
2-4% of serverswill die annually
Source: Jeff Dean, LADIS 2009
1-5% of disk drives will die every year
Source: Jeff Dean, LADIS 2009
human errors
human errors~20% admin issues have unintended consequences
Source: James Hamilton
scalable & available
assume sw/hw failure
design apps to be resilient
automation & alarming
US East Region
Availability Zone A
Availability Zone B
Availability Zone C
Availability Zone D
!"#$%&'()*+
T
TT
elastic load balancing
CloudWatch
auto scaling
elastic block store
elastic IP
SQS
flexibility
on-demand instancesreserved instances
spot instances
some implications
computing platforms
http://cyclecomputing.comhttp://wiki.github.com/documentcloud/cloud-crowd
sudo gem install cloud-crowd
Input S3 bucket
Output S3 bucket
Amazon S3
Hadoop
Amazon EC2 Instances
Input dataset
outputresults
Deploy Application
Web Console, Command line tools
End
Notify
Get ResultsInput Data
Amazon Elastic MapReduce
Hadoop Hadoop
Hadoop
Hadoop
Hadoop
Elastic MapReduce
Elastic MapReduce
application platforms
Image: O’Reilly Radar
software distribution
http://bitbucket.org/galaxy/galaxy-central/wiki/Home
data distribution
http://aws.amazon.com/publicdatasets/
to conclude
built for scale
built for availability
shared dataspacescommon namespaces
task-based resources
new software architectures
new computing platforms
Data Platform
App Platform
available today
deesingh@amazon.com Twi2er:@mndoci Presenta7on ideas from James Hamilton, @mza, and @lessig
Thank you!
top related