hpc on aws

HPC on Amazon Web Services

Deepak SinghAmazon Web Services

Dec 17, 2010

Image: Simon Cockell under CC-BY

the new reality

lots and lots and lots and lots and lots of data

lots and lots and lots and lots and lots of

compute


people


places

constant change

innovate

innovate in a new reality

optimize the most valuable resource

compute, storage, workflows, memory,

transmission, algorithms, cost, …

people drive innovation

Credit: Pieter Musterd a CC-BY-NC-ND license

http://www.flickr.com/photos/piet_musterd/1858568495/


make people productive




challenges

Your Idea SuccessfulProduct


Great Idea Not Prioritized

Resource Contention

Tight Budgets

Shared Resources

enter the cloud

infrastructure services

building blocks

Undifferentiated Heavy Lifting

pay as you go

pay for what you use

on demand

programmable

import botoimport boto.emrfrom boto.emr.step import StreamingStepfrom boto.emr.bootstrap_action import BootstrapActionimport time

# set your aws keys and S3 bucket, e.g. from environment or .botoAWSKEY= SECRETKEY= S3_BUCKET=NUM_INSTANCES = 1

conn = boto.connect_emr(AWSKEY,SECRETKEY)

bootstrap_step = BootstrapAction("download.tst", "s3://elasticmapreduce/bootstrap-actions/download.sh",None)

step = StreamingStep(name='Wordcount', mapper='s3n://elasticmapreduce/samples/wordcount/wordSplitter.py', cache_files = ["s3n://" + S3_BUCKET + "/boto.mod#boto.mod"], reducer='aggregate', input='s3n://elasticmapreduce/samples/wordcount/input', output='s3n://' + S3_BUCKET + '/output/wordcount_output')

jobid = conn.run_jobflow( name="testbootstrap", log_uri="s3://" + S3_BUCKET + "/logs", steps = [step], bootstrap_actions=[bootstrap_step], num_instances=NUM_INSTANCES)

print "finished spawning job (note: starting still takes time)"

state = conn.describe_jobflow(jobid).stateprint "job state = ", stateprint "job id = ", jobidwhile state != u'COMPLETED': print time.localtime() time.sleep(30) state = conn.describe_jobflow(jobid).state print "job state = ", state print "job id = ", jobid

print "final output can be found in s3://" + S3_BUCKET + "/output" + TIMESTAMPprint "try: $ s3cmd sync s3://" + S3_BUCKET + "/output" + TIMESTAMP + " ."

Connect to Elastic MapReduce

Install packages

Set up mappers &reduces

job state

elastic

Capacity

Time

Realdemand

Elasticcapacity

On demand Faster to market

Pay as you go Maintain focus

Pay to play Efficiency

Elastic resources Capacity planning

Computing with Amazon EC2

Credit: Angel Pizzaro, U. Penn

Credit: Tom Fifield: U. Melbourne

standard “m1”high cpu “c1”

high memory “m2”

http://aws.amazon.com/ec2/instance-types/

EC2

inst

ance

type

s



listening to customers

new EC2 instance type

text

cluster compute instances




2 * Xeon 5570 (“Intel Nehalem”)23 GB RAM

10 gbps Ethernet

1690 TB local disk

HVM-based virtualization

$1.60 / hr

10gbps

PlacementGroup

full bisection bandwidth

HPC on EC2 =

EC2 instance+

high bandwidth, low latency networking

http://aws.amazon.com/ec2/hpc-applications/



Linpack benchmark

880-instance CC1 clusterPerformance: 41.82 TFlops*

*#231 in the most recent Top 500 rankings

CFDMolecular ModelingSequence AnalysisEngineering Design

Energy Trading…

high I/O applications




cluster compute “cc1”

EC2

inst

ance

type

s



HPC is evolving

cluster GPU instances




HPC on EC2 =

EC2 instance+


+GPU

http://aws.amazon.com/ec2/hpc-applications/



2 * Xeon 5570 (“Intel Nehalem”)

22 GB RAM

10 gbps Ethernet

1690 TB local disk

HVM-based virtualization

$2.10 / hr

2 * Tesla M2050 GPU




cluster compute “cc1”

EC2

inst

ance

type

s

cluster GPU “cg1”



CFDMolecular DynamicsFinancial Modeling

RenderingVideo Processing

…What is your interest?

“90 percent scaling efficiency on clusters of up to 128 GPUs”

-- Mental Images iRay

Getting Started

http://aws.amazon.com/hpc-applications



4 steps

15 minutes

http://aws.amazon.com/ec2



ecosystem

ISV ecosystem

Mathworksmental imagesRevup Render

Elemental Technologies...

HPC with AWS

E2 instance+


+Tesla GPU*

*optional

On demand Faster to market

Pay as you go Maintain focus

Pay to play Efficiency

Elastic resources Capacity planning

make people productive





Great Idea Not Prioritized

[email protected] Twitter:@mndoci

http://slideshare.net/mndocihttp://mndoci.com

Inspiration and ideas from Matt Wood, James Hamilton

& Larry Lessig

Credit” Oberazzi under a CC-BY-NC-SA license

mailto:[email protected]

mailto:[email protected]

http://slideshare.net/mndoci

http://slideshare.net/mndoci

http://mndoci.com

http://mndoci.com

http://www.flickr.com/photos/oberazzi/

http://www.flickr.com/photos/oberazzi/

hpc on aws

Technology

s3 bucket

amazon ec2credit

aws keys

awse2 instance

s3cmd sync s3

stateprint job state

stateprint job id

high bandwidth