new launch! introducing aws batch: easy and efficient batch computing on amazon web services

39
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Dougal Ballantyne and Jamie Kinney Principal Product Managers, AWS Batch December 2016 CMP323 NEW LAUNCH! Introducing AWS Batch Easy and Efficient Batch Computing on Amazon Web Services

Upload: amazon-web-services

Post on 06-Jan-2017

166 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Dougal Ballantyne and Jamie Kinney

Principal Product Managers, AWS Batch

December 2016

CMP323

NEW LAUNCH!

Introducing AWS BatchEasy and Efficient Batch Computing

on Amazon Web Services

Page 2: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

Agenda

• A brief history of batch computing

• AWS Batch overview and concepts

• Use cases

• Let’s take it for a spin!

• Q&A

Page 3: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

Run jobs asynchronously and automatically across one or

more computers.

Jobs may have dependencies, making the sequencing and

scheduling of multiple jobs complex and challenging.

What is batch computing?

Page 4: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

Batch computing has been around for a while…

Images from: history.nasa.gov

Page 5: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

Early Batch APIs

Page 6: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

CRAY-1: 1976

• First commercial

supercomputer

• 167 millions

calculations/second

• USD$8.86 million

($7.9 million plus

$1 million for disk)

CRAY-1 on display in the hallways of the EPFL in Lausanne. https://commons.wikimedia.org/wiki/File:Cray_1_IMG_9126.jpg

Page 7: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

Early Batch on AWS: NY Times TimesMachine

aws.amazon.com/blogs/aws/new-york-times/

In 2007 the New York Times

processed 130 years of archives in

36 hours.

11 million articles & 4 TB of data

AWS services used:

Amazon S3, SQS, EC2, and EMR

Total cost (in 2007): $890

$240 compute + $650 storage

http://open.blogs.nytimes.com/2007/11/01/self-service-

prorated-super-computing-fun/

Page 8: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

Batch Computing On-Premises

Page 9: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

RAMI/O

CPU CPU CPU RAM CPU

CPU RAM

I/O

CPU CPU RAM

Page 10: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

RAM

I/O

CPUCPU

CPU

I/O

RAM

CPU RAMI/O

CPU

Page 11: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

RAM

I/OGPU

StorageCPU FPGA

Page 12: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

How does this work in the cloud?

Page 13: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

RAM

I/OGPU

StorageCPU FPGA

R4

I3P2

D2C5 F1

Page 14: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

However, batch computing could be easier…

AWS

Components:

• EC2

• Spot Fleet

• Auto Scaling

• SNS

• SQS

• CloudWatch

• AWS Lambda

• S3

• DynamoDB

• API Gateway

• …

Page 15: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

Introducing AWS Batch

Fully managed

No software to install or

servers to manage.

AWS Batch provisions,

manages, and scales your

infrastructure

Integrated with AWS

Natively integrated with the

AWS platform, AWS Batch

jobs can easily and securely

interact with services such as

Amazon S3, DynamoDB, and

Rekognition

Cost-optimized

resource provisioning

AWS Batch automatically

provisions compute

resources tailored to the

needs of your jobs using

Amazon EC2 and Spot

Page 16: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

Introducing AWS Batch

• Fully managed batch primitives

• Focus on your applications (shell scripts,

Linux executables, Docker images) and

their resource requirements

• We take care of the rest!

Page 17: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

AWS Batch Concepts

• Jobs

• Job definitions

• Job queue

• Compute environments

• Scheduler

Page 18: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

Job Queues

Jobs are submitted to a job queue, where they reside until they are

able to be scheduled to a compute resource.

$ aws batch create-job-queue --job-queue-name genomics --priority 500 --compute-environment-order ...

Page 19: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

AWS Batch Concepts

Job queues are mapped to one or more compute environments

containing the EC2 instances used to run containerized batch jobs.

Managed compute environments enable you to describe your business

requirements (instance types, min/max/desired vCPUs, and Spot bid

as a % of On-Demand) and we launch and scale resources on your

behalf.

Alternatively, you can launch and manage your own resources within

an unmanaged compute environment. Your instances need to include

the ECS agent and run supported versions of Linux and Docker.

Page 20: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

Job Definitions

Similar to ECS task definitions, AWS Batch job definitions specify how

jobs are to be run. While each job must reference a job definition, many

parameters can be overridden.

Some of the attributes specified in a job definition:

• IAM role associated with the job

• vCPU and memory requirements

• Mount points

• Container properties

• Environment variables

$ aws batch register-job-definition --job-definition-name gatk--container-properties ...

Page 21: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

Jobs

Jobs are the unit of work executed by AWS Batch as containerized

applications running on Amazon EC2.

Containerized jobs can reference a container image, command, and

parameters or users can simply provide a .zip containing their

application and we will run it on a default Amazon Linux container.

$ aws batch submit-job --job-name variant-calling --job-definition gatk --job-queue genomics

Page 22: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

Easily run massively parallel jobs

Today, users can submit a large number of independent “simple jobs.”

In the coming weeks, we will add support for “array jobs” that run

many copies of an application against an array of elements.

Array jobs are an efficient way to run:

• Parametric sweeps

• Monte Carlo simulations

• Processing a large collection of objects

Page 23: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

Workflows and Job Dependencies

Page 24: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

Workflows, Pipelines, and Job Dependencies

Jobs can express a dependency on the successful

completion of other jobs or specific elements of an

array job.

Use your preferred workflow engine and language to

submit jobs. Flow-based systems simply submit jobs

serially, while DAG-based systems submit many jobs

at once, identifying inter-job dependencies.

$ aws batch submit-job –depends-on 606b3ad1-aa31-48d8-92ec-f154bfc8215f ...

Page 25: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

AWS Batch Concepts

The Scheduler evaluates when, where, and

how to run jobs that have been submitted to

a job queue.

Jobs run in approximately the order in which

they are submitted as long as all

dependencies on other jobs have been met.

Page 26: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

AWS Batch Pricing

There is no charge for AWS Batch; you only pay for the

underlying resources that you consume!

Page 27: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

AWS Batch Availability

• Currently in preview in the US East (Northern Virginia) Region

• Support for array jobs and jobs executed as AWS Lambda

functions coming soon!

Page 28: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

Some ideas for inspiration

Page 29: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

DNA Sequencing

Page 30: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

Genomics on Unmanaged Compute Environments

Page 31: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

Computational Chemistry

Page 32: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

Media Encoding/Transcoding

Page 33: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

Animation and Visual Effects Rendering

Page 34: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

Financial Trading Analytics

Page 35: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

Would you like to see a demo?

Page 36: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

Fully managed Integrated with AWS Cost-optimized

resource provisioning

Page 37: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

Any questions?

Page 38: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

Thank you!

Page 39: NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amazon Web Services

Remember to complete

your evaluations!