amazon simple work flow engine (swf): how beamr uses swf for video optimization in the cloud -...

38
Amazon Simple Work Flow Engine (SWF): How Beamr uses SWF for video optimization in the cloud Dan Julius, VP R&D, Beamr

Upload: amazon-web-services

Post on 13-Apr-2017

917 views

Category:

Technology


0 download

TRANSCRIPT

Amazon Simple Work Flow Engine (SWF): How Beamr uses SWF for video optimization in the cloud

Dan Julius, VP R&D, Beamr

There are few times when something new comes along that makes me think the following:

- Really? Nahh...- Hmm.. Maybe..- OMG!! How the f*** does it do that?- GIVE IT TO ME NOW!!

“Lens In The Face

- Optimized ±5B Photos in 2015

- Reduces JPEG file size from 20%-80%

- Community of tens of thousands photographers

- Free trial available at jpegmini.com

- Enterprise and consumer products

SOURCE1080p @ 3.5 mbps

BEAMR VIDEO1080p @ 2.1 mbps

*Images are courtesy of Universal Studios

REDUCED BY 40%

BEAMR KEY FEATURES

5©2016 Beamr Imaging Ltd. All Rights Reserved

Bitrate reduction Always safe Fully automatic Standard compliant

Significant reduction. in bitrate or file size

Retains original image or video

quality

No additional QC steps required

No modification needed for existing players

Agenda

6©2016 Beamr Imaging Ltd. All Rights Reserved 5 5

1. Why workflows?

2. SWF Concepts

3. Tips and Gotchas

4. Q&A

For Python → Ask me later

Java/Ruby → Checkout Flow Framework

No Code

Workflows

A Simple Workflow

8©2016 Beamr Imaging Ltd. All Rights Reserved 5 5

But…

9©2016 Beamr Imaging Ltd. All Rights Reserved 5 5

Reality isn’t always simple

Simple Workflow ServiceSWF

The Players

©2016 Beamr Imaging Ltd. All Rights Reserved

Activity Workers

DECIDERS

SWF

11

Activity Workers

©2016 Beamr Imaging Ltd. All Rights Reserved

Activity WorkersResponsible for doing the “work”

Typically a long-running process

Poll queue task-list → Do work

While True:

Poll an SWF queue task-list

Process the task

Return result to SWF

12

Deciders

©2016 Beamr Imaging Ltd. All Rights Reserved

DECIDERS

Responsible for making workflow decisions

Typically a long-running process

While True:

Poll task-list for a decision

Analyze execution history

Make a decision on next step(s)

Return result to SWF

13

SWF Responsibilities

©2016 Beamr Imaging Ltd. All Rights Reserved

Coordinate system components

Schedule “decision tasks” (a queue?)

Schedule “activity tasks” (a queue?)

Maintain workflow state

Catch errors and timeouts

Provide an API to track workflow-execution progress

Does No work. Makes No decisions.

14

Example of Two-Task Workflow Life Cycle

15©2016 Beamr Imaging Ltd. All Rights Reserved

SWF Decider Worker A Worker B

start execution

new exc. what to do?

Schedule A

starting...

A task for you...

Completed. Result is...

New history. what now?Schedule B

A task for you...

Completed. Result is...

New history. What now?Close. Result is...

Are we done?

no

Are we done?

YesSTART ENDTask A Task B

Tips for Activity Workers

©2016 Beamr Imaging Ltd. All Rights Reserved

Do only one thing

Be Stateless

Catch all exceptions and return failure

Send heartbeats (and check responses)

While True:

Poll an SWF task-list

Process the task

Return result to SWF

16

Tips for Deciders

©2016 Beamr Imaging Ltd. All Rights Reserved

Focus on Decision Making. Avoid doing any work.

Be Stateless

Decide based on entire* execution history

While True:

Poll SWF for an execution

Analyze execution history

Make a decision on next step(s)

Return result to SWF

Parallelize tasks when appropriate by returning multiple decisions.

Catch all exceptions and always return a valid decision

Expect the unexpected... (e.g. a decision may be rejected)

17

The Execution History

©2016 Beamr Imaging Ltd. All Rights Reserved

[ { "eventId": 11, "eventTimestamp": 1326671603.102, "eventType": "WorkflowExecutionTimedOut", ... }, ... { "activityTaskScheduledEventAttributes": { "activityId": "verification-27", "activityType": { "name": "activityVerify", "version": "1.0" }, ….. "input": "5634-0056-4367-0923,12/12,437", ... }, "eventId": 8, ... "eventType": "ActivityTaskScheduled" },...{ ... "eventId": 2, "eventTimestamp": 1326668003.094, "eventType": "DecisionTaskScheduled" }]

The workflow-execution “state”

It’s just json

Every event is notedDecisions are notedScheduled tasks are notedStart and Finish are noted…

How to make a decision?- Read the execution history- Apply some “logic”- Return one or more decisions

18

DEMO

CLI Demo

©2016 Beamr Imaging Ltd. All Rights Reserved

A distributed “Image Processing” workflow

Prerequisites:

- AWS Account, IAM Credentials- AWSCLI- bash / jq / ImageMagic / JPEGmini

20

CLI Demo - Registration

©2016 Beamr Imaging Ltd. All Rights Reserved

Register Domain / Workflow / Activity Types

We use “code” to configure SWF. No CloudFormation available.LOFT_Domain

demo (Domain)

Process Media

Ingest

CleanupOptimize

Thumbnail

21

CLI Demo - Process Media Workflow

©2016 Beamr Imaging Ltd. All Rights Reserved

START ENDIngest Cleanup

Optimize

Thumbnail

Download from Web to EFS

Process files on EFSUpload results to S3

Remove files form EFS

22

Tips and Gotchas

Some Advanced Topics

©2016 Beamr Imaging Ltd. All Rights Reserved

Child Workflows- Simplify large workflows- Reuse child-workflows

Signals- Sent from somewhere (external / activity)- SWF will schedule a Decision-Task

24

[ ... { "eventId": 153, "eventTimestamp": 1326671603.102, "eventType": "WorkflowExecutionSignaled", "workflowExecutionSignaledEventAttributes"{ ... } ... }, ...]

High Availability

©2016 Beamr Imaging Ltd. All Rights Reserved

REST API

At least two deciders, and two workers of each type, across multiple zones

25

The Execution History

©2016 Beamr Imaging Ltd. All Rights Reserved

Beware of the “LastEvent” samplesNote the previousStartedEventId

When then History gets long - Note the nextPageToken.

Pages are small (100 items). Beware of rate limits.

- Cache history pages- Use ChildWorkflows or ContinueAsNewWorkflow- Use short names in input and results fields

[ { "eventId": 11, "eventTimestamp": 1326671603.102, "eventType": "WorkflowExecutionTimedOut", ... }, ... { "activityTaskScheduledEventAttributes": { "activityId": "verification-27", "activityType": { "name": "activityVerify", "version": "1.0" }, ….. "input": "5634-0056-4367-0923,12/12,437", ... }, "eventId": 8, ... "eventType": "ActivityTaskScheduled" },...{ ... "eventId": 2, "eventTimestamp": 1326668003.094, "eventType": "DecisionTaskScheduled" }]

26

Monitoring and Scaling

©2016 Beamr Imaging Ltd. All Rights Reserved

Processes to Monitoring- Monitor each decider process- Monitor each worker process

Metrics to Monitor- Activity task list sizes- Decider task list sizes- DecisionTaskScheduleToStartTime - ActivityTaskScheduleToStartTime

Scale Workers with Spot instances - Use short-timeouts- Heartbeats

27

Things Will Break...

©2016 Beamr Imaging Ltd. All Rights Reserved

API calls may fail, therefore backoff and retry- Rate limiting- Network errors

Expect the unexpected:- Decisions or Responses may be rejected

- Closing a workflow execution could fail due to pending signals- An Activity Worker response may fail due to cancelled execution

- Events may be aggregated, or delivered out of order- Task lists are only mostly FIFO

28

Q&A

Dan JuliusVP R&D, [email protected]

Backup Slides

Workflows vs. Messages

©2016 Beamr Imaging Ltd. All Rights Reserved

A complex workflow requires more than just passing

messages from task to task…

Workflows have “state”

Tasks might need to be synchronized

Tasks can fail or timeout

32

How SWF Works (maybe)

©2016 Beamr Imaging Ltd. All Rights Reserved

SWF

Worker A Worker B

Decidertask-list

task-listtask-list

Execution history

33

Poll

The Manual Worker

©2016 Beamr Imaging Ltd. All Rights Reserved

The worker process:

While True:Poll an SWF task-listSend email to manual-worker

The employee:

When an email is received,Process task, and submit completion form

WebServer

Email

Respond to SWF

Submit Form

SWFWorker Process

ManualLabour

34

Timeouts

©2016 Beamr Imaging Ltd. All Rights Reserved 35

Timeouts

©2016 Beamr Imaging Ltd. All Rights Reserved 36

More Advanced Topics

©2016 Beamr Imaging Ltd. All Rights Reserved

Markers- Added by deciders - Simplify understanding history- “Milestones”

Timers- Set by deciders - SWF will schedule a Decision-Task when timer fires

Priorities- SWF will can re-order tasks based on Priority

[ ... { "eventId": 153, "eventTimestamp": 1326671603.102, "eventType": "TimerFired", "TimerFiredEventAttributes"{ ... } ... }, ...]

37

[ ... { "eventId": 153, "eventTimestamp": 1326671603.102, "eventType": "MarkerRecorded", "markerRecordedEventAttributes"{ ... } ... }, ...]

Some Concepts

©2016 Beamr Imaging Ltd. All Rights Reserved

Worklow Activity

Task

State

Logical Flow

Activity Workers Asynchronous processing

Distributed processing

Scalability

Domain

Activity Type Activity Version

Activity Timeouts

Results

Decider

Input Data

Events

Workflow executionExecution History

Polling

workflowId + runId

Workflow Starters

decisions

Activity Task

Lambda Task Decision Task

Task Lists

38