througputer parallel computing paas · 2020-01-14 · existing parallel computing tools mainly...

23
THROUGHPUTER PaaS for creating and executing concurrent cloud applications

Upload: others

Post on 28-May-2020

18 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ThrougPuter Parallel Computing PaaS · 2020-01-14 · Existing parallel computing tools mainly limited to parallel programming aspect of parallel computing challenge Any parallel

THROUGHPUTER

PaaS for creating and executing

concurrent cloud applications

Page 2: ThrougPuter Parallel Computing PaaS · 2020-01-14 · Existing parallel computing tools mainly limited to parallel programming aspect of parallel computing challenge Any parallel

© ThroughPuter, Inc. All rights reserved.

OVERVIEW

1) Fundamental transformation in computing:

Concurrent apps on dynamically shared resources Micro-services: unpredictable bursts of short-lived transactions

2) Industry-wide, must-solve challenge:

Calls for an open-source platform approach

3) Enabling performance-critical cloud computing:

Dynamic parallel execution environment (DPEE)

4) Call for collaboration:

Open-source development environment for creating concurrent apps for effective cloud execution

2

Page 3: ThrougPuter Parallel Computing PaaS · 2020-01-14 · Existing parallel computing tools mainly limited to parallel programming aspect of parallel computing challenge Any parallel

© ThroughPuter, Inc. All rights reserved.

1) Fundamental Transformation in Computing

A. What higher clock rates were needed for (=everything), concurrent processing will be needed for:

For decades, application program speed-up was automatic through higher processor clock rates

• Processor clock rate increases no longer feasible/economical

Going forward performance gain demands concurrent execution

• Including at intra-application level

→ Very complex and distracting!

3

Page 4: ThrougPuter Parallel Computing PaaS · 2020-01-14 · Existing parallel computing tools mainly limited to parallel programming aspect of parallel computing challenge Any parallel

© ThroughPuter, Inc. All rights reserved.

1) Fundamental Transformation in Computing

B. Computing is increasingly taking place on dynamically shared cloud infrastructure

A. (concurrency) and B. (cloud computing) together:

→ Need for dynamic parallel computing

Must-solve challenge for application developers, who however are ill-equipped to address it

4

Fundamental challenge for software industry calling for a solution

Page 5: ThrougPuter Parallel Computing PaaS · 2020-01-14 · Existing parallel computing tools mainly limited to parallel programming aspect of parallel computing challenge Any parallel

© ThroughPuter, Inc. All rights reserved. 5

1) Concurrent Apps on Dynamically Shared Resources

Master (App #A)

Parallel worker #1

Parallel worker #2

Parallel worker #n

.

.

.

Stage #1

Stage #2

Stage #n

Stage #2’ Stage #2’’’

Alternative task #1

Alternative task #2

Alternative task #n

.

.

.

App instances/tasks increasingly short-lived microservices/actors Need for very rapid dynamic resource management Would increase system software overhead No longer tolerable as hardware is not getting faster!!!

Need to accommodate: • Apps with variable inter-task topologies • Dynamic # of active instances per app • Dynamic # of active tasks per app instance • Inter-task communication among

dynamically scheduled and placed tasks

Master (App #B)

Master (App #C)

. . .

Page 6: ThrougPuter Parallel Computing PaaS · 2020-01-14 · Existing parallel computing tools mainly limited to parallel programming aspect of parallel computing challenge Any parallel

© ThroughPuter, Inc. All rights reserved.

1) Overhead of Optimal Resource Management

• Optimizing usage of a pool of resources (e.g. 1000s of cores) among: – multiple (10s) of apps, with their respective resource entitlements,

– dynamic number (10s) of active instances per app, and

– dynamic selections of (10s) active tasks per app instance ..

• where the dynamic sets of active instances and tasks vary at network packet / inter-task communications (ITC) message transmission time granularity (i.e. at ~ 1kb/1Gpbs = microsecond granularity) ..

• requires identifying for each of the 1000s of cores the optimal app instance task among the 1000s of alternatives 1M times per second – and carry out the actual resource access re-configurations etc. ..

• i.e. performing Billions of complex computations + re-conf:s / second

If done in software, optimal resource management itself would demand most of the resources, defeating its purpose! – Static allocation neither a viable approach, given the increasingly

dynamic & unpredictable load variations for enterprise/web cloud apps

6

Page 7: ThrougPuter Parallel Computing PaaS · 2020-01-14 · Existing parallel computing tools mainly limited to parallel programming aspect of parallel computing challenge Any parallel

© ThroughPuter, Inc. All rights reserved.

0

1

2

3

4

5

6

7

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Th

ro

ug

hp

ut,

ove

rhe

ad

/

no

rma

lize

d u

nic

ore

th

rou

gh

pu

t

Scaling factor: # apps = # tasks/app = # cores

Scaling parallel cloud computing with software operating system

System throughput capacity

Overhead rate

Throughput rate/core

1) Scalability Problem in Parallel Cloud Computing =

SYSTEM SOFTWARE OVERHEAD

7

Cost

Rev

enu

e

Concurrency overhead increases according to f1(#apps) * f2(#tasks/app) * f3(#cores)

Page 8: ThrougPuter Parallel Computing PaaS · 2020-01-14 · Existing parallel computing tools mainly limited to parallel programming aspect of parallel computing challenge Any parallel

© ThroughPuter, Inc. All rights reserved.

0

5

10

15

20

25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Th

ro

ug

hp

ut,

ove

rhe

ad

/

no

rma

lize

d u

nic

ore

th

rou

gh

pu

t

Scaling factor: # apps = # tasks/app = # cores

Scaling parallel cloud computing with hardware operating system

System throughput capacity

Overhead rate

Throughput rate/core

1) Scalability Solution for Parallel Cloud Computing =

AUTOMATE SYSTEM FUNCTIONS IN HARDWARE

8

Eliminating parallelization software overhead by hardware OS allows near-linear application on-time throughput scaling

Cost

Rev

enu

e

Page 9: ThrougPuter Parallel Computing PaaS · 2020-01-14 · Existing parallel computing tools mainly limited to parallel programming aspect of parallel computing challenge Any parallel

© ThroughPuter, Inc. All rights reserved.

2) Industry-wide, Must-solve Challenge

Existing computing tech suppliers offer point-products or half-solutions:

• parallel programming languages (extensions), frameworks, tools, middleware, manycore processors, etc.

Burden of cost-efficient concurrent cloud computing left for individual application developers (SaaS vendors)

• .. or worse yet, to the app users (SaaS customers)!

→ Status-quo not sustainable

9

A holistic platform solution needed for parallel cloud computing

Page 10: ThrougPuter Parallel Computing PaaS · 2020-01-14 · Existing parallel computing tools mainly limited to parallel programming aspect of parallel computing challenge Any parallel

© ThroughPuter, Inc. All rights reserved.

3) Parallel Cloud Computing Platform

Existing parallel computing tools mainly limited to parallel programming aspect of parallel computing challenge

Any parallel execution technologies etc. designed for acceleration on dedicated machines, not on the cloud

→ Parallel execution in cloud i.e. dynamic parallel execution, though critical for cloud computing cost-efficiency, left unaddressed by legacy vendors

10

Dynamic parallel execution critical piece of performance-critical cloud computing

Page 11: ThrougPuter Parallel Computing PaaS · 2020-01-14 · Existing parallel computing tools mainly limited to parallel programming aspect of parallel computing challenge Any parallel

© ThroughPuter, Inc. All rights reserved.

3) ThroughPuter: Parallel Computing PaaS

ThroughPuter introduces a critical technology enabling effective parallel cloud computing platform:

• Dynamic parallel execution environment (DPEE) • Implemented in HDL code

• User-friendly PaaS business model • Incentives to maximize computing on-time throughput performance and

resource efficiency as well as user’s productivity

• Intellectual property rights for dynamic parallel execution • 44 patents issued and pending worldwide, incl. 26 granted US/UK patents

11

Performance and efficiency of dynamic parallel execution with high development productivity and deployment flexibility

Page 12: ThrougPuter Parallel Computing PaaS · 2020-01-14 · Existing parallel computing tools mainly limited to parallel programming aspect of parallel computing challenge Any parallel

Development environment • software • open sourced

Development environment • software • open sourced

. . .

Application tasks, residing in their dedicated memory segments

•From each application: core demand expressions, task priority lists •For each app: sets of tasks for execution

app0 task1

. . .

. . .

app1 task0

app0 task2 appN

taskM

Parallelized application tasks

(executables)

3) ThroughPuter: Platform Overview

Execution environment • hardware automated, including dynamic resource mgmt

Execution environment • hardware automated, including dynamic resource mgmt

Open standard interface

User interface and development tools: • Represent the client apps as concurrently executable tasks

core

. . .

Hardware operating system and dynamic on-chip network: •Switch app tasks to cores based on processing loads and contractual policies

•Dynamically, connect tasks of any given app, rather the cores statically

Simpler implementation, higher performance

core

core Core array

© ThroughPuter, Inc. All rights reserved. 12

Page 13: ThrougPuter Parallel Computing PaaS · 2020-01-14 · Existing parallel computing tools mainly limited to parallel programming aspect of parallel computing challenge Any parallel

© ThroughPuter, Inc. All rights reserved.

3) ThroughPuter: Open Parallel Computing PaaS

Reasons for open-source collaboration for parallel computing PaaS to leverage ThroughPuter’s dynamic parallel execution environment (DPEE):

1) Less low-level work: DPEE automates dynamic resource management etc. parallel execution OS functions in (programmable) hardware, providing higher level interface (API) for the development environment software

2) Higher performance due to minimum-overhead hardware automation of system tasks such as optimally allocating processing capacity, scheduling and placing application tasks for execution, inter-task communications, billing etc.

3) Built-in cloud computing security: mechanisms for unauthorized interactions between different applications simply non-existent in the hardware

4) Open standard interface between development and execution environment

Dynamic parallel capacity allocation example and the enabling cloud processor architecture block diagram on next slides

Open parallel computing PaaS needed -- ThroughPuter execution environment goes a long way toward reaching that goal

13

Page 14: ThrougPuter Parallel Computing PaaS · 2020-01-14 · Existing parallel computing tools mainly limited to parallel programming aspect of parallel computing challenge Any parallel

© ThroughPuter, Inc. All rights reserved.

time X us

app1

app2

app3

app4 3) Dynamic Core Allocation Example

Core Allocation Period (CAP)

All tasks continuing on

consecutive CAPs stay on their

existing core, and continue

processing uninterruptedly

through CAP boundaries

Allocation of cores re-

optimized among applications for each new CAP based on core demands and

entitlements of the applications

X+1 us

Any application can burst even up to full capacity of the shared core pool, as long as

actually materialized core demands by other apps are met up to their entitlements

X+2 us

Any task can communicate with all other

tasks of its app instance without

knowledge of whether or at

which core any given task is

running

X+3 us

Core (in 16-core array)

14

Page 15: ThrougPuter Parallel Computing PaaS · 2020-01-14 · Existing parallel computing tools mainly limited to parallel programming aspect of parallel computing challenge Any parallel

© ThroughPuter, Inc. All rights reserved.

Example:

Average core demands by

applications sharing a 16-core processor

app1

12.5%

app2

25%

app3

12.5%

app4

50%

1 2 3 4 5 6 7 8 9 10

0

2

4

6

8

10

12

14

16

cores demanded

time / microsecond

Actual core demands by applications over 10us period

app1

app2

app3

app4

12

34

0

1

2

3

4

5

6

7

8

9

Cores

Application #

Actual and average core demands at t = 6 microseconds

cores demanded

static allocation

Idle capacity

Blocked demand

3) Concurrent cloud computing challenge - Technical

15

Page 16: ThrougPuter Parallel Computing PaaS · 2020-01-14 · Existing parallel computing tools mainly limited to parallel programming aspect of parallel computing challenge Any parallel

© ThroughPuter, Inc. All rights reserved.

• The actual, momentary processing capacity demand by any given individual application program hardly ever equals its ‘average’ demand

Non-adaptive capacity partitioning leads to wasting of resources and blocking on-time throughput

• Capacity being held statically in reserve for idling applications should have been allocated to other applications on the manycore processor that at that time would have been able to use it

12

34

0

1

2

3

4

5

6

7

8

9

Cores

Application #

Actual and average core demands at t = 6 microseconds

cores demanded

static allocation

Blocked demand

= lost revenue

Idle capacity

= wasted cost

16

3) Concurrent cloud computing challenge - Economical

Page 17: ThrougPuter Parallel Computing PaaS · 2020-01-14 · Existing parallel computing tools mainly limited to parallel programming aspect of parallel computing challenge Any parallel

© ThroughPuter, Inc. All rights reserved.

0

2

4

6

8

10

12

14

16

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31

Time / microsecond

Co

res d

em

an

ded

by a

n a

pp

licati

on

Realtime demand

of the application

Average demand

of the application

Dynamic allocation of

otherwise idle capacity

to other app:s that can use it

..to enable on-demand

bursting for maximized

on-time throughput

per given cost

17

3) Concurrent cloud computing challenge - Solution

Dynamic Parallel Execution Environment (DPEE) of ThroughPuter PaaS enables realtime application load adaptive,

resource efficient and high performance cloud computing

Page 18: ThrougPuter Parallel Computing PaaS · 2020-01-14 · Existing parallel computing tools mainly limited to parallel programming aspect of parallel computing challenge Any parallel

© ThroughPuter, Inc. All rights reserved.

Hardware operating system

STEP 1 Once per a

core allocation period (e.g.

microsecond) : Allocate cores to

applications

Core demand figures

from applications

Manycore processing fabric

Ready-task priority ordered lists from applications, along with the core types demanded by each task

STEP 3

For each application:

Map selected tasks to available cores

For each core: Active application task ID

For each application: Number of cores allocated

For each task: Processing core ID and core type

For each application: List of selected tasks, along with their demanded core types

Billing

subsystem

For each application:

Core entitlements For each application: Billables

To/from contract management system

STEP 2

For each application:

Select to-be-

executing tasks

time tick

core slot

core slot

core slot

Fabric network and memories

core slot

core slot

. . .

Patents issued and pending. 18

3) DPEE: App load adaptive manycore architecture

Page 19: ThrougPuter Parallel Computing PaaS · 2020-01-14 · Existing parallel computing tools mainly limited to parallel programming aspect of parallel computing challenge Any parallel

© ThroughPuter, Inc. All rights reserved. Patents issued and pending. 19

ITC

Input ports

(from load

balancers)

Output ports

(to load

balancers)

ITC

packet

switch

Inter-task

communications

(ITC)

App. load

adaptive

manycore

processing

system

(worker

stage #0)

ITC (worker

stage # 2)

ITC (worker

stage # T-1)

App. load

adaptive

manycore

processing

system

(entry

stage)

App. load

adaptive

manycore

processing

system

(exit

stage)

ITC

ITC

.

.

.

3) DPEE: Multi-stage manycore architecture

• Real hardware resources for ITC, memory/network I/O, load balancing No software abstractions (=overhead) needed

• Hardware automated support for any mixes/matches of data flow topologies among apps sharing the multi-stage manycore cloud processors

• Further scalability via load balanced parallel arrays of the cloud processors

Page 20: ThrougPuter Parallel Computing PaaS · 2020-01-14 · Existing parallel computing tools mainly limited to parallel programming aspect of parallel computing challenge Any parallel

© ThroughPuter, Inc. All rights reserved. Patents issued and pending. 20

3) Example application data flow topologies

Master (App #A)

Parallel worker #1

Parallel worker #2

Parallel worker #n

.

.

.

Stage #1

Stage #2

Stage #n

Stage #2’ Stage #2’’’

Alternative task #1

Alternative task #2

Alternative task #n

.

.

.

The hardware automatically adapts to (potentially dynamic) data flow topology of any given application, without any system software (re)configurations

• Scatter-gather • Pipelined • Hybrids of above • Any parallel mixes/matches of above

Master (App #B)

Master (App #C)

. . .

Page 21: ThrougPuter Parallel Computing PaaS · 2020-01-14 · Existing parallel computing tools mainly limited to parallel programming aspect of parallel computing challenge Any parallel

© ThroughPuter, Inc. All rights reserved.

3) Summary of Advantages

• PERFORMANCE and COST-EFFICIENCY: – Architecturally maximized application processing on-time

throughput per unit cost • Hardware operating system and on-chip network optimized for execution

of multiple concurrent apps on dynamically shared manycore processors

• SECURITY: – Full isolation, right from hardware level up, among client

applications dynamically sharing a pool of cores

• PRODUCTIVITY: – PaaS automates concurrent cloud app development + deployment

• OPEN SYSTEM: – PaaS software and processor-core hardware to be open-sourced

– Host anywhere; ThroughPuter commercial hosting an option

21

Page 22: ThrougPuter Parallel Computing PaaS · 2020-01-14 · Existing parallel computing tools mainly limited to parallel programming aspect of parallel computing challenge Any parallel

© ThroughPuter, Inc. All rights reserved.

4) Call for Collaboration

• The need for parallel processing an emerging, MAJOR industry and profession wide challenge Open-source collaboration a natural approach

• Need for architectural optimization across traditional application, system and hardware layer boundaries

SOLUTION: Open-source PaaS reaching all the way to parallel cloud computing optimized hardware – ThroughPuter’s contribution: Hardware architecture designed

for dynamically shared multi-tenant parallel cloud computing • Secure hardware OS for manycore fabric with on-chip network, taking

care of dynamic capacity allocation, parallel program execution mgmt

– Collaboration opportunities: • Development environment and tools • Extensions of the PaaS for specific user domains: channel partnerships • Processor-core IP for the cloud processors • Physical hardware supply, physical hosting (IaaS) etc.

22