isvlsi 2012

39
KIT – Universität des Landes Baden-Württemberg und nationales Forschungszentrum in der Helmholtz-Gemeinschaft www.itiv.kit.e du Institute for Information Processing Technology (ITIV) Prof. Dr.-Ing. K. D. Müller-Glaser · Prof. Dr.-Ing. J. Becker · Prof. Dr. rer. nat. W. Stork FlexTiles Self-adaptive heterogeneous many-core based on Flexible Tiles Dr. Gabriel Marchesan Almeida [email protected]

Upload: flextiles-team

Post on 10-May-2015

163 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ISVLSI 2012

KIT – Universität des Landes Baden-Württemberg undnationales Forschungszentrum in der Helmholtz-Gemeinschaft www.itiv.kit.edu

Institute for Information Processing Technology (ITIV)Prof. Dr.-Ing. K. D. Müller-Glaser · Prof. Dr.-Ing. J. Becker · Prof. Dr. rer. nat. W. Stork

FlexTiles

Self-adaptive heterogeneous many-core based on Flexible Tiles

Dr. Gabriel Marchesan [email protected]

Page 2: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 2Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

Motivation

11.04.2023

architectures are designed in very customized ways to deal with a specific problem: well defined set of applications; pre-defined budgets (power/energy + area + time-to-market); several requirements must be met upon application execution:

power/energy consumption; performance (application throughput / deadlines);

complexity of applications is increasing; parallelization is the solution;

Source: http://baldmike2004.xanga.com/

Page 3: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 3Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

Motivation

11.04.2023

issues for industry: let’s be as conservative as possible and

keep everything under control! why to take so many risks with many-core

architectures?

applications often exhibit time-changing workloads (mapping decisions sub-optimal);

Source: http://shop.cafepress.com/old-school-conservative

Proposition

novel many-core architecture based on reconfigurable devices (FPGAs), DSPs and GPPs with a clever virtualization layer;

Cognitive Radio Smart Camera

Page 4: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 4Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

Who are we?

11.04.2023

Source: http://thefreeman.net

Page 5: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 5Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

Project Consortium

Partners:

Project Goals:

Propose novel adaptive techniques for many-core architectures;

Autonomous decision making mechanism;

Provide an innovative virtualization layer and dedicated tool-flow to:

improve programming efficiency;

reduce the impact on time to market;

Budget: 3.67M €Period: 15.10.2011 – 14.10.2014Duration: 36 monthsCoordinator: Fabrice Lemmonier

Page 6: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 6Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

8 x 8 grid general purpose processor cores (tiles);

ANSI standard C and C++;

Up to 443 BOPS (billion operations per second);

Support SMP Linux with 2.6 Kernel;

Compute-intensive applications such as advanced networking, digital multimedia and telecom, wireless infrastructure

TILEPro64™ (Tilera)

Page 7: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 7Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

3 billion transistors;

up to 512 CUDA cores;

a CUDA core executes a floating point or integer instruction per clock for a thread;

16 SMs (Streaming Multiprocessor) of 32 cores each;

CUDA parallel programming model

Fermi Architecture (Nvidia)

SFU(Special

Function Unit)

TranscendentalInstructions (sin, cosine, square root,

etc.)

Page 8: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 8Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

Homogeneous architectures

replica of the same processing element;

intended to be more flexible;

programmability facilities make of such architectures good solutions for future scalable systems;

intended to better deal with faults that may appear in the system;

Source: http://www.starwarsreport.com/

Page 9: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 9Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

… but

… this is not enough!

Source: http://knowyourmeme.com/

Page 10: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 10Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

Customization is needed to raise efficiency of applications

Source: http://saxonyfineclothing.com/

Page 11: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 11Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

Process2(){ …

a = receive(task1); a = a * 2; … send(a,task1);}

MPSoC + specialized ISPs + HW accelerators;

4 cores heterogeneous / shared memory design + a number of accelerators;

Dataflow: message passing;

4 differentiated CPUs:

- ARM 11 (Application proc.)- ARM 9 (Modem)- 2 DSPs (Audio + Modem)

2D/3D, Java Accelerators

Process1(){ a = 1; send(a,task2); … … … … … a = receive(task2); a = a + 5;}

TASK 1 TASK 2

Message

Message

PE

L1 / L2 / L3Caches

PE

L1 / L2 / L3Caches

PE

L1 / L2 / L3Caches

MAIN MEMORY

I/O SYSTEM

PE

L1 / L2 / L3Caches

HYBRID MODEL

QualComm MSM7200

Page 12: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 12Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

Heterogeneous architectures

dedicated to a specific domain of applications;

efficient architectures:low power consumption;

high processing power;

Source: http://www.bripblap.com/

What is the price to pay?

reduced flexibility;

poor scalability;

hard programmability;

Page 13: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 13Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

Challenge

How to get the best of both worlds?

Page 14: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 14Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

Challenge

PROCESSORS

FPGA

DSPSource: http://www.gamearenaph.com

Source: http://www.vision.caltech.edu

APPLICATIONS

Source: http://www.funtoosh.com

How to efficiently map complex applications to many-core architectures with limited budget

(power, performance, …)

???

LIMITED BUDGET

Source: http://www.lnci.org.au

Page 15: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 15Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

FlexTiles – Architecture Overview

TILE TILE

AI Accelerator InterfaceInterpret requests from GPP

NINetwork InterfaceInterfaces a node with NoC

Page 16: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 16Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

FlexTiles – Tool-flow

Ad

ap

tive Te

ch

niq

ue

s

Source: http://www.psdgraphics.com

Page 17: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 17Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

Adaptation

ARCHITECTURE LEVEL

DYNAMIC FREQUENCY

SCALING

SYSTEM LEVEL

DYNAMIC MAPPING

TASK MIGRATION

ADAPTATION

An adaptive system is a set of interacting entities able to respond to environmental changes or

changes in the interacting parts.

ADAPT AS FAST AS POSSIBLE

IMPROVE OVERALLPERFORMANCE

Source: http://www.stjohns.edu

Page 18: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 18Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

Information Management

MONITORING DIAGNOSIS

O = F(L)

ACTION

SYSTEM

Information Management and Decision Making Mechanisms:MonitoringDiagnosisAction

Reference: Gabriel Marchesan Almeida. Adaptive Multiprocessor Systems-on-Chip Architectures: Principles,

Methods and Tools, 124p. LAP LAMPERT, ISBN 978-3848424282, 2012.

Page 19: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 19Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

FlexTiles – ArchitectureA 3D stacked chip based on:

A many-core layer

A FPGA layer

Page 20: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 20Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

0000 0100 0200

0001 0101 0201

0002 0102 0202

DESIGN-TIMETASK MAPPING

ALGORITHM

APP TASK NPU

1 1 0x0000

1 2 0x0100

1 3 0x0001

1 4 0x0001

1 5 0x0001

2 1 0x0002

2 2 0x0102

2 3 0x0101

2 4 0x0102

2 5 0x0202

TASK MAPPING TABLE

APP 1

1

3

4 5

APP 2

Static Mapping: applications are mapped at design-time according to a given heuristic

FlexTiles – Techniques

1

2 3

54

2

Page 21: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 21Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

DIAGNOSIS

O = F(L)

ACTION

SYSTEM

MONITORING

FIFO Filling

1 240%

1 380%

3 460%

2 460%

4 520%

1

2 3

4

5

Monitoring, Diagnosis, Action (MDA)

Information Management

Page 22: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 22Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

Monitoring, Diagnosis, Action (MDA)

DIAGNOSIS

O = F(L)

ACTION

SYSTEM

MONITORING

CPU Workload

Information Management

Page 23: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 23Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

Monitoring, Diagnosis, Action (MDA)

DIAGNOSIS

O = F(L)

ACTION

SYSTEM

MONITORING

1 2

1 3

3 4

2 4

4 5

1

2 3

4

5

Application Throughput

5

5

3,58 MB/s

Information Management

Page 24: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 24Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

Monitoring, Diagnosis, Action (MDA)

ACTION

SYSTEM

MONITORING DIAGNOSIS

O = F(L)

Draw conclusions based on monitored information

CPU is getting overloaded

CPU is most of the time in idle mode

Application throughput is decreasing

Information Management

Page 25: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 25Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

Monitoring, Diagnosis, Action (MDA)

SYSTEM

MONITORING DIAGNOSIS

O = F(L)

ACTION

Decisions are made based on both monitored information and diagnosis

1. Reduce processor frequency whenever CPU is running in idle mode or no-high speed processing is required;

2. Increase processor frequency in order to meet application performance requirements;

3. Migrate a task whenever CPU becomes overloaded;

ALA

Architecture Level Adaptation

SLA

System Level Adaptation

Information Management

Page 26: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 26Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

Task Migration: tasks are migrated at run-time according to certain criteria

FlexTiles – Techniques

0000 0100 0200

0001 0101 0201

0002 0102 0202

1 2

3 4

5

12

3

45

CPU gets

overloaded

Task is Migrated=

ImprovedLoad Balancing

Performance

Page 27: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 27Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

FlexTiles – Techniques

eFPGA – reconfigurable resources are seen as a homogeneous set of resources (to be allocated at run-time);

this leads to a better resource sharing among the many-core SoC;

enable implementation of large accelerators if required;

Page 28: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 28Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

FlexTiles – Platforms

PCRUN Model

EMULATION

CompOSe

Low-Level Model

FPGA PROTOTYPE

High-Level Model

SIMULATION

Page 29: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 29Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

• Number of processors• Interconnection type (bus, NoC)

Architecture Modeling(1)

Processing Element Configuration

(2)• Memory size• Processor type (microBlaze, MIPS32, ARM7,

OpenRISC (OR1K) and PowerPC)

Application Description

(3) • C Programming language

Application Compilation and Model Execution

(4)

• Cross-compilers • Operating system (Windows, Linux)• Architecture (32, 64 bits)

ExecutionReports

(5)

• Applications trace• MIPS per processor and total MIPS• Number of simulated instructions• Simulation time

Simplify Framework (http://simplify.itiv.kit.edu)

FlexTiles – Simplify

Page 30: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 30Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

FlexTiles – Simplify

Page 31: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 31Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

FlexTiles – Simplify

Page 32: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 32Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

FlexTiles – Simplify

Version 2.0

OS support: Round-robin scheduler;

Semaphores;

Mutexes;

Multi-task;

Communication API for applications;

Web framework: New design;

Improved performance;

Application profiling (instruction counter per application);

Version 1.0

Processors: MIPS32, microBlaze, ARM7, openRISC; PowerPC;

Interconnect: Bus;

Web framework: Architecture modeling;

PE (processing element) configuration;

Application description, compilation and execution;

Execution reports;

Automatic generation of OVP platforms;

No OS support; Mono application – 1 per core; No API for app. communication;

Page 33: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 33Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

FlexTiles – Simplify

Page 34: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 34Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

FlexTiles – Simplify

Page 35: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 35Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

Closing Remarks

FlexTiles is a novel architecture which contains several adaptive techniques mainly used for: Improving application performance; Reducing energy/power consumption; Decreasing temperature hot-spots;

The tool-flow ease the programmability of many-core heterogeneous platforms;

Application-driven frequency scaling: Performance requirements; Power consumption budget;

Feedback to application designers;

Source: http://www.charlesphoenix.com/

Page 36: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 36Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

Thank you for your attention

Dr. Gabriel Marchesan AlmeidaInstitute for Information Processing Technology (ITIV)[email protected]

Karlsruhe Institute of Technology

Page 37: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 37Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

Backup Slides

Page 38: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 38Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

Programming Model

Application is a set of static clusters;

A cluster is described using Synchronous Data Flow (SDF) or Cyclo-Static Data Flow (CSDF);

Within a data flow, each consumer/producer of tokens is called actor;

Actors are featured by nested loops implementing the operators and the rules of token consumption/production;

Two actors communicate through FIFOs of tokens;

Page 39: ISVLSI 2012

Institute for Information Processing Technology (ITIV) 39Gabriel Marchesan Almeida

Self-adaptive heterogeneous manycore based on Flexible Tiles

11.04.2023

Programming Model