isvlsi 2012
TRANSCRIPT
![Page 1: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/1.jpg)
KIT – Universität des Landes Baden-Württemberg undnationales Forschungszentrum in der Helmholtz-Gemeinschaft www.itiv.kit.edu
Institute for Information Processing Technology (ITIV)Prof. Dr.-Ing. K. D. Müller-Glaser · Prof. Dr.-Ing. J. Becker · Prof. Dr. rer. nat. W. Stork
FlexTiles
Self-adaptive heterogeneous many-core based on Flexible Tiles
Dr. Gabriel Marchesan [email protected]
![Page 2: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/2.jpg)
Institute for Information Processing Technology (ITIV) 2Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
Motivation
11.04.2023
architectures are designed in very customized ways to deal with a specific problem: well defined set of applications; pre-defined budgets (power/energy + area + time-to-market); several requirements must be met upon application execution:
power/energy consumption; performance (application throughput / deadlines);
complexity of applications is increasing; parallelization is the solution;
Source: http://baldmike2004.xanga.com/
![Page 3: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/3.jpg)
Institute for Information Processing Technology (ITIV) 3Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
Motivation
11.04.2023
issues for industry: let’s be as conservative as possible and
keep everything under control! why to take so many risks with many-core
architectures?
applications often exhibit time-changing workloads (mapping decisions sub-optimal);
Source: http://shop.cafepress.com/old-school-conservative
Proposition
novel many-core architecture based on reconfigurable devices (FPGAs), DSPs and GPPs with a clever virtualization layer;
Cognitive Radio Smart Camera
![Page 4: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/4.jpg)
Institute for Information Processing Technology (ITIV) 4Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
Who are we?
11.04.2023
Source: http://thefreeman.net
![Page 5: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/5.jpg)
Institute for Information Processing Technology (ITIV) 5Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
Project Consortium
Partners:
Project Goals:
Propose novel adaptive techniques for many-core architectures;
Autonomous decision making mechanism;
Provide an innovative virtualization layer and dedicated tool-flow to:
improve programming efficiency;
reduce the impact on time to market;
Budget: 3.67M €Period: 15.10.2011 – 14.10.2014Duration: 36 monthsCoordinator: Fabrice Lemmonier
![Page 6: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/6.jpg)
Institute for Information Processing Technology (ITIV) 6Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
8 x 8 grid general purpose processor cores (tiles);
ANSI standard C and C++;
Up to 443 BOPS (billion operations per second);
Support SMP Linux with 2.6 Kernel;
Compute-intensive applications such as advanced networking, digital multimedia and telecom, wireless infrastructure
TILEPro64™ (Tilera)
![Page 7: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/7.jpg)
Institute for Information Processing Technology (ITIV) 7Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
3 billion transistors;
up to 512 CUDA cores;
a CUDA core executes a floating point or integer instruction per clock for a thread;
16 SMs (Streaming Multiprocessor) of 32 cores each;
CUDA parallel programming model
Fermi Architecture (Nvidia)
SFU(Special
Function Unit)
TranscendentalInstructions (sin, cosine, square root,
etc.)
![Page 8: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/8.jpg)
Institute for Information Processing Technology (ITIV) 8Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
Homogeneous architectures
replica of the same processing element;
intended to be more flexible;
programmability facilities make of such architectures good solutions for future scalable systems;
intended to better deal with faults that may appear in the system;
Source: http://www.starwarsreport.com/
![Page 9: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/9.jpg)
Institute for Information Processing Technology (ITIV) 9Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
… but
… this is not enough!
Source: http://knowyourmeme.com/
![Page 10: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/10.jpg)
Institute for Information Processing Technology (ITIV) 10Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
Customization is needed to raise efficiency of applications
Source: http://saxonyfineclothing.com/
![Page 11: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/11.jpg)
Institute for Information Processing Technology (ITIV) 11Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
Process2(){ …
a = receive(task1); a = a * 2; … send(a,task1);}
MPSoC + specialized ISPs + HW accelerators;
4 cores heterogeneous / shared memory design + a number of accelerators;
Dataflow: message passing;
4 differentiated CPUs:
- ARM 11 (Application proc.)- ARM 9 (Modem)- 2 DSPs (Audio + Modem)
2D/3D, Java Accelerators
Process1(){ a = 1; send(a,task2); … … … … … a = receive(task2); a = a + 5;}
TASK 1 TASK 2
Message
Message
PE
L1 / L2 / L3Caches
PE
L1 / L2 / L3Caches
PE
L1 / L2 / L3Caches
MAIN MEMORY
I/O SYSTEM
PE
L1 / L2 / L3Caches
HYBRID MODEL
QualComm MSM7200
![Page 12: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/12.jpg)
Institute for Information Processing Technology (ITIV) 12Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
Heterogeneous architectures
dedicated to a specific domain of applications;
efficient architectures:low power consumption;
high processing power;
Source: http://www.bripblap.com/
What is the price to pay?
reduced flexibility;
poor scalability;
hard programmability;
![Page 13: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/13.jpg)
Institute for Information Processing Technology (ITIV) 13Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
Challenge
How to get the best of both worlds?
![Page 14: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/14.jpg)
Institute for Information Processing Technology (ITIV) 14Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
Challenge
PROCESSORS
FPGA
DSPSource: http://www.gamearenaph.com
Source: http://www.vision.caltech.edu
APPLICATIONS
Source: http://www.funtoosh.com
How to efficiently map complex applications to many-core architectures with limited budget
(power, performance, …)
???
LIMITED BUDGET
Source: http://www.lnci.org.au
![Page 15: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/15.jpg)
Institute for Information Processing Technology (ITIV) 15Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
FlexTiles – Architecture Overview
TILE TILE
AI Accelerator InterfaceInterpret requests from GPP
NINetwork InterfaceInterfaces a node with NoC
![Page 16: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/16.jpg)
Institute for Information Processing Technology (ITIV) 16Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
FlexTiles – Tool-flow
Ad
ap
tive Te
ch
niq
ue
s
Source: http://www.psdgraphics.com
![Page 17: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/17.jpg)
Institute for Information Processing Technology (ITIV) 17Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
Adaptation
ARCHITECTURE LEVEL
DYNAMIC FREQUENCY
SCALING
SYSTEM LEVEL
DYNAMIC MAPPING
TASK MIGRATION
ADAPTATION
An adaptive system is a set of interacting entities able to respond to environmental changes or
changes in the interacting parts.
ADAPT AS FAST AS POSSIBLE
IMPROVE OVERALLPERFORMANCE
Source: http://www.stjohns.edu
![Page 18: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/18.jpg)
Institute for Information Processing Technology (ITIV) 18Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
Information Management
MONITORING DIAGNOSIS
O = F(L)
ACTION
SYSTEM
Information Management and Decision Making Mechanisms:MonitoringDiagnosisAction
Reference: Gabriel Marchesan Almeida. Adaptive Multiprocessor Systems-on-Chip Architectures: Principles,
Methods and Tools, 124p. LAP LAMPERT, ISBN 978-3848424282, 2012.
![Page 19: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/19.jpg)
Institute for Information Processing Technology (ITIV) 19Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
FlexTiles – ArchitectureA 3D stacked chip based on:
A many-core layer
A FPGA layer
![Page 20: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/20.jpg)
Institute for Information Processing Technology (ITIV) 20Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
0000 0100 0200
0001 0101 0201
0002 0102 0202
DESIGN-TIMETASK MAPPING
ALGORITHM
APP TASK NPU
1 1 0x0000
1 2 0x0100
1 3 0x0001
1 4 0x0001
1 5 0x0001
2 1 0x0002
2 2 0x0102
2 3 0x0101
2 4 0x0102
2 5 0x0202
TASK MAPPING TABLE
APP 1
1
3
4 5
APP 2
Static Mapping: applications are mapped at design-time according to a given heuristic
FlexTiles – Techniques
1
2 3
54
2
![Page 21: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/21.jpg)
Institute for Information Processing Technology (ITIV) 21Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
DIAGNOSIS
O = F(L)
ACTION
SYSTEM
MONITORING
FIFO Filling
1 240%
1 380%
3 460%
2 460%
4 520%
1
2 3
4
5
Monitoring, Diagnosis, Action (MDA)
Information Management
![Page 22: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/22.jpg)
Institute for Information Processing Technology (ITIV) 22Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
Monitoring, Diagnosis, Action (MDA)
DIAGNOSIS
O = F(L)
ACTION
SYSTEM
MONITORING
CPU Workload
Information Management
![Page 23: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/23.jpg)
Institute for Information Processing Technology (ITIV) 23Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
Monitoring, Diagnosis, Action (MDA)
DIAGNOSIS
O = F(L)
ACTION
SYSTEM
MONITORING
1 2
1 3
3 4
2 4
4 5
1
2 3
4
5
Application Throughput
5
5
3,58 MB/s
Information Management
![Page 24: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/24.jpg)
Institute for Information Processing Technology (ITIV) 24Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
Monitoring, Diagnosis, Action (MDA)
ACTION
SYSTEM
MONITORING DIAGNOSIS
O = F(L)
Draw conclusions based on monitored information
CPU is getting overloaded
CPU is most of the time in idle mode
Application throughput is decreasing
Information Management
![Page 25: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/25.jpg)
Institute for Information Processing Technology (ITIV) 25Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
Monitoring, Diagnosis, Action (MDA)
SYSTEM
MONITORING DIAGNOSIS
O = F(L)
ACTION
Decisions are made based on both monitored information and diagnosis
1. Reduce processor frequency whenever CPU is running in idle mode or no-high speed processing is required;
2. Increase processor frequency in order to meet application performance requirements;
3. Migrate a task whenever CPU becomes overloaded;
ALA
Architecture Level Adaptation
SLA
System Level Adaptation
Information Management
![Page 26: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/26.jpg)
Institute for Information Processing Technology (ITIV) 26Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
Task Migration: tasks are migrated at run-time according to certain criteria
FlexTiles – Techniques
0000 0100 0200
0001 0101 0201
0002 0102 0202
1 2
3 4
5
12
3
45
CPU gets
overloaded
Task is Migrated=
ImprovedLoad Balancing
Performance
![Page 27: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/27.jpg)
Institute for Information Processing Technology (ITIV) 27Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
FlexTiles – Techniques
eFPGA – reconfigurable resources are seen as a homogeneous set of resources (to be allocated at run-time);
this leads to a better resource sharing among the many-core SoC;
enable implementation of large accelerators if required;
![Page 28: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/28.jpg)
Institute for Information Processing Technology (ITIV) 28Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
FlexTiles – Platforms
PCRUN Model
EMULATION
CompOSe
Low-Level Model
FPGA PROTOTYPE
High-Level Model
SIMULATION
![Page 29: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/29.jpg)
Institute for Information Processing Technology (ITIV) 29Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
• Number of processors• Interconnection type (bus, NoC)
Architecture Modeling(1)
Processing Element Configuration
(2)• Memory size• Processor type (microBlaze, MIPS32, ARM7,
OpenRISC (OR1K) and PowerPC)
Application Description
(3) • C Programming language
Application Compilation and Model Execution
(4)
• Cross-compilers • Operating system (Windows, Linux)• Architecture (32, 64 bits)
ExecutionReports
(5)
• Applications trace• MIPS per processor and total MIPS• Number of simulated instructions• Simulation time
Simplify Framework (http://simplify.itiv.kit.edu)
FlexTiles – Simplify
![Page 30: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/30.jpg)
Institute for Information Processing Technology (ITIV) 30Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
FlexTiles – Simplify
![Page 31: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/31.jpg)
Institute for Information Processing Technology (ITIV) 31Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
FlexTiles – Simplify
![Page 32: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/32.jpg)
Institute for Information Processing Technology (ITIV) 32Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
FlexTiles – Simplify
Version 2.0
OS support: Round-robin scheduler;
Semaphores;
Mutexes;
Multi-task;
Communication API for applications;
Web framework: New design;
Improved performance;
Application profiling (instruction counter per application);
Version 1.0
Processors: MIPS32, microBlaze, ARM7, openRISC; PowerPC;
Interconnect: Bus;
Web framework: Architecture modeling;
PE (processing element) configuration;
Application description, compilation and execution;
Execution reports;
Automatic generation of OVP platforms;
No OS support; Mono application – 1 per core; No API for app. communication;
![Page 33: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/33.jpg)
Institute for Information Processing Technology (ITIV) 33Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
FlexTiles – Simplify
![Page 34: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/34.jpg)
Institute for Information Processing Technology (ITIV) 34Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
FlexTiles – Simplify
![Page 35: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/35.jpg)
Institute for Information Processing Technology (ITIV) 35Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
Closing Remarks
FlexTiles is a novel architecture which contains several adaptive techniques mainly used for: Improving application performance; Reducing energy/power consumption; Decreasing temperature hot-spots;
The tool-flow ease the programmability of many-core heterogeneous platforms;
Application-driven frequency scaling: Performance requirements; Power consumption budget;
Feedback to application designers;
Source: http://www.charlesphoenix.com/
![Page 36: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/36.jpg)
Institute for Information Processing Technology (ITIV) 36Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
Thank you for your attention
Dr. Gabriel Marchesan AlmeidaInstitute for Information Processing Technology (ITIV)[email protected]
Karlsruhe Institute of Technology
![Page 37: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/37.jpg)
Institute for Information Processing Technology (ITIV) 37Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
Backup Slides
![Page 38: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/38.jpg)
Institute for Information Processing Technology (ITIV) 38Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
Programming Model
Application is a set of static clusters;
A cluster is described using Synchronous Data Flow (SDF) or Cyclo-Static Data Flow (CSDF);
Within a data flow, each consumer/producer of tokens is called actor;
Actors are featured by nested loops implementing the operators and the rules of token consumption/production;
Two actors communicate through FIFOs of tokens;
![Page 39: ISVLSI 2012](https://reader034.vdocuments.mx/reader034/viewer/2022042700/554f6309b4c905c8088b4b82/html5/thumbnails/39.jpg)
Institute for Information Processing Technology (ITIV) 39Gabriel Marchesan Almeida
Self-adaptive heterogeneous manycore based on Flexible Tiles
11.04.2023
Programming Model