evaluating application performance between dsp processor ... · group of architectures and...
TRANSCRIPT
Evaluating Application Performance BetweenDSP Processor and GPP Using Reconfigurable
Hardware
Nicola, Eduardo V.Ruzicki, Julio C.M.Martins, Luis H.J.Mattos, Julio C.B.
Group of Architectures and Integrated Circuits - GACIFederal University of Pelotas - UFPel
Pelotas - Brasil
1 / 30
Introduction
The industry and market of electronic devices grows rapidlyeach day
Most products now have builtin microprocessors to handlespecific tasks
This is made possible due to the fast technology development,in response to market demands
General purpose processors (GPPs)ASICs (Application Specific Integrated Circuits)
3 / 30
Introduction
However, it is possible to perform those tasks in other kinds ofprocessors:
GPPs were used before the idealization of DSP processorsASIC were another option of reduced costs and flexibility
4 / 30
Motivation
Flexible Specialization
A field that is growing in research is ReconfigurableArchitecturesThis area of research aims to improve processor performancewithout increasing its clock rateThis is made using specialized hardware structures to performportions of program code
5 / 30
Methodology
The work was divided in stages:
1 Choice of tools to simulate DSP processors and simulate theMips32 processor;
2 Selection of typical DSP applications;
3 Use of simulators to generate execution traces from the DSPapplications;
4 Comparison of results between the DSP processor and theMIPS32, with and without the Reconfigurable Hardware.
6 / 30
Methodology
Three commercial tools were analysed as possible DSP processorsimulators for the work:
7 / 30
Methodology
Symphony Studio
Works with DSP563xx/DSP567xx processors
Features
Can be executed on Eclipse environmentGDB (GNU Project Debugger)Project Manager
9 / 30
Methodology
VisualDSP++
Features
Statistical performance measurementsEmulationSimulationCompilation
10 / 30
Methodology
OVPsim
Features
Describing platforms with one or more processorsModels are available for many standard processorsCompilationTrace Generator
11 / 30
The Reconfigurable Hardware
The Binary Translator works parallel with MIPS pipeline
At DI stage, BT reads the PC indexed address
19 / 30
The Reconfigurable Hardware
The Binary Translator works parallel with MIPS pipeline
At DI stage, BT reads the PC indexed addressAfter decoding the instruction, BT sends the instruction toRU, when possible, as a configuration to be made on RU
20 / 30
The Reconfigurable Hardware
The Binary Translator works parallel with MIPS pipeline
At DI stage, BT reads the PC indexed addressAfter decoding the instruction, BT sends the instruction toRU, when possible, as a configuration to be made on RUBT also stores the instruction configuration or group ofinstruction configurations on RCWhen PC points to an existing address on RC, RU takes theexecution from CPU.
21 / 30
Case Study
A generic and customized benchmark was madewith DSP-like operations
1 Performs a multiplication of two real vectors2 Two matrix addition3 Matrix subtraction4 Matrix scalar multiplication5 Calculates the mean of the input array6 Calculates the RMS of the elements in the input
array7 Accumulates the two vector product8 Two matrix addition
22 / 30
Case Study
9 Executes the Discrete Fourier Transform
10 Executes the Inverse Discrete Fourier Transform
11 Executes the Fast Fourier Transform
12 Executes the Inverse Fast Fourier Transform
13 Executes the Finite Impulse Response
23 / 30
Case Study: Results
1 2 3 4 5 6 7 8
2
3
4
·105
Clo
ckC
ycle
s
ADSP-BF504 MIPS32-pure MIPS32-array
25 / 30
Case Study: Results
11 12 13
0
2
4
6
·105
Clo
ckC
ycle
s
ADSP-BF504 MIPS32-pure MIPS32-array
27 / 30
Case Study: Results
MIPS32 had less clock cycles with the Reconfigurable System
Blackfin ADSP-BF504 is a specialized DSP processor,comparing to MIPS-pure(without array), but MIPS-array hadless clock cycles in most applications
28 / 30
Conclusions
This work presented the simulation of DSP applications tocompare Blackfin and MIPS32
Even Blackfin ADSP-BF504 being a specialized DSPprocessor, MIPS-array had less clock cycles in mostapplications
A GPP processor with Reconfigurable System can optimiseprogram execution, compared to DSP processors
For future works: Other kinds of simulations
1 Power estimates2 Energy consumption
29 / 30