mahesh sukumar subramanian srinivasan. introduction embedded system products keep arriving in the...
TRANSCRIPT
Mahesh SukumarSubramanian
Srinivasan
IntroductionEmbedded system products keep arriving in
the market.There is a continuous growing demand for
more functional and more complex appliances.
Great challenge to design the embedded system applications. These systems must have enough processing
power to handle these tasks.
Java
Java is becoming increasingly popular in embedded environments.More then 721 million devices are shipped with Java each year.Furthermore, it is predicted that 80% of mobile phones support Java.
Current Design goalCurrent design goals must include a careful
look on embedded Java architectures.Embedded systems must have
Low power dissipation.Support a huge software library to cope with
stringent design times.Need for architectures that can support all
the software development effort currently required.
Java compliant ArchitectureBinary Translation Unit.Reconfiguration Cache.Reconfigurable Array.Speeds up the system and reduce energy
consumption.Results in extra area.
Binary Translation UnitA separate unit is responsible for
dynamic analysis of the instructions.find the sequences that can be executed in the
array.BT saves the configuration for the potential
sequence of instructions in a reconfiguration cache.
There is a delay involved with the reconfiguration.If the sequence of instructions is going to be
repeated performance and energy gains are meaningful.
Reconfigurable Cache ListA write command for the reconfigurable
cache is sent. This command saves the content of the buffer
to this cache. This list is made in real time, as the
instructions are fetched from memory.The size of the buffer is of 20 eight-bit
registers long.
Reconfigurable ArrayThe used coarse grained reconfigurable
array is tightly coupled to the processor.The array is divided in blocks, called cells.The operand block (a sequence of Java
bytecodes) previously detected is fitted in one ore more of these cells in the array.
Cell of the Reconfigurable ArrayThe initial part of the cell is composed by
three functional units (ALU, shifter, ld/st). After the first part, six identical parts follow
in sequence.Each cell of the array has just one multiplier
and takes exactly one processor cycle to complete execution.
For each cell in the array, 327 reconfiguration bits are needed.
Consequently, if the array is formed by 3 cells, 971 bits in the reconfiguration cache are necessary.
Run time detection and analysisThe detection is performed at run time.The next time that the sequence of
instructions is detected it can be executed in the array.
Prevents loss of cycles for execution.
ResultsThe tool utilized to provide data on the
energy consumption, memory usage and performance is a configurable compiled-code cycle accurate simulator.
We compare the processor coupled with the reconfigurable array with VLIW versions with the same instruction set.
ConclusionWe showed in this paper the implementation of
Java compliant architecture to work with a coarse-grain array in a native Java processor.
Boosts performance and reduce energy consumption.
The search of the potential sequence of instructions is done at run-time.
Furthermore, we demonstrated that there is no need for huge available parallelism in the application, such as it is in VLIW and Low power architectures, to achieve good results.
Future WorkMore algorithms concerning the
optimizations aimed at the reconfigurable arrays can be evaluated.
Furthermore, we can use another Java processor for the analysis of instructions instead of a dedicated hardware.
Thank you.