compiler-based register name adjustment for low-power embedded processors
DESCRIPTION
Compiler-Based Register Name Adjustment for Low-Power Embedded Processors. Peter Petrov; Alex Orailoglu; ICCAD’03. Agenda. Introduction Mathematical Formulation Heuristic Solutions For RNA Register PermuTation (RPT) Register PerturBation (RPB) Experimental Results Conclusions. - PowerPoint PPT PresentationTRANSCRIPT
Compiler-Based Register Compiler-Based Register Name Adjustment for Low-Name Adjustment for Low-
Power Embedded ProcessorsPower Embedded Processors
Peter Petrov; Alex Orailoglu; ICCAD’03
2/19
AgendaAgenda
Introduction Mathematical Formulation Heuristic Solutions For RNA
Register PermuTation (RPT) Register PerturBation (RPB)
Experimental Results Conclusions
3/19
IntroductionIntroduction
Objective: Low-PowerLow-Power Key Point: Reduce bit transition activity
on the register index streams.
Concept: Register Name Adjustment (RNA)
4/19
ExampleExample
add r3, r2, r4 add r6, r2, r4
sub r6, r3, r5 sub r7, r6, r5
sub r3, r2, r6 sub r6, r2, r7
mul r4, r4, r5 mul r4, r4, r5
011
110
011
100
110
111
110
100
010
011
010
100
010
110
010
100
100
101
110
101
100
101
111
101
Total Bit Transitions: 7 + 4 + 5 = 16 3 + 4 + 3 = 10
5/19
AgendaAgenda
Introduction Mathematical FormulationMathematical Formulation Heuristic Solutions For RNA
Register PermuTation (RPT) Register PerturBation (RPB)
Experimental Results Conclusions
6/19
Cost FunctionCost Function
fc(rega, regb): the hamming distance between rega and regb.
l: the lth column in an instruction.
M(Pi, j): a bijective mapping function from the original reg Pi, j to a new reg index
CPMPMfl
n
ililic
3
1
1
1,1, ,cost
7/19
LiteralsLiterals LiteralsLiterals: unchangeable field in an instruction
such as an opcode or immediate oprand.
L(i, j): to record the literal positions.
ji
ji
P
PMM
,
,'
if Li, j = 0
if Li, j = 1
8/19
ExampleExample
ld r5, (r1) 0add r3, r2, r5add r4, r3, r2mul r3, r4, r3st r3, r7 (10)
v5 v1 0v3 v2 v5v4 v3 v2v3 v4 v3v3 v7 10
P =
0 0 1 0 0 0 0 0 0 0 0 0 0 0 1
L =
(v3, v4) – 3 (v4, v7) – 1 (v1, v2) – 1(v2, v3) – 2 (v5, v2) – 1(v5, v3) – 1 ( 0, v5) – 1(v3, v3) – 1 (10, v3) – 1
9/19
AgendaAgenda
Introduction Mathematical Formulation Heuristic Solutions For RNAHeuristic Solutions For RNA
Register PermuTation (RPT)Register PermuTation (RPT) Register PerturBation (RPB)Register PerturBation (RPB)
Experimental Results Conclusions
10/19
FlowFlow
RPB: Max the distribution skew of register pair occurrences
Select Vi and Vj that maximize f(eij) + f(eji)
Pick names for Vi and Vj and compute the cost
All unassigned indices tried?
All registers named?
Name Vi and Vj with min costYes
Finish
Yes
No
No
Brute-ForceTime-
Consuming
11/19
Cost Function of RPTCost Function of RPT
Cij =
jcicHefef
kcjcHjckcH
kcicHickcH
jiij
IkRkLk
IkRkLk
jj
ii
,
,,
,,
,
,
Vi Vj
eij
eji
Literali
…
Regi
…
Literalj
…
Regj
…
12/19
Register PertuBationRegister PertuBation Number of higher utilization frequency↓
Performance↑ Number of self transition↑ Performance↑
13/19
Cost Function of RPBCost Function of RPB
N
x
2
2
PN
DD ˆ
ˆ1ˆ0 DC
Maximize σto maximize the distribution skew of register pair occurrenc
es
D: the number of self-transitionsDoes larger σ impl
y larger skewness?
14/19
Register PertuBationRegister PertuBation Commutativity Transformation
Dead Register Reassignment
r1 r2, r3 r1 r2, r3r4 r1, r2 r2 r1, r2r2 r3, r4 r2 r3, r2
r1 r2, r3 r1 r2, r3r4 r1, r2 r4 r2, r1
Note: r4 must be dead after the third instruction
Note: these instructions must be commutable
Question: would the data
dependency increase?
15/19
Dead Register Dead Register ReassignmentReassignment
r1 r2 r3
1 23
4
56
7
8
r1 r2 r3
1 23
4
56
7
8
Self-Transition
16/19
AgendaAgenda
Introduction Mathematical Formulation Heuristic Solutions For RNA
Register PermuTation (RPT) Register PerturBation (RPB)
Experimental ResultsExperimental Results Conclusions
17/19
Experimental ResultsExperimental Results ˆ1ˆ
0 DC
Circuit TotalRPT RPB
Total Impr% λ(0.0) λ(0.25) λ(0.5) λ(0.75) λ(1.0) Impr%
fdct 70 58 18.09 47 46 46 46 46 34.55
ej 73,837 63,169 14.45 49,203 48,933 48,934 48,934 45,224 38.75
mmul 7,613 6,463 15.11 4,710 4,460 4,460 4,460 4,593 41.41
tri 5,929 5,400 8.92 3,490 3,489 3,489 3,489 3,335 43.76
sor 1,440 1,142 20.69 1,004 1,003 1,043 1,043 1,004 30.30
adpcm_e 20,513 15,338 25.23 15,897 15,144 15,144 15,144 14,750 28.10
adpcm_d 17,212 13,689 20.46 13,393 12,655 12,655 12,655 11,404 33.74
Does larger σ imply larger skewness?
18/19
AgendaAgenda
Introduction Mathematical Formulation Heuristic Solutions For RNA
Register PermuTation (RPT) Register PerturBation (RPB)
Experimental Results ConclusionsConclusions
19/19
ConclusionsConclusions
Minimize the bit transitions , reduce the power consumption.
RPT improves up to 25%. RPB improves up to 44%.