optimizing local memory allocation and assignment through ...€¦ · optimizing local memory...
TRANSCRIPT
![Page 1: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/1.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Optimizing Local Memory Allocation andAssignment Through a Decoupled Approach
Boubacar Diouf 1, Ozcan Ozturk 2, Albert Cohen 1
1INRIA Saclay-Ile de France & Universite Paris-Sud 11
2Bilkent University
October 11, 2009 / LCPC’09
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 2: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/2.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Outline
1 Introduction
2 Decoupled LM Management
3 Experiments
4 Conclusion
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 3: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/3.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
I Many processors have Local MemoriesDigital Signal processorsStream-processing unit (GPUs) and network processorsCell broadband engine’s synergetic processing units (SPU)
I Why?FastPredicatabilityPower efficiency
I Array allocation? on Local Memory (LM)Allocation decision fixed for the entire execution (static)Allocation depends on the program points (dynamic)
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 4: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/4.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
I Many processors have Local MemoriesDigital Signal processorsStream-processing unit (GPUs) and network processorsCell broadband engine’s synergetic processing units (SPU)
I Why?FastPredicatabilityPower efficiency
I Array allocation? on Local Memory (LM)Allocation decision fixed for the entire execution (static)Allocation depends on the program points (dynamic)
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 5: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/5.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Motivation
Register AllocationI Allocation Phase
Rely on MaxliveChoose register residents
I Assignment phasewhich register for whichvariablepolynomial under SSADecoupling: isolate the hardproblem of allocation (spilling)
For more, Please attend SSA tutorial
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 6: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/6.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Motivation
Register AllocationI Allocation Phase
Rely on MaxliveChoose register residents
I Assignment phasewhich register for whichvariablepolynomial under SSADecoupling: isolate the hardproblem of allocation (spilling)
d = ...b = load …b = b * da = load …a = d / ac = a / ba = b + cstore cstore a
code
For more, Please attend SSA tutorial
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 7: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/7.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Motivation
Register AllocationI Allocation Phase
Rely on MaxliveChoose register residents
I Assignment phasewhich register for whichvariablepolynomial under SSADecoupling: isolate the hardproblem of allocation (spilling)
d = ...b = load …b = b * da = load …a = d / ac = a / ba = b + cstore cstore a
code stackd
2 Available registers
a
bc
For more, Please attend SSA tutorial
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 8: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/8.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Motivation
Register AllocationI Allocation Phase
Rely on MaxliveChoose register residents
I Assignment phasewhich register for whichvariablepolynomial under SSADecoupling: isolate the hardproblem of allocation (spilling)
d = ...b = load …b = b * da = load …a = d / ac = a / ba = b + cstore cstore a
code stackd
2 Available registers
a
bc
For more, Please attend SSA tutorial
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 9: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/9.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Motivation
Register AllocationI Allocation Phase
Rely on MaxliveChoose register residents
I Assignment phasewhich register for whichvariablepolynomial under SSADecoupling: isolate the hardproblem of allocation (spilling)
d = ...b = load …b = b * da = load …a = d / ac = a / ba = b + cstore cstore a
code stack
2 Available registers
a
bc
For more, Please attend SSA tutorial
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 10: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/10.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Motivation
Register AllocationI Allocation Phase
Rely on MaxliveChoose register residents
I Assignment phasewhich register for whichvariablepolynomial under SSADecoupling: isolate the hardproblem of allocation (spilling)
d = ...b = load …b = b * da = load …a = d / ac = a / ba = b + cstore cstore a
code stack
2 Available registers
c
a
b
For more, Please attend SSA tutorial
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 11: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/11.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Motivation
Register AllocationI Allocation Phase
Rely on MaxliveChoose register residents
I Assignment phasewhich register for whichvariablepolynomial under SSADecoupling: isolate the hardproblem of allocation (spilling)
d = ...b = load …b = b * da = load …a = d / ac = a / ba = b + cstore cstore a
code stack
2 Available registers
a
b
For more, Please attend SSA tutorial
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 12: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/12.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Motivation
Register AllocationI Allocation Phase
Rely on MaxliveChoose register residents
I Assignment phasewhich register for whichvariablepolynomial under SSADecoupling: isolate the hardproblem of allocation (spilling)
d = ...b = load …b = b * da = load …a = d / ac = a / ba = b + cstore cstore a
code stack
2 Available registers
a
b
For more, Please attend SSA tutorial
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 13: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/13.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Motivation
Register AllocationI Allocation Phase
Rely on MaxliveChoose register residents
I Assignment phasewhich register for whichvariablepolynomial under SSADecoupling: isolate the hardproblem of allocation (spilling)
d = ...b = load …b = b * da = load …a = d / ac = a / ba = b + cstore cstore a
code stack
2 Available registers
a
b
For more, Please attend SSA tutorial
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 14: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/14.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Motivation
Register AllocationI Allocation Phase
Rely on MaxliveChoose register residents
I Assignment phasewhich register for whichvariablepolynomial under SSADecoupling: isolate the hardproblem of allocation (spilling)
d = ...b = load …b = b * da = load …a = d / ac = a / ba = b + cstore cstore a
code stack
2 Available registers
a
b
For more, Please attend SSA tutorial
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 15: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/15.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Motivation
Register AllocationI Allocation Phase
Rely on MaxliveChoose register residents
I Assignment phasewhich register for whichvariablepolynomial under SSADecoupling: isolate the hardproblem of allocation (spilling)
d = ...b = load …b = b * da = load …a = d / ac = a / ba = b + cstore cstore a
code stack
2 Available registers
a
b
For more, Please attend SSA tutorial
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 16: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/16.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Motivation
Register AllocationI Allocation Phase
Rely on MaxliveChoose register residents
I Assignment phasewhich register for whichvariablepolynomial under SSADecoupling: isolate the hardproblem of allocation (spilling)
d = ...b = load …b = b * da = load …a = d / ac = a / ba = b + cstore cstore a
db,db,da,b,da,bb,ca,ca
code lived
2 Available registers
12232221
a
bc
For more, Please attend SSA tutorial
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 17: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/17.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Motivation
Register AllocationI Allocation Phase
Rely on MaxliveChoose register residents
I Assignment phasewhich register for whichvariablepolynomial under SSADecoupling: isolate the hardproblem of allocation (spilling)
d = ...b = load …b = b * da = load …a = d / ac = a / ba = b + cstore cstore a
db,db,da,b,da,bb,ca,ca
code lived
2 Available registers
12232221
maxlive
a
bc
For more, Please attend SSA tutorial
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 18: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/18.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Motivation
Register AllocationI Allocation Phase
Rely on MaxliveChoose register residents
I Assignment phasewhich register for whichvariablepolynomial under SSADecoupling: isolate the hardproblem of allocation (spilling)
d = ...b = load …b = b * da = load …a = d / ac = a / ba = b + cstore cstore a
db,db,da,b,da,bb,ca,ca
code lived
2 Available registers
12222221
maxlive
a
bc
For more, Please attend SSA tutorial
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 19: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/19.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Motivation
Register AllocationI Allocation Phase
Rely on MaxliveChoose register residents
I Assignment phasewhich register for whichvariablepolynomial under SSADecoupling: isolate the hardproblem of allocation (spilling)
d = ...b = load …b = b * da = load …a = d / ac = a / ba = b + cstore cstore a
db,db,da,b,da,bb,ca,ca
code lived
2 Available registers
12222221
maxlive
a
bc
For more, Please attend SSA tutorial
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 20: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/20.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Motivation
Register AllocationI Allocation Phase
Rely on MaxliveChoose register residents
I Assignment phasewhich register for whichvariablepolynomial under SSADecoupling: isolate the hardproblem of allocation (spilling)
d = ...b = load …b = b * da = load …a = d / ac = a / ba = b + cstore cstore a
db,db,da,b,da,bb,ca,ca
code live
2 Available registers
12222221
maxlive
a
bc
For more, Please attend SSA tutorial
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 21: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/21.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Motivation
Register AllocationI Allocation Phase
Rely on MaxliveChoose register residents
I Assignment phasewhich register for whichvariablepolynomial under SSADecoupling: isolate the hardproblem of allocation (spilling)
d = ...b1= load …b2= b1 * da1= load …a2= d / a1c1= a2 / b2a3= b2 + c1store c1store a3
db,db,da,b,da,bb,ca,ca
code live
2 Available registers
12232221
maxlive
For more, Please attend SSA tutorial
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 22: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/22.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Motivation
Register AllocationI Allocation Phase
Rely on MaxliveChoose register residents
I Assignment phasewhich register for whichvariablepolynomial under SSADecoupling: isolate the hardproblem of allocation (spilling)
d = ...b1= load …b2= b1 * da1= load …a2= d / a1c1= a2 / b2a3= b2 + c1store c1store a3
db,db,da,b,da,bb,ca,ca
code live
2 Available registers
12222221
maxlive
For more, Please attend SSA tutorial
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 23: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/23.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Motivation
Register AllocationI Allocation Phase
Rely on MaxliveChoose register residents
I Assignment phasewhich register for whichvariablepolynomial under SSADecoupling: isolate the hardproblem of allocation (spilling)
d = ...b1= load …b2= b1 * da1= load …a2= d / a1c1= a2 / b2a3= b2 + c1store c1store a3
db,db,da,b,da,bb,ca,ca
code live
2 Available registers
12222221
maxlive
b1
b2
a2a1
c1a3
For more, Please attend SSA tutorial
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 24: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/24.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Motivation
Register AllocationI Allocation Phase
Rely on MaxliveChoose register residents
I Assignment phasewhich register for whichvariablepolynomial under SSADecoupling: isolate the hardproblem of allocation (spilling)
d = ...b1= load …b2= b1 * da1= load …a2= d / a1c1= a2 / b2a3= b2 + c1store c1store a3
db,db,da,b,da,bb,ca,ca
code live
2 Available registers
12222221
maxlive
b1
b2
a2a1
c1a3
For more, Please attend SSA tutorial
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 25: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/25.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Motivation
Register AllocationI Allocation Phase
Rely on MaxliveChoose register residents
I Assignment phasewhich register for whichvariablepolynomial under SSADecoupling: isolate the hardproblem of allocation (spilling)
d = ...b1= load …b2= b1 * da1= load …a2= d / a1c1= a2 / b2a3= b2 + c1store c1store a3
db,db,da,b,da,bb,ca,ca
code live
2 Available registers
12222221
maxlive
b1
b2
a2a1
c1a3
For more, Please attend SSA tutorial
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 26: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/26.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Motivation
Register AllocationI Allocation Phase
Rely on MaxliveChoose register residents
I Assignment phasewhich register for whichvariablepolynomial under SSADecoupling: isolate the hardproblem of allocation (spilling)
d = ...b1= load …b2= b1 * da1= load …a2= d / a1c1= a2 / b2a3= b2 + c1store c1store a3
db,db,da,b,da,bb,ca,ca
code live
2 Available registers
12222221
maxlive
b1
b2
a2a1
c1a3
For more, Please attend SSA tutorial
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 27: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/27.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Motivation
Decoupled Local Memory AllocationI Allocation Phase
Rely on Maxlive, revised as the maximal of the living arraysChoose decision points for splitting
I Assignment phaseWhich offset for which ArrayColorability?Complexity?
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 28: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/28.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Motivation Example
Choice of decision points:I points where loads and stores are
going to be inserted
//Nested within outer loopsfor (i=0; i<N; i++)
for (j=0; j<N; j++) C[i][j] = /* ... */;
F[0][0]=1; F[0][1]=2; F[0][2]=1; F[1][0]=2;F[1][1]=4; F[1][2]=2;F[2][0]=1; F[2][1]=2;F[2][2]=1;
decision
decision
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 29: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/29.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Allocation schemes
1 At every array instructionfiner decision pointsExcessive Complexity (if ILP used)
2 Every time an array becomes aliveSimilar to SSA-based register Allocation
3 For the whole methodSpill everywhere problem (static)We cannot rely on MAXLIVE
//Nested within outer loopsfor (i=0; i<N; i++)
for (j=0; j<N; j++) C[i][j] = /* ... */;
F[0][0]=1; F[0][1]=2;... ...F[2][1]=2;F[2][2]=1;
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 30: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/30.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Allocation schemes
1 At every array instructionfiner decision pointsExcessive Complexity (if ILP used)
2 Every time an array becomes aliveSimilar to SSA-based register Allocation
3 For the whole methodSpill everywhere problem (static)We cannot rely on MAXLIVE
//Nested within outer loopsfor (i=0; i<N; i++)
for (j=0; j<N; j++) C[i][j] = /* ... */;
F[0][0]=1; F[0][1]=2;... ...F[2][1]=2;F[2][2]=1;
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 31: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/31.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Allocation schemes
1 At every array instructionfiner decision pointsExcessive Complexity (if ILP used)
2 Every time an array becomes aliveSimilar to SSA-based register Allocation
3 For the whole methodSpill everywhere problem (static)We cannot rely on MAXLIVE
//Nested within outer loopsfor (i=0; i<N; i++)
for (j=0; j<N; j++) C[i][j] = /* ... */;
F[0][0]=1; F[0][1]=2;... ...F[2][1]=2;F[2][2]=1;
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 32: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/32.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Preliminary transformations
I Tiling
I Loop distribution
I Strip Mining
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 33: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/33.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Preliminary transformations
I Tiling
I Loop distribution
I Strip Mining
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 34: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/34.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Preliminary transformations
I Tiling
I Loop distribution
I Strip Mining
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 35: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/35.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Preliminary transformations
I Tiling
I Loop distribution
I Strip Mining
for (i=0; i<N; i++) for (j=0; j<N; j++) C[i][j] = /* ... */;
F[0][0]=1; /* ... */ F[2][2]=1;
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 36: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/36.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Preliminary transformations
I Tiling
I Loop distribution
I Strip Mining
for (i=0; i<N; i++)//Outer strip-mined loop for (jj=0; jj<N+B-1; jj+=s) // Inner strip-mined loop for (j=jj; j<N && j<jj+s; j++) C[i][j] = /* ... */;
F[0][0]=1; /* ... */ F[2][2]=1;
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 37: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/37.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Preliminary transformations
I Tiling
I Loop distribution
I Strip Mining
for (i=0; i<N; i++) //Outer strip-mined loop for (jj=0; jj<N+B-1; jj+=s) STORE(C[i][jj..min(jj+B-1,N-1)]);
STORE(F[0..2][0..2]);
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 38: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/38.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Abstracted Model
I Array blocks are like scalar variables in register allocation
I Extension of SSA to perform on array blocks
Not array SSA: no dataflow of individual array elements
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 39: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/39.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Pointer reconciliation
= BA1 = ...
B
LM
= CA2 = ...A1
LM
A2
C
A3 = Φ(A1,A2)
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 40: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/40.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Pointer reconciliation
= BA1 = ...
B
LM
= CA2 = ...
A1
LM
A2
C
A3 = Φ(A1,A2)
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 41: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/41.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Pointer reconciliation
= BA1 = ...
B
LM
= CA2 = ...
A1
LM
A2 C
A3 = Φ(A1,A2)
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 42: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/42.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Pointer reconciliation
= BA1 = ...
B
LM
= CA2 = ...
A1
LM
A2 C
A3 = Φ(A1,A2)
LMA2?
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 43: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/43.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Pointer reconciliation
= BA1 = ...
B
LM
= CA2 = ...
A1
LM
A2 C
A3 = Φ(A1,A2)
LM
?A1
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 44: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/44.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Pointer reconciliation
= BA1 = ...
B
LM
= CA2 = ...
A1
LM
A2 C
A3 = Φ(A1,A2)
PTR (A1) PTR (A2)
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 45: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/45.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Pointer reconciliation
= BA1 = …
B
LM
= CA2 = ...
A1
LM
A2 C
A3 = Φ(A1,A2)
PTR (A1) PTR (A2)
PTR(A3)= PTR (A1)
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 46: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/46.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Pointer reconciliation
= BA1 = …
B
LM
= CA2 = ...
A1
LM
A2 C
A3 = Φ(A1,A2)
PTR (A1) PTR (A2)
PTR(A3)= PTR (A1) PTR(A3)= PTR (A1)
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 47: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/47.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Allocation
I The allocation problem is solved by Integer Linear Programming (ILP)
I Rely on maxlive to perform allocation
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 48: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/48.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Assignment
Two options:I Ignore the fragmentation problem in the allocation step, rely on a
fragmentation-avoidance heuristicI Extend the allocation step to guarantee fragmentation-free assignment:
openCompare with an integrated approach (not scalable ILP problem)
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 49: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/49.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Benchmarks
Benchmark Brief Suite Data arraysdescription size /blocks
Edge-Detect Edge detection in an image UTDSP 196644 4/385D-FFT 256-point complex FFT UTDSP 2032 7/7Bmcm Water molecular dynamics Perfect Club 125240 10/310MxM Matrix multiplication n.a. 120000 3/300
Constant Latencylatency LM 8latency MM 128
latency move(sv) 8+2svlatency spill(sv) 128+4sv
latency reload(sv) 128+4sv
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 50: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/50.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Results
minsize (16%)+20%
+40%maxsize -1
maxsizefitsize (5166%)
00.20.40.60.8
11.2
decoupledintegrated BMCM
minsize (33%)+20%
+40%maxsize -1
maxsizefitsize (10000%)
00.20.40.60.8
11.2
MXMdecoupledintegrated
minsize (24%)+20%
+40%maxsize -1
maxsizefitsize (100%)
0
0.20.40.6
0.81
1.2FFT
decoupledintegrated
minsize (24%)+20%
+40%maxsize -1
maxsizefitsize (9435%)
00.20.40.60.8
11.2
EDGE_DETECTdecoupledintegrated
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation
![Page 51: Optimizing Local Memory Allocation and Assignment Through ...€¦ · Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach Boubacar Diouf 1, Ozcan Ozturk](https://reader030.vdocuments.mx/reader030/viewer/2022040614/5f0beeac7e708231d432ee26/html5/thumbnails/51.jpg)
university-logo
IntroductionDecoupled LM Management
ExperimentsConclusion
Conclusion
I New bridge between LM management and Register Allocation
I Validation by Experiments
Optimal allocation Relying on maxlive
No fragmentation-induced spills
No fragmentation-induced displacements
B Diouf, O Ozturk, A Cohen Decoupled Local Memory Allocation