data reuse analysis for automated synthesis of...

84
DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF CUSTOM INSTRUCTIONS IN SLIDING WINDOW APPLICATIONS Georgios Zacharopoulos Laura Pozzi Università della Svizzera italiana (USI Lugano), Faculty of Informatics Giovanni Ansaloni IMPACT 2017 Jan 23, 2017 Stockholm, Sweden

Upload: others

Post on 03-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF CUSTOM INSTRUCTIONS IN SLIDING WINDOW APPLICATIONS

Georgios Zacharopoulos Laura Pozzi

Università della Svizzera italiana (USI Lugano), Faculty of Informatics

Giovanni Ansaloni

IMPACT 2017

Jan 23, 2017 Stockholm, Sweden

Page 2: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

INTRODUCTION

MOORE’S LAW - DENNARD SCALING

Expectation

▸ Performance should keep increasing

Reality

▸ Performance is NOT increasing anymore

2

Page 3: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

INTRODUCTION

BREAKDOWN OF DENNARD SCALING

▸ Single core —> Multiple cores

▸ Dark Silicon

3

Page 4: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

INTRODUCTION

HETEROGENEOUS ERA

▸ Need for Specialised HW

▸ Multiple cores —> Heterogeneous Computing

http://developer.amd.com/resources/heterogeneous-computing/what-is-heterogeneous-system-architecture-hsa/[1]

[1]

4

Page 5: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

BACKGROUND

SPECIALISATION AND PERFORMANCE

5

GENERAL PURPOSE

CPU

DOMAIN SPECIFIC PROCESSORS

(ADSPS)

APPLICATION SPECIFIC INSTRUCTION-SET

PROCESSORS (ASIPS)

ASICS

CPU + HW ACCELERATORS

PERFORMANCE

SPEC

IALIS

ATIO

N

Page 6: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

RELATED WORK

DATA TRANSFER IS THE BOTTLENECK

▸ Accelerate data transfer between accelerators and memory

▸ Custom storage as an optimisation

➡ Identify Data reuse for building custom storage

[4] G. Gutin, A. Johnstone, J. Reddington, E. Scott, and A. Yeo. An algorithm for finding input-output constrained convex sets in an acyclic digraph. J. Discrete Algorithms, 13:47–58, 2012. [5] E. Giaquinta, A. Mishra, and L. Pozzi. Maximum convex subgraphs under I/O constraint for automatic identification of custom instructions. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 34(3):483–494, 2015. [6] P. Biswas, N. Dutt, P. Ienne, and L. Pozzi. Automatic identification of application-specific functional units with architecturally visible storage. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition, pages 212–217, Mar. 2006. [7] M.Haaß,L.Bauer,andJ.Henkel.Automatic Custom Instruction Identification in Memory Streaming Algorithms. In Proceedings of the International Conference on Compilers, Architectures, and Synthesis for Embedded Systems, pages 1–9, Oct. 2014.

[4] [5]

[6] [7]

6

Page 7: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

RELATED WORK

DATA TRANSFER IS THE BOTTLENECK

▸ Rely on source-to-source transformations

▸ Limited Parallelism

▸ Large Internal Buffers

[8] W. Meeus and D. Stroobandt. Automating data reuse in high-level synthesis. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition, pages 1–4, Mar. 2014. [9] L.-N. Pouchet, P. Zhang, P. Sadayappan, and J. Cong. Polyhedral-based data reuse optimization for configurable computing. In Proceedings of the 2013 ACM/SIGDA 21st International Symposium on Field Programmable Gate Arrays, pages 29–38, Feb. 2013. [10] M. Schmid, O. Reiche, F. Hannig, and J. Teich. Loop coarsening in C-based high-level synthesis. In Proceedings of the 26th International Conference on Application-specific Systems, Architectures and Processors, pages 166–173, July 2015.

[8] [9] [10]

7

[8] [9]

[10]

Page 8: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

RELATED WORK

DATA TRANSFER IS THE BOTTLENECK

▸ Rely on source-to-source transformations

▸ Limited Parallelism

▸ Large Internal Buffers

[8] W. Meeus and D. Stroobandt. Automating data reuse in high-level synthesis. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition, pages 1–4, Mar. 2014. [9] L.-N. Pouchet, P. Zhang, P. Sadayappan, and J. Cong. Polyhedral-based data reuse optimization for configurable computing. In Proceedings of the 2013 ACM/SIGDA 21st International Symposium on Field Programmable Gate Arrays, pages 29–38, Feb. 2013. [10] M. Schmid, O. Reiche, F. Hannig, and J. Teich. Loop coarsening in C-based high-level synthesis. In Proceedings of the 26th International Conference on Application-specific Systems, Architectures and Processors, pages 166–173, July 2015.

[8] [9] [10]

8

[8] [9]

[10]

➡Multiple Datapaths ➡Area efficient Accelerators

Page 9: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY

SW ANALYSIS FOR DATA REUSE

9

void Sobelsystem() {

outer_loop:for(i = 1; i < 100; i++) {

inner_loop:for(j = 1; j < 100; j++) {

Sobelmodule(image[i-1][j-1], image[i][j-1], image[i+1][j+1], image[i-1][j], image[i+1][j], image[i-1][j+1], image[i][j+1], image[i+1][j-1], image[i][j]);

}

}

}

LOADS

Page 10: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY

SW ANALYSIS FOR DATA REUSE

10

Page 11: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY

SW ANALYSIS FOR DATA REUSE

11

Page 12: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY

SW ANALYSIS FOR DATA REUSE

12

Page 13: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY 13

WINDOW 1

Page 14: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY 14

WINDOW 2

Page 15: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY 15

WINDOW 2

WINDOW 1

1.1 Divisors and the Abel-Jacobi map

d.

Next, we define the Picard group of X as the sheaf cohomology group

PicX

:= H1(X,O⇤X

) ,

which is well known to be isomorphic to the set of line bundles up to isomorphism, withgroup structure induced by the usual tensor product of coherent sheaves.To any divisor D 2 Div

X

one can associate the line bundle OX

(D) defined over anyopen set U ⇢ X by the prescription

H0(U,OX

(D)) =�f 2 k(X)⇤ | (f)|U +D|U � 0

which can be seen as the line bundle whose global sections are locally controlled by D.In other words, the sections of O

X

(D) are allowed to present poles on any point of thesupport of D, with order bounded by the coefficient of the point appearing in the formalsum.It is an easy consequence of the Riemann-Roch Theorem that every line bundle L onX admits a global section s and, thus, it can be written in the form L = O

X

(D) withD = (s) being the divisor corresponding to such a section. Thus we define the degree

of L to be the degree of the divisor D. We will denote by Pic dX

the set of line bundlesof degree d.Based on the above construction, we give a definition of the so called Abel-Jacobi

map by the assignment

u : DivX

! PicX

D 7! OX

(D)

Remark 1.2. Recall that classically, in the theory of smooth curves over C, the Abel-Jacobi map is defined for a one-point divisor P (and then extended linearly) by meansof the Abelian integrals as

u : Xd

! J(X), P 7!✓ Z

P

P0

!1 , . . . ,

ZP

P0

!g

◆mod ⇤,

where P0 is a fixed point of X, the !i

’s form a basis of the space of Abelian differentialsand ⇤ is a nondegenerate lattice arising from the Riemann bilinear relations.Our definition of the Abel-Jacobi map is simpler, more elegant and does not rely onanalytical tools such as path-integration, nevertheless it turns out to be equivalent toits classical counterpart.

3

UNROLLING

Page 16: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY 16

WINDOW 2

WINDOW 1

UNROLLING

1.1 Divisors and the Abel-Jacobi map

d.

Next, we define the Picard group of X as the sheaf cohomology group

PicX

:= H1(X,O⇤X

) ,

which is well known to be isomorphic to the set of line bundles up to isomorphism, withgroup structure induced by the usual tensor product of coherent sheaves.To any divisor D 2 Div

X

one can associate the line bundle OX

(D) defined over anyopen set U ⇢ X by the prescription

H0(U,OX

(D)) =�f 2 k(X)⇤ | (f)|U +D|U � 0

which can be seen as the line bundle whose global sections are locally controlled by D.In other words, the sections of O

X

(D) are allowed to present poles on any point of thesupport of D, with order bounded by the coefficient of the point appearing in the formalsum.It is an easy consequence of the Riemann-Roch Theorem that every line bundle L onX admits a global section s and, thus, it can be written in the form L = O

X

(D) withD = (s) being the divisor corresponding to such a section. Thus we define the degree

of L to be the degree of the divisor D. We will denote by Pic dX

the set of line bundlesof degree d.Based on the above construction, we give a definition of the so called Abel-Jacobi

map by the assignment

u : DivX

! PicX

D 7! OX

(D)

Remark 1.2. Recall that classically, in the theory of smooth curves over C, the Abel-Jacobi map is defined for a one-point divisor P (and then extended linearly) by meansof the Abelian integrals as

u : Xd

! J(X), P 7!✓ Z

P

P0

!1 , . . . ,

ZP

P0

!g

◆mod ⇤,

where P0 is a fixed point of X, the !i

’s form a basis of the space of Abelian differentialsand ⇤ is a nondegenerate lattice arising from the Riemann bilinear relations.Our definition of the Abel-Jacobi map is simpler, more elegant and does not rely onanalytical tools such as path-integration, nevertheless it turns out to be equivalent toits classical counterpart.

3

PIPELINING

Page 17: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY 17

WINDOW 2

WINDOW 1

UNROLLING

PIPELINING

Page 18: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY

COMPILERS

▸ Perform Source Code Analysis

▸ Identify Data Reuse in Sliding Window Applications

18

Page 19: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY

COMPILERS

▸ Perform Source Code Analysis

▸ Identify Data Reuse in Sliding Window Applications

[2] C. Lattner and V. Adve. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Proceedings of the 2nd International Symposium on Code generation and optimization, Palo Alto, California, Mar 2004. [3] T. Grosser, H. Zheng, R. Aloor, A. Simbürger, A. Größlinger, and L.-N. Pouchet. Polly-polyhedral optimization in llvm. In Proceedings of the First International Workshop on Polyhedral Compilation Techniques (IMPACT), volume 2011, 2011.

19

[2] [3]

Page 20: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY

LLVM POLLY SCOP ANALYSIS FOR DATA REUSE

20

1.1 Divisors and the Abel-Jacobi map

d.

Next, we define the Picard group of X as the sheaf cohomology group

PicX

:= H1(X,O⇤X

) ,

which is well known to be isomorphic to the set of line bundles up to isomorphism, withgroup structure induced by the usual tensor product of coherent sheaves.To any divisor D 2 Div

X

one can associate the line bundle OX

(D) defined over anyopen set U ⇢ X by the prescription

H0(U,OX

(D)) =�f 2 k(X)⇤ | (f)|U +D|U � 0

which can be seen as the line bundle whose global sections are locally controlled by D.In other words, the sections of O

X

(D) are allowed to present poles on any point of thesupport of D, with order bounded by the coefficient of the point appearing in the formalsum.It is an easy consequence of the Riemann-Roch Theorem that every line bundle L onX admits a global section s and, thus, it can be written in the form L = O

X

(D) withD = (s) being the divisor corresponding to such a section. Thus we define the degree

of L to be the degree of the divisor D. We will denote by Pic dX

the set of line bundlesof degree d.Based on the above construction, we give a definition of the so called Abel-Jacobi

map by the assignment

u : DivX

! PicX

D 7! OX

(D)

Remark 1.2. Recall that classically, in the theory of smooth curves over C, the Abel-Jacobi map is defined for a one-point divisor P (and then extended linearly) by meansof the Abelian integrals as

u : Xd

! J(X), P 7!✓ Z

P

P0

!1 , . . . ,

ZP

P0

!g

◆mod ⇤,

where P0 is a fixed point of X, the !i

’s form a basis of the space of Abelian differentialsand ⇤ is a nondegenerate lattice arising from the Riemann bilinear relations.Our definition of the Abel-Jacobi map is simpler, more elegant and does not rely onanalytical tools such as path-integration, nevertheless it turns out to be equivalent toits classical counterpart.

3

WINDOW SIZE

Page 21: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY

LLVM POLLY SCOP ANALYSIS FOR DATA REUSE

21

STRIDE

1.1 Divisors and the Abel-Jacobi map

d.

Next, we define the Picard group of X as the sheaf cohomology group

PicX

:= H1(X,O⇤X

) ,

which is well known to be isomorphic to the set of line bundles up to isomorphism, withgroup structure induced by the usual tensor product of coherent sheaves.To any divisor D 2 Div

X

one can associate the line bundle OX

(D) defined over anyopen set U ⇢ X by the prescription

H0(U,OX

(D)) =�f 2 k(X)⇤ | (f)|U +D|U � 0

which can be seen as the line bundle whose global sections are locally controlled by D.In other words, the sections of O

X

(D) are allowed to present poles on any point of thesupport of D, with order bounded by the coefficient of the point appearing in the formalsum.It is an easy consequence of the Riemann-Roch Theorem that every line bundle L onX admits a global section s and, thus, it can be written in the form L = O

X

(D) withD = (s) being the divisor corresponding to such a section. Thus we define the degree

of L to be the degree of the divisor D. We will denote by Pic dX

the set of line bundlesof degree d.Based on the above construction, we give a definition of the so called Abel-Jacobi

map by the assignment

u : DivX

! PicX

D 7! OX

(D)

Remark 1.2. Recall that classically, in the theory of smooth curves over C, the Abel-Jacobi map is defined for a one-point divisor P (and then extended linearly) by meansof the Abelian integrals as

u : Xd

! J(X), P 7!✓ Z

P

P0

!1 , . . . ,

ZP

P0

!g

◆mod ⇤,

where P0 is a fixed point of X, the !i

’s form a basis of the space of Abelian differentialsand ⇤ is a nondegenerate lattice arising from the Riemann bilinear relations.Our definition of the Abel-Jacobi map is simpler, more elegant and does not rely onanalytical tools such as path-integration, nevertheless it turns out to be equivalent toits classical counterpart.

3

Page 22: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY

LLVM POLLY SCOP ANALYSIS FOR DATA REUSE

22

FRAME SIZE

Page 23: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY

HW IMPLEMENTATION

Parameters retrieved with LLVM Polly Analysis Pass:

23

▸ Horizontal and Vertical Window Size

▸ Stride

▸ Iteration Domain (Frame Size)

Page 24: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY

HW IMPLEMENTATION

24

����

����

�����������������

���� �������������

���

���

������������

������� � ���������

������������������

��� �������

� ������������

����

���������

Page 25: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY

HW IMPLEMENTATION

25

����

����

�����������������

���� �������������

���

���

������������

������� � ���������

������������������

��� �������

� ������������

����

���������AVAILABLE INPUT DATA WIDTH

Page 26: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY

HW IMPLEMENTATION

26

����

����

�����������������

���� �������������

���

���

������������

������� � ���������

������������������

��� �������

� ������������

����

���������AVAILABLE INPUT DATA WIDTH

Buffer Size:

Horizontal Width = Input Width (INw)INw

Page 27: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY

HW IMPLEMENTATION

27

����

����

�����������������

���� �������������

���

���

������������

������� � ���������

������������������

��� �������

� ������������

����

���������

Buffer Size:

Horizontal Width = Input Width (INw)

Example:

INw = Horizontal Win. Size + 1

INw

Page 28: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY

HW IMPLEMENTATION

28

����

����

�����������������

���� �������������

���

���

������������

������� � ���������

������������������

��� �������

� ������������

����

���������

Buffer Size:

Horizontal Width = Input Width (INw)

Example:

INw = Horizontal Win. Size + 1

Vertical Width = Vertical Win. Size

INw

Page 29: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY

HW IMPLEMENTATION

29

����

����

�����������������

���� �������������

���

���

������������

������� � ���������

������������������

��� �������

� ������������

����

���������

Buffer Size:

Horizontal Width = Input Width (INw)

Vertical Width = Vertical Win. Size

Example:

INw = Horizontal Win. Size + 1

INw

Page 30: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY 30

WINDOW 2

WINDOW 1

1.1 Divisors and the Abel-Jacobi map

d.

Next, we define the Picard group of X as the sheaf cohomology group

PicX

:= H1(X,O⇤X

) ,

which is well known to be isomorphic to the set of line bundles up to isomorphism, withgroup structure induced by the usual tensor product of coherent sheaves.To any divisor D 2 Div

X

one can associate the line bundle OX

(D) defined over anyopen set U ⇢ X by the prescription

H0(U,OX

(D)) =�f 2 k(X)⇤ | (f)|U +D|U � 0

which can be seen as the line bundle whose global sections are locally controlled by D.In other words, the sections of O

X

(D) are allowed to present poles on any point of thesupport of D, with order bounded by the coefficient of the point appearing in the formalsum.It is an easy consequence of the Riemann-Roch Theorem that every line bundle L onX admits a global section s and, thus, it can be written in the form L = O

X

(D) withD = (s) being the divisor corresponding to such a section. Thus we define the degree

of L to be the degree of the divisor D. We will denote by Pic dX

the set of line bundlesof degree d.Based on the above construction, we give a definition of the so called Abel-Jacobi

map by the assignment

u : DivX

! PicX

D 7! OX

(D)

Remark 1.2. Recall that classically, in the theory of smooth curves over C, the Abel-Jacobi map is defined for a one-point divisor P (and then extended linearly) by meansof the Abelian integrals as

u : Xd

! J(X), P 7!✓ Z

P

P0

!1 , . . . ,

ZP

P0

!g

◆mod ⇤,

where P0 is a fixed point of X, the !i

’s form a basis of the space of Abelian differentialsand ⇤ is a nondegenerate lattice arising from the Riemann bilinear relations.Our definition of the Abel-Jacobi map is simpler, more elegant and does not rely onanalytical tools such as path-integration, nevertheless it turns out to be equivalent toits classical counterpart.

3

BUFFER

Page 31: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY 31

WINDOW 2

WINDOW 1

STRIDE - STORE NEW ROW

1.1 Divisors and the Abel-Jacobi map

d.

Next, we define the Picard group of X as the sheaf cohomology group

PicX

:= H1(X,O⇤X

) ,

which is well known to be isomorphic to the set of line bundles up to isomorphism, withgroup structure induced by the usual tensor product of coherent sheaves.To any divisor D 2 Div

X

one can associate the line bundle OX

(D) defined over anyopen set U ⇢ X by the prescription

H0(U,OX

(D)) =�f 2 k(X)⇤ | (f)|U +D|U � 0

which can be seen as the line bundle whose global sections are locally controlled by D.In other words, the sections of O

X

(D) are allowed to present poles on any point of thesupport of D, with order bounded by the coefficient of the point appearing in the formalsum.It is an easy consequence of the Riemann-Roch Theorem that every line bundle L onX admits a global section s and, thus, it can be written in the form L = O

X

(D) withD = (s) being the divisor corresponding to such a section. Thus we define the degree

of L to be the degree of the divisor D. We will denote by Pic dX

the set of line bundlesof degree d.Based on the above construction, we give a definition of the so called Abel-Jacobi

map by the assignment

u : DivX

! PicX

D 7! OX

(D)

Remark 1.2. Recall that classically, in the theory of smooth curves over C, the Abel-Jacobi map is defined for a one-point divisor P (and then extended linearly) by meansof the Abelian integrals as

u : Xd

! J(X), P 7!✓ Z

P

P0

!1 , . . . ,

ZP

P0

!g

◆mod ⇤,

where P0 is a fixed point of X, the !i

’s form a basis of the space of Abelian differentialsand ⇤ is a nondegenerate lattice arising from the Riemann bilinear relations.Our definition of the Abel-Jacobi map is simpler, more elegant and does not rely onanalytical tools such as path-integration, nevertheless it turns out to be equivalent toits classical counterpart.

3

STRIDE - DISCARD TOPMOST ROW

1.1 Divisors and the Abel-Jacobi map

d.

Next, we define the Picard group of X as the sheaf cohomology group

PicX

:= H1(X,O⇤X

) ,

which is well known to be isomorphic to the set of line bundles up to isomorphism, withgroup structure induced by the usual tensor product of coherent sheaves.To any divisor D 2 Div

X

one can associate the line bundle OX

(D) defined over anyopen set U ⇢ X by the prescription

H0(U,OX

(D)) =�f 2 k(X)⇤ | (f)|U +D|U � 0

which can be seen as the line bundle whose global sections are locally controlled by D.In other words, the sections of O

X

(D) are allowed to present poles on any point of thesupport of D, with order bounded by the coefficient of the point appearing in the formalsum.It is an easy consequence of the Riemann-Roch Theorem that every line bundle L onX admits a global section s and, thus, it can be written in the form L = O

X

(D) withD = (s) being the divisor corresponding to such a section. Thus we define the degree

of L to be the degree of the divisor D. We will denote by Pic dX

the set of line bundlesof degree d.Based on the above construction, we give a definition of the so called Abel-Jacobi

map by the assignment

u : DivX

! PicX

D 7! OX

(D)

Remark 1.2. Recall that classically, in the theory of smooth curves over C, the Abel-Jacobi map is defined for a one-point divisor P (and then extended linearly) by meansof the Abelian integrals as

u : Xd

! J(X), P 7!✓ Z

P

P0

!1 , . . . ,

ZP

P0

!g

◆mod ⇤,

where P0 is a fixed point of X, the !i

’s form a basis of the space of Abelian differentialsand ⇤ is a nondegenerate lattice arising from the Riemann bilinear relations.Our definition of the Abel-Jacobi map is simpler, more elegant and does not rely onanalytical tools such as path-integration, nevertheless it turns out to be equivalent toits classical counterpart.

3

Page 32: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY 32

WINDOW 2

WINDOW 1

Page 33: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY 33

WINDOW 2

WINDOW 1

Page 34: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY 34

WINDOW 2

WINDOW 1

Page 35: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY 35

WINDOW 2

WINDOW 1

Page 36: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY 36

WINDOW 2

WINDOW 1

1.1

Div

isor

san

dth

eA

bel

-Jac

obim

ap

d. Nex

t,w

ede

fine

the

Pic

ard

grou

pof

Xas

the

shea

fcoh

omol

ogy

grou

p

PicX

:=H

1(X

,O⇤ X

),

whi

chis

wel

lkno

wn

tobe

isom

orph

icto

the

set

oflin

ebu

ndle

sup

tois

omor

phis

m,w

ithgr

oup

stru

ctur

ein

duce

dby

the

usua

lten

sor

prod

uct

ofco

here

ntsh

eave

s.To

any

divi

sorD

2Div

X

one

can

asso

ciat

eth

elin

ebu

ndle

OX

(D)

defin

edov

eran

yop

ense

tU

⇢X

byth

epr

escr

iptio

n

H0(U

,OX

(D))

=� f

2k(X

)⇤|(f) |U

+D

|U�

0

whi

chca

nbe

seen

asth

elin

ebu

ndle

who

segl

obal

sect

ions

are

loca

llyco

ntro

lled

byD

.In

othe

rw

ords

,the

sect

ions

ofOX

(D)

are

allo

wed

topr

esen

tpo

les

onan

ypo

int

ofth

esu

ppor

tofD

,with

orde

rbou

nded

byth

eco

effici

ento

fthe

poin

tapp

earin

gin

the

form

alsu

m.

Itis

anea

syco

nseq

uenc

eof

the

Rie

man

n-R

och

The

orem

that

ever

ylin

ebu

ndle

Lon

Xad

mits

agl

obal

sect

ions

and,

thus

,it

can

bew

ritte

nin

the

form

L=

OX

(D)

with

D=

(s)

bein

gth

edi

viso

rco

rres

pond

ing

tosu

cha

sect

ion.

Thu

sw

ede

fine

the

degr

ee

ofL

tobe

the

degr

eeof

the

divi

sorD

.W

ew

illde

note

byPic

d

X

the

set

oflin

ebu

ndle

sof

degr

eed.

Bas

edon

the

abov

eco

nstr

uctio

n,w

egi

vea

defin

ition

ofth

eso

calle

dA

bel

-Jac

obi

map

byth

eas

sign

men

t

u:Div

X

!PicX

D7!

OX

(D)

Rem

ark

1.2.

Rec

allt

hat

clas

sica

lly,i

nth

eth

eory

ofsm

ooth

curv

esov

erC

,the

Abe

l-Ja

cobi

map

isde

fined

for

aon

e-po

int

divi

sorP

(and

then

exte

nded

linea

rly)

bym

eans

ofth

eA

belia

nin

tegr

als

as

u:X

d

!J(X

),P

7!✓ Z

P

P

0

!1,...,

Z P P

0

!g

◆mod⇤,

whe

reP0

isa

fixed

poin

tof

X,t

he!i

’sfo

rma

basi

sof

the

spac

eof

Abe

lian

diffe

rent

ials

and⇤

isa

nond

egen

erat

ela

ttic

ear

isin

gfr

omth

eR

iem

ann

bilin

ear

rela

tions

.O

urde

finiti

onof

the

Abe

l-Jac

obi

map

issi

mpl

er,

mor

eel

egan

tan

ddo

esno

tre

lyon

anal

ytic

alto

ols

such

aspa

th-in

tegr

atio

n,ne

vert

hele

ssit

turn

sou

tto

beeq

uiva

lent

toits

clas

sica

lcou

nter

part

.

3

DISPLACEMENT

Page 37: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY 37

WINDOW 2

WINDOW 1

1.1

Div

isor

san

dth

eA

bel

-Jac

obim

ap

d. Nex

t,w

ede

fine

the

Pic

ard

grou

pof

Xas

the

shea

fcoh

omol

ogy

grou

p

PicX

:=H

1(X

,O⇤ X

),

whi

chis

wel

lkno

wn

tobe

isom

orph

icto

the

set

oflin

ebu

ndle

sup

tois

omor

phis

m,w

ithgr

oup

stru

ctur

ein

duce

dby

the

usua

lten

sor

prod

uct

ofco

here

ntsh

eave

s.To

any

divi

sorD

2Div

X

one

can

asso

ciat

eth

elin

ebu

ndle

OX

(D)

defin

edov

eran

yop

ense

tU

⇢X

byth

epr

escr

iptio

n

H0(U

,OX

(D))

=� f

2k(X

)⇤|(f) |U

+D

|U�

0

whi

chca

nbe

seen

asth

elin

ebu

ndle

who

segl

obal

sect

ions

are

loca

llyco

ntro

lled

byD

.In

othe

rw

ords

,the

sect

ions

ofOX

(D)

are

allo

wed

topr

esen

tpo

les

onan

ypo

int

ofth

esu

ppor

tofD

,with

orde

rbou

nded

byth

eco

effici

ento

fthe

poin

tapp

earin

gin

the

form

alsu

m.

Itis

anea

syco

nseq

uenc

eof

the

Rie

man

n-R

och

The

orem

that

ever

ylin

ebu

ndle

Lon

Xad

mits

agl

obal

sect

ions

and,

thus

,it

can

bew

ritte

nin

the

form

L=

OX

(D)

with

D=

(s)

bein

gth

edi

viso

rco

rres

pond

ing

tosu

cha

sect

ion.

Thu

sw

ede

fine

the

degr

ee

ofL

tobe

the

degr

eeof

the

divi

sorD

.W

ew

illde

note

byPic

d

X

the

set

oflin

ebu

ndle

sof

degr

eed.

Bas

edon

the

abov

eco

nstr

uctio

n,w

egi

vea

defin

ition

ofth

eso

calle

dA

bel

-Jac

obi

map

byth

eas

sign

men

t

u:Div

X

!PicX

D7!

OX

(D)

Rem

ark

1.2.

Rec

allt

hat

clas

sica

lly,i

nth

eth

eory

ofsm

ooth

curv

esov

erC

,the

Abe

l-Ja

cobi

map

isde

fined

for

aon

e-po

int

divi

sorP

(and

then

exte

nded

linea

rly)

bym

eans

ofth

eA

belia

nin

tegr

als

as

u:X

d

!J(X

),P

7!✓ Z

P

P

0

!1,...,

Z P P

0

!g

◆mod⇤,

whe

reP0

isa

fixed

poin

tof

X,t

he!i

’sfo

rma

basi

sof

the

spac

eof

Abe

lian

diffe

rent

ials

and⇤

isa

nond

egen

erat

ela

ttic

ear

isin

gfr

omth

eR

iem

ann

bilin

ear

rela

tions

.O

urde

finiti

onof

the

Abe

l-Jac

obi

map

issi

mpl

er,

mor

eel

egan

tan

ddo

esno

tre

lyon

anal

ytic

alto

ols

such

aspa

th-in

tegr

atio

n,ne

vert

hele

ssit

turn

sou

tto

beeq

uiva

lent

toits

clas

sica

lcou

nter

part

.

3

DISPLACEMENT

Horiz. Displacement = INw - Horiz. Win. Size + 1

Page 38: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

METHODOLOGY 38

WINDOW 2

WINDOW 1

1.1

Div

isor

san

dth

eA

bel

-Jac

obim

ap

d. Nex

t,w

ede

fine

the

Pic

ard

grou

pof

Xas

the

shea

fcoh

omol

ogy

grou

p

PicX

:=H

1(X

,O⇤ X

),

whi

chis

wel

lkno

wn

tobe

isom

orph

icto

the

set

oflin

ebu

ndle

sup

tois

omor

phis

m,w

ithgr

oup

stru

ctur

ein

duce

dby

the

usua

lten

sor

prod

uct

ofco

here

ntsh

eave

s.To

any

divi

sorD

2Div

X

one

can

asso

ciat

eth

elin

ebu

ndle

OX

(D)

defin

edov

eran

yop

ense

tU

⇢X

byth

epr

escr

iptio

n

H0(U

,OX

(D))

=� f

2k(X

)⇤|(f) |U

+D

|U�

0

whi

chca

nbe

seen

asth

elin

ebu

ndle

who

segl

obal

sect

ions

are

loca

llyco

ntro

lled

byD

.In

othe

rw

ords

,the

sect

ions

ofOX

(D)

are

allo

wed

topr

esen

tpo

les

onan

ypo

int

ofth

esu

ppor

tofD

,with

orde

rbou

nded

byth

eco

effici

ento

fthe

poin

tapp

earin

gin

the

form

alsu

m.

Itis

anea

syco

nseq

uenc

eof

the

Rie

man

n-R

och

The

orem

that

ever

ylin

ebu

ndle

Lon

Xad

mits

agl

obal

sect

ions

and,

thus

,it

can

bew

ritte

nin

the

form

L=

OX

(D)

with

D=

(s)

bein

gth

edi

viso

rco

rres

pond

ing

tosu

cha

sect

ion.

Thu

sw

ede

fine

the

degr

ee

ofL

tobe

the

degr

eeof

the

divi

sorD

.W

ew

illde

note

byPic

d

X

the

set

oflin

ebu

ndle

sof

degr

eed.

Bas

edon

the

abov

eco

nstr

uctio

n,w

egi

vea

defin

ition

ofth

eso

calle

dA

bel

-Jac

obi

map

byth

eas

sign

men

t

u:Div

X

!PicX

D7!

OX

(D)

Rem

ark

1.2.

Rec

allt

hat

clas

sica

lly,i

nth

eth

eory

ofsm

ooth

curv

esov

erC

,the

Abe

l-Ja

cobi

map

isde

fined

for

aon

e-po

int

divi

sorP

(and

then

exte

nded

linea

rly)

bym

eans

ofth

eA

belia

nin

tegr

als

as

u:X

d

!J(X

),P

7!✓ Z

P

P

0

!1,...,

Z P P

0

!g

◆mod⇤,

whe

reP0

isa

fixed

poin

tof

X,t

he!i

’sfo

rma

basi

sof

the

spac

eof

Abe

lian

diffe

rent

ials

and⇤

isa

nond

egen

erat

ela

ttic

ear

isin

gfr

omth

eR

iem

ann

bilin

ear

rela

tions

.O

urde

finiti

onof

the

Abe

l-Jac

obi

map

issi

mpl

er,

mor

eel

egan

tan

ddo

esno

tre

lyon

anal

ytic

alto

ols

such

aspa

th-in

tegr

atio

n,ne

vert

hele

ssit

turn

sou

tto

beeq

uiva

lent

toits

clas

sica

lcou

nter

part

.

3

DISPLACEMENT

Horiz. Displacement = INw - Horiz. Win. Size + 1

Example:

Horiz. Displacement = 2

INw = Horizontal Win. Size + 1

Page 39: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

▸ ROCCC

▸ Vivado HLS

▸ Our Approach

EXPERIMENTAL SETUP

APPROACHES COMPARED

39

Page 40: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

▸ ROCCC

▸ Vivado HLS

▸ Our Approach

EXPERIMENTAL SETUP

APPROACHES COMPARED

40

PIPELINING DATA REUSE

Page 41: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

▸ ROCCC

▸ Vivado HLS

▸ Our Approach

EXPERIMENTAL SETUP

APPROACHES COMPARED

41

PIPELINING DATA REUSE

WINDOW 2

WINDOW 1

PIPELINING

Page 42: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

▸ ROCCC

▸ Vivado HLS

▸ Our Approach

EXPERIMENTAL SETUP

APPROACHES COMPARED

42

PIPELINING DATA REUSE

ROCCC

WINDOW 2

WINDOW 1

PIPELINING

Page 43: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

▸ ROCCC

▸ Vivado HLS

▸ Our Approach

EXPERIMENTAL SETUP

APPROACHES COMPARED

43

PIPELINING DATA REUSE

ROCCC

WINDOW 2

WINDOW 1

VIVADO Manual Rewrite

PIPELINING

Page 44: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

▸ ROCCC

▸ Vivado HLS

▸ Our Approach

EXPERIMENTAL SETUP

APPROACHES COMPARED

44

PIPELINING DATA REUSE

ROCCC

WINDOW 2

WINDOW 1

OUR

VIVADO

PIPELINING

Manual Rewrite

Page 45: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

▸ ROCCC

▸ Vivado HLS

▸ Our Approach

EXPERIMENTAL SETUP

APPROACHES COMPARED

45

PIPELINING DATA REUSE

ROCCC

OUR

VIVADO

UNROLLING DATA REUSE

Manual Rewrite

Page 46: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

▸ ROCCC

▸ Vivado HLS

▸ Our Approach

EXPERIMENTAL SETUP

APPROACHES COMPARED

46

PIPELINING DATA REUSE

ROCCC

OUR

VIVADO

UNROLLING DATA REUSE

UNROLLING

WINDOW 2WINDOW 1

Manual Rewrite

Page 47: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

▸ ROCCC

▸ Vivado HLS

▸ Our Approach

EXPERIMENTAL SETUP

APPROACHES COMPARED

47

PIPELINING DATA REUSE

ROCCC

OUR

VIVADO

UNROLLING DATA REUSE

UNROLLING

WINDOW 2WINDOW 1

Manual Rewrite

Page 48: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

▸ ROCCC

▸ Vivado HLS

▸ Our Approach

EXPERIMENTAL SETUP

APPROACHES COMPARED

48

PIPELINING DATA REUSE

ROCCC

OUR

VIVADO

UNROLLING DATA REUSE

UNROLLING

WINDOW 2WINDOW 1

Manual Rewrite

Page 49: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

▸ ROCCC

▸ Vivado HLS

▸ Our Approach

EXPERIMENTAL SETUP

APPROACHES COMPARED

49

PIPELINING DATA REUSE

ROCCC

OUR

VIVADO

UNROLLING DATA REUSE

UNROLLING

WINDOW 2WINDOW 1

Manual Rewrite

Page 50: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

▸ ROCCC

▸ Vivado HLS

▸ Our Approach

EXPERIMENTAL SETUP

APPROACHES COMPARED

50

PIPELINING DATA REUSE

ROCCC

OUR

VIVADO

UNROLLING DATA REUSE

PARAMETRIC I/O WIDTH

Manual Rewrite

Page 51: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

▸ ROCCC

▸ Vivado HLS

▸ Our Approach

EXPERIMENTAL SETUP

APPROACHES COMPARED

51

PIPELINING DATA REUSE

ROCCC

OUR

VIVADO

UNROLLING DATA REUSE

PARAMETRIC I/O WIDTH

BUFFER

INPUT WIDTH

Manual Rewrite

Page 52: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

▸ ROCCC

▸ Vivado HLS

▸ Our Approach

EXPERIMENTAL SETUP

APPROACHES COMPARED

52

PIPELINING DATA REUSE

ROCCC

OUR

VIVADO

UNROLLING DATA REUSE

PARAMETRIC I/O WIDTH

BUFFER

INPUT WIDTH

Manual Rewrite

Page 53: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

▸ ROCCC

▸ Vivado HLS

▸ Our Approach

EXPERIMENTAL SETUP

APPROACHES COMPARED

53

PIPELINING DATA REUSE

ROCCC

OUR

VIVADO

UNROLLING DATA REUSE

PARAMETRIC I/O WIDTH

BUFFER

INPUT WIDTH

Manual Rewrite

Page 54: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

▸ ROCCC

▸ Vivado HLS

▸ Our Approach

EXPERIMENTAL SETUP

APPROACHES COMPARED

54

PIPELINING DATA REUSE

ROCCC

OUR

VIVADO

UNROLLING DATA REUSE

PARAMETRIC I/O WIDTH

BUFFER

INPUT WIDTH

Manual Rewrite

Page 55: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

▸ Xilinx Virtex7 FPGA (HDL Simulation)

EXPERIMENTAL SETUP

OUR APPROACH

55

Page 56: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

▸ Xilinx Virtex7 FPGA (HDL Simulation)

EXPERIMENTAL SETUP

OUR APPROACH

56

3 CONFIGURATIONS

Page 57: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

▸ Xilinx Virtex7 FPGA (HDL Simulation)

EXPERIMENTAL SETUP

OUR APPROACH

57

INPUT WIDTH # DATAPATHS

3 CONFIGURATIONS

Page 58: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

▸ Xilinx Virtex7 FPGA (HDL Simulation)

EXPERIMENTAL SETUP

OUR APPROACH

58

INPUT WIDTH # DATAPATHS

3 CONFIGURATIONS

SAD

SOBEL

MAX FILTER

WINDOW SIZE

4X4

3X3

8X8

Page 59: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

▸ Xilinx Virtex7 FPGA (HDL Simulation)

EXPERIMENTAL SETUP

OUR APPROACH

59

SINGLE DATAPATH SDP

SAD

SOBEL

MAX FILTER

WINDOW SIZE

4X4

3X3

8X8

INPUT WIDTH # DATAPATHS

Page 60: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

▸ Xilinx Virtex7 FPGA (HDL Simulation)

EXPERIMENTAL SETUP

OUR APPROACH

60

SINGLE DATAPATH SDP

SAD

SOBEL

MAX FILTER

WINDOW SIZE

4X4

3X3

8X8

INPUT WIDTH # DATAPATHS

4, 1

3, 1

8, 1

Page 61: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

▸ Xilinx Virtex7 FPGA (HDL Simulation)

EXPERIMENTAL SETUP

OUR APPROACH

61

SINGLE DATAPATH SDP

SAD

SOBEL

MAX FILTER

MULTIPLE DATAPATHS MDPWINDOW SIZE

4, 1

3, 1

8, 1

4X4

3X3

8X8

INPUT WIDTH # DATAPATHS

Page 62: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

▸ Xilinx Virtex7 FPGA (HDL Simulation)

EXPERIMENTAL SETUP

OUR APPROACH

62

SINGLE DATAPATH SDP

SAD

SOBEL

MAX FILTER

MULTIPLE DATAPATHS MDPWINDOW SIZE

4, 1

3, 1

8, 1

4X4

3X3

8X8

8, 5

8, 6

16, 9

INPUT WIDTH # DATAPATHS

Page 63: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

▸ Xilinx Virtex7 FPGA (HDL Simulation)

EXPERIMENTAL SETUP

OUR APPROACH

63

SINGLE DATAPATH SDP

SAD

SOBEL

MAX FILTER

MULTIPLE DATAPATHS MDP

EXTRA MULT. DATAPATHS XMDPWINDOW SIZE

4, 1

3, 1

8, 1

4X4

3X3

8X8

8, 5

8, 6

16, 9

INPUT WIDTH # DATAPATHS

Page 64: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

▸ Xilinx Virtex7 FPGA (HDL Simulation)

EXPERIMENTAL SETUP

OUR APPROACH

64

SINGLE DATAPATH SDP

SAD

SOBEL

MAX FILTER

MULTIPLE DATAPATHS MDP

EXTRA MULT. DATAPATHS XMDPWINDOW SIZE

4, 1

3, 1

8, 1

4X4

3X3

8X8

8, 5

8, 6

16, 9

INPUT WIDTH # DATAPATHS

16, 13

16, 14

24, 17

Page 65: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

EXPERIMENTS

EXECUTION TIME COMPARISON

65

500

5000

50000

500000

Sobel MaxFilter BlockSAD

ROCCCVivado_norevVivado_revCISC=Conf.1CISC=Conf.2CISC=Conf.3

Vivado_norewVivado_rew

Conf.1=========

Conf.3Conf.2

nS

SDP

MDP

XMDP

Page 66: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

EXPERIMENTS

EXECUTION TIME COMPARISON

66

500

5000

50000

500000

Sobel MaxFilter BlockSAD

ROCCCVivado_norevVivado_revCISC=Conf.1CISC=Conf.2CISC=Conf.3

Vivado_norewVivado_rew

Conf.1=========

Conf.3Conf.2

nS

SDP

MDP

XMDP

VIVADO_NOREW

PIPELINING DATA REUSE

UNROLLING DATA REUSE

PARAMETRIC I/O WIDTH

Page 67: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

EXPERIMENTS

EXECUTION TIME COMPARISON

67

500

5000

50000

500000

Sobel MaxFilter BlockSAD

ROCCCVivado_norevVivado_revCISC=Conf.1CISC=Conf.2CISC=Conf.3

Vivado_norewVivado_rew

Conf.1=========

Conf.3Conf.2

nS

SDP

MDP

XMDP

VIVADO_REW

PIPELINING DATA REUSE

UNROLLING DATA REUSE

PARAMETRIC I/O WIDTH

Manual Rewrite

Page 68: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

EXPERIMENTS

EXECUTION TIME COMPARISON

68

▸ Exec. Time to process a 100x100 frame

500

5000

50000

500000

Sobel MaxFilter BlockSAD

ROCCCVivado_norevVivado_revCISC=Conf.1CISC=Conf.2CISC=Conf.3

Vivado_norewVivado_rew

Conf.1=========

Conf.3Conf.2

nS

SDP

MDP

XMDP

Page 69: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

EXPERIMENTS

EXECUTION TIME COMPARISON

69

▸ Exec. Time to process a 100x100 frame

500

5000

50000

500000

Sobel MaxFilter BlockSAD

ROCCCVivado_norevVivado_revCISC=Conf.1CISC=Conf.2CISC=Conf.3

Vivado_norewVivado_rew

Conf.1=========

Conf.3Conf.2

nS

SDP

MDP

XMDP

Page 70: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

EXPERIMENTS

EXECUTION TIME COMPARISON

70

▸ Exec. Time to process a 100x100 frame

500

5000

50000

500000

Sobel MaxFilter BlockSAD

ROCCCVivado_norevVivado_revCISC=Conf.1CISC=Conf.2CISC=Conf.3

Vivado_norewVivado_rew

Conf.1=========

Conf.3Conf.2

nS

SDP

MDP

XMDP

Page 71: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

EXPERIMENTS

EXECUTION TIME COMPARISON

71

▸ Exec. Time to process a 100x100 frame

500

5000

50000

500000

Sobel MaxFilter BlockSAD

ROCCCVivado_norevVivado_revCISC=Conf.1CISC=Conf.2CISC=Conf.3

Vivado_norewVivado_rew

Conf.1=========

Conf.3Conf.2

nS

SDP

MDP

XMDP

Page 72: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

EXPERIMENTS

AREA COMPARISON

72

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

45666

7,8)9$:;$.&8

7,8)9$:.&8

6<#6=6$;>?!

6<#6=6$;>?@

6<#6=6$;>?A

)B %B

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

6$;>?!=========

6$;>?A

6$;>?@

7,8)9$:;$.&C7,8)9$:.&C

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

45666

7,8)9$:;$.&8

7,8)9$:.&8

6<#6=6$;>?!

6<#6=6$;>?@

6<#6=6$;>?A

)B %B

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

6$;>?!=========

6$;>?A

6$;>?@

7,8)9$:;$.&C7,8)9$:.&C

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

45666

7,8)9$:;$.&8

7,8)9$:.&8

6<#6=6$;>?!

6<#6=6$;>?@

6<#6=6$;>?A

)B %B

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

6$;>?!=========

6$;>?A

6$;>?@

7,8)9$:;$.&C7,8)9$:.&C

SDP

MDP

XMDP

FLIP FLOPS LUTS

Page 73: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

EXPERIMENTS

AREA COMPARISON

73

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

45666

7,8)9$:;$.&8

7,8)9$:.&8

6<#6=6$;>?!

6<#6=6$;>?@

6<#6=6$;>?A

)B %B

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

6$;>?!=========

6$;>?A

6$;>?@

7,8)9$:;$.&C7,8)9$:.&C

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

45666

7,8)9$:;$.&8

7,8)9$:.&8

6<#6=6$;>?!

6<#6=6$;>?@

6<#6=6$;>?A

)B %B

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

6$;>?!=========

6$;>?A

6$;>?@

7,8)9$:;$.&C7,8)9$:.&C

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

45666

7,8)9$:;$.&8

7,8)9$:.&8

6<#6=6$;>?!

6<#6=6$;>?@

6<#6=6$;>?A

)B %B

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

6$;>?!=========

6$;>?A

6$;>?@

7,8)9$:;$.&C7,8)9$:.&C

SDP

MDP

XMDP

FLIP FLOPS LUTS

Page 74: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

EXPERIMENTS

AREA COMPARISON

74

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

45666

7,8)9$:;$.&8

7,8)9$:.&8

6<#6=6$;>?!

6<#6=6$;>?@

6<#6=6$;>?A

)B %B

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

6$;>?!=========

6$;>?A

6$;>?@

7,8)9$:;$.&C7,8)9$:.&C

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

45666

7,8)9$:;$.&8

7,8)9$:.&8

6<#6=6$;>?!

6<#6=6$;>?@

6<#6=6$;>?A

)B %B

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

6$;>?!=========

6$;>?A

6$;>?@

7,8)9$:;$.&C7,8)9$:.&C

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

45666

7,8)9$:;$.&8

7,8)9$:.&8

6<#6=6$;>?!

6<#6=6$;>?@

6<#6=6$;>?A

)B %B

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

6$;>?!=========

6$;>?A

6$;>?@

7,8)9$:;$.&C7,8)9$:.&C

SDP

MDP

XMDP

FLIP FLOPS LUTS

Page 75: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

EXPERIMENTS

AREA COMPARISON

75

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

45666

7,8)9$:;$.&8

7,8)9$:.&8

6<#6=6$;>?!

6<#6=6$;>?@

6<#6=6$;>?A

)B %B

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

6$;>?!=========

6$;>?A

6$;>?@

7,8)9$:;$.&C7,8)9$:.&C

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

45666

7,8)9$:;$.&8

7,8)9$:.&8

6<#6=6$;>?!

6<#6=6$;>?@

6<#6=6$;>?A

)B %B

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

6$;>?!=========

6$;>?A

6$;>?@

7,8)9$:;$.&C7,8)9$:.&C

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

45666

7,8)9$:;$.&8

7,8)9$:.&8

6<#6=6$;>?!

6<#6=6$;>?@

6<#6=6$;>?A

)B %B

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

6$;>?!=========

6$;>?A

6$;>?@

7,8)9$:;$.&C7,8)9$:.&C

SDP

MDP

XMDP

FLIP FLOPS LUTS

Page 76: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

EXPERIMENTS

AREA COMPARISON

76

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

45666

7,8)9$:;$.&8

7,8)9$:.&8

6<#6=6$;>?!

6<#6=6$;>?@

6<#6=6$;>?A

)B %B

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

6$;>?!=========

6$;>?A

6$;>?@

7,8)9$:;$.&C7,8)9$:.&C

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

45666

7,8)9$:;$.&8

7,8)9$:.&8

6<#6=6$;>?!

6<#6=6$;>?@

6<#6=6$;>?A

)B %B

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

6$;>?!=========

6$;>?A

6$;>?@

7,8)9$:;$.&C7,8)9$:.&C

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

45666

7,8)9$:;$.&8

7,8)9$:.&8

6<#6=6$;>?!

6<#6=6$;>?@

6<#6=6$;>?A

)B %B

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

6$;>?!=========

6$;>?A

6$;>?@

7,8)9$:;$.&C7,8)9$:.&C

SDP

MDP

XMDP

FLIP FLOPS LUTS

Page 77: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

EXPERIMENTS

AREA COMPARISON

77

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

45666

7,8)9$:;$.&8

7,8)9$:.&8

6<#6=6$;>?!

6<#6=6$;>?@

6<#6=6$;>?A

)B %B

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

6$;>?!=========

6$;>?A

6$;>?@

7,8)9$:;$.&C7,8)9$:.&C

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

45666

7,8)9$:;$.&8

7,8)9$:.&8

6<#6=6$;>?!

6<#6=6$;>?@

6<#6=6$;>?A

)B %B

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

6$;>?!=========

6$;>?A

6$;>?@

7,8)9$:;$.&C7,8)9$:.&C

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

45666

7,8)9$:;$.&8

7,8)9$:.&8

6<#6=6$;>?!

6<#6=6$;>?@

6<#6=6$;>?A

)B %B

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

6$;>?!=========

6$;>?A

6$;>?@

7,8)9$:;$.&C7,8)9$:.&C

SDP

MDP

XMDP

FLIP FLOPS LUTS

Page 78: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

EXPERIMENTS

AREA COMPARISON

78

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

45666

7,8)9$:;$.&8

7,8)9$:.&8

6<#6=6$;>?!

6<#6=6$;>?@

6<#6=6$;>?A

)B %B

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

6$;>?!=========

6$;>?A

6$;>?@

7,8)9$:;$.&C7,8)9$:.&C

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

45666

7,8)9$:;$.&8

7,8)9$:.&8

6<#6=6$;>?!

6<#6=6$;>?@

6<#6=6$;>?A

)B %B

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

6$;>?!=========

6$;>?A

6$;>?@

7,8)9$:;$.&C7,8)9$:.&C

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

45666

7,8)9$:;$.&8

7,8)9$:.&8

6<#6=6$;>?!

6<#6=6$;>?@

6<#6=6$;>?A

)B %B

!""

!"""

!""""

#$%&' ()*+,'-&. /'$01#23

6$;>?!=========

6$;>?A

6$;>?@

7,8)9$:;$.&C7,8)9$:.&C

SDP

MDP

XMDP

FLIP FLOPS LUTS

Page 79: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

EXPERIMENTS

SPEEDUP AND AREA

79

!"#$%&''()%''*+(,-"./01 !"#$%2''()%''*+(,-"./01

3&456

737&74

8900-:9

;/0,'"(0/<0,-' =>?@)A

;/0,'"(0/<0,-' =BB)A

SADSOBEL MAX FILTER

SADSOBEL MAX FILTER

MDP vs Vivado_Rew XMDP vs Vivado_Rew

Page 80: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

CONCLUSIONS

DATA REUSE ANALYSIS BASED ON POLLY

Optimize HLS of Accelerators for Sliding Window Applications

Use Polly for Better HW Designs

80

Page 81: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

CONCLUSIONS

DATA REUSE ANALYSIS BASED ON POLLY

Optimize HLS of Accelerators for Sliding Window Applications

Use Polly for Better HW Designs

81

▸ Exploit Data Locality

Page 82: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

CONCLUSIONS

DATA REUSE ANALYSIS BASED ON POLLY

Optimize HLS of Accelerators for Sliding Window Applications

Use Polly for Better HW Designs

82

▸ Exploit Data Locality

▸ Design Efficient Buffers

Page 83: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

CONCLUSIONS

DATA REUSE ANALYSIS BASED ON POLLY

Optimize HLS of Accelerators for Sliding Window Applications

Use Polly for Better HW Designs

83

▸ Exploit Data Locality

▸ Design Efficient Buffers

▸ Better Use of Resources

Page 84: DATA REUSE ANALYSIS FOR AUTOMATED SYNTHESIS OF …impact.gforge.inria.fr/impact2017/papers/impact-17... · RELATED WORK DATA TRANSFER IS THE BOTTLENECK Rely on source-to-source transformations

CONCLUSIONS

DATA REUSE ANALYSIS BASED ON POLLY

Optimize HLS of Accelerators for Sliding Window Applications

Use Polly for Better HW Designs

84

▸ Exploit Data Locality

▸ Design Efficient Buffers

▸ Better Use of Resources

▸ Better Performance