hls chap3 catapult - inriapeople.rennes.inria.fr/olivier.sentieys/teach/hls_chap3_catapult.pdf ·...

12
1/31/12 1 High-Level Synthesis using Catapult-C System-on-Chip Design Methodologies Olivier Sentieys IRISA/INRIA ENSSAT - Université de Rennes 1 EII3/M2R - 2 Outline Introduction Design Flow and Tool Basics Data Types Writing C++ for Synthesis Optimizing your Design Loops Interface and Memory Synthesis EII3/M2R - 3 Catapult C Design Methodology Compatible Environments Matlab/Simulink C++ SystemC EII3/M2R - 4 Design Steps with Catapult Algorithm design and H/S partitioning Write C code for hardware technology independent, fast simulation, compact, etc. Analyze your C code inside Catapult Constrain the micro-architecture technology, resource constraints, I/O, frequency, etc. Generate, analyze and validate hardware Gantt chart testbench generation, RTL generation

Upload: others

Post on 09-Aug-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: HLS chap3 Catapult - Inriapeople.rennes.inria.fr/Olivier.Sentieys/teach/HLS_chap3_Catapult.pdf · • The style of your C code will have a great impact on design quality ! Add #pragma

1/31/12  

1  

High-Level Synthesis using Catapult-C

System-on-Chip Design Methodologies

Olivier Sentieys IRISA/INRIA

ENSSAT - Université de Rennes 1

EII3/M2R - 2

Outline §  Introduction §  Design Flow and Tool Basics §  Data Types §  Writing C++ for Synthesis §  Optimizing your Design

•  Loops

§  Interface and Memory Synthesis

EII3/M2R - 3

Catapult C Design Methodology §  Compatible

Environments •  Matlab/Simulink •  C++ •  SystemC

EII3/M2R - 4

Design Steps with Catapult §  Algorithm design and H/S partitioning §  Write C code for hardware

•  technology independent, fast simulation, compact, etc.

§  Analyze your C code inside Catapult §  Constrain the micro-architecture

•  technology, resource constraints, I/O, frequency, etc.

§  Generate, analyze and validate hardware •  Gantt chart •  testbench generation, RTL generation

Page 2: HLS chap3 Catapult - Inriapeople.rennes.inria.fr/Olivier.Sentieys/teach/HLS_chap3_Catapult.pdf · • The style of your C code will have a great impact on design quality ! Add #pragma

1/31/12  

2  

EII3/M2R - 5

Writing C code for Hardware: basics §  Big rules:

•  No dynamic memory allocation •  Pointer restrictions •  Integer or fixed-point data types: bit accurate, no floats •  The style of your C code will have a great impact on design

quality

§  Add #pragma hls_design top!§  Use pragmas in C++ code

EII3/M2R - 6

C Design Example

#define num_taps 8!#pragma hls_design top !void fir_filter (int *input,!

! ! ! ! !int coeffs[num_taps],!! ! ! ! !int *output) { !!static int regs[num_taps];!!short temp = 0;!!for (int i = num_taps-1; i >= 0; i--) { !! !if (i == 0)!! ! !regs[0] = *input; !! !else!! ! !regs[i] = regs[i-1]; !! !temp += coeffs[i] * regs[i];!!} !!*output = temp;!

}!

Virtual WAIT

Outputs are registered

Inputs are read with handshake

EII3/M2R - 7

Outline §  Introduction §  Design Flow and Tool Basics §  Data Types §  Writing C++ for Synthesis §  Optimizing your Design

•  Loops

§  Interface and Memory Synthesis

EII3/M2R - 8

Synthesis Flow §  Set working directory §  Add input file(s) §  Setup design §  Architectural

constraints §  Resource constraints §  Schedule §  Generate RTL §  Invoke simulation

Page 3: HLS chap3 Catapult - Inriapeople.rennes.inria.fr/Olivier.Sentieys/teach/HLS_chap3_Catapult.pdf · • The style of your C code will have a great impact on design quality ! Add #pragma

1/31/12  

3  

EII3/M2R - 9

Catapult Window §  Invoke catapult!§  Load scripts §  “Crossprobing” between

C code and •  constraints •  Gantt chart •  generated HDL •  reports •  schematic •  etc.

EII3/M2R - 10

Setting Up Design

EII3/M2R - 11

Setting Up Design §  Technology: FPGA, ASIC §  Synthesis tool §  IP blocks: RAMs, pipeline multipliers §  Design frequency constraints

•  Clock cycle frequency

§  Interface •  One, and only one, clock ;-) je vous l’avais bien dit •  Synchronous or asynchronous reset •  Enable (optional) •  Start and Done flags

EII3/M2R - 12

Specifying Architectural Constraints

Page 4: HLS chap3 Catapult - Inriapeople.rennes.inria.fr/Olivier.Sentieys/teach/HLS_chap3_Catapult.pdf · • The style of your C code will have a great impact on design quality ! Add #pragma

1/31/12  

4  

EII3/M2R - 13

Analysing your design

EII3/M2R - 14

Analysing your design

EII3/M2R - 15

Analysing your design §  Gantt Chart View of Data Dependencies

EII3/M2R - 16

Schematic Viewer

Page 5: HLS chap3 Catapult - Inriapeople.rennes.inria.fr/Olivier.Sentieys/teach/HLS_chap3_Catapult.pdf · • The style of your C code will have a great impact on design quality ! Add #pragma

1/31/12  

5  

EII3/M2R - 17

Schematic Viewer

EII3/M2R - 18

Outline §  Introduction §  Design Flow and Tool Basics §  Data Types §  Writing C++ for Synthesis §  Optimizing your Design

•  Loops

§  Interface and Memory Synthesis

EII3/M2R - 19

Data Types §  Type constraints

•  signed or unsigned •  bit-width for integers •  bit-width for fixed-point types

o  0111.11001

§  C++ extensions: bitvectors •  package mc_bitvector.h •  int5 my_variable; !•  uint5 my_unsigned_variable;!

EII3/M2R - 20

SystemC Datatypes §  #include "systemc.h”!§  sc_int/sc_bigint

•  sc_int<10> tenbitInt;!§  sc_uint/sc_biguint §  sc_fixed/sc_ufixed

•  sc_fixed<20,10> a; !•  sc_fixed<20,10,SC_RND,SC_SAT_ZERO> c;!

Page 6: HLS chap3 Catapult - Inriapeople.rennes.inria.fr/Olivier.Sentieys/teach/HLS_chap3_Catapult.pdf · • The style of your C code will have a great impact on design quality ! Add #pragma

1/31/12  

6  

EII3/M2R - 21

AC Datatypes §  #include <ac_fixed.h>!§  ac_int<int W, bool S>

•  ac_int<10> tenbitInt;!•  ac_int<10,true> tenbitIntSigned;!

§  ac_fixed<int W, int I, bool S, ac_q_mode Q, ac_o_mode O>

•  ac_fixed<20,10,true> a; !•  ac_fixed<20,10,AC_RND,AC_SAT_ZERO> c;!

EII3/M2R - 22

AC Datatypes

EII3/M2R - 23

AC Datatypes: Quantization Modes

EII3/M2R - 24

AC Datatypes: Overflow Modes

Page 7: HLS chap3 Catapult - Inriapeople.rennes.inria.fr/Olivier.Sentieys/teach/HLS_chap3_Catapult.pdf · • The style of your C code will have a great impact on design quality ! Add #pragma

1/31/12  

7  

EII3/M2R - 25

Outline §  Introduction §  Design Flow and Tool Basics §  Data Types §  Writing C++ for Synthesis §  Optimizing your Design

•  Loops

§  Interface and Memory Synthesis

EII3/M2R - 26

C++ File Format §  Files: .c, .cxx, .cpp or .c §  C++ parser §  C++ preprocessor #ifndef MY_HEADER_FILE_NAME !

#define MY_HEADER_FILE_NAME!!code goes here ...!

#endif!!

#pragma hls_design top !void my_design (int *input, int array[8], int *output) { !

!static int temp;!!short var = *input;!!… !!*output = temp;!

}!

EII3/M2R - 27

Storage Types §  Static datatypes may only be assigned to constants

during their declaration. •  static int a = 5; // Correct !•  static int b = x; // Incorrect if x is not a constant declared in the local scope!

§  Static variables are assigned to their initial value during reset

§  Storage types “const”, “extern” and “mutable” have no affect on synthesis

EII3/M2R - 28

Condition Statements §  All branches of a conditional statement will be

balanced to have the same length §  Every branch of a case statement should have a

break §  Don't write code with conditional looping statements

§  Supported: if, switch switch (a) { !!case 1:!! !c = a + b;!! !break; !!case 12:!! !c = a - b; !! !break;!

}!

Page 8: HLS chap3 Catapult - Inriapeople.rennes.inria.fr/Olivier.Sentieys/teach/HLS_chap3_Catapult.pdf · • The style of your C code will have a great impact on design quality ! Add #pragma

1/31/12  

8  

EII3/M2R - 29

Loop Statements §  The variable used to decide if a loop should exit

should have a constant start value and increment §  Conditional break from a loop is preferred §  Each loop should have only one “exit” §  Supported: do, for, while §  Partial unroll of any loop

•  #pragma unroll yes // unroll a loop !•  #pragma unroll no // leave a loop rolled !•  #pragma unroll 5 // Unroll the loop 5 times!•  Default is to leave loops rolled

EII3/M2R - 30

Branching and Functions §  Branching

•  “continue” statement should be avoided •  “goto” statement is not supported •  Supported: break continue return

§  Functions •  Should have only one “return” statement at the end •  Recursive functions are not supported

int my_addsub (int a, int b, bool c) { !if ( c )!

!return a + b; !else!

!return a - b;!}!

if ( c )!!return a + b; !

return a - b;!

EII3/M2R - 31

Expressions §  Don't index arrays with signed expressions §  Don't use the pre- and post-increment expressions

(“++” and “--”) on a variable if that variable is used somewhere else in that expression •  b = 5; a = ++b * --b;!•  a = 6 * 5; or a = 5 * 5;!

§  Divide “/” and Modulo “%” should be avoided §  Keep C integer shifts in their defined range: 0 to 31

for int, 0 to 63 for long

EII3/M2R - 32

Using Parentheses and Common Sub-Expressions

Page 9: HLS chap3 Catapult - Inriapeople.rennes.inria.fr/Olivier.Sentieys/teach/HLS_chap3_Catapult.pdf · • The style of your C code will have a great impact on design quality ! Add #pragma

1/31/12  

9  

EII3/M2R - 33

Multiply and Divide by Constants §  Constant shifts have virtually no cost in hardware §  Constant multiplications, divisions and modulus are

converted into shifts and adds or subtracts •  y = a * 6;!•  converted into y = (a << 1) + (a << 2);!

•  y = (int)a/2;!•  converted into y = ((int)a + 1) >> 1;!

•  y = (unsigned int)a/2;!•  converted into y = (unsigned int) a >> 1;!

EII3/M2R - 34

Loops §  FOR loop

§  WHILE loop

§  DO loop

FOR_LOOP:for(int i=0;i<4;i++) { !!dout[i] = din[i];!

}

int i=0; !WHILE_LOOP:while(i<4){!

!dout[i] = din[i]; !!i++;!

}

DO_LOOP:do{!!dout[i] = din[i];!!i++; !

}while(i<4);

EII3/M2R - 35

Loops §  Loop using one iterator

EII3/M2R - 36

Loops §  Loop using multiple iterators

Page 10: HLS chap3 Catapult - Inriapeople.rennes.inria.fr/Olivier.Sentieys/teach/HLS_chap3_Catapult.pdf · • The style of your C code will have a great impact on design quality ! Add #pragma

1/31/12  

10  

EII3/M2R - 37

Conditions §  Conditional “if” Creates Dependency Chain

EII3/M2R - 38

Conditions §  Conditional “else” Splits Dependency Chain

EII3/M2R - 39

Outline §  Introduction §  Design Flow and Tool Basics §  Data Types §  Writing C++ for Synthesis §  Optimizing your Design

•  Loops

§  Interface and Memory Synthesis

EII3/M2R - 40

Loop §  Loop example int acc=0; !

ACCUM:for(int i=0;i<4;i++){!!acc += din[i];!

}!

Page 11: HLS chap3 Catapult - Inriapeople.rennes.inria.fr/Olivier.Sentieys/teach/HLS_chap3_Catapult.pdf · • The style of your C code will have a great impact on design quality ! Add #pragma

1/31/12  

11  

EII3/M2R - 41

Partial Loop Unrolling int acc=0;!ACCUM:for(int i=0;i<4;i+=2){ !

!acc += din[i]; !!acc += din[i+1];!

}!

EII3/M2R - 42

Fully Unrolled Loop

int acc=0;!acc += din[0]; !acc += din[1]; !acc += din[2]; !acc += din[3];

EII3/M2R - 43

Loops with Conditional Bounds •  Loop bound is an input

#include“accum.h” !#include<ac_int.h> !void accumulate( int din[4], int &dout,!

! ! ! ! !unsigned int ctrl){! int acc=0; ! ACCUM:for(int i=0;i<ctrl;i++){!

!acc += din[i];! }!}!

EII3/M2R - 44

Optimizing the Loop Control #include “accum.h” !#include <ac_int.h> !void accumulate(int din[4], int &dout, !

! ! ! ! !ac_int<3,false> ctrl){! int acc=0;! int i_old=0; ! ACCUM:for(int i=0;i<4;i++){!

!acc += din[i]; if(i_old==ctrl)!!break; !!i_old = i;!

} ! dout = acc;!}

Page 12: HLS chap3 Catapult - Inriapeople.rennes.inria.fr/Olivier.Sentieys/teach/HLS_chap3_Catapult.pdf · • The style of your C code will have a great impact on design quality ! Add #pragma

1/31/12  

12  

EII3/M2R - 45

Nested Loops #include “accum.h” !#include <ac_int.h> !#define MAX 100000 !void accumulate(int din[2][4], !

! ! ! ! int &dout){! int acc=0; ! ROW:for(int i=0;i<2;i++){!

!if(acc>MAX) !! acc = MAX;!

!COL:for(int j=0;j<4;j++){ !! acc += din[i][j];!

!} !!dout = acc;!

}!}

EII3/M2R - 46

Nested Loop: Unrolling the Innermost Loop

ROW:for(int i=0;i<2;i++){! !acc=0; !

!acc += din[i][0];!!acc += din[i][1];!!acc += din[i][2];!!acc += din[i][3];!!dout[i] = acc;!

}!}

EII3/M2R - 47

Nested Loop: Unrolling the Outer Loop

int acc[2];!acc[0] = 0; !COL_0: for(int j=0;j<4;j++){!

!acc[0] += din[0][j];!} !dout[0] = acc[0]; !acc[1] = 0;!COL_1:for(int j=0;j<4;j++){ !

!acc[1] += din[1][j];!} !dout[1] = acc[1];

int acc[2];!acc[0] = 0;!acc[1] = 0; !COL_0_1:for(int j=0;j<4;j++){!

!acc[0] += din[0][j]; !!acc[1] += din[1][j];!

} !dout[0] = acc[0]; !dout[1] = acc[1];

=> Loop Merging

EII3/M2R - 48

But loop parallelization is complex! for(i=1; i<=n-1; i++)! for(j=1; j<=n-1; j++)!

!a[i][j] = ( a[i-1][j] + a[i][j]!! ! ! + a[i][j-1] ) / 3.0;!

§  Is the loop parallel ?

§  Apply the following transform •  t=i+j, p=j i=p-t, j=p •  Is it better ?

The parallel execution of DO loops, Leslie Lamport, Communications of the ACM CACM 17(2), 1974