lecture 19 performance optimization · lecture 19 performance optimization xuan ‘silvia’ zhang...

26
Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis http://classes.engineering.wustl.edu/ese461/

Upload: others

Post on 18-Jan-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 19 Performance Optimization · Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis

Lecture 19 Performance Optimization

Xuan ‘Silvia’ Zhang Washington University in St. Louis

http://classes.engineering.wustl.edu/ese461/

Page 2: Lecture 19 Performance Optimization · Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis

Project FAQ

•  Correction –  typo in optical flow: Iy(i, j) = I1(i, j+1) – I1(i, j-1) –  I1(i, j+1) might not exist

•  Mid-project report –  behavioral Verilog code and testbench –  show proof of working functional simulation –  ensure synthesizable codes

•  Use of external memory –  instantiate in the test bench –  used for large data array or buffers

2

Page 3: Lecture 19 Performance Optimization · Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis

Arrays, Vectors, and Memories

3

Page 4: Lecture 19 Performance Optimization · Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis

Useful Verilog Features

•  Display tasks –  $display, $displayb (h, o) in binary, hex, and octal –  $write, $strobe, $monitor

•  File I/O tasks –  $fopen, $fclose –  $fdisplay, $fwrite, $fstrobe, $fmonitor –  $readmemb, $readmemh: read a text file into memory

4

Page 5: Lecture 19 Performance Optimization · Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis

Module Partitioning

•  Where possible, register module outputs and keep critical path in one block

•  Design Registering –  pipelining –  restructure a long data path with several levels of logic

and break it up over multiple cycles

5

Page 6: Lecture 19 Performance Optimization · Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis

Pipelining

6

Page 7: Lecture 19 Performance Optimization · Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis

Pipelining

7

Page 8: Lecture 19 Performance Optimization · Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis

Adding Structure

•  Control the structure by using separate assignment and parentheses

•  Example –  32-bit arithmetic shift right –  design 1 –  design 2

8

Page 9: Lecture 19 Performance Optimization · Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis

32-Bit Arithmetic Shift Right

•  Design 3

9

Page 10: Lecture 19 Performance Optimization · Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis

32-Bit Arithmetic Shift Right

•  Optimal structured design

10

Page 11: Lecture 19 Performance Optimization · Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis

32-Bit Arithmetic Shift Right

•  Without specifying the mux instantiations

11

Page 12: Lecture 19 Performance Optimization · Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis

Horizontal Partitioning

•  Break circuit into horizontal slices to minimize maximum fan-in

•  Example –  carry lookahead adder:

32-bit adder broken to eight 4-bit blocks –  32-bit priority encoder

12

Page 13: Lecture 19 Performance Optimization · Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis

32-Bit Priority Encoder

•  Restructured with four 8-bit blocks

13

Page 14: Lecture 19 Performance Optimization · Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis

Priority-Encoded Logic vs Balanced Logic

•  If-Then-Else vs Case Statement –  redundant priority

14

Page 15: Lecture 19 Performance Optimization · Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis

Hierarchy

•  Collapse hierarchy (flattening) –  more efficient synthesis

•  Add Hierarchy –  benefit results from structure preservation –  example: 32-bit decoder

–  least-efficient implementation

15

Page 16: Lecture 19 Performance Optimization · Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis

32-Bit Decoder

•  More concise representation

•  A balanced tree decoder is even better

16

Page 17: Lecture 19 Performance Optimization · Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis

32-Bit Balanced-Tree Decoder

17

Page 18: Lecture 19 Performance Optimization · Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis

Performing Operations in Parallel

•  Example –  linear search

18

Page 19: Lecture 19 Performance Optimization · Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis

Performing Operations in Parallel

•  Example –  binary search

19

Page 20: Lecture 19 Performance Optimization · Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis

Performing Operations in Parallel

•  Example –  parallel search

20

Page 21: Lecture 19 Performance Optimization · Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis

MUX for Conditional Assignment

•  Example: counter

21

Page 22: Lecture 19 Performance Optimization · Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis

MUX for Conditional Assignment

•  Example: counter

22

Page 23: Lecture 19 Performance Optimization · Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis

Replication

•  Large fanout –  manual register duplication to reduce congestion

23

Page 24: Lecture 19 Performance Optimization · Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis

Resource Sharing

•  Optimize area but hurt speed –  with resource sharing

24

Page 25: Lecture 19 Performance Optimization · Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis

Resource Sharing

•  Optimize area but hurt speed –  without resource sharing

25

Page 26: Lecture 19 Performance Optimization · Lecture 19 Performance Optimization Xuan ‘Silvia’ Zhang Washington University in St. Louis

Questions?

Comments?

Discussion?

26