pipelining and parallel processing -...

33
1 Pipelining and Parallel Processing Shao-Yi Chien

Upload: others

Post on 30-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

1

Pipelining and

Parallel Processing

Shao-Yi Chien

Page 2: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 2

Introduction

Pipelining and parallel are the most important

design techniques in VLSI DSP systems

Make use of the inherent parallel property of

DSP algorithms

Pipelining: different function units working in parallel

Parallel: duplicated function units working in parallel

Page 3: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 3

An Example:

Car Production Line

mT

T T T

mT

mT

mT

m

n

Cost: 1

Throughput: 1/mT

Latency: mT

Throughput: how

many cars produced in

one hour

Latency: how long will

it take to produce one

carCost: >1

Throughput: 1/T

Latency: >mT

Cost: >n

Throughput: (1/mT)*n

Latency: >mT

Page 4: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 4

Pipelining of Digital Filters (1/3)

FIR filter

Critical path: TM+2TA

Page 5: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 5

Pipelining of Digital Filters (2/3)

Pipelined FIR filter

AMsample TTT AM

sampleTT

f

1

Page 6: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 6

Pipelining of Digital Filters (3/3)

Schedule

Page 7: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 7

Pipeline

Can reduce the critical path to increase

the working frequency and sample rate

TM+2TATM+TA

Page 8: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 8

Drawbacks of Pipelining

Increasing latency (in cycle)

For M-level pipelined system, the number of

delay elements in any path from input to

output is (M-1) greater than the origin one

Increase the number of latches (registers)

Page 9: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 9

How to Do Pipelining?

Put pipelining latches across any feed-forward cutset of the graph

Cutset A cutset is a set of edges of a graph such that if these

edges are removed from the graph, the graph becomes disjoint

Feed-forward cutset The data move in the forward direction on all the

edges of the cutset

Page 10: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 10

How to Do Pipelining?

Feed-forward cutset

Page 11: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 11

Example

In the SFG in Fig. (a), all the computation time for each node is 1 u.t.

(a) Calculate the critical path

(b) The critical path is reduced to 2 u.t. by inserting 3 extra delay elements. Is it a valid pipelining?

Page 12: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 12

Fine-Grain Pipelining

Critical path (TM=10, TA=2) TM+2TA=14

TM+TA=12

TM1=6 or TM2+TA=6

Page 13: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 13

Notes for Pipelining (1/2)

Pipelining is a very simple design technique which can maintain the input output data configuration and sampling frequency

Tclk=Tsample

Supported in many EDA tools

Still has some limitations Pipeline bubbles

Has some problems for recursive system

Introduces large hardware cost for 2-D or 3-D data

Communication bound

Page 14: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 14

Notes for Pipelining (2/2)

Effective pipelining

Put pipelining registers on the critical path

Balance pipelining

10(2+8): critical path=8

10(5+5): critical path=5

Page 15: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 15

Parallel of Digital Filters (1/5) Single-input single-output (SISO) system

Multiple-input multiple-output (MIMO) system

3-Parallel System!

Page 16: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 16

Parallel of Digital Filters (2/5)

Parallel processing, block processing

Block size (L): the number of data to be

processed at the same time

Block delay (L-slow)

A latch is equivalent to L clock cycles at the

sample rate

Page 17: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 17

Parallel of Digital Filters (3/5)

The critical path is the same

Tclk is not equal to Tsample

L-slow

Page 18: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 18

Parallel of Digital Filters (4/5)

Whole system

Page 19: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 19

Parallel of Digital Filters (5/5)

Serial-to-parallel converter Parallel-to-serial converter

Page 20: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 20

Parallel Processing v.s.

Pipelining Parallel processing is superior than pipelining

processing for the I/O bottleneck

(communication bounded)

Pipelining only can increase the clock rate

System clock rate = sample rate for pipelining system

Use parallel processing can further lower the required

working frequency

Page 21: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 21

Pipelining-Parallel Architecture

Page 22: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 22

Notes for Parallel Processing

The input/output data access scheme

should be carefully designed, it will cost a

lot sometimes

Tclk>Tsample, fclk<fsample

Large hardware cost

Combined with pipelining processing

Page 23: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 23

Low Power Issues

Pipelining and parallel processing are also beneficial for low power design

Propagation delay

Power consumption

Assume the sampling frequency is the same

Page 24: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 24

Pipelining for Low Power (1/2)

For M-level pipelining

Critical path is reduced to 1/M

The capacitance is also reduced to Ccharge/M

The supply voltage can be reduced to , and the

propagation delay remains unchanged

Page 25: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 25

Pipelining for Low Power (2/2)

Power consumption:

How about the parameter ?

Solve this equation to get

Page 26: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 26

Example

Parameters TM=10 u.t.

TA=2 u.t.

Tm1=6 u.t.

Tm2=4 u.t.

CM=5CA

Vt=0.6V

Normal Vcc=5V

(a) New supply voltage?

(b) Power saving percentage?

Page 27: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 27

Answer(a)

Origin system:

Pipelined system:

(b)

Invalid value, less

than threshold

voltage

Page 28: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 28

Parallel Processing for Low

Power (1/2) For L-parallel system

Clock period: TseqLTseq

Ccharge remains unchanged

CtotolLCtotal

Have more time to charge the capacitance, the supply

voltage can be lower

Page 29: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 29

Parallel Processing for Low

Power (2/2)

Power consumption

To derive the parameter

Page 30: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 30

Example

Parameters TM=8 u.t.

TA=1 u.t.

Tsample=9 u.t.

CM=8CA

Vt=0.45V

Normal Vcc=3.3V

(a) New supply voltage?

(b) Power saving percentage?

Page 31: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 31

Answer(a)

Origin:

2-parallel system:

(b)

Invalid value, less

than threshold

voltage

Page 32: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 32

Pipelining-Parallel for Low

Power

Page 33: Pipelining and Parallel Processing - 國立臺灣大學media.ee.ntu.edu.tw/courses/dspdesign/16F/slide/3... · Notes for Pipelining (1/2) Pipelining is a very simple design technique

DSP in VLSI Design Shao-Yi Chien 33

Conclusion

Pipelining

Reduce the critical path

Increase the clock speed

Reduce power consumption at the same speed

Parallel

Increase effective sampling rate

Reduce power consumption at the same speed