cmput429/cmpe382 amaral 1/17/01 cmput429/cmpe382 winter 2001 topic9: software pipelining (some...

49
CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture Slides)

Upload: gordon-commander

Post on 14-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

CMPUT429/CMPE382 Winter 2001

Topic9: Software Pipelining

(Some slides from David A. Patterson’s CS252,

Spring 2001 Lecture Slides)

Page 2: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Another possibility:Software Pipelining

• Observation: if iterations from loops are independent, then we can get more ILP by scheduling execution instructions from different iterations

• Software pipelining: reorganizes loops so that each iteration is made from instructions chosen from different iterations of the original loop

Iteration 0 Iteration

1 Iteration 2 Iteration

3 Iteration 4

Software- pipelined iteration

Page 3: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining ExampleBefore: Unrolled 3 times 1 L.D F0,0(R1) 2 ADD.D F4,F0,F2 3 S.D 0(R1),F4 4 L.D F6,-8(R1) 5 ADD.D F8,F6,F2 6 S.D -8(R1),F8 7 L.D F10,-16(R1) 8 ADD.D F12,F10,F2 9 S.D -16(R1),F12 10 DSUBUI R1,R1,#24 11 BNEZ R1,LOOP

After: Software Pipelined 1 S.D 0(R1),F4 ; Stores M[i] 2 ADD.D F4,F0,F2 ; Adds to

M[i-1] 3 L.D F0,-16(R1);Loads M[i-

2] 4 DSUBUI R1,R1,#8 5 BNEZ R1,LOOP

• Symbolic Loop Unrolling– Maximize result-use distance – Less code space than unrolling– Fill & drain pipe only once per loop vs. once per each unrolled iteration in loop unrolling

SW Pipeline

Loop Unrolled

ove

rlap

ped

op

sTime

Time

5 cycles per iteration

Page 4: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining ExampleBefore: Unrolled 3 times 1 L.D F0,0(R1) 2 ADD.D F4,F0,F2 3 S.D 0(R1),F4 4 L.D F6,-8(R1) 5 ADD.D F8,F6,F2 6 S.D -8(R1),F8 7 L.D F10,-16(R1) 8 ADD.D F12,F10,F2 9 S.D -16(R1),F12 10 DSUBUI R1,R1,#24 11 BNEZ R1,LOOP

After: Software PipelinedL.D F0,0(R1)ADD.D F4,F0,F2L.D F0,-8(R1)

------------------------------------L: S.D 0(R1),F4 ; Stores M[i]

ADD.D F4,F0,F2 ; Adds to M[i-1]L.D F0,-16(R1); Loads M[i-2]DSUBUI R1,R1,#8

BNEZ R1,L------------------------------------

S.D -8(R1),F4 ADD.D F4,F0,F2

S.D -16(R1),F4

Page 5: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining ExampleAfter: Software Pipelined

L.D F0,0(R1)ADD.D F4,F0,F2L.D F0,-8(R1)

------------------------------------L: S.D 0(R1),F4 ; Stores M[i]

ADD.D F4,F0,F2 ; Adds to M[i-1]L.D F0,-16(R1); Loads M[i-2]DSUBUI R1,R1,#8

BNEZ R1,L------------------------------------

S.D -8(R1),F4 ADD.D F4,F0,F2

S.D -16(R1),F4

F0 F2 F4

X[1000]X[999]X[998]X[997]

...

0xFF000xFEE80xFEE00xFED8

...

R1

sX[1000]

Page 6: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining ExampleAfter: Software Pipelined

L.D F0,0(R1)ADD.D F4,F0,F2L.D F0,-8(R1)

------------------------------------L: S.D 0(R1),F4 ; Stores M[i]

ADD.D F4,F0,F2 ; Adds to M[i-1]L.D F0,-16(R1); Loads M[i-2]DSUBUI R1,R1,#8

BNEZ R1,L------------------------------------

S.D -8(R1),F4 ADD.D F4,F0,F2

S.D -16(R1),F4

X[1000]X[999]X[998]X[997]

...

0xFF000xFEE80xFEE00xFED8

...

+

R1

T1F0 F2 F4

sx[1000]

Page 7: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining ExampleAfter: Software Pipelined

L.D F0,0(R1)ADD.D F4,F0,F2L.D F0,-8(R1)

------------------------------------L: S.D 0(R1),F4 ; Stores M[i]

ADD.D F4,F0,F2 ; Adds to M[i-1]L.D F0,-16(R1); Loads M[i-2]DSUBUI R1,R1,#8

BNEZ R1,L------------------------------------

S.D -8(R1),F4 ADD.D F4,F0,F2

S.D -16(R1),F4

X[1000]X[999]X[998]X[997]

...

0xFF000xFEE80xFEE00xFED8

...

R1

T1F0 F2 F4

sx[999]

Page 8: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining ExampleAfter: Software Pipelined

L.D F0,0(R1)ADD.D F4,F0,F2L.D F0,-8(R1)

------------------------------------L: S.D 0(R1),F4 ; Stores M[i]

ADD.D F4,F0,F2 ; Adds to M[i-1]L.D F0,-16(R1); Loads M[i-2]DSUBUI R1,R1,#8

BNEZ R1,L------------------------------------

S.D -8(R1),F4 ADD.D F4,F0,F2

S.D -16(R1),F4

T1X[999]X[998]X[997]

...

0xFF000xFEE80xFEE00xFED8

...

R1

T1F0 F2 F4

sx[999]

Page 9: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining ExampleAfter: Software Pipelined

L.D F0,0(R1)ADD.D F4,F0,F2L.D F0,-8(R1)

------------------------------------L: S.D 0(R1),F4 ; Stores M[i]

ADD.D F4,F0,F2 ; Adds to M[i-1]L.D F0,-16(R1); Loads M[i-2]DSUBUI R1,R1,#8

BNEZ R1,L------------------------------------

S.D -8(R1),F4 ADD.D F4,F0,F2

S.D -16(R1),F4

X[1000]X[999]X[998]X[997]

...

0xFF000xFEE80xFEE00xFED8

...

R1

T2F0 F2 F4

sx[999]

+

Page 10: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining ExampleAfter: Software Pipelined

L.D F0,0(R1)ADD.D F4,F0,F2L.D F0,-8(R1)

------------------------------------L: S.D 0(R1),F4 ; Stores M[i]

ADD.D F4,F0,F2 ; Adds to M[i-1]L.D F0,-16(R1); Loads M[i-2]DSUBUI R1,R1,#8

BNEZ R1,L------------------------------------

S.D -8(R1),F4 ADD.D F4,F0,F2

S.D -16(R1),F4

X[1000]X[999]X[998]X[997]

...

0xFF000xFEE80xFEE00xFED8

...

R1

T2F0 F2 F4

sx[998]

Page 11: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining ExampleAfter: Software Pipelined

L.D F0,0(R1)ADD.D F4,F0,F2L.D F0,-8(R1)

------------------------------------L: S.D 0(R1),F4 ; Stores M[i]

ADD.D F4,F0,F2 ; Adds to M[i-1]L.D F0,-16(R1); Loads M[i-2]DSUBUI R1,R1,#8

BNEZ R1,L------------------------------------

S.D -8(R1),F4 ADD.D F4,F0,F2

S.D -16(R1),F4

X[1000]X[999]X[998]X[997]

...

0xFF000xFEE80xFEE00xFED8

...

R1

T2F0 F2 F4

sx[998]

Page 12: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining ExampleAfter: Software Pipelined

L.D F0,0(R1)ADD.D F4,F0,F2L.D F0,-8(R1)

------------------------------------L: S.D 0(R1),F4 ; Stores M[i]

ADD.D F4,F0,F2 ; Adds to M[i-1]L.D F0,-16(R1); Loads M[i-2]DSUBUI R1,R1,#8

BNEZ R1,L------------------------------------

S.D -8(R1),F4 ADD.D F4,F0,F2

S.D -16(R1),F4

X[1000]T2

X[998]X[997]

...

0xFF000xFEE80xFEE00xFED8

...

R1

T2F0 F2 F4

sx[998]

Page 13: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

32 33 34 35 36 37 38

General Registers (Physical)

0 0116 17 18

Predicate Registers

4

LC

3

EC

x4x5

x1x2x3

Memory

39

32 33 34 35 36 37 38 39

General Registers (Logical)

0

RRB

Page 14: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

x132 33 34 35 36 37 38

General Registers (Physical)

0 0116 17 18

Predicate Registers

4

LC

3

EC

x4x5

x1x2x3

Memory

39

32 33 34 35 36 37 38 39

General Registers (Logical)

0

RRB

Page 15: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

0 0116 17 18

Predicate Registers

4

LC

3

EC

x4x5

x1x2x3

Memory

x132 33 34 35 36 37 38

General Registers (Physical)

39

32 33 34 35 36 37 38 39

General Registers (Logical)

0

RRB

Page 16: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

0 0116 17 18

Predicate Registers

4

LC

3

EC

x4x5

x1x2x3

Memory

x132 33 34 35 36 37 38

General Registers (Physical)

39

32 33 34 35 36 37 38 39

General Registers (Logical)

0

RRB

Page 17: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

0 0116 17 18

Predicate Registers

4

LC

3

EC

1

x4x5

x1x2x3

Memory

x133 34 35 36 37 38 39

General Registers (Physical)

32

32 33 34 35 36 37 38 39

General Registers (Logical)

-1

RRB

Page 18: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

1 0116 17 18

Predicate Registers

3

LC

3

EC

x4x5

x1x2x3

Memory

x133 34 35 36 37 38 39

General Registers (Physical)

32

32 33 34 35 36 37 38 39

General Registers (Logical)

-1

RRB

Page 19: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

1 0116 17 18

Predicate Registers

3

LC

3

EC

x4x5

x1x2x3

Memory

x133 34 35 36 37 38 39

General Registers (Physical)

32

32 33 34 35 36 37 38 39

General Registers (Logical)

x2

-1

RRB

Page 20: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

1 0116 17 18

Predicate Registers

3

LC

3

EC

x4x5

x1x2x3

Memory

x133 34 35 36 37 38 39

General Registers (Physical)

32

32 33 34 35 36 37 38 39

General Registers (Logical)

x2y1

-1

RRB

Page 21: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

1 0116 17 18

Predicate Registers

3

LC

3

EC

x4x5

x1x2x3

Memory

x133 34 35 36 37 38 39

General Registers (Physical)

32

32 33 34 35 36 37 38 39

General Registers (Logical)

x2y1

-1

RRB

Page 22: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

1 0116 17 18

Predicate Registers

3

LC

3

EC

x4x5

x1x2x3

Memory

x133 34 35 36 37 38 39

General Registers (Physical)

32

32 33 34 35 36 37 38 39

General Registers (Logical)

x2y1

-1

RRB

Page 23: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

1 1116 17 18

Predicate Registers

2

LC

3

EC

1

x4x5

x1x2x3

Memory

x134 35 36 37 38 39 32

General Registers (Physical)

33

32 33 34 35 36 37 38 39

General Registers (Logical)

x2y1

-2

RRB

Page 24: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

1 1116 17 18

Predicate Registers

2

LC

3

EC

x4x5

x1x2x3

Memory

x134 35 36 37 38 39 32

General Registers (Physical)

33

32 33 34 35 36 37 38 39

General Registers (Logical)

x2y1 x3

-2

RRB

Page 25: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

y2

1 1116 17 18

Predicate Registers

2

LC

3

EC

x4x5

x1x2x3

Memory

34 35 36 37 38 39 32

General Registers (Physical)

33

32 33 34 35 36 37 38 39

General Registers (Logical)

x2y1 x3

-2

RRB

Page 26: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

1 1116 17 18

Predicate Registers

2

LC

3

EC

x4x5

x1x2x3 y1

Memory

y234 35 36 37 38 39 32

General Registers (Physical)

33

32 33 34 35 36 37 38 39

General Registers (Logical)

x2y1 x3

-2

RRB

Page 27: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

1 1116 17 18

Predicate Registers

2

LC

3

EC

x4x5

x1x2x3 y1

Memory

y234 35 36 37 38 39 32

General Registers (Physical)

33

32 33 34 35 36 37 38 39

General Registers (Logical)

x2y1 x3

-2

RRB

Page 28: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

1 11

16 17 18

Predicate Registers

1

LC

3

EC

1

x4x5

x1x2x3 y1

Memory

-3

RRB

y235 36 37 38 39 32 33

General Registers (Physical)

34

32 33 34 35 36 37 38 39

General Registers (Logical)

x2y1 x3

Page 29: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

1 1116 17 18

Predicate Registers

1

LC

3

EC

x4x5

x1x2x3 y1

Memory

-3

RRB

y2 x435 36 37 38 39 32 33

General Registers (Physical)

34

32 33 34 35 36 37 38 39

General Registers (Logical)

x2y1 x3

Page 30: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

1 1116 17 18

Predicate Registers

1

LC

3

EC

x4x5

x1x2x3 y1

Memory

y2 x435 36 37 38 39 32 33

General Registers (Physical)

34

32 33 34 35 36 37 38 39

General Registers (Logical)

y3y1 x3

-3

RRB

Page 31: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

1 1116 17 18

Predicate Registers

1

LC

3

EC

x4x5

x1x2x3 y1

y2

Memory

y2 x435 36 37 38 39 32 33

General Registers (Physical)

34

32 33 34 35 36 37 38 39

General Registers (Logical)

y3y1 x3

-3

RRB

Page 32: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

1 1116 17 18

Predicate Registers

1

LC

3

EC

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

x4x5

x1x2x3 y1

y2

Memory

y2 x435 36 37 38 39 32 33

General Registers (Physical)

34

32 33 34 35 36 37 38 39

General Registers (Logical)

y3y1 x3

-3

RRB

Page 33: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

1 1116 17 18

Predicate Registers

0

LC

3

EC

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

1

x4x5

x1x2x3 y1

y2

Memory

-4

RRB

y2 x436 37 38 39 32 33 34

General Registers (Physical)

35

32 33 34 35 36 37 38 39

General Registers (Logical)

y3y1 x3

Page 34: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

1 1116 17 18

Predicate Registers

0

LC

3

EC

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

x4x5

x1x2x3 y1

y2

Memory

y2 x5 x436 37 38 39 32 33 34

General Registers (Physical)

35

32 33 34 35 36 37 38 39

General Registers (Logical)

y3y1 x3

-4

RRB

Page 35: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

1 1116 17 18

Predicate Registers

0

LC

3

EC

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

x4x5

x1x2x3 y1

y2

Memory

y2 x5 x436 37 38 39 32 33 34

General Registers (Physical)

35

32 33 34 35 36 37 38 39

General Registers (Logical)

y3y1 y4

-4

RRB

Page 36: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

1 1116 17 18

Predicate Registers

0

LC

3

EC

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

x4x5

x1x2x3 y1

y2y3

Memory

-4

RRB

y2 x5 x436 37 38 39 32 33 34

General Registers (Physical)

35

32 33 34 35 36 37 38 39

General Registers (Logical)

y3y1 y4

Page 37: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

1 1116 17 18

Predicate Registers

0

LC

3

EC

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

x4x5

x1x2x3 y1

y2y3

Memory

y2 x5 x436 37 38 39 32 33 34

General Registers (Physical)

35

32 33 34 35 36 37 38 39

General Registers (Logical)

y3y1 y4

-4

RRB

Page 38: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

1 1016 17 18

Predicate Registers

0

LC

2

EC

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

0

x4x5

x1x2x3 y1

y2y3

Memory

y2 x5 x437 38 39 32 33 34 35

General Registers (Physical)

36

32 33 34 35 36 37 38 39

General Registers (Logical)

y3y1 y4

-5

RRB

Page 39: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

1 1016 17 18

Predicate Registers

0

LC

2

EC

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

x4x5

x1x2x3 y1

y2y3

Memory

y2 x5 x437 38 39 32 33 34 35

General Registers (Physical)

36

32 33 34 35 36 37 38 39

General Registers (Logical)

y3y1 y4

-5

RRB

Page 40: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

1 1016 17 18

Predicate Registers

0

LC

2

EC

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

x4x5

x1x2x3 y1

y2y3

Memory

y2 x5 y537 38 39 32 33 34 35

General Registers (Physical)

36

32 33 34 35 36 37 38 39

General Registers (Logical)

y3y1 y4

-5

RRB

Page 41: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

1 1016 17 18

Predicate Registers

0

LC

2

EC

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

x4x5

x1x2x3

y4

y1y2y3

Memory

y2 x5 y537 38 39 32 33 34 35

General Registers (Physical)

36

32 33 34 35 36 37 38 39

General Registers (Logical)

y3y1 y4

-5

RRB

Page 42: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

1 1016 17 18

Predicate Registers

0

LC

2

EC

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

x4x5

x1x2x3

y4

y1y2y3

Memory

y2 x5 y537 38 39 32 33 34 35

General Registers (Physical)

36

32 33 34 35 36 37 38 39

General Registers (Logical)

y3y1 y4

-5

RRB

Page 43: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

0 1016 17 18

Predicate Registers

0

LC

1

EC

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

0

x4x5

x1x2x3

y4

y1y2y3

Memory

y2 x5 y538 39 32 33 34 35 36

General Registers (Physical)

37

32 33 34 35 36 37 38 39

General Registers (Logical)

y3y1 y4

-6

RRB

Page 44: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

0 1016 17 18

Predicate Registers

0

LC

1

EC

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

x4x5

x1x2x3

y4

y1y2y3

Memory

y2 x5 y5

General Registers (Physical)32 33 34 35 36 37 38 39

General Registers (Logical)

y3y1 y4

-6

RRB

38 39 32 33 34 35 36 37

Page 45: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

0 1016 17 18

Predicate Registers

0

LC

1

EC

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

x4x5

x1x2x3

y4

y1y2y3

Memory

y2 x5 y5

General Registers (Physical)32 33 34 35 36 37 38 39

General Registers (Logical)

y3y1 y4

-6

RRB

38 39 32 33 34 35 36 37

Page 46: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

0 1016 17 18

Predicate Registers

0

LC

1

EC

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

x4x5

x1x2x3

y4y5

y1y2y3

Memory

y2 x5 y5

General Registers (Physical)32 33 34 35 36 37 38 39

General Registers (Logical)

y3y1 y4

-6

RRB

38 39 32 33 34 35 36 37

Page 47: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

0 1016 17 18

Predicate Registers

0

LC

1

EC

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

x4x5

x1x2x3

y4y5

y1y2y3

Memory

y2 x5 y5

General Registers (Physical)32 33 34 35 36 37 38 39

General Registers (Logical)

y3y1 y4

-6

RRB

38 39 32 33 34 35 36 37

Page 48: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

0 1016 17 18

Predicate Registers

0

LC

1

EC

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

x4x5

x1x2x3

y4y5

y1y2y3

Memory

y2 x5 y5

General Registers (Physical)32 33 34 35 36 37 38 39

General Registers (Logical)

y3y1 y4

-6

RRB

38 39 32 33 34 35 36 37

Page 49: CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture

CMPUT429/CMPE382Amaral1/17/01

Software Pipelining Example in the IA-64

0 0016 17 18

Predicate Registers

0

LC

0

EC

loop:(p16) ldl r32 = [r12], 1(p17) add r34 = 1, r33(p18) stl [r13] = r35,1

br.ctop loop

0

x4x5

x1x2x3

y4y5

y1y2y3

Memory

y2 x5 y5

General Registers (Physical)32 33 34 35 36 37 38 39

General Registers (Logical)

y3y1 y4

-7

RRB

38 39 32 33 34 35 36 37