review&session& - george mason university
TRANSCRIPT
![Page 1: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/1.jpg)
Review Session
CS 465-‐ Fall 2015 Prof. Daniel A. Menasce
Department of Computer Science
![Page 2: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/2.jpg)
Exercise 2.22 Write a minimal set of MIPS assembly instrucFons that does the idenFcal operaFon as the C code below. Assume the base address of C is in $s1 and that A is in $s2. Use the minimum number of registers. Do not destroy the contents of $s1 or $s2. A = C[0] << 4;
![Page 3: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/3.jpg)
Solution to Exercise 2.22 Write a minimal set of MIPS assembly instrucFons that does the idenFcal operaFon as the C code below. Assume the base address of C is in $s1 and that A is in $s2. Use the minimum number of registers. Do not destroy the contents of $s1 or $s2. A = C[0] << 4; lw $t1, 0($s1) # loads C[0] into A sll $t1,$t1,4 # shiU leU by 4 bits contents of $t1 sw $t1, 0($s2) # store C[0] << 4 into A
![Page 4: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/4.jpg)
Exercise 2.26.1 Consider the following MIPS code with the following iniFal values: $t1 = 10 and $s2 = 0. LOOP: slt $t2, $0, $t1 beq $t2, $0, DONE subi $t1, $t1, 1 addi $s2, $s2, 2 j LOOP DONE: What is the final value of $s2?
![Page 5: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/5.jpg)
Solution to Exercise 2.26.1 Consider the following MIPS code with the following iniFal values: $t1 = 10 and $s2 = 0. LOOP: slt $t2, $0, $t1 # if $t1 > 0 then $t2 = 1 else $t2 = 0 beq $t2, $0, DONE # if $t2 = 0 then go to DONE subi $t1, $t1, 1 # $t1 = $t1 -‐1 addi $s2, $s2, 2 # $s2 = $s2 + 2 j LOOP # Go to LOOP DONE: Number of loop execuFons: $t1 at top = 10; $t1 at bocom = 9 … t1 at top = 1; $t1 at bocom = 0 è 10 execuFons è$s2 = 2x10 = 20
![Page 6: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/6.jpg)
Exercise 1 Describe what the following MIPS code does.
addi $s2,$0,$0 addi $t1,$0,$0
LOOP lw $s1,0($s0) add $s2,$s2,$s1 addi $s0,$s0,4 addi $t1,$t1,1 slF $t2,$t1,100 bne $t2,$0,LOOP
DONE:
![Page 7: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/7.jpg)
Solution to Exercise 1 Describe what the following MIPS code does.
addi $s2,$0,$0 # $s2 = 0 addi $t1,$0,$0 # $t1 = 0
LOOP lw $s1,0($s0) # $s1 = Mem[$s0] add $s2,$s2,$s1 # $s2 = $s2+Mem[$s0] addi $s0,$s0,4 # $s0 = $s0 + 4 addi $t1,$t1,1 # $t1 = $t1 + 1 slF $t2,$t1,100 # $t1 = 1 if $t1 < 100; $t1 = 0 otherwise bne $t2,$0,LOOP # branch to LOOP if $t2 ≠ 0 ($t1 < 100)
DONE:
Code meaning: store in $s2 the sum of all 100 words stored starFng at address $s0
![Page 8: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/8.jpg)
Exercise 2 Consider a mulFprocessor with p processors. Assume that 25% of the instrucFons of a program can be executed in parallel using all p processors. The remaining 75% of the instrucFons have to be executed sequenFally. Assume that the Fme to execute the program sequenFally (i.e., using only one processor) is Ts. Give an expression for S(p), the speedup obtained when using p processors. What is the maximum possible speedup? I.e.(lim S(p) when p -‐> ∞)
![Page 9: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/9.jpg)
Solution to Exercise 2 Consider a mulFprocessor with p processors. Assume that 25% of the instrucFons of a program can be executed in parallel using all p processors. The remaining 75% of the instrucFons have to be executed sequenFally. Assume that the Fme to execute the program sequenFally (i.e., using only one processor) is Ts. Give an expression for S(p), the speedup obtained when using p processors. S(p) = Ts / (0.25 Ts + 0.75 Ts/p) = 1 / (0.25 + 0.75/p) Lim S(p) when p -‐> ∞ = 1 / 0.25 = 100 / 25 = 4
![Page 10: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/10.jpg)
Solution to Exercise 2
Amdahl’s Law! The speedup is dominated by the sequenFal porFon of the program.
![Page 11: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/11.jpg)
Exercise 3
1
Describe in detail the operaFon of a lw $t1,off($t2) for a single cycle data path.
![Page 12: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/12.jpg)
Solution to Exercise 3
$t2
$t1
off
0
1
Describe in detail the operaFon of a lw $t1,off($t2) for a single cycle data path.
![Page 13: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/13.jpg)
$t2
$t1
off
32-bit off
$t2
1
0
1
Solution to Exercise 3 Describe in detail the operaFon of a lw $t1,off($t2) for a single cycle data path.
![Page 14: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/14.jpg)
$t2
$t1
off
32-bit off
$t2
1
$t2+off add 0
1
Solution to Exercise 3 Describe in detail the operaFon of a lw $t1,off($t2) for a single cycle data path.
![Page 15: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/15.jpg)
$t2
$t1
off
32-bit off
$t2
1
$t2+off add
1
1
0
Solution to Exercise 3 Describe in detail the operaFon of a lw $t1,off($t2) for a single cycle data path.
![Page 16: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/16.jpg)
$t2
$t1
off
32-bit off
$t2
1
$t2+off add
1
1
0
Mem [$t2+off]
Solution to Exercise 3 Describe in detail the operaFon of a lw $t1,off($t2) for a single cycle data path.
![Page 17: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/17.jpg)
$t2
$t1
off
32-bit off
$t2
1
$t2+off add
1
1
0
Mem [$t2+off]
1
Solution to Exercise 3 Describe in detail the operaFon of a lw $t1,off($t2) for a single cycle data path.
![Page 18: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/18.jpg)
Exercise 4 Consider that the MIPS code below is executed in a 5-‐stage pipelined architecture as discussed in class. What is the minimum number of cycles needed to execute the code and why? lw $t1,0($s0) addi $t1,$t1,10 sw $t1,0($s0)
![Page 19: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/19.jpg)
Solution to Exercise 4 Consider that the MIPS code below is executed in a 5-‐stage pipelined architecture as discussed in class. What is the minimum number of cycles needed to execute the code and why? lw $t1,0($s0) IF ID EX MEM WB addi $t1,$t1,10 IF ID EX MEM WB sw $t1,0($s0) IF ID EX MEM WB Eight cycles are needed. The addi instrucFon needs $t1, which can be forwarded from the end of the MEM stage of the lw instrucFon. This requires the pipeline to stall for one cycle. The sw instrucFon can only be fetched aUer the addi instrucFon is fetched to avoid a structural hazard.
![Page 20: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/20.jpg)
Exercise 5 Which of the statements below regarding pipeline registers are correct? (a) Pipeline registers are needed because each stage of the
pipeline needs its own register file to execute instrucFons that need registers.
(b) A pipeline register at the interface of stages i and i+1 needs to store all the control unit values needed to execute an instrucFon execuFng at stage i+1 as well as the outputs of stage i that are needed as input to stage i+1.
(c) The contents of the pipeline registers are updated at each clock cycle.
(d) The pipeline registers store values to be forwarded to previous stages in the pipeline.
![Page 21: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/21.jpg)
Solutions to Exercise 5 Which of the statements below regarding pipeline registers are correct? (a) Pipeline registers are needed because each stage of the
pipeline needs its own register file to execute instrucFons that need registers.
Incorrect: There is only one register file, which has all registers of the CPU. The pipeline registers hold, among other things, values for rs, rt, and rd.
![Page 22: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/22.jpg)
Solutions to Exercise 5 Which of the statements below regarding pipeline registers are correct? b) A pipeline register at the interface of stages i and i+1 needs to
store all the control unit values needed to execute an instrucFon execuFng at stage i+1 as well as the outputs of stage i that are needed as input to stage i+1.
Correct.
![Page 23: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/23.jpg)
Solutions to Exercise 5 Which of the statements below regarding pipeline registers are correct? c) The contents of the pipeline registers are updated at each clock
cycle. Correct d) The pipeline registers store values to be forwarded to previous
stages in the pipeline.
A pipeline register at the intersecFon of stages i and i+1 stores a value generated by an instrucFon at stage i. This value can be forwarded to an instrucFon at stage i-‐1. This instrucFon was issued aUer the instrucFon at stage i.
![Page 24: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/24.jpg)
Exercise 6 Consider the format of R-‐type instrucFons given below and the following sequence of instrucFons. I1 add ra,rb,rc I2 sub re,rf,rg
0 rs rt rd shamt funct 31:26 5:0 25:21 20:16 15:11 10:6
Provide an expression, as a funcFon of ra,rb,rc,re,rf,rg and the pipeline registers, that can be used to determine if that sequence causes a data hazard.
![Page 25: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/25.jpg)
Solutions to Exercise 6 Consider the format of R-‐type instrucFons given below and the following sequence of instrucFons. I1 add ra,rb,rc I2 sub re,rf,rg
0 rs rt rd shamt funct 31:26 5:0 25:21 20:16 15:11 10:6
Provide an expression, as a funcFon of ra,rb,rc,re,rf,rg and the pipeline registers, that can be used to determine if that sequence causes a data hazard.
((EX/MEM.RegisterRd = ID/EX.RegisterRs = ra) or (EX/MEM.RegisterRd = ID/EX.RegisterRt =ra)) and EX/MEM.RegWrite and EX/MEM.RegisterRd ≠0
![Page 26: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/26.jpg)
Exercise 7 Consider the following sequence of instrucFons. Assume that the register comparison in a branch instrucFon is done at the instrucFon decode stage of the pipeline. Can that sequence be executed without and without stalling?
add $t1, $t2, $0 add $t3, $t4, $t5 beq $t1, $t3, LABEL
![Page 27: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/27.jpg)
Solution to Exercise 7 Consider the following sequence of instrucFons. Assume that the register comparison in a branch instrucFon is done at the instrucFon decode stage of the pipeline. Can that sequence be executed without and without stalling?
add $t1, $t2, $0 IF ID EX MEM WB add $t3, $t4, $t5 IF ID EX MEM WB beq $t1, $t3, LABEL IF ID EX MEM WB
No. The pipeline needs to stall for one cycle because $t1 and $t3 are needed at the beginning of its ID stage. The value of $t3 is only available at the end of the EX stage of the second add.
![Page 28: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/28.jpg)
Exercise 8 Consider a 2-‐bit branch predicFon and the following recent history of branch instrucFons. Branch instruc/on address Outcome
1000 Solid Branch Taken
2500 Temporary Branch Not Taken
3657 Solid Branch Not Taken
4512 Solid Branch Taken
What is the predicFon for a branch at address 1000 and how would the table change if the branch is mispredicted?
![Page 29: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/29.jpg)
Solution to Exercise 8 Consider a 2-‐bit branch predicFon and the following recent history of branch instrucFons. Branch instruc/on address State
1000 Solid Branch Taken
2500 Temporary Branch Not Taken
3657 Solid Branch Not Taken
4512 Solid Branch Taken
What is the predicFon for a branch at address 1000 and how would the table change if the branch is mispredicted? The predicFon would be take the branch. In case of a mispredicFon, the new state should be Temporary Branch Taken.
![Page 30: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/30.jpg)
Exercise 9 Consider a 2-‐bit branch predicFon and the following recent history of branch instrucFons. Branch instruc/on address Outcome
1000 Solid Branch Taken
2500 Temporary Branch Not Taken
3657 Solid Branch Not Taken
4512 Solid Branch Taken
What is the predicFon for a branch at address 2500 and how would the table change if the branch is mispredicted?
![Page 31: Review&Session& - George Mason University](https://reader034.vdocuments.mx/reader034/viewer/2022051315/627a55647d3e511af825f9e7/html5/thumbnails/31.jpg)
Solution to Exercise 9 Consider a 2-‐bit branch predicFon and the following recent history of branch instrucFons. Branch instruc/on address State
1000 Solid Branch Taken
2500 Temporary Branch Not Taken
3657 Solid Branch Not Taken
4512 Solid Branch Taken
What is the predicFon for a branch at address 2500 and how would the table change if the branch is mispredicted? The predicFon would be do not take the branch. In case of a mispredicFon, the new state should be Temporary Branch Taken.