computer systems architecture ipcrowley/cse/560/l22-shakir.pdfcomputer systems architecture i cse...

22
Computer Systems Architecture I CSE 560M Lecture 22 Guest Lecturer: Shakir James

Upload: others

Post on 18-Mar-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Computer Systems Architecture Ipcrowley/cse/560/L22-shakir.pdfComputer Systems Architecture I CSE 560M Lecture22 Guest Lecturer: Shakir James. 2 ... –Dynamic (Dtrace) • Rewrite

Computer Systems

Architecture I

CSE 560M

Lecture 22

Guest Lecturer: Shakir James

Page 2: Computer Systems Architecture Ipcrowley/cse/560/L22-shakir.pdfComputer Systems Architecture I CSE 560M Lecture22 Guest Lecturer: Shakir James. 2 ... –Dynamic (Dtrace) • Rewrite

2

Reminders

• Project report due Dec 7th

– Bring hard copy to class

– Email soft copy to Patrick & Shakir

• Course evaluation due Dec 10th

• Final Exam due Dec 14th

– Email to Prof. Crowley by 4PM on Dec 14th

Page 3: Computer Systems Architecture Ipcrowley/cse/560/L22-shakir.pdfComputer Systems Architecture I CSE 560M Lecture22 Guest Lecturer: Shakir James. 2 ... –Dynamic (Dtrace) • Rewrite

3

Today’s Discussion

• Performance Monitoring*

– Types of Monitors

– Program Monitors• Instrumentation

• Valgrind/Cachegrind (an example)

* Jain, R., “The Art of Computer Systems Performance Analysis: Techniques for

Experimental Design, Measurement, Simulation, and Modeling.”

Page 4: Computer Systems Architecture Ipcrowley/cse/560/L22-shakir.pdfComputer Systems Architecture I CSE 560M Lecture22 Guest Lecturer: Shakir James. 2 ... –Dynamic (Dtrace) • Rewrite

Monitoring

• First, key step in performance measurement

– observe the execution of system

• Used for

– improving software

– finding bottleneck

– modeling

4

“If you can not measure it, you can not improve it.” - Lord Kelvin

Page 5: Computer Systems Architecture Ipcrowley/cse/560/L22-shakir.pdfComputer Systems Architecture I CSE 560M Lecture22 Guest Lecturer: Shakir James. 2 ... –Dynamic (Dtrace) • Rewrite

Types of Monitors

• Based on implementation

– Hardware

• probes, counters, timers (Oprofile & Vtune)

– Software

• Program monitors/profilers (Valgrind)

– Firmware

• Key tradeoff: resolution vs. overhead 5

Page 6: Computer Systems Architecture Ipcrowley/cse/560/L22-shakir.pdfComputer Systems Architecture I CSE 560M Lecture22 Guest Lecturer: Shakir James. 2 ... –Dynamic (Dtrace) • Rewrite

Program Monitors - 1

• Observe the execution of a program

• Used for

– Tracing

– Timing

– Tuning

– Asserting checking

– Coverage analysis

6

Page 7: Computer Systems Architecture Ipcrowley/cse/560/L22-shakir.pdfComputer Systems Architecture I CSE 560M Lecture22 Guest Lecturer: Shakir James. 2 ... –Dynamic (Dtrace) • Rewrite

Program Monitors - 2

• Basic steps

7

Instrumented program

Original program

Executionprofiles

Run instrumentedprogram

Add instrumentation

Page 8: Computer Systems Architecture Ipcrowley/cse/560/L22-shakir.pdfComputer Systems Architecture I CSE 560M Lecture22 Guest Lecturer: Shakir James. 2 ... –Dynamic (Dtrace) • Rewrite

Program Instrumentation

• Modify source

– Static (Gprof)

– Dynamic (Dtrace)

• Rewrite binary

– Static (Etch)

– Dynamic (Valgrind)

8

Page 9: Computer Systems Architecture Ipcrowley/cse/560/L22-shakir.pdfComputer Systems Architecture I CSE 560M Lecture22 Guest Lecturer: Shakir James. 2 ... –Dynamic (Dtrace) • Rewrite

Valgrind

• Dynamic binary instrumentation (DBI)

– Instruments precompiled binary

– Framework for building other tools

• Mainly used for debugging

– memory errors (Memcheck tool)

9

Page 10: Computer Systems Architecture Ipcrowley/cse/560/L22-shakir.pdfComputer Systems Architecture I CSE 560M Lecture22 Guest Lecturer: Shakir James. 2 ... –Dynamic (Dtrace) • Rewrite

Valgrind Limitations

• Kernel behavior not counted

• Other processes not counted

• Thread scheduling variations

10

Page 11: Computer Systems Architecture Ipcrowley/cse/560/L22-shakir.pdfComputer Systems Architecture I CSE 560M Lecture22 Guest Lecturer: Shakir James. 2 ... –Dynamic (Dtrace) • Rewrite

Valgrind Tools

• Memory error detectors (Memcheck)

• 2 thread error detectors (Helgrind & DRD)

• Call-graph profiler (Callgrind)

• Heap profiler (Massif)

• Cache and branch-prediction profiler

– Cachegrind

11

Page 12: Computer Systems Architecture Ipcrowley/cse/560/L22-shakir.pdfComputer Systems Architecture I CSE 560M Lecture22 Guest Lecturer: Shakir James. 2 ... –Dynamic (Dtrace) • Rewrite

Cachegrind

• Simulates program behavior

– cache hierarchy

– branch prediction (optional)

12

Page 13: Computer Systems Architecture Ipcrowley/cse/560/L22-shakir.pdfComputer Systems Architecture I CSE 560M Lecture22 Guest Lecturer: Shakir James. 2 ... –Dynamic (Dtrace) • Rewrite

Cachegrind: Simulated Cache

• Separate L1 $: instruction (I1) and data (D1)

• Unified, inclusive L2:

• Write allocate

• Cache sizes can be manually set

– default are automatically determined

13

Page 14: Computer Systems Architecture Ipcrowley/cse/560/L22-shakir.pdfComputer Systems Architecture I CSE 560M Lecture22 Guest Lecturer: Shakir James. 2 ... –Dynamic (Dtrace) • Rewrite

Cachegrind: Cache Config.

• --I1=<size>,<associativity>,<line size>

• --D1=<size>,<associativity>,<line size>

• --L2=<size>,<associativity>,<line size>

14

Page 15: Computer Systems Architecture Ipcrowley/cse/560/L22-shakir.pdfComputer Systems Architecture I CSE 560M Lecture22 Guest Lecturer: Shakir James. 2 ... –Dynamic (Dtrace) • Rewrite

Cachegrind: Branch Predictor

• Correlating 2-bit counter

– Array of 16384 2-bit saturating counters

– Array index computed from the branch address

and the taken/not-taken behavior of the last few

branches

• Target prediction- same as last time

15

Page 16: Computer Systems Architecture Ipcrowley/cse/560/L22-shakir.pdfComputer Systems Architecture I CSE 560M Lecture22 Guest Lecturer: Shakir James. 2 ... –Dynamic (Dtrace) • Rewrite

Cachegrind Limitations

• No TLB misses

• No virtual to physical address mappings

16

Page 17: Computer Systems Architecture Ipcrowley/cse/560/L22-shakir.pdfComputer Systems Architecture I CSE 560M Lecture22 Guest Lecturer: Shakir James. 2 ... –Dynamic (Dtrace) • Rewrite

Cachegrind Usage

• Compile program

– Preferably using --g to include debugging info

• Run valgrind– valgrind --tool=cachegrind --branch-sim=yes

– Outputs to file cachegrind.out.<pid>

• Annotated source files– cg_annotate --<pid> <sourcefile(s)>

17

Page 18: Computer Systems Architecture Ipcrowley/cse/560/L22-shakir.pdfComputer Systems Architecture I CSE 560M Lecture22 Guest Lecturer: Shakir James. 2 ... –Dynamic (Dtrace) • Rewrite

Cachegrind Example

• fast.cint main(void)

{

int h, i, j, a[1024][1024];

for (h=0;h<10;h++)

for(i=0; i<1024; i++)

for(j=0;j<1-24;j++)

a[ i ][ j ]=0

return 0;

}

• slow.cint main(void)

{

int h, i, j, a[1024][1024];

for (h=0;h<10;h++)

for(i=0; i<1024; i++)

for(j=0;j<1-24;j++)

a[ j ][ i ]=0

return 0;

}

Page 19: Computer Systems Architecture Ipcrowley/cse/560/L22-shakir.pdfComputer Systems Architecture I CSE 560M Lecture22 Guest Lecturer: Shakir James. 2 ... –Dynamic (Dtrace) • Rewrite

Valgrind --tool=cachegrind fast

Page 20: Computer Systems Architecture Ipcrowley/cse/560/L22-shakir.pdfComputer Systems Architecture I CSE 560M Lecture22 Guest Lecturer: Shakir James. 2 ... –Dynamic (Dtrace) • Rewrite

Cachegrind Example (cont’d)

• cg_annotate --<pid> fast.c

Page 21: Computer Systems Architecture Ipcrowley/cse/560/L22-shakir.pdfComputer Systems Architecture I CSE 560M Lecture22 Guest Lecturer: Shakir James. 2 ... –Dynamic (Dtrace) • Rewrite
Page 22: Computer Systems Architecture Ipcrowley/cse/560/L22-shakir.pdfComputer Systems Architecture I CSE 560M Lecture22 Guest Lecturer: Shakir James. 2 ... –Dynamic (Dtrace) • Rewrite

22

Assignment

• For Monday

– Turn in final report hard copy

– Pick up final exam