Computer Systems:A Programmer’s Perspective
Computer Systems:A Programmer’s Perspective
Randal E. Bryant, David R. O’HallaronRandal E. Bryant, David R. O’Hallaron
Computer Science and Electrical EngineeringComputer Science and Electrical Engineering
Carnegie Mellon UniversityCarnegie Mellon Universitys1 s2 s3 s4 s5 s6 s7 s8
s9
s10
s11
s12
s13
s14
s15
s16
16m 8m
4m
2m 1024
k
512k 25
6k 128k 64
k 32k 16
k 8k4k
2k 1k .5k0
200
400
600
800
1000
1200
MB/s
StrideSize
– 2 – ICS
OutlineOutline
Introduction to Computer SystemsIntroduction to Computer Systems Course taught at CMU since Fall, 1998 Some ideas on labs, motivations, …
Computer Systems: A Programmer’s PerspectiveComputer Systems: A Programmer’s Perspective Our textbook Ways to use the book in different courses
The Role of Systems Design in CS/Engineering The Role of Systems Design in CS/Engineering CurriculaCurricula
– 3 – ICS
BackgroundBackground1995-1997: REB/DROH teaching computer 1995-1997: REB/DROH teaching computer
architecture course at CMU.architecture course at CMU. Good material, dedicated teachers, but students hate it Don’t see how it will affect there lives as programmers
Course Evaluations
2
2.5
3
3.5
4
4.5
5
1995 1996 1997 1998 1999 2000 2001 2002
CS Average
REB: Computer Architecture
– 4 – ICS
Computer ArithmeticBuilder’s PerspectiveComputer ArithmeticBuilder’s Perspective
How to design high performance arithmetic circuits
32-bitMultiplier
– 5 – ICS
Computer ArithmeticProgrammer’s PerspectiveComputer ArithmeticProgrammer’s Perspective
Numbers are represented using a finite word size Operations can overflow when values too large
But behavior still has clear, mathematical properties
void show_squares(){ int x; for (x = 5; x <= 5000000; x*=10) printf("x = %d x^2 = %d\n", x, x*x);}
x = 5 x2 = 25x = 50 x2 = 2500x = 500 x2 = 250000x = 5000 x2 = 25000000x = 50000 x2 = -1794967296x = 500000 x2 = 891896832x = 5000000 x2 = -1004630016
– 6 – ICS
Memory SystemBuilder’s PerspectiveMemory SystemBuilder’s PerspectiveBuilder’s PerspectiveBuilder’s Perspective
Must make many difficult design decisions Complex tradeoffs and interactions between components
Mainmemory Disk
L1 i-cache
L1 d-cacheRegs L2 unifiedcacheCPU
Write through or
write back?
Direct mapped or
set indexed?
How many lines?
Virtual or physical
indexing?
Synchronousor
asynchronous?
– 7 – ICS
Memory SystemProgrammer’s PerspectiveMemory SystemProgrammer’s Perspective
Hierarchical memory organization Performance depends on access patterns
Including how step through multi-dimensional array
void copyji(int src[2048][2048], int dst[2048][2048]){ int i,j; for (j = 0; j < 2048; j++) for (i = 0; i < 2048; i++) dst[i][j] = src[i][j];}
void copyij(int src[2048][2048], int dst[2048][2048]){ int i,j; for (i = 0; i < 2048; i++) for (j = 0; j < 2048; j++) dst[i][j] = src[i][j];}
59,393,288 clock cycles 1,277,877,876 clock cycles
21.5 times slower!(Measured on 2GHz
Intel Pentium 4)
– 8 – ICS
The Memory MountainThe Memory Mountain
s1
s3
s5
s7
s9
s11
s13
s15
8m 2
m 51
2k
12
8k 32
k 8k 2
k
0
200
400
600
800
1000
1200
Re
ad
th
rou
gh
pu
t (M
B/s
)
Stride (words) Working set size (bytes)
Pentium III Xeon550 MHz16 KB on-chip L1 d-cache16 KB on-chip L1 i-cache512 KB off-chip unifiedL2 cache
L1
L2
Mem
xe
copyij
copyji
– 9 – ICS
Background (Cont.)Background (Cont.)1997: OS instructors complain about lack of 1997: OS instructors complain about lack of
preparationpreparation Students don’t know machine-level programming well
enoughWhat does it mean to store the processor state on the run-
time stack?
Our architecture course was not part of prerequisite stream
– 10 – ICS
Birth of ICSBirth of ICS1997: REB/DROH pursue new idea:1997: REB/DROH pursue new idea:
Introduce them to computer systems from a programmer's perspective rather than a system designer's perspective.
Topic Filter: What parts of a computer system affect the correctness, performance, and utility of my C programs?
1998: Replace architecture course with new course: 1998: Replace architecture course with new course: 15-213: Introduction to Computer Systems
Curriculum ChangesCurriculum Changes Sophomore level course Eliminated digital design & architecture as required
courses for CS majors
– 11 – ICS
15-213: Intro to Computer Systems15-213: Intro to Computer SystemsGoalsGoals
Teach students to be sophisticated application programmers Prepare students for upper-level systems courses
Taught every semester to 150 studentsTaught every semester to 150 students 50% CS, 40% ECE, 10% other.
Part of the 4-course CMU CS core:Part of the 4-course CMU CS core: Data structures and algorithms (Java) Programming Languages (ML) Systems (C/IA32/Linux) Intro. to theoretical CS
– 12 – ICS
ICS FeedbackICS FeedbackStudentsStudents
FacultyFaculty Prerequisite for most upper level CS systems courses Also required for ECE embedded systems, architecture, and
network courses
Course Evaluations
2
2.5
3
3.5
4
4.5
5
1995 1996 1997 1998 1999 2000 2001 2002
REB: Intro. Comp. Systems
CS Average
REB: Computer Architecture
– 13 – ICS
Lecture CoverageLecture Coverage
Data representations [3]Data representations [3] It’s all just bits. int’s are not integers and float’s are not reals.
IA32 machine language [5]IA32 machine language [5] Analyzing and understanding compiler-generated machine
code.
Program optimization [2]Program optimization [2] Understanding compilers and modern processors.
Memory Hierarchy [3]Memory Hierarchy [3] Caches matter!
Linking [1]Linking [1] With DLL’s, linking is cool again!
– 14 – ICS
Lecture Coverage (cont)Lecture Coverage (cont)
Exceptional Control Flow [2]Exceptional Control Flow [2] The system includes an operating system that you must interact with.
Measuring performance [1]Measuring performance [1] Accounting for time on a computer is tricky!
Virtual memory [4]Virtual memory [4] How it works, how to use it, and how to manage it.
I/O and network programming [4]I/O and network programming [4] Programs often need to talk to other programs.
Application level concurrency [2]Application level concurrency [2] Processes, I/O multiplexing, and threads.
Total: 27 lectures, 14 week semester.Total: 27 lectures, 14 week semester.
– 15 – ICS
LabsLabs
Key teaching insight: Key teaching insight: Cool Labs Great Course
A set of 1 and 2 week labs define the course.A set of 1 and 2 week labs define the course.
Guiding principles:Guiding principles: Be hands on, practical, and fun. Be interactive, with continuous feedback from automatic
graders Find ways to challenge the best while providing worthwhile
experience for the rest Use healthy competition to maintain high energy.
– 16 – ICS
Lab ExercisesLab ExercisesData Lab (2 weeks)Data Lab (2 weeks)
Manipulating bits.
Bomb Lab (2 weeks) Bomb Lab (2 weeks) Defusing a binary bomb.
Buffer Lab (1 week) Buffer Lab (1 week) Exploiting a buffer overflow bug.
Performance Lab (2 weeks)Performance Lab (2 weeks) Optimizing kernel functions.
Shell Lab (1 week) Shell Lab (1 week) Writing your own shell with job control.
Malloc Lab (2-3 weeks) Malloc Lab (2-3 weeks) Writing your own malloc package.
Proxy Lab (2 weeks) Proxy Lab (2 weeks) Writing your own concurrent Web proxy.
– 17 – ICS
Bomb LabBomb Lab Idea due to Chris Colohan, TA during inaugural offering
Bomb:Bomb: C program with six C program with six phasesphases..
Each phase expects student to type a specific string.Each phase expects student to type a specific string. Wrong string: bomb explodes by printing BOOM! (- 1/4 pt) Correct string: phase defused (+10 pts) In either case, bomb sends mail to a spool file Bomb daemon posts current scores anonymously and in real time on Web
page
Goal: Defuse the bomb by defusing all six phases.Goal: Defuse the bomb by defusing all six phases. For fun, we include an unadvertised seventh secret phase
The kicker:The kicker: Students get only the binary executable of a unique bomb To defuse their bomb, students must disassemble and reverse engineer
this binary
– 18 – ICS
Properties of Bomb PhasesProperties of Bomb Phases
Phases test understanding of different C constructs Phases test understanding of different C constructs and how they are compiled to machine codeand how they are compiled to machine code Phase 1: string comparison Phase 2: loop Phase 3: switch statement/jump table Phase 4: recursive call Phase 5: pointers Phase 6: linked list/pointers/structs Secret phase: binary search (biggest challenge is figuring
out how to reach phase)
Phases start out easy and get progressively harder Phases start out easy and get progressively harder
– 19 – ICS
Let’s defuse a bomb phase!Let’s defuse a bomb phase!08048b48 <phase_2>: ... # function prologue not shown 8048b50: mov 0x8(%ebp),%edx 8048b53: add $0xfffffff8,%esp 8048b56: lea 0xffffffe8(%ebp),%eax 8048b59: push %eax 8048b5a: push %edx 8048b5b: call 8048f48 <read_six_nums> 8048b60: mov $0x1,%ebx 8048b68: lea 0xffffffe8(%ebp),%esi
8048b70: mov 0xfffffffc(%esi,%ebx,4),%eax 8048b74: add $0x5,%eax 8048b77: cmp %eax,(%esi,%ebx,4) 8048b7a: je 8048b81 <phase_2+0x39> 8048b7c: call 804946c <explode_bomb> 8048b81: inc %ebx 8048b82: cmp $0x5,%ebx 8048b85: jle 8048b70 <phase_2+0x28> ... # function epilogue not shown 8048b8f: ret
# else explode!
# LOOP: eax = num[i-1]
# then goto LOOP:
# edx = &str
# eax = &num[] on stack# push function args
# rd 6 ints from str 2 num
# i = 1# esi = &num[] on stack
# eax = num[i-1] + 5# if num[i-1] + 5 == num[i]# then goto OK:
# OK: i++# if (i <= 5)
# YIPPEE!
– 20 – ICS
Source Code for Bomb PhaseSource Code for Bomb Phase/* * phase2b.c - To defeat this stage the user must enter arithmetic * sequence of length 6 and delta = 5. */void phase_2(char *input){ int ii; int numbers[6];
read_six_numbers(input, numbers);
for (ii = 1; ii < 6; ii++) { if (numbers[ii] != numbers[ii-1] + 5) explode_bomb(); }}
– 21 – ICS
The Beauty of the BombThe Beauty of the Bomb
For the StudentFor the Student Get a deep understanding of machine code in the context of
a fun game Learn about machine code in the context they will encounter
in their professional livesWorking with compiler-generated code
Learn concepts and tools of debuggingForward vs backward debuggingStudents must learn to use a debugger to defuse a bomb
For the InstructorFor the Instructor Self-grading Scales to different ability levels Easy to generate variants and to port to other machines
– 22 – ICS
ICS SummaryICS Summary
ProposalProposal Introduce students to computer systems from the
programmer's perspective rather than the system builder's perspective
ThemesThemes What parts of the system affect the correctness, efficiency,
and utility of my C programs? Makes systems fun and relevant for students Prepare students for builder-oriented courses
Architecture, compilers, operating systems, networks, distributed systems, databases, …
Since our course provides complementary view of systems, does not just seem like a watered-down version of a more advanced course
Gives them better appreciation for what to build
– 23 – ICS
Shameless PlugShameless Plug
http://csapp.cs.cmu.edu Published August, 2002
– 24 – ICS
CS:APPCS:APPVital stats:Vital stats:
13 chapters 154 practice problems (solutions in book), 132 homework
problems (solutions in IM) 410 figures, 249 line drawings 368 C code example, 88 machine code examples
Turn-key course provided with book:Turn-key course provided with book: Electronic versions of all code examples. Powerpoint, EPS, and PDF versions of each line drawing Password-protected Instructors Page, with Instructor’s
Manual, Lab Infrastructure, Powerpoint lecture notes, and Exam problems.
– 25 – ICS
AdoptionsAdoptions
Research universities: Prepare students for advanced courses
Small colleges: Only systems course
AdoptionsMay, 2006
– 26 – ICS
TranslationsTranslations
– 27 – ICS
CoverageCoverage
Material Used by ICS at CMUMaterial Used by ICS at CMU Pulls together material previously covered by multiple
textbooks, system programming references, and man pages
Greater Depth on Some TopicsGreater Depth on Some Topics IA32 floating point Dynamic linking Thread programming
Additional TopicAdditional Topic Computer Architecture Added to cover all topics in “Computer Organization” course
– 28 – ICS
The Evolving CS & Engineering CurriculumThe Evolving CS & Engineering CurriculumProgramming Lies at the Heart of Most Modern SystemsProgramming Lies at the Heart of Most Modern Systems
Computer systems Embedded devices: Cell phones, automobile controls, … Electronics: DSPs, programmable controllers
Programmers Have to Understand Their Machines and Programmers Have to Understand Their Machines and Their LimitationsTheir Limitations Correctness: computer arithmetic, storage allocation Efficiency: memory & CPU performance
Knowing How to Build Systems Is Not the Way to Learn Knowing How to Build Systems Is Not the Way to Learn How to Program ThemHow to Program Them It’s wasteful to teach every computer scientist how to design a
microprocessor Knowledge of how to build does not transfer to knowledge of
how to use