Download - CS 6534: Tech Trends / Intro
![Page 1: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/1.jpg)
CS 6534: Tech Trends / Intro
Charles Reiss
24 August 2016
1
![Page 2: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/2.jpg)
Moore’s Law
2,300
10,000
100,000
1,000,000
10,000,000
100,000,000
1,000,000,000
2,600,000,000
1971 1980 1990 2000 2011
Date of introduction
4004
8008
8080
RCA 1802
8085
8088
Z80
MOS 6502
6809
8086
80186
6800
68000
80286
80386
80486
PentiumAMD K5
Pentium IIPentium III
AMD K6
AMD K6-IIIAMD K7
Pentium 4Barton Atom
AMD K8
Itanium 2 CellCore 2 Duo
AMD K10Itanium 2 with 9MB cache
POWER6
Core i7 (Quad)Six-Core Opteron 2400
8-Core Xeon Nehalem-EXQuad-Core Itanium TukwilaQuad-core z1968-core POWER7
10-Core Xeon Westmere-EX
16-Core SPARC T3
Six-Core Core i7
Six-Core Xeon 7400
Dual-Core Itanium 2
AMD K10
Microprocessor Transistor Counts 1971-2011 & Moore's Law
Tra
nsis
tor
coun
t
Wikimedia Commons / Wgsimon2
![Page 3: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/3.jpg)
Good Ol’ Days: Frequency Scaling
Copyright © 2011, Elsevier Inc. All rights Reserved. 5
Figure 1.11 Growth in clock rate of microprocessors in Figure 1.1. Between 1978 and 1986, the clock rate improved less than 15% peryear while performance improved by 25% per year. During the “renaissance period” of 52% performance improvement per year between1986 and 2003, clock rates shot up almost 40% per year. Since then, the clock rate has been nearly flat, growing at less than 1% per year,while single processor performance improved at less than 22% per year.
H&P3
![Page 4: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/4.jpg)
The Power Wall
Power ∼ Switching Power + Leakage Power
Switching Power ∼ Capacitance×Voltage2×Frequency
4
![Page 5: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/5.jpg)
Increasing Parallelism: Cores
2005200620082009201020122013201420160
5
10
15
20
25
Date
Inte
lx86
#of
Core
sPe
rPac
kage
5
![Page 6: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/6.jpg)
Increasing Parallelism: Vector width
1995 1997 2000 2003 2005 2008 2011 2014 20160
100
200
300
400
500
Date
vect
orre
gist
ersiz
e(b
its) x86
ARM
6
![Page 7: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/7.jpg)
Increasing Parallelism: ILP
1978 1983 1988 1994 1999 2005 2010 20160
1
2
3
4
Date
x86 Intel 32-bit adds per cycle
7
![Page 8: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/8.jpg)
Limits: Parallelism
10 20 30 40 50 60
5
10
15
20
5% serial
10% serial
25% serial50% serial
0% serial
Degree of Parallelism (1=serial)
Spee
dup
(1=
seria
l)
Amdahl’s Law
8
![Page 9: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/9.jpg)
Limits: Communication[Balfour et al, “Operand Registers and Explicit Operand Forwarding”, 2009.]
[Malladi et al, “Towards Energy-Proportional Datacenter Memory with Mobile DRAM”, 2012.]DDR3 DRAM (32-bit read/write)
full utilization 2 300 000 fJ 4300×low utilization 7 700 000 fJ 15000×
9
![Page 10: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/10.jpg)
Increasing Efficiency: Specialization
Task/workload-specific coprocessors or instructionsMaybe reconfigurable?
Heterogeneous systemsdifferent parts for different types of computation
10
![Page 11: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/11.jpg)
Interlude: Logistics
Paper reviews — approx 2/class
Homeworks — programming assignments
Exam — end of semester
11
![Page 12: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/12.jpg)
Textbook?
Primarily paper readings
Classic + some newish papers
Reference: Hennessy and Patterson,Computer Architecture:A Quantitative Approach
12
![Page 13: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/13.jpg)
Paper Reviews
What was your most significant insight from thepaper?
What evidence does the paper have to support thisinsight?
What is the weakest part of the paper or how couldit be approved?
What topic from the paper would you like to seediscussed in class (if any)?
Might not be what the authorsput in their abstract/conclusion
13
![Page 14: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/14.jpg)
Paper Reviews
What was your most significant insight from thepaper?
What evidence does the paper have to support thisinsight?
What is the weakest part of the paper or how couldit be approved?
What topic from the paper would you like to seediscussed in class (if any)?
Might not be what the authorsput in their abstract/conclusion
13
![Page 15: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/15.jpg)
Paper Discussions
and not paper lectures.
Requires your cooperation.
14
![Page 16: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/16.jpg)
Homeworks
Individual programming + writing assignments
First — on memory hierarchy — available now.
Second — to be announced — likely on superscalar
Third — to be announced — likely GPUprogramming
15
![Page 17: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/17.jpg)
Homework 1
Description on course website (linked off Collab)
Memory system parameters by benchmarking
Example: 32K cache means accessing 32Krepeatedly is faster than 128K repeatedly.
16
![Page 18: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/18.jpg)
Homework 1
Description on course website (linked off Collab)
Memory system parameters by benchmarking
Example: 32K cache means accessing 32Krepeatedly is faster than 128K repeatedly.
16
![Page 19: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/19.jpg)
Homework 1: Disclaimer
This is probably hard
Modern memory hierarchies are complicated
Documentation is incomplete
Mainly looking for: measurement technique that‘should’ work
If it doesn’t, try to come up with good reasons why
17
![Page 20: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/20.jpg)
Exam
There will be in final, probably in-class.
Cover material from papers, homeworks, discussionsin class
18
![Page 21: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/21.jpg)
Exceptions / etc.
Need accommodations — please ask
Disability accommodations — Student DisabilityAccess Center
19
![Page 22: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/22.jpg)
Asking Questions
Piazza (linked of Collab)
Office Hours:Instructor Lecturer Charles Reiss TA Luowan WangLoation Soda 205 TBATimes Monday 1PM–3PM Tuesday 1PM–2PM
Friday 10AM–noon
Email: [email protected]
20
![Page 23: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/23.jpg)
Survey
linked off Collab
anonymous
please do it
21
![Page 24: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/24.jpg)
Preview of coming topics
22
![Page 25: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/25.jpg)
Memory hierarchy
caching — review(?) and advanced techniques
homework 1
23
![Page 26: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/26.jpg)
Pipelining
different parts of multiple instructions at the sametime
more advanced topics: handling exceptions
Image: Wikimedia commons / Poil24
![Page 27: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/27.jpg)
Increasing Parallelism: ILP
1978 1983 1988 1994 1999 2005 2010 20160
1
2
3
4
Date
x86 Intel 32-bit adds per cycle
25
![Page 28: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/28.jpg)
Beyond pipelining: Multiple issue
starting multiple instructions at the same time
allows cycles per instruction < 1
26
![Page 29: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/29.jpg)
Beyond pipelining: Out-of-order
run next instruction despite stall of prior oneslow cacheread-after-write hazard. . .
speculation — guess outcome of branch/load/etc.fix later if wrong
27
![Page 30: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/30.jpg)
Increasing Parallelism: Cores
2005200620082009201020122013201420160
5
10
15
20
25
Date
Inte
lx86
#of
Core
sPe
rPac
kage
28
![Page 31: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/31.jpg)
Multiprocessor/multicore
connecting processors together
shared memory — multiple threads
synchronization
29
![Page 32: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/32.jpg)
Increasing Parallelism: Vector width
1995 1997 2000 2003 2005 2008 2011 2014 20160
100
200
300
400
500
Date
vect
orre
gist
ersiz
e(b
its) x86
ARM
30
![Page 33: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/33.jpg)
Vector/SIMD/GPUs
single instruction/multiple data
started with early supercomputers
basis of GPU programming model
31
![Page 34: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/34.jpg)
Specialization
using custom chips (or circuits within chips)
reconfigurable processors (e.g. FPGAs)
32
![Page 35: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/35.jpg)
Miscellaneous topics
hardware security
warehouse-scale computers
. . . depends on time
Suggestions?
33
![Page 36: CS 6534: Tech Trends / Intro](https://reader030.vdocuments.mx/reader030/viewer/2022012211/61df3a1b131b7919e1066f59/html5/thumbnails/36.jpg)
Papers for Next Class
Alan Smith’s review of caching in 1982D. J. Bernstein’s timing attack and suggestions tocomputer architects in 2005
Note: We’re not reading this to learn about AES
34