the transdisciplinary responsibility of cs curricula reiner hartenstein tu kaiserslautern san diego,...
TRANSCRIPT
The Transdisciplinary Responsibility
of CS Curricula
Reiner Hartenstein
TU Kaiserslautern
San Diego, CA, USA, June 25 - 30, 2006
THE NINTH WORLD CONFERENCE ON INTEGRATED DESIGN & PROCESS
TECHNOLOGY
© 2006, [email protected] http://hartenstein.de
TU Kaiserslautern
2
frustrates interdisciplinary education, in CS even between subdisciplines
The Basic Model Paradigm Trap
High performance computing stalled for decades by the von Neuman paradigm trap: the wrong road map.
stolen from Bob Colwell
CPU
The right roadmap kept by another trap for decades !
© 2006, [email protected] http://hartenstein.de
TU Kaiserslautern
3
Computer Science not preparedTransdisciplinary
Education?
for decades: the Hardware / Software chasm
Lacking intradisciplinary cohesion between the mind sets of:
•Hardware People
•Theoreticians (Math background)
•Software People (Application Development)
turns into: the Configware / Software chasm
•Embedded Syst. Designers•Computer Architects
© 2006, [email protected] http://hartenstein.de
TU Kaiserslautern
4
xxx
xxx
xxx
|
||
x x
x
x
x
x
x x
x
- -
-
input data stream
xx
x
x
x
x
xx
x
--
-
-
-
-
-
-
-
-
-
-
xxx
xxx
xxx
|
|
|
|
|
|
|
|
|
|
|
|
|
| output data
streams „data
streams“ time
port #
time
time
port #time
port #
define: ... which data item at which time at which port
introducing Data streams,
no instruction streams needed
H. T. Kung paradigm(systolic array)
CS Mathematicians‘ hobby, early 80ies
1980
DPA*(pipe network)*) DataPath Array
(array of DPUs)
execution transport-triggered
© 2006, [email protected] http://hartenstein.de
TU Kaiserslautern
5
of course, algebraic !Synthesis Method?
Algebraic means linear projection, restricted to uniform arrays, only with linear pipes
useful only for applications with strictly regular data dependencies:
Mathematicians caught by their own paradigm trap for more than a decade
Rainer Kress discarded their algebraic synthesis methods and replaced it by simulated annealing.
Generalization by a transdisciplinary hardware guy:
rDPA:
1995
© 2006, [email protected] http://hartenstein.de
TU Kaiserslautern
6
no memory wall
DPA
xxx
xxx
xxx
|
||
x x
x
x
x
x
x x
x
- -
-
input data streams
xx
x
x
x
x
xx
x
--
-
-
-
-
-
-
-
-
-
-
xxx
xxx
xxx
|
|
|
|
|
|
|
|
|
|
|
|
|
| output data
streams
DPU operation is transport-triggered
no instruction streamsno message passing
nor thru common memory
where are the supercomputing people ? (took >2 decades)
massively avoiding memory cycles
The right road map to HPC: there ignored for
decades
© 2006, [email protected] http://hartenstein.de
TU Kaiserslautern
7
The supercomputing paradigm trap
this did not prevent supercomputing from following the wrong rodmap for decades,imprisoned by the von Neumann paradigm trap
No technology transfer from Mathematics: caught by the algebraic paradigm trap (systolic array scene)
© 2006, [email protected] http://hartenstein.de
TU Kaiserslautern
8
Monstrous Steam Engines of Computing
5120 Processors, 5000 pins eachCrossbar weight: 220 t, 3000 km of cable,
ES 20: TFLOPS
peak or sustained?
© 2006, [email protected] http://hartenstein.de
TU Kaiserslautern
9
Dead Supercomputer Society
•ACRI •Alliant •American Supercomputer •Ametek •Applied Dynamics •Astronautics •BBN •CDC•Convex•Cray Computer •Cray Research •Culler-Harris •Culler Scientific •Cydrome •Dana/Ardent/ Stellar/Stardent
•DAPP •Denelcor •Elexsi •ETA Systems •Evans and Sutherland•Computer•Floating Point Systems •Galaxy YH-1 •Goodyear Aerospace MPP •Gould NPL •Guiltech •ICL •Intel Scientific Computers •International Parallel . Machines •Kendall Square Research •Key Computer Laboratories
Research 1985 – 1995 [Gordon Bell, keynote ISCA 2000]
•MasPar•Meiko •Multiflow •Myrias •Numerix •Prisma •Tera •Thinking Machines •Saxpy •Scientific Computer•Systems (SCS) •Soviet Supercomputers •Supertek •Supercomputer Systems •Suprenum •Vitesse Electronics
© 2006, [email protected] http://hartenstein.de
TU Kaiserslautern
10
The Language and Tool Disaster
Software people do not speak VHDL
Hardware people do not speak MPI
Bad quality of the application development tools
End of April a DARPA brainstorming conference
A poll at FCCM’98 revealed, that 86% hardware designers hate their tools
© 2006, [email protected] http://hartenstein.de
TU Kaiserslautern
11
Escaping the Paradigm Trap
The fastest growing segment of the semiconductor marketMassive speed-up
The underground success story of FPGAs
However, this is not supported by our education systems
Slashing the electricity bill
© 2006, [email protected] http://hartenstein.de
TU Kaiserslautern
12
X 2/yr
FPGA
some published speed-up factors
1980 1990 2000 2010100
103
106
109
8080
Pentium 4
7%/yr
50%/yr
real-time face detectionreal-time face detection6000
video-rate stereo vision
video-rate stereo vision
900pattern
recognitionpattern
recognition730
SPIHT wavelet-based image compressionSPIHT wavelet-based image compression 457Smith-Waterman pattern matching
Smith-Waterman pattern matching
288
BLASTBLAST52protein identificationprotein identification
40
molecular dynamics simulationmolecular dynamics simulation
88
Reed-Solomon Decoding
Reed-Solomon Decoding2400
Viterbi DecodingViterbi Decoding
400
FFTFFT
100
1000MA
CMA
C
DSP and wirelessImage processing,Pattern matching,
Multimedia
Bioinformatics
GRAPEGRAPE20
AstrophysicsMicroprocessor
rela
tive
perf
orm
anc
e
Memory
cryptocrypto
1000
although the effective integration density of FPGAs is by 4 orders of magnitude behind the Moore curvewiring overheadreconfigurability overheadrouting congestion
The RC paradox
© 2006, [email protected] http://hartenstein.de
TU Kaiserslautern
13
Pervasiveness of Reconfigurable Computing (RC)
1,620,000
915,000
398,000
272,000
647,000
1,490,000
# of hits by
“FPGA and ….”
Embedded Systems scene not imprisened by the von
Neumann paradigm trap
•Hardware People:•Computer architects•Embedded system des.
© 2006, [email protected] http://hartenstein.de
TU Kaiserslautern
14
The Pervasiveness of RC
162,000
127,000
158,000113,000
171,000194,000
# of hits by Google
1,620,000
915,000
398,000
272,000
647,000
1,490,000
# of hits by
“FPGA and ….” Math/SW-savvy sceneunqualified for RC ?
© 2006, [email protected] http://hartenstein.de
TU Kaiserslautern
15
Using FPGAs for scientific computation?
Unqualified for RC ?
hiring a student from the EE dept. ?
application disciplines use their own trick boxes:transdisciplinary fragmentation of methodologyCS is responsible to provide a RC common model•for transdisciplinary education•and, to fix its intradisciplinary fragmentation
© 2006, [email protected] http://hartenstein.de
TU Kaiserslautern
16
Computing Curricula 2004fully ignores
Reconfigurable Computing
Joint Task Force for
FPGA & synonyma: 0 hits
not even here
(Google: 10 million hits)
Curricula ?
© 2006, [email protected] http://hartenstein.de
TU Kaiserslautern
17
Upon my complaints the only change: including to the last paragraph of the survey volume:
Curriculum Recommendations, v. 2005
"programmable hardware (including FPGAs, PGAs, PALs,
GALs, etc.)." However, no structural changes at all
v. 2005 intended to be the final version (?) torpedoing the transdisciplinary
responsibility of CS curriculaThis is criminal !This is criminal !
© 2006, [email protected] http://hartenstein.de
TU Kaiserslautern
18
to develop a roadmap for CS to assume intradisciplinary resonsibility for education
to send delegates to the Joint Task Force for Computing Curricula
to identify intra-disciplinary communication gaps in CS
SIG on CS educationand RC education
We need a
a task force for Reconfigurable Computing education
© 2006, [email protected] http://hartenstein.de
TU Kaiserslautern
22
datacounter
GAG RAM
ASM: Auto-Sequencing
Memory
ASM: Auto-Sequencing
Memory
use data counters, no program
counter
rDPArDPA
x x
x
x
x
x
x x
x
- -
-
xx
x
x
x
x
xx
x
--
-
-
-
-
-
-
-
-
-
-
ASM Data stream generators
(pipe network)
xxx
xxx
xxx
|
||
xxx
xxx
xxx
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ASM
ASM
ASM
ASM
ASM
ASM
AS
M
AS
M
AS
M
AS
M
AS
M
AS
M
implemented by distributed
memory
implemented by distributed
memory
50 & more on-chip ASM are feasible
50 & more on-chip ASM are feasible
reconfigurablereconfigurable
32 ports, orn x 32 ports
other example non-von-Neumannmachine paradigm
© 2006, [email protected] http://hartenstein.de
TU Kaiserslautern
23
Liaison to the Organic Computing Initiative ?
http://www.organic-computing.org
German section
http://www.organic-computing.de
© 2006, [email protected] http://hartenstein.de
TU Kaiserslautern
24
Illustrating the von Neumann paradigm trap
The data-stream-based approach
The instruction-stream-based approach
von Neuman
n bottle-
neck
von Neuman
n bottle-
neck
has no von Neumann bottle-neck
has no von Neumann bottle-neck
the watering pot model [Hartenstein]
many watering pots
© 2006, [email protected] http://hartenstein.de
TU Kaiserslautern
25
data meeting the processing unit (PU)
by Software
byConfigware
routing the data
placement of the execution locality
We have 2 choices
instruction streams arememory-cycle-hungry
optimize pipe network: place PU in data stream
© 2006, [email protected] http://hartenstein.de
TU Kaiserslautern
26
Co-Compiler Enabling Technology
is available from academia
only a small team needed for commercial re-implementation
on the road map to the Personal Supercomputer
© 2006, [email protected] http://hartenstein.de
TU Kaiserslautern
27
Compilation: Software vs. Configware
source program
softwarecompiler
software code
Software Engineeri
ng
Software Engineeri
ng
configware code
mapper
configwarecompiler
scheduler
flowware code
source „program“
Configware
Engineering
Configware
Engineering
placement &
routing
data
C, FORTRANMATHLAB
© 2006, [email protected] http://hartenstein.de
TU Kaiserslautern
28
configware resources: variable
Nick Tredennick’s Paradigm Shifts explain the differences
2 programming sources needed
flowware algorithm: variable
Configware EngineeringConfigware Engineering
Software EngineeringSoftware Engineering
1 programming source needed
algorithm: variable
resources: fixedsoftware
CPU
© 2006, [email protected] http://hartenstein.de
TU Kaiserslautern
29
Co-Compilation
softwarecompiler
software code
Software / Configware Co-Compiler
Software / Configware Co-Compiler
configware code
mapperconfigware
compiler
scheduler
flowware code
data
C, FORTRAN, MATHLAB
automatic SW / CW partitionersimulated annealing
simulated annealing
simulated annealing
simulated annealing
© 2006, [email protected] http://hartenstein.de
TU Kaiserslautern
30
Co-Compiler for Hardwired Kress/Kung Machine
[e. g. Brodersen]
softwarecompiler
software code
Software / Flowware
Co-Compiler
Software / Flowware
Co-Compiler
flowwarecompiler
scheduler
flowware code
data
source
automatic SW / CW partitioner