e2/mapld 2004 storaasli engineering applications on nasa’s fpga*-based hypercomputers by...
Post on 16-Jan-2016
213 Views
Preview:
TRANSCRIPT
E2/MAPLD 2004 Storaasli
Engineering Applications on NASA’s FPGA*-based Hypercomputers
By
Olaf.O.Storaasli@nasa.govAnalytical & Computational Methods Branch
NASA Langley Research CenterHampton Virginia
7th Military Aerospace Programmable Logic Device (MAPLD) International Conference
Reagan Center, Washington DCSeptember 10, 2004
*Field-Programmable Gate Array
2 E2/MAPLD 2004 Storaasli
Contents
Background: Hardware, “Gateware”
Current: Algorithms
Applications: CPU-FPGA, FPGA
Future: “New” Spacecraft Hypercomputer
3 E2/MAPLD 2004 Storaasli
NASA Reconfigurable Hypercomputers
‘02 ‘04
62K gates/FPGA 6M gates/FPGA
4 E2/MAPLD 2004 Storaasli
Computing Faster Without CPUsGOAL: Explore Engineering Applications on NASA’s FPGA-based HypercomputersTEAM: Drs. Olaf Storaasli, Jarek Sobieski & Robert Singleterry,
PARTNERS: Starbridge Systems (FPGA H/W + VIVA S/W) NSA, USAF, MSFC, AlphaStar
Students: MIT Harvard VT Brown UVA JPMorgan Case Pitt, Governor’s SchoolDave Rutishauser, Joe Rehder, Garry Qualls, Robert Lewis
5 E2/MAPLD 2004 Storaasli
VIVA: Custom Chip DesignWhat: Graphically code FPGAs: drag & drop vs text)
How: Converts icons-transports to FPGA circuit
Why: near-ASIC speed (w/o chip design $$$)
More: System Description ports to any H/W
“write once, run anywhere”
Data: Any type-size-precision (not fixed)
Corelib: Pre-built objects & examples
Traditional Code: 1D do i = 1, 1000 C= A+B end do
VIVA Gateware: 3D
+
+
+…+
Parallelism
naturalesoteric
VIVA Menu
6 E2/MAPLD 2004 Storaasli
FPGA Use
Replace CPUs
Exploit Parallelism FullyMax {Ops/cycle} => Fill FPGAVIVA/VHDL/Verilog codeLimit: FPGA(s) gates
CPU +FPGA Accelerator
Exploit Local ParallelismMax {kernel Ops/cycle}C/FORTRAN calls VIVA kernelLimit: FPGA gates + Amdahl’s Law
CPU
50 line kernel95% CPU TimeMove to FPGA
Ax=b NASA GPS
28k linesFORTRAN
CPU
<=>CallFPGA
FPGAkernel
7 E2/MAPLD 2004 Storaasli
GENOA-GPS* “Port”GENOA Analysis/Design (AlphaStar) GPS Matrix Equation Solver (NASA)
Structural, EM, acoustic analysis+design
Most Computations in 50-line kernel
kernel coded: VIVA-GPSVIVA2.4 => large applications ongoing (NASA-AlphaStar-Starbridge)
Progressive Failure, Reliability, DurabilityManufacturing,Virtual Test, Life prediction
Calls GPS
Shuttle re-entry wing damage analysis time: 660 hours => minutes (Goal)
*‘99 NASA Software-of-the-Year
Finite Element Model
8 E2/MAPLD 2004 Storaasli
Columbia Burn-thru Analysis
Panel 6Panel 7
Panel 8
38in
Spar Fracture 500 sec
Insulation Fracture230 Sec
RCC-Tseal Fracture 503 sec
Time
Leading Edge FEMLeading Edge
9 E2/MAPLD 2004 Storaasli
FPGA Use
Replace CPUs
Exploit Parallelism FullyMax {Ops/cycle} => Fill FPGA100% VIVA codeLimit: FPGA(s) gates
CPU +FPGA Accelerator
Exploit Local ParallelismMax {kernel Ops/cycle}C/FORTRAN calls VIVA kernelLimit: FPGA gates + Amdahl’s Law
Maximize Performance via ParallelismAdds/FPGA 16 32 128 256 512 640% FPGA used 1 2 8 16 41 51
109 Ops 4 8 34 77 154 192
1000+ adds/clock cycle => 1011 Ops/sec (1 add/cycle on CPUs)
10 E2/MAPLD 2004 Storaasli
Memory: FPGA & SDRAM- keep “action” on/near FPGA -
144x 2KB blocks RAM 2-8GB SDRAM(large applications)
11 E2/MAPLD 2004 Storaasli
File I/O
• FileIn/FileOut in Corelib• Transfers 2 KB blocks (Disk FPGA RAM)• User can access FPGA RAM 4 Bytes at a time
12 E2/MAPLD 2004 Storaasli
Read 2 files => Store in FPGA RAM => + files => Write result
Add Files in Parallel
S
S
R
R
+ W
13 E2/MAPLD 2004 Storaasli
23
46
92
0
10
20
30
40
50
60
70
80
90
100
0 4 8 12 16 20 24 28Number of FPGA Adders used
Time incycles
4KB
8KB
16KB
Log. (8KB)
Log. (4KB)
Log. (16KB)
CPUs (1 add)
Parallel Adds Faster- same file size -
2
File size
14 E2/MAPLD 2004 Storaasli
Algorithms Developed
• n! => Probability: Combinations/Permutations
• Cordic => Transcendentals: sin, log, exp, cosh…
∂y/∂x & ∫f(x)dx => Runge-Kutta: CFD, Newmark Beta: CSM
Matrix Equation Solvers: [A]{x} = {b}, Gauss & Jacobi
• Nonlinear Analysis: reduces NL time
Matrix Algebra: {V}, [M], {V}T{V}, [M]x[M],GCD,…
.• Dynamic Analysis: [M]{ü} + [C]{u} + [K]{u} + NL = {P(t)}
Structural Design & Optimization
• Analog Computing: digital accuracy
15 E2/MAPLD 2004 Storaasli
Applications: VIVA CodeGauss Matrix SolverJacobi Matrix Solver
Cellular AutomataRunge-Kutta
16 E2/MAPLD 2004 Storaasli
Gauss-Jordan A x = B Solver
• VIVA code solves n equations.
=>Ex: x0 + x1 + x2 = 0
x0 – 2x1 + 2x2 = 4
x0 + 2x1 – x2 = 2
x0 = 4
x1 = -2
x2 = -2• Run on hypercomputer emulator, then FPGA
17 E2/MAPLD 2004 Storaasli
Spring-Mass SolverMethod: 4-stage Runge-Kutta
( )tufdt
du,=
),(
),(
),(
),(
34
221
21
3
121
21
2
1
kyhxhfk
kyhxhfk
kyhxhfk
yxhfk
nn
nn
nn
nn
++=++=++=
=
)22( 432161
1
1
kkkkyy
hxx
nn
nn
++++=+=
+
+( ) 00 yxy =
f
18 E2/MAPLD 2004 Storaasli
Cellular Automata• Parallel: Stephen Wolfram - A New Kind of Science
• Complexity via simple interactions w/o PDEs• CFD => Structures
1 2 3 4 5 6 7 8 9 101
2
3
4
5
6
7
8
9
10
-25.4057-21.4971-17.5886
-17.5886
-13.68
-13.68
-9.77143
-9.77143
-5.86286
-5.86286-5.86286
-1.9
5429
-1.95429 -1.95429
1.9
5429
1.95429 1.95429
5.86286
5.86286
9.77143
9.77143
13.68
17.5886
21.4971
-40
-30
-20
-10
0
10
20
1 2 3 4 5 6 7 8 9 101
2
3
4
5
6
7
8
9
10
-25.4057-21.4971-17.5886
-17.5886
-13.68
-13.68
-9.77143
-9.77143
-5.86286
-5.86286-5.86286
-1.9
5429
-1.95429 -1.95429
1.9
5429
1.95429 1.95429
5.86286
5.86286
9.77143
9.77143
13.68
17.5886
21.4971
-40
-30
-20
-10
0
10
20
P
d
FEAsolution
Cellular Automatasolution
• Cell-neighbors interactions; simple compute/cell
19 E2/MAPLD 2004 Storaasli
Cantilever Beam Optimization
Find thickness, d, to minimize
allowedStresswd
PLStress ≤= 2
6where
dwLWeight ×××=ρ
ρρ
Constants:
L = 24” W = 3” P = 20 lbs
= 0.097 lbs/in3
Constraint:Stressallowed = 40K lbs/in2
20 E2/MAPLD 2004 Storaasli
VIVA FPGA Code Minimizes Beam Weight
VIVA Results: d= 0.156” (0.155 exact)
Minimum weight = 1.09 lbs (1.082 exact)
d chosen 1023 times
21 E2/MAPLD 2004 Storaasli
“a bold new course into the cosmos”Reconfigurable Scalable Computing (RSC) for Space Applications - $14.8M
22 E2/MAPLD 2004 Storaasli
Spirit & Opportunity Rovers6 Radiation-tolerant FPGAs:1M gates @ 100kRads-----------------------------------------Next:6M gates @ 200kRads
23 E2/MAPLD 2004 Storaasli
What Reconfigurable Scalable Computing (RSC) for Space Applications Who Langley, Goddard, NSA, Starbridge, Jefferson Lab, ASRC, QueenslandWhen 4 years (FY ‘05-’08)How $14.8MGoal Effective-affordable processing
for moon & Mars missionsPlan Design-implement-demonstrate
RSC for space applicationsHardware Stacked scalable FPGAsGateware Conventional (MPI/Linux) + Special (VIVA)
More:
24 E2/MAPLD 2004 Storaasli
Summary
Hardware: Exploiting advanced FPGA-based systems
FPGAs: Rapid growth, inherently //, flexible, efficient
VIVA: Powerful & growing (tailored to NASA needs)
Applications: - Many Engineering algorithms (VIVA => FPGAs)
Speed: 640 ops/cycle (2x1011 ops/sec) measured
Future: Reconfigurable Scalable Computing for Space
- GPS-VIVA => CPU+FPGA accelerator
25 E2/MAPLD 2004 Storaasli
The End
top related