dynamic analysis in the reduceron · dynamic vs. static analysis “create all closures as...
TRANSCRIPT
![Page 1: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/1.jpg)
Dynamic analysis in
the Reduceron
Matthew Naylor and Colin Runciman
University of York
![Page 2: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/2.jpg)
“I wonder how popular Haskell needs to become for
Intel to optimize their processors for my runtime,
rather than the other way around.”
Simon Marlow, 2009
A question
![Page 3: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/3.jpg)
Underlying problem?
Constraints imposed by conventional processors
make efficient implementations of functional
languages sophisticated and limited.
![Page 4: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/4.jpg)
“Current RISC technology will probably have
increased in speed enough by the time a graph-
reduction chip could be designed and fabricated to
make the exercise pointless.”
Koopman, 1990
Build a machine especially to run functional programs.
Possible solution
But now the situation is different…
![Page 5: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/5.jpg)
Block RAMs
FPGAs (Reconfigurable H/W)
Contain a large set of components that can be
connected together in any desired way.
Widely available, quick to program, autonomous.
Logic Cells (Gates & Flip-flops)
![Page 6: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/6.jpg)
not restricted by conventional architectural constraints,
A computer designed to run lazy functional programs,
implemented on an FPGA, using a functional language.
The Reduceron
(Origin: Chris Kania)
![Page 7: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/7.jpg)
Graph reduction
Step by step
![Page 8: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/8.jpg)
Graph reduction, step by step
f ys x xs = g x (h xs ys)
Suppose that function f is defined by
where g and h are functions and the following
machine-state arises during reduction.
f
a
b
c g @2
Stack Templates Heap
f: h @3 @1
![Page 9: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/9.jpg)
Operation: f <- Stack[0]
g <- Code[f]
g -> Heap
Steps: 3
f
a
b
c
Stack Heap
g g @2 f: h @3 @1
Templates
Graph reduction, step by step
![Page 10: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/10.jpg)
Operation: arg <- Code[f+1]
b <- Stack[arg]
b -> Heap
Steps: 6
f
a
b
c
Stack Heap
g b g @2 f: h @3 @1
Templates
Graph reduction, step by step
![Page 11: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/11.jpg)
Operation: ptr <- Code[f+2]
ptr’ -> Heap
Steps: 8
f
a
b
c
Stack Heap
g b g @2 f: h @3 @1
Templates
Graph reduction, step by step
![Page 12: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/12.jpg)
Operation: h <- Code[f+3]
h -> Heap
Steps: 10
f
a
b
c
Stack Heap
g b g @2 f: h @3 @1 h
Templates
Graph reduction, step by step
![Page 13: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/13.jpg)
Operation: arg <- Code[f+4]
c <- Stack[arg]
c -> Heap
Steps: 13
f
a
b
c
Stack Heap
g b g @2 f: h @3 @1 h c
Templates
Graph reduction, step by step
![Page 14: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/14.jpg)
Operation: arg <- Code[f+5]
a <- Stack[arg]
a -> Heap
Steps: 16
f
a
b
c
Stack Heap
g b g @2 f: h @3 @1 h c a
Templates
Graph reduction, step by step
![Page 15: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/15.jpg)
The von Neumann Bottleneck
One word at a time.
Each of the 16 memory
transactions is done
sequentially.
![Page 16: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/16.jpg)
Widening the bottleneck
![Page 17: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/17.jpg)
Widening the bottleneck, again
Many words at a time.
![Page 18: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/18.jpg)
Applying a function “in one go”
f
a
b
c
Stack Heap
g b g @2 f: h @3 @1 h c a
Templates
The function f is applied in a single clock cycle.
![Page 19: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/19.jpg)
Reduceron “instruction set”
Operation Clock cycles
Apply n/2
Unwind 1
Update 1
Primitive Apply 1
Where n = number of applications in function body.
![Page 20: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/20.jpg)
Template instantiation?
Yes, the Reduceron actually does
template instantiation!
But what about the G-machine and the ABC machine?
![Page 21: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/21.jpg)
Template instantiation
“This chapter introduces the simplest possible
implementation of a functional language: a
graph-reducer based on template instantiation”
Simon Peyton Jones, 1992
![Page 22: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/22.jpg)
Dynamic analysis (1)
Update avoidance
![Page 23: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/23.jpg)
Shared pointers
unshared applications, and
possibly-shared applications.
Distinguish between:
Idea: When an unshared application is
reduced to normal form, no update is needed.
![Page 24: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/24.jpg)
Dynamic vs. static analysis
“Create all closures as [unshared], and
dynamically change their tag to [possibly-
shared] if they become shared. We call
this operation dashing.”
“In general we strongly suspect that the
cost of dashing greatly outweighs the
advantages of precision when compared
to the [static analysis] method.”
Peyton Jones, 1988
![Page 25: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/25.jpg)
Dashing when applying
s
f
g
x @1 @3
Stack Templates
s: @2 @3
Heap
When the function s, containing two references
to its 3rd argument, is applied,
![Page 26: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/26.jpg)
Dashing when applying
s
f
g
x @1 @3
Stack
s: @2 @3
When the function s, containing two references
to its 3rd argument, is applied,
the 3rd argument is dashed.
Heap
f x g x
Templates
![Page 27: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/27.jpg)
Dashing when applying
inst :: Reduceron -> Atom -> Atom
inst r a =
pick
[ a!isARG --> dash (a!isArgShared) x
, a!isAP -->
makeAP (a!isShared)
(r!heap!size + a!pointer)
, otherwise --> a
]
where
otherwise = inv (a!isARG <|> a!isAP)
x = pick (zip (a!argIndex!velems)
(r!vstack!tops!velems))
![Page 28: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/28.jpg)
Dashing when unwinding
x
c
Stack Heap
f a x: b
When a pointer x to a shared application appears
on top of the stack,
![Page 29: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/29.jpg)
Dashing when unwinding
c
Stack Heap
f a x: b
When a pointer x to a shared application appears
on top of the stack,
f
a
b
the unwound application is dashed.
![Page 30: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/30.jpg)
Dashing when unwinding unwind :: Reduceron -> Recipe
unwind r =
Seq [
r!newTop <== vhead as
, r!vstack!update n (unwindMask n)
(as!vtail!velems)
, app!hasAlts |> r!astack!push (app!alts)
, upd |> r!ustack!push
(makeUpdate (r!vstack!size)
(r!top!pointer))
]
where
app = r!heap!outputB
n = app!appArity
as = vmap (dash sh) (app!atoms)
sh = r!top!isShared
upd = sh <&> inv (app!isNF)
![Page 31: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/31.jpg)
Dashing when updating
Stack Heap
f a x:
C
a
as b
When a normal-form is reached,
![Page 32: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/32.jpg)
Dashing when updating
Stack Heap
C a x:
C
a
as as
When a normal-form is reached,
it is copied onto the heap, overwriting the original
application, and its arguments are dashed.
![Page 33: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/33.jpg)
Dashing in the Reduceron
In each of the three cases, dashing
is just bit-flipping under some
simple-to-compute conditions.
![Page 34: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/34.jpg)
Normal-form updates avoided
0
10
20
30
40
50
60
70
80
90
100
Mean = 56.2%
![Page 35: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/35.jpg)
Total updates avoided
0
10
20
30
40
50
60
70
80
90
100
Mean = 56.2%
Mean = 87.2%
![Page 36: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/36.jpg)
% runtime saving
0
5
10
15
20
25
30
35
40
45
50
Mean = 21.5%
![Page 37: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/37.jpg)
Dynamic analysis (2)
Speculative evaluation of primitive redexes (PRS)
![Page 38: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/38.jpg)
Primitive redexes
f x xs a = g (x+a) xs
Suppose that function f is defined by
where g is a function, and + is primitive addition.
f
5
xs
2 g @2
Stack Templates Heap
f:
@1 + @3
![Page 39: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/39.jpg)
Primitive redexes
Application of f results in the primitive redex
5+2 being instantiated on the heap.
g @2
Stack Heap
f:
@1 + @3
g xs
5 + 2
f
5
xs
2
Templates
![Page 40: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/40.jpg)
Speculative evaluation
Idea: At instantiation time, look at the arguments
in a primitive application. If they are already
evaluated, apply the primitive speculatively.
g @2
Stack Heap
f:
@1 + @3
f
5
xs
2 g 7 xs
Templates
![Page 41: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/41.jpg)
% runtime saving, due to PRS
-10
0
10
20
30
40
50
Mean = 12%
amount of primitive reduction
![Page 42: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/42.jpg)
Example of PRS success
enum n m = if n <= m
then Cons n (enum (n+1) m)
else Nil
enum 0 10
=
Function to enumerate a range of integers:
We wish to evaluate: enum 0 10
![Page 43: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/43.jpg)
Example of PRS success
enum n m = if n <= m
then Cons n (enum (n+1) m)
else Nil
enum 0 10
= if 0 <= 1
then Cons 0 (enum (0+1) 10)
else Nil
Function to enumerate a range of integers:
We wish to evaluate: 0 10 enum
![Page 44: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/44.jpg)
Example of PRS success
enum n m = if n <= m
then Cons n (enum (n+1) m)
else Nil
enum 0 10
= if True
then Cons 0 (enum 1 10)
else Nil
Function to enumerate a range of integers:
We wish to evaluate: 0 10 enum
![Page 45: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/45.jpg)
Example of PRS success
enum n m = if n <= m
then Cons n (enum (n+1) m)
else Nil
enum 0 10
= if True
then Cons 0 (enum 1 10)
else Nil
= Cons 0 (enum 1 10)
Function to enumerate a range of integers:
We wish to evaluate: 0 10 enum
![Page 46: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/46.jpg)
Example of PRS failure
enum n m = if n <= m
then Cons n (enum (n+1) m)
else Nil
enum x 10
=
Function to enumerate a range of integers:
We wish to evaluate: y enum 10 x: f
![Page 47: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/47.jpg)
Example of PRS failure
enum n m = if n <= m
then Cons n (enum (n+1) m)
else Nil
enum x 10
= if x <= 10
then Cons x (enum (x+1) 10)
else Nil
Function to enumerate a range of integers:
We wish to evaluate: y enum 10 x: f
![Page 48: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/48.jpg)
Example of PRS failure
enum n m = if n <= m
then Cons n (enum (n+1) m)
else Nil
enum x 10
= if x <= 10
then Cons x (enum (x+1) 10)
else Nil
= Cons x (enum (x+1) 10)
Function to enumerate a range of integers:
We wish to evaluate: enum 10 x: 3
![Page 49: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/49.jpg)
Possible solutions
Option 1: Worker/wrapper transformation.
Since enum is strict, introduce enumWrap
which forces n and m before passing them to
enum.
Option 2: Update the stack. Leave evaluated
arguments of conditional primitives on the
stack, to be observed by case alternatives.
![Page 50: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/50.jpg)
Conclusions
The Reduceron enjoys many freedoms.
Dynamic analysis can be very attractive.
There is a lot of low-level parallelism to
exploit even in “sequential” graph reduction.
Still to come: Parallel reduction, PRS is just
the beginning, and compiler optimisation.
![Page 51: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/51.jpg)
Quick progress report
![Page 52: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/52.jpg)
Speed-up factor, since IFL’07
1
3
5
7
9
11
13
15
![Page 53: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/53.jpg)
My desk
Intel Core 2 Duo E8400
* 6M Cache
* 3.00 GHz
* 1333 MHz FSB
* Released in 2008
Xilinx Virtex 5 FPGA
* LX110T (mid-range)
* Speed-grade 1 (lowest)
* Released in 2006
![Page 54: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/54.jpg)
Reduceron synthesis results
Fmax 110Mhz
Slice usage 2284 (13%)
LUTs 5865 (8%)
Flip-flops 1018 (1%)
BRAM usage 4.7 megabits (89%)
NOTE: memory capacity is very small – could be
enhanced by moving the heap into off-chip memory.
![Page 55: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/55.jpg)
Slow-down factor, against PC
1
2
3
4
5
6
7
8
9 GHC -O2
Clean
![Page 56: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/56.jpg)
Acknowledgements
This research is supported by EPSRC grant
EP/G011052/1.
Thanks to the Xilinx University Program for
donating the FPGA and development board.
Thanks to Satnam Singh for his advice and
contacts which enabled us to obtain this
board. The money saved has been used to
extend my contract by three months!
![Page 57: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/57.jpg)
Megabits of block RAM, over time
0
5
10
15
20
25
30
35
40
45
Virtex II (2000) Virtex 4 (2004) Virtex 5 (2006) Virtex 6 (2009)
![Page 58: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/58.jpg)
An off-chip heap
• Heap could be moved into off-chip RAM.
– To cater for memory-hungry programs.
• Should be possible without modifying
existing design significantly.
– Modern memory chips clock much higher
than 110Mhz.
– Example: reduced-latency RAM (4-cycle
access) clocking at over 400Mhz.
![Page 59: Dynamic analysis in the Reduceron · Dynamic vs. static analysis “Create all closures as [unshared], and dynamically change their tag to [possibly-shared] if they become shared](https://reader033.vdocuments.mx/reader033/viewer/2022053114/608d914e2e734b60ba4b512b/html5/thumbnails/59.jpg)
% runtime spent on arithmetic
0
5
10
15
20
25
30
35 Mean = 12%