determinate imperative programming: the cf model

22
Determinate Imperative Programming: The CF Model Vijay Saraswat IBM TJ Watson Research Center joint work with Radha Jagadeesan, Armando Solar- Lezama, Christoph von Praun http://www.saraswat.org/cf.html

Upload: carson-stanton

Post on 30-Dec-2015

30 views

Category:

Documents


1 download

DESCRIPTION

Determinate Imperative Programming: The CF Model. Vijay Saraswat IBM TJ Watson Research Center joint work with Radha Jagadeesan, Armando Solar-Lezama, Christoph von Praun http://www.saraswat.org/cf.html. Problem: Many concurrent imperative programs are determinate. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Determinate Imperative Programming:  The CF Model

Determinate Imperative Programming: The CF Model

Vijay SaraswatIBM TJ Watson Research Centerjoint work with Radha Jagadeesan, Armando Solar-Lezama, Christoph von Praunhttp://www.saraswat.org/cf.html

Page 2: Determinate Imperative Programming:  The CF Model

2

Outline

Problem: Many concurrent

imperative programs are determinate.

Determinacy is not apparent from the syntax.

Basic idea A variable is the stream

of values written to it by a thread.

Many examples

Semantics

Implementation

Future work

Page 3: Determinate Imperative Programming:  The CF Model

3

Background: X10

Five basic themes: Partitioned address

space Pervasive explicit

asynchrony (Cilk-style recursive parallelism)

Java base Guaranteed VM

invariants Explicit, distributed VM

Few language extensions <s> = async <s> <s> = finish <s> <s> = foreach ( <v>,

…,<v> in <e>) <s> Multidimensional arrays

over distributions

Subsumes MPI, OpenMP, SPMD languages, Cilk …

Page 4: Determinate Imperative Programming:  The CF Model

4

X10: clocks, clocked final data structures Clocks can be created

dynamically. Activities are registered with

clocks. An activity may register a

newly created activity with one of its clocks.

“next;” resumes each clock; blocks until each clock advances. This is sufficient for

deadlock-freedom. Adequate for parallel

operations on arrays But not dataflow

Clock advances when all activities registered on it resume the clock.

Operations c.resume(); next; c.drop();

Clocked final datum In each phase of the clock

the datum is immutable. Read gets current value;

write updates in next phase.

Clocks do not introduce deadlock; clocked finals are determinate.

Page 5: Determinate Imperative Programming:  The CF Model

5

int clocked (c) final [0:M-1,0:N-1] G = …;

finish foreach (int i,j in [1:M-1,1:N-1]) clocked (c) {

for (int p in [0:TimeStep-1]) {

G[i,j] = omega/4*(G[i-1,j]+G[i+1,j]+G[i,j-1]+G[i,j+1])+(1-omega)*G[i,j];

next;

}

}

Clocked final example: Array relaxationG elements are assigned to at most once in each

phase of clock c.

Wait for clock to advance.

Takeaway: Each cell is assigned a clocked stream of immutable values.

Read current value of cell.

Each activity is registered on c.

Write visible (only) when clock advances.

Page 6: Determinate Imperative Programming:  The CF Model

6

Imperative Programming Revisited Variables

Value in a Box Read: fetch current value Write: change value Stability condition: Value

does not change unless a write is performed

Very powerful Permit repeated many-

writer, many-reader communication through arbitrary reference graphs

Asynchrony introduces indeterminacy

May write out either 0 or 1.

int x = 0;

async x=1;

print(x);

Reader-reader, reader-writer, writer-writer conflicts.

Page 7: Determinate Imperative Programming:  The CF Model

7

Determinate Concurrent Imperative frameworks Asynchronous Kahn

networks Nodes can be thought of

as (continuous) functions over streams.

Pop/peek Push Node-local state may

mutate arbitrarily

Concurrent Constraint Programming Tell constraints Ask if a constraint is true Subsumes Kahn

networks (dataflow). Subsumes (det)

concurrent logic programming, lazy functional programming

Do not support arbitrary mutable variables.

Page 8: Determinate Imperative Programming:  The CF Model

8

Determinate Concurrent Imperative Frameworks Safe Asynchrony

(Steele 1991) Parent may communicate

with children. Children may

communicate with parent. Siblings may

communicate with each other only through commutative, associative writes (“commuting writes”).

int x=0;

finish foreach (int i in 1:N) {

x += i;

}

print(x); // N*(N+1)/2

int x=0;

finish foreach (int i in 1:N) {

x += i;

async print(x);

}

Good:

Bad:

Useful but limited. Does not permit dataflow synch.

Page 9: Determinate Imperative Programming:  The CF Model

9

The CF Basic model

A shared variable is a stream of immutable values.

Each activity maintains an index i + clean/dirty bit for every shared variable. Initially i=1, v[0] contains initial

value. Read: If clean, block until v[i] is

written and return v[i++] else return v[i-1]. Mark as clean.

Write: Write into v[i++]. Mark as dirty.

A read stutters (returns value in last phase) if no activity can write in this phase. E.g. for local variables.

World Map=Collection of indices for an activity.

Index transmission rules. Activity initialized with

current world map of parent activity.

On finish, world map of activity is lubbed with world map of finished activities. (clean lub dirty = clean)

All programs are determinate and scheduler independent. May deadlock … nexts are

not conjunctive.

The clock of clocked final is made implicit.

Page 10: Determinate Imperative Programming:  The CF Model

10

CF example: Array relaxation

shared int [0:M-1,0:N-1] G = …;

finish foreach (int i,j in [1:M-1,1:N-1]) {

for (int p in [0:TimeStep-1]) {

G[i,j] = omega/4*(G[i-1,j]+G[i+1,j]+G[i,j-1]+G[i,j+1])+(1-omega)*G[i,j];

}

}

All clock manipulations are implicit.

Page 11: Determinate Imperative Programming:  The CF Model

11

Some simple examples

shared int x=0;

finish {

async {int r1 = x; int r2 = x; println(r1); println(r2);}

async {x=1;x=2;}

}

0

1

Only one result – independent of the scheduler!

i x A1 A2

0 0 read r1

1 1 read r2 write 1

2 2 write 2

Page 12: Determinate Imperative Programming:  The CF Model

12

Some simple examplesshared int x=0;

finish {

async {int r1 = x; int r2 = x; println(r1); println(r2);}

async {x=1;}

async {x=1; int r3 = x; async {x=2;}}

}

println(x);

All programs are determinate.

0

1

2

i x A1 (0) A2 (0) A3 (0) A4 (2)

0 0 read r1

1 1 read r2 write 1 write 1; read r3

2 2 write 2

Page 13: Determinate Imperative Programming:  The CF Model

13

Some StreamIt examples

void -> void pipeline Minimal {

add IntSource;

add IntPrinter;

}

void ->int filter IntSource {

int x;

init {x=0;}

work push 1 { push(x++);}

}

int->void filter IntPrinter {

work pop 1 { print(pop());}

}

shared int x=0;

async while (true) x++;

async while (true) println(x);

StreamIt0

1

The communication is through assignment to x, so the same result is obtained with:

shared int x=0;

async while (true) ++x;

async while (true) println(x);

0

1

X10/CF

Each shared variable is a multi-reader, multi-writer stream.

Page 14: Determinate Imperative Programming:  The CF Model

14

Some StreamIt examples: fibonacci

shared int x=1, y=1;

async while (true) y=x;

async while (true) x+=y;

i y x

0 1 1

1 1 2

2 2 3

3 3 5

… … …

Activity 1

Activity 2

Can express any recursive, asynchronous Kahn network.

Page 15: Determinate Imperative Programming:  The CF Model

15

StreamIt examples: Moving Averagevoid->void pipeline MovingAverage {

add intSource();

add Averager(10);

add IntPrinter();

}

int->int filter Average(int n) {

work pop 1 push 1 peek n {

int sum=0;

for (int i=0; i < n; i++)

sum += peek(i);

push(sum/n);

pop();

}

}

shared int y=0;

shared int x=0; async while (true) x++;

async while (true) {

int sum=x;

for (int i in 1:N-1) sum += peek(x, i);

y = sum/N;

}

• peek(x, i) reads the i’th future value, without popping it. Blocks if necessary.

Page 16: Determinate Imperative Programming:  The CF Model

16

StreamIt examples: Bandpass filterfloat->float pipeline BandPassFilter(float rate,

float low, float high, int taps) {

add BPFCore(rate, low, high, taps);

add Subtracter();}

float ->float splitjoin BPFCore

(float rate, float low,

float high, int taps) {

split duplicate;

add LowPass(rate, low, taps, 0);

add LowPass(rate, high, taps, 0);

join roundrobin;}

float->float filter Subtracter {

Work pop 2 push 1 {

push(peek(1)-peek(0));

pop(); pop();}}

float bandPassFilter(float rate, float low,

float high, int taps, int in) {

int tmp=in;

shared int in1=tmp, in2=tmp;

async while (true) in1=in;

async while (true) in2=in;

shared int o1 = lowPass(rate, low, taps, 0, in1),

o2 = lowPass(rate, high, taps, 0, in2);

shared int o = o1-o2;

async while(true) o = o1-o2;

return o;

}

Functions return streams.

Page 17: Determinate Imperative Programming:  The CF Model

17

Canon matrix multiplication <final int N>void canon (double[N,N] c, double[N,N] a, double[N,N] b) {

finish foreach (int i,j in [0:N-1,0:N-1]) {

a[i,j] = a[i,(j+1) % N];

b[i,j] = b[(i+j)%N, j];

}

for (int k in [0:N-1])

finish foreach (int i,j in [0:N-1,0:N-1]) {

c[i,j] = c[i+j] + a[i,j]*b[i,j];

a[i,j] = a[i,(j+1)%N];

b[i,j] = b[(i+1)%N, j];

}

}

Local variables in each activity.

Parameters whose values are finalized.

The natural sequential program works (for finish foreach).

Page 18: Determinate Imperative Programming:  The CF Model

18

Histogram

Permit “commuting” writes to be performed simultaneously in the same phase.

Phase is completed when all activities that can write have written.

<int N> [1:N][] histogram([1:N][] A) {

final int[] B = new int [1:N];

finish foreach(int i in A) B[A[i]]++;

return B;

}

B’s phase is not yet complete. A subsequent read will complete it.

Page 19: Determinate Imperative Programming:  The CF Model

19

Cilk programs with races

int x;

cilk void foo() {

x = x +1;

}

cilk int main() {

x=0;

spawn foo();

spawn foo();

sync;

printf(“x is \%d\n”, x);

return 0;

}

Determinate: Will always print 1 in CF.

CF smoothly combines Cilk and StreamIt.

Page 20: Determinate Imperative Programming:  The CF Model

20

Implementation

Each activity’s world map increases monotonically with time.

Use garbage collection to erase past unreachable values.

Programs with no sibling communication may be executed in buffers with unit windows.

Considering permitting user to specify bounds on variables (cf push/pop specifications in StreamIt). This will force writes to

become blocking as well.

Scheduling strategy affects size of buffers, not result.

Page 21: Determinate Imperative Programming:  The CF Model

21

Formalization

MJ/CF Very straightforward

additions to field read/write.

Paper contains details.

Surprisingly localized.

Page 22: Determinate Imperative Programming:  The CF Model

22

Future work

Paper contains ideas on detecting deadlock (stabilities) at runtime and recovering from them. Programmability being investigated.

Implementation. Leverage connection with StreamIt, and static scheduling.

Coarser granularity for indices. Use same clock for many variables. Permits “coordinated” changes to multiple variables.