cs 294-8 abstraction functions cs.berkeley/~yelick/294

CS294, Yelick Abstract. Funs., p1

CS 294-8Abstraction Functions

http://www.cs.berkeley.edu/~yelick/294


Agenda• Administrivia• Review of abstraction functions for

the memory example• History variables• Prophecy variables• General discussion


Administrivia• Dawson Engler speaking Thursday

– OSDI paper online

• Final projects: – Send mail to schedule meeting with

me– Poster session:

• Scheduled 12/7. Too early? With 262?

– Papers due Friday 12/15

• Homework 3


History on Abstraction Functions

• Used since the 1970s for reasoning about datatypes, e.g., Tony Hoare’s paper– Used with “representation invariants” by

Liskov and others

• Abadi and Lamport looked formalized their use in concurrent systems: – When do abstraction functions exist?– Formalized history variables

• Example from Herlihy and Wing paper demonstrated need for prophecy variables


Execution Model Reminder• A Spec Module defines a state machine• An execution fragment is s0 s1 s2

• An execution starts in an initial state

• Steps are written as (si, , si+1)

0 1 2


Some Executions of WBCache write(2,a) (read(2),a)

init write(4,c) (read(3),a)

B

C A

A

A B

B

C A

A

A C

B

C A

A

A C

B

C

A

B

B

A

A A

A C

C B

C

A

A B

B

A

A

B

B

A

A

C

B

A

A

C

B

A

A

C

abstract execution of specification

concrete execution of implementation


Abstraction Function for WBCache

FUNC AF() -> M = RET (LAMBDA (a) -> D = IF c!a => c(a) [*] m(a) FI)• Note that abstraction function maps

each state in previous WBCache execution to Memory state


Abstraction Function (Def 1)• An abstraction function F: T -> S has:

– If t is any initial state of T, then F(t) is an initial state of S

– If t is reachable state of T and (t, , t’) is a step of T, then there is a step of S from F(t) to F(t’) having the same trace

• Same trace means externally visible values.


Representation Invariants• The abstraction function need not

be defined on every value of the concrete state, only those reachable in the implementation

• Example: implementations of a set– An sorted array – Or, an unsorted array without

duplicates• Not every array is a legal set

representation


Modeling Failures in Spec• A “crash” can happen between any two

atomic actions– Volatile state reset– Stable state unaffected

• Add a Crash procedure to a module– Need not be atomic; invoked when there’s a

crash– Does a “CRASH” command, which stops

current (non-atomic) executions. Nothing else can be invoked until Crash returns.

– Crash may do other things after CRASH cmd– Normal operation resumes after Crash returns


Disk Example• In the Disk example (H7, p4):

– Write operations are • Ordered• Not atomic (although each block write is)

– There is no global volatile state • Crash just executes “CRASH”


Agenda• Review of abstraction functions for



Example1 : Statistical DB Type

• Given a “statistical DB” Spec with the operations– Add(v): add a new number, v, to the DB– Size(): report the number of elements in the DB– mean(): report the mean of all elements in the DB– variance(): report the variance of all elements in

the DB

• Notes have a parameter for the DB element type (V); for simplicity, I’ll use “real.”


Example1 : Statistical DB Type

• Implementation 1:– Keep set of all values in the database– Mean and variance are computed when

needed

• Implementation 2 (optimized):– Use only three values:

• integer count, initially 0 // number of elements

• float sum, initially 0 // sum of elements• float sumSquare, initially 0 // sum of squares of • // all elements


History Variables

• Problem: the specification contains more info than the implementation– Specifically, one can’t recover the values

in the db from the 3 state variables

• Idea: add some phantom variables to the state of the implementation.– Only for the proof– The operations can update this “phantom” state,

but cannot change their behavior based on it.


Proof of 2nd Implementation• Proof idea:

– add a variable db to the representation state (for the proof only)

– the implementation may update db

• Augmented “implementation” has:VAR count := 0 sum := 0 sumSquare := 0 db : SEQ real = {}

APROC Add(v) = << count := 1; sum +:= v; sumSquare += v2; db += {v}; RET >>


Proof of 2nd Implementation• Proof: The abstraction function for the

augmented DB maps db field to the abstract state

• Need to prove the representation invariants:– count = db.size– sum = sum({x | x in db})– sumSquare = sum({x2 | x in db})

• Invariants prove that the operations behave correctly, e.g., Size returns the right value.


History Variables• In general, we can augment an

implementation with history variables such that:– Every initial state of the original machine has

a corresponding state with some initial value for the history variables

– No existing step is disabled by additional predicates on history variables

– A value assigned to an existing component must not depend on the value of a history variable (e.g., return values).

• Note: the statDB example is extreme, since the entire spec state is added


Examples of History Variables

• Why do history variables arise?– To simplify the specifications– To optimization the implementations

• More realistic examples:– Web search

• Spec talks about the state of the web: “search” looks at arbitrary subset

• Implementation cannot reproduce state of failed nodes, except by “storing” lost state (may be phantom “history” vars)

– Others?


Abstraction Relations• An alternative to history variables is to

use and “abstraction relation” AR• AR maps each concrete state to a set of

possible spec states

• For this example, AR maps a the 3 values to the set of all db’s having the given size, sum, and sum-of-squares.

• It’s a matter of proof style and taste.


Stuttering Transitions• Recall that Lamport and Abadi considered

any 2 executions equivalent if one erases “no ops”

I.e., (s, , s)• Intuition: a single high level operation

(e.g., transaction) may be implemented by several smaller steps (atomic in the impl.)

• A generalized abstraction function allows for 1 step of T to correspond to 0 or >1 steps of S


Abstraction Function (Def 2)• A generalized abstraction function

F: T -> S has:– If t is any initial state of T, then F(t) is

an initial state of S– If t is reachable state of T and (t, , t’)

is a step of T, then there is an execution fragment (0 or more step) of S from F(t) to F(t’) having the same trace


Agenda• Review of abstraction functions for



Example 2: NonDet (Toy)• Specification VAR j := 0 APROC Out() -> Int = << IF j = 0 => BEGIN j := 2 [] j := 3

END; RET 1 [*] RET j FI >>

• Implementation: VAR j := 0 APROC Out() -> Int = << IF j = 0 => j := 1 [*] j = 1 => BEGIN j := 2 []

j := 3 [*] SKIP FI; RET j >>

• Both have traces: 1, 2, 2, 2,… and 1, 3, 3, 3, …

• Do we have AF’s in both directions? Notes say: • The “spec” implements the “impl” using the

identity abstraction function• The reverse AF can’t be defined


Prophecy Variables• We can augment an implementation T with

prophecy variables to produce TP such that:– Every state of T has a corresponding state with some

value for the prophecy variables– No existing step is disabled in the backward direction by

additional predicates on prophecy variables. • For each step (t, , t’) of T and state (t’, p’) of TP, there

must be a value p of the prophecy variable(s), such that ((t,p), (t’, p’)) is a step of TP.

– A value assigned to an existing component must not depend on the value of a history variable (e.g., return values).

– If t is an initial state of T and (t,p) is a state of TP, then (t,p) must be an initial state of TP


Example 3: Reliable Messages

MODULE ReliableMsg [M] EXPORT Put, Get, Crash = VAR q : SEQ M := {}

APROC Put(m) = << q + := {m} >> APROC Get() -> M = <<VAR m q.head | q := q.tail; RET

m >> APROC Crash() = << VAR q’ | subseq(q’, q) => q =

q’>>

• Problem: don’t know which messages will be lost at the time of a crash– ensure FIFO delivery– eliminate duplicates from retransmission


Example 4: Queue• Given a queue with operations

– Enq: add an element to the back of the queue– Deq: remove the element at the front of the

queue and return it

• (Abbreviated E and D on the next slide)• (Two processes, A and B)


Some Queue Histories• History 1, acceptable

• History 2, not acceptable

• History 3, not acceptable

q.E(x) A q.D(y) A q.E(z) A

q.E(y) B q.D(x) B

q.E(x) A q.D(y) A

q.E(y) B

q.E(x) A q.D(y) A

q.E(y) B q.D(y) B


Example 4A: Queue with Locks• Given a queue implementation containing

– integers, back, front– an array of values, items– a lock, l

Enq = proc (q: queue, x: item) // ignoring buffer overflow

lock(l) i: int = q.back++ // allocate new slot q.items[i] = x // fill it unlock(l) Deq = proc (q: queue) returns (item) signals empty lock(l) if (back==front) signal empty else front++ ret: item = items[front] unlock (l) return(ret)


Some Queue Histories

• History 1, acceptable

• Process A got the lock first during first Enq’s

• Why didn’t A return immediately after releasing the lock?

q.E(x) A q.D(y) A q.E(z) A

q.E(y) B q.D(x) B


Simple Abstraction Function• The abstraction function maps the

elements in items items[front…back] to the abstract queue value

• Proof is straightward

• The lock prevents the “interesting” cases


Example 4B: Queue with Atomic Ops

• Given a queue implementation containing– an integer, back– an array of values, items

Enq = proc (q: queue, x: item) i: int = INC(q.back) // allocate new slot, atomic STORE(q.items[i], x) // fill it Deq = proc (q: queue) returns (item) while true do range: int = READ(q.back) - 1 for i: int in 1.. range do x: item = SWAP (q.items[i], null) if x!= null then return x


Queue Example Notes• Several atomic operations are defined

– STORE, SWAP, INC– These may or may not be supported on given

hardware, which would change the proof

• The deq operation starts at the front end of the queue– slots already dequeued will show up as nulls– slots not yet filled will also be nulls– picks the first non-empty slot– will repeat scan until it finds an element, waiting for an

enqueue to happen if necessary

• Many inefficiencies, such as the lack of a head pointer. Example is to illustrate proof technique.


Need for Prophecy Variables•Abstraction function

–for this example, a prophecy variable is needed–Two processes, A and B (1 implicit queue)Enq(x) AEnq(y) B INC(q.back) A for this execution, there is no way of defining an abstraction function

without INC(q.back) B predicting the future, I.e., whether x

or y will be dequeued first STORE(q.items[2], y) B Enq(y) returns on B


Existence of Abstraction Functions

There are three cases that arise in trying to prove that an abstraction function exist in a concurrent system

• The function can be defined directly on the implementation state

• A history variable needs to be added to the implementation to record a past event

• A prophecy variable needs to be added to the implementation to record a future event

• Alternatively, you may use an abstraction relation

A(t’)A(t)

t t’


General Discussion• Examples of distributed programs that

are challenges to specify– Bayou consistency model– Oceanstore meta-consistency model– Others

• Criteria: when is a spec good enough?• Examples of algorithms hard to verify• Examples of programs hard to verify

cs 294-8 abstraction functions cs.berkeley/~yelick/294

Documents

initial state of t

reachable state of t

initial state of sif

concrete state

step of t

global volatile state

abstraction function

crash procedure