making proof-based verified computation almost practical michael walfish the university of texas at...

Making proof-based verified computation almost practical

Michael Walfish

The University of Texas at Austin

The motivation is 3rd party computing: cloud, volunteers, etc.

We desire the following properties in the above exchange:

1. Unconditional, meaning no assumptions about the server

2. General-purpose, meaning not specialized to a particular f

3. Practical, or at least conceivably practical soon

“f”, x

y, aux.client server

check whether y = f(x), without

computing f(x)

By verified computation, we mean the following:

REJECT

ACCEPT

Unfortunately, the constants and proof length are outrageous.

“f”, xy

client servery’

...

...

Theory can supposedly help. Consider the theory of Probabilistically Checkable Proofs (PCPs). [ALMSS92,

AS92]

Using a naive PCP implementation, verifying multiplication of 500×500 matrices would take 500+ trillion CPU years (seriously).

We have reduced the costs of a PCP-based argument system [Ishai et al., CCC07] by 20+ orders of magnitude, with proof.

We have implemented the refinements in a system that is not ready for prime time but is practical in some cases.

We believe that PCP-based machinery will be a key tool for building secure systems in the near future.

S. Setty, B. Braun, V. Vu, A. J. Blumberg, B. Parno, and M. Walfish. Resolving the conflict between generality and plausibility in verified computation. Cryptology ePrint 622, November 2012.

S. Setty, V. Vu, N. Panpalia, B. Braun, A. J. Blumberg, and M. Walfish. Taking proof-based verified computation a few steps closer to practicality. Usenix Security, August 2012.

S. Setty, R. McPherson, A. J. Blumberg, and M. Walfish. Making argument systems for outsourced computation practical (sometimes). Network and Distributed Systems Security Symposium (NDSS), February 2012.

S. Setty, A. J. Blumberg, and M. Walfish. Toward practical and unconditional verification of remote computations. Workshop on Hot Topics in Operating Systems (HotOS), May 2011.

(1) The design of PEPPER

(2) Experimental results, limitations, and outlook

ACCEPT/REJECT

“f”, xclient server

y

...

...

The proof is not drawn to scale: it is far too long to be transferred.

Pepper incorporates PCPs but not like this:

(Even the asymptotically short PCPs [BGHSV CCC05, BGHSV SIJC06, Dinur

JACM07, Ben-Sasson & Sudan SIJC08] seem to have high constants.)

We move out of the PCP setting: we make computational assumptions. (And we allow # of query bits to be superconstant.)

client server...

[IKO CCC07]

...

serverclientcommit request

commit response

q1w q2w q3w

… Pepper uses an efficient argument [Kilian CRYPTO 92,95]:

Instead of transferring the PCP …

q1, q2, q3, …PCPQuery(q){ return <q,w>;}

ACCEPT/REJECT

efficient checks [ALMSS JACM98]

queries

The server’s vector w encodes an execution trace of f(x).

w

f ( )

What is in w? (1)An entry for each wire; and(2)An entry for the product of each pair of wires.

x

x0

x1

xn

…

AND

OR

AND

01

1

10

y0

y1

10

01

1NOT

NOT

1

0

NOT

0

010

[ALMSS JACM98]

[IKO CCC07]...

serverclientcommit request

commit response

q1w q2w q3w

Pepper uses an efficient argument [Kilian CRYPTO 92,95]:

q1, q2, q3, …PCPQuery(q){ return <q,w>;}

ACCEPT/REJECT

[ALMSS JACM98]

queries

This is still too costly (by a factor of 1023), but it is promising.

efficient checks

“f”y

client server

wcommit request

commit response

, x

query vectors: q1, q2, q3, …

response scalars: q1w, q2w,

q3w, …

ACCEPT/REJECT

checks

queries

PEPPER incorporates refinements to [IKO CCC07], with proof.


w1

w2

w3

client server

The client amortizes its overhead by reusing queries over multiple runs. Each run has the same f but different input x.

“f”y

client server


q3w, …

wcommit request

commit response

, x


ACCEPT/REJECT

checks

queries ✔

Boolean circuit

Arithmetic circuit

Arithmetic circuit with

concise gates×

×

×

×

+

+

+ ab abab

something gross

Unfortunately, this computational model does not really handle fractions, comparisons, logical operations, etc.

If the output is 7 …

… there is a solution

An incorrect input/output pair results in unsatisfiable constraints.

0 = X - 6 0 = Y - (X + 1)0 = Y - 7

We now work with (degree-2) constraints over a finite field.

Increment(X) Y=X + 1

0 = X - <input>,0 = Y - (X+1),0 = Y - <output>

As an example, suppose the input above is 6.

If the output is 8 …

… there is no solution

0 = X - 6 0 = Y - (X + 1)0 = Y - 8

This formalism represents general-purpose program constructs concisely: if/else, !=, <, >, ||, &&, floating-point, …

“Z1 != Z2” 1 = (Z1 – Z2 ) M

if (bool X) Y=3else Y=4

0 = X - MY = M3 + (1-M)4

We have a compiler that does this transformation; it is derived from Fairplay [Malkhi et al. Usenix Security04].

This work should apply to other protocols.

Z1=23, Z2=187, …,

w

10

10

1

010

w

1013204

7

18723

219

01

805

The proof vector now encodes the assignment that satisfies the constraints.

Take-away: constraints are general-purpose, (relatively) concise, and accommodate the verification machinery.

01

1

10 1

0

0

1 = (Z1 – Z2 ) M

0 = Z3 − Z4

0 = Z3Z5 + Z6 − 5

The savings from the change are enormous (though the new model is not perfect – loops are still unrolled).

“f”y

client server

wcommit request

commit response

, x


✔


q3w, …

ACCEPT/REJECT

checks

queries ✔

w

server

We (mostly) eliminate the server’s PCP-based overhead.

before: # of entries quadratic in computation

size

after: # of entries linear in computation

size

Now, the server’s overhead is mainly in the cryptographic machinery and the constraint model itself.

The client and server reap tremendous benefit from this change.

|w|=|Z|2

PCP verifier

This resolves a conjecture of Ishai et al. [IKO

CCC07]

For any computation, there exists a linear PCP whose proof vector is (quasi)linear in the computation size.

[Gennaro et al. 2012]

linearity testquad corr.

testcircuit test

commit request

commit

response

queries

responsesπ(q1), …, π(qu)

π()=<,w>

q1, q2, …, qu

(z, z ⊗ z)

(z, h)

serverclient

new quad.test

|w|=|Z|+|C|

w

w

“f”, xyclient server

commit request

commit response

query vectors: q1, q2, q3, … w

✔

✔

✔


q3w, …

checks

queries

ACCEPT/REJECT

linearity test

quad. test

We prove: commitment primitive is (far) stronger than linearity test

commit request

commit

response

consistency test

queries

responses

PCP verifier

client

We eliminate linearity tests.

linearity: π(q1) + π(q2) = π(q1+q2)

Enc(r)

consistency:t = r + α1q1 + … + αuqu

π(t) = π(r) + α1π (q1) + … + αuπ (qu)

? ?

Enc(π(r))

π(t)

π()

q1, q2, …, qu

server

t

π(q1), …, π(qu)

?

We eliminate soundness amplification (repetition to reduce error).

≣

“f”, xyclient server

commit request

commit response

query vectors: q1, q2, q3, … w

✔

✔

✔


q3w, …

checks

queries

ACCEPT/REJECT

✔

✔

(1) The design of PEPPER

(2) Experimental results, limitations, and outlook

✔

Our implementation of the server is massively parallel; it is threaded, distributed, and GPUed.

Some details of our evaluation platform:

It uses a cluster at Texas Advanced Computing Center (TACC)

Each machine runs Linux on an Intel Xeon 2.53 GHz with 48GB of RAM.

For GPU experiments, we use NVIDIA Tesla M2070 GPUs (448 CUDA cores and 6GB of memory)

To get a sense of the cost reduction, consider amortized costs for multiplication of 500×500 matrices:

Under the theory,naively applied

Under PEPPER

client CPU time >100 trillion years

<2.2 seconds

server CPU time >100 trillion years

6.5 hours

However, this assumes a (fairly large) batch.

1. What are the break-even points?

2. At these points, is the server’s latency acceptable?

# instances

CPU time

computation costs

verification costs

break-even point

break-even 4100 inst. 30,500 inst.

24,500 inst.

client CPU time 2.5 hours 27 minutes

25 minutes

server CPU time 3.1 years 10 months 9.5 months

break-even 20 inst. 1570 inst. 990 inst.

client CPU time 44 sec. 83 sec. 60 sec.

server CPU time 5.4 days 16 days 11.9 days

Consider outsourcing in two different regimes:

(1) Verification work performed on CPU

(2) If we had free crypto hardware for verification …

mat. mult.,m=500

insertion sort,m=256

PAM clustering,m=15, d=100

1 c

ore

4 c

ore

s6

0 c

ore

s6

0 c

ore

s(i

deal)

1 c

ore

4 c

ore

s6

0 c

ore

s

60 c

ore

s(i

deal)

1 c

ore

4 c

ore

s

60 c

ore

s(i

deal)

1 c

ore

4 c

ore

s6

0 c

ore

s

60

core

s(i

deal)

matrix multiplication

insertion sort PAM clustering all-pairs shortest paths

60

core

s

Parallelizing the server results in near-linear speedup

polynomial evaluation

Hamming distance root finding Fannkuch benchmark

1 c

ore

4 c

ore

s6

0 c

ore

s6

0 c

ore

s(i

deal)

1 c

ore

4 c

ore

s6

0 c

ore

s

60 c

ore

s(i

deal)

1 c

ore

4 c

ore

s

60 c

ore

s(i

deal)

1 c

ore

4 c

ore

s6

0 c

ore

s

60

core

s(i

deal)

60

core

s

Parallelizing the server results in near-linear speedup

1. The client requires batching to break even.

2. The server’s burden is too high, still.

3. The computational model is still limited (looping is clumsy, etc.)

PEPPER is not ready for prime time:

We describe the landscape in terms of our three goals.

Gives up being unconditional or general-purpose

Replication [Castro & Liskov TOCS02], trusted hardware [Chiesa &

Tromer ICS10, Sadeghi et al. TRUST10], auditing [Haeberlen et al. SOSP07, Monrose et al. NDSS99]

Special-purpose [Freivalds MFCS79, Golle & Mironov RSA01, Sion

VLDB05, Benabbas et al. CRYPTO11, Boneh & Freeman EUROCRYPT11]

Unconditional and general-purpose but not geared toward practice

Use fully homomorphic encryption [Gennaro et al., Chung et al. CRYPTO10]

Proof-based verified computation [GMR85, Ben-Or et al. STOC88, BFLS91, Kilian STOC92, ALMSS92, AS92, Goldwasser et al. STOC08, Ben-Sasson et al. ePrint12]

Proof-based verified computation geared to practice

PEPPER [HotOS11, NDSS12, Usenix Security 12, ePrint12]

[Cormode et al. ITCS12, Thaler et al. HotCloud12] [Gennaro et al. ePrint12]

We have reduced the costs of a PCP-based argument system [Ishai et al., CCC07] by 20+ orders of magnitude, with proof.

The refinements are in the PCP and the cryptography.

We have made the computational model general-purpose.

This leads to cost savings and programmability …

… and yields a practical compiler.

Take-away: Proof-based verified computation is now close to practical and will be a key tool in real systems.

making proof-based verified computation almost practical michael walfish the university of texas at...

Documents

w pepper

almss jacm98 slide

request commit response

outlook slide

austin slide

verified computation

argument systems

practical michael walfish