p4p: a framework for practical server-assisted multiparty computation with privacy
DESCRIPTION
P4P: A Framework for Practical Server-Assisted Multiparty Computation with Privacy. Yitao Duan Berkeley Institute of Design UC Berkeley Qualifying Exam April 18, 2005. Outline. Problem and motivation Privacy issues examined Privacy is never a purely tech issue - PowerPoint PPT PresentationTRANSCRIPT
P4P: P4P: A Framework for Practical A Framework for Practical Server-Assisted Multiparty Server-Assisted Multiparty Computation with PrivacyComputation with Privacy
Yitao DuanBerkeley Institute of Design
UC BerkeleyQualifying Exam
April 18, 2005
OutlineOutline
Problem and motivation Privacy issues examined
– Privacy is never a purely tech issue– Derive some design principles
The P4P framework Applications
– Practical multiparty arithmetic computation with privacy
– Service provision with privacy Progress and future work
Problem ScenarioProblem Scenario
Applications and MotivationApplications and Motivation
“Next generation search” makes heavy use of personal data for customized search, context-awareness, expertise mining and collaborative filtering
E-commerce vendors (like Amazon) try to build user purchase profiles across markets. And user profiling is moving onto the desktop
Location based services, real-world monitoring …
OutlineOutline
Problem and motivation Privacy issues examined
– Privacy is never a purely tech issue– Derive some design principles
The P4P framework Applications
– Practical multiparty arithmetic computation with privacy
– Service provision with privacy Progress and future work
Legal PerspectivesLegal Perspectives
Privacy issues arise as a tension between two parties: one seeks info about the other
Identity of the seeker leads to different situations and precedents– E.g. individual vs, the press, vs. the employer
Power imbalance between the twoLoss of privacy often leads to real harm:
e.g. loss of job, loss of right, etc.
[AK95]
Economic PerspectivesEconomic Perspectives
Market forces work against customer privacy– Company has to do extra work to get less info– Company can benefit from having user info– So they lack the incentive to adopt PETs
Power imbalance (again!) in e-commerce– But we, as users, can make a difference by flexing our
collective muscles!
Users often underestimate the risk of privacy intrusion and are unwilling to pay for PET
[FFSS02,ODL02, A04]
Social Science PerspectivesSocial Science Perspectives
Privacy is NOT minimizing disclosure– Maintaining a degree of privacy often requires
disclosure of personal information [Altman 75]– E.g. faculty members put “Perspective students
please read this before you email me …” on their web page
Sociality requires free exchange of some information– PET should not prevent normal exchange
Lessons for Designing Lessons for Designing Practical SystemsPractical Systems
Almost all problems are preserved, or even exaggerated in computing– Tension exists but court arbitration not available– Power imbalance prevails with no protection of the weak
– client/server paradigm– Lack of incentive (to adopt PET, to cooperate, etc)
Design constraints for practical PET– Cost of privacy must be close to 0. And the privacy
scheme must not conflict with the powerful actor’s need
OutlineOutline
Problem and motivation Privacy issues examined
– Privacy is never a purely tech issue– Derive some design principles
The P4P framework Applications
– Practical multiparty arithmetic computation with privacy
– Service provision with privacy Progress and future work
The P4P PhilosophyThe P4P Philosophy
You can’t wait for privacy to be granted. One has to fight for it.
P4P: P4P: ΠΠ22 Principles Principles
Prevention: Not deterrenceIncentive: Design should consider the
incentives of the participantsProtection: Design should incorporate
mechanisms that protect the weak partiesIndependence: The protection should be
effective even if some parties do not cooperate
TopologiesTopologiesS
u
Client-server
P2P
Problems With the Two Problems With the Two ParadigmsParadigms
Client-server– Power imbalance– Lack of incentive
P2P– Doesn’t always match all the transactions models (e.g.
buying PCs from Dell)– Hides the heterogeneity– Many efficient server-based computation are too
expensive if done P2P
The P4P ArchitectureThe P4P Architecture
Privacy Peer (PP)
• A subset of users are elected as “privacy providers” (called privacy peers) within the group• PPs provide privacy when they are available, but can’t access data themselves
P4P BasicsP4P Basics
Server is (almost) always available but PPs aren’t (but should be periodically) – asynchronous or semi-synchronous protocols
Server provides data archival, and synchronizes the protocol
Server only communicates with PPs occasionally (when they are online and light-loaded eg 2AM)
Server can often be trusted not to bias the computation – but we have means to verify it
PPs and all other user are completed untrusted
The Half-Full/Half-Empty Glass The Half-Full/Half-Empty Glass
In a typical P2P system, 5% of the peers provide 70% of the services [GFS]
P2P: 70% of the users are free riding
P4P: 5+% of the users are serving the community
Enough for P4P to work practically!
Roles of the Privacy PeersRoles of the Privacy Peers
Anonymizing Communication– E.g. Anonymizer.com or Mix
Offloading the ServerSharing InformationParticipating in ComputationOthers Infrastructure Support
Tools and ServicesTools and Services
Cryptographic tools: Commitment, VSS, ZKP, Anonymous authentication, eCash, etc
Anonymous Message Routing– E.g. MIX network [CHAUM]
Data protection scheme [PET04]
Λ: the set of users whom should have access to X Anonymous SSL
Y E X K eyset K
K eyset E K u
K R
K X
X
u
( ( ), , )
{ ( )| }
Practical Multiparty Arithmetic Practical Multiparty Arithmetic Computation with PrivacyComputation with Privacy
ApplicationsApplications
Multiparty ComputationMultiparty Computation
n parties with private inputs wish to compute some joint function of their inputs
Must preserve security properties. E.g., privacy and correctness
Adversary: participants or external– Semi-honest: follows the protocol but curious– Malicious: can behave arbitrarily
ApplicationsApplications
MPC – Known ResultsMPC – Known Results
Computational Setting: Trapdoor permutations– Any two-party function can be securely computed in
the semi-honest model [Yao]– Any multiparty function can be securely computed in
the malicious model, for any number of corrupted parties [GMW]
Info-Theoretic Setting: No complexity assumption– Any multiparty function can be securely computed in
the malicious model if 2/3n honest parties [BGW,CCD]– With broadcast channel, only >1/2n honest parties[RB]
ApplicationsApplications
A Solved Problem?A Solved Problem?
Boolean circuit based protocols totally impractical
Arithmetic better but still expensive: the best protocols have O(n3) complexity to deal with active adversary
Can’t be used directly in real systems with large scale: 103 ~ 106 users each with 103 ~ 106 data items
ApplicationsApplications
Contributions to Practical MPCContributions to Practical MPC
P4P provides a setting where generic arithmetic MPC protocols can be run much more efficiently– Existing protocols (the best one): O(n3)
complexity (malicious model)– P4P allows to reduce n without sacrificing
securityEnables new protocols to make a whole
class of computation practical
ApplicationsApplications
Arithmetic: Homomorphism vs VSSArithmetic: Homomorphism vs VSS
Homomorphism: E(a)E(b) = E(a+b) Verifiable Secret Sharing (VSS): a a1, a2, … an
Addition easy– E(a)E(b) = E(a+b)– share(a) + share(b) = share(a+b)
Multiplication more involved for both
– HOMO-MPC: O(n3) w/ big constant [CDN01, DN03]
– VSS-MPC: O(n4) (e.g. [GRR98])
ApplicationsApplications
Arithmetic: Homomorphism vs VSSArithmetic: Homomorphism vs VSS
HOMO-MPC– + Can tolerate t < n corrupted players as far as privacy
is concerned– Use public key crypto, 10,000x more expensive than
normal arithmetic (even for addition)– Requires large fields (e.g. 1024 bit)
VSS-MPC– + Addition is essentially free– + Can use any size field– - Can’t tolerate t > n/2 corrupted players (can’t do two
party multiplication)
ApplicationsApplications
Bridging the Two ParadigmsBridging the Two Paradigms
HOMO-MPC VSS-MPC: – Inputs: c = E(a) (public)
– Outputs: sharei(a) = DSKi(c) (private)
VSS-MPC HOMO-MPC:– Inputs: sharei(a) (private)
– Outputs: c = П E(sharei(a)) (public)
A hybrid protocol possible
ApplicationsApplications
Efficiency & Security AssumptionsEfficiency & Security Assumptions Existing protocols: uniform trust assumption
– All players are corrupted with the same probability– Damages caused by one corrupted player = another– A common mechanism to protect the weakest link
against the most severe attacks
But players are heterogeneous in their trustworthiness, interests, and incentives etc.– Cooperation servers behind firewalls– Desktops maintained by high school kids– The collusion example
ApplicationsApplications
Exploiting the DifferenceExploiting the Difference
Server is secure against outside attacks– Companies spend $$$ to protect their servers– The server often holds much more valuable info than
what the protocol reveals PPs won’t collude with the server
– Interests conflicts, mutual distrust, laws– Server can’t trust clients can keep conspiracy secret
Server won’t corrupt client machines– Market force and laws
Rely on server for protection against outside attacks, PPs for defending against a curious server
ApplicationsApplications
How to Compute Any Arithmetic How to Compute Any Arithmetic Function – P4P StyleFunction – P4P Style
Each player secret shares her data among the server and one PP using (2, 2)-VSS
Server and PP convert to a HOMO-MPC for mult. Use VSS for addition. Result obtained by threshold decryption or secret reconstruction
Dealing with malicious adversary: cheating PP replaced by another
2 << n! – Communication independent of n– Computation on talliers ~ fully distributed version
ApplicationsApplications
Addition Only AlgorithmsAddition Only Algorithms
Although general computation made more efficient in P4P, multiplication still way more expensive than addition
A large number of practical algorithms can be implemented with addition only aggregation – Collaborative filtering [IEEESP02, SIGIR02]– HITS, PageRank …– E-M algorithm, HMM, most linear algebra
algorithms …
ApplicationsApplications
New Vector Addition Based MPCNew Vector Addition Based MPC
User i has an m-dimensional vector di, want to compute
[y, A’] = F(Σi=1n di, A)
Goals– Privacy: no one learns di except user i
– Correctness: computation should be verified
– Validity: ||di||2 < L w.h.p.
ApplicationsApplications
Cost for Private Computation: Cost for Private Computation: Vector Addition OnlyVector Addition Only
C P
Total computation cost
Cost for computation on obfuscated data
Cost for privacy/security
σC:O(mn) for both HOMO and VSS
ApplicationsApplications
Cost for Private Computation: Cost for Private Computation: Vector Addition OnlyVector Addition Only
C P
Total computation cost
Cost for computation on obfuscated data
Cost for privacy/security
O(nlogm)
The hidden const: HOMO: 10,000 VSS: 1 or 2
σC:O(mn) for both HOMO and VSS
ApplicationsApplications
Basic ArchitectureBasic Architecture
ApplicationsApplications
viui
ui + vi = di
Basic ArchitectureBasic Architecture
ApplicationsApplications
μ = Σui ν = Σvi
ui + vi = di
Basic ArchitectureBasic Architecture
ApplicationsApplications
ui + vi = di
μ = Σui ν = Σvi
μ
ν
Basic ArchitectureBasic Architecture
ApplicationsApplications
[y, A’] = F(μ + ν, A)
Adversary ModelsAdversary Models
Model 1: Any number of users can be corrupted by a malicious adversary; Both PP and the server can be corrupted by different semi-honest adversary
Model 2: Any number of users and the PP can be corrupted by a malicious adversary. The server can be corrupted by another malicious adversary who should not stop
ApplicationsApplications
An Efficient Proof of HonestyAn Efficient Proof of Honesty
Show that some random projections of the user’s vector are small
If user fails T out of the N tests, reject his data One proof/user vector and complexity O(logm)
ApplicationsApplications
s kj
j
l
2 1
1
Success ProbabilitySuccess Probability
0 0.5 1 1.5 2 2.5 3 3.5 40
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1Probability of Declared Success (L = 1, T/N = 0.317311)
||d||2
pN = 10N = 50N = 100
ApplicationsApplications
Complexity and CostComplexity and Cost
Only one proof for each user vector – no per-element proofs!
Computation size of sk: O(log m)
m = 106, l = 20, with N = 50, need 1420 exponentiations – ~ 5s/user
Benchmark: http://botan.randombit.net/bmarks.html, 1.6 Ghz AMD Opteron (Linux, gcc 3.2.2)
ApplicationsApplications
Service Provision with PrivacyService Provision with Privacy
ApplicationsApplications
Existing Service ArchitectureExisting Service Architecture
ApplicationsApplications
Traditional Service ModelTraditional Service Model
Requires or reveals private user info– Locations, IP addresses, the data downloaded
Requires user authentication– Subscription verification and billing purposes
Traditional client-server paradigm allows the server to link these two pieces of info
P4P keeps them separate
ApplicationsApplications
P4P’s Service ModelP4P’s Service Model
ApplicationsApplications
• Authenticates user• Anonymizes comm. • Processes the
transaction
• PP knows user’s identity but not his data• Server knows user’s transaction but not his ID• To the PP: Transactions protected w/ crypto• To the server: Transactions unlinkable to each other or to a particular user
Possible IssuesPossible Issues
The scheme involves multiple parties, why would they cooperate? – Server’s concerns and fears: Privacy peers are
assigned the task of user authentication, how could the server trust the privacy peers?
– Can the server block the PPs?– How to motivate the privacy peers?
How do we detect and trace any fraud?
ApplicationsApplications
SolutionsSolutionsMechanism to detect fraud and trace faulty
playersPP incentive: Rely on altruism or
mechanism to credit the PPs(An extreme) A fully P2P structure among
the users and PPs– Server cannot isolate the PPs– Independence!– A partial P2P structure should work (e.g.5%PP)
ApplicationsApplications
Billing ResolutionBilling Resolution
ApplicationsApplications
Fraud detection together with bill resolution Have schemes for a number of billing models (flat-
rate, pay-per-use) No info about user’s transactions (except those of
the faulty players) is leaked An extension: PP replaced by a commercial privacy
provider who does it for a profit– Now you can use its service and don’t have to be
embarrassed by Amazon knowing the DVD title you buy– http://www.cs.berkeley.edu/~duan/research/qual/
submitted/trustbus05.pdf
ConclusionsConclusions
System design guidelines drawn from legal, economic and social science research
P4P argues for peer involvement and exploits the heterogeneity among the players and provides a viable framework for practical collaborative computation with privacy
P4P allows for private computation based on VSS – privacy offered in P4P almost for free!
Progress So FarProgress So Far
Published work:– Data protection – PET04– Link analysis – SIAM Link Analysis Workshop
Submitted:– Group Communication Cryptosystem– Service Provision with Privacy
In progress:– Practical Vector Addition Based Computation– Hybrid MPC– Anonymous SSL
Plan and Future Work Plan and Future Work
Finish the work at hand Extend the practical computation to support
multiplication?– Hybrid: Homomorphism and VSS based scheme– VSS: Efficient multiplication possible if we can have 3
non-colluding players (another server? Another PP?) More applications? Implementation
– A P4P toolkit or lib that developers can use to built their application
Time to graduate: 12 to 18 months
ReferencesReferences [AK95] Alderman, E., Kennedy, C.: The Right to Privacy. DIANE
Publishing Co. (1995) [Altman75] Altman, E.: The Environment and Social Behavior.
Brooks/Cole Pub. Co. (1975) [DC04] Duan, Y., Canny, J.: Protecting user data in ubiquitous
computing: Towards trustworthy environments. In: PET’04. [PK01] Pfitzmann, A., Kohntopp, M.: Anonymity, unobservability, and pseudonymity: A proposal for terminology. Draft, ver0.17 (2001)
[Yao] Yao, A.C.C.: Protocols for secure computations. In FOCS '82 [GMW] Goldreich, O., Micali, S., Wigderson, A.: How to play any
mental game - a completeness theorem for protocols with honest majority. In STOC’87
[CDN01] R. Cramer et. al: Multiparty Computation from Threshold Homomorphic Encryption, EUROCRYPT '01
[GFS] E. Adar and B. Huberman, Free Riding on Gnutella [A04] Acquisti, A.: Privacy in electronic commerce and the
economics of immediate grati¯cation. In: ACMEC '04
[GRR98] R. Gennaro et. Al:Simplified VSS and fast-track multiparty computations with applications to threshold cryptography, PODC '98
[DN03] I. Damgård and J. Nielsen: Universally Composable Efficient Multiparty Computation from Threshold Homomorphic Encryption, CRYPTO 2003
[BGW] Ben-Or, M., Goldwasser, S., Wigderson, A.: Completeness theorems for non-cryptographic fault-tolerant distributed computation. In STOC'88
[CCD] Chaum, D., Crepeau, C., Damgård, I.: Multiparty unconditionally secure protocols. In STOC 88
[RB] Rabin, T., Ben-Or, M.: Verifiable secret sharing and multiparty protocols with honest majority. In STOC '89
[CD98] Cramer, R., Damgård, I.: Zero-knowledge proof for finite field arithmetic, or: Can zero-knowledge be for free? In: CRYPTO '98
ReferencesReferences
Thank You!Thank You!
Protecting the TransactionsProtecting the Transactions
M E A S C E R T S h
Q E Q E K E K
K A K Q
K K Q K Q
P A
Q S A
( , , , ( ))
( ( ), ( ), ( ))
1
0
ApplicationsApplications
M
Q0: The query, hQ = h(Q)
( , , , ( ))P C E R T Q S hP K QP 1
PP: Verifies cert, hash & signature
A
P
S
Protecting the TransactionsProtecting the Transactions
( , ( , , ( )))h E h R S hQ K Q K RQ S 1
ApplicationsApplications
E h R S hK Q K RQ S( , , ( )) 1
• PP performs authentication, S processes query• PP knows user’s identity but not his data• S knows user’s transaction but not his ID
S: Verifies cert, hash & signature
An Efficient Proof of HonestyAn Efficient Proof of Honesty
ApplicationsApplications
Randomly selects ck from {-1, 1}m, k = 1, 2, …, N
ui + vi = di
ck
ck
An Efficient Proof of HonestyAn Efficient Proof of Honesty
ApplicationsApplications
(xk, ρk, Xk) (yk, γk, Yk)
x u j c j X C x
y v j c j Y C y
k i kj
m
k k k
k i kj
m
k k k
[ ] [ ], ( , )
[ ] [ ], ( , )
1
1
An Efficient Proof of HonestyAn Efficient Proof of Honesty
x u j c j
X C x
k i kj
m
k k k
[ ] [ ] ?
( , ) ?1
ApplicationsApplications
y v j c j
Y C x
k i kj
m
k k k
[ ] [ ] ?
( , ) ?1
An Efficient Proof of HonestyAn Efficient Proof of Honesty
ApplicationsApplications
Xk, Yk
Z = XkYk
An Efficient Proof of HonestyAn Efficient Proof of Honesty
ApplicationsApplications
Zj, j = 1, …, l Zj, j = 1, …, l
Z = XkYk
s x y s j
s j Z C s j r j l
r Z j l r r
k k k kj
j
l
k j k j
j R q l k k jj
l
[ ]
[ ] { , } , ( [ ] , ), , . . . ,
, , . . . ,
2
0 1 1
1 1
1
1
1
1
An Efficient Proof of HonestyAn Efficient Proof of Honesty
ApplicationsApplications
Z = ΠZj?
An Efficient Proof of HonestyAn Efficient Proof of Honesty
ApplicationsApplications
Z = ΠZj
ZKP: Zj, contains a bit (i.e. 0 or 1)
ZKP: Zj, contains a bit (i.e. 0 or 1)
Using the bit commitment proof of [CD98]
EffectivenessEffectiveness
ck[j] is selected from {-1, 1}, a zero-mean, unit variance random variable
sk = ckTdi, also a zero-mean R.V.
VAR(sk) = …… = ||di||2
The protocol bounds its var by bounding the RV
Optimal results by tuning T and N
OptimizationsOptimizations
Vector commitment and proof of bit vector commitment reduce the computation by half and communication for commitment by N
User is allowed to acknowledge up to T failed tests
Disqualify a user on the first failed test she claims to pass
Only need to actually run at most N – T tests (30% more efficient)
Privacy Goals [PK01]Privacy Goals [PK01]
Unobservability: The state of IOIs (Items of Interest) being indistinguishable from any IOIs at all
Unlinkability: IOIs are no more and no less related than they are not related
Anonymity: The state of being not identifiable within a set of subjects, the anonymity set
Pseudonymity: Using a pseudonym as ID
P4P ArchitectureP4P Architecture
• A subset of users are elected as “privacy providers” (called privacy peers) within the group• PPs provide privacy when they are available, but can’t access data themselves
The P4P ArchitectureThe P4P Architecture
Billing Resolution Example – Flat Billing Resolution Example – Flat Rate ModelRate Model
ApplicationsApplications
Server charges a flat fee for the service– But place a limit on the maximum resource a
user can consumeUser pays directly to server – no PayPalGoals
– Server is guaranteed to obtains fair payment for the services it provides
– Fraud detection– No leaking of info about transactions
Basic Tools – Homomorphic Basic Tools – Homomorphic CommitmentCommitment
ApplicationsApplications
A = C(a, r)
a
Commit
Basic Tools – Homomorphic Basic Tools – Homomorphic CommitmentCommitment
ApplicationsApplications
a, r
Open
A = C(a, r)?
Homomorphism: C(a1, r1) C(a2, r2) = C(a1+a2, r1+r2)
a
Billing Resolution Example – Flat Billing Resolution Example – Flat Rate ModelRate Model
ApplicationsApplications
cA = C(sA, rA)(sA, rA)
sA: Resource used (e.g. # of trans.)
C: Homomorphic commitment
rA: Randomness
Billing Resolution Example – Flat Billing Resolution Example – Flat Rate ModelRate Model
ApplicationsApplications
ZKP: sA, < L
A is a legitimate customer?
Using the protocol to be explained later
Billing Resolution Example – Flat Billing Resolution Example – Flat Rate ModelRate Model
S cK A
S 1 ( )
ApplicationsApplications
cA = C(sA, rA)? Is sA consistent with my record?
S cK A
S 1 ( ) Server’s
signature on cA
Billing Resolution Example – Flat Billing Resolution Example – Flat Rate ModelRate Model
r ruu U U
\ '
ApplicationsApplications
(r, U’)
U’:The set of users who failed to submit a valid receipt
s: Total number of trans.
C s r c
s s s
uu U U
uu U U
( ' , ) ?
'
\ '
\ '
Billing Resolution Example – Flat Billing Resolution Example – Flat Rate Model Don’t show this!Rate Model Don’t show this!
ApplicationsApplications
sQ
( , , ( ))h s S sQ Q K QS 1