secure cloud database using multiparty computation
TRANSCRIPT
Introduction
• Security in cloud environment– The service providers are typically third party– Goal: protect sensitive data
• Related paper in secure DB– NetDB2, IBM (Outsourced database)– Relational Cloud, CryptDB (MIT, CIDR 2011)– TrustedDB using secure hardware (VLDB 2011
demo, Radu Sion)
NetDB2
Tuple 1 xxx yyyTuple 2 aaa bbb
Tuple 1 !a4 a3gTuple 2 L%j m*KValue-level encryption
SELECT * WHERE value = `xxx’ SELECT * WHERE value = `!a4’
DB
Encrypted DB
Tuple 1 P2 P2
Tuple 2 P1 P1+Partition information
Partition:P1: < `m’; otherwise P2
SELECT * WHERE value < `xxx’ SELECT * WHERE value in [P1, P2]
Simple deterministic encryption
CryptDB
• Onion-encryption: multiple encryption done on 1 data
10
Original data
encryptE1(10) =A*65h
OPES: numeric comparisons
E2(A*65h) = BB647
Deterministic encryptionEquality can be done
Non-deterministic encryptionNo computation is feasible
E3(BB647) = %j@9G
If the user wants more computation power, decrypt to the desired level (one way!)
Summary
• Mainly on encryption technique– Provide limited computation capabilities
• Also note that security strength depends on the encryption function– For example, deterministic encryption may allow a
frequency analysis attack• `Male’ , `Female’ => `%k9)2’, `Ah475’• `Ah475’ x 21; `%k9)2’ x 5 in DB group
Secret sharing (around 1980)
10
Secret
46 shares
Alice Bob
6+4 = 10
What is the secret value?
Alice’s share would be 5? 20? -3?
The secret is recovered only when the two parties exchange their shares
Secret sharing
• General case
s
Secret
s1 s2 … sn
The secret can be divided into n parties, for any n
s = g(s1, s2, …, sn)
Example:Sum of all shares (modular)Bitwise XOR of all sharesProduct, string concatenation, etc…
Security requirement:Given k < n shares, it is hard to recover s
Secure multiparty computation
Party 1
x1
Party 2
x2
Party n
xn
…
Objective:Every party obtains f(x1, x2, …, xn) but cannot observe any other information apart from its own data
r = f(x1, x2, …, xn)
r
r
r
Secure multiparty computation
• Any function f that can be expressed as a circuit can be computed securely in SMC– Limitation of the generic solution• Not efficient
• Many efficient protocols are developed to support certain operations
Building a secure database system
• To hide the data– Secret sharing
• To provide query processing functionality– Secure multiparty computation (SMC)
• Done?
Secure Cloud Database =Secret Sharing + SMC?
DB
A
B
C
Service Provider 1
Service Provider 2
Service Provider 3DB = A + B + C
SMC
Queries
Result
R
R
R
Difference
• Security requirement– SMC allows all party obtain the result
vs SDB allows only the user obtain the result• Computational model– SMC: a single function computation
vs SDB: follow-up queries
An adaption of SMC + secret sharing
• Example: SHAREMIND– Outsourced privacy preserving data mining
DB
A
B
C
Service Provider 1
Service Provider 2
Service Provider 3DB = A + B + C
An adaption of SMC + secret sharing
• Example: SHAREMIND– Key: computational result is also shared among
partiesA
B
C
Service Provider 1
Service Provider 2
Service Provider 3
QueryResult
A
B
C
A + B + C = Result
SHAREMIND Toolkit
• Provide several basic operations to build mining application– Arithmetic (add, multiply, divide), bitwise
operations (XOR), equality
SHAREMIND – Recursive processing
SMC operations
Workspaces in different parties
Result in shares
Intermediate results as part of data of future processing
Example:SELECT *WHERE A > AVERAGE(B)
Query execution:SMC1. Compute average(B)SMC2. Filter with result from SMC1
Secure DB Model
DB
A
B
C
Service Provider 1
Service Provider 2
Service Provider 3DB = A + B + C
Owner/User
Before we proceed….Clarifying the security
• Negative result– Ideal security:• Querying workflow: user issues query => service
providers compute result and return to user• Knowledge gained by service providers: NONE. Not
even anything about query and result!
– A solution achieving ideal security is not more efficient than a non-outsourcing solution (not using cloud)
Knowledge gained by service provider
• Output space of a simple selection query: varies from no tuple to the entire database– Even larger space if we consider joins
• Example knowledge gain– If the output size is small, the service provider knows
it is not the case that the query selects entire table• To hide the above information, each returned
query result should be at least of size = entire table
Security in secure database
• Each service provider can observe– Query content• The tables that are related to the query• Number of conditions, types of conditions, attributes
that are related• But not other info about query
– Query answer• the set of shares of tuples in some query answer• But not other content
Example query
• SELECT NameFROM EmployerWHERE Salary > 6000
• Transformed query may look like to one service providerSELECT ATTRIBUTE_7FROM TABLE_AWHERE ATTRIBUTE_3 > XWITH SHARE_X = 1000
Answer
Tom
Kitty
Answer
T
Ki
Answer
o
t
Answer
m
tyThe other two parties may get SHARE_X = 2000 and SHARE_X = 3000
Building a secure database
• Baseline solution– Use the existing SHAREMIND Toolkit• Each value is divided into shares• Selection using equality operation or greater than
(detailed protocol not found ????)
One efficiency problem
• SMC is distributed computing– Number of rounds should be as small as possible!– Handshaking is expensive
• Naïve compiling of query– May result in series of SMC protocols– Example• SELECT A+B+C+D• 3 sum operations separately? 3X latency• Sum in 1 round!
Better solution?
• 1. Query execution plan optimization– We have different possible ways to translate the query into
SMC primitives, how to optimize in terms of number of rounds of communication? Even better is to have a cost model to consider everything
• 2. Shortcut operator– Example: (X+Y) mod 5, original two individual SMC
operators, but we can use a single SMC operator to replace this combination
• 3. Index– How to implement index efficiently and securely?