privacy-preserving query processing over …lxiong/cs573_f16/share/slides/08...cloud computing...

19
Motivation Two-Cloud Setting PPkNN Classification Conclusion and Future Research Privacy-Preserving Query Processing over Encrypted Data in Cloud CS573 Data Privacy and Security Yousef M. Elmehdwi Department of Mathematics and Computer Science Emory University October 31, 2016 Yousef M. Elmehdwi Privacy-Preserving Query Processing 1 / 57

Upload: others

Post on 23-Apr-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Privacy-Preserving Query Processing over …lxiong/cs573_f16/share/slides/08...Cloud Computing Computation over Encrypted Data Cloud Computing 2 How to Ensure Data Confidentiality

MotivationTwo-Cloud Setting

PPkNN ClassificationConclusion and Future Research

Privacy-Preserving Query Processing overEncrypted Data in Cloud

CS573 Data Privacy and Security

Yousef M. ElmehdwiDepartment of Mathematics and Computer Science

Emory University

October 31, 2016

Yousef M. Elmehdwi Privacy-Preserving Query Processing 1 / 57

Page 2: Privacy-Preserving Query Processing over …lxiong/cs573_f16/share/slides/08...Cloud Computing Computation over Encrypted Data Cloud Computing 2 How to Ensure Data Confidentiality

MotivationTwo-Cloud Setting

PPkNN ClassificationConclusion and Future Research

Cloud ComputingComputation over Encrypted Data

Cloud Computing

DefinitionType of computing that relies on sharing computing resources rather thanhaving local servers or personal devices to handle applications

Outsourcing

Data owner outsources its data as well as processing functionalities to acloud

Reduced management cost, less overhead of data storage, andimproved quality of service

Key Challenge

Cloud cannot be fully trusted

Protect data confidentiality query privacy, and data access patterns

Yousef M. Elmehdwi Privacy-Preserving Query Processing 2 / 57

Page 3: Privacy-Preserving Query Processing over …lxiong/cs573_f16/share/slides/08...Cloud Computing Computation over Encrypted Data Cloud Computing 2 How to Ensure Data Confidentiality

MotivationTwo-Cloud Setting

PPkNN ClassificationConclusion and Future Research

Cloud ComputingComputation over Encrypted Data

Cloud Computing 2

How to Ensure Data Confidentiality

Data owners encrypt their data before outsourced to a cloud

Key challenge: query processing over encrypted data without the cloudever decrypting the data

Yousef M. Elmehdwi Privacy-Preserving Query Processing 3 / 57

Page 4: Privacy-Preserving Query Processing over …lxiong/cs573_f16/share/slides/08...Cloud Computing Computation over Encrypted Data Cloud Computing 2 How to Ensure Data Confidentiality

MotivationTwo-Cloud Setting

PPkNN ClassificationConclusion and Future Research

Cloud ComputingComputation over Encrypted Data

Computing on Encrypted Data

Basic idea

Party P1 sends encrypted data to party P2

Party P2 performs some computation and returns the encrypted resultto party P1

Party P1 decrypts to find out the answer

Ways to perform computations on encrypted data

Fully homomorphic encryption (impractical)

Additive/Multiplicative homomorphic encryption schemes (Additiveadopted in this work)

Yousef M. Elmehdwi Privacy-Preserving Query Processing 4 / 57

Page 5: Privacy-Preserving Query Processing over …lxiong/cs573_f16/share/slides/08...Cloud Computing Computation over Encrypted Data Cloud Computing 2 How to Ensure Data Confidentiality

MotivationTwo-Cloud Setting

PPkNN ClassificationConclusion and Future Research

Cloud ComputingComputation over Encrypted Data

The Goal of this Work

Develop distributed protocols to allow the cloud to perform queriesdirectly over encrypted data

During query processing, the cloud cannot infer any information aboutthe outsourced data, the user queries, or data access patterns

Such a protocol is termed as privacy-preserving query processing(PPQP)

Yousef M. Elmehdwi Privacy-Preserving Query Processing 5 / 57

Page 6: Privacy-Preserving Query Processing over …lxiong/cs573_f16/share/slides/08...Cloud Computing Computation over Encrypted Data Cloud Computing 2 How to Ensure Data Confidentiality

MotivationTwo-Cloud Setting

PPkNN ClassificationConclusion and Future Research

Cloud ComputingComputation over Encrypted Data

Desired Output and Security Guarantee

Basic formulation

PPQP(〈C : T ′〉, 〈Bob : q〉

)→ 〈Bob : qout〉

Input - T ′ denotes the encrypted database and q the user queryOutput- qout denotes set of records that satisfies q

Security requirements1 Data confidentiality and query privacy2 Privacy/Hide data access patterns3 Output security4 Information that can be inferred from input/output is not a security

violation

Other desirable requirements1 End-user efficiency and correctness

Yousef M. Elmehdwi Privacy-Preserving Query Processing 6 / 57

Page 7: Privacy-Preserving Query Processing over …lxiong/cs573_f16/share/slides/08...Cloud Computing Computation over Encrypted Data Cloud Computing 2 How to Ensure Data Confidentiality

MotivationTwo-Cloud Setting

PPkNN ClassificationConclusion and Future Research

OverviewSecurity modelPrivacy-preserving primitivesProving Security of SM

Example: Insurance Company

Yousef M. Elmehdwi Privacy-Preserving Query Processing 8 / 57

Page 8: Privacy-Preserving Query Processing over …lxiong/cs573_f16/share/slides/08...Cloud Computing Computation over Encrypted Data Cloud Computing 2 How to Ensure Data Confidentiality

MotivationTwo-Cloud Setting

PPkNN ClassificationConclusion and Future Research

OverviewSecurity modelPrivacy-preserving primitivesProving Security of SM

Two-Cloud Environment

Basic Assumptions

Existence of two cloud service providers denoted by C1 and C2 (e.g.,Google and Amazon)

Alice owns a database T of n records t1, . . . , tn and m attributes

Alice generates two keys (pk,sk) based on the AH-ENC system

Alice encrypts T attribute-wise, and sends the encrypted database T ′ toC1 and sk to C2

Bob wants to execute his input query q = 〈q1, . . . , qm〉 on T ′ in thecloud in a privacy-preserving manner

C1 is the data host, who stores all uploaded (encrypted) data T ′

C2 is called the key holder since it stores Alice’s private key sk

Yousef M. Elmehdwi Privacy-Preserving Query Processing 9 / 57

Page 9: Privacy-Preserving Query Processing over …lxiong/cs573_f16/share/slides/08...Cloud Computing Computation over Encrypted Data Cloud Computing 2 How to Ensure Data Confidentiality

MotivationTwo-Cloud Setting

PPkNN ClassificationConclusion and Future Research

OverviewSecurity modelPrivacy-preserving primitivesProving Security of SM

Architecture of Two-Cloud Setting Based Solution

Yousef M. Elmehdwi Privacy-Preserving Query Processing 10 / 57

Page 10: Privacy-Preserving Query Processing over …lxiong/cs573_f16/share/slides/08...Cloud Computing Computation over Encrypted Data Cloud Computing 2 How to Ensure Data Confidentiality

MotivationTwo-Cloud Setting

PPkNN ClassificationConclusion and Future Research

OverviewSecurity modelPrivacy-preserving primitivesProving Security of SM

Adopted Security Model

More realistic model: Secure multiparty computation (SMC)

Parties collaboratively compute the functionality in a secure fashionwithout using a trusted third party

In SMC, security means guaranteeing the correctness of the output aswell as the privacy of the parties’ inputs

Yousef M. Elmehdwi Privacy-Preserving Query Processing 11 / 57

Page 11: Privacy-Preserving Query Processing over …lxiong/cs573_f16/share/slides/08...Cloud Computing Computation over Encrypted Data Cloud Computing 2 How to Ensure Data Confidentiality

MotivationTwo-Cloud Setting

PPkNN ClassificationConclusion and Future Research

OverviewSecurity modelPrivacy-preserving primitivesProving Security of SM

Adopted Adversarial Model

Adversarial modelGenerally specifies what an adversary or attacker is allowed to do during anexecution of a secure protocol

Common adversary models under SMC

Semi-honest: follow the protocol faithfully, but can try to infer thesecret information of the other parties from the data they see during theprotocol execution

Malicious: may do anything to infer secret information (e.g., inputmodification, sending the wrong values)

In our work, we adopt the semi-honest adversary model

Yousef M. Elmehdwi Privacy-Preserving Query Processing 12 / 57

Page 12: Privacy-Preserving Query Processing over …lxiong/cs573_f16/share/slides/08...Cloud Computing Computation over Encrypted Data Cloud Computing 2 How to Ensure Data Confidentiality

MotivationTwo-Cloud Setting

PPkNN ClassificationConclusion and Future Research

OverviewSecurity modelPrivacy-preserving primitivesProving Security of SM

Justification of Use of Semi-Honest Model

Two reasons for adopting Semi-Honest Model

Developing protocols under the semi-honest setting is an important firststep towards constructing protocols with stronger security guarantees

Both C1 and C2 were assumed to be two cloud service providers.Today, cloud service providers in the market are legitimate, well-knowncompanies (e.g., Amazon, Google, and Microsoft). These companiesmaintain reputations that are invaluable assets that need to be protectedat all costs. Thus, a collusion between them is highly unlikely as it willdamage their reputation, which, in turn, affects their revenues.

Yousef M. Elmehdwi Privacy-Preserving Query Processing 13 / 57

Page 13: Privacy-Preserving Query Processing over …lxiong/cs573_f16/share/slides/08...Cloud Computing Computation over Encrypted Data Cloud Computing 2 How to Ensure Data Confidentiality

MotivationTwo-Cloud Setting

PPkNN ClassificationConclusion and Future Research

OverviewSecurity modelPrivacy-preserving primitivesProving Security of SM

Additive Homomorphic Probabilistic Encryption

Epk and Dsk be the encryption and decryption functions. Given m1,m2 ∈ ZN ,the AH-ENC system exhibits the following properties.

Homomorphic Addition

Dsk(Epk(m1 + m2)

)= Dsk

(Epk(m1) ∗ Epk(m2)

)Homomorphic Multiplication

Given a constant c and a ciphertext Epk(m1)

Dsk(Epk(c ∗ m1)

)= Dsk

(Epk(m1)

c)Probabilistic

Let c1 = Epk(m1) and c2 = Epk(m2)

Probability for c1 6= c2 is very high even if m1 = m2

Semantic Security

Given Epk(m1), an adversary cannot derive any information about m1

Yousef M. Elmehdwi Privacy-Preserving Query Processing 14 / 57

Page 14: Privacy-Preserving Query Processing over …lxiong/cs573_f16/share/slides/08...Cloud Computing Computation over Encrypted Data Cloud Computing 2 How to Ensure Data Confidentiality

MotivationTwo-Cloud Setting

PPkNN ClassificationConclusion and Future Research

OverviewSecurity modelPrivacy-preserving primitivesProving Security of SM

Sub-protocol

Secure Multiplication (SM)

SM(⟨

C1 : Epk(a),Epk(b)⟩,⟨C2 : sk

⟩)→

(⟨C1 : Epk(a ∗ b)

⟩, 〈C2 : ∅〉

)Input : Epk(a), Epk(b), and private key sk

Output : encryption of a ∗ b

Secure Squared Euclidean Distance (SSED)

SSED(⟨

C1 : Epk(X),Epk(Y)⟩,⟨C2 : sk

⟩)→

(⟨C1 : Epk(|X − Y|2)

⟩, 〈C2 : ∅〉

)Input : X and Y are m-dimensional vectors, and private key sk , whereEpk(X) = 〈Epk(x1), . . . ,Epk(xm)〉, Epk(Y) = 〈Epk(y1), . . . ,Epk(ym)〉

Output : encryption of squared Euclidean distance between X and Y

Yousef M. Elmehdwi Privacy-Preserving Query Processing 15 / 57

Page 15: Privacy-Preserving Query Processing over …lxiong/cs573_f16/share/slides/08...Cloud Computing Computation over Encrypted Data Cloud Computing 2 How to Ensure Data Confidentiality

MotivationTwo-Cloud Setting

PPkNN ClassificationConclusion and Future Research

OverviewSecurity modelPrivacy-preserving primitivesProving Security of SM

Sub-Protocols 2

Secure Bit-OR (SBOR)

SBOR(⟨

C1 : Epk(o1),Epk(o2)⟩,⟨C2 : sk

⟩)→

(⟨C1 : Epk(o1 ∨ o2)

⟩, 〈C2 : ∅〉

)Input : o1 and o2 are two bits, and sk is private key sk

Output : encryption of the boolean OR operation between o1 and o2

Secure Bit-Decomposition (SBD)

SBD(⟨

C1 : Epk(z)⟩,⟨C2 : sk

⟩)→

(⟨C1 : [z]

⟩, 〈C2 : ∅〉

)Input : Epk(z) such that 0 ≤ z < 2l and sk is private key

Output : [z] =⟨Epk(z1), . . . ,Epk(zl)

⟩SBD: Example

Let z = 5, l = 3. SBD(Epk(5), sk

)→ [z] =

⟨Epk(1),Epk(0),Epk(1)

⟩Yousef M. Elmehdwi Privacy-Preserving Query Processing 16 / 57

Page 16: Privacy-Preserving Query Processing over …lxiong/cs573_f16/share/slides/08...Cloud Computing Computation over Encrypted Data Cloud Computing 2 How to Ensure Data Confidentiality

MotivationTwo-Cloud Setting

PPkNN ClassificationConclusion and Future Research

OverviewSecurity modelPrivacy-preserving primitivesProving Security of SM

New Secure Minimum Sub-Protocol

Secure Minimum (SMIN)

SMIN(〈C1 : (u′, v′)〉, 〈C2 : sk〉)→ (〈C1 : [min(u, v)],Epk(smin(u,v))〉, 〈C2 : ∅〉)

Input: u′ =([u],Epk(su)

), v′ =

([v],Epk(sv)

), and sk is private key

[u] (resp., [v]) denotes the encryption of individual bits of binaryrepresentation of u (resp., v)su (resp., su) denotes the secret corresponding to u (resp., v)

Output:[min(u, v)]: encryptions of individual bits of minimum between u and vEpk(smin(u,v)): encryptions of secret corresponds to minimum of u and v

SMIN: Example

Let u′ =([5],Epk(s5)

)and v′ =

([3],Epk(s3)

), where [5] =

⟨Epk(1),Epk(0),Epk(1)

⟩,

[3] =⟨Epk(0),Epk(1),Epk(1)

⟩⇒SMIN(u′, v′)=

([3],Epk(s3)

)Yousef M. Elmehdwi Privacy-Preserving Query Processing 17 / 57

Page 17: Privacy-Preserving Query Processing over …lxiong/cs573_f16/share/slides/08...Cloud Computing Computation over Encrypted Data Cloud Computing 2 How to Ensure Data Confidentiality

MotivationTwo-Cloud Setting

PPkNN ClassificationConclusion and Future Research

OverviewSecurity modelPrivacy-preserving primitivesProving Security of SM

New Secure Minimum out of n numbers Sub-Protocol

Secure Minimum out of n numbers (SMINn)

SMINn(〈C1 : (θ1, . . . , θn)〉, 〈C2 : sk〉)→ (〈C1 : [dmin],Epk(sdmin)〉, 〈C2 : ∅〉)

Input: ∀ni=1 θi =

([di],Epk(sdi )

)and sk is private key

[di]: encryption of individual bits of binary representation of di for1 ≤ i ≤ nsdi : secret corresponding to di for 1 ≤ i ≤ n

Output:[min(d1, . . . , dn)] = [dmin]: encryptions of individual bits of globalminimumEpk(smin(d1,...,dn)) = Epk(sdmin ): encryptions of secret corresponds toglobal minimum

SMAX and SMAXn

Similarly, one can design SMAX and SMAXn to compute the global maximum

Yousef M. Elmehdwi Privacy-Preserving Query Processing 18 / 57

Page 18: Privacy-Preserving Query Processing over …lxiong/cs573_f16/share/slides/08...Cloud Computing Computation over Encrypted Data Cloud Computing 2 How to Ensure Data Confidentiality

MotivationTwo-Cloud Setting

PPkNN ClassificationConclusion and Future Research

OverviewSecurity modelPrivacy-preserving primitivesProving Security of SM

New Secure Frequency Sub-Protocol

Secure Frequency (SF)

SF(〈C1 : (Λ,Λ′)〉, 〈C2 : sk〉)→(〈C1 : Epk(f (c1)

), . . . ,Epk

(f (cw)

)〉, 〈C2 : ∅〉)

Input: Λ =⟨Epk(c1), . . . ,Epk(cw)

⟩, Λ′ =

⟨Epk(c′1), . . . ,Epk(c′k)

⟩, and sk is

private key

Output:Epk

(f (cj)

): encryption of the frequency of cj in the list 〈c′1, . . . , c′k〉, for

1 ≤ j ≤ wcj’s are unique and c′i ∈ {c1, . . . , cw} for 1 ≤ i ≤ k

SF: Example

(i.e., w = 3 and k = 6), Λ = 〈Epk(2),Epk(3),Epk(5)〉 andΛ′ = 〈Epk(3),Epk(2),Epk(3),Epk(2),Epk(5),Epk(2)〉⇒ SF(Λ,Λ′)→

⟨Epk

(f (2)

)= Epk(3),Epk

(f (3)

)= Epk(2),Epk

(f (5)

)= Epk(1)

⟩Yousef M. Elmehdwi Privacy-Preserving Query Processing 19 / 57

Page 19: Privacy-Preserving Query Processing over …lxiong/cs573_f16/share/slides/08...Cloud Computing Computation over Encrypted Data Cloud Computing 2 How to Ensure Data Confidentiality

MotivationTwo-Cloud Setting

PPkNN ClassificationConclusion and Future Research

OverviewSecurity modelPrivacy-preserving primitivesProving Security of SM

Proving Security Under The Semi-Honest Model

Key ideas

Information deduced from the messages in the real execution image ofa protocol should be computationally indistinguishable from theinformation deduced based on the corresponding messages in thesimulated view

For this, all the intermediate messages seen by an adversary during theexecution of a protocol should be either random or pseudo-random

To prove a protocol is secure under semi-honest model, it required toshow that the execution image of a protocol does not leak anyinformation regarding the private inputs of participating parties / weneed to show that the simulated execution image of that protocol iscomputationally indistinguishable from its actual execution image

An execution image generally includes the messages exchanged andthe information computed from these messages

Yousef M. Elmehdwi Privacy-Preserving Query Processing 20 / 57