secure query processing over encrypted big data in public...

34
Secure query processing over encrypted Big Data in public cloud Mohammad Ahmadian www.cs.ucf.edu/~ahmadian Ph.D candidate in Computer Science University of Central Florida February 2016 Fayetteville State University

Upload: others

Post on 15-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Secure query processing over encrypted Big Data in public cloud

Mohammad Ahmadianwww.cs.ucf.edu/~ahmadian

Ph.D candidate in Computer ScienceUniversity of Central Florida

February 2016Fayetteville State University

Page 2: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Is it possible to delegate processing of your data without getting your private information revealed?

Page 3: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Introduction Big Data Cloud computing Security SecureNoSQL Information Leakage Conclusion References

Outline

Page 4: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Feed people Supply energy Medical care

In efficient way

Introduction

progress progress

Storing Data in Disk 2000 BC

Big Data helps us to:

Page 5: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Why we have this much Data? Datafication Atomization

Big Data

Why traditional database systems fail to support “big data”? Faster response time Scalability

Properties of Big Data Variety Volume Velocity

Page 6: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Key-Value store A dictionary DS where a key uniquely identifies the value.

Column-family Data are stored in rows and each row has a unique key and set of columns

Document-store Data are stored in internal structure (Document) to offer higher level of

granularity. Each document has a unique key to identify.

Graph DatabaseThis model is based on graph and can used to represent

complex structures and highly connected data.

Big Data – Data models

Page 7: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Introduction Big Data Cloud computing Security SecureNoSQL Information Leakage Conclusion References

Outline

Page 8: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Compute resources as a utilitycustomers choosing to buy the computing resources they need from some central utility rather than generate it

Self-service provisioning End users can spin up computing resources for almost any type of workload on-demand.

Elasticity scale up as computing needs increase and then scale down

Pay per use Computing resources are measured at a granular level, allowing users to pay only for the resources and

workloads they use

Cloud computing

Page 9: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Cloud computing services:

Cloud computing

Shared by several organization;

Typically externally hosted but may be internally

hosted by one of the organizations

Community

Used for a single organization; can be internally or externally hosted

Private

Composition of the two or more clouds(private or public) that remain unique entities but are bound together, offering the benefits of multiple deployment models, is internally and externally hosted.

Hybrid Public

Provisioned for open use for the public by a particular organization who also hosts the service.

Page 10: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Introduction Big Data Cloud computing Security SecureNoSQL Information Leakage Conclusion References

Outline

Page 11: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Big Data- Big Security “2015 Global Megatrends in Cybersecurity”[1], the security and privacy threat is the most preventive

reason to avoid joining and use cloud services.

Legal issue Storing certain type of unencrypted data such as medical record off-site is illegal.

Encryption Data encryption nullify the benefits of cloud computing (sacrificing convenience) unless give the cloud

secret key to cloud for decryption (sacrificing privacy)

Modern crypto-system With traditional crypto-systems it is impossible to outsource encrypted data to cloud for processing.

Thus, we need new type of crypto-systems.

Security

Page 12: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Introduction Big Data Cloud computing Security SecureNoSQL Information Leakage Conclusion References

Outline

Page 13: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Unchanged DBMS and clients side encrypted documents in the DBMS with respect to standard semantics.

Security level-proportional overhead high level security more overhead.

Open-ended User can add new cryptographic modules to the system .

Multi-key, multi-level security Each data element can have different security mechanism.

Descriptive language Key-value pair for describing security parameters.

Data integrity Unauthorized modification by malicious.

SecureNoSQLObjectives:

Page 14: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Architecture

SecureNoSQL

Page 15: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Random(RND) encrypted same message with same key yields different ciphertext.

Deterministic (DET) high level security more overhead.

Order-preserving encryption (OPE) Order of of plaintext is projected on the ciphertext.

Additive homomorphic encryption (HOM) allow limited operations over encrypted data.

SecureNoSQL – Crypto-systems

Page 16: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

JSON Schema Shows how the query data should be interpreted and how to extract and apply security mechanism for

data items.

Sections of schema Collection Cryptographic modules Data element Mapping cryptographic modules to the fields

SecureNoSQL – Security Schema

Page 17: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Collection A collection is a group of NoSQL documents equivalent to RDBMS table.

SecureNoSQL – Security Schema

Page 18: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Cryptographic modules the pointer to an item, the encryption key and initialization vector.

SecureNoSQL – Security Schema

Page 19: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Data elements Describes all data elements with JSON notation

SecureNoSQL – Security Schema

Page 20: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Mapping cryptographic modules to the Fields Assigns cryptographic modules to the Fields.

SecureNoSQL – Security Schema

Page 21: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Query Encryption Parsing query elements and applying cryptographic modules based on secure schema.

SecureNoSQL

Page 22: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Some sample NoSQL queries Parsing query elements and applying cryptographic modules based on secure schema.

SecureNoSQL

Page 23: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Data integrity Using HMAC client generates hash values for all encrypted documents.

SecureNoSQL

Page 24: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

SecureNoSQL – Data-flow

Page 25: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Introduction Big Data Cloud computing Security SecureNoSQL Information Leakage Conclusion References

Outline

Page 26: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Information leakage from crypto-systems Weakness and strengths of crypo-systems. Poor parameter selection for crypto-system

short primes for RSA or bad key for AES Intrinsic weakness of crypt-system OPE, DES, RC2, MD5

Information leakage from statistical sampling Zipf, Gaussian, Power law

Information leakage from access pattern given query q and dataset D, server easily finds outs the set of documents that touched.

Information leakage

Page 27: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Solution 1 Adding fake query for valid query

Information leakage

Page 28: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Solution 2 Hiding statistical model of encrypted data.

Information leakage

Page 29: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Introduction Big Data Cloud computing Security SecureNoSQL Information Leakage Conclusion References

Outline

Page 30: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

In this investigation, it was found that modern cryptographic supports readily available to process queries on the encrypted very large scale data-stores. Furthermore, we designed the system that provides security and integrity and overhead of system is proportional to the required level of security.

Conclusion

Page 31: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

The crypto primitives in modern cryptography is proportional to the desired security level or operations we need to curred out on the ciphertext. Here is a simple comparison between the overhead of different security configurations.

Conclusion

DB Plain OPE64 OPE128 OPE256 OPE512 HOM

Size(MB) 170 430 508 662 1000 3400

Page 32: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Introduction Big Data Cloud computing Security SecureNoSQL Information Leakage Conclusion References

Outline

Page 33: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

[1] Ponemon Institute Research Report. 2015 Global Megatrends in Cybersecurity. , ,Feb 2015.

[2] M. Ahmadian, F. Plochan, Z. Roessler, and D. C. Marinescu, “SecureNoSQL: An approach for secure search of encrypted nosql databases in the public cloud,” International Journal of Information Management, vol. 37, no. 2, pp. 63– 74, 2017. [Online]. Available:http://www.sciencedirect.com/science/article/pii/S0268401216302262

[3] R. A. Popa, C. M. S. Redfield, N. Zeldovich, and H. Balakrishnan, “Cryptdb: Protecting confidentialitywith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium on Operating SystemsPrinciples, pp. 85–100, 2011.

[4] H. Hu, J. Xu, C. Ren, and B. Choi, “Processing private queries over untrusted data cloud throughprivacy homomorphism,” in Data Engineering (ICDE), 2011 IEEE 27th International Conference on.IEEE, 2011, pp. 601–612.

[5] A. Boldyreva, N. Chenette, Y. Lee, and A. Oneill, “Order-preserving symmetric encryption,” Advancesin Cryptology-EUROCRYPT, pp. 224–241, 2009.

[6] M. Ahmadian, A. Paya, and D. Marinescu, “Security of applications involving multiple organizationsand order preserving encryption in hybrid cloud environments,” IEEE International conf. on ParallelDistributed Processing Symposium Workshops (IPDPSW), pp. 894–903, May 2014.

[7] C. Gentry et al., “Fully homomorphic encryption using ideal lattices.” in STOC, vol. 9, 2009, pp. 169–178.

[8] M. Ahmadian, “SECURE QUERY PROCESSING in CLOUD NoSQL,” in IEEE International Conference on Consumer Electronics (ICCE) (2017 ICCE), Las Vegas, USA, Jan. 2017.

[9] Ahmadian, M., Khodabandehloo, J., & Marinescu, D. (2015). A security scheme for geographic informa-tion databases in location based systems. IEEE SoutheastCon, (pp. 1–7). doi:10.1109/SECON.2015.7132941.

Reference

Page 34: Secure query processing over encrypted Big Data in public cloudcs.ucf.edu/~ahmadian/pubs/secureNoSQL.pdfwith encrypted query processing,” Proc. of the Twenty-Third ACM Symposium

Question & Discussion