monomi: practical analytical query processing over encrypted data
TRANSCRIPT
![Page 1: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/1.jpg)
Monomi: Practical Analytical Query Processing over Encrypted Data
Network Security
Dr. Ali Fanian
Mostafa Arjmand
![Page 2: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/2.jpg)
The Presentation Include The Following :
• INTRODUTION• SPLIT CLIENT/SERVER EXECUTION• OPTIMIZATION TECHNIQUES• DESIGNER AND PLANNER• EVALUATION• CONCLUSION
![Page 3: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/3.jpg)
Monomi
• Securely executing analytical query over sensitive data on an untrusted database server
![Page 4: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/4.jpg)
Problem: Want to run queries over data!
Vulnerable databaseTrusted user
Query
Response
“Give me the # of views of all adults by country”
US 1M
Italy 3K
… …
![Page 5: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/5.jpg)
Approach 1: Fully Homomorphic Encryption
• Run any computation over encrypted data• Prohibitive overheads in practice
Slowdown order 10^9× compared to computation
on plaintext data
![Page 6: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/6.jpg)
Approach 2: Specialized Schemes
• Cryptosystems supporting specific operations:• Equality (deterministic) [AES]• Addition [Paillier 99]• Inequality (order preserving) [Boldyreva 09]• Keyword Search [Song 00]
• These operations common in SQL queries…
![Page 7: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/7.jpg)
Practical state of the art: CryptDB
SELECT country_DET, PAILLIER_SUM(views_HOM) FROM users_ENCRYPTEDWHERE age_OPE > 0xDEADBEEFGROUP BY country_DET
Transformed Query:SELECT country, SUM(views) FROM users WHERE age > 18GROUP BY country
Original Query:Deterministic encryption: EqualityOrder preserving encryption: InequalityPaillier cryptosystem: Addition
0xDEADBEEF = Encrypt_OPE(18)
Under attack
DB Servertransformed queryplain query
Stores encryption keys
Applicationdecrypted results encrypted results
Trusted
Encrypted DB
No client computation: CryptDB requires that all computation in a query are supported by a specialized crypto-system
![Page 8: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/8.jpg)
Problem: OLTP ≠ OLAP
• CryptDB is designed for OLTP queries• We are interested in OLAP queries• Queries typically involve more computation• CryptDB can only support 4/22 TPC-H queries
![Page 9: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/9.jpg)
SELECT category, SUM(cost * quantity) AS valueFROM productWHERE made_in = ‘United States’GROUP BY categoryHAVING SUM(cost * quantity) > 1000000ORDER BY value
What happens when we run this query with CryptDB?
SELECT category, SUM(cost * quantity) AS valueFROM productWHERE made_in = ‘United States’GROUP BY categoryHAVING SUM(cost * quantity) > 1000000ORDER BY value
No efficient additive + multiplicative homomorphic cryptosystem
SELECT category, SUM(cost * quantity) AS valueFROM productWHERE made_in = ‘United States’GROUP BY categoryHAVING SUM(cost * quantity) > 1000000ORDER BY value
No efficient additive + order preserving homomorphic cryptosystem
Problem: OLTP ≠ OLAPSELECT category, SUM(cost * quantity) AS valueFROM productWHERE made_in = ‘United States’GROUP BY categoryHAVING SUM(cost * quantity) > 1000000ORDER BY value
Most of the query can be executed on the server, except a few parts
CryptDB can handle queries that involve only computation supported by one of these encryption schemes
But few analytical queries fall in this category (only four out of 22 TPC-H), and for those four, CryptDB incurs significant Overhead
![Page 10: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/10.jpg)
Solution
• Monomi: A new system for practical analytical query processing • Split client/server query execution• Pre-computation + other runtime optimizations• Query planner/designer
Monomi: Can run TPC-H with 1.24x median overhead (vs. plaintext) using these three techniques.
![Page 11: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/11.jpg)
Encryption schemes used by MONOMI
![Page 12: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/12.jpg)
Overall architecture of MONOMI
![Page 13: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/13.jpg)
2 ) SPLIT CLIENT/SERVER EXECUTION
• In order to execute queries that cannot be computed on the server alone, MONOMI partitions the execution of each query
![Page 14: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/14.jpg)
TPC-H query 11
Example split query plan for TPC-H query 11
SELECT ps_partkey,SUM(ps_supplycost * ps_availqty) AS valueFROM partsupp JOIN supplier JOIN nationWHERE n_name = :1GROUP BY ps_partkeyHAVING SUM(ps_supplycost * ps_availqty) > (
SELECT SUM(ps_supplycost * ps_availqty) * 0.0001
FROM partsupp JOIN supplier JOIN nationWHERE n_name = :1 )ORDER BY value DESC;
Can execute on server
• Can’t execute on server
• They need decrypt on clien and execute
Pre-computed (ps_supplycost * ps_availqty) (Deterministic)Execute Group by with pre-computed ps_partkey_det
0xabcdef denotes the deterministic encryption of the argument :1
MONOMI’s encryption schemes cannot support such queries directly over encrypted data
![Page 15: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/15.jpg)
Pseudo-code for GENERATEQUERYPLAN
![Page 16: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/16.jpg)
3 ) OPTIMIZATION TECHNIQUES
• MONOMI’s designer and planner automatically use these techniques as appropriate to achieve good performance• Without requiring the application developer to perform manual query
rewriting
![Page 17: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/17.jpg)
Per-row precomputation
• Executing an operator on the client can require downloading a large amount of intermediate data• SUM(ps_supplycost*ps_availqty) • Paillier encryption cannot perform multiplication of two values• MONOMI employs per-row precomputation• Designer materializes an additional column in a table• These materialized expressions can then be used at query runtime
• MONOMI can materialize an additional column storing encryption of ps_supplycost*ps_availqty
• Which allows the server to compute the entire SUM()• This allows the client to download one encrypted aggregate value,
instead of downloading all ps_supplycost and ps_availqty values
![Page 18: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/18.jpg)
Space-efficient encryption
• Database queries that involve table scans—which are common in analytic workloads—are often bottlenecked by I/O
• Precomputed (o_orderdate_det) • To allow grouping by the extracted year, the value must be encrypted
deterministically, but using the standard AES or Blowfish algorithm would produce a 128- or 64-bit ciphertext even for 8-, 16-, or 32-bit.
• To minimize ciphertext expansion for deterministic encryption of such small data types, MONOMI uses the FFX block cipher mode of operation• Which encrypts n-bit plaintexts to n-bit ciphertexts, as long as n
is less than the block cipher width (e.g., 128 bits for AES).
![Page 19: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/19.jpg)
Grouped homomorphic addition
Conservative pre-filtering
![Page 20: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/20.jpg)
3 ) DESIGNER AND PLANNER
• The optimizations do not always apply to every query• In particular, a greedy application of the techniques or a greedy
execution of all parts of the query on the server does not necessarily yield the best performance and can waste space• The goal of MONOMI’s designer is to decide how to encrypt the data
(physical design). • MONOMI’s planner determines how to best execute queries given a
particular physical design. • This planner is also used at runtime to choose a query plan for a new
query from the application
![Page 21: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/21.jpg)
Input and output
• Input : Query workload Q1,Q2, ...,Qn.• Should be representative of the operations that the user is expecting to perform
over the data• Input : A sample of the data that will be loaded into the database • Is used for estimating statistics about the data• Need not be the exact data that will be eventually loaded on the server
• Input : The user can also specify a space constraint factor S• Controls how much space MONOMI’s designer can consume
• Designer returns to the user a physical design for the server• Is a set of (encrypted) columns to materialize for each table, including pre-
computed and Paillier columns
![Page 22: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/22.jpg)
Algorithm without constraints step 1
• The designer considers all of the operations in Qi, including all of the expressions in the WHERE and HAVING clauses, any ORDER BY, etc.• For each operation, the designer determines what expression and
encryption scheme would allow that operation to execute on the server.• The set of these (value, scheme) pairs for Qi is called EncSeti, and
defined E to be the set of all possible such pairs; that is, EncSeti E .• For example, a WHERE x = :1 clause generates a (x,DET) pair, referring to the
x column, and an ORDER BY x+y clause generates a (x+y,OPE) pair, referring to a precomputed x+y value.
![Page 23: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/23.jpg)
Algorithm without constraints step 2
• The designer invokes the planner • The planner computing power set that contains all subsets of EncSeti• It then constructs an execution plan for Qi for each element of the
power set• Where the execution plan describes what parts of the query would be
executed on the server, and what parts would be executed on the client
![Page 24: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/24.jpg)
Algorithm without constraints step 3
• For each of the execution plans, the planner uses a cost model to estimate how much time it would take to execute that plan.
![Page 25: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/25.jpg)
4 ) Evaluation
• How many TPC-H queries can Monomi run?• Can MONOMI efficiently execute an analytical workload over encrypted data on
an untrusted server?• How much do MONOMI’s optimization techniques, designer, and planner matter
in achieving good performance?• What are the overheads for MONOMI in terms of space and client-side CPU time?
• Setup:• TPC-H scale 10• Postgres 8.4 on Linux 2.6
• 8GB RAM, 16 cores, six 7200 RPM HDDs
![Page 26: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/26.jpg)
Most TPC-H queries supported
• Monomi’s approach handles all TPC-H queries• Our prototype handles 19/22 due to missing SQL features (e.g. views, LIKE
’%foo%bar%’, )
• First system we know of that can do this!• CryptDB only supports 4/22
![Page 27: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/27.jpg)
Execution time of TPC-H queries under various systems
min overhead 1.03x, median overhead 1.24x,
max overhead 2.33x
![Page 28: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/28.jpg)
Space and CPU overheads
Amount of disk space used on the server CPU overhead
![Page 29: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/29.jpg)
Sensitivity to designer input
![Page 30: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/30.jpg)
Number of distinct columns in the TPC-H tables encrypted by MONOM
![Page 31: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/31.jpg)
Security
• To understand the level of security that MONOMI provides, it is important to consider the encryption schemes chosen by MONOMI• The worst is OPE, which reveals order, followed by DET• A common use of OPE is for date fields, which may be less sensitive
• The next weakest scheme, DET• Is used in quite a few columns, but reveals only duplicates, which may be less
of a concern
![Page 32: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/32.jpg)
CONCLUSION
• Monomi: analytics on encrypted data can be made practical!• Techniques:• Split client/server execution• Pre-computation + space-efficient encryption + pre-filtering +…• Planner/designer
![Page 33: Monomi: Practical Analytical Query Processing over Encrypted Data](https://reader030.vdocuments.mx/reader030/viewer/2022012914/58a3339e1a28ab9b6d8b5207/html5/thumbnails/33.jpg)
THANK YOU FOR LISTENING