quoc-cuong to , benjamin nguyen, philippe pucheral smis team
DESCRIPTION
Privacy-Preserving Query Execution using a Decentralized Architecture and Tamper Resistant Hardware. University of Versailles St-Quentin INRIA Rocquencourt CNRS. Quoc-Cuong To , Benjamin Nguyen, Philippe Pucheral SMIS Team. EDBT 2014 Athens, March 24-28. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Quoc-Cuong To , Benjamin Nguyen, Philippe Pucheral SMIS Team](https://reader035.vdocuments.mx/reader035/viewer/2022081515/56816216550346895dd241a6/html5/thumbnails/1.jpg)
PRIVACY-PRESERVING QUERY EXECUTION USING A DECENTRALIZED ARCHITECTURE AND TAMPER RESISTANT HARDWARE
Quoc-Cuong To, Benjamin Nguyen, Philippe Pucheral
SMIS Team
EDBT 2014Athens, March 24-28
University of Versailles St-Quentin INRIA RocquencourtCNRS
![Page 2: Quoc-Cuong To , Benjamin Nguyen, Philippe Pucheral SMIS Team](https://reader035.vdocuments.mx/reader035/viewer/2022081515/56816216550346895dd241a6/html5/thumbnails/2.jpg)
MASS-GENERATION OF (PERSONAL) DATA
2
Data sources have mostly turned digital Analog processes
• e.g., photography, films Paper-based interactions
• e.g., banking, e-administration Communications
• e.g., email, SMS, MMS, SkypeWhere is your personal data? … In data centers
112 new emails per day Mail servers 65 SMS sent per day Telcos 800 pages of social data Social networks Web searches, list of purchases google, amazon
![Page 3: Quoc-Cuong To , Benjamin Nguyen, Philippe Pucheral SMIS Team](https://reader035.vdocuments.mx/reader035/viewer/2022081515/56816216550346895dd241a6/html5/thumbnails/3.jpg)
DATA PRODUCED BY SECURE HARDWARE
3
Secure hardware is “everywhere”
Where is your personal data stored? … In data centers
![Page 4: Quoc-Cuong To , Benjamin Nguyen, Philippe Pucheral SMIS Team](https://reader035.vdocuments.mx/reader035/viewer/2022081515/56816216550346895dd241a6/html5/thumbnails/4.jpg)
CENTRALIZED VS DE-CENTRALIZED
Centralized solutions Privacy violation Internal & external
attacks on server Single point of attack
4
De-centralized solution Get rid of the
assumption of trusted central server
Distributed secure devices
![Page 5: Quoc-Cuong To , Benjamin Nguyen, Philippe Pucheral SMIS Team](https://reader035.vdocuments.mx/reader035/viewer/2022081515/56816216550346895dd241a6/html5/thumbnails/5.jpg)
ASYMMETRIC ARCHITECTURE: SECURE DEVICE
5
How to compute global queries on nation-wide dataset over decentralized personal data stores while respecting users’ privacy?
AuthorizedQuerier
Average energy
consumption of France
Secure Device (Trusted Data Server - TDS) Characteristics :• High security:
• High ratio Cost/Benefit of an attack;• Secure against its owner;
• Modest computing resources (~10KB of RAM, 120MHz CPU);
• Low availability: physically controlled by its owner; connects and disconnects at it will
![Page 6: Quoc-Cuong To , Benjamin Nguyen, Philippe Pucheral SMIS Team](https://reader035.vdocuments.mx/reader035/viewer/2022081515/56816216550346895dd241a6/html5/thumbnails/6.jpg)
OUTLINE Generic protocol & variations Information exposure analysis Experiment
6
![Page 7: Quoc-Cuong To , Benjamin Nguyen, Philippe Pucheral SMIS Team](https://reader035.vdocuments.mx/reader035/viewer/2022081515/56816216550346895dd241a6/html5/thumbnails/7.jpg)
THE GENERIC PROTOCOL
7
Querier
Supporting ServerInfrastructure (SSI)
…
SELECT <attribute(s) and/or aggregate function(s)>FROM <Table(s)>[WHERE <condition(s)>][GROUP BY <grouping attribute(s)>][HAVING <grouping condition(s)>][SIZE <size condition(s)>];
Collection phase
Aggregation phase
Stop condition: min #tuples or max time
John, 35K Mary, 43K Paul, 100K
![Page 8: Quoc-Cuong To , Benjamin Nguyen, Philippe Pucheral SMIS Team](https://reader035.vdocuments.mx/reader035/viewer/2022081515/56816216550346895dd241a6/html5/thumbnails/8.jpg)
HYPOTHESIS ABOUT QUERIER & SSIQuerier: Share the secret key with TDSs (for encrypt the query &
decrypt result). Access control policy:
Cannot get the raw data stored in TDSs (get only the final result) Can obtain only authorized views of the dataset
Supporting Server Infrastructure: Prior knowledge about data distribution. Honest-but-curious attacker: Frequency-based attack
SSI matches the plaintext and ciphertext of the same frequency. look at remarkable (very high/low) frequencies in dataset
distribution (e.g., Mr. X with high salary = 1 M€/month and there is only one distinct encrypted salary → Mr. X participates in the dataset). 8
![Page 9: Quoc-Cuong To , Benjamin Nguyen, Philippe Pucheral SMIS Team](https://reader035.vdocuments.mx/reader035/viewer/2022081515/56816216550346895dd241a6/html5/thumbnails/9.jpg)
RELATED WORKS Outsourced database services: simple
queries or high computing cost Statistical Database & Differential
privacy: trusting the server , produce approximate results
Secure Multi-party Computation: not scalable
Secure Data Aggregation in wireless sensor network: communicate with each other in order to form a network topology
First proposal achieving a fully distributed and secure solution to compute general SQL queries over a large set of participants
9
![Page 10: Quoc-Cuong To , Benjamin Nguyen, Philippe Pucheral SMIS Team](https://reader035.vdocuments.mx/reader035/viewer/2022081515/56816216550346895dd241a6/html5/thumbnails/10.jpg)
CLASSIFICATION OF SOLUTIONSWhich encryption is used, how the SSI constructs
the partitions, and what information is revealed to the SSI
Secure aggregation solution: nDet_Enc Noise-based solutions: Det_Enc + fake data
random (white) noise noise controlled by the complementary domain
Histogram-based solution: equi-depth histogram
10
Performance & Security
![Page 11: Quoc-Cuong To , Benjamin Nguyen, Philippe Pucheral SMIS Team](https://reader035.vdocuments.mx/reader035/viewer/2022081515/56816216550346895dd241a6/html5/thumbnails/11.jpg)
SECURE AGGREGATION
11
Supporting ServerInfrastructure (SSI)
…
encrypts its data using non-deterministic encryption
Form partitions
Hold partial aggregation (Gij,AGGk)
Querier
}
(Paris, 35K)
(#x3Z, aW4r)
(Lyon, 43K) (Nice, 100K)
Q: SELECT City, SUM(Energy) GROUP BY City HAVING SUM(Energy) > 50B
($f2&, bG?3)
(T?f2, s5@a)
(#x3Z, aW4r)($f2&, bG?3)($&1z, kHa3)…(T?f2, s5@a)
(#i3Z, afWE)(T?f2, s!@a)($f2&, bGa3)
(#x3Z, aW4r)($f2&, bG?3)($&1z, kHa3)
(?i6Z, af~E)(T?f2, s5@a)(5f2A, bG!3)
(Paris, 35K)(Lyon, 24K)(Lyon, 43K)
(Paris, 35K)(Lyon, 67K)
(F!d2, s7@z)(ZL5=, w2^Z)
Final Agg(#f4R, bZ_a)(Ye”H, fw%g)(@!fg, wZ4#)
(Paris, 912300M)(Lyon, 56000M)
Evaluate HAVING clause
Final Result(#f4R, bZ_a)(Ye”H, fw%g)
Qi= <EK1(Q),Credential,Size>
Decrypt Qi Check AC rules
Decrypt Qi Check AC rules
Decrypt Qi Check AC rules
![Page 12: Quoc-Cuong To , Benjamin Nguyen, Philippe Pucheral SMIS Team](https://reader035.vdocuments.mx/reader035/viewer/2022081515/56816216550346895dd241a6/html5/thumbnails/12.jpg)
NOISE-BASED PROTOCOLS nDet_Enc on AG SSI cannot gather tuples
belonging to the same group into same partition. Det_Enc on AG frequency-based attack. Add noise (fake tuples) to hide distribution of AG. How many fake tuples (nf) needed? disparity in
frequencies among AG small nf: random noise big nf: white noise nf = n-1: controlled noise (n: AG domain cardinality)
Efficiency: Each TDS handles tuples belonging to one group
(instead of large partial aggregation as in SAgg) However, high cost of generating and processing the
very large number of fake tuples
12
![Page 13: Quoc-Cuong To , Benjamin Nguyen, Philippe Pucheral SMIS Team](https://reader035.vdocuments.mx/reader035/viewer/2022081515/56816216550346895dd241a6/html5/thumbnails/13.jpg)
NEARLY EQUI-DEPTH HISTOGRAM Distribution of AG is
discovered and distributed to all TDSs.
TDS allocates its tuple to corresponding bucket.
Send to SSI: {h(AG),nDet_Enc(tuple)}
h(AG) = bucketID
13
Not generate & process too many fake tuples
Not handle too large partial aggregation
True Distribution Nearly equi-depth histogram
![Page 14: Quoc-Cuong To , Benjamin Nguyen, Philippe Pucheral SMIS Team](https://reader035.vdocuments.mx/reader035/viewer/2022081515/56816216550346895dd241a6/html5/thumbnails/14.jpg)
INFORMATION EXPOSURE (DAMIANI ET AL. CCS 2003)
14
![Page 15: Quoc-Cuong To , Benjamin Nguyen, Philippe Pucheral SMIS Team](https://reader035.vdocuments.mx/reader035/viewer/2022081515/56816216550346895dd241a6/html5/thumbnails/15.jpg)
INFORMATION EXPOSURE
15
_1 1 1
1 1 1/k kn
S Agg ji j jj
Nn N
SAgg: ICi,j = 1/Nj for all i,j
• n: the number of tuples, • k: the number of attributes, • ICi,j : the value in row i and
column j in the IC table• Nj: the number of distinct
plaintext values in the global distribution of attribute in column j (i.e., Nj ≤ n)
_1
min( ) 1/k
ED Hist jj
N
EDHist: requires finding all possible partitions of the plaintext values such that the sum of their occurrences is the cardinality of the mapped value: NP-Hard multiple subset sum problem Noise_based & ED_Hist have a uniform distribution of the AG: ɛED_Hist = ɛNoise_based
Plaintext: _1 1
1 1 1kn
P Texti jn
ɛS_Agg ≤ ɛED_Hist =ɛNoise_based <1
![Page 16: Quoc-Cuong To , Benjamin Nguyen, Philippe Pucheral SMIS Team](https://reader035.vdocuments.mx/reader035/viewer/2022081515/56816216550346895dd241a6/html5/thumbnails/16.jpg)
UNIT TEST
16
Internal time consumption
• 32 bit RISC CPU: 120 MHz• Crypto-coprocessor: AES, SHA• 64KB RAM, 1GB NAND-Flash• USB full speed: 12 Mbps }
![Page 17: Quoc-Cuong To , Benjamin Nguyen, Philippe Pucheral SMIS Team](https://reader035.vdocuments.mx/reader035/viewer/2022081515/56816216550346895dd241a6/html5/thumbnails/17.jpg)
METRICS FOR THE EVALUATION: TRADE-OFF BETWEEN CRITERIA
17
Total Load
Average Time/Load
Query Response Time
Information Exposure
Query Response Time
Resource Variation
![Page 18: Quoc-Cuong To , Benjamin Nguyen, Philippe Pucheral SMIS Team](https://reader035.vdocuments.mx/reader035/viewer/2022081515/56816216550346895dd241a6/html5/thumbnails/18.jpg)
WHICH ONE ?
18
S_Agg & ED_Hist: best solutions.
ED_Hist: E.g., medical folder; seldom connect; save resource for their own tasks.
S_Agg: smart meter; connect all time; mostly idle; not care resource.
![Page 19: Quoc-Cuong To , Benjamin Nguyen, Philippe Pucheral SMIS Team](https://reader035.vdocuments.mx/reader035/viewer/2022081515/56816216550346895dd241a6/html5/thumbnails/19.jpg)
FUTURE WORK Support external joins (i.e., joins
between several TDSs). Extend the threat model to (a
small number of) compromised TDSs
19
![Page 20: Quoc-Cuong To , Benjamin Nguyen, Philippe Pucheral SMIS Team](https://reader035.vdocuments.mx/reader035/viewer/2022081515/56816216550346895dd241a6/html5/thumbnails/20.jpg)
20