efficient secure query evaluation over encrypted xml databases wendy hui wang laks v.s. lakshmanan...
TRANSCRIPT
Efficient Secure Query Evaluation over Encrypted XML
DatabasesWendy Hui Wang
Laks V.S. Lakshmanan
University of British Columbia, Canada
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
2
Outline
Introduction Design of metadata Secure and efficient query processing
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
3
Database-as-Service (DAS) Model Data owner
Small business with limited budget (e.g., an online art gallery owner)
Owns an XML database of large size (e.g., a database contains the information of paintings & customers)
Cannot afford a suitable database server More cost effective: hosts the database on a third-
party remote server E.g., Caspio web database service provider
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
4
Security Concerns in DAS Model
Data owner Does NOT trust the server Protects the sensitive information in the database
Individual XML element with its content (structure of the subelements, data values, etc..) E.g., the customer’s financial information
Association between data values E.g., the customer’s name and the paintings he/she purcha
sed
FinancialAccount
visa mastercardvisa
… … …
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
5
Database-as-Service Model (Cont.) Data Owner
Stores the encrypted database on the server Keeps decryption keys to himself
Server Provides data storage & query engine as services Doesn’t have decryption keys
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
6
The Queries by Data Owner
Remotely sent by data owner’s handheld devices The answers are a very small portion of the databas
e E.g., “The name of paintings that Andy bid for”
The answers are post-processed on the handheld devices The devices are installed with decryptor and query engine Limited bandwidth Limited memory and processing power
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
7
Naïve Method of Query Processing Returns the whole encrypted database back to the client
Disadvantages Expensive cost of data transportation, decryption and query post-
processing May exceed the computational capabilities of handheld devices
Encrypted XML Database
XML Decryptor
Query Executor
Answer of Query
Client
Untrusted Server
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
8
Another Option for Query Processing Encrypts tags & data values in the database individually
E.g.,
Tags & values in the query are encrypted as the same as in database E.g., purchase[cname=Andy]/pname
Query processing is more efficient than naïve method But there exists security breach!
E.g., the attacker knows Andy is the biggest customer of the art gallery. Then the encrypted value on “customer” that is of the largest # of occurrences must correspond to “Andy”.
purchasecname
Andy Lilypname
purchasecname
A Lilypname
purchase[cname=A]/pname
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
9
Our Goals
Security Guarantee no leakage of sensitive information to t
he untrusted server/disk
Efficient query evaluation The server returns ONLY the portions of database
that is relevant to the data owner’s query
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
10
Our Approach
XML DecryptorQuery Translator
Query Executor
Query Executor Encryption
blocks relevant to Q
“//purchase[//cname=‘Andy’]/pname”
Client
Untrusted Server
Qs
Answer of Query Q
Metadata… purchases
…purchase
cname pnamepurchase
cname pnameAndy Lily Betty Reflection
Query Q
purchasecname pname
Andy Lily
“Lily”
Encrypted XML Database
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
11
Our Contributions
Security constraints (see paper) Formal definition of attack model and security Construction of the secure encryption
scheme (see paper) Finding an optimal secure encryption scheme is
NP-hard Design of the metadata on the server Efficient and secure query processing
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
12
Outline
Introduction Design of metadata
Structural index Value index
Secure and efficient query processing
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
13
Structural Index
Purpose: for efficient processing of tags and XPath predicates (/, //,[], sibings, etc..) in the query
The interval index of the element Each element is assigned an interval (start; end) For parent u and child v, u:start < v:start < v:end < u:end The intervals of adjacent nodes don’t overlap
The structural index Index table entry: <(encrypted) tag, the interval index> Encryption block table entry: <encryption block ID, the inte
rval index>
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
14
Attacks on Structural Index By accessing structural index T and encrypted elemen
t E, the attacker constructs the candidates of the original element that have the same structural index T the size of the encrypted candidate is the same as that o
f E
[0, 1]
β γ
The # of such candidates is 1, i.e., the attacker can reconstruct the structure of the original element!
[0.2,0.25]
[0.55, 0.75]
δ δ δ[0.83, 0.84]
Tag index
Enc(A) [0, 1]
Enc(B) [0.2, 0.25]
Enc(C) [0.55, 0.75]
Enc(D) [0.8, 0.82], [0.83, 0.84], [0.85, 0.9]
[0.8, 0.82]
[0.85, 0.9]
Index table
α
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
15
More Secure Structural Index Grouping on the intervals in the index table
The intervals of the adjacent nodes with the same tag and encrypted in the same block are grouped together
Tag index
Enc(A) [0, 1]
Enc(B) [0.2, 0.25]
Enc(C) [0.55, 0.75]
Enc(D)
[0.8, 0.9]
Index table before grouping
Tag index
Enc(A) [0, 1]
Enc(B) [0.2, 0.25]
Enc(C) [0.55, 0.75]
Enc(D)
[0.8, 0.82],
[0.83, 0.84],
[0.85, 0.9] Index tableafter grouping
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
16
Security Example of Structural Index
A [0, 1]
B C
Original element[0.2,0.25] [0.55, 0.75]
D D D[0.8, 0.9]
A [0, 1]
B C
Candidate 1
[0.2,0.25] [0.55, 0.75] C D D
[0.8, 0.9]
A [0, 1]
B B
Candidate 2[0.2,0.25] [0.55, 0.75]
C C D[0.8, 0.9]
3 intervals on 5 leaf nodes
# of Candidates: 62
4
13
15
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
17
Technical Result of Security of Structural Metadata We prove there exists a large number of cand
idate databases (including the true hosted database) such that: By applying any query that is captured by any sec
urity constraint, only the true database returns the non-empty answer
By looking at the structural index, the candidates are pairwise indistinguishable
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
18
Related Work of Structural Index Efficient Tree Search in Encrypted XML Data
base [Brinkman et al. 2004] stores a relational table containing structural infor
mation of the database on server compromises security of structural information
XML interval index schemes [Al-Khalifa et al.2002, Chien et.al, 2002, etc..] Only focus on efficiency. Don’t consider security
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
19
Outline
Design of metadata Structural index Value index
Secure and efficient query processing
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
20
Value Index
Purpose: for efficient processing of value-based constraints in the queries
Every encrypted data value in the database is indexed in format <(Encrypted) value, block IDs>
By accessing value index, the attacker counts the # of occurrences of encrypted values
Enc(100)Enc(30)Enc(20)
Enc(70)
Enc(50)
Enc(90)
1, 2, 5, 6 3, 6 3, 4 1, 2Block IDs
Encrypted value
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
21
Attacks on Value Index Attacker’s aim: infer mapping between plaintext
values and corresponding index, consequently crack the associations between data values
E.g., he wants to find out what are the paintings Andy has bought. The names of paintings are not encrypted. But the names of customers are.
His prior knowledge: # of occurrences of some data values in the original database E.g., from the newspaper, he knows Andy has bought
10 paintings from the art gallery for charity purpose.
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
22
Attacks on Value Index (Cont.) His approach: map the encrypted values with
plaintext based on their # of occurrences E.g., “A” is the only value in index whose
occurrences ≥10. Then “A” must map to “Andy”. Consequently the attacker finds out which paintings that Andy has purchased
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
23
Our Solution
Order preserving encryption with splitting and scaling (OPESS) Order preserving: efficient query processing Splitting and scaling
Purpose: change frequency distribution of encrypted data values in value index to be different from that of the frequencies of original values
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
24
Splitting
Andy
Betty
Carl
12
59
Plaintext value of “CustomerName”
# of occurrence
# of occurrence
Encrypted value of “CustomerName”
435
KHKA
KT3+4+5=12
45
WAWE
4+5=9
5SF 5=5
Every plaintext value p is encrypted into multiple distinct ciphertext values {v1, v
2..vn} by using distinct keys. ∑|vi|=|p|.
encrypted value vi, |vi| ∊ {m-1, m, m+1} Orders preserved. Encrypted values corresponding to different plaintext values
never straddle each other Mapping between encrypted values and plaintext values is unique, i.e., splitting
alone is not secure!
E.g., for data values on attribute “CustomerName”
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
25
Scaling
E.g., for data values on attribute “CustomerName”
Andy
Betty
Carl
125
9
Plaintext value of “CustomerName”
# of occurrence
# of occurrence
Encrypted value of “CustomerName”
43
5KHKA
KT
45
WAWE
5SF
Every encrypted value replicated multiple times so their occurrences will be scaled up.
By scaling, the mapping between encrypted values and plaintext values is not unique!
666666
Scale to
102
5
13
16
To map 6 distinct ciphertext values to 3 distinct plaintext values, # of mappings =
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
26
Technical Results of Security of Value Index We prove there exists a large number of can
didate databases (including the true hosted database) such that: By applying any query that is captured by any sec
urity constraint, only the true database returns the non-empty answer
By looking at the value index, the candidates are pairwise indistinguishable
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
27
Related Work of Value Index Efficient processing of queries on encrypted relational database
[Hacigumus et al. 2002] Index on the bucket ID, which represents the partition to which th
e unencrypted value belongs DO NOT consider occurrence-based distribution model
Order-preserving encryption for numeric data [Agrawal et al. 2004] Consider a DIFFERENT histogram-based distribution model
Balancing security and efficiency in untrusted relational DBMSs [Damiani et.al 2003] Propose indexing scheme by direct encryption and hashing, and
measure the information exposure For the same occurrence-based distribution model as ours, their
probability of information exposure can be HIGH The encryption is NOT order-preserving
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
28
Outline
Introduction Design of metadata Secure and efficient query processing
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
29
Example of Query Processing
XML DecryptorQuery Translator
Query Executor
“//purchase[cname=‘Andy’]/pname”
Client
Untrusted Server
“Lily”
… purchases
…
purchasecname pname
purchasecname mname
Andy Lily Betty Reflection
Query Q
purchasecname mname
Andy Lily
Encrypted XML Database
Translated Query Qs
Block 1 Block 2
Structural index Value index
“//α [β]/γ”
[β]
{Block 1, 2}Block 1
{Block 1}Join
{Block 1}
“// α [β≥‘KA’ AND β’KT’] /γ”/γ”
[β≥‘KA’ AND β’KT’]
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
30
Technical Results of Security of Query Answering Let A be any query that is captured by the se
curity constraints, and Bel(B(A)) be the attacker’s belief probability of whether the hosted database satisfies A
We prove that by answering queries, Bel(B(A)) does not increase
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
31
Experiments
Impact of Optimization
Compared with naïve method, our approach gets > 80% of savings!
0. 80. 810. 820. 830. 840. 850. 86
1MB 10MB 50MB 100MB
Document si ze
Savi
ng R
atio
QsQmQl
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
32
Conclusion We consider the problem of efficient and sec
ure evaluation of XPath queries on encrypted XML database
We formally define the attack model and security (see paper)
We propose The security constraints (see paper) The secure encryption scheme (see paper) The design of secure structural and value index The secure and efficient query evaluation
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
33
Future Work
More prior knowledge Tag distribution Query workload distribution Correlations between data values
Updates on database Definition of security Secure encryption scheme How to design metadata
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
34
Thank you !
Questions?
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
35
Extra Slides
Extra slides
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
36
Similar Application Scenario: Untrusted Disk The attacker may install the Trojan virus on th
e disk where the databases are stored (maybe locally), and spy the operations on the databases
The disk is not trusted anymore, which is similar to the untrusted server
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
37
More Discussion on Security of Structural Index Attacker still can infer the structural relations
(e.g., parent/child, siblings, etc..) between the nodes in the encrypted elements
However, he cannot reconstruct the exact content of original element
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
38
Other Contributions: Security Constraints Node type constraint
For sensitive XML element with its content E.g., //customer//prescription
Association type constraint For sensitive associations between data values E.g., //customer: (/name, //purchase//mname)
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
39
Security Definitions
Query Executor
Untrusted Server
QsEncrypted XML Database
Metadata
Level 1: Secure Encryption Scheme
Level 2: Secure Database System
Level 3: Secure Query Answering
A set of encryption blocks
XML DecryptorQuery Translator
Query Executor
Query Q
Client
Answer of Query Q
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
40
XML Encryption
W3C standard Different encryption granularity
Info… purchases…
purchasecname pname
purchasecname pname
Andy ‘Lily’ Betty ‘Last supper’
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
41
XML Encryption (Cont.)
Tradeoff exists between encryption granularity and efficient query processing
Next question is
What’s the optimal encryption scheme s.t.
(1) it is secure, and
(2) it facilitate the query processing?
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
42
Secure Encryption Scheme
Encryption Scheme S: Every security constraint is enforced
node type constraint c, node that c binds to is encrypted E.g., for the security constraint //customer//prescription, enc
rypt every “prescription” element association type constraint p:(q1, q2), nodes that bi
nds to either p/q1 or p/q2 are encrypted E.g., for the security constraint //customer: (/name, //purc
hase//mname), either //customer/name or //customer//purchase//mname is encrypted
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
43
Secure Encryption Scheme (Cont.) More protection
Every leaf element containing data values is encrypted with encryption decoy Effect: every encrypted value is of unique number of
occurrence E.g., original values (“AIDS”, “AIDS”, “cold”) are
encrypted to be (“CCED”, “PACS”, “DAEE”) Goal: defense of frequency-based attack
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
44
Secure Encryption Scheme (Cont.) Theorem: the encryption scheme is a secure
encryption scheme Theorem: Finding an optimal secure
encryption scheme is NP-hard in size of security constraints
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
45
Unsafety of Continuous Interval Index
A [1, 10]
B [2,5] B [6, 9]
C [3, 4] D [7, 8]
Tag DSI index
Enc(A) [1,10]
Enc(B) [2, 9]
Enc(C) [3, 4]
Enc(D) [7, 8]
The original structure is revealed by the gap!
Original database
A [1, 10]
B [2,9]
C [3, 4] D [7, 8]
B [2,5] B [6, 9]
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
46
Safety of Discontinuous Interval Index
A [0, 1]
B [0.1,0.4] B [0.5,0.9]
C [0.2,0.25] D[0.55, 0.75]
Tag DSI index
Enc(A) [0, 1]
Enc(B) [0.1, 0.9]
Enc(C) [0.2, 0.25]
Enc(D) [0.55, 0.75]
A [0, 1]
B [0.1,0.9]
C [0.2,0.25] D[0.55, 0.6]
A fake candidate
Original database
D[0.65, 0.75]
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
47
SplittingE.g., for data values on attribute “Age”
10
20
30
18
5
27
Ki ∊ [m-1, m, m+1], m=6
Plaintext value of “Age”
# of occurrence
# of occurrence ki
Ciphertextvalue of “Age”
67
5124101
189
7+6+5=18
7776
312367371389
7*3+6=27
5210 5=5
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
48
Scaling
10
20
30
18
5
27
Plaintextvalue of “price”
# of occurrence
67
5
# of occurrence
Ciphertext value of “price”
5
7776
124101
189210312367371389
1414
141414141414
Scale to
To map 8 distinct ciphertext values to 3 distinct plaintext values, # of mappings =
212
7
13
18
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
49
Value Metadata (Cont.)
Before Splitting: Distinct Plaintext Values
0
5
10
15
20
25
30
10 20 30
Nu
mb
er
of
Oc
cu
ren
ce
After Splitting: Distinct Ciphertext Values
0
1
2
3
4
5
6
7
8
101 124 189 210 312 367 371 389
Nu
mb
er
of
Oc
cu
ren
ce
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
50
Query Processing at Client
The tags and values are encrypted E.g., original query: //customer[//zipcode=‘12500’]//
name
‘12500’‘4000’…
‘7000’
Translated query: //α [‘4000 ‘ β and β ‘7000’] // γ
customer α, zipcode β, name γ
[‘4000 ‘ β and β ‘7000’]Zipcode =‘12500’
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
51
Query Processing at Server
(1)Query tags//α [β] // γ
A set of encryptionblock IDs Bs
structural index
(2) Value-based Constraints‘4000 ‘ β and β ‘7000’
A set of encryptionblock IDs Bv
value index
(3) The blocks corresponding to Bs Bv are returned to the client. Each returned block contains the answers of the original query
Wang, Lakshmanan Efficient Secure Query Evluation over Encrypted XML Databases
52
Experiments (Cont.)
Effects of Various Secure Encryption Schemes
0
5
10
15
Top Sub App Opt
Encryption Scheme
Tim
e(s
)
Query processing on clientDecryption on clientQuery processing on server
1. Optimal encryption scheme always has the best performance of query evaluation
2. The performance of approximate scheme is around 1.1-1.3 times of that by optimal encryption scheme