simulatability “the enemy knows the system”, claude shannon compsci 590.03 instructor: ashwin...
TRANSCRIPT
![Page 1: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/1.jpg)
Lecture 6 : 590.03 Fall 12 1
Simulatability“The enemy knows the system”, Claude Shannon
CompSci 590.03Instructor: Ashwin Machanavajjhala
![Page 2: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/2.jpg)
Lecture 6 : 590.03 Fall 12 2
Announcements• Please meet with me at least 2 times before you finalize your
project (deadline Sep 28).
![Page 3: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/3.jpg)
Lecture 6 : 590.03 Fall 12 3
Recap – L-Diversity• The link between identity and attribute value is the sensitive
information. “Does Bob have Cancer? Heart disease? Flu?” “Does Umeko have Cancer? Heart disease? Flu?”
• Adversary knows ≤ L-2 negation statements. “Umeko does not have Heart Disease.”
– Data Publisher may not know exact adversarial knowledge
• Privacy is breached when identity can be linked to attribute value with high probability Pr[ “Bob has Cancer” | published table, adv. knowledge] > t
![Page 4: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/4.jpg)
Lecture 6 : 590.03 Fall 12 4
Zip Age Nat. Disease
1306* <=40 * Heart
1306* <=40 * Flu
1306* <=40 * Cancer
1306* <=40 * Cancer
1485* >40 * Cancer
1485* >40 * Heart
1485* >40 * Flu
1485* >40 * Flu
1305* <=40 * Heart
1305* <=40 * Flu
1305* <=40 * Cancer
1305* <=40 * Cancer
Recap – 3-Diverse Table
L-Diversity Principle: Every group of tuples with the same Q-ID values has ≥ L distinct sensitive values of roughly equal proportions.
![Page 5: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/5.jpg)
Lecture 6 : 590.03 Fall 12 5
Outline• Simulatable Auditing
• Minimality Attack in anonymization
• Simulatable algorithms for anoymization
![Page 6: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/6.jpg)
Lecture 6 : 590.03 Fall 12 6
Query Auditing
Database has numeric values (say salaries of employees).Database either truthfully answers a question or denies answering.
MIN, MAX, SUM queries over subsets of the database.
Question: When to allow/deny queries?
Database
Researcher
Query
Safe to publish?
Yes
No
![Page 7: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/7.jpg)
Lecture 6 : 590.03 Fall 12 7
Why should we deny queries?• Q1: Ben’s sensitive value?
– DENY
• Q2: Max sensitive value of males?– ANSWER: 2
• Q3: Max sensitive value of 1st year PhD students? – ANSWER: 3
• But Q3 + Q2 => Xi = 3
Name 1st year PhD
Gender Sensitive value
Ben Y M 1Bha N M 1Ios Y M 1Jan N M 2Jian Y M 2Jie N M 1Joe N M 2
Moh N M 1Son N F 1Xi Y F 3
Yao N M 2
![Page 8: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/8.jpg)
Lecture 6 : 590.03 Fall 12 8
Value-Based Auditing• Let a1, a2, …, ak be the answers to previous queries Q1, Q2, …, Qk.
• Let ak+1 be the answer to Qk+1.
ai = f(ci1x1, ci2x2, …, cinxn), i = 1 … k+1
cim = 1 if Qi depends on xm
Check if any xj has a unique solution.
![Page 9: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/9.jpg)
Lecture 6 : 590.03 Fall 12 9
Value-based Auditing• Data Values: {x1, x2 , x3 , x4 , x5}, Queries: MAX.• Allow query if value of xi can’t be inferred.
x1
x2
x3
x4 x5
![Page 10: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/10.jpg)
Lecture 6 : 590.03 Fall 12 10
Value-based Auditing• Data Values: {x1, x2 , x3 , x4 , x5}, Queries: MAX.• Allow query if value of xi can’t be inferred.
x1
x2
x3
x4 x5
max(x1, x2 , x3 , x4 , x5)
Ans: 10 10
-∞ ≤ x1 … x5≤ 10
![Page 11: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/11.jpg)
Lecture 6 : 590.03 Fall 12 11
Value-based Auditing• Data Values: {x1, x2 , x3 , x4 , x5}, Queries: MAX.• Allow query if value of xi can’t be inferred.
x1
x2
x3
x4 x5
max(x1, x2 , x3 , x4 , x5)
Ans: 10 10
max(x1, x2 , x3 , x4)
Ans: 8DENY
-∞ ≤ x1 … x4 ≤ 8 => x5 = 10
![Page 12: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/12.jpg)
Lecture 6 : 590.03 Fall 12 12
Value-based Auditing• Data Values: {x1, x2 , x3 , x4 , x5}, Queries: MAX.• Allow query if value of xi can’t be inferred.
x1
x2
x3
x4 x5
max(x1, x2 , x3 , x4 , x5)
Ans: 10 10
max(x1, x2 , x3 , x4)
Ans: 8DENY
Denial means some value can be
compromised!
![Page 13: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/13.jpg)
Lecture 6 : 590.03 Fall 12 13
Value-based Auditing• Data Values: {x1, x2 , x3 , x4 , x5}, Queries: MAX.• Allow query if value of xi can’t be inferred.
x1
x2
x3
x4 x5
max(x1, x2 , x3 , x4 , x5)
Ans: 10 10
max(x1, x2 , x3 , x4)
Ans: 8DENY
What could max(x1, x2, x3, x4)
be?
![Page 14: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/14.jpg)
Lecture 6 : 590.03 Fall 12 14
Value-based Auditing• Data Values: {x1, x2 , x3 , x4 , x5}, Queries: MAX.• Allow query if value of xi can’t be inferred.
x1
x2
x3
x4 x5
max(x1, x2 , x3 , x4 , x5)
Ans: 10 10
max(x1, x2 , x3 , x4)
Ans: 8DENY
From first answer, max(x1,x2,x3,x4) ≤ 10
![Page 15: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/15.jpg)
Lecture 6 : 590.03 Fall 12 15
Value-based Auditing• Data Values: {x1, x2 , x3 , x4 , x5}, Queries: MAX.• Allow query if value of xi can’t be inferred.
x1
x2
x3
x4 x5
max(x1, x2 , x3 , x4 , x5)
Ans: 10 10
max(x1, x2 , x3 , x4)
Ans: 8DENY
If, max(x1,x2,x3,x4) = 10
Then, no privacy breach
![Page 16: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/16.jpg)
Lecture 6 : 590.03 Fall 12 16
Value-based Auditing• Data Values: {x1, x2 , x3 , x4 , x5}, Queries: MAX.• Allow query if value of xi can’t be inferred.
x1
x2
x3
x4 x5
max(x1, x2 , x3 , x4 , x5)
Ans: 10 10
max(x1, x2 , x3 , x4)
Ans: 8DENY
Hence, max(x1,x2,x3,x4) < 10
=> x5 = 10!
![Page 17: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/17.jpg)
Lecture 6 : 590.03 Fall 12 17
Value-based Auditing• Data Values: {x1, x2 , x3 , x4 , x5}, Queries: MAX.• Allow query if value of xi can’t be inferred.
x1
x2
x3
x4 x5
max(x1, x2 , x3 , x4 , x5)
Ans: 10 10
max(x1, x2 , x3 , x4)
Ans: 8DENY
Hence, max(x1,x2,x3,x4) < 10
=> x5 = 10!Denials leak information.
Attack occurred since privacy analysis didnot assume that attacker knows the algorithm.
![Page 18: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/18.jpg)
Lecture 6 : 590.03 Fall 12 18
Simulatable Auditing [Kenthapadi et al PODS ‘05]
• An auditor is simulatable if the decision to deny a query Qk is made based on information already available to the attacker. – Can use queries Q1, Q2, …, Qk and answers a1, a2, …, ak-1
– Cannot use ak or the actual data to make the decision.
• Denials provably do not leak informaiton– Because the attacker could equivalently determine whether
the query would be denied. – Attacker can mimic or simulate the auditor.
![Page 19: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/19.jpg)
Lecture 6 : 590.03 Fall 12 19
Simulatable Auditing Algorithm• Data Values: {x1, x2 , x3 , x4 , x5}, Queries: MAX.• Allow query if value of xi can’t be inferred.
x1
x2
x3
x4 x5
max(x1, x2 , x3 , x4 , x5)
Ans: 10 10
max(x1, x2 , x3 , x4)Before
computing answer
DENY
Ans > 10 => not possibleAns = 10 => -∞ ≤ x1 … x4 ≤ 10Ans < 10 => x5 = 10
SAFEUNSAFE
![Page 20: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/20.jpg)
Lecture 6 : 590.03 Fall 12 20
Summary of Simulatable Auditing
• Decision to deny answers must be based on past queries answered in some (many!) cases.
• Denials can leak information if the adversary does not know all the information that is used to decide whether to deny the query.
![Page 21: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/21.jpg)
Lecture 6 : 590.03 Fall 12 21
Outline• Simulatable Auditing
• Minimality Attack in anonymization
• Simulatable algorithms for anoymization
![Page 22: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/22.jpg)
Lecture 6 : 590.03 Fall 12 22
Minimality attack on Generalization algorithms
• Algorithms for K-anonymity, L-diversity, T-closeness, etc. try to maximize utility. – Find a minimally generalized table in the lattice that satisfies privacy, and
maximizes utility.
• But … attacker also knows this algorithm!
![Page 23: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/23.jpg)
Lecture 6 : 590.03 Fall 12 23
Example Minimality attack [Wong et al VLDB07]
• Dataset with one quasi-identifier and 2 values q1, q2.• q1, q2 generalize to Q.
• Sensitive attribute: Cancer – yes/no• We want to ensure P[Cancer = yes] < ½.
– OK to know if an individual does not have Cancer.
• Published Table:
QID Cancer
Q Yes
Q Yes
Q No
Q No
q2 No
q2 No
![Page 24: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/24.jpg)
Lecture 6 : 590.03 Fall 12 24
Which input datasets could have led to the published table?
QID Cancer
Q Yes
Q Yes
Q No
Q No
q2 No
q2 No
Output dataset{q1,q2} Q
(“2-diverse”)Possible Input dataset
3 occurrences of q1QID Cance
r
q1 Yes
q1 Yes
q1 No
q2 No
q2 No
q2 No
QID Cancer
q1 Yes
q1 No
q1 No
q2 Yes
q2 No
q2 No
![Page 25: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/25.jpg)
Lecture 6 : 590.03 Fall 12 25
Which input datasets could have led to the published table?
QID Cancer
Q Yes
Q Yes
Q No
Q No
q2 No
q2 No
Output dataset{q1,q2} Q
(“2-diverse”)Possible Input dataset
3 occurrences of q1QID Cance
r
q1 Yes
Q No
Q No
q2 Yes
q2 No
q2 NoThis is a better generalization!
![Page 26: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/26.jpg)
Lecture 6 : 590.03 Fall 12 26
Which input datasets could have led to the published table?
QID Cancer
Q Yes
Q Yes
Q No
Q No
q2 No
q2 No
Output dataset{q1,q2} Q
(“2-diverse”)Possible Input dataset
1 occurrence of q1QID Cance
r
q2 Yes
q1 Yes
q2 No
q2 No
q2 No
q2 No
QID Cancer
q2 Yes
q2 Yes
q1 No
q2 No
q2 No
q2 No
![Page 27: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/27.jpg)
Lecture 6 : 590.03 Fall 12 27
Which input datasets could have led to the published table?
QID Cancer
Q Yes
Q Yes
Q No
Q No
q2 No
q2 No
Output dataset{q1,q2} Q
(“2-diverse”)Possible Input dataset
3 occurrences of q1QID Cance
r
q2 Yes
Q No
Q No
q2 Yes
q2 No
q2 NoThis is a better generalization!
![Page 28: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/28.jpg)
Lecture 6 : 590.03 Fall 12 28
Which input datasets could have led to the published table?
QID Cancer
Q Yes
Q Yes
Q No
Q No
q2 No
q2 No
Output dataset{q1,q2} Q
(“2-diverse”)Possible Input dataset
3 occurrences of q1QID Cance
r
q2 Yes
Q No
Q No
q2 Yes
q2 No
q2 No
There must be exactly two tuples with q1
![Page 29: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/29.jpg)
Lecture 6 : 590.03 Fall 12 29
Which input datasets could have led to the published table?
QID Cancer
Q Yes
Q Yes
Q No
Q No
q2 No
q2 No
Output dataset{q1,q2} Q
(“2-diverse”)
Possible Input dataset2 occurrences of q1
QID Cancer
q1 Yes
q1 Yes
q2 No
q2 No
q2 No
q2 No
QID Cancer
q2 Yes
q2 Yes
q1 No
q1 No
q2 No
q2 No
QID Cancer
q1 Yes
q2 Yes
q1 No
q2 No
q2 No
q2 No
Already satisfies privacy
![Page 30: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/30.jpg)
Lecture 6 : 590.03 Fall 12 30
Which input datasets could have led to the published table?
QID Cancer
Q Yes
Q Yes
Q No
Q No
q2 No
q2 No
Output dataset{q1,q2} Q
(“2-diverse”)
Possible Input dataset2 occurrences of q1
QID Cancer
q1 Yes
q1 Yes
q2 No
q2 No
q2 No
q2 No
QID Cancer
q2 Yes
q2 Yes
q1 No
q1 No
q2 No
q2 No
Learning Cancer=NO is OK,
Hence, this is private
![Page 31: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/31.jpg)
Lecture 6 : 590.03 Fall 12 31
Which input datasets could have led to the published table?
QID Cancer
Q Yes
Q Yes
Q No
Q No
q2 No
q2 No
Output dataset{q1,q2} Q
(“2-diverse”)
Possible Input dataset2 occurrences of q1
QID Cancer
q1 Yes
q1 Yes
q2 No
q2 No
q2 No
q2 No
This is the ONLY input that results in
the output!
P[Cancer = yes | q1] = 1
![Page 32: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/32.jpg)
Lecture 6 : 590.03 Fall 12 32
Outline• Simulatable Auditing
• Minimality Attack in anonymization
• Transparent Anonymization: Simulatable algorithms for anoymization
![Page 33: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/33.jpg)
Lecture 6 : 590.03 Fall 12 33
Transparent Anonymization• Assume that the adversary knows the algorithm that is being
used.
O: Output table
I(O, A): Input tables that result in O due to algorithm A
I: All possible input tables
![Page 34: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/34.jpg)
Lecture 6 : 590.03 Fall 12 34
Transparent Anonymization• According to I(O, A) privacy must be guaranteed.
– Probability must be computed assuming I(O,A) is the actual set of all possible input tables.
• What is an efficient algorithm for Transparent Anonymization?– For L-diversity?
![Page 35: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/35.jpg)
Lecture 6 : 590.03 Fall 12 35
Ace Algorithm [Xiao et al TODS’10]
Step 1: AssignJust based on the sensitive values, construct (in a randomized fashion) an intermediate L-diverse generation.
Step 2: SplitOnly based on the quasi-identifier values (and without looking at sensitive values) , deterministically refine the intermediate solution to maximize utility.
![Page 36: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/36.jpg)
Lecture 6 : 590.03 Fall 12 36
Step 1: Assign• Input Table
![Page 37: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/37.jpg)
Lecture 6 : 590.03 Fall 12 37
Step 1: Assign• St is the set of all tuples (grouped by sensitive value)
• Iteratively,
– Remove α tuples each from the β (≥L) most frequent sensitive values
![Page 38: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/38.jpg)
Lecture 6 : 590.03 Fall 12 38
Step 1: Assign• St is the set of all tuples (grouped by sensitive value)
• Iteratively,
– Remove α tuples each from the β (≥L) most frequent sensitive values
– 1st iteration β=2, α=2
![Page 39: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/39.jpg)
Lecture 6 : 590.03 Fall 12 39
Step 1: Assign• St is the set of all tuples (grouped by sensitive value)
• Iteratively,
– Remove α tuples each from the β (≥L) most frequent sensitive values
– 2nd iteration β=2, α=1
![Page 40: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/40.jpg)
Lecture 6 : 590.03 Fall 12 40
Step 1: Assign• St is the set of all tuples (grouped by sensitive value)
• Iteratively,
– Remove α tuples each from the β (≥L) most frequent sensitive values
– 3rd iteration β=2, α=1
![Page 41: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/41.jpg)
Lecture 6 : 590.03 Fall 12 41
Intermediate GeneralizationName Age Zip
Ann 21 10000
Bob 27 18000
Gill 60 63000
Ed 54 60000
Don 32 35000
Fred 60 63000
Hera 60 63000
Cate 32 35000
Disease
Dyspepsia
Dyspepsia
Flu
Flu
Bronchitis
Gastritis
Diabetes
Gastritis
![Page 42: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/42.jpg)
Lecture 6 : 590.03 Fall 12 42
Step 2: Split• If a bucket contains α>1 tuples of each sensitive value, split it into
two buckets, Ba and Bb s.t.,
– Pick 1 ≤ αa < α tuples from each sensitive value in bucket B, and put them in bucket Ba. The remaining tuples go to Bb.
– The division (Ba, Bb) is optimal in terms of utility. Name Age Zip
Ann 21 10000
Bob 27 18000
Gill 60 63000
Ed 54 60000
Don 32 35000
Fred 60 63000
Hera 60 63000
Cate 32 35000
![Page 43: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/43.jpg)
Lecture 6 : 590.03 Fall 12 43
Why does the Ace algorithm satisfy Transparent L-Diversity?
• According to I(O, A) privacy must be guaranteed. – Probability must be computed assuming I(O,A) is the actual set of all possible
input tables.
O: Output table
I(O, A): Input tables that result in O due to algorithm A
I: All possible input tables
![Page 44: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/44.jpg)
Lecture 6 : 590.03 Fall 12 44
Ace algorithm analysisLemma 1:
The assign step satisfies transparent L-diversity.
Proof (sketch): • Consider an intermediate output Int• Suppose there is some input table T such that Assign(T) = Int• Any other table T’ where the sensitive values of 2 individuals in
the same group are swapped, also leads to the same intermediate output Int.
![Page 45: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/45.jpg)
Lecture 6 : 590.03 Fall 12 45
Ace algorithm analysis
Both tables result in the same intermediate output.
![Page 46: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/46.jpg)
Lecture 6 : 590.03 Fall 12 46
Ace algorithm analysisLemma 1:
The assign step satisfies transparent L-diversity.Proof (sketch): • Consider an intermediate output Int• Suppose there is some input table T such that Assign(T) = Int• Any other table T’, where the sensitive values of 2 individuals in the same
group are swapped, also leads to the same intermediate output.
• The set of input tables I(Int,A) contains all possible assignments of diseases to individuals within each group of Int.
![Page 47: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/47.jpg)
Lecture 6 : 590.03 Fall 12 47
Ace algorithm analysisLemma 1:
The assign step satisfies transparent L-diversity.Proof (sketch): • The set of table I(Int,A) contains all possible assignments of diseases to
individuals in each group of Int.
• P[Ann has dyspepsia | I(Int,A) and Int] = 1/2
Name Age Zip
Ann 21 10000
Bob 27 18000
Gill 60 63000
Ed 54 60000
Disease
Dyspepsia
Dyspepsia
Flu
Flu
![Page 48: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/48.jpg)
Lecture 6 : 590.03 Fall 12 48
Ace algorithm analysisLemma 2:
The split phase also satisfies transparent L-diversity.
Proof (sketch):• I(Int, Assign) contains all tables where an individual is assigned to
an arbitrary sensitive value within the same group in Int. • Suppose some input table T ε I(Int, Assign) results in the final
output O after Split.
![Page 49: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/49.jpg)
Lecture 6 : 590.03 Fall 12 49
Ace algorithm analysis• Split does not depend on the sensitive values.
Ann Gill
BobEd
dyspepsia flu
Ann Bob
dyspepsia flu Gill Ed
dyspepsia flu
results in
BobEd
AnnGill
dyspepsia flu
Bob Ann
dyspepsia flu Ed Gill
dyspepsia flu
results in
![Page 50: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/50.jpg)
Lecture 6 : 590.03 Fall 12 50
Ace algorithm analysis
If T ε I(Int, Assign), and it results in O after split, Then, T’ ε I(Int, Assign), and it results in O after split
Table T Table T’
![Page 51: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/51.jpg)
Lecture 6 : 590.03 Fall 12 51
Ace algorithm analysis• Lemma 2:
The split phase also satisfies transparent L-diversity.
Proof (sketch)• Let T’ be generated by “swapping diseases” in some bucket. • If T ε I(Int, Assign), and it results in O after split,
Then, T’ ε I(Int, Assign), and it results in O after split.
• For any individual it is equally likely that sensitive value is one of ≥L choices.
• Therefore, P[individual has disease | I(O, Ace)] < 1/L
![Page 52: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/52.jpg)
Lecture 6 : 590.03 Fall 12 52
Summary• Many systems assume privacy/security is guaranteed by assuming
the adversary does not know the algorithm. – This is bad …
• Simulatable algorithms avoid this problem– Ideally choices made by the algorithm should be simulatable by the
adversary.
• Anonymization algorithms are also susceptible to adversaries who know the algorithm or the objective function.
• Transparent anonymization limits the inference an attacker (who knows the algorithm) can make about sensitive values.
![Page 53: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/53.jpg)
Lecture 6 : 590.03 Fall 12 53
Next Class• Composition of privacy • Differential Privacy
![Page 54: Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala 1Lecture 6 : 590.03 Fall 12](https://reader035.vdocuments.mx/reader035/viewer/2022062313/56649c775503460f9492cd64/html5/thumbnails/54.jpg)
Lecture 6 : 590.03 Fall 12 54
ReferencesA. Machanavajjhala, J. Gehrke, D. Kifer, M. Venkitasubramaniam, “L-Diversity: Privacy
beyond k-anonymity”, ICDE 2006K. Kenthapadi, N. Mishra, K. Nissim, “Simulatable Auditing”, PODS 2005R. Wong, A. Fu, K. Wang, J. Pei, “Minimality attack in privacy preserving data publishing”,
PVLDB 2007X. Xiao, Y. Tao & N. Koudas, “Transparent Anonymization: Thwarting adversaries who know
the algorithm”, TODS 2010