current trends

48
1 Current Trends in Data Security Dan Suciu Joint work with Gerome Miklau

Upload: demis

Post on 13-Apr-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Current Trends

1

Current Trends in Data Security

Dan Suciu

Joint work with Gerome Miklau

Page 2: Current Trends

2

Data Security

Dorothy Denning, 1982:

• Data Security is the science and study of methods of protecting data (...) from unauthorized disclosure and modification

• Data Security = Confidentiality + Integrity

Page 3: Current Trends

3

Data Security

• Distinct from systems and network security– Assumes these are already secure

• Tools:– Cryptography, information theory, statistics, …

• Applications:– An enabling technology

Page 4: Current Trends

4

Outline

• Traditional data security

• Two attacks

• Data security research today

• Conclusions

Page 5: Current Trends

5

Traditional Data Security

• Security in SQL = Access control + Views

• Security in statistical databases = Theory

Page 6: Current Trends

6

Access Control in SQL

GRANT privileges ON object TO users [WITH GRANT OPTIONS]

privileges = SELECT | INSERT | DELETE | . . .

object = table | attribute

REVOKE privileges ON object FROM users [CASCADE ]

[Griffith&Wade'76, Fagin'78]

Page 7: Current Trends

7

Views in SQL

A SQL View = (almost) any SQL query

• Typically used as:

GRANT SELECT ON pmpStudents TO DavidRispoli

CREATE VIEW pmpStudents AS SELECT * FROM Students WHERE…

Page 8: Current Trends

8

Summary of SQL Security

Limitations:• No row level access control• Table creator owns the data: that’s unfair !

… or spectacular failure:• Only 30% assign privileges to users/roles

– And then to protect entire tables, not columns

Access control = great success story of the DB community...

Page 9: Current Trends

9

Summary (cont)

• Most policies in middleware: slow, error prone:– SAP has 10**4 tables– GTE over 10**5 attributes– A brokerage house has 80,000 applications– A US government entity thinks that it has 350K

• Today the database is not at the center of the policy administration universe

[Rosenthal&Winslett’2004]

Page 10: Current Trends

10

Security in Statistical DBs

Goal:• Allow arbitrary aggregate SQL queries• Hide confidential data

SELECT count(*)FROM PatientsWHERE age=42 and sex=‘M’ and diagnostic=‘schizophrenia’

OK

SELECT nameFROM PatientWHERE age=42 and sex=‘M’ and diagnostic=‘schizophrenia’

Not OK

[Adam&Wortmann’89]

Page 11: Current Trends

11

Security in Statistical DBsWhat has been tried:• Query restriction

– Query-size control, query-set overlap control, query monitoring– None is practical

• Data perturbation– Most popular: cell combination, cell suppression– Other methods, for continuous attributes: may introduce bias

• Output perturbation– For continuous attributes only

[Adam&Wortmann’89]

Page 12: Current Trends

12

Summary on Security in Statistical DB

• Original goal seems impossible to achieve

• Cell combination/suppression are popular, but do not allow arbitrary queries

Page 13: Current Trends

13

Outline

• Traditional data security

• Two attacks

• Data security research today

• Conclusions

Page 14: Current Trends

14

Search claims by:

SQL InjectionYour health insurance company lets you see the claims online:

Now search through the claims :

Dr. Lee

First login: User:

Password:

fred

********

SELECT…FROM…WHERE doctor=‘Dr. Lee’ and patientID=‘fred’

[Chris Anley, Advanced SQL Injection In SQL]

Page 15: Current Trends

15

SQL InjectionNow try this:

Search claims by: Dr. Lee’ OR patientID = ‘suciu’; --

Better:

Search claims by: Dr. Lee’ OR 1 = 1; --

…..WHERE doctor=‘Dr. Lee’ OR patientID=‘suciu’; --’ and patientID=‘fred’

Page 16: Current Trends

16

SQL InjectionWhen you’re done, do this:

Search claims by: Dr. Lee’; DROP TABLE Patients; --

Page 17: Current Trends

17

SQL Injection

• The DBMS works perfectly. So why is SQL injection possible so often ?

• Quick answer:– Poor programming: use stored procedures !

• Deeper answer:– Move policy implementation from apps to DB

Page 18: Current Trends

18

Latanya Sweeney’s Finding

• In Massachusetts, the Group Insurance Commission (GIC) is responsible for purchasing health insurance for state employees

• GIC has to publish the data:

GIC(zip, dob, sex, diagnosis, procedure, ...)

Page 19: Current Trends

19

Latanya Sweeney’s Finding

• Sweeney paid $20 and bought the voter registration list for Cambridge Massachusetts:

GIC(zip, dob, sex, diagnosis, procedure, ...)VOTER(name, party, ..., zip, dob, sex)

Page 20: Current Trends

20

Latanya Sweeney’s Finding

• William Weld (former governor) lives in Cambridge, hence is in VOTER

• 6 people in VOTER share his dob• only 3 of them were man (same sex)• Weld was the only one in that zip• Sweeney learned Weld’s medical records !

zip, dob, sex

Page 21: Current Trends

21

Latanya Sweeney’s Finding

• All systems worked as specified, yet an important data has leaked

• How do we protect against that ?

Some of today’s research in data security address breachesthat happen even if all systems work correctly

Page 22: Current Trends

22

Summary on Attacks

SQL injection:• A correctness problem:

– Security policy implemented poorly in the application

Sweeney’s finding:• Beyond correctness:

– Leakage occurred when all systems work as specified

Page 23: Current Trends

23

Outline

• Traditional data security

• Two attacks

• Data security research today

• Conclusions

Page 24: Current Trends

24

Research Topics in Data Security

Rest of the talk:• Information Leakage• Privacy• Fine-grained access control• Data encryption• Secure shared computation

Page 25: Current Trends

25

First Last Age RaceHarry Stone 34 Afr-AmJohn Reyser 36 Cauc

Beatrice Stone 47 Afr-amJohn Ramos 22 Hisp

First Last Age Race* Stone 30-50 Afr-Am

John R* 20-40 ** Stone 30-50 Afr-am

John R* 20-40 *

Information Leakage:k-Anonymity

Definition: each tuple is equal to at least k-1 others

Anonymizing: through suppression and generalization

Hard: NP-complete for supression onlyApproximations exists

[Samarati&Sweeney’98, Meyerson&Williams’04]

Page 26: Current Trends

26

Information Leakage:Query-view Security

Secret Query View(s) Disclosure ?S(name) V(name,phone)

S(name,phone) V1(name,dept)V2(dept,phone)

S(name) V(dept)S(name)

where dept=‘HR’V(name)

where dept=‘RD’

TABLE Employee(name, dept, phone)Have data:

total

big

tiny

none

[Miklau&S’04, Miklau&Dalvi&S’05,Yang&Li’04]

Page 27: Current Trends

27

Summary on Information Disclosure

• The theoretical research:– Exciting new connections between databases

and information theory, probability theory, cryptography

• The applications: – many years away

[Abadi&Warinschi’05]

Page 28: Current Trends

28

Privacy

• “Is the right of individuals to determine for themselves when, how and to what extent information about them is communicated to others”

• More complex than confidentiality

[Agrawal’03]

Page 29: Current Trends

29

Privacy

Involves:• Data• Owner• Requester• Purpose• Consent

Example: Alice gives her email to a web service

[email protected]

Privacy policy: P3P

Page 30: Current Trends

30

Hippocratic Databases

DB support for implementing privacy policies.• Purpose specification• Consent• Limited use• Limited retention• …

[Agrawal’03, LeFevrey’04]

[email protected]

Privacy policy: P3P

Hippocratic DB

Protection against: Sloppy organizations Malicious organizations

Page 31: Current Trends

31

Privacy for Paranoids

• Idea: rely on trusted agents

[email protected]

Agent

[email protected]

[email protected]

foreign keys ?

[Aggarwal’04]

Protection against: Sloppy organizations Malicious attackers

Page 32: Current Trends

32

Summary on Privacy

• Major concern in industry– Legislation– Consumer demand

• Challenge:– How to enforce an organization’s stated

policies

Page 33: Current Trends

33

Fine-grained Access Control

Control access at the tuple level.

• Policy specification languages• Implementation

Page 34: Current Trends

34

Policy Specification Language

CREATE AUTHORIZATION VIEW PatientsForDoctors AS SELECT Patient.* FROM Patient, Doctor WHERE Patient.doctorID = Doctor.ID and Doctor.login = %currentUser

Contextparameters

No standard, but usually based on parameterized views.

Page 35: Current Trends

35

ImplementationSELECT Patient.name, Patient.ageFROM PatientWHERE Patient.disease = ‘flu’

SELECT Patient.name, Patient.ageFROM Patient, DoctorWHERE Patient.disease = ‘flu’ and Patient.doctorID = Doctor.ID and Patient.login = %currentUser

e.g. Oracle

Page 36: Current Trends

36

Two Semantics• The Truman Model = filter semantics

– transform reality– ACCEPT all queries– REWRITE queries– Sometimes misleading results

• The non-Truman model = deny semantics– reject queries– ACCEPT or REJECT queries– Execute query UNCHANGED– May define multiple security views for a user

[Rizvi’04]

SELECT count(*)FROM PatientsWHERE disease=‘flu’

Page 37: Current Trends

37

Summary of Fine Grained Access Control

• Trend in industry: label-based security• Killer app: application hosting

– Independent franchises share a single table at headquarters (e.g., Holiday Inn)

– Application runs under requester’s label, cannot see other labels

– Headquarters runs Read queries over them• Oracle’s Virtual Private Database

[Rosenthal&Winslett’2004]

Page 38: Current Trends

38

Data Encryption for Publishing

• Users and their keys:

• Complex Policies:

All authorized users: Kuser

Patient: Kpat

Doctor: Kdr

Nurse: Knu

Administrator : Kadmin

What is the encryption granularity ?

Doctor researchers may access trials Nurses may access diagnosticEtc…

Scientist wants to publish medical research data on the Web

Page 39: Current Trends

39

Data Encryption for PublishingAn XML tree protection:

<patient>

<privateData>

<name> <age>

<diagnostic>

JoeDoe 28

<address>

Seattle

<trial>

<drug>

flu

<placebo>

Kuser

Kpat (KnuKadm) Knu KdrKdr

Kpat Kmaster Kmaster

Tylenol Candy

[Miklau&S.’03]

Doctor: Kuser, Kdr

Nurse: Kuser, Knu

Nurse+admin: Kuser, Knu, Kadm

Page 40: Current Trends

40

Summary on Data Encryption

• Industry:– Supported by all vendors:

Oracle, DB2, SQL-Server– Efficiency issues still largely unresolved

• Research:– Hard theoretical security analysis

[Abadi&Warinschi’05]

Page 41: Current Trends

41

Secure Shared Processing

• Alice has a database DBA

• Bob has a database DBB

• How can they compute Q(DBA, DBB), without revealing their data ?

• Long history in cryptography• Some database queries are easier than general case

Page 42: Current Trends

42

Secure Shared Processing

[Agrawal’03]

Alice Bob

a b c d c d e

h(a) h(b) h(c) h(d) h(c) h(d) h(e)

Compute one-way hash

Exchange

h(c) h(d) h(e) h(a) h(b) h(c) h(d)What’s wrong ?

Task: find intersectionwithout revealing the rest

Page 43: Current Trends

43

Secure Shared ProcessingAlice Bob

a b c d c d e

EB(c) EB(d) EB(e) EA(a) EA(b) EA(c) EA(d)

commutative encryption:h(x) = EA(EB(x)) = EB(EA(x))

EA(a) EA(b) EA(c) EA(d) EB(c) EB(d) EB(e)

EA EB

h(c) h(d) h(e) h(a) h(b) h(c) h(d)EA EB

h(a) h(b) h(c) h(d) h(c) h(d) h(e)

[Agrawal’03]

Page 44: Current Trends

44

Summary on Secure Shared Processing

• Secure intersection, joins, data mining

• But are there other examples ?

Page 45: Current Trends

45

Outline

• Traditional data security

• Two attacks

• Data security research today

• Conclusions

Page 46: Current Trends

46

Conclusions

• Traditional data security confined to one server– Security in SQL– Security in statistical databases

• Attacks possible due to:– Poor implementation of security policies: SQL

injection– Unintended information leakage in published data

Page 47: Current Trends

47

Conclusions• State of the industry:

– Data security policies: scattered throughout applications– Database no longer center of the security universe– Needed: automatic means to translate complex policies into

physical implementations

• State of research: data security in global data sharing– Information leakage, privacy, secure computations, etc.– Database research community has an increased appetite for

cryptographic techniques

Page 48: Current Trends

48

Questions ?