csc321/545: summary of database techniques

23
CSC321/545: Summary of Database Techniques Dr. Zhen Jiang Computer Science Department West Chester University West Chester, PA 19383

Upload: galia

Post on 24-Feb-2016

43 views

Category:

Documents


0 download

DESCRIPTION

CSC321/545: Summary of Database Techniques . Dr. Zhen Jiang Computer Science Department West Chester University West Chester, PA 19383. Outline. Overview Non-relational DB system NonSQL DB system Injection Inference Role access control (UML) Perturbation Design. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: CSC321/545:  Summary of  Database Techniques

CSC321/545: Summary of Database Techniques Dr. Zhen JiangComputer Science DepartmentWest Chester UniversityWest Chester, PA 19383

Page 2: CSC321/545:  Summary of  Database Techniques

OutlineOverview

◦Non-relational DB system◦NonSQL DB system

InjectionInference

◦Role access control (UML)◦Perturbation

Design

Page 3: CSC321/545:  Summary of  Database Techniques

DBMS

Database System Overview

dataDatabase

Query reque

st

Page 4: CSC321/545:  Summary of  Database Techniques

IntegrationAdministration Security &

encryptionPrivacy & inferenceTransaction &

injectionSketching & hashing

Page 5: CSC321/545:  Summary of  Database Techniques

Application Programming Interface (API)

integration

Page 6: CSC321/545:  Summary of  Database Techniques

Traditional DatabaseThe relation of key vs. non-keyThe relation between key and foreign

key◦Intra-table relation◦Inter-table relation

E-R diagram◦http://www.cs.wcupa.edu/~zjiang/ER.pdf ◦Any regularity?

Arbitrary & Abrupt◦Ambiguity

Sample of such ambiguity in normalization process caused by the lack of background

Page 7: CSC321/545:  Summary of  Database Techniques

Non-Relational DatabaseData does not relate in the true

sense◦e.g., Mongo, which handles

document stores or other content and/or metadata stores

Page 8: CSC321/545:  Summary of  Database Techniques

NonSQL DatabaseA more clear structure

e.g., Kobo, Playtika (mobile service) Distributed database system

No need and not possible for a “join” operator Fast third-party data aggregation Fast caching for application objects Globally distributed data repository E-commerce and internet burstness Game (data intensive applications) Ad targeting (social networks)

Page 9: CSC321/545:  Summary of  Database Techniques

Query reque

stOk?

APIDBMS

Page 10: CSC321/545:  Summary of  Database Techniques

InjectionDirect DB injection

◦http://www.youtube.com/watch?v=v6bphRHH4sM

Indirect DB injection◦http://www.irongeek.com/i.php?page

=videos/webgoat-sql-injection

Page 11: CSC321/545:  Summary of  Database Techniques

You need a tool for the

trace of transactions

Page 12: CSC321/545:  Summary of  Database Techniques

interrupt each transaction as you debug and trace

the record of each transaction

Page 13: CSC321/545:  Summary of  Database Techniques

Authorization◦ Restrict access to data and restrict the actions

that people may take (when they access data).Encryption

◦ Scramble data so that the data cannot be read.

Authentication◦ Password check◦ Key protection, not to protect everything!

https://www.youtube.com/watch?v=3QnD2c4XovkRole based access control

Page 14: CSC321/545:  Summary of  Database Techniques

Inference (aggregation)Basically, inference occurs when

users are able to piece together (aggregate) information to determine a fact that should be protected.

Role cheating

Page 15: CSC321/545:  Summary of  Database Techniques

Flight ID Cargo Hold Contents Classification

1254 A Boots Unclassified1254 B Atomic bomb Top secret1254 C Butter UnclassifiedGeneral Jones (who has a top

security clearance) requests information and would see all three.

Civilian Smith (who has no security clearance) requests the data and would see the following data:Flight ID Cargo Hold Contents Classificatio

n1254 A Boots Unclassified1254 C Butter Unclassified

Page 16: CSC321/545:  Summary of  Database Techniques

When Smith sees that nothing is scheduled for hold B on flight 1254, he might attempt to insert the record, and his insertion will fail due to the unique constraint on cargo space availability.

He has all the data he needs to infer that there is a secret shipment on flight.

He could then cross-reference the flight information table to find out the source and destination of the secret shipment and various other information.

Page 17: CSC321/545:  Summary of  Database Techniques

Poly-instantiation: allows different records (hold B) to exist in the same table.

Overbooking!

Page 18: CSC321/545:  Summary of  Database Techniques

Other caurses such as:◦Count of highly preferred customers◦Average salary

Problem is difficult◦Information?

Content: what is critical?◦Path?

Hold A-C, Hold B? Total space? Probing!

Page 19: CSC321/545:  Summary of  Database Techniques

Existing solutions◦Limit access

Role access control Too many restriction could seriously

hinder the functionality

Page 20: CSC321/545:  Summary of  Database Techniques

◦Perturbation Alter the data so that individual

details are accurate but overall generalization are inaccurate.

Include dummy data in the results returned by the query unauthorized.

Protect sensitive data, but also achieve preservation of the properties of the dataset. Sketching with a probability of p. With probability p to use the original data With probability (1-p) to use a replacement

Page 21: CSC321/545:  Summary of  Database Techniques

PreservationGiven each query f in the original table T with n rows, build a re-constructible query f’ in the revised table T’ (with n rows), so that the result difference can be controlled in a limited range with a probability of p.

In other words, the expected number of rows that get perturbation is n(1-p). For a domain ∆C, n(1-p)k rows will be expected to lie within the available value range (k ∆C), k[1, 0]. Among total nr rows observed from T’ in the value range (k ∆C), subtracting the n(1-p)k rows, we have the estimation for the number of unperturbed rows. Scaled up by 1/p, we get the total number of original rows (n0), as only a p fraction of rows were retained.

Page 22: CSC321/545:  Summary of  Database Techniques

Security and Privacyf’ = n0/n[n-n0, n0]A = [n-nr, nr]a=Pr(row T) vs. b=Pr(row in perturbed

table T’)Privacy breach, security threshold

> a / bb b’ (sketch does not help to distinguish

the cases)Server Storage (with a) vs. Client retaining

(with b)

Page 23: CSC321/545:  Summary of  Database Techniques

OO Design for DB SystemsInjection, inference

◦RBAC (role based access control)◦Use case

http://www.cs.wcupa.edu/~zjiang/intro_uc.ppt ◦Class design is needed for better maintaining the

data ownership http://www.cs.wcupa.edu/~zjiang/csc545_oo_design.htm

Non-relational DB◦Activity pattern – prediction of future relation,

e.g., credit card securityNonSQL DB

◦Relations in structure for the use.