personal data management with secure hardware - inriaanciaux/mdm-2013.pdf · pr sm prism lab. - umr...
TRANSCRIPT
PR SMPRiSM Lab. - UMR 8144
Personal Data Management with Personal Data Management with Secure Hardware The Advantage of Keeping your Data at Hand
Nicolas Anciaux, Benjamin Nguyen & Iulian Sandu PopaINRIA Paris-Rocquencourt & University of Versailles St-Quentin
IEEE MDM’13 Advanced Seminar4th June 2013
Mass-generation of personal data
Data sources have mostly turned to digital
Paper-based interactions e.g., banking, e-administration
Analog processese.g., photography
Mechanical interactionse.g., opening a door
People recording
People listnening
St Peter's Place, Roma
PR SM
e.g., opening a door
Where is your personal data? … In data centers
112 new emails per day ���� Mail servers
800 pages of social data ���� Social networks
Daily basis interaction data ���� Search engines, Telco, Transport, etc.
List of purchases ���� Central purchasing organizations
All this opens the way to exhilarating economic per spectives…..
2
“Personal data is the new oil” (quoting WEF)Good news (for the economy)$2 billion a year spend by US companies
on third-party data about individuals
(Forrester Report)
(oil exploration and production is $400 billion)
$44.25 is the estimated return on $1
invested in email marketing (oil is up to 0.5$/yr)
States (e.g., France in 2013) investigate the idea of taxing personal data (personal data are resource s
PR SM
States (e.g., France in 2013) investigate the idea of taxing personal data (personal data are resource s
collected for free, escaping added value tax)
Facebook: value / #accounts ≈ ≈ ≈ ≈ 50$
Google: $38 billion business sells ads based on how people search the Web
Not only for Google & Facebook but also:e.g., Amazon (knows purchase intent), mail order systems companies, loyalty program sailors, banks &
insurrance, employement market (linkedIn, viadeo), travel & transportation (voyages-sncf), the
« love » market (meetic), etc.
3
“Personal data is the new oil” (quoting WEF)
How would oil producers behave ?
They would offer to exploit your oil field for free
They would offer free services to you
… which would cost them only a few cents
(e.g. HW/SW to manage emails)
… or bad news ? (for the fields)
PR SM
(e.g. HW/SW to manage emails)
and would provide services which may not be to you (and not advertized)
… which would yield healthy returns
(e.g. advertisement and profiling, location tracking and spying, …)
In other words : your personal data would be
processed by sophisticated data refineries …
4
Many reasons to get anxious…
Even the most defended servers are sucessfuly attac ked
Including those of Pentagon, FBI and NASA
E.g., feb. 2013: 1TB hacked daily – victims include US military facilities
Personal data can not avoid being subject to neglig ence
+1000 data leak incidents, +100 millions records af fected per year
(reported by Open Security Foundation, , Privacy Rights Clearinghouse & Ponemon Institute)
Personal data is often regulated by loose privacy p olicies
Obscure policies, assumed accepted when using the s ervice
PR SM
Ill-intentioned scrutinization flourishes
Justified by business interests, governmental press ures and inquisitiveness among people
Intelius.com, which make scrutinization its busines s: “Live in the know”
…is recipient of 2011 “5000-Fastest Growing Private Companies” Award.
Only a few actors hold most the data…
E.g., Google: “We have YouTube, Gmail, Google Docs, Google Calendar, Google+, Google Wallet,
Chrome browser, Android mobile platform, etc.”
… and users cannot escape this without being exclude d from necessary services…
=> The risk of a backlash is growing
5
Yet data centers openly assume offenses against privacy lawsPrivacy policies of dominant actors are invalid vs. EU & US standards
Too vague about purposes for which personal data is collected
Security principle is violated
E.g., Facebook does not guarantee any level of data security (2013):
“We do our best to keep Facebook safe, but we cannot guarantee it”
Consent principle is not respected
Personal data is collected, transferred, used witho ut user’s consent
PR SM
Personal data is collected, transferred, used witho ut user’s consent
(Microsoft, Pandora, Yelp get personal information from Facebook)
Personal data of non-users are collected (non-users profiles)
Policy change (frequently) without requesting users ’ consent
Openness (view & correct false data) is not provided in prac tice
E.g., EU versus Facebook affair: 40000 users still waiting (2013)…
Data retention limits are not applied
Data retention is far too long wrt collection purpo ses (e.g., Google)
6
Question: After all, is privacy really required ?
Great untruth #1: Being privacy aware has limited e ffects for business
Argument:
Users do not switch to non-invasive systems (Ixquic k 3% market)
Alternative answer:
MORE privacy gives same number of users, but MORE p ersonal data
(study of +5000 Facebook users)
PR SM
Great untruth #2: Privacy is an old(-fashioned) con cept
Argument:
Youth exposes personal life online more easily, rat her than adults
“Privacy is no longer the social norm” (by M. Zuckerb erg)
Alternative answer:
Household is the adult’s private sphere, but for a teen it is not
The online sphere is their private sphere far from parents prying eyes
7
Answer: YES, privacy is really required
Vulnerable citizens remain under threatA study conducted on 1000 pupils shows that 72% rec eived unpleasant contact
from strangers via online profiles
The current practice “Accept the policy or quit” is not the good option
Blatant failures of emblematic applications due to privacy concernsNational EHRs failed in many countries because doct ors feel spied and patients fear
being discriminated – Prejudice is economical & soci al
PR SM
A new digital divide: applications whishing to foll ow UN chartersOrganizations like NGOs, Healthcare companies, etc. , must build their applications
on infrastructures complying with worldwide privacy laws
Citizens & governments are more and more concernedMore privacy complaints (+30% in France in 2011), m ore citizens feel that their
privacy online is not sufficiently protected (18/24 years become the dominant
category with 78%)
WEF: high risk to lock the value of personal data i f privacy is not provided (2012)
8
Is the current centralised model good wrt privacy protection?Intrinsic problem #1: personal data is exposed to s ophisticated attacks
Cost of attacks proportional to benefits (high on a centralized systems)
One person negligence may affect millions
E.g., hackers who cracked Sony’s PlayStation 3 game system last year
got 12 million credit and debit card numbers …
Intrinsic problem #2: personal data is hostage of s udden privacy changes
Centralised administration of data means delegation of control
PR SM
Centralised administration of data means delegation of control
This leads to regular changes, with application (an d business) evolution,
whit mergers and acquisition, etc.
Increasing security does not solve those intrinsic limitations
E.g., TrustedDB [VLDB11] proposes tamper-resistant hardware to secure
outsourced centralized databases.
9
Alternative solutions?
Alternative solution: for the W.E.F. it would be“a data platform that allows individuals to manage t he collection, usage
and sharing of data in different contexts and for d ifferent types and
sensitivities of data”
Alternative privacy preserving technical solutions are flourishingBased on decentralized & user centric principles
E.g., Freedombox, projectVRM, Personal data servers
PR SM
E.g., Freedombox, projectVRM, Personal data servers
Goal of this presentation : catch a glimpse of the holy Grail
A Personal Data Ecosystem…
… built around user-centricity and trustTransparency: what data is captured, how, for what purpose
Trust: security, integrity, availability, reliabili ty
Control: over the using and sharing of personal dat a
Value: assess the value created by the use of the d ata
10
Outline of the seminar
PART I. Decentralized architecturesReview of privacy-oriented decentralized solutions
An interesting attempts or a panacea ?
Abstract architecture with secure hardwareA see change ?
PART II. Resource constrained data managementHardware description, constraints and problem state ment
PR SM
Hardware description, constraints and problem state ment
Existing data management techniques for constrained HW
Representative structures and evaluation strategies
PART III. Global processingReview of existing solutions
Distributed processing on the asymmetric architectu re
CONCLUSION. A view of expected instances
11
PR SMPRiSM Lab. - UMR 8144
PART IPART IDecentralized Architectures
Decentralized Architectures
Outline of Part IReview of privacy-preserving decentralized solutions
Infomediaries
Vendor Relationship Management
FreedomBox
Decentralized Social Networks
PR SM
Personal Data Server (PDS) architectureA trusted, secure and decentralized architecture for personal data
management
13
Infomediaries (since late 1990)
Infomediary: trusted third party helping consumers to take control over the personal information used by marketersPersonal information is the property of individuals , not of the one who gathers itPersonal data has value ���� provide users with means to monetize and profit fro m
their information profilesTrust: separate the control over personal data from the service provider
AllAdvantage, Bynamite, Mydex, Adnostic, Lumeria, …
PR SM
Source: www.identitywoman.net/mass-educational-databases-wrong-architecture
14
Vendor Relationship Management (VRM, projectvrm.org,since 2006)
VRM: software tools for customers to provide them i ndependence from vendors
VRM is a software implementation of an infomediary
ObservationsNo privacy implemented in the Internet, which mainl y works as a Master-Slave systemCustomer Relationship Management (CRM), 14billion$ market in 2013, but the
customers are not involved“Big Data is turning into Big Brother” (Washington Post)
PR SM
“Big Data is turning into Big Brother” (Washington Post)
(Some of) VRM principlesGive the customer independence and a way to engageSpecify your own terms of serviceBe able to gather, examine and control the use of y our own data
VRM tools to do all that either on your own or with the help of a “fourth party” (a third-party that works for you)a dozen of open source and commercial development p rojects in 2012
15
Privowny (example of VRM software)
PR SM
Source: http://cyber.law.harvard.edu/interactive/events/2012/06/searls
16
FreedomBox(freedomboxfoundation.org/ , since 2010)
Personal plug servers running open software to rega in
privacy and controlReturn the Internet to its intended P2P architectur e
(dehierarchicalization)
Keep your data in your home
Base hardware requirements
PR SM
Base hardware requirementsCheap (around 30$ for a plug server)
Power consumption < 15W
RAM > 256MB, Flash storage for file system > 512MB
Communication interfaces: network, serial, JTAG
Storage interfaces: SATA, USB, SD
Noise level < 20dB
17
FreedomBox
Software stack covering a wide range of application s:Secure and anonymous communicationsDistributed Social NetworksPersonal CloudVRM
Trust: secure and anonymous communications, open so ftware, distribution
PR SM
distribution
18
Distributed SN (P2P) or Federated SN (interoperable client-
server implementations)
Main challenges of privacy-preserving DSNSecure message hosting
Secure and anonymous message transfer
Message hosting
Decentralized Social Networks (DSN)
PR SM
Message hostingEncryption and distributed hash table (Lotusnet, Pe erSoN), encryption
and trusted contacts (Safebook)
Attribute-based encryption for fine-grained access control (Persona)
Self-hosting (FreedomBox)
19
Message transfer: communication privacy optimized on the
social graph and physical network topologyHop-by-hop encryption among trusted users (Freenet)
Anonymous routing (Safebook, FreedomBox)
Message transfer in DSN s
Matryoshka
PR SM
Source: Safebook: A Privacy-Preserving Online Social Network Leveraging on Real-Life Trust
20
Anonymous routing in Safebook
Diaspora* ( https://joindiaspora.com/ , since 2010, more than
400 thousand users in 2013, cf. Wikipedia): appeare d as a
response to the many privacy issues engendered by
Facebook/Google
“ ...our distributed design means no big corporation wil l ever control
Diaspora* DSN
PR SM
“ ...our distributed design means no big corporation wil l ever control
Diaspora. Diaspora* will never sell your social lif e to advertisers, and
you won’t have to conform to someone’s arbitrary ru les or look over
your shoulder before you speak. ”
Trust: distribution, open software, users own their data
21
Summary of Distributed Solutions
Common main objective: privacy-preserving services
Different types of decentralized architecturesThree-tier architecture (Infomediary)
Two-tier architecture (VRM)
P2P (FreedomBox , Decentralized Social Networks)
PR SM
P2P (FreedomBox , Decentralized Social Networks)
Hybrid architecture (Decentralized Social Networks, Personal Cloud-
FreedomBox, Personal Data Store)
Built on common principlesUser-centricity and trust (transparency, security, control)
22
Critique of Decentralized Approaches
The Good : do not exhibit the intrinsic limitations of centr alized solutions (privacy, security, etc…)
The Bad : yet, they’ve generally known little success (the privacy paradox)
… and the Challenging : raise important, but interesting
PR SM
challenges Economic: viable business models compatible with privacy
Technical : design a secure Personal Data Server1 - Secure storage of personal data (i.e., local req uirements)2 - Provide the same level of functionality, respons iveness and
availability as a centralized solution (i.e., global requirements)
23
1. Secure storage with a Personal Data Server
Secure storage under user’s controlData must be made highly available, resilient to fa ilure and protected
against confidentiality and integrity attacksCryptographic keys must be secured and only accessi ble by the userAccessing data from anywhere without privacy breach es
Data integration/aggregationAggregate user’s data in a single location: better usage, privacy, value
PR SM
Aggregate user’s data in a single location: better usage, privacy, valuePersonal data is heterogeneous
Structured/unstructured data, text, images, sound, video …Records of transactions, clickstream data, bookmarks, bills, profiles, projects,
preferences …Data modeling, data integration, querying
Privacy policy definitionIntuitive, simple ways for users to define access c ontrol rules
24
Existing attempts of a Personal Data Server
Many recent initiatives (Mydex, the Locker Project,
Personal.com, data.fm, Qiy Foundation, …)Personal data stores, personal data lockers/vaults, personal cloud
PR SM
Focus on secure storage and data aggregationManaged locally by the user (The Locker Project) or outsourced to a
trusted third party (Mydex, Personal.com)
Federate data from different sources (The Locker Pr oject)
25
Weaknesses of exiting solutions
Important security breaches related to the data sto rageData is stored encrypted in the Cloud (Mydex, Perso nal.com)
But the cryptographic keys are under the control of the service provider
Data is stored locally by the users on their person al computers (The
Locker Project)Raises several problems related to security, durability and availability
PR SM
Many functionalities required to obtain a complete Personal
Data Ecosystem are not providedE.g., Global querying, anonymous data publishing, s ecure sharing,
secure usage and accountability
26
Personal Data Server: functional architecture
Database engine to securely manage personal dataFacilitate the development of applications: data st ructuring, integrity,
queries, transactionsFacilitate the definition/enforcement of access and usage control rules
IHM / Applications Sensors
PR SM
DATA MODEL
Administration
Key Value Store
Data Sharing Manager
Query Manager RecoveryAnonymizer
CONTROL Context Manager
Relational DBMS Files Spatio-temporal
RAW ACCESSLog Containers File System Remote Files
Access & Usage Control
The cloud
27
2. Required global functionalities of a Personal Da ta Server
Global queryingPersonal data is essential to the development of so cietal related
applications (smart cities, transport, energy, heal thcare …)Transparently query many PDSs as with a centralized database
Anonymous data publishingPDS must allow users to anonymously participate in global treatments
Distributed secure sharing
PR SM
Distributed secure sharingUsers must get a proof of legitimacy for the creden tials exposed by the
participants of a data exchange
Secure usage and accountabilityUsers must not loose control over their data throug h data sharing
KuppingerCole, a security analyst company promotes Life Management Platforms “a new approach for privacy-aware sharing of sensitive information, without the risk of loosing control of that information”
Privacy principles must be enforced for the externa lized data
28
Personal Data Server: complete functional architect ure
Provide global processing facilities (e.g., global queries, production of anonymized releases, data sharing) similar to that of a centralized database
IHM / Applications Sensors
Data Sharing
PR SM
DATA MODEL
Administration
Key Value Store
Data Sharing Manager
Query Manager RecoveryAnonymizer
CONTROL Context Manager
Relational DBMS Files Spatio-temporal
RAW ACCESSLog Containers File System Remote Files
Access & Usage Control
The cloud
29
How to enforce the security of the PDS architecture
Advent of secure hardware at the edges of the Inter netSecure portable tokens: Secure MCU + Flash storage
A sea change for personal data services Offer privacy guarantees ( >> Trust )
PR SM 30
FLASH (GB size)
SecureMCU
Secure Portable Token
Sim Card(two chips superposed)
USB form factor(MicroSD Flash)
Contactless + USB8GB Flash Secure MicroSD
4G Flash
USB form factor(with SIM card)
Why trust personal secure HW solutions?
Users store their own data ���� minimize abusive usage
Self (user) managed platform ���� no DBA attack
Tamper -resistance + certified code/secure execution + single user
PR SM
Tamper -resistance + certified code/secure execution + single user ���� ratio cost/benefit of an attack is very high
Enforce privacy principles for externalized (shared ) data provided the recipient of the data is another PDSObservation: a user does not have all the privilege s over the data in her
PDS
31
IHM / Applications
From a local functional architecture to a global distributed architecture
Administration
Sensors
ExternalData Manager
Query Manager RecoveryAnonymizer
CONTROL Context ManagerAccess & Usage Control
Implementation depending on the distributed archite cture model
PR SM 32
DATA MODELKey Value Store
CONTROL Context Manager
Relational DBMS Files Spatio-temporal
RAW ACCESSLog Containers File System Remote Files
Access & Usage Control
The cloudDevice dependent implementation
Global PDS Architectures: a spectrum of solutions
DurabilitySecure sharing
Global querying
PDS asymmetric architectureBuilt on Secure Portable TokensChallenges
Embedded data management (Part II of the seminar)Global querying (Part III of the seminar)
PR SM 33
Present other configurations of global architecture s in the Conclusion and Perspectives
PR SMPRiSM Lab. - UMR 8144
PART IIResource Constrained Data Management
Resource constrained data management
Goal: manage our own data in a secure & personal de vicePersonal folders can be large
E-mails, medical record, official documents (admin., bank, etc.), e-bills (telecom,
Amazon, IP, etc.), digital traces (transport, geo-localized services, etc.), …
A query engine must be embedded (to extract authori zed results)
Outline
PR SM
Outline Target hardware platforms & constraints
Existing techniques & problem statementThe “small RAM – NAND FLASH” paradox
A general framework to solve the problem
Representative proposals for search engine & relati onal DBMS
35
Target hardware: secure personal devices
Common architectureGBs of memory
Sim Card
…on which aGB flash chip
is superposed
USB MicroSDreader
Contactless + USB8GB Flash
Secure MicroSD4GB Flash
Differentformfactors
④④④④①①①① ②②②② ③③③③
..in which a secure chip is implanted
Memory devices…
PR SM
GBs of memory
NAND FLASH (dense, robust, low cost)
Tamper resistant microcontroller [SC02]
Miniaturization,
Protective layers (carrying signal),
Multi-Layering (hide sensitive lines),
Sensors (light/temp/power/freq.)
⇒⇒⇒⇒ Highly costly to attack
& communication interfaces (USB, contactless, APDU)
36
NANDFLASH
(Secure)MCU
BU
S
Strong hardware constraints … with a big impact on data management
Microcontrollers
Small RAM (<128 KB) Favor pipeline evaluation
RAM is not dense ⇒⇒⇒⇒ requires (many) indexes
Security is linked with size
NAND FLASH
⇒⇒⇒⇒
PR SM
NAND FLASH
High cost of random writes Data structures and strategies…
Pages are erased before write … must minimize random writes
Erase by Block vs. write by Page
How do existing techniques deal with these constrai nts ?
37
⇒⇒⇒⇒
Existing Techniques
Light & Embedded databasesEmbedded DBMS, e.g., SQLite, BerkeleyDB
Light DBMS, e.g., DB2 Everyplace, Oracle Database M obile Server
Target small but powerful devices (e.g., smart phon es, set top boxes)
⇒⇒⇒⇒ Small RAM & NAND Flash constraints are not supporte d
PR SM
FLASH aware indexation techniquesB+Tree adaptation: BFTL [TECS07], LATree [VLDB09], FD Tree [VLDB10]
Store index updates in a Flash resident log , itself indexed in RAM
Updates are committed to the B+-Tree in a batch mode (amortize write cost)
Vary in the way log/RAM index are managed
⇒⇒⇒⇒ Not compliant with small RAMSmall RAM ⇒ Small index in RAM ⇒ High commit frequency ⇒ Low gains
38
Existing Techniques (cont.)
Flash aware implementation of key-value storesSkimpyStash [SIG11], LogBase [VLDB12], SILT [SOSP11]
Store the key-value pairs in a log structure in FLASH
An index is maintained in RAM (relatively large size, ~1B per key-value pair)
⇒⇒⇒⇒ Incompatible with small RAM
Data management techniques dedicated to MCU (SoC )
PR SM
Pionee proposals in the area of DBMS for smartcardsPicoDBMS [VLDBJ01], VSDB [TOIS03]
Exploit byte writes accesses (EEPROM, NOR) not avai lable in NAND FLASH
Data management techniques on-chipRDBMS: GhostDB [SIG07], PBFilter [IS12], MiloDB [DAPD1 3]
Search engines: MAX [TSN08], Snoogle [TPDS10], Micro search [TECS10]
39
Details next
Problem statement
Problem : execute queries with a small RAM
on large volumes of data stored in NAND FLASH
Increase RAM consumptionEvaluating queries with a small RAM
Pipeline strategy Compensate
PR SM
The “small RAM – NAND FLASH ” combination…
… leads to paradoxical solutions !
How do recent works resolve the problem ?
Build Indexes Many random writes… unacceptable costs
in NAND FlashIndex maintenance
40
General framework to solve the problem
Identify/design the needed indexes for a pipeline e valuation
Organize them into Sequentially Written Structures ( SWS)… structures which satisfy Flash constraints by const ruction:
Allocation & de-allocation are only made on a BLOCK basisPartial garbage collection never occurs (avoids costly GC)
Pages are written sequentially (and never updated n or moved)
PR SM
Pages are written sequentially (and never updated n or moved)Proscribes random writes by construction
If more scalability is needed: reorganize the SWSsTransform a SWS into a more efficient SWS
NB: transformation itself must only rely on (tempor ary) SWSs…
How do recent works implement this methodology?
41
First illustration: embedded search engine
Information retrieval queries From a set of terms, retrieve the K most relevant d ocuments
(according to a weight function like TF-IDF)
Use an inverted index Stores triples (term, docid , value)
TF-IDF(doc) = ΣΣΣΣ value ti,doc x Log( {doc} / {doc containing t i} ){ti} set of
query terms
PR SM
Stores triples (term, docid , value)
Retrieves all triples corresponding to a given term
Classical search algorithmInverted index access for each term of the query
In RAM: one container is allocated / retrieved doci d too much!
used to aggregate the values of the different tripl es for that docid
Sort results and return the K docid with the highest TF-IDF
How to store the index sequentially ? How to answer in pipeline?
42
How to store the inverted index sequentially (SWS) ?[TECS10]
RAM H3 17
H1
H2
hash table
Index triples(term, value, docid)
Buckets are chained in flash
H3 26
H1
H2
PR SM 43
Sequentially Written Structures (SWS)
FLASH
doc2 doc4
docid=7 docid=9 docid=21 docid=23
Documents
……
Hash buckets of Index triples
How to evalute the query in pipeline?
The base for the pipeline evaluation strategy:In each hash list, docid are sorted
… t2,1,2t2,1,3t2,1,5∅
t1,5,7t1,1,9∅
… t2,1,20t2,2,21t2,1,23Addr 14
t1,3,21t1,1,23Addr 17
t2,1,25t2,2,28t2,3,30Addr 25
t1,1,25t1,5,28Addr 26
Addr 14 Addr 17 Addr 25 Addr 26 Addr 40 Addr 43
H1 56
H2 40
H3 43
… …
HASH Table (in RAM)
Hash buckets
[TECS10]
docid sorted (desc.)(hash value H3)
PR SM
How to compute the query in pipeline:1 page is allocated in RAM for each hash list conta ining a query term
In practice the number of terms ≤ 3 or 4…
The triples from the hash lists are “merged” on docid⇒ Triples with an equal docid arrive in RAM at the same time…
… the TF-IDF of each docid can be computed (directly)
The K docid with the highest TF-IDF are kept in RAM
Hash buckets(SWS in FLASH)
44
Second illustration: embedded relational database
SQL queriesEvaluate selections, projections, joins, (group by and aggregates)
RDBMS use indexesQ1: How to store an index sequentially (in a SWS)?
Q2: How to make it scalable?
… and algorithms: join algorithms (HJ, SMJ, …) need lots of RAM
PR SM
Join indices could be a solution…
… but consecutive joins incur random access or a RAM-hungry sort
Q3: How to compute select-project-joins queries in pipeline?
σσσσ(CUSTOMER) ORDER LINETEM
Sorted on CUS.id
Sorted on ORD.id
JI
JI
Sorted on CUS.id
45
How to build a selection index sequentially (SWSs)?
SWS 1: «Keys» (vertical partition)Filled sequentially at tuples’ insertion
CUSTOMER… …… Joe… …… Jack… …… …… …… Paul… …
… …… Lyon… …… Lyon… …… …… …… Lyon… …
t20
t30
[IS12]
Keys
SWS 1
…Lyon…Lyon………
Lyon…
t20
t30
Indexedcolumn
CITY
P2
…Sum2
…
B.Filters
SWS 2
SWS 2: «Bloom Filters»Summary of SWS 1
BF = summaries 1 page of «Keys»
(consumes ~2B per key)
PR SM
Table scan(640 IOs)
… Paul… …… …… …… …… …… Jim… …… …… Tom… …… …
… Lyon… …… …… …… …… …… Lyon… …… …… Lyon… …… …
t50
t70
t90
46
Efficient: SWS2 + 1 IO/result… but how to achieve scalability?
Summary Scan(17 IOs)
Lyon……………
Lyon……
Lyon……
t50
t70
t90
P16
P68
P78
…Sum16
…Sum68
…Sum78
…
(consumes ~2B per key)
Written sequentially
Retrieve CUSTOMER.CITY=‘Lyon’Full scan of «Bloom Filters» (SWS 2)
For each BF : if ‘Lyon’ ∈∈∈∈ BF
Negative ⇒⇒⇒⇒ ignore it
Positive ⇒⇒⇒⇒ access 1 page of «Keys»
search ‘Lyon’ & return tuples ptrs
Reorganization process:Only uses seq. structures (SWSs)
Background / interruptible
Ex: Summary scan ���� B-Tree like
Scalability ⇒⇒⇒⇒ timely reorganize the index…to transform it into a more efficient index
[DAPD13]
B-Tree like index
Summary scanindex
Sorted run1
Sorted run2
…
Temp.SWSs
1) Sort the (key, pointer) pairs
���� Temp. SWS (sorted “runs”)
���� result is SWS: «Sorted Keys»
Keys
…Lyon…
Lyon………
Lyon……………
Lyon……
Lyon…
t20
t50
t70
t90
t30P2
P16
P68
P78
…Sum2
…Sum16
…Sum68
…Sum78
…
B.Filters
PR SM 47
SWS: «Tree»
B-Tree like index
SWS: «Sorted keys»
K1K2………
……Kn
Lyont20t50t70t90 t30
���� result is SWS: «Sorted Keys»
2) Build a key hierarchy
���� No need of temporary SWS
���� result is SWS: «Tree»
Result: efficient B-Tree like index
… how to evaluate
SQL queries in pipeline?
Lyon……
P78
How to evaluate SQL queries in pipeline ?[SIG07, DAPB13]
TPCD likeschema
LIN
Project
{LINid ↓ , CUSid, ORDid, PSid}
Execution PlanTjoin Indexes(generalized join index)each rowid of the root
table contains the rowids of the tuples it refers to in the subtreeTselect on
SUP.Name
Tselect IndexesEach key of the index
contains the rowids of the root table refering to it
SELECT CUS.*, ORD.*, LIN.*, PARTSUP.*FROM CUSTOMER CUS, ORDER ORD, LINETEM LIN , PARTSUP PS, SUPPLIER SUPWHERE CUS.CUSkey = ORD.CUSkey AND ORD.ORDkey = LIN.ORDkey AND
LIN.PSkey = PS.PSkey AND PS.SUPkey = SUP.SUPkey ANDCUS.Mktsegment = 'HOUSEHOLD' AND SUP.Name = 'SUPPLIER-1'
π π π πQueryroottable
PR SM 48
LIN
PSORD
SUPCUS PARTselect on SUP.Name
Intersectmerge
Tjoinaccess
Tselectaccess
{LINid} ↓{LINid} ↓
{LINid} ↓
ORDid, PSid}
‘HOUSEHOLD’
Tselectaccess
‘SUPPLIER-1’
Tjoin on LIN
Tselect on CUS.Mktsegment
Tjoin on LIN
LIN
idO
RD
idC
US
idP
Sid
PA
Rid
refers to in the subtree
SU
Pid
SUP.Name
NB: Tselect returnssorted row ids!
Tselect on CUS.marketsegment
t20t50 t30
K1K2………
……
Kn
HOUSEHOLD
σ σ σ σ
π π π π
σ σ σ σ
π π π π
π π π π
table
‘HOUSEHOLD’ ‘SUPPLIER-1’
Conclusion
Encouraging results…A good methodology
To tackle the conflicting small RAM – NAND Flash constraints
Efficient search engines
Efficient SQL queries…a whole DBMS (indexes, tables, updates, buffers) can fit into SWS
… and many remaining challenges
PR SM
… and many remaining challengesExtend those principles to other data models
XML, time series, spatial-temporal data, noSQL & key-value stores, etc.
A general co-design approach is still missingHow to benefit from additional RAM ?
How to adapt to dynamic variations of the HW parameters ?
49
PR SMPRiSM Lab. - UMR 8144
PART III : SECURE GLOBAL PART III : SECURE GLOBAL COMPUTATIONS
The example of Secure computation of Privacy Preser ving Data Publishing Algorithms using Tokens
Secure Global Computationand Anonymous Data Publishing
PART III : OUTLINE
Problem Statement
Current Solutions to Secure Global Computation
Generic Approach
PR SM
Toolkits for Secure Computation
Using Trusted Hardware to Achieve Generic Computati on
Taking on Privacy Preserving Data Publishing
Perspectives
51
PROBLEM STATEMENT
PR SM
Part II of the talk showed how to query local data on PDSs.
Part III of the talk is going to discuss how to com pute
aggregate data using many PDSs.
52
Secure Global Computation on Tokens: Problem StatementOBJECTIVE :
Maintain the functionalities of traditional
database servers managing private data
(availability, durability, scalability of the
system, etc.) while increasing privacy
protection (by using secure tokens and
distributing computation)
PROBLEM :
THREAT MODEL :Infrastructure (SSI) can be :
honest but curiousWeakly-Malicious (Covert Adversary)
Token can be :
PR SM
PROBLEM : The use of secure portable tokens must
not jeopardize traditional data intensive
applications, in particular applications
aggregating data, e.g. Privacy
Preserving Data Publishing, SQL
processing
Unbreakable (honest)A subset can be broken (Weakly Malicious)
The « classical » problem of SecureGlobal Computation is more generaland makes no trust assumption
53
Is this a new problem ?
Several approaches are possible to securely perform global computations:
1. Use only an untrusted server/cloud/P2P and use generic (and costly)
algorithms. (e.g. Secure Multi-Party Computing, fully homomorphic encryption)
����Problem = COST
2. Use only an untrusted server/cloud/P2P and develop a specific algorithm for
PR SM
each specific class of queries or applications. (e.g. DataMining Toolkit), using
low cost primitives ����Problem = GENERICITY
3. Introduce a tangible element of trust, through the use of a trusted component
and develop a generic methodology to execute any centralized algorithm in this
context. (e.g. PDSs) ���� Problem = TRUST
54
CURRENT SOLUTIONS TO SECURE GLOBAL QUERYING
PR SM 55
APPROACH I : GENERIC AND SECURE GLOBAL COMPUTATION
PR SM 56
Generic Secure Multi-Party Computing (SMC)
Truly Generic SMC is exponential in the number of inputs and
therefore does not scale. See [Yao82, Yao86].
Other solutions such as [GMW87] do not provide specific
generics to compute a solution (i.e. they need a zero-
knowledge proof to work).Cost is unpractical : the resolution of the millionnaire problem proposed in ’82
PR SM
• Cost is unpractical : the resolution of the millionnaire problem proposed in ’82
is proportional to the size of the values compared.
• Generalization to m different parties requires taking into account cheating
(extra cost).
• [CKL06] have shown that in fact if there is not an honest majority, then only
trivial functions can be computed.
There are (more or less) complicated cryptographic protocols.Protocols are generic in the sense that they comput e values of mathematical functions.Protocols are far too costly .
57
Algebraic approach : Homomorphic Encryption
Homomorphic Encryption is a characteristic of sever al
crypto-systems such as RSA, Paillier, ElGamal, etc.
Example : Consider RSA. Given the RSA public key (e, m),
the encryption of a message x is given by :E(p)=p^e mod m
PR SM
The homomorphic property is :E(p1) x E(p2) = p1^e x p2^e mod m = (p1 x p2)^e mod; m = E(p1 x p2)
58
Fully Homomorphic Encryption
Fully Homomorphic Encrytion means that all ring operators are homomorphic (this means + and x).
Example : we say that E is a fully homomorphic encr yption from ({0,1}, +, x) to (D, ⊕, ⊗⊕, ⊗⊕, ⊗⊕, ⊗) if for all c 1, c2 in D, such that c 1=E(p1) and c 2=E(p2)
E-1(c1⊕⊕⊕⊕ c2) = p1+p2
PR SM
1 2 1 2
E-1(c1⊗⊗⊗⊗ c2) = p1 x p 2
Or more generally E -1(fD(c1,…,cn))=f{0,1}(p1,…,pn)
Why is this a solution ? • Any program with bounded input can be transformed into a Boolean circuit• Any circuit can be transformed into a polyonmial modulo 2• Secure computation of a polynomial equates to securely computing any program• To securely compute a polynomial, it is necessary and sufficient to securely compute +
and x operations.
59
Fully Homomorphic Encryption
For a long time, it was unclear whether fully homom orphic
encrytion was possible.
A first result was proposed using ideal lattice cry ptography
in [Gent09], and has been a hot topic since.
Keys to cypher only a couple of bits are gigabytes in size …
PR SM
Keys to cypher only a couple of bits are gigabytes in size …
���� The cost to have good security is (incredibly) high .
60
APPROACH 2: TOOLKITS FOR SECURE COMPUTATIONS
PR SM 61
Data Mining Toolkit
Toolkit for Data Mining : [CKV+02] Primitives : – Secure Sum,
– Secure Set Union,
– Secure Size of Set Intersection,
– Scalar Product.
Can compute : Association Rules, Clusters. (Also : efficiency
drops when some participants are dishonest).
PR SM
drops when some participants are dishonest).
Not usable for other applications
(such as PPDP or SQL)
5 R=32
792
37
413
15
15-32 [40] = 23
Secure Sum Primitive
62
Queries on encrypted databases
• Similar questions appear in outsourced databases (DaaS), which have been a hot
research topic for over 10 years : WHAT ARE THE PROBLEMS ?– Same attack model : SSI (DaaS provider) can be untrusted.
– Attacks considered are inference frequency based attacks based on the frequency of cyphertexts.
e.g. Select department, count(distinct salary) from emp
might leak some information on individual salaries, given background knowledge of their distribution in
the company
• Some Solutions :– [HILM02, HIM04] propose techniques to execute various SQL and SQL aggregation queries over
PR SM
– [HILM02, HIM04] propose techniques to execute various SQL and SQL aggregation queries over
encrypted data.
– [HMT04] propose a specific index for range queries
– Many works (such as [ABG+04, AGB+05, SAP03]) use a trusted third party to compute queries, which
makes things simple
– [AEW12] Give a good overview of the problem of securing DaaS in the cloud
While in the approach envisionned no data is outsourced , but some data will be
exported to SSI.
Some techniques proposed in these articles could be useful in the PDS context !
63
APPROACH 3 : USING TRUSTED HARDWARE TO ACHIEVE GENERIC COMPUTATION
PR SM 64
A new trend : SMC Using Tokens
• Using tokens to improve the speed of computations : [JKSS10]
• New foundations of SMC [Katz07, GIS+10]
• Limited to Secure Intersect (Oblivious Search): [HL08, FPS+11]
�The primitives used are not « data intensive » primitives. Complex processing
using tokens is a new topic !
PR SM
�These processes involve initializing and sending one or more smart cards, that
can or can not be trusted. (PDSs would be an alternative).
�Smart cards cannot compute everything themselves (this is not introducing a
trusted third party)
The general idea when using Secure Hardware :
Use cheap secure hardware to obtain substancial complexity class gains with
SMC algorithms.
65
What of Complicated Data Intensive Computations … ?
One of the classical multi-user data intensive appl ications
that require private computations is Privacy Preserving
Data Publishing . (an example of aggregate queries)
We will give some insight on the global framework
PR SM
We will give some insight on the global framework
proposed in [ANP13].
Adapting this type of approach to SQL is ongoing wo rk.
66
EXAMPLE
PR SM
Taking on Privacy Preserving Data Publishing…
(or more generally aggregation operations)
…using Secure Portable Tokens
67
Privacy Preserving Data Publishing (PPDP)
Raw data Anonymized data (or sanitized …)
Is the process known in advance ?
PR SM
Individuals
(or sanitized …)
Publisher(trusted)
Recipients (no assumption of trust)
68
What is anonymized data ? [Sweeney02]
Quasi-identifiers ! (QID)It is feasible to de-anonymize some parts of an anonymized dataset
based on quasi-identifiers i.e., sets of attributes that take unique
values over a given dataset.
These quasi-identifiers can then be used to cross information with
other databases or simply to deduce (private/sensit ive) information
from the data published
PR SM
from the data published
Concepts usedTraditional PPDP considers 2 classes of attributes :
Those part of the QID
The others, assumed sensitive
69
PPDP Techniques
Range from anything trivial and simple (pseudonymis ation)
to complex and provable (differential privacy [Dwo0 6])
And other ad hoc techniques …
Time t1
PR SM
k-anonymity
l-diversity
Time t2 There are two fake tuples
m-invarienceADVANTAGE : Global Queries are directDISADVATAGE : Differential Privacy does
not support all types of queries
/!\ Computing a k-anonymous releasemeans computing an AGGREGATION(as in SQL Group By)
70
Overview : Generic Protocol
Computing a query on such an architecture follows 3 steps
1. The querier broadcasts (credentials, query) coupl e
2. Each PDS decides locally whether to participate o r not in the query depending on
privacy models and rules and local opt-in/opt-out c hoices. ���� Collection Phase
3. A distributed protocol is established between par ticipating PDSs and SSI such that
PR SM
the final result can be delivered to the querier.a) Construction Phase (secure computation of the qu ery)
b) Sanitization Phase (sending the results to the q uerier)
/!\ Depending on the complexity of the query, the SS I may only store intermediate
results of may play a more active role in the compu tation
71
Mondrian Algorithm [LeFevre06]
PR SM 72
Parallelizing (and securing) the Approach
Collection phase is naturally parallel !
Construction phase is algorithm dependant . To remain as generic as possible,
this task is delegated to the central server, while disclosing an amount of
information compatible with the privacy requirement s. ����BREAK UP THE
TUPLES !
Encrypted Data (e)
Construction Information (c) / Sanitization informat ion (s)
PR SM
Construction Information (c) / Sanitization informat ion (s)
Safety Information ( ζζζζ)
Sanitization is parallelized on the tokens by sending them batches of information.
73
And what of malicious adversaries … ?
• First solution : clustering to reduce impact of cracked PDS
• Attacks launched in the case of malicious adversaries can be the creation,
deletion and copy of tuples. (Active attacks)
• Several generic safety properties can be defined, and a supporting meta-
protocol in order to support current PPDP models against such adversaries.
• In the case of covert adversaries, detection is probabilistic.
PR SM
• In the case of covert adversaries, detection is probabilistic.� These counter measures are generic
74
The MetaP Meta-Algorithm
PR SM 75
Future work
The immediate idea is to work on generic evaluation of SQL Queries !Can be simple in the case of SFW queries without joins
Is harder with joins or with group by
techniques used in the PPDP context will probably be useful
Other types of queries (No-SQL) could also be suppo rtedThe difficult part will often be the aggregate part.
PR SM 76
PR SMPRiSM Lab. - UMR 8144
PERSPECTIVES
Instances of alternative global architectures relyi ng on secure hardware
Personal Social-Medical Folder (Field experiment)A personal folder available at home to ease care co ordination
Each patient owns her medical-social folder in a se cure token
The folder is archived (encrypted) and shared using central services
Local and central copies are synchronized without In ternet connection
Human Powered Information Systems
PR SM
Human Powered Information SystemsSecure and low cost PDS for personal data services in Least
Developed Countries
A delay tolerant network (no infrastructure) is est ablished
Trusted CellsBased on secure personal tokens to regulate persona l data at home
The token is connected and regulates data sharing i n the cloud
78
Personal social-medical folder: architecture elemen ts
Patient’spersonal
server
FLASHSecurechip
JDBC API Healthrecords
DBMS
UI web app
@Central server
(data durability, availability,querying)
PR SM
Synchro. web app
Practitioner’s smart badge
File System
Sync.files
FLASH
Securechip
querying)
79
Availability at patient’s home
EHR on a personal server
Access from a browser by
patient’s visitors (doctors & social
workers, family…)
PR SM
Personal Server
Disconnected access to Personal Servers
(patient)
❩❩❩❩
Smart Badge
80
Care coordination between practitioners
EHRs on a central server
Web access & exchange
Sync. via Smart Badges No data re-entered
No network link required
EHR on a personal server
Access from a browser by
patient’s visitors (doctors & social
workers, family…)
Sync. with central server
PR SM
@Personal Server
External IS
Smart Badge
❪❪❪❪
❩❩❩❩
❫❫❫❫
Sync. with central server via Smart Badges
(practitioner)
81
Human Powered Information Systems (HPIS)
PR SM 82
Trusted Cells Vision Architecture(credit: Gi-De)
ARM Trust Zone
PR SM 83
PR SMPRiSM Lab. - UMR 8144
THANK YOU
PR SMPRiSM Lab. - UMR 8144
REFERENCES
PART I: Distributed architecture (1/3)
The World Economic Forum. Rethinking Personal Data: Strengthening Trust. May 2012
A. Pentland et al. Personal Data: The Emergence of a New Asset Class. World Economic Forum.
January 2011
H. Nissenbaum, Privacy in context: Technology, poli cy, and the integrity of social life,” Stanford
Law Books, 2010
J. Catlett. Panel on infomediaries and negotiated pr ivacy techniques. In Proceedings of the tenth
conference on Computers, freedom and privacy: chall enging the assumptions, CFP ’00, pages
155–156, New York, NY, USA, 2000
PR SM
155–156, New York, NY, USA, 2000
Mass-Educational Databases = Wrong Architecture, ww w.identitywoman.net/mass-educational-
databases-wrong-architecture
VRM project, http://blogs.law.harvard.edu/vrm/projects/
A. Mitchell, I. Henderson, and D. Searls. Reinventi ng direct marketing — with vrm inside. Journal of
Direct Data and Digital Marketing Practice, 10(1):3 –15, 2008
FreedomBox: http://freedomboxfoundation.org/
Wikipedia. Freedombox, Vendor Relationship Manageme nt, Distributed Social Networks
86
PART I: Distributed architecture (2/3)
L. Cutillo, R. Molva, and T. Strufe. Safebook: A pr ivacy-preserving online social network leveraging on real-life trust. IEEE Communications Magazine, 4 7(12):94–101, 2009
L. M. Aiello and G. Ruffo. Lotusnet: tunable privac y for distributed online social network services. Computer Communications, In Press, 2010
I. Clarke, S. G. Miller, T. W. Hong, O. Sandberg, a nd B. Wiley. Protecting free with freenet. Internet Computing IEEE, 6(February):40–49, 2002
Diaspora*, https://joindiaspora.com/
R. Baden, A. Bender, N. Spring, B. Bhattacharjee, a nd D. Starin. Persona: An online social network with user -defined privacy. Computer, 39(4):135 –146, 2009
PR SM
with user -defined privacy. Computer, 39(4):135 –146, 2009
S. Buchegger, D. Schioberg, L. H. Vu, and A. Datta. PeerSoN: P2P Social Networking - Early Experiences and Insights. In Proceedings of the Sec ond ACM Workshop on Social Network Systems Social Network Systems 2009, co-located wit h Eurosys 2009, Nurnberg, Germany, March 31 2009
A. Narayanan, V. Toubiana, S. Barocas, H. Nissenbau m, D. Boneh: A Critical Look at Decentralized Personal Data Architectures CoRR abs/1202.4503: (201 2)
M. Mun, S. Hao, N. Mishra, K. Shilton, J. Burke, D. Estrin, M. Hansen, and R. Govindan. Personal Data Vaults: a locus of control for personal data s treams. 2010
87
PART I: Distributed architecture (3/3)
Mydex, http://mydex.org/
Mydex. The case for personal information empowermen t : The rise of the personal data store, 2010
The Locker Project, http://lockerproject.org/
Qiy Foundation, www.qiyfoundation.org/
Personal, www.personal.com
KuppingerCole, http://www.kuppingercole.com/report/advisorylifeman agementplatforms7060813412
T. Allard et al.: Secure Personal Data Servers: a V ision Paper. PVLDB 3(1): 25 -35 (2010)
PR SM
T. Allard et al.: Secure Personal Data Servers: a V ision Paper. PVLDB 3(1): 25 -35 (2010)
Giesecke & Devrient, “Portable Security Token”, http ://www.gd-sfs.com/portable-security-token
Eurosmart. Smart USB token. White paper, Eurosmart, 2008, (10p)
ARM-TrustZone, http://www.arm.com/products/processors/technologies /trustzone.php
N. Anciaux, P. Bonnet, L. Bouganim, B. Nguyen, I. S andu Popa, P. Pucheral. Trusted Cells: A Sea Change for Personnal Data Services, in "6th Biennal C onference on Innovative Database Research (CIDR)", Asilomar, États-Unis, 2013
88
PART II: Resource constrained data management (1/4)
Smart card security[SC02] Witteman, M. (2002). Advances in smartcard s ecurity.
Information Security Bulletin, 7(2002), 11-22.
Flash aware indexes[TECS07] Wu, C. H., Kuo, T. W., & Chang, L. P. (200 7). An efficient B-
tree layer implementation for flash -memory storage systems. ACM
PR SM
tree layer implementation for flash -memory storage systems. ACM
Transactions on Embedded Computing Systems (TECS), 6(3), 19.
[VLDB09] Agrawal, D., Ganesan, D., Sitaraman, R., D iao, Y., & Singh, S.
(2009). Lazy-adaptive tree: An optimized index struct ure for flash
devices. Proceedings of the VLDB Endowment, 2(1), 3 61-372.
[VLDB10] Li, Y., He, B., Yang, R. J., Luo, Q., & Yi , K. (2010). Tree
indexing on solid state drives. Proceedings of the VLDB Endowment,
3(1-2), 1195-1206.
89
PART II: Resource constrained data management (2/4)
Flash aware key-value stores[SIG11] Debnath, B., Sengupta, S., & Li, J. (2011, June). SkimpyStash:
RAM space skimpy key-value store on flash-based sto rage. In
Proceedings of the 2011 international conference on Management of
data (pp. 25-36). ACM.
[VLDB12] Vo, H. T., Wang, S., Agrawal, D., Chen, G. , & Ooi, B. C. (2012).
LogBase : a scalable log -structured database system in the cloud.
PR SM
LogBase : a scalable log -structured database system in the cloud.
Proceedings of the VLDB Endowment, 5(10), 1004-1015 .
[SOSP11] Lim, H., Fan, B., Andersen, D. G., & Kamin sky, M. (2011,
October). SILT: A memory-efficient, high-performanc e key-value
store. In Proceedings of the Twenty-Third ACM Sympo sium on
Operating Systems Principles (pp. 1-13). ACM.
90
DBMS on-chip[VLDBJ01] Pucheral, P., Bouganim , L., Valduriez, P., & Bobineau, C.
(2001). PicoDBMS: Scaling down database techniques for the
smartcard. The VLDB Journal, 10(2-3), 120-132.
[TOIS03] Bolchini, C., Salice, F., Schreiber, F. A. , & Tanca, L. (2003).
Logical and physical design issues for smart card d atabases. ACM
Transactions on Information Systems (TOIS), 21(3), 254-285.
PART II: Resource constrained data management (3/4)
PR SM
Transactions on Information Systems (TOIS), 21(3), 254-285.
[SIG07] Anciaux, N., Benzine, M., Bouganim , L., Pucheral, P., & Shasha,
D. (2007, June). GhostDB: querying visible and hidd en data without
leaks. In Proceedings of the 2007 ACM SIGMOD intern ational
conference on Management of data (pp. 677-688). ACM .
[IS12] Yin, S., & Pucheral, P. (2012). PBFilter: A flash-based indexing
scheme for embedded systems. Information Systems.
91
PART II: Resource constrained data management (4/4)
DBMS on-chip (cont.)[DAPD13] Anciaux, N., Bouganim , L., Pucheral, P., Guo, Y., Le Folgoc,
L., & Yin, S. (2013). MILo-DB: a personal, secure a nd portable
database machine. Distributed and Parallel Database s, 1-27.
Search engines on-chip[TSN08] Yap, K. K., Srinivasan , V., & Motani , M. (2008). Max: Wide area
PR SM
[TSN08] Yap, K. K., Srinivasan , V., & Motani , M. (2008). Max: Wide area
human-centric search of the physical world. ACM Tra nsactions on
Sensor Networks (TOSN), 4(4), 26.
[TPDS10] Wang, H., Tan, C. C., & Li, Q. (2010). Sno ogle: A search
engine for pervasive environments. Parallel and Dis tributed Systems,
IEEE Transactions on, 21(8), 1188-1202.
[TECS10] Tan, C. C., Sheng, B., Wang, H., & Li, Q. (2010). Microsearch:
A search engine for embedded devices used in pervas ive computing.
ACM Transactions on Embedded Computing Systems (TEC S), 9(4).
92
PART III: references (uncomplete)
[ANP13] Allard, T., Nguyen, N., Pucheral, P.: MetaP : Revisiting Privacy-Preserving Data Publishing usi ng Secure
Devices, in DAPD, 55p, to appear.
[CKV+02] Clifton, C., Kantarcioglu, M., Vaidya, J., Lin, X., Zhu, M.Y.: Tools for privacy preserving dis tributed data
mining. SIGKDD Explor. Newsl., vol. 4, pages 28-34, ACM, New York, NY, USA, (2002)
[FPS+11] Fischlin, M., Pinkas, B., Sadeghi, A-R., S chneider, T., Visconti, I.: Secure set intersection with untrusted
hardware tokens. In CT-RSA, (2011).
[Gent09] Gentry, C.: Fully Homomorphic Encryption Us ing Ideal Lattices. In STOC, (2009)
[GIS+10] Goyal, V., Ishai, Y., Sahai, A., Venkatesa n R., Wadia, A.: Founding Cryptography on Tamper-Proof
PR SM
Hardware Tokens. Theory of Cryptography, pp 308-326, (2010)
[GMW87] Goldreich, O., Micali, S., Wigderson, A.: H ow to play ANY mental game. In ACM STOC, pp 218-229, New
York, NY, USA, (1987)
[HILM02] Hacigumus, H., Iyer, B., Li, C., Mehrotra, S.: Executing SQL over encrypted data in database service
provider model. ACM SIGMOD, pp. 216-227. Wisconsin (2002)
[HIM04] Hacigumus, H., Iyer, B. R., Mehrotra, S.: E fficient execution of aggregation queries over encr ypted relational
databases. DASFAA, pp. 125-136. Korea (2004)
[HL08] Hazay, C., Lindell, Y.: Constructions of trul y practical secure protocols using standard smartcards. In ACM
CCS, New York, NY, USA (2008)
PART III: references
[JKSS10] Jarvinen, K., Kolesnikov, V., Sadeghi A-R. , Schneider, T.:
Embedded SFE:Offloading Server and Net-work Using H ardware
Tokens. In Financial Cryptography and Data Security (2010)
[Katz07] Katz, J.:Universally Composable Multi-part y Computation
Using Tamper-Proof Hardware. In Advances in Cryptol ogy,
EUROCRYPT '07, pp 115-128, (2007)
PR SM
[Yao82] Yao, A.C.: Protocols for secure computation s. In Annual
Symposium on Foundations of Computer Science, FOCS, pp 160-
164, Washington, DC, USA, (1982)
[Yao86] Yao, A.C.: How to generate and exchange sec rets. In Annual
Symposium on Foundations of Computer Science, FOCS, pp 162-
167, Washington, DC, USA, (1986)