remembrance: the unbearable sentience of being digital ragib hasan *, radu sion, and marianne...

14
Remembrance: The Unbearable Sentience of Being Digital Ragib Hasan * , Radu Sion, and Marianne Winslett University of Illinois at Urbana-Champaign Stony Brook University 4 th Biennial Conference on Innovative Database Research January 4-7, 2009

Upload: deanna-hench

Post on 30-Mar-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Remembrance: The Unbearable Sentience of Being Digital Ragib Hasan *, Radu Sion, and Marianne Winslett University of Illinois at Urbana-Champaign Stony

Remembrance: The Unbearable Sentience of Being Digital

Ragib Hasan*, Radu Sion, and Marianne Winslett

University of Illinois at Urbana-ChampaignStony Brook University

4th Biennial Conference on Innovative Database Research

January 4-7, 2009

Page 2: Remembrance: The Unbearable Sentience of Being Digital Ragib Hasan *, Radu Sion, and Marianne Winslett University of Illinois at Urbana-Champaign Stony

2

What is the difference between …

A file, or a database tuple, is a dumb container of valuesData, the robot, can remember, has sentience

4th Conference on Innovative Data Systems Research (CIDR) 2009

Data, the android from Star Trek

and …

Data, stored in databases or file cabinets

Page 3: Remembrance: The Unbearable Sentience of Being Digital Ragib Hasan *, Radu Sion, and Marianne Winslett University of Illinois at Urbana-Champaign Stony

3

Our Data objects suffer from AmnesiaSince the early days, our data processing model has assumed data containers (tuples, files, variables) to be oblivious of their past

4th Conference on Innovative Data Systems Research (CIDR) 2009

We assume a data object to know only its present value, and not recall its old values, states, or context information

Page 4: Remembrance: The Unbearable Sentience of Being Digital Ragib Hasan *, Radu Sion, and Marianne Winslett University of Illinois at Urbana-Champaign Stony

4

Our current data objects are not sentient• Database tuples returned in query results

only show their latest values

• Data processing, evaluation is based on only the current state of variables

• Data objects do not tell us their historical states, or how they were created, processed, transmitted

• We cannot pick a system, turn a knob, and time-travel to 5 minutes in the past

4th Conference on Innovative Data Systems Research (CIDR) 2009

Page 5: Remembrance: The Unbearable Sentience of Being Digital Ragib Hasan *, Radu Sion, and Marianne Winslett University of Illinois at Urbana-Champaign Stony

5

Exploring remembrance in various formsDatabases• Time-travel/Transaction time

databases• Temporal SQL• Checkpointing

4th Conference on Innovative Data Systems Research (CIDR) 2009

Scientific computing• Provenance• Lineage

File systems• Versioning file systems• CVS and other SCMs• WORM storage

Web• WayBack archive• gMail

Systems & languages• Reflective systems, self-managed systems• Time traveling virtual machines

Page 6: Remembrance: The Unbearable Sentience of Being Digital Ragib Hasan *, Radu Sion, and Marianne Winslett University of Illinois at Urbana-Champaign Stony

6

However …• Most of these systems are in

essence, versioning systems– Memory / history is not an intrinsic

property of data– Association between a data value and

its history is kept externally

• These solutions are also isolated, piecemeal, and glued together by our original single-valued, oblivious data paradigm

4th Conference on Innovative Data Systems Research (CIDR) 2009

Page 7: Remembrance: The Unbearable Sentience of Being Digital Ragib Hasan *, Radu Sion, and Marianne Winslett University of Illinois at Urbana-Champaign Stony

7

Remembrant Computing• We propose a new data paradigm, where

– Data objects retain their memories as an intrinsic property– History, context, temporal events can be recalled– Past (memory) and present (value) are considered as an atomic unit of

data

4th Conference on Innovative Data Systems Research (CIDR) 2009

Files recall their past contents

x = 5x = 10

Variables remember their past values and context

Hard disk blocks recall past content

Queries return tuple objects which remember their past

context , value, states

Page 8: Remembrance: The Unbearable Sentience of Being Digital Ragib Hasan *, Radu Sion, and Marianne Winslett University of Illinois at Urbana-Champaign Stony

8

Remembrant Computing• When data objects are transferred, they

retain their old memories

• Copies retain memory of the original, along with copying context

• Deletions remove the value from container, but the memories may live on

4th Conference on Innovative Data Systems Research (CIDR) 2009

Page 9: Remembrance: The Unbearable Sentience of Being Digital Ragib Hasan *, Radu Sion, and Marianne Winslett University of Illinois at Urbana-Champaign Stony

9

But, what’s the point of remembering?• “Time-aware knowledge”• Associative memories• More expressivity in data processing– Compute based on not only present value, but

historic information, derivation, lineage, provenance

– Mine useful patterns• Taint analysis / information flow checking• Recover from transient errors at arbitrary

granularities• Time-travel seamlessly to any point in an

application, system, or website4th Conference on Innovative Data Systems Research (CIDR) 2009

Page 10: Remembrance: The Unbearable Sentience of Being Digital Ragib Hasan *, Radu Sion, and Marianne Winslett University of Illinois at Urbana-Champaign Stony

10

Is this possible, viable, desirable?• Physical limitations– Only limited amount of “memory” possible

in primary and secondary data storage– Not all memories can be retained forever

• Problem of Recursion– Will the system to store memories also

have its own memory objects? • Performance– Handling large amount of history for every

data object can cause performance bottleneck

4th Conference on Innovative Data Systems Research (CIDR) 2009

Page 11: Remembrance: The Unbearable Sentience of Being Digital Ragib Hasan *, Radu Sion, and Marianne Winslett University of Illinois at Urbana-Champaign Stony

11

Is this possible, viable, desirable?• Security / privacy– How do we control access to old memories?– Would remembering states/values violate

privacy?• Legal issues– Various regulations limit how long data can

be retained– Privacy laws limit contextual information

that can be recorded

4th Conference on Innovative Data Systems Research (CIDR) 2009

Page 12: Remembrance: The Unbearable Sentience of Being Digital Ragib Hasan *, Radu Sion, and Marianne Winslett University of Illinois at Urbana-Champaign Stony

12

Is this possible, viable, desirable?• Scalability: How do we recall, and when

do we forget?– Remembering everything can be undesirable• Some humans suffer from Hyperthymesia or total

recall• Too many unimportant details can overwhelm

functionality

– How to decide what “memories” to forget is an issue

– Management, searching, indexing all need to scale with large number of memories

4th Conference on Innovative Data Systems Research (CIDR) 2009

Page 13: Remembrance: The Unbearable Sentience of Being Digital Ragib Hasan *, Radu Sion, and Marianne Winslett University of Illinois at Urbana-Champaign Stony

13

Where we are today …• MyLifeBits:– Recording all of Gordon Bell’s personal

interactions requires 18GB/year, or 1.1 TB over a lifetime

• Provenance:– 16% overhead in recording all information

flows for files (PASS [Seltzer et al, Usenix Technical 06])

– 3%-15% overhead in secure, tamper evident provenance for files [Hasan et al, Usenix FAST09]

4th Conference on Innovative Data Systems Research (CIDR) 2009

Page 14: Remembrance: The Unbearable Sentience of Being Digital Ragib Hasan *, Radu Sion, and Marianne Winslett University of Illinois at Urbana-Champaign Stony

14

Epilogue• Ability to recall the past memories, and

contextual information differentiates sentient beings from simpler organisms

• Augmenting data objects with memory as an intrinsic property will introduce sentience for digital objects

4th Conference on Innovative Data Systems Research (CIDR) 2009