documents with databases inside them david karger mit
TRANSCRIPT
DOCUMENTS WITH DATABASES INSIDE THEM
David Karger
MIT
THE WEB PAGE AS A WYSIWYG END USER CUSTOMIZABLE DATABASE BACKED INFORMATION MANAGEMENTAPPLICATION (UIST 2009)
David Karger
MIT
SMALL DATA
David Karger
MIT
Databases for Plain Folks
Database community: Defined key
primitives for data management
Knows building apps over databases offers simplicity, power, and flexibility
Plain Folks: Have data to
manage Think databases
are black magic Manage their data
by editing documents
Hide databases in plain sight inside documents plain folks
can edit
Conclusion
People should be able to create or customize applications (data, visualization, interaction) for their own information management tasks
The web has evolved a standard metaphor of AJAX-y “active documents” as interfaces to (web) databases
People know how to edit documents So we can turn them into database
engineers by helping them edit web documents like the ones they already use
Conclusion
DIDO is a Data Interactive DOcument A standalone html document that contains
Some structured data in a database An AJAX-y WYSIWYG interface to view/edit the
data A WYSIWYG “metaeditor” to edit the interface
Persistence simply by saving the document A broad class of Create/Read/Update/Delete
content management applications can be authored (not programmed) using DIDO
Thank You
http://bit.ly/didodido Google “dido exhibit”
Customizable Applications
Applications bring together the data, specialized views, and interactions necessary to perform tasks
User wants to “stretch” the app to their task hide irrelevant data incorporate new kinds of data change how data is presented or manipulated
Can’t, because apps are rigid Developer hard-wires “right” data model And “right” visualizations and interactions with
data
Migration to the Web
Many applications are now web sites Amazon, Youtube, CNET, Epicurious, LinkedIn,
Flickr Still just as rigid/uncustomizable as applications Common Architecture
Database backed Entity-Relation Model
Common interface paradigms Templates Sortable lists Faceted browsing, text search maps, timelines, thumbnails, tables
Migration to the Web
Migration to the Web
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Text search
Faceted Browsing
Sorting by Properties
Templated Items
Nothing New
Tremendous similarity in many web sites Much more uniform than applications Helps users know what to expect Boring?
Perhaps But it creates an opportunity to
standardize/package Define authorable standard vocabulary of
widgets
EXHIBIT
Exhibit
HTML vocabulary extension (new tags) for making interactive pages “like these”
ER data model (items + properties) Lenses (templates) for rendering individual
items Views of an object collection:
list, thumbnails, tabular, scatter plot, map, timeline Facets to filter the collection based on
properties enumerated list, tag cloud, slider/numeric range
Exhibit Javascript library interprets the new tags
Exhibit Interaction
Simple Architecture
Each facet, view is independently tied to data
Specify which properties should fill in which places in the template hold latitude/longitude for map hold start/end for timeline be sortable in the list be columns in the tabular view provide filtering values for a facet
Then interact through the data Clicking value in facet filters data, which
changes view
Deployment
Deployed 2007 Open source Javascript library
http://simile-widgets.org/exhibit Browser independent Scales to ~ 1000 items, tens of
properties >1500 exhibits on the web
Hobbyists, scientists, newspapers, merchants
Discussion
Variety of exhibits (> 1500) suggests we’ve approximated “right” interaction vocabulary
Each is just an HTML document (with new tags)
All authored by editing data files and HTML source
How do we make them WYSIWYG?
DIDO
Editing the Data
Data objects are rendered through lenses
Which “fill in the blanks” using object properties
Changes to the rendered object can map back to changes in the underlying object
No additional “editing form” required
Change the rendered data,Change the underlying data
Editing the Interface
Views (of collection) and facets (for filtering) are elements on the page
Add them like other elements---images, media
Must tie to data model By specifying which properties are used in
view/facet Like a chart in a spreadsheet---you specify
columns
Editing Lens Templates
Templates say which properties go where Just HTML, so any WYSIWYG editor will
do
Adding template field spawns new data field
Changing the template changes the schema
Persistence
It’s just a document Save it Publish it on your web site Email it Store it in a version control system Copy it to create a new, different app
Not anchored to any particular app or web site
Implementation
Everything is in the one HTML file Data (serialized JSON) Display (HTML, with new tags for view/facet
widgets) Javascript
Exhibit framework to drive the data widgets jQuery TinyMCE HTML editor 2000 lines of new Javascript to wire everything together
No server, no plugin, nothing to install/configure Changes to data or display change the file
So saving the file saves the changes
DISCUSSION
Discussion: Document as Interface “Active Documents” (Xerox PARC) are a
great UI Add elements that react to the user Document becomes a whole user interface But still intuitive (it’s just a document) And easy to build (extend document
editor/viewer) They have taken over the web
Every web page with Javascript is one They are the dominant interfaces to web
content
Discussion: Authoring Active Docs Current web active docs are manually scripted Web has converged on standard active doc
widgets Sortable lists, maps, facets for browsing,… Should become part of HTML standard tags
These can be inserted in document Just like an image or other media
And wired to underlying data Just like a chart in a spreadsheet
So users can WYSIWYG author these active docs
It’s Just a Document
No application to install or configure New, specialized display for each data
set App as light “CSS” wrapper around the data
Create new apps by cloning old no need to start from scratch or learn the rules
oops!
Authoring by Copying
Views, lenses in HTML file
Copy it, change the data
(Maybe change the presentation too)
Scientific Publication
Document of the future Publish your paper with the data inside it Let the reader interact with that data They’ll better understand/believe your
argument
Easy Implementation
Browser as comprehensive UI toolkit 2000 lines of JS
Just wiring existing open source libraries together
Dido = Exhibit + Tiddlywiki + TinyMCE Even I could do it
A Cloud Computing Contrarian Popular model:
Move all functionality to the cloud Deliver “thin client” web interface using
Ajax
We can support the same interface experience without the cloud
Cloud Costs
Requires connectivity Leaks your data
Provider is attack target May not care about your data as much as you do
Locks your data Trapped if provider shuts down Or changes terms of service
Dido avoids all these costs Keep working on the plane Manage sensitive data that cannot be leaked Email to a friend, publish on a web site, version
control
Cloud Benefits
Persistence/ubiquity Can persist the document (google docs)
For Big Data Valid reason Even Dido does this for Google Maps But most data is small Don’t make cloud downsides ubiquitous just
because sometimes necessary! Even if need cloud for data, UI approach
still works
One DAPI to rule them all
Usually only need cloud for the data Define a single data access API
E.g., SQL over http Build it into the browser
Native javascript invocation Maybe even HTML tags
Define HTML table using SQL query? Skip the labor of defining/learning a new
API for each application Simplify client AND server side
programming
The DB community can lead
Define the right primitives for data access/manipulation in web interfaces
Push for a data access standard in HTML In-browser SQL currently bogged down in debates
High-performance in-browser database implementation
Push for one cloud-data API Generic database equivalent of Apache
HTTPD? Ease of adoption
Bring the database to the document!
Try It!
http://bit.ly/didodido http://projects.csail.mit.edu/exhibit/Dido/
Remember: it’s a research prototype Written by a theory prof.
Thanks to Scott Ostler and Ryan Lee