untangling the web from dns michael walfish hari balakrishnan massachusetts institute of technology...
TRANSCRIPT
Untangling the Web from DNS
Michael Walfish Hari Balakrishnan
Massachusetts Institute of Technology
Scott Shenker UC Berkeley/ICSI
IRIS Project
30 March 2004
Introduction
• The Web depends on linking; links contain references
• Properties of DNS-based references encode administrative domain human-friendly
<A HREF=http://domain_name/path_name>click here</A>
• These properties are problems!
Web Links Should Use Flat Identifiers
<A HREF=http://isp.com/dog.jpg>my friend’s dog</A>
<A HREF=http://isp.com/dog.jpg>my friend’s dog</A>
<A HREF=http://f0120123112/ >my friend’s dog</A>
<A HREF=http://f0120123112/ >my friend’s dog</A>
Current Proposed
• This talk: That we should build
this That we can build this
Outline
I. Argue for flat tags instead of DNS-based URLs
II. Resolution service for flat tags
Status Quo
DNS
IP addr
a.comBrowser
HTTP GET: /dog.jpg
http://http://<A HREF=
http://a.com/
dog.jpg>Spot</A>
<A HREF=
http://a.com/
dog.jpg>Spot</A>
Web Page
Why not DNS?
“Reference Resolution Service”
Goal #1: Stable References
Stable=“reference is invariant when object moves”
• In other words, links shouldn’t break• DNS-based URLs are not stable . . .
Object Movement Breaks Links
isp.com
“dog.jpg”
isp-2.com
“spot.jpg”
“HTTP 404”
HTTP GET:/dog.jpg
Browser
http://http://<A HREF=
http://isp.com/dog.jpg>Spot</A>
<A HREF=
http://isp.com/dog.jpg>Spot</A>
URLs hard-code a domain and a path
Object Movement Breaks Links, Cont’d
isp.com
“dog.jpg”
isp-2.com
“spot.jpg”
“HTTP 404”
HTTP GET:/dog.jpg
Browser
http://http://<A HREF=http://isp.com/dog.jpg>Spot</A>
<A HREF=http://isp.com/dog.jpg>Spot</A>
Today’s solutions not stable:• HTTP redirects
need cooperation of original host• Vanity domains, e.g.: internetjoe.org
now owner can’t change
Goal #2: Supporting Object Replication
• Host replication relatively easy today• But per-object replication requires:
separate DNS name for each object virtual hosting so replica servers recognize names configuring DNS to refer to replica servers
isp.com
“/docs/foo.ps”
mit.edu
“~joe/foo.ps”
http://object26.org
HTTP “GET /”
host: object26.org
HTTP “GET /”host: object26.org
What Should References Encode?
• Observe: if the object is allowed to change administrative domains, then the reference can’t encode an administrative domain
• What can the reference encode? Nothing about the object that might change! Especially not the object’s whereabouts!
• What kind of namespace should we use?
Goal #3: Automate Namespace Management
• Automated management implies no fighting over references
• DNS-based URLs do not satisfy this . . .
DNS is a Locus of Contention
• Used as a branding mechanism tremendous legal combat “name squatting”, “typo squatting”,
“reverse hijacking”, . . .
• ICANN and WIPO politics technical coordinator inventing naming
rights set-asides for misspelled trademarks
• Humans will always fight over names . . .
Separate References and User-level Handles
• “So aren’t you just moving the problem?” Yes. But.
Let people fight over handles, not references
Object Location
Human-unfriendly References
User Handles(AOL Keywords,
New Services, etc.)
tussle space[Clark et al., 2002]
Two Principles for References
1. References should not embed object or location semantics
2. References should be human-unfriendly
Flat tags
Minimal interfaceNatural choice:
Outline
I. Argue for flat tags instead of DNS-based URLs
II. Resolution service for flat tags: SFR
Object Location
Flat TagUser Handle
(AOL Keyword,New Handle, etc.)
<A HREF=http://f012c1d/ >Spot</A>
<A HREF=http://f012c1d/ >Spot</A>
Managed DHT-based
Infrastructure
GET(0xf012c1d)
(10.1.2.3, 80, /pics/dog.gif)
o-record
HTTP GET: /pics/dog.gif 10.1.2.3
Web Server
/pics/dog.gif
SFR in a Nutshell
orecAPI• orec = get(tag);• put(tag, orec);Anyone can put() or get()
Resilient Linking
SFR
<A HREF=http://f012012/pub.pdf>here is a paper</A>
<A HREF=http://f012012/pub.pdf>here is a paper</A>
HTTP GET:
/docs/pub.pdf
10.1.2.3
/docs/
20.2.4.6
HTTP GET:
/~user/pubs/pub.pdf(10.1.2.3,80,/docs/)(20.2.4.6,80,
/~user/pubs/)/~user/pubs/
• tag abstracts all object reachability information• objects: any granularity (files, directories, hosts)
(Doesn’t address massive replication)
Flexible Object Replication
(IP1, port1, path1),(IP2, port2, path2),(IP3, port3, path3),. . .
0xf012012SFR
o-record
• Grass-roots Replication People replicate each other’s content Does not require control over Web servers
Reference Management
• Requirements No collisions, even under network partition References must be human-unfriendly Only authorized updates to o-records
• Approach: randomness and self-certification tag = hash(pubkey, salt) o-record has pubkey, salt, signature anyone can check if tag and o-record match
Latency
• Problem: Lookups should be fast
• Solution: lots of TTL-based caching Clients and DHT nodes cache o-records DHT nodes cache each other’s locations
In Chord, aggressive location caching 2 or 3 hops per lookup
Could also use “one-hop” or Beehive
Outline
I. Argue for flat tags instead of DNS-based URLs
II. Resolution service for flat tags: SFR
III. Related Work / Summary / Conclusion
Related Work
• URN (Universal Resource Name) DOI: an existing URN implementation
• PURL (Permanent URL)• Globe• Open Network Handles• DNS over Chord
Summary
• Should we build flat references for the Web? Yes!
• Can we build flat references for the Web? Yes! (Our implementation is usable.)
• Lots of future work . . .
Conclusion
• DNS-based URLs certainly convenient!
• But flat tags better for linking
• Future: type DNS names, link with flat tags?
Appendix Slides
Implementation
http://http:// HTTPSFR Web
Proxy
SFR Client
SFR Portal
Chord
DHash
SFR Server
Web Client
• Proxy allows: End-users to experience SFR
latency Dynamic population of SFR
infrastructure with o-records
orec = get(tag)put(orec,tag)
HTTP
SFR Portal
GetRequestGetResponse
PlanetLab
Evaluation
SFR data ——— • eight day trace• 390 virtual hosts; 130
nodes in Chord ring on PlanetLab
• latency seen by SFR portal• most queries are two hops• informal feedback:
generally indistinguishable from DNS
Comparison meant to be suggestive not conclusive
DNS data ——— ——— • collected at MIT CSAIL, Feb. 2004
SFR ComponentsPortal
Portal
Relay
Relay
Relay
OrgStore
SFRInfrastructure
Organization
• Infrastructure stores (tag, o-record) pairs• Caching throughout; o-record has TTL field
SFR Client
ApplicationSFR Client
ApplicationSFR Client
Application
SFR Client
Application
Portal
Portal
Relay
Relay
Relay
OrgStore
SFRInfrastructure
Organization
Fate Sharing
put(tag,orec)
get(tag)
• Fate sharing via write-localitySimple case . . .
SFR Client
ApplicationSFR Client
ApplicationSFR Client
Application
SFR Client
Application
Alternate SFR Design: SFR--
• SFR stores only pointers to organizations• Analogous to NS records in DNS
GET(0xf012120)
org ptr: x
User
SFRInfrastructure
x
GET(0xf012120)
meta-data (IP addr, etc.)
Organization
When Files Separate From Directories
SFR
<A HREF=http://f012012/pub3.ps>here is a paper</A>
<A HREF=http://f012012/pub3.ps>here is a paper</A>
HTTP GET:
/doc/pub3.ps
10.1.2.3
/doc/pub1.ps
20.2.4.6HTTP GET:
/abc/pub3.
ps
(10.1.2.3,80,/doc/)
redirect ptr:x
/abc/pub3.ps
/doc/pub2.ps
/doc/pub3.psx
GET(0xf012012)
Location Caching
• Simulate effect of location caching 20 lookups/sec; one failure and one birth per
10 sec. Timeout rate: 4% 1000 nodes