experiments towards reverse linking on the web
Post on 08-Jul-2015
2.630 Views
Preview:
DESCRIPTION
TRANSCRIPT
Experiments Toward Reverse Linking on the Web
Yeliz Yesilada, Darren Lunn and Simon Harper
Information Management Group
University of Manchester
Links and Browsing
• Links Allow Movement in Information Space
• Etymology of Browsing– To nibble at leaves, tender shoots, or other soft
vegetation
• A User Is In Control of What to Read or Examine
Current Web Model
• Closed Hypermedia System
• Links Embedded Within the Document By The Author
• Outbound Uni-Directional Links
• Limits the Users Browsing Experience
A B C
Bi-Directional Linking
• Used in Open Hypermedia Systems
• Users Can Travel in Both Directions
• Links Stored in a Separate Link Base
• Links Generated Dynamically
?
A B?
Existing Bi-Directional Web Linking
• Back Button– Uses the Browser Cache– User Only Knows About Pages Previously Visited
• Surfing The Web Backwards (Chakrabati ‘99)– Netscape Browser Extension– Web Server Extension
• Trackback– An Acknowledgement Between Sites that a Link
Exists– Both Sites Need to Be Trackback Enabled
Our Approach
• Use Web Logs To Establish Who Links To Our Website
• Reduced Spam Threat as Users Must Click on a Link
• Links Available to Any JavaScript Supporting Browser
Architecture
Web Page +
Browser
Client-Side
WebServer
Server-Side
1. User Clicks A Link To Request a Web Page
1
Architecture
Web Page +
Browser
Client-Side
WebServer
Log File
Server-Side
1. Server Records Request
2
Architecture
Web Page +
Browser
Client-Side
WebServer
Log File Log Processor Pages.xml
Server-Side
1. Log Processor Parses Log To Create Linkbase
3
Architecture
Web Page +
Browser
Client-Side
WebServer
Log File Log Processor Pages.xml
Pages.html
Server-Side
1. Link Base is Added To Page
4
Architecture
Web Page +
Browser
Client-Side
WebServer
Log File Log Processor Pages.xml
Pages.html
Server-Side
1. Web page Plus Reverse Links Sent To User
5
User Follows Link (1)
Server Creates Web Log (2)
• Web Server Logs HTTP Requests– Page Requested– Destination Client of the Requested Page
• Also Logs Additional Information– The Page Where the User Clicked the Link to
Request Page– Client Platform
• W3C Extended Log File Format
Example Web Log
01: 130.88.199.206 02: - 03: - 04: [08/Aug/2007:18:30:39 +0000] 05: "GET /ht07/index.php HTTP/1.1" 06: 200 07: 3811 08: "http://markbernstein.org/ 09: "Mozilla/5.0 (Windows NT 5.1; en-GB;) Gecko/20061204 Firefox/2.0.0.1"
Linkbase Creation (3)
• Parse the Log File for Referrer / Get Request Pairs
• Create Simple XML File
• Each Webpage has a Corresponding XML Linkbase– index.php index.xml
• Individual XML Linkbases Allow– Reduced Processing on the Server– Reduced Delay on the Client
Example Linkbase (index.xml)
<linkbase> <link> <title>Home page of Mark Bernstein</title> <url>http://markbernstein.org/</url> </link> <link> <title>HCI Conference and Workshops</title> <url>http://degraaff.org/hci/conference.html</url> </link> <link> <title>D-Lib Workshops and Conferences: 2007</title> <url>http://dlib.org/groups.html</url> </link> . . . </linkbase>
Links Added To The Page (4)
• Add JavaScript To Each Webpage
• Widely Supported By Most Browser Software
• When Page is Loaded, Look For Corresponding Linkbase
• Extracts Links From Linkbase
• Add Links to Page
Displaying Links - Menu (5)
• As Part of the Menu
• Immediately Available For Use
• Menu Size Increases Significantly
Displaying Links - Menu (5)
Displaying Links - Breadcrumb (5)
• Breadcrumbs Act As Navigation Aids
• They Inform Users Where They Are Within a Website
• Reverse Links Recommend Common Paths To Get To The Current Page
• Add A “Recommender” Extension To The Breadcrumb Trail
Displaying Links - Breadcrumb (5)
Evaluation
• Technical Evaluation– In the Lab– Live on the Hypertext Website
• No User Evaluation– Previous Work has Show Reverse Linking Can
Enhance Web Browsing [Chakrabati ‘99]
Issues To Address
• How Often Should The Log File be Parsed?– Too Frequent - May slow down the server speed– Too Infrequent - Links may be out of date– Monthly - Anecdotally this seemed to work OK
• How Do We Manage The Link Box Size?– We only added links that occurred more than once– Could use time to keep only the most recently
followed links
Issues To Address
• Can Fine Grained Linking Be Achieved?– We link to the page– Is it possible to link to fragments eg Blogs?
• How Do We Ensure Link Quality?– Some referrers were password protected– Some pages had been relocated eg Blogs– Some pages might be spam
Conclusions
• Reverse Linking Is Possible Using Server Logs
• Our Technique is Platform Independent
• Enhance Users Browsing Experience
• This Is A First Step - More Investigation Is Required
Questions
http://hcw.cs.manchester.ac.uk/
top related