webquilt: capturing and visualizing the web experience at

69
WebQuilt Capturing and Visualizing the Web Experience Jason I. Hong James A. Landay Group for User Interface Research EECS Department University of California at Berkeley World Wide Web 10

Upload: carnegie-mellon-university

Post on 08-May-2015

39 views

Category:

Technology


0 download

DESCRIPTION

Research I did a while back on using a web proxy to capture web interactions remotely and then visualizing those interactions. Basically, WebQuilt is a tool to support remote usability testing of web sites. WebQuilt is a web logging and visualization system that helps web design teams run usability tests (both local and remote) and analyze the collected data. Logging is done through a proxy, overcoming many of the problems with server-side and client-side logging. Captured usage traces can be aggregated and visualized in a zooming interface that shows the web pages people viewed. The visualization also shows the most common paths taken through the website for a given task, as well as the optimal path for that task as designated by the designer. This paper discusses the architecture of WebQuilt and also describes how it can be extended for new kinds of analyses and visualizations. Authors are Jason Hong and James Landay

TRANSCRIPT

Page 1: WebQuilt: Capturing and Visualizing the Web Experience at

WebQuiltCapturing and Visualizing the Web Experience

Jason I. HongJames A. Landay

Group for User Interface ResearchEECS Department

University of California at Berkeley

World Wide Web 10

Page 2: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 2

Motivation

• Many websites have usability problems 62% web shoppers gave up past month (Spool)

39% failed in buying attempts (Creative Good)

• Two problems all web designers face Understanding users' tasks Understanding obstacles in completing tasks

• Many methods for understanding tasks E.g. interviews, ethnographic observations,

surveys, focus groups

• Focus here is on understanding obstacles

Page 3: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 3

Understanding Obstacles Today

• Traditional usability testsExtremely useful qualitative informationLots of time, small websites, few people, local

• Server-side loggingEasy to collect, remote testing, lots of toolsRestricted access, little on tasks and problems

• Client-side loggingCan track everything, remote testing Installation, platform-dependent, analysis tools

Page 4: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 4

Streamlining Current Practices

• Fast and easy to deploy on any website• Compatible with range of OS and browsers• Better tools for analyzing the data

Page 5: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 5

WebQuilt Approach

• Fast and easy to deploy on any website• Compatible with range of OS and browsers• Better tools for analyzing the data

Client Browser Web Server

Request

Response

Page 6: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 6

WebQuilt Approach

• Fast and easy to deploy on any website• Compatible with range of OS and browsers• Better tools for analyzing the data

WebQuiltLog

ProxyClient Browser Web Server

Page 7: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 7

WebQuilt Approach

• Fast and easy to deploy on any website• Compatible with range of OS and browsers• Better tools for analyzing the data

Page 8: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 8

WebQuilt Usage

• Setup several tasks, recruit 20–100 people• Email participants a URL that uses the proxy• Ask them to complete the predefined tasks• Collect lots of remote (or local) data• Aggregate, view, and interact with data• Find problems, fix, repeat

Evaluate

Design

Prototype

Page 9: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 9

Outline

Background and MotivationWebQuilt ArchitectureUsage Experience and VisualizationsSummary and Future Work

Page 10: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 10

Overall ArchitectureProxy

Logger

GraphLayout

Viz

GraphMerger

ActionInferencer

Log Files

Online

Offline

Page 11: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 11

ProxyProxy

Logger

GraphLayout

Viz

GraphMerger

ActionInferencer

• Lies between browser and server

http://domain.com/webquilt?replace=http://www.yahoo.com

• One log file per user session• Currently use Java servlets

Important part is log file format

Page 12: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 12

Time(ms)

From TID

To TID

Parent ID

HTTP Response

Frame ID

Link ID

HTTP Method

URL + Query

6062 0 1 -1 200 -1 -1 GET http://www.google.com

11191 1 2 -1 200 -1 -1 GET http://www.phish.com/index.htmq=Phish&btnI=I%27m+Feeling+Lucky

167525 2 3 -1 200 -1 1 GET http://www.phish.com/bios.html

31043 3 4 -1 200 -1 2 GET https://www.phish.com/bin/catalog.cgi

68772 2 5 -1 200 -1 15 GET http://www.emusic.com/features/phish

Log File Format

Page 13: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 13

Time From TID

To TID

Parent ID

HTTP Response

Frame ID

Link ID

HTTP Method

URL + Query

6062 0 1 -1 200 -1 -1 GET http://www.google.com

11191 1 2 -1 200 -1 -1 GET http://www.phish.com/index.htmq=Phish&btnI=I%27m+Feeling+Lucky

167525 2 3 -1 200 -1 1 GET http://www.phish.com/bios.html

31043 3 4 -1 200 -1 2 GET https://www.phish.com/bin/catalog.cgi

68772 2 5 -1 200 -1 15 GET http://www.emusic.com/features/phish

Time From TID

To TID

Parent ID

HTTP Response

6062 0 1 -1 200

(ms)

Log File Format

Page 14: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 14

Time(ms)

From TID

To TID

Parent ID

HTTP Response

Frame ID

Link ID

HTTP Method

URL + Query

6062 0 1 -1 200 -1 -1 GET http://www.google.com

11191 1 2 -1 200 -1 -1 GET http://www.phish.com/index.htmq=Phish&btnI=I%27m+Feeling+Lucky

167525 2 3 -1 200 -1 1 GET http://www.phish.com/bios.html

31043 3 4 -1 200 -1 2 GET https://www.phish.com/bin/catalog.cgi

68772 2 5 -1 200 -1 15 GET http://www.emusic.com/features/phish

Frame ID

Link ID

HTTP Method

URL + Query

-1 -1 GET http: / /www.google.com

Log File Format

Page 15: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 15

Client Browser Web Server WebQuilt Proxy

Proxy Editor

Cached Pages WebQuilt Logs

WebProxy Servlet 1 2

3 4 5 HTTPClient

Package

Client Browser Web Server WebQuilt Proxy

Proxy Editor

Cached Pages WebQuilt Logs

WebProxy Servlet 1 2

3 4 5 HTTPClient

Package

Store

The Proxy at Runtime

Page 16: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 16

Client Browser Web Server WebQuilt Proxy

Proxy Editor

Cached Pages WebQuilt Logs

WebProxy Servlet 1 2

3 4 5 HTTPClient

Package

Client Browser Web Server WebQuilt Proxy

Proxy Editor

Cached Pages WebQuilt Logs

WebProxy Servlet 1 2

3 4 5 HTTPClient

Package

1. Process Client Request

Store

The Proxy at Runtime

Page 17: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 17

Client Browser Web Server WebQuilt Proxy

Proxy Editor

Cached Pages WebQuilt Logs

WebProxy Servlet 1 2

3 4 5 HTTPClient

Package

Client Browser Web Server WebQuilt Proxy

Proxy Editor

Cached Pages WebQuilt Logs

WebProxy Servlet 1 2

3 4 5 HTTPClient

Package

2. Retrieve Requested Document

Store

The Proxy at Runtime

Page 18: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 18

Client Browser Web Server WebQuilt Proxy

Proxy Editor

Cached Pages WebQuilt Logs

WebProxy Servlet 1 2

3 4 5 HTTPClient

Package

Client Browser Web Server WebQuilt Proxy

Proxy Editor

Cached Pages WebQuilt Logs

WebProxy Servlet 1 2

3 4 5 HTTPClient

Package

3. Process and return the page

Store

The Proxy at Runtime

Page 19: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 19

Start with:<A HREF="computers.html">

End up with:<A HREF="http://tasmania.cs.berkeley.edu/webquilt?replace=http://www.yahoo.com/computers.html&tid=1&linkid=12">

The Proxy at Runtime

Page 20: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 20

Client Browser Web Server WebQuilt Proxy

Proxy Editor

Cached Pages WebQuilt Logs

WebProxy Servlet 1 2

3 4 5 HTTPClient

Package

Client Browser Web Server WebQuilt Proxy

Proxy Editor

Cached Pages WebQuilt Logs

WebProxy Servlet 1 2

3 4 5 HTTPClient

Package

4. Store the page 5. Log the transaction

Store

The Proxy at Runtime

Page 21: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 21

Additional Proxy Functionality

• Handling Cookies Cookies only sent from browser back

to web server that put it there

ProxyLogger

GraphLayout

Viz

GraphMerger

ActionInferencer

User ID Domain Cookie

AAA yahoo.com xyzzy

AAA google.com asdfg

BBB yahoo.com abcde

Page 22: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 22

Additional Proxy Functionality

• Handling Cookies Cookies only sent from browser back

to web server that put it there

• Handling Secure Socket Layer Encrypts page requests and data E.g. Shopping Carts, Financials

ProxyLogger

GraphLayout

Viz

GraphMerger

ActionInferencer

Client Browser Web Server

SSL

Page 23: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 23

Additional Proxy Functionality

• Handling Cookies Cookies only sent from browser back

to web server that put it there

• Handling Secure Socket Layer Encrypts page requests and data E.g. Shopping Carts, Financials Split into two SSL requests

ProxyLogger

GraphLayout

Viz

GraphMerger

ActionInferencer

ProxyClient Browser Web Server

SSL SSL

Page 24: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 24

Action Inferencer

• Takes a single log file and converts into a list of actions "Clicked on link" or "Hit back button"

• Inference because still must guess Back and forward actions local

ProxyLogger

GraphLayout

Viz

GraphMerger

ActionInferencer

Page 25: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 25

Time From TID

To TID

Parent ID

HTTP Response

Frame ID

Link ID

HTTP Method

URL + Query

6062 0 1 -1 200 -1 -1 GET http://www.google.com

11191 1 2 -1 200 -1 -1 GET http://www.phish.com/index.htmq=Phish&btnI=I%27m+Feeling+Lucky

167525 2 3 -1 200 -1 1 GET http://www.phish.com/bios.html

31043 3 4 -1 200 -1 2 GET https://www.phish.com/bin/catalog.cgi

68772 2 5 -1 200 -1 15 GET http://www.emusic.com/features/phish

Re-Assembling User Actions

Page 26: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 26

From TID

To TID

Parent ID

HTTP Response

Frame ID

Link ID

HTTP Method

0 1 -1 200 -1 -1 GET

2 -1 200 -1 -1 GET

3 -1 200 -1 1 GET

4 -1 200 -1 2 GET

5 -1 200 -1 15 GET

URL + Query

http://www.google.com

http://www.phish.com/index.htmq=Phish&btnI=I%27m+Feeling+Lucky

http://www.phish.com/bios.html

https://www.phish.com/bin/catalog.cgi

http://www.emusic.com/features/phish

Time

6062

11191

167525

31043

68772

1

2

3

2

1

2

3

4

5

Re-Assembling User Actions

Page 27: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 27

From TID

To TID

0 1

2 3

3 4

2 5

URL + Query

http://www.google.com

http://www.phish.com/index.htmq=Phish&btnI=I%27m+Feeling+Lucky

http://www.phish.com/bios.html

https://www.phish.com/bin/catalog.cgi

http://www.emusic.com/features/phish

1

2

3

4

5

1

2

3

2

Re-Assembling User Actions

Page 28: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 28

From TID

To TID

0 1

2 3

3 4

2 5

URL + Query

http://www.google.com

http://www.phish.com/index.htmq=Phish&btnI=I%27m+Feeling+Lucky

http://www.phish.com/bios.html

https://www.phish.com/bin/catalog.cgi

http://www.emusic.com/features/phish

1

2

3

4

5

1

2

3

2

Start 1

Re-Assembling User Actions

Page 29: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 29

From TID

To TID

0 1

2 3

3 4

2 5

URL + Query

http://www.google.com

http://www.phish.com/index.htmq=Phish&btnI=I%27m+Feeling+Lucky

http://www.phish.com/bios.html

https://www.phish.com/bin/catalog.cgi

http://www.emusic.com/features/phish

1

2

3

4

5

1

2

3

2

Start 1 2

Re-Assembling User Actions

Page 30: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 30

From TID

To TID

0 1

2 3

3 4

2 5

URL + Query

http://www.google.com

http://www.phish.com/index.htmq=Phish&btnI=I%27m+Feeling+Lucky

http://www.phish.com/bios.html

https://www.phish.com/bin/catalog.cgi

http://www.emusic.com/features/phish

1

2

3

4

5

1

2

3

2

Start 1 2 3

Re-Assembling User Actions

Page 31: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 31

From TID

To TID

0 1

2 3

3 4

2 5

URL + Query

http://www.google.com

http://www.phish.com/index.htmq=Phish&btnI=I%27m+Feeling+Lucky

http://www.phish.com/bios.html

https://www.phish.com/bin/catalog.cgi

http://www.emusic.com/features/phish

1

2

3

4

5

1

2

3

2

Start 1 2 3 4

Re-Assembling User Actions

Page 32: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 32

From TID

To TID

0 1

2 3

3 4

2 5

URL + Query

http://www.google.com

http://www.phish.com/index.htmq=Phish&btnI=I%27m+Feeling+Lucky

http://www.phish.com/bios.html

https://www.phish.com/bin/catalog.cgi

http://www.emusic.com/features/phish

1

2

3

4

5

1

2

3

2

Start 1 2 3 4

5

Re-Assembling User Actions

Page 33: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 33

Start 1 2 3 4

5

Action Inferencer

Page 34: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 34

1 2 3 4 3 2Start 5

Start 1 2 3 4

5

Case 1

Link Back Link

Action Inferencer

Page 35: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 35

Start 1 2 3 4

5

1 2 3 4 3 2Start 1 2 5

Case 2

Link Back LinkFwd

Action Inferencer

Page 36: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 36

1 2 3 4 3 2Start 5

Start 1 2 3 4

5

Case 1 by default(shortest path)

Action Inferencer

Page 37: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 37

Merger

• Combines multiple log files into a single directed graph Web pages are nodes Actions are edges

ProxyLogger

GraphLayout

Viz

GraphMerger

ActionInferencer

Page 38: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 38

Graph Layout

• Assign (x,y) to all nodes• Force-directed placement

Keep connected nodes close Push unconnected nodes far apart

• Edge-weighted depth-first Most traffic along top Less followed paths below Grid to help organize and align

• Plug-in new algorithms here

ProxyLogger

GraphLayout

Viz

GraphMerger

ActionInferencer

Page 39: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 39

VisualizationProxy

Logger

GraphLayout

Viz

GraphMerger

ActionInferencer

Page 40: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 40

Page 41: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 41

Page 42: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 42

Page 43: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 43

Page 44: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 44

Page 45: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 45

Page 46: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 46

Page 47: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 47

Page 48: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 48

Page 49: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 49

Future Work

• More sophisticated logging Lower level events (e.g. AT&T WET) Personalized web pages

• More sophisticated visualizations More use of semantic zooming Dynamic filtering

• Continue getting feedback from designers Initiated interviews with web designers Still need to do evaluations

Page 50: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 50

Take Home Ideas

• Need more tools for improving web site usability

• Proxy logging Logging where task is already known Any website, any browser, remote testing

• Visualizing logged data Aggregates large data sets Interact with in a zooming interface

• Pluggable architecture

Page 51: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 51

Acknowlegements

• Special thanks to Jeff Heer, Tim Sohn, and Sarah Waterson

Group for User Interface ResearchEECS Department

University of California at Berkeley

Download WebQuilt at:http://guir.berkeley.edu/webquilt

Page 52: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 52

Extra Slides

Page 53: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 53

Berkeley Website A

Page 54: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 54

Page 55: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 55

Page 56: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 56

Page 57: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 57

Page 58: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 58

Page 59: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 59

Casa de Fruta A

Page 60: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 60

Page 61: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 61

Page 62: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 62

Page 63: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 63

Casa de Fruta B

Page 64: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 64

Page 65: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 65

Page 66: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 66

Page 67: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 67

Page 68: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 68

Time From TID

To TID

Parent ID

HTTP Response

Frame ID

Link ID

HTTP Method

URL + Query

6062 0 1 -1 200 -1 -1 GET http://www.google.com

11191 1 2 -1 200 -1 -1 GET http://www.phish.com/index.htmq=Phish&btnI=I%27m+Feeling+Lucky

167525 2 3 -1 200 -1 1 GET http://www.phish.com/bios.html

31043 3 4 -1 200 -1 2 GET https://www.phish.com/bin/catalog.cgi

68772 2 5 -1 200 -1 15 GET http://www.emusic.com/features/phish

Log File Format

Page 69: WebQuilt: Capturing and Visualizing the Web Experience at

May 04 2001 69

In Case You're Feeling Evil…

• URLs can be of the form:http://userid@domain/page.html

• Most web servers ignore the userid part, but…http://[email protected]…/…

• Can auto-track people's actions once they hit your page without their knowledge or consent