ircdl damico del-bimbo-meoni

23
Interactive visual representations of complex information structures

Upload: media-integration-and-communication-center

Post on 13-Dec-2014

282 views

Category:

Technology


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Ircdl damico del-bimbo-meoni

Interactive visual representations of complex information structures

Page 2: Ircdl damico del-bimbo-meoni

2

The problem

•  in the last years the amount of information available on the Web have increased not only in size, but also in complexity

•  most of the documents accessible through the internet consist of multimedia data, user generated content, real-time information

•  the complexity of the new structure of information has thus become a big issue in the field of user experience and web usability and there is not yet a standard framework for the presentation of these information to the user

Page 3: Ircdl damico del-bimbo-meoni

Thinkbase and Thinkpedia

•  interactive visualization tools for exploring the semantic graph of large knowledge spaces

•  these systems are designed only to improve the visual representation of semantic web like the open shared knowledge databases Freebase and Wikipedia

3

Page 4: Ircdl damico del-bimbo-meoni

Whatsonweb+

•  systems that develop the use of web clustering engines as data sources for the visualization

•  these systems forward the user’s queries to the classical web search engines, take back the results and organize them in categorized groups called clusters, in order to provide a semantic representation of the information to the user

4

Page 5: Ircdl damico del-bimbo-meoni

TouchGraph Google Browser

•  a visual search engine that displays the connections between web sites using Google technology and visualizing the results in an interactive and customizable map

5

Page 6: Ircdl damico del-bimbo-meoni

Previous works

•  all the previous solutions are built on top of a dedicated resource and cannot be extended to other information repositories, like: –  web search engines –  multimedia databases (images, video, etc.) –  real-time data –  traditional databases –  simple data sets (csv files, xml, etc.)

•  they do not add content extracted from other resources to improve information quality

•  they do not use advanced techniques for visualization of information

6

Page 7: Ircdl damico del-bimbo-meoni

The visual interactive framework

7

The framework consists of a web-based application which allows users to perform a query, extracting and merging results from diverse knowledge repositories, and letting users to explore information by means of an interactive graph-based user interface

•  easily adaptable to all common data repositories

•  related content extraction to enrich information quality

•  Information Visualization techniques to improve user experience

more efficiency in information retrieval

Page 8: Ircdl damico del-bimbo-meoni

The architecture

•  The user performs a query against the following different resources: –  the main data source: a web search engine –  the related sources: two multimedia databases and a social network

•  results are then processed according to a semantic strategy to create a descriptions of the results

•  the GUI represents descriptions by specific interactive features and visual elements designed to allow users to explore the information

8

Page 9: Ircdl damico del-bimbo-meoni

•  the user performs a query against a web search engine which returns clustered data extracted from the web

•  we used the clustering engine Carrot2: search results are organized by topic, using the Lingo clustering engine

Main data source

9

Page 10: Ircdl damico del-bimbo-meoni

•  information extracted from the main resource are processed and used to query some of the most common social networks, image and video sharing platforms

•  the objective is the enrichment of information with related content in order to provide the user with a larger number of information sources to improve information completeness and increase user’s knowledge

•  our system uses the publicly available APIs of YouTube as source of video contents, Flickr for images and pictures and Technorati for social and real-time content

Related content sources

10

Page 11: Ircdl damico del-bimbo-meoni

•  all the information extracted from main and related content sources are then processed to create a semantic structure in the XML format

•  data are organized in the following elements: –  entities (called nodes), which represent the descriptors of the documents extracted from

the resources –  relations (edges), which represent the connections among documents

•  The structure of data was designed with two goals: –  to be easily extended to all common data repositories or search engines, in order to

implement a standardized representation of single elements, clusters, ranking informations and semantic relations

–  to be lightweight, so that it can be easily delivered and processed by the RIA application that runs within a browser plugin.

Resources description

11

Page 12: Ircdl damico del-bimbo-meoni

•  the user interface uses graph-based representation techniques in order to maximizes data comprehension and relations analysis

•  visual paradigms are proposed by means of two different metaphors to represent information

•  interactive functions are implemented in order to explore the structure of data

Graphical User Interface

12

Page 13: Ircdl damico del-bimbo-meoni

•  each node of the interface is a graphically enriched element representing the following group of information: –  a cluster returned by the main data source –  related content from the multimedia repositories –  related content from the social repository

•  each node is shaped according to the relations among all the information of the corresponding group

•  a line connecting two different nodes represent the semantic relation between groups

Information visualization

13

Page 14: Ircdl damico del-bimbo-meoni

•  Users can inspect all the information sources related to each node of the graphs that contain the search results, such as web pages, blog entries, images and videos related to the subject of the query

•  Contents can be accessed by clicking on buttons under each node name

Content access

14

Page 15: Ircdl damico del-bimbo-meoni

•  results are represented by a geometric shape differing in size, colour and shape: –  size represents the amount of results returned by the main data source –  colour represents the amount of results returned by the multimedia repositories –  shape type represents the amount of results returned by the social repository (starting

from the basic shape of a triangle)

Geometric paradigm

15

Page 16: Ircdl damico del-bimbo-meoni

Example of geometric user interface

16

Page 17: Ircdl damico del-bimbo-meoni

Urban paradigm

17

•  results are represented by elements of a urban landscape: –  building type represents the amount of results returned by the main data source –  trees represent the number of amount returned by the multimedia repositories –  people represent the number of amount returned by the social repository

Page 18: Ircdl damico del-bimbo-meoni

Example of urban visual interface

18

Page 19: Ircdl damico del-bimbo-meoni

•  testing was focused on evaluating the quality of the two different visual paradigms and the usability of the system

•  11 users with different cultural background and expertise have been selected: –  3 students and researchers of the Media Integration and Communication Center

–  3 students of the Master in Multimedia Content Design –  5 non-technical users

•  the test was conducted in different sessions according to the following methods: –  trained testing: participants were first given a brief tutorial (lasting about 10 minutes) about

the functions of the system –  untrained testing: participants were required to complete the test without any previous

knowledge of the system

Experimental analysis

19

Page 20: Ircdl damico del-bimbo-meoni

•  each participant was asked to find a document, image or video about a topic using a keyword given by the test supervisor

•  the tasks assigned in the experiments were: –  task 1: find an installation guide of Ubuntu operating system, using the keyword ubuntu –  task 2: find a web page describing the climate conditions that can be expected in Italy,

using the keyword Italy –  task 3: find the name of the founder of the social network Facebook, using the keyword

facebook –  task 4: find an image of one or more players of American Football, using the keyword

football

•  tasks were followed by a short interview in which subjects were asked about their experiences and their understanding of interface, data representations and visual paradigms

•  parameters used to evaluate the system: –  the number of interactions (mouse clicks) used to complete a task –  the time spent (seconds) used to complete a task

Users test methodology

20

Page 21: Ircdl damico del-bimbo-meoni

Trained testing results

21

•  9 users were assigned to this test (3 users for each results presentation paradigm)

•  Google was used to compare results with a traditional visualization interface

Page 22: Ircdl damico del-bimbo-meoni

Untrained testing results

22

•  2 users were assigned to this test, one for each visualization paradigm

Page 23: Ircdl damico del-bimbo-meoni

•  In this paper was presented a framework to visualize heterogenous information from the World Wide Web

•  Given a query string, the proposed system extracts the results from a web clustering engine and represent them according to a graph-based visualization technique

•  The GUI allows the end-user to explore the information space and visualize related content extracted from different resources, like multimedia databases and social networks

•  Two different visualization paradigms have been developed and tested in usability experiments, to evaluate their effectiveness in letting end-user to have a better comprehension of the categories and semantic relationships existing between the search results, thus achieving a more efficient retrieval of the web documents

•  Future work will address an extended experimental evaluation with different user-interfaces, to overcome the difficulties highlighted in the experiments, as well as an expansion of methods used for the extraction and linking of multimedia content related to the textual searches

Conclusions and future works

23