Folksonomies: Indexing and Retrieval in Web 2.0
By Isabella PetersPresented by:
Curtis Naphan & Shahid Zia QaisraniCMN 5150Fall 2011
On the Author
• Dr. Isabella Peters, M.A.• Specializes in
Information Science• Researcher and Lecturer
at Heinrich-Heine-Universität in Düsseldorf, Germany
Source: http://www.isi2011.de/programm/vortrag.php?id=9
On the Book
• Published in 2009• Part of the Knowledge &
Information book series• Originally in German• Thorough and “sober” analysis
of folksonomies• Not casual reading but a good
resource for those who need to know
Source: http://www.amazon.com/Folksonomies-Indexing-Retrieval-Knowledge-Information/dp/images/3598251793
How the Book is Structured• Where are folksonomies being used today?• What are the various characteristics?
Overview of Collaborative
Information Services
• What are some relevant concepts in folksonomies?• What are the alternatives?
Overview of Terminology and
Models
• How can folksonomies help capture knowledge?• What are the benefits and drawbacks?
Folksonomies for Knowledge
Representation
• How can folksonomies help retrieve knowledge?• How do they compare with traditional methods?
Folksonomies for Information Retrieval
Overview of Folksonomies
A look at how folksonomies are being used today
What are they?
• The use of tags to index and retrieve content
FOLKSONOMY
dogfunny gun
spot
Why are they used?
• Web 2.0– User-generated content– Little formal curation
• Taxonomies too restrictive– “If hierarchies were a good way to organize links,
Yahoo would be king of the hill and Google an also-ran service.” (Shirky, 2004)
• Full-text search not enough– Non-textual resources– Collaborative browsing
Where are they used?
• Incorporated into many applications
• Some differences:– Tag my stuff vs. tag
everyone’s stuff– Content belongs to
me vs. Content is public
Social Bookmarking
• Users add bookmarks• User can tag bookmarks• Link can be tagged by
multiple users• Tags aid:– Personal retrieval– Collaborative browsing– Search
• Often used for PKM
Examples• Del.icio.us• Diigo• Bibsonomy• CiteULike
Link Tags Recommended Tags
E-Commerce
• Users can tag products• Complements search and professional
directory• Example: Amazon.com
Knowledge Bank
• For researchers and engineers • Tag Widget • Simple and Advance Search • Boolean AND• Multi-user tagging• Example: Engineering Village
Streaming Radio
• Example: last.fm• Songs streamed and played up to 3 times• Remunerated for playback• Collaborative rating system • Taste and listening habits • Tag-based recommender system
Libraries, Museums
• Tagging real-life objects via web• Complements traditional indexing methods
Examples• LibraryThing• Stevemuseum
romancicero marble bust
Photosharing
• Users can tag any photo• Aids search, browsing
Examples• Flickr
• Tagged and rated blogs• Search engine and directory• Tag generator code
• Slightly different implementation– Tags extracted from #hashed keywords
• Twitter adds:– Users following users– Messages linked to @users
user
link to other user
hash-tag
Tagging Games
Overall Remarks
• Each application’s implementation of folksonomies is different
• Subject matter is crucial– Altruism is rare (Wikipedia)– Personal gain is important motivation (del.icio.us)
• Implementation is important– Must be easy to use– Often few features
• Usefulness tends to increase when alternative indexing and retrieval methods are insufficient
Knowledge Representation
How folksonomies are used to capture knowledge
Overview of Knowledge Representation
• Types of Data• Broad versus Narrow• Tag Distribution• Tag Gardening• User Behaviour• Advantages• Disadvantages
The Tripartite Hypergraph
• 3 types of data– Users/Identity– Resources/Object– Tags/Metadata
• 3 types of graphs– User-Tag-Resource– User-Tag-User– Resource-Tag-Resource
Source: http://www.preoccupations.org/2007/10/thomas-vander-w.html
User-Tag-Resource Graph
• Answers the question “Which resources relate to which user?”
• Useful for PKM and browsing through interesting users’ resources
User
Tag
Resource
Tag
Tag
Resource
Resource
User Tag
User
User-Tag-User Graph
• Answers the question “Which users are similar?”• Useful for finding users with similar interests• Similarity can be measured by connected edges
User
Tag
User
Tag
Tag
User
User
Tag
Very similar users (e=2)
Resource-Tag-Resource Graph
• Answers the question “Which resources are similar?”• Useful for finding related resources• Similarity can be measured by connected edges
Tag
Resource
Tag
ResourceTag Tag
Resource
Resource
Tag
ResourceResource
Highly related resources! (e=2)
Broad Folksonomies
• A resource can be tagged with the same tag more than once– E.g. del.icio.us, CiteULike, Connotea,
Bibsonomy– Tend to be link-based resources
• Can calculate tag frequency per item
• Can enable tag recommender systems
Source: Thomas vander Wal, 2005
Narrow Folksonomies
• A resource can be tagged with a certain tag only once– E.g. flickr, Amazon, YouTube– Tend to be non-textual resources– Resources are inherently unique– Duplicates cannot be detected easily
• Tag occurrence for a resource is either 0 or 1
Source: Thomas vander Wal, 2005
Tag Distribution
• Tends to follow the “Power Law” (drops off exponentially)• Long Tail tags tend to be either useless (personal, synonyms, general) or high
value discriminators
Tag Gardening
• Is the attempt to address tagging problems, such as:– Synonyms (dog, doggy, dogs)– Multilingualism (dog, chien, Hund, perro)– Homonyms (jaguar[cat], jaguar[car])– “Spagging”– Semantic Enrichment (dog is a mammal, poodle is a type of dog,
london and paris are cities)– Personalisms (toread, willbuy, cmn5150)– Misspellings and orthographic variation (uottawa, u-ottawa, u_ottawa,
uotawa)• Must be either:
– User-guided and personal– Community-wide and automatic but invisible
Summary of Advantages
• Authentic Language• Actuality/Neologisms• Multiple interpretations• Cheap indexing – distributed workload• More taggers, better effect – scales well• Identify communities and “small worlds”• Recommendation systems• Familiarize users with indexing system• Faster than classifying in a taxonomy• Good user recollection
Summary of Disadvantages
• Lack of a controlled vocabulary• The context of indexing is lost• Languages are mixed• Hidden relations are unexploited• Spam tags, user-specific tags, unclear keywords• Resources are indexed as a whole• Social character of tags is mostly invisible• Cold start problem
Information Retrieval
How folksonomies are used to retrieve information
Retrieval with Folksonomies
• Search– Works much like full-text search– Puts more weight on tag hits
• Browse– Filter by tag– Uses tag clouds and other tools– Allows for “serendipitous” discovery
• Visualize– Discover patterns in tags
Tag Filtering
• Tag filtering is the mechanism for filtering a list of resources by tag– Mine, a person’s or the community’s
• Usually assume AND relation between tags• Can be implemented with clicks-only or text• Could support more advanced filtering– e.g. newyork & (cats | dogs)
Searching on del.icio.us
Browsing on Diigo
Visualizing del.icio.us with Delicious Soup
Topigraphy
Source: Fujimara, 2008
Concluding Remarks
With applications to Knowledge Management
Folksonomies and KM
• Familiarization with tags• Recommender systems
• Adding tags to resources
• Information retrieval via tags
• Tag clouds, tag search• Visualization tools
• Tag gardening• Automatic processing
Combination Internalization
SocializationExternalization
Folksonomies and Ba
User Issues • Insight into community
mind via tag clouds, visualizations, recommender systems
• Promotion of “tagiquette”• Leveraging selfishness• Integration into traditional
taxonomies
Technical Issues• Intra- and inter-linguistic
issues• Inter-platform issues• Spam detection• Fair relevance rankings• Integrated visualization
tools
How can the environment (ba) contribute to the management of knowledge?
Conclusion
• Folksonomies are a powerful, and sometimes necessary, way of managing Web 2.0
• Functionality, not an application itself• Can complement traditional techniques, like ontologies,
hierarchies, full-text search, etc…• Success depends on:
– Number of users– Quality of implementation– Suitability of resource for tagging– Automatic tag management algorithms– Unsuitability of alternative classification and retrieval mechanisms
References• Fujimara, K. “Topigraphy: Visualization for Large-Scale Tag Clouds” (2008),
WWW2008.• Nonaka, I. “The Concept of ‘Ba’: Building a Foundation for Knowledge Creation”,
California Management Review, Vol. 40, No. 3, Spring 1998, p. 40-54.• Peters, Isabella. “Folksonomies: Indexing and Retrieval in Web 2.0” (2007), De
Gruyter.• Peters, Isabella. “Folksonomies Indexing Und Retrieval In Bibliotheken” (2010).
Retrieved from http://www.slideshare.net/Isabellapeters/folksonomies-indexing-und-retrieval-in-bibliotheken
• Peters, I. & Weller, K. “Tag Gardening for Folksonomy Enrichment and Maintenance” (2008). Retrieved from http://www.webology.org/2008/v5n3/a58.html
• Smith, G. “Visual Folksonomy Explanation” (2005). Retrieved from http://atomiq.org/archives/2005/01/visual_folksonomy_explanation.html
• Vander Wal, T. “Explaining and Showing Broad and Narrow Folksonomies” (2005). Retrieved from http://personalinfocloud.com/2005/02/explaining_and_.html
Questions?
Source: Larson, 1987