semantically enhanced desktop search using directory-based clustering and wordnet knowledge...
TRANSCRIPT
Semantically Enhanced Desktop Search Using Directory-Based
Clustering and Wordnet Knowledge
Ştefania GHIŢĂ
29.10.2004
Hannover 2
Forschungszentrum L3S
Content
Project Overview Google Purpose Structure Photo Prototype Offline Content Prototype Conclusions
29.10.2004
Hannover 3
Forschungszentrum L3S
Project Overview
Background Search
No personalization user preferences No context
Topic classification in DMOZ Purpose
Contextualize / personalize search using additional metadata
Advantages Precision of search Expresiveness of search results
29.10.2004
Hannover 4
Forschungszentrum L3S
A possible solution – indexing data on the PC (Google): Increase search efficiency Doesn’t use specific
characteristics of the user like : Folder hierarchies Browser caches
29.10.2004
Hannover 5
Forschungszentrum L3S
Purpose
Finding new solutions for: Increasing precision of search according
to the user’s profile Expresiveness of search results by
adding additional information to the search
Ranking the search results Metadata as the answer to these
problems
29.10.2004
Hannover 6
Forschungszentrum L3S
Structure
How to characterize and obtain a user profile
Define metadata models for different types of information
Automatically generating such metadata Enriching data by adding additional
information: Wordnet Extending additional information using file
structure and user behaviour Search engine that uses the metadata
29.10.2004
Hannover 7
Forschungszentrum L3S
Photo prototype
/My Pictures/ Holidays/ Germany/ Hannover/ Rathaus/ building.jpg <location_info>Holidays</location_info> … <location_info>building</location_info> <lastModified>date</lastModifies> <sizeBytes>XX</sizeBytes>
<resolution>0</resolution>
<sizeX>(pixels)</sizeX>
<sizeY>(pixels)</sizeY>
<colorScheme>X</colorScheme>
29.10.2004
Hannover 8
Forschungszentrum L3S
Enriching Data with Wordnet
Holidays/ Germany/ Hannover
RDF Add Wordnet extensions:
Synonims Holonyms (Germany is a part of …) Meronyms (Germany has part …) Hypernims (Holiday is a kind of …) Hyponims (… is a kind of Holiday) Troponyms
29.10.2004
Hannover 9
Forschungszentrum L3S
Example<rdf:Description rdf:about="file:\\C:\Stefi\L3S\beautiful\home\plant\cat.jpg"> <j.0:location_info>C:\Stefi\</j.0:location_info> <j.0:location_info>C:\Stefi\L3S\</j.0:location_info> <j.0:location_info> <rdf:Description rdf:about="file:\\C:\Stefi\L3S\beautiful\"> <j.0:sense>beautiful</j.0:sense> </rdf:Description> </j.0:location_info> <j.0:location_info rdf:resource="file:\\C:\Stefi\L3S\beautiful\home\"/> <j.0:location_info> <rdf:Description rdf:about="file:\\C:\Stefi\L3S\beautiful\home\plant\"> <j.0:sense>plant</j.0:sense> <j.0:sense>establish</j.0:sense> <j.0:sense>implant</j.0:sense> </rdf:Description> </j.0:location_info> <j.0:location_info>cat</j.0:location_info> <j.0:sense>cat</j.0:sense> <j.0:sense>kat</j.0:sense> <j.0:sense>guy</j.0:sense> <j.0:sense>cat-o'-nine-tails</j.0:sense> <j.0:sense>big_cat</j.0:sense> <j.0:sense>vomit</j.0:sense> <j.0:sense>Caterpillar</j.0:sense> <j.0:sense>computerized_tomography</j.0:sense> <j.0:lastModified>Tue Oct 26 17:36:44 CEST 2004</j.0:lastModified> <j.0:sizeBytes>291851</j.0:sizeBytes> </rdf:Description></rdf:RDF>
29.10.2004
Hannover 10
Forschungszentrum L3S
Offline Content Prototype
Additional information for the user’s profile Browsing behaviour
Relevant results Additional context for results
Structure: ID of the page Date of access Link from which the user came Links accessed on the page Others annotations of the content
29.10.2004
Hannover 11
Forschungszentrum L3S
Conclusion
Metadata models for contextualized search for different types of files
Tools for automatically generating metadata
Tools for enriching metadata Search engine and algorithms that use
the metadata