semantically enhanced desktop search using directory-based clustering and wordnet knowledge...

11
Semantically Enhanced Desktop Search Using Directory-Based Clustering and Wordnet Knowledge Ştefania GHIŢĂ

Upload: natalie-lindsey

Post on 26-Mar-2015

217 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Semantically Enhanced Desktop Search Using Directory-Based Clustering and Wordnet Knowledge Ştefania GHIŢĂ

Semantically Enhanced Desktop Search Using Directory-Based

Clustering and Wordnet Knowledge

Ştefania GHIŢĂ

Page 2: Semantically Enhanced Desktop Search Using Directory-Based Clustering and Wordnet Knowledge Ştefania GHIŢĂ

29.10.2004

Hannover 2

Forschungszentrum L3S

Content

Project Overview Google Purpose Structure Photo Prototype Offline Content Prototype Conclusions

Page 3: Semantically Enhanced Desktop Search Using Directory-Based Clustering and Wordnet Knowledge Ştefania GHIŢĂ

29.10.2004

Hannover 3

Forschungszentrum L3S

Project Overview

Background Search

No personalization user preferences No context

Topic classification in DMOZ Purpose

Contextualize / personalize search using additional metadata

Advantages Precision of search Expresiveness of search results

Page 4: Semantically Enhanced Desktop Search Using Directory-Based Clustering and Wordnet Knowledge Ştefania GHIŢĂ

29.10.2004

Hannover 4

Forschungszentrum L3S

Google

A possible solution – indexing data on the PC (Google): Increase search efficiency Doesn’t use specific

characteristics of the user like : Folder hierarchies Browser caches

Page 5: Semantically Enhanced Desktop Search Using Directory-Based Clustering and Wordnet Knowledge Ştefania GHIŢĂ

29.10.2004

Hannover 5

Forschungszentrum L3S

Purpose

Finding new solutions for: Increasing precision of search according

to the user’s profile Expresiveness of search results by

adding additional information to the search

Ranking the search results Metadata as the answer to these

problems

Page 6: Semantically Enhanced Desktop Search Using Directory-Based Clustering and Wordnet Knowledge Ştefania GHIŢĂ

29.10.2004

Hannover 6

Forschungszentrum L3S

Structure

How to characterize and obtain a user profile

Define metadata models for different types of information

Automatically generating such metadata Enriching data by adding additional

information: Wordnet Extending additional information using file

structure and user behaviour Search engine that uses the metadata

Page 7: Semantically Enhanced Desktop Search Using Directory-Based Clustering and Wordnet Knowledge Ştefania GHIŢĂ

29.10.2004

Hannover 7

Forschungszentrum L3S

Photo prototype

/My Pictures/ Holidays/ Germany/ Hannover/ Rathaus/ building.jpg <location_info>Holidays</location_info> … <location_info>building</location_info> <lastModified>date</lastModifies> <sizeBytes>XX</sizeBytes>

<resolution>0</resolution>

<sizeX>(pixels)</sizeX>

<sizeY>(pixels)</sizeY>

<colorScheme>X</colorScheme>

Page 8: Semantically Enhanced Desktop Search Using Directory-Based Clustering and Wordnet Knowledge Ştefania GHIŢĂ

29.10.2004

Hannover 8

Forschungszentrum L3S

Enriching Data with Wordnet

Holidays/ Germany/ Hannover

RDF Add Wordnet extensions:

Synonims Holonyms (Germany is a part of …) Meronyms (Germany has part …) Hypernims (Holiday is a kind of …) Hyponims (… is a kind of Holiday) Troponyms

Page 9: Semantically Enhanced Desktop Search Using Directory-Based Clustering and Wordnet Knowledge Ştefania GHIŢĂ

29.10.2004

Hannover 9

Forschungszentrum L3S

Example<rdf:Description rdf:about="file:\\C:\Stefi\L3S\beautiful\home\plant\cat.jpg"> <j.0:location_info>C:\Stefi\</j.0:location_info> <j.0:location_info>C:\Stefi\L3S\</j.0:location_info> <j.0:location_info> <rdf:Description rdf:about="file:\\C:\Stefi\L3S\beautiful\"> <j.0:sense>beautiful</j.0:sense> </rdf:Description> </j.0:location_info> <j.0:location_info rdf:resource="file:\\C:\Stefi\L3S\beautiful\home\"/> <j.0:location_info> <rdf:Description rdf:about="file:\\C:\Stefi\L3S\beautiful\home\plant\"> <j.0:sense>plant</j.0:sense> <j.0:sense>establish</j.0:sense> <j.0:sense>implant</j.0:sense> </rdf:Description> </j.0:location_info> <j.0:location_info>cat</j.0:location_info> <j.0:sense>cat</j.0:sense> <j.0:sense>kat</j.0:sense> <j.0:sense>guy</j.0:sense> <j.0:sense>cat-o'-nine-tails</j.0:sense> <j.0:sense>big_cat</j.0:sense> <j.0:sense>vomit</j.0:sense> <j.0:sense>Caterpillar</j.0:sense> <j.0:sense>computerized_tomography</j.0:sense> <j.0:lastModified>Tue Oct 26 17:36:44 CEST 2004</j.0:lastModified> <j.0:sizeBytes>291851</j.0:sizeBytes> </rdf:Description></rdf:RDF>

Page 10: Semantically Enhanced Desktop Search Using Directory-Based Clustering and Wordnet Knowledge Ştefania GHIŢĂ

29.10.2004

Hannover 10

Forschungszentrum L3S

Offline Content Prototype

Additional information for the user’s profile Browsing behaviour

Relevant results Additional context for results

Structure: ID of the page Date of access Link from which the user came Links accessed on the page Others annotations of the content

Page 11: Semantically Enhanced Desktop Search Using Directory-Based Clustering and Wordnet Knowledge Ştefania GHIŢĂ

29.10.2004

Hannover 11

Forschungszentrum L3S

Conclusion

Metadata models for contextualized search for different types of files

Tools for automatically generating metadata

Tools for enriching metadata Search engine and algorithms that use

the metadata