recuperação de informação cap. 01: introdução 21 de fevereiro de 1999 berthier ribeiro-neto

7
Recuperação de Informação Cap. 01: Introdução 21 de Fevereiro de 1999 Berthier Ribeiro-Neto

Upload: godwin-whitehead

Post on 18-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Recuperação de Informação Cap. 01: Introdução 21 de Fevereiro de 1999 Berthier Ribeiro-Neto

Recuperação de Informação

Cap. 01: Introdução

21 de Fevereiro de 1999

Berthier Ribeiro-Neto

Page 2: Recuperação de Informação Cap. 01: Introdução 21 de Fevereiro de 1999 Berthier Ribeiro-Neto

Motivation

IR: representation, storage, organization of, and access to information items

Focus is on the user information need User information need:

Find all docs containing information on college tennis teams which: (1) are maintained by a USA university and (2) participate in the NCAA tournament.

Emphasis is on the retrieval of information (not data)

Page 3: Recuperação de Informação Cap. 01: Introdução 21 de Fevereiro de 1999 Berthier Ribeiro-Neto

Motivation Data retrieval

which docs contain a set of keywords? Well defined semantics a single erroneous object implies failure!

Information retrieval information about a subject or topic semantics is frequently loose small errors are tolerated

IR system: interpret contents of information items generate a ranking which reflects relevance notion of relevance is most important

Page 4: Recuperação de Informação Cap. 01: Introdução 21 de Fevereiro de 1999 Berthier Ribeiro-Neto

Motivation IR at the center of the stage

IR in the last 20 years: classification and categorization systems and languages user interfaces and visualization

Still, area was seen as of narrow interest Advent of the Web changed this perception once and for all

universal repository of knowledge free (low cost) universal access no central editorial board many problems though: IR seen as key to finding the

solutions!

Page 5: Recuperação de Informação Cap. 01: Introdução 21 de Fevereiro de 1999 Berthier Ribeiro-Neto

Basic Concepts The User Task

Retrieval information or data purposeful

Browsing glancing around F1; cars, Le Mans, France, tourism

Retrieval

Browsing

Database

Page 6: Recuperação de Informação Cap. 01: Introdução 21 de Fevereiro de 1999 Berthier Ribeiro-Neto

Basic Concepts Logical view of the documents

Document representation viewed as a continuum: logical view of docs might shift

structure

Accentsspacing stopwords

Noungroups stemming

Manual indexingDocs

structure Full text Index terms

Page 7: Recuperação de Informação Cap. 01: Introdução 21 de Fevereiro de 1999 Berthier Ribeiro-Neto

UserInterface

Text Operations

Query Operations Indexing

Searching

Ranking

Index

Text

query

user need

user feedback

ranked docs

retrieved docs

logical viewlogical view

inverted file

DB Manager Module

4, 10

6, 7

5 8

2

8

Text Database

Text

The Retrieval Process