artificial intelligence approaches for information retrieval outline u artificial intelligence (ai)...

Artificial Intelligence Approaches for Information Retrieval

OutlineArtificial Intelligence (AI)AI and IRAI applied to IR

Information characterisationInformation seekingSystem IntegrationSupport functions

ConclusionReferences

Aims of AI

Building intelligent entities as well as understanding them

Systems that reason with information in some wayEx: problem solving, classification, learning, planning

Usually use some explicit representation of informationand some means of manipulating information

Four goals of AI

Systems that think like humans

“The exciting new effort to make computersthink ... machine in minds, in the full andliteral senses” (Haugeland, 1985)

“[The automation of] activities that weassociate with human thinking, activitiessuch as decision making, problem-solving,learning ...” (Bellman, 1978)

Systems that think rationally

“The study of mental faculties through theuse of computational models” (Charniak andMcDermott, 1985)

“The study of the computations that make itpossible to perceive, reason, and act”(Winston, 1992)

Systems that act like humans

“The art of creating machines that performfunctions that require intelligence whenperformed by people” (Kurzweil)

“The study of how to make computers dothings at which, at the moment, people arebetter” (Rich and Knight, 1991)

Systems that act rationally

“A fi eld of s tudy that seeks to explain andemulate intelligent behavior in terms ofcomputational processes” (Schalkoff, 1990)

“The branch of co mputer science that isconcerned with the automation of intelligentbehavior” (Luger and Stubblefield)

AI and IR

“[AI] This is the use of computers to carry out tasks requiring reasoning on world knowledge, as exemplified by giving responses to questions in situation where one is dealing with only partial knowledge and with indirect connectivity”

Karen Sparck Jones, 1991

“We construe a system to be knowledge-based when its behavior depends largely on accessing or encoding information”

Call for papers - FLAIRS conferences

Areas of AI for IR

Natural language processing Knowledge representation

Expert systems Ex: Logical formalisms, conceptual graphs, etc

Machine learning Short term: over a single session Long term: over multiple searches by multiple users

Computer Vision Ex: OCR

Reasoning under uncertainty Ex: Dempster-Shafer, Bayesian networks, probability theory, etc

Cognitive theory Ex: User modelling

AI applied to IR

Four main roles investigated

Information characterisation

Search formulation in information seeking

System Integration

Support functions

Information characterisation - Approach 1 (1)

“Strongest” AI approach

Set of documents reported as a single knowledge base Directly manipulating the information available => Knowledge-base retrieval

calcium

is made of

BONE

is kind of is kind of is kind of

HUMERUS RADIUS ULNA

“…calcium Content of bones …”

“… the humerusBones…”

“… radius andUlna, bones inArm…”

Information characterisation - Approach 1 (2)

Query: “does calcium deficiency cause Smith’s disease?”

Criticism: this is a model for question/answering system (not “traditional IR”)

Replace document text (natural language) with a knowledge base in an artificial language

Much of the (textual) information is lost What will be put in the knowledge base Issue of information extraction

Problem with large collection

Successful in specific domain (SCISOR)

Information characterisation - Approach 2

Weaker view of AI

Keep documents and use knowledge base as access tool (query formulation) Semantic-based access, concept-based access Interface and presentation

Better classification of document text and better access

Criticism: problems of (automatic) linkages (documents have different style, language and level of discussion)


Even weaker AI approach

Abandon knowledge base but use AI (syntactic level) to characterise document content

Sophisticated matching

Use NLP to derive Noun-phrases: “The mother of Jane <=> Jane’s mother” Sentences: “The boy ate the apple <=> The apple was eaten by the boy”

Normalisation is necessary!

Very little of evidence of success (so far)


Weakest AI approach

Use AI to select good natural language index termsThesaurus constructionCompound terms

Use world knowledge and a bit of linguistics (eg noun vs verb, discourse)

Important caveat or warning:“Most criticisms are valid for cases when we have large-scale and a variety of need on user side. We may have quite a different situation on specialised contexts where an AI approach … may be both justifiable and feasible”

Karen Sparck Jones, 1991

Information seeking

Characterisation of the user’s information need (and not the actual matching)

User modelling

“Automating the intermediary” giving the user an intelligent front-end

Over iterative searching and dialogue, determine use’s real information needMedical doctor vs medical student

Student and general topic: look for a survey document

Criticism: users have difficulty expressing their information need difficult of manually or automatically deriving rules for systems

Possible in limited context

Expert Systems

Expert systems (1)

Designed to simulate expert in narrow/specific field

Rule-based systemsfact: p

rule: if p then q

--------------------

Then add fact: q

Ex: if (patient has red spots) then (patient has measles)

patient has red spots

-------------------------------------------------------------

patient has measles

Rules can be uncertain

Expert systems (2)

Development of an expert intermediary system to assist with query formulation search strategy selection

Problem (Brooks 87) Evaluation Formulation of the expertise (typical problem with expert system, and not just

for IR)

Expert systems for searching (1)

(Khoo and Poo 1993)

Expert interface to online catalog

Expert system Use search heuristics derived from humans Rules for selecting good heuristics Explanation of strategy selection

Expert systems for searching (2)

Relevance feedback

Collect statistics from search Number of document retrieved, precision, …

Use rules to automatically improve content of search Query terms used - addition/removal of terms, synonyms Connectives used

Rules

3 types of rules Data abstraction rules

if precision <=20% then precision level is 1 if precision > 80% then precision level is 5 if retrieval size is 101-200 then retrieval level is 4

(1 - very low … 5 very high) Heuristic matching rules

if precision level is 2 or 3 and retrieval level >2 then use narrowing strategy

Refinement rules if a narrowing strategy is needed then select strategy “use terms that have

high frequency in relevant records” with weight 0.8

Application of rules

Forward chaining Analyse change in statistics Decide what heuristic rule apply Choose refinement rules (strategies)

Backward chaining Analyse strategies in turn to see if conditions hold

Once chosen strategies Rank strategies by weight Implement in turn

Expert systems

Other expert systems exist for Specialised dialogue functions (eg building a user model) Domain knowledge representation

Intensive to build

Good for helping with complex tasks

System Integration

How to search over different collections, different types of objects, different representations Wrappers Mediators

Machine processable semantics of information for B2B, C2B, e-commerce XML RDF Ontologies

Wrapper

Wrappers have been developed as a component in information mediation architectures

Integrated query access to heterogeneous and distributed information sources

Wrapper: intermediate layer that mediates between users and information sources

Mediator: make the user transparent that the information sources are distributed, i.e. translates query into sub-queries

Wrapper: makes the mediator transparent that the information sources are heterogeneous (protocol, syntax, semantics)

WWW: use HTML-based document structure for heuristic information extraction

Query Agent

Mediator

Wrapper 1 Wrapper 2 Wrapper n

Source 1 Source 2 Source n

Agent-based technology

XML

Different to HTML, XML tags can be written for specific applications Tags define the semantics of the data (HTML tags mainly used as a layout language)

Example

<Person>

<Name>Mounia Lalmas</Name>

<Email>[email protected]</Email>

</Person>

XML Schema

Define a grammar and meaningful tags for documents (DTD) They are XML documents Provide a rich set of datatypes that can be used to define the values of

elementary tags Provide namespace mechanism to combine XML documents with

heterogeneous vocabularies Example

<!ELEMENT name (title*, first name | initial, middle name? Last name+)>

<!ELEMENT first name #PCDATA>

<name>

<title>Dr</title>

<first name>Mounia</first name>

<last name>Lalmas<last name>

</name>

RDF

XML provides semantic information as a by-product of defining the structure of document

XML prescribes a tree structure for documents and the different leaves of the tree have a well-defined tag and context the information can be understand with

structure and semantics of documents are interwoven

The Resource Description Framework RDF provides a means for adding semantics to a document without making any assumptions about the structure of the documents and it provides pre-defined modelling primitives for expressing semantics of data

RDF

RDF is an XML application (i.e. its syntax is defined in XML) customised for adding meta-information to web documents

Currently under development as a W3C standard for content descriptions of web data

Three types of objects: Resources (subjects)

Entity that can be referred to by an address on the web (by an URI); elements that are described by RDF statements.

Properties (predicates)

define a binary relation between resources and/or atomic values provided by the primitive datatype definitions in XML

Statements (objects)

specifies for a resource a value of the property; provide actual characterisation of the web documents

RDF: Simple example

Author(http://www.dcs.qmul.ac.uk/~mounia) = Mounia

States that the author of the named web document is Mounia

Values can also be structured entities:

Author(http://www.dcs.qmul.ac.uk/~mounia) = X

Name(X) = Mounia

Email(X) = [email protected]

where X denotes an actual URI (Universal Resource Identifier)

Syntax of RDF is different! RDF Schema

Ontologies

Neither XML and RDF define a standard vocabulary for describing semantics of documents Define standard vocabularies

Neither XML nor RDF define a standard structure for describing semantics of documents Define standard structure or mappings between different structures

Ontologies Consensual and formal specifications of conceptualisations Provides a shared and common understanding of a domain that can be

communicated across people and applications systems

Ontologies

Definition of vocabulary (i.e. concepts) Definition of structure (i.e. attributes and concept hierarchy)

Logical characterisation of concepts and attributes Domain and range restrictions Properties of relations (symmetry, transitivity) General logical axioms

Examples: Wordnet CYC Ontolingua DAML, OIL (www.semanticweb.org)

Support functions

Information extraction Abstracting and summarising

Cataloguing (Ontology) Automatically linking parts of texts (Hypertext) Thesaurus/dictionary building(Linguistics) Story telling (News)

Information extraction

Analyse unstructured text Extract “relevant” data from collection of documents Application area of computation linguistics

Examples News reading program (Jacobs and Rau, 1990) Generation of fields from free text for database purpose Application to the web (“semantic-based” search engine, e-commerce)

Information extraction

Based on extraction patterns

Concept + linguistic patterns as pre-conditions

Example: We want to extract the target of the terrorist attack from a sentence:

The parliament was bombed by the guerrillas.

Triggering word: bombed (stemming) Linguistic pattern: <subject><passive-verb> Subject is extracted as target

Summarising and abstracting

Summarising: Extracting important sentences

Abstracting: Forming a sequence sentences that summarise the content of a document.

Output of web search engines

Useful for viewing (at a glimpse) retrieved documents Cost in downloading the documents Help for relevance feedback user assessment

Conclusions

Do not overestimate the power of AI in IR Tasks depending on real world knowledge are hard to design, to build and

perhaps to use To be flexible sophisticated enough for wide range of users Many IR tasks are not deep but of a more shallow linguistic nature

AI has a very valuable contribution to make Specialised systems where domain is controlled, well-integrated and understood Support functions Case-based reasoning and dialogue functions Integrated functions

References

C. Stanfill and D.L. Waltz, 1992. Statistical Methods, Artificial Intelligence, and Information Retrieval. In text-based intelligent systems. Current research and practice in Information Extraction and Retrieval (eds. P.S. Jacobs) Lawrence Erlbaum Associates Intelligent

K. Sparck Jones, 1991. The role of Artificial Intelligence in Information Retrieval, JASIS 42(8) pp 558-565

W.B. Croft, 1987. Approaches to intelligent information retrieval, IP&M 23(4) pp 249-254. P. Jacobs and L. Rau, 1990. SCISOR, Extracting information from on-line news.

Communications of the ACM 33(11), pp88-97. A.F. Smeaton, 1992. Progress in the Application of Natural Language Processing to Information

Retrieval Tasks. In The Computer Journal, 36 (3) Srinivasan, P., 1991. Thesaurus construction. In Frakes, W., & Baeza-Yates, R. (eds),

Information Retrieval: Data Structures & Algorithms, Prentice Hall. C,S.G. Khoo and D.C.C Poo, 1993. An expert system approach to online catalog subject

searching. Information Processing & Management, 30(2):223-238. H.M. Brooks, 1987. Expert Systems and Intelligent Information Retrieval. Information

Processing & Management 23(4):341-366 K. Sparck Jones, 1999. Information retrieval and artificial intelligence. Artificial Intelligence

114:257-281.

artificial intelligence approaches for information retrieval outline u artificial intelligence (ai)...

Documents

partial knowledge

world knowledge

information available

textual information

single knowledge basedirectly

use of computers

people kurzweilthe study

boneis kind