artificial intelligence approaches for information retrieval outline u artificial intelligence (ai)...
TRANSCRIPT
Artificial Intelligence Approaches for Information Retrieval
OutlineArtificial Intelligence (AI)AI and IRAI applied to IR
Information characterisationInformation seekingSystem IntegrationSupport functions
ConclusionReferences
Aims of AI
Building intelligent entities as well as understanding them
Systems that reason with information in some wayEx: problem solving, classification, learning, planning
Usually use some explicit representation of informationand some means of manipulating information
Four goals of AI
Systems that think like humans
“The exciting new effort to make computersthink ... machine in minds, in the full andliteral senses” (Haugeland, 1985)
“[The automation of] activities that weassociate with human thinking, activitiessuch as decision making, problem-solving,learning ...” (Bellman, 1978)
Systems that think rationally
“The study of mental faculties through theuse of computational models” (Charniak andMcDermott, 1985)
“The study of the computations that make itpossible to perceive, reason, and act”(Winston, 1992)
Systems that act like humans
“The art of creating machines that performfunctions that require intelligence whenperformed by people” (Kurzweil)
“The study of how to make computers dothings at which, at the moment, people arebetter” (Rich and Knight, 1991)
Systems that act rationally
“A fi eld of s tudy that seeks to explain andemulate intelligent behavior in terms ofcomputational processes” (Schalkoff, 1990)
“The branch of co mputer science that isconcerned with the automation of intelligentbehavior” (Luger and Stubblefield)
AI and IR
“[AI] This is the use of computers to carry out tasks requiring reasoning on world knowledge, as exemplified by giving responses to questions in situation where one is dealing with only partial knowledge and with indirect connectivity”
Karen Sparck Jones, 1991
“We construe a system to be knowledge-based when its behavior depends largely on accessing or encoding information”
Call for papers - FLAIRS conferences
Areas of AI for IR
Natural language processing Knowledge representation
Expert systems Ex: Logical formalisms, conceptual graphs, etc
Machine learning Short term: over a single session Long term: over multiple searches by multiple users
Computer Vision Ex: OCR
Reasoning under uncertainty Ex: Dempster-Shafer, Bayesian networks, probability theory, etc
Cognitive theory Ex: User modelling
AI applied to IR
Four main roles investigated
Information characterisation
Search formulation in information seeking
System Integration
Support functions
Information characterisation - Approach 1 (1)
“Strongest” AI approach
Set of documents reported as a single knowledge base Directly manipulating the information available => Knowledge-base retrieval
calcium
is made of
BONE
is kind of is kind of is kind of
HUMERUS RADIUS ULNA
“…calcium Content of bones …”
“… the humerusBones…”
“… radius andUlna, bones inArm…”
Information characterisation - Approach 1 (2)
Query: “does calcium deficiency cause Smith’s disease?”
Criticism: this is a model for question/answering system (not “traditional IR”)
Replace document text (natural language) with a knowledge base in an artificial language
Much of the (textual) information is lost What will be put in the knowledge base Issue of information extraction
Problem with large collection
Successful in specific domain (SCISOR)
Information characterisation - Approach 2
Weaker view of AI
Keep documents and use knowledge base as access tool (query formulation) Semantic-based access, concept-based access Interface and presentation
Better classification of document text and better access
Criticism: problems of (automatic) linkages (documents have different style, language and level of discussion)
Information characterisation - Approach 3
Even weaker AI approach
Abandon knowledge base but use AI (syntactic level) to characterise document content
Sophisticated matching
Use NLP to derive Noun-phrases: “The mother of Jane <=> Jane’s mother” Sentences: “The boy ate the apple <=> The apple was eaten by the boy”
Normalisation is necessary!
Very little of evidence of success (so far)
Information characterisation - Approach 4
Weakest AI approach
Use AI to select good natural language index termsThesaurus constructionCompound terms
Use world knowledge and a bit of linguistics (eg noun vs verb, discourse)
Important caveat or warning:“Most criticisms are valid for cases when we have large-scale and a variety of need on user side. We may have quite a different situation on specialised contexts where an AI approach … may be both justifiable and feasible”
Karen Sparck Jones, 1991
Information seeking
Characterisation of the user’s information need (and not the actual matching)
User modelling
“Automating the intermediary” giving the user an intelligent front-end
Over iterative searching and dialogue, determine use’s real information needMedical doctor vs medical student
Student and general topic: look for a survey document
Criticism: users have difficulty expressing their information need difficult of manually or automatically deriving rules for systems
Possible in limited context
Expert Systems
Expert systems (1)
Designed to simulate expert in narrow/specific field
Rule-based systemsfact: p
rule: if p then q
--------------------
Then add fact: q
Ex: if (patient has red spots) then (patient has measles)
patient has red spots
-------------------------------------------------------------
patient has measles
Rules can be uncertain
Expert systems (2)
Development of an expert intermediary system to assist with query formulation search strategy selection
Problem (Brooks 87) Evaluation Formulation of the expertise (typical problem with expert system, and not just
for IR)
Expert systems for searching (1)
(Khoo and Poo 1993)
Expert interface to online catalog
Expert system Use search heuristics derived from humans Rules for selecting good heuristics Explanation of strategy selection
Expert systems for searching (2)
Relevance feedback
Collect statistics from search Number of document retrieved, precision, …
Use rules to automatically improve content of search Query terms used - addition/removal of terms, synonyms Connectives used
Rules
3 types of rules Data abstraction rules
if precision <=20% then precision level is 1 if precision > 80% then precision level is 5 if retrieval size is 101-200 then retrieval level is 4
(1 - very low … 5 very high) Heuristic matching rules
if precision level is 2 or 3 and retrieval level >2 then use narrowing strategy
Refinement rules if a narrowing strategy is needed then select strategy “use terms that have
high frequency in relevant records” with weight 0.8
Application of rules
Forward chaining Analyse change in statistics Decide what heuristic rule apply Choose refinement rules (strategies)
Backward chaining Analyse strategies in turn to see if conditions hold
Once chosen strategies Rank strategies by weight Implement in turn
Expert systems
Other expert systems exist for Specialised dialogue functions (eg building a user model) Domain knowledge representation
Intensive to build
Good for helping with complex tasks
System Integration
How to search over different collections, different types of objects, different representations Wrappers Mediators
Machine processable semantics of information for B2B, C2B, e-commerce XML RDF Ontologies
Wrapper
Wrappers have been developed as a component in information mediation architectures
Integrated query access to heterogeneous and distributed information sources
Wrapper: intermediate layer that mediates between users and information sources
Mediator: make the user transparent that the information sources are distributed, i.e. translates query into sub-queries
Wrapper: makes the mediator transparent that the information sources are heterogeneous (protocol, syntax, semantics)
WWW: use HTML-based document structure for heuristic information extraction
Query Agent
Mediator
Wrapper 1 Wrapper 2 Wrapper n
Source 1 Source 2 Source n
Agent-based technology
XML
Different to HTML, XML tags can be written for specific applications Tags define the semantics of the data (HTML tags mainly used as a layout language)
Example
<Person>
<Name>Mounia Lalmas</Name>
<Email>[email protected]</Email>
</Person>
XML Schema
Define a grammar and meaningful tags for documents (DTD) They are XML documents Provide a rich set of datatypes that can be used to define the values of
elementary tags Provide namespace mechanism to combine XML documents with
heterogeneous vocabularies Example
<!ELEMENT name (title*, first name | initial, middle name? Last name+)>
<!ELEMENT first name #PCDATA>
<name>
<title>Dr</title>
<first name>Mounia</first name>
<last name>Lalmas<last name>
</name>
RDF
XML provides semantic information as a by-product of defining the structure of document
XML prescribes a tree structure for documents and the different leaves of the tree have a well-defined tag and context the information can be understand with
structure and semantics of documents are interwoven
The Resource Description Framework RDF provides a means for adding semantics to a document without making any assumptions about the structure of the documents and it provides pre-defined modelling primitives for expressing semantics of data
RDF
RDF is an XML application (i.e. its syntax is defined in XML) customised for adding meta-information to web documents
Currently under development as a W3C standard for content descriptions of web data
Three types of objects: Resources (subjects)
Entity that can be referred to by an address on the web (by an URI); elements that are described by RDF statements.
Properties (predicates)
define a binary relation between resources and/or atomic values provided by the primitive datatype definitions in XML
Statements (objects)
specifies for a resource a value of the property; provide actual characterisation of the web documents
RDF: Simple example
Author(http://www.dcs.qmul.ac.uk/~mounia) = Mounia
States that the author of the named web document is Mounia
Values can also be structured entities:
Author(http://www.dcs.qmul.ac.uk/~mounia) = X
Name(X) = Mounia
Email(X) = [email protected]
where X denotes an actual URI (Universal Resource Identifier)
Syntax of RDF is different! RDF Schema
Ontologies
Neither XML and RDF define a standard vocabulary for describing semantics of documents Define standard vocabularies
Neither XML nor RDF define a standard structure for describing semantics of documents Define standard structure or mappings between different structures
Ontologies Consensual and formal specifications of conceptualisations Provides a shared and common understanding of a domain that can be
communicated across people and applications systems
Ontologies
Definition of vocabulary (i.e. concepts) Definition of structure (i.e. attributes and concept hierarchy)
Logical characterisation of concepts and attributes Domain and range restrictions Properties of relations (symmetry, transitivity) General logical axioms
Examples: Wordnet CYC Ontolingua DAML, OIL (www.semanticweb.org)
Support functions
Information extraction Abstracting and summarising
Cataloguing (Ontology) Automatically linking parts of texts (Hypertext) Thesaurus/dictionary building(Linguistics) Story telling (News)
Information extraction
Analyse unstructured text Extract “relevant” data from collection of documents Application area of computation linguistics
Examples News reading program (Jacobs and Rau, 1990) Generation of fields from free text for database purpose Application to the web (“semantic-based” search engine, e-commerce)
Information extraction
Based on extraction patterns
Concept + linguistic patterns as pre-conditions
Example: We want to extract the target of the terrorist attack from a sentence:
The parliament was bombed by the guerrillas.
Triggering word: bombed (stemming) Linguistic pattern: <subject><passive-verb> Subject is extracted as target
Summarising and abstracting
Summarising: Extracting important sentences
Abstracting: Forming a sequence sentences that summarise the content of a document.
Output of web search engines
Useful for viewing (at a glimpse) retrieved documents Cost in downloading the documents Help for relevance feedback user assessment
Conclusions
Do not overestimate the power of AI in IR Tasks depending on real world knowledge are hard to design, to build and
perhaps to use To be flexible sophisticated enough for wide range of users Many IR tasks are not deep but of a more shallow linguistic nature
AI has a very valuable contribution to make Specialised systems where domain is controlled, well-integrated and understood Support functions Case-based reasoning and dialogue functions Integrated functions
References
C. Stanfill and D.L. Waltz, 1992. Statistical Methods, Artificial Intelligence, and Information Retrieval. In text-based intelligent systems. Current research and practice in Information Extraction and Retrieval (eds. P.S. Jacobs) Lawrence Erlbaum Associates Intelligent
K. Sparck Jones, 1991. The role of Artificial Intelligence in Information Retrieval, JASIS 42(8) pp 558-565
W.B. Croft, 1987. Approaches to intelligent information retrieval, IP&M 23(4) pp 249-254. P. Jacobs and L. Rau, 1990. SCISOR, Extracting information from on-line news.
Communications of the ACM 33(11), pp88-97. A.F. Smeaton, 1992. Progress in the Application of Natural Language Processing to Information
Retrieval Tasks. In The Computer Journal, 36 (3) Srinivasan, P., 1991. Thesaurus construction. In Frakes, W., & Baeza-Yates, R. (eds),
Information Retrieval: Data Structures & Algorithms, Prentice Hall. C,S.G. Khoo and D.C.C Poo, 1993. An expert system approach to online catalog subject
searching. Information Processing & Management, 30(2):223-238. H.M. Brooks, 1987. Expert Systems and Intelligent Information Retrieval. Information
Processing & Management 23(4):341-366 K. Sparck Jones, 1999. Information retrieval and artificial intelligence. Artificial Intelligence
114:257-281.