qall-me: ontology and semantic web
DESCRIPTION
Invited talk at Driving Future Question Answering: Research Trends And Market Perspectives Workshop, Trento, ItalyTRANSCRIPT
Co-funded by the European Union
QALL-ME: Ontology and QALL-ME: Ontology and Semantic WebSemantic Web
Constantin Orasan
University of Wolverhampton
http://clg.wlv.ac.uk
Structure of presentation
1. The QALL-ME ontology
2. The ontology for answer retrieval
3. The ontology for bibliographical domain
4. The ontology for presentation
5. Where next?
Author, Title - Date 3
Ontology in QALL-MEOntology in QALL-ME The QALL-ME ontology provides a
conceptualised description of the domain in which the system is used
It is used to: Provide a bridge between languages Pass information between different components
of the system Encode the data Retrieve the data
QALL-ME ontology An ontology for the domain of tourism was
developed and used in the prototype (Ou et. al., 2008)
Experiments with (existing) ontologies for the bibliographical domain were carried out (Orasan et. al., 2009)
Ontology for the domain of tourism
Developed to address the user needs Inspired by existing ontologies such as
Harmonise, eTourism, etc. … but developed specially for the project
Aligned it to WordNet and SUMO
Freely available from the QALL-ME website
Part of the ontology (cinema/movies)
MovieShow
Cinema
Movie
TicketPrice
DateTimePeriod
synposis
isInSitehasPrice
hasEventContent
hasPeriod
priceType
priceValue
Director
Star
Producer
Writer
Currency
GPSCoordinate
DirectionLocation
Contact
hasCurrency
TimePeriod
DatePeriod
startTimeendTime
endDate startDate
hasTimePeriod
hasDatePeriod
DirectionLocation
hasSiteFacility
hasContact
hasWriter
hasDirector
hasProducer
genre
name
hasPostalAddress
hasGPSCoordinate
PostalAddress
CinemaRoom
hasRoom
hasStar
certificate
SitePrice
Event
EventContentPeriod
subClassOfsubClassOf
subClassOf
subClassOfsubClassOf
SiteFacility
RoomFacility
hasRoomFacility
name description
Semantic annotation and database organization
The ontology was used to encode the data Annotated data from the content providers
was converted to RDF triplets The RDF documents can be stored in
databases or plain text files The Jena RDF API was used for the
operations
Semantic annotation and database organization
XML Schema
XML Documents
RDF Documents
Define
DetermineDetermine
Transform
QALL-ME Ontology
HTML Parser
Download World Wide Web
Convert
Database
Convert
Ontology for answer retrieval
What movie starring Halle Berry is on in Birmingham?
Class: MovieShow Property: isInSite, Range: Cinema
Property: hasPostalAddress, Range: PostalAddress Property: isInDestination, Range: Destination
Property: name, Range: string <Birmingham>Property: hasEventContent, Range: Movie
Property: name, Range: string <unknown>Property: hasStar, Range: Star
Property:name, Range: string <Halle Berry>
PREFIX qme: http://qallme.itc.it/ontology/qallme-tourism.owl#PREFIX xsd: http://www.w3.org/2001/XMLSchema#SELECT ?movieNameWHERE {
?MovieShow qme:isInSite ?Cinema.?Cinema qme:hasPostalAddress ?PostalAddress.?PostalAddress qme:isInDestination ?Destination.?Destination qme:name “Birmingham”^^<xsd:string>?MovieShow qme:hasEventContent ?Movie.?Movie qme:name ?movieName.?Movie qme:hasStar ?Star.?Star qme:name “Halle Berry”^^<xsd:string>
}
Ontology for MRP Minimal Relation Patterns represent relations
in the ontology Can be used in text entailment
Already presented
Ontology for generation of hypothesis
Starting from the ontology we can create hypothesis
What is the name of the movie with [DIRECTOR]?
What is the director of the movie with the name [NAME]?
Can be done for any language Can generate the SPARQL at the same time Can be done for any domain
Ontology generated patterns 91% of the questions from the benchmark have one
or two constrains Investigation of the benchmark indicated three
types of questions: T1 – Query the name of a site or event which has one or
more non-name attributes; Can you tell me the name of a Chinese restaurant in Walsall?
T2 – Query a non-name attribute of a site or event whose name is known; andCan you give me the address for the Kinnaree Thai Restaurant?
T3 – Query a non-name attribute of a site or event whose name is unknown but using its other non-name attribute(s) as the constraint(s).
Could you give me a contact number for an Italian restaurant in Solihull?”
can be decomposed into the following two questions:
T1: could you give me the name of an Italian restaurant in Solihull?
T2: could you give me a contact number for <the name of the restaurant in T1>?
Automatically generated patterns the ontology can be used to generate patterns for T1 and
T2 questions with one or two constraints 2703 patterns were generated for English and German generated also the SPARQLs Evaluation on 200 questions Baseline = cosine bag of words Semantic engine = similarity on concepts + EAT + entity
filtering
Language and domain independent
Baseline Semantic engine
English 42.46% 65%
German 34.96% 64.88%
How do we move to another domain?
Domain of scientific publications Experiments for the bibliographic domain were
carried out
What papers did C. Orasan published in 2008?
Existing ontologies were combined: Semantic Web for Research Communities (SWRC)
models concepts from the research community A subset of Dublin Core was used to describe the
properties of a bibliographical entry Simple Knowledge Organisation System (SKOS) was
used to model relations between terms
The data from BibTeX format was converted to the domain ontology
SPARQL patterns were generated The retrieval algorithm was not changed
… but some changes had to be introduced at the level of framework
How do we interact with the user?
User satisfaction is largely determined by aspects such as the ease of use, learning curve, feedback, interface friendliness, etc. and not just by accuracy.
What movies can I see at Symphony Hall this week? If no answers:
Look for a different location Search for a different time period Wrong presupposition
User preferences
Most of the Feedback desiderata can be met without changing the current pipeline. 'understanding' occurs in the Entailment engine
(EE) the QPlanner does not have direct access to this
information, but it can be injected in the results via the generated
SPARQL, exploiting the RDF data model
Interactive Question Answering (IQA) ontology (Magnini et. al., 2009)
A question is analysed in terms of: Expected answer type Constraints Context
The answer will contain: Core Information Justification Complementary information
The situation can be handled using a rich SPARQL Rewriting rules for the SPARQL in case of empty
answer
PREFIX declarationsCONSTRUCT {
results triplesAnswersObject triplesQuestionInterpretation triples
}WHERE { OPTIONAL {
selection triples} . }
qmq:qi rdf:type qmq:QuestionInterpretation;qmq:hasInterpretation"In which cinema is [MOVIE] showed on [TIME]" ;qmq:hasConstraint qmq:c1;qmq:hasConstraint qmq:c2;qmq:hasFacet qmq:f1.
qmq:c2 rdf:type qmq:Filter;qmq:hasType qmo:DatePeriod;qmq:hasProperty qmo:startDate;qmq:hasValue '''[TIMEX2]''' ;qmq:failureReason “No film can be for the given date”.
Faceted browsing
Where next? We have the technology to “convert” a natural
language question to SPARQL, via an ontology
We can get access to a large number of resources using Linked Open Data
We can expand the access to knowledge
Thank you !