querying for information integration: how to go from an imprecise intent to a precise query? aditya...

20
Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query? Aditya Telang Sharma Chakravarthy, Chengkai Li

Upload: franklin-warren

Post on 16-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query? Aditya Telang Sharma Chakravarthy, Chengkai Li

Querying for Information Integration: How to go from an Imprecise Intent

to a Precise Query?

Aditya TelangSharma Chakravarthy, Chengkai Li

Page 2: Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query? Aditya Telang Sharma Chakravarthy, Chengkai Li

Motivation

• “Retrieve castles near London that are reachable by train in less than 2 hours”

• “Find 3-bedroom houses in Houston within 2 miles of a school and within 5 miles of a highway and priced under 250,000$”

• “Retrieve French restaurants within 1 mile of IMAX Theater in Dallas, Texas”

• …

Page 3: Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query? Aditya Telang Sharma Chakravarthy, Chengkai Li

Current Scenario

•“Retrieve castles near London that are reachable by train in less than 2 hours”

London Train schedules

Trains from London

Castles Near London

- Decision Making Process- Manually Combine Results to arrive at a decision

- Decision Making Process- Manually Combine Results to arrive at a decision

Page 4: Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query? Aditya Telang Sharma Chakravarthy, Chengkai Li

Ideal Scenario

Information Integration

System

Intent: Retrieve castles near London that are reachable by train in less than 2 hours

Actual Results for the intent

Page 5: Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query? Aditya Telang Sharma Chakravarthy, Chengkai Li

The InfoMosaic Approach

Page 6: Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query? Aditya Telang Sharma Chakravarthy, Chengkai Li

Query Specification

• Query: “Bank within 1 mile of ‘University of Texas, Arlington’”

Query: Castle within 2 hours by train from London

Page 7: Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query? Aditya Telang Sharma Chakravarthy, Chengkai Li

How to specify a query?

• Search Method (e.g., Google) –– Just needs to search for the ‘keyword’ in a set of

documents– Get list of documents and post-process (rank,

cluster, classify, etc.)– In an Integration scenario, this doesn’t work

• ‘bank’ – D1• ‘University of Texas, Arlington’ – D2• 1 (out of 1 mile) is ignored• Intersecting documents returned will not generate

results desired

Page 8: Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query? Aditya Telang Sharma Chakravarthy, Chengkai Li

How to specify a query?

• Database Method (e.g., SQL) –– Too rigid– Need to know database (or source) and its

corresponding attributes• SELECT T1.a1, T2.a2 FROM T1, T2 WHERE …

– Web is not organized as a database hence exact mapping between sources and attributes is not feasible and not available

Page 9: Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query? Aditya Telang Sharma Chakravarthy, Chengkai Li

How to specify a query?

• Natural Language –– Ideal mechanism– Inherently hard considering ambiguities of natural

language.– school – institution for education; group of fish– Mechanisms such as Question-Answering

frameworks focus on sophisticated language models built for specific domains independently.

– Incorrect assumption in a integration scenario

Page 10: Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query? Aditya Telang Sharma Chakravarthy, Chengkai Li

Query Specification

• Query – “castle near London”

List of documents retrieved from Web containing text – “castle near London”

Relation containing tuples

SELECT castle.name, …FROM castle_DBWHERECastle.location = ‘London’

Page 11: Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query? Aditya Telang Sharma Chakravarthy, Chengkai Li

Query Specification

• Query – “castle near London”

InformationIntegration

No idea about source, schema, attributes, etc.No idea about how to pose a query

No idea about user intent –Castle:= building, move in chess, … ?

SELECT castle.*WHERE castle.place = “London”

Page 12: Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query? Aditya Telang Sharma Chakravarthy, Chengkai Li

Proposed Approach

• Approach: refine-as-you-input• Approach: verify-after-input

Amount of input

Number of Interactions

(verify-after-input)

(refine-as-you-input)

1

2

3

4

Succinct (keyword)

Verbose (natural language, SQL)Ideal Formulation

Page 13: Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query? Aditya Telang Sharma Chakravarthy, Chengkai Li

Approach 1: Refine-as-you-input

• Based on most popular paradigm of querying used today – keyword search– Input: Set of keywords/concepts (e.g., castle, train, …)– Output: Set of 1 or more Precise Structured Query– Challenge:

• Keyword Resolution – entity, attribute, value?• Generating Query from minimal information

– Problem:• Could result in too many non-relevant queries

– Positive:• Paradigm accepted by Web, IR and even DB community !!!

Page 14: Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query? Aditya Telang Sharma Chakravarthy, Chengkai Li

Approach 1: Refine-as-you-input

User Interaction

Page 15: Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query? Aditya Telang Sharma Chakravarthy, Chengkai Li

Approach 1: Verify-after-input

• Based on a rigorous method of formulation queries – similar to SQL– Input – user filled template based– Output – single precise query– Problem

• Users don’t like filling too many details• Coming up with a unique template across domains

– Positive • Less ambiguous• Reduced number of user interactions

Page 16: Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query? Aditya Telang Sharma Chakravarthy, Chengkai Li

Approach 1: Verify-after-input

Page 17: Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query? Aditya Telang Sharma Chakravarthy, Chengkai Li

Evaluation Plan

• Testing the approaches on RDBMS where the schema and output is known

• Actual user studies

Page 18: Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query? Aditya Telang Sharma Chakravarthy, Chengkai Li

Related Work

Page 19: Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query? Aditya Telang Sharma Chakravarthy, Chengkai Li

Future Work

• Perform extensive experiments to prove the validity of the proposed approaches

• Address other issues in information integration

• Current focus – Ranking [Telang:DBRank’07]

Page 20: Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query? Aditya Telang Sharma Chakravarthy, Chengkai Li

Thank You !