1spring 2005 specification and analysis of information systems towards service retrieval on the web...
Post on 20-Dec-2015
213 views
TRANSCRIPT
1Spring 2005Specification and Analysis of Information Systems
Towards Service Retrieval on the WebEran Toch*, Iris Reinhartz-Berger, Dov Dori, and Avigdor Gal
IBM Research Seminar Haifa, Feb 2007
The Technion
Haifa University
* Supported by the Levi Eshkol grant from Israel’s Ministry of Science
6
Our Definition
InterfaceInputs, outputsWSDL, Web pages
OtherServicescomposed
with
QualityPrice, speed, freshness
Service
9
The Semantics-based Approach
• Based on Web ontologies– [Barners-Lee 2001]
• Service modeling languages (OWL-S)
• Logic-based inference to compose services– [Paolucci 2004]– [Patil 2002]
Google Maps
preconditions input output
10
Drawbacks
• Approximation– Logic inference yields low levels of
approximation recall is low
• Tagging– Tagging is difficult– Agreement on mutual concepts is difficult
• Query Language– Require formal concepts
• Performance– Inference takes time
11
Current Approaches
Preciseapproximated
Human Oriented
Machine Oriented
Text-based
Semantics-based
We Want to be here
We are here
12
Agenda
1. Background
2. Approximated Service Retrieval
3. Results
1. OPOSSUM 3. Indexing2. Approximation
Operation
in
in
in
Hospital
Medical Center
Diagnosed Symptom
in
Mount Sinai
Inform Hospital
1
0.5
0.5
1
CertaintyConceptRole
out
in GPS Position
Hospital 0.5
1Find Nearest
Medical Center
in City 0.4
13
Usability in Service Retrieval• Simple query language
– No formal concepts
• Approximation– Flexible results
• Compositions– Retrieval of compositions, components
• Ranking– Best matches on top
• Fast– Sub-linear processing time
15
Architecture
OPOSSUM
Crawler
Concepts Index
Composition Index
Query Evaluator
Service DescriptionService
Description
UserUser
WSDL, OWL-S,Web sources
Domain KnowledgeDomain
KnowledgeOntologies
16
Queries
“drug”“treatment drug”“medical treatment map or
address”“drug chickenpox price:0-5$”“input:treatment output:drug”“treatment provider:mount sinai
hospital”
17
Services Queries
Service Network:An ontology-labeled, directed graph
Query - An ontology-labeled rooted, directed tree
medical treatment
AddressMap
“medical treatment map or address”
Insurance Treatment Locator
Yellow pages
Google Maps
Yahoo Maps
Hospital Info
Service
Government Treatment
Locator
18
Results
• V – a virtual service– a fully ordered sequence of operations
Insurance Treatment Locator
Yellow pages
0.9
Insurance Treatment Locator
Yellow pages
Google Maps
0.8
Insurance Treatment Locator
Yellow pages
Hospital Info Service
Yahoo Maps 0.4
21
Service Graph Construction
Insurance Treatment
Locator
Government Treatment
Locator
Yellow pages
Hospital Info Service
Google Maps
Yahoo Maps
Hospital
Location
medical treatment
Price = free
c = 0.7c = 1
22
Probability-based Ontologies
Carmel
Hospital
Rothschild
Address
c = 0.7
Similarity Relations
Name
Properties
Certainty values
Businessc = 0.8
“near by”
Departments
Treatments
“is a”c = 0.9
“is a”c = 0.9
“is a”
23
μ-satisfiability
• V satisfies the requirements of Q with a certainty [0,1]
• μ-satisfiability quantifies approximation using a single numerical value
Easy to calculate, but not always accurate
Concept level
Operation level
Composition level
24
Concept Level Approximation
Each query concept, qi is matched against the concepts index.
BusinessHospital ?
25
Concept Approximation
Hospital
BusinessYellow Pages (Get business
address)
Get Hospital Address
Get Hospital Address
Yellow Pages (Get business
address)
26
Semantic Distance
Concept b
Concept a
Concept x
)(log2
1),( pathbadis
Similarity Radius
Distance function - distance coefficient
Semantic edge number / certainty
xbjxai jcic
path)(
1
)(
1
c(i), c(j) – certainty value on edges
27
Operation Level Matching
A query, Q, is matched against the operation
?Get
Hospital List
City
Treatment
Address
medical treatment
AddressMapName
29
Composition Level Matching
A query is matched against a composition of operations
?
medical treatment
AddressMap
30
Composition Certainty
c = 1
c = 0.8
c = 1
Shorter compositions have higher certainty
c = 1
c = 0.2
31
Approximation By Composition
Map Zip Yellow Pages
Map by Zip
Map by Address
Map by Address Yellow Pages + Map by Zip
32
Partial Matching
Address to Zip
Converter
Map by Zip
Hospital Locator
Hospital Address Finder
Partial Match
33
Excessive Matching
Get side effect
Get treatmen
t info
Get drug price
Order drug
Excessive Matching
35
Quantifying Structural Approximation• An approximated service is defined as:
• We define the structural satisfiability as:
“Full” Virtual Service
Approximated Service
Graph edit distance
Satisfiability of query components
36
RankingQ = “medical treatment map or address”
Insurance Treatment Locator
Yellow pages
0.9
Insurance Treatment Locator
Yellow pages
Google Maps
0.8
Insurance Treatment Locator
Yellow pages
Hospital Info Service
Yahoo Maps 0.4
38
Concept Index
in
in
Treatment
Drug
in
Symptom
Insurance Treatment
Locator
1
0.5
0.5
1
out
in Address
Map 1
1 Google Maps
in Zip code 0.4
out Hospital
The size of the index is determined by the semantic radius
Concept CertaintyRole Operation
39
Composition Complexity
• We know that – The Service Network is a graph– Q is a graph– Evaluating a query Q on the Service Network
is subgraph isomorphism
• Which is NP-Complete– In the general case
• Semantic Peer-to-peer Routing– [Schlosser 2002]– [Ben-Asher 2006]
40
Hierarchal Concept Clustering
Hospital availabilit
y
Hospital Info
Service
Hospital Locator
Yahoo MapsGoogle
Maps
Zip Finder
Update medical records
Get Physician Address
Get patient
address
41
Multi-dimensional clustering• Hypercubes
• Shortest path between two most distant nodes=logbN– N – number of nodes– B – hypercube base
Yahoo MapsGoogle
Maps
Zip Finder
Geography
Yellow Pages
Business
8 1
2
0
11
3 0
4 5
7
0
11
6 0
2 2
22
42
Query Evaluation Complexity
• D - query disjunctions• OP – retrieved operations• - number of results• N - number of operations in the service
network• b - hypercube base
45
Performance
QueryOWLS-MXOPOSSUM
Ratio
hospital investigating171033 52book price164735 48country skilled occupation
174220 88
car price service168215 113
geopolitical entity weather process
136427 51
government degree scholarship
178232 56
novel author166240 42
46
Scalability
0
10
20
30
40
50
60
0 500 1000 1500 2000 2500 3000
number of services
pro
ce
ss
ing
tim
e (
in m
s)
Trendline
Service Generator
P – number of parametersnc – create new conceptc – concept mapping
Simulated service
47
Indexing
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
0 100 200 300 400 500 600
number of service entries
nu
mb
er o
f in
dex
en
trie
s
48
Domains and Approximation
0
200
400
600
800
1000
1200
1400
1600
1800
2000
0 500 1000 1500 2000 2500
domain size & connectivity
app
rxim
atio
n f
acto
r
50
References[Schlosser 2002] Mario T. Schlosser, Michael Sintek, Stefan Decker,
Wolfgang Nejdl: HyperCuP - Hypercubes, Ontologies, and Efficient Search on Peer-to-Peer Networks. AP2PC 2002: 112-124
[Ben-Asher] Yosi Ben-Asher, Shlomo Berkovsky: Semantic Data Management in Peer-to-Peer E-Commerce Applications. J. Data Semantics VI: 115-142 (2006)
[Berners-Lee 2001] Berners-Lee, T., Hendler, J., Lassila, O., The Semantic Web, Scientific American, 284(5), 2001, pp. 34-43.
[Patil 2004] A. Patil, S. Oundhakar, A. Sheth, and K. Verma. Meteor-s web 13 service annotation framework. In Proceedings of WWW 2004, pages 553–562, New York, NY, May 2004.
[Paolucci 2002] Massimo Paolucci, Takahiro Kawamura, Terry R. Payne, and Katia P. Sycara. Semantic matching of web services capabilities. In International Semantic Web Conference, pages 333–347, 2002.
[Toch 2006] Eran Toch, Iris Reinhartz-Berger, Avigdor Gal, and Dov Dori, OPOSSUM: Bridging the Gap between Web Services and the Semantic Web, proceedings of NGITS 2006, pp. 357-358.