a comparison of sawsdl based semantic web service discovery algorithms

A COMPARISON OF SAWSDL BASED SEMANTIC WEB SERVICE DISCOVERY

ALGORITHMS

by

SHIVA SANDEEP GARLAPATI

(Under the Direction of John A. Miller)

ABSTRACT

The advent of Web Services has revolutionized the way web-based applications

communicate. However, an ongoing problem has been how to discover the appropriate web

services. There have been techniques which were purely based on the syntactic aspects of the web

services. This approach is often not adequate due to the fact that it never really considered the

semantics of the web service. Henceforth, there has been significant research on semantically

discovering web services. In this thesis, we compare and analyze three SAWSDL based semantic

web service discovery algorithms namely, SAWSDL-MX, MWSDI and TVERSKY.

INDEX TERMS: Semantic Web Service, SAWSDL, Ontology, Web Service discovery and

WSDL.


ALGORITHMS

by


B.E., ANNA UNIVERSITY, INDIA, 2007

A Thesis Submitted to the Graduate Faculty of The University of Georgia in Partial

Fulfillment of the Requirements for the Degree

MASTER OF SCIENCE

ATHENS, GEORGIA

2010


ALGORITHMS

by


Major Professor: John A. Miller

Committee: Krzysztof J. Kochut

Thiab R. Taha

Electronic Version Approved:

Maureen Grasso

Dean of the Graduate School

The University of Georgia

August 2010

iv

DEDICATION

To my parents, my brother, my relatives and all my friends.

v

ACKNOWLEDGEMENTS

I thank my professor Dr. John A. Miller for the encouragement and support

provided throughout.

vi

TABLE OF CONTENTS

Page

ACKNOWLEDGEMENTS .................................................................................................v

LIST OF TABLES ............................................................................................................ vii

LIST OF FIGURES ......................................................................................................... viii

CHAPTER

1. INTRODUCTION ............................................................................................1

2. BACKGROUND ...............................................................................................4

SEMANTIC WEB SERVICES ...................................................................4

SAWSDL .....................................................................................................4

3. DISCOVERY ALGORITHMS .........................................................................6

SAWSDL-MX MATCHMAKER ...............................................................6

TVERSKY MODEL BASED MATCHING ALGORITHM ......................9

METEOR-S DISCOVERY MATCHING ALGORITHM ........................14

4. EVALUATION AND DISCUSSION OF RESULTS .....................................21

EVALUATION OF ALGORITHMS BASED ON RESULTS .................34

5. CONCLUSION AND FUTURE WORK ........................................................37

REFERENCES ..................................................................................................................39

APPENDICES

A. APPENDIX .....................................................................................................44

vii

LIST OF TABLES

Page

Table 3-1: Match Tversky model Similarity Scores for common ontology ......................10

Table 3-2: Similarity of Operations ...................................................................................15

Table 3-3: Concept Similarity ............................................................................................17

Table 3-4: Property Similarity ...........................................................................................19

Table 4-1: Precision, Recall and F-Measure for SAWSDL-M0 ........................................28





Table 4-6: Precision, Recall and F-Measure for TVERSKY .............................................30

Table 4-7: Precision, Recall and F-Measure for MWSDI .................................................31

Table 4-8: Average Precision, Recall and F-Measure for SAWSDL ................................31

Table 4-9: Syntactic similarity using Extended Jaccard measure ......................................32

viii

LIST OF FIGURES

Page

Figure 4-1: Precision Graph for SAWSDL-M0, TVERSKY, MWSDI and Average

SAWSDL-MX Hybrid ...........................................................................................33

Figure 4-2: Recall Graph for SAWSDL-M0, TVERSKY, MWSDI and Average

SAWSDL-MX Hybrid ...........................................................................................33

Figure 4-3: F-Measure Graph for SAWSDL-M0, TVERSKY, MWSDI and Average

SAWSDL-MX Hybrid ...........................................................................................34

Figure A-1: novel_author_service.wsdl ............................................................................45

Figure A-2: novel_author_service.wsdl (continued) ........................................................46

Figure A-3: surfing_destination_service.wsdl ..................................................................47

Figure A-4: surfing_destination_service.wsdl (continued) ...............................................48

Figure A-5: novel_authorbook-type_service.wsdl ............................................................49

Figure A-6: novel_authorbook-type_service.wsdl (continued) ........................................50

Figure A-7: activity_destination_service.wsdl .................................................................51

Figure A-8: activity_destination_service.wsdl (continued) ..............................................52

Figure A-9: activity_beach_service.wsdl...........................................................................53

Figure A-10: activity_beach_service.wsdl (continued) ....................................................54

1

CHAPTER 1

INTRODUCTION

Web services have revolutionized the way web-based applications communicate.

The World Wide Web Consortium (W3C) defines a Web Service as "a software system

designed to support interoperable machine-to-machine interaction over a network. It has

an interface described in a machine-processable format (specifically Web Services

Description Language WSDL [4]). Other systems interact with the web service in a

manner prescribed by its description using SOAP [3] messages, typically conveyed using

HTTP with an XML serialization in conjunction with other web-related standards." [1].

The Web services provide a way for interaction between web-based applications using

the Extensible Markup Language (XML) [2], Simple Object Access Protocol (SOAP) and

Web Service Description Language (WSDL). The Simple Object Access Protocol is a

protocol for communication of information (messages in XML) between web services

and usually relies on other protocols (for example, HTTP [5]). The web services are

described using an XML language called WSDL. Lately, the emphasis has been moving

towards Representational State Transfer (REST) [6] services, which does not require

SOAP, but uses HTTP or similar protocols.

Universal, Description, Discovery and Integration (UDDI) [7] is an XML based

registry to publish and discover web services (UDDI technical specification committee,

OASIS has been closed and there has been no further development after 2007 [8]). The

http://en.wikipedia.org/wiki/Interoperability

http://en.wikipedia.org/wiki/Machine_to_Machine

http://en.wikipedia.org/wiki/Computer_network

http://en.wikipedia.org/wiki/Web_Services_Description_Language

http://en.wikipedia.org/wiki/SOAP

2

UDDI enables the service providers to publish their web services to the world and the

service requesters to find and locate appropriate web services. Web service discovery

[9, 10] by UDDI is purely keyword and taxonomy based, which is often not adequate in

terms of correctness. A better way to enhance web service discovery is by adding

semantics to service descriptions and use algorithms or mechanisms that do semantic

matching. Many approaches that add such semantics have come up, such as OWL-S [11],

WSMO [12], WSDL-S [13] and SAWSDL [14]. Adding semantics can be achieved by

annotating the WSDL [15] using references to concepts in ontologies. Ontology [16] is a

formal representation of a set of concepts within a domain and the relationships between

those concepts. This set of extensions to the WSDL is called Semantic Annotations for

Web Service Description Language (SAWSDL).

This thesis focuses on three SAWSDL based semantic web service discovery

algorithms. SAWSDL-MX [17] is a hybrid semantic matchmaker which uses logic-based

and text-based similarity information to determine the match. The Tversky [18] model

finds the similarity match by identifying the relationship between ontological concepts

and the similarity of properties corresponding to them. The MWSDI [19] discovery

algorithm is based on finding the similarity match between the request and service by

comparing the ontological concepts of the operation, inputs and outputs.

The thesis is structured as follows: We briefly explain what semantic web services

are and how to semantically annotate web services with SAWSDL in chapter 2. The

discussion of the algorithms gives insight into each algorithm in chapter 3. The

evaluation of each of the discovery engines is done and the precision, recall and F-

3

Measure of the experiments are calculated and compared in chapter 4. In chapter 5, we

present the conclusions and future work.

4

CHAPTER 2

BACKGROUND

2.1 SEMANTIC WEB SERVICES

Semantic Web Services (SWS) [20] are just the extensions to the traditional web

services by adding references to the ontologies. This makes the web services more

machine-understandable, since the ontologies are more standardized. The semantic

annotations to a web service make it unambiguous. When multiple services use common

ontologies, they follow the same vocabulary, which allows machines to perform

automated discovery and compositions of web services. SAWSDL is one of the

approaches to semantically annotate web services. It is discussed in detail below.

2.2 SAWSDL

SAWSDL is an extension of WSDL to add additional semantics to the service

descriptions. It is achieved by model references and schema mappings. A

modelReference points to a concept in an ontology with a similar meaning intended. A

model reference is commonly added to interfaces, operations, faults, element, simple

types and complex types. There can be multiple references to multiple ontologies for all

of them. A Schema mapping is used to tackle mismatch of data between the request and

the service. The schema mappings are of two types. The liftingSchemaMapping,

describes the transformation from lower level (XML) to the upper level (Ontology) and

5

loweringSchemaMapping, describes the transformation from upper level to lower level.

The model references help in automated discovery of services, while the schema

mappings help in automated service execution.

Here is an example SAWSDL file [21]. The model references and schema

mapping are shown in bold.

<xsd:element name="Author" type="AuthorType"/>

<xsd:element name="Novel"

sawsdl:liftingSchemaMapping="http://127.0.0.1/services/liftingSchemaMappings/no

vel_author_service_Novel_liftingSchemaMapping.xslt" type="NovelType"/>

<xsd:complexType name="NovelType"

sawsdl:modelReference="http://127.0.0.1/ontology/books.owl#Novel">

<xsd:sequence>

<xsd:element name="hasSize" type="Medium"/>

</xsd:sequence>

</xsd:complexType>

<xsd:simpleType name="Medium"

sawsdl:modelReference="http://127.0.0.1/ontology/books.owl#Medium">

<xsd:restriction base="xsd:string"/>

</xsd:simpleType>

<xsd:simpleType name="AuthorType"

sawsdl:modelReference="http://127.0.0.1/ontology/books.owl#Author">

<xsd:restriction base="xsd:string"/>

</xsd:simpleType>

6

CHAPTER 3

DISCOVERY ALGORITHMS

The main feature of these semantic matching algorithms is the degree of match

(match score) between the operations, inputs and outputs of requests and services. The

semantic annotations for each of these are taken and the relationships between concepts

of ontology are considered to find the degree of match. Another main feature is how the

services are ranked once they are discovered.

In the following sections, we discuss the three discovery algorithms in detail.

3.1 SAWSDL-MX MATCHMAKER

SAWSDL-MX is a hybrid semantic service matchmaker. There are multiple

variants namely, SAWSDL-MX1, SAWSDL-M0+WA and SAWSDL-MX2. The

SAWSDL-MX1 is logic based with the degree of match varying among 5 different

aspects: Exact, Plug-in, Subsumes, Subsumed-by and Fail. It then applies text similarity

filters to rank the services with the same degree of match.

The SAWSDL-M0+WA is very similar to the SAWSDL-MX1 and does the same

logic based matching, but ranks the services using the WSDL Analyzer (WA) tool [22]

which calculates the structural similarity of the entire WSDL. The SAWSDL-MX2

computes all the three kinds of matching: logic based text similarity and structural

similarity. Moreover, it calculates the ranking based on a machine learning approach

called binary SVM (Support Vector Machines) [23] classifier which, given the training

7

set of relevance services to the request, ranks the services. However, the WSDL-Analyzer

tool and SVM-based classifiers are beyond the scope of this thesis. SAWSDL-MX1 is the

only approach which focuses only on the semantic annotations in the SAWSDL and text

similarities and is described more in detail below.

LOGIC-BASED MATCHING

The SAWSDL-MX matchmaker uses the terminology of the OWL-DL. ≡ and ⊑

denote concept equivalence and concept subsumption, respectively. Let be the

ontology used by the matchmaker and H( ) be the concept hierarchy/taxonomy in the

ontology, then

For each concept C, D ∈ H( ),

D ∈ LSC(C) if D ⊑ C and ∄ E s.t. D ⊑ E ⊑ C

where, LSC(C) is the immediate sub-concept of C. Similarly explained, LGC(C) is the

immediate super-concept of C.

Let R.I be the set of input concepts for the request R and S.I be the set of input

concepts of the service S. R.O is the set of output concepts for R and S.O is the set of

output concepts for S. For simplicity, let us, for the moment, assume that all the list of

inputs and outputs are singletons. In this notation, the various types of matches supported

by SAWSDL-MX may be given as below:

Exact match: R exactly matches with S.

o R.I ≡ R.I R.O ≡ S.O

Plug-in Match: S plugs into R.

8

o R.I ⊑ S.I S.O LSC(R.O)

Subsumes Match: R subsumes S.

o R.I ⊑ S.I S.O ⊑ R.O

Subsumed-by match: R is subsumed by S

o R.I ⊑ S.I (R.O ≡ S.O S.O LGC(R.O))

Fail: S fails to match with R according to logic based semantic filter criteria.

These types of matches were designed for the OWLS-MX matchmaker [24]. The

ranking of services are defined in the same order explained. This logic based variant is

called SAWSDL-M0. The ranks for the services with similar degree of match are given

in random order for the first time, but follow the same ranking principle thereafter.

TEXT SIMILARITY MATCHING

SAWSDL-MX calculates the text similarity of the request and service using the

following token-based similarity measures: Loss-of-Information [24], Extended Jaccard

[34], Cosine [35] and Jensen-Shannon [36]. To calculate the text similarity, the concepts

from both the request and service, such as input concepts and output concepts from the

ontology are unfolded into vectors such as the vector model of information retrieval.

Each of these terms is given a tf-idf (term frequency-inverse document frequency) [25]

weight based on whether it is input or output. Each pair of inputs and outputs are

compared and the average is taken as text similarity for the request and service pair.

The SAWSDL-(M1-M4) variants use hybrid matching which calculates the

degree of match using the Logic-Based matching and then ranks the service based on one

9

of these text similarity measures Loss-of-Information (M1), Extended Jaccard (M2),

Cosine (M3) or Jensen-Shannon (M4).

3.2 TVERSKY MODEL BASED MATCHING ALGORITHM

The Tversky algorithm uses the Tversky model [27] to calculate the degree of

match between services. The services are annotated in SAWSDL. The approach is a

similarity based model in which the match score is calculated by comparing the input

concepts, output concepts and functionality concepts of the requests and the services

referenced in the ontologies.

MATCHING ALGORITHM

The Tversky’s model matches the services by individually finding SimI

(Similarity of Inputs), SimO (Similarity of Outputs) and SimF (Similarity of

Functionality), which analyzes the number of common properties (which may be

inherited) between the pairs of inputs, outputs and functionality concepts of request R

and service S conceptualized in the ontology. We say that the two concepts R and S have

a common property if the two properties RP and SQ have the weighted average of name

match and range match is greater than the threshold ∝, which is decided arbitrarily. This

is discussed more in detail later in the section. Given a concept C, p(C) is defined as the

set of properties of the concept C in the ontology.

MATCHING BASED ON COMMON ONTOLOGY

Case 1: For SimI(R, S) (refer to Table 3-1), the similarity or the degree of match

score for the input is: (a) 1 if R.I ≡ S.I, the two concepts are said to be equivalent if R.I ⊑

S.I S.I ⊑ R.I, i.e., both the concepts subsume each other. (b) 1 if R.I ⊑ S.I, instances

10

of R.I are necessarily instances of S.I, in other words R.I has a subclass of relationship

with S.I. (c) The ratio of the number of common properties and the number of properties

of S.I, if S.I ⊑ R.I, i.e., the service concept subsumes the request concept, in this case

p(R.I) ∩ p(S.I) = p(R.I). (d) The ratio of the number of common properties and the

number of properties of S.I, if R.I ∩ S.I ≠ ∅, i.e., the two concepts intersecting should

have at least one property in common.

Table 3-1: Tversky model Similarity Scores for common ontology

Input/Output/

Functionality

Match Score Condition

SimI(R,S) 1 R.I ≡ S.I

1 R.I ⊑ S.I

|p(R.I)| / | p(S.I)| S.I ⊑ R.I

|p(R.I)∩ p(S.I)| / | p(S.I)| R.I ∩ S.I ≠ ∅

SimO(R,S)

1 R.O ≡ S.O

|p(S.O)| / | p(R.O)| R.O ⊑ S.O

1 S.O ⊑ R.O

|p(R.O)∩ p(S.O )| / | p(R.O)| R.O ∩ S.O ≠ ∅

SimF(R,S)

1 R.F ≡ S.F

|p(S.F)| / | p(R.F )| R.F ⊑ S.F

1 S.F ⊑ R.F

|p(R.F)∩ p(S.F )| / | p(R.F)| R.F ∩ S.F ≠∅

11

. Case 2: For SimO(R, S) (refer to Table 3-1) the similarity or the degree of match

for the output is: (a) 1 if R.O ≡ S.O, i.e., two concepts are equivalent. (b) The ratio of the

number of common properties and the number of properties for R.O, if R.O ⊑ S.O, i.e.

the concept subsumes the request concept, in this case p(R.O) ∩ p(S.O) = p(R.O). (c) 1 if

S.O ⊑ R.O, instances of S.O are necessarily instances of R.O. In other words, R.O has a

superclass of relationship with S.O. (d) The ratio of the number of common properties

and the number of properties of R.O if R.O ∩ S.O ≠ ∅, the two concepts intersecting

should have properties in common.

Case 3: For SimF(R, S) (refer to Table 3-1) the similarity or the degree of match

for the output is: (a) 1 if R.F ≡ S.F, i.e., two concepts are equivalent. (b) The ratio of the

number of common properties and the number of properties for R.F, if R.F ⊑ S.F, i.e., the

concept subsumes the request concept, in this case p(R.F) ∩ p(S.F) = p(R.F). (c) 1 if S.F

⊑ R.F, instances of S.F are necessarily instances of R.F in other words R.F has a

superclass of relationship with S.F. (d) The ratio of the number of common properties and

the number of properties of R.F if R.F ∩ S.F ≠ ∅, the two concepts intersecting should

have properties in common.

MATCHING BASED ON MULTIPLE ONTOLOGIES

Not all web services reference the same ontology. Different web services can be

described by different ontologies. This gives rise to certain issues which can be tackled

by using a feature-based similarity measure, which compares concepts based on their

common and distinguishing features (properties). It takes into account the features or

12

properties of concepts which are transparently represented by their inherited properties.

The similarity functions MOSimI(R, S), MOSimO(R, S), MOSimF(R, S) for inputs,

outputs and functionality for multiple ontologies are defined as follows:

( ( . ), ( . )) ( ( . ), ( . ))*

| ( . ) | | ( . ) | ( ( . ), ( . )) | ( . ) |( , )

IM p R I p S I M p R I p S I

p R I p S I M p R I p S I p S IMOSim R S

( ( . ), ( )) ( ( ), ( ))*

| ( ) | | ( ) | ( ( . ), ( )) | ( ) |

. . .. . . .

( , ) O

M p R O p S M p R p S

p R p S M p R O p S p R

O O OO O O O

MOSim R S

( ( . ), ( )) ( ( ), ( ))

*| ( ) | | ( ) | ( ( . ), ( )) | ( ) |

. . .

. . . .( , ) F

M p R F p S M p R p S

p R p S M p R F p S p R

F F F

F F F FMOSim R S

The above formulas can be summarized as the geometric distance between the

ratio of the best mapping between the properties and the number of properties present in

the service or request. Function M establishes a mapping between the properties of the

two concept classes S and R. It establishes the best matching or mapping between two

sets of properties, P = 1 2

{ , ,... }up p p , Q = 1 2

{ , ,... }vq q q and is determined by using the

Hungarian algorithm [28] for weighted bipartite graph matching.

1

1

1

1 2

( , ) ( * )

1 ( , )

0

1

1

* ( , ) * ( , )

u

ij iji

i jij

u

ijj

v

iji

ij

x

x

M P Q Max m x

if p q is selectedwhere x

otherwise

such that for all i

for all j

m w namematch p q w rangematch p q

13

The weight ijm is calculated using the similarity between the properties. A

property has three parts: name, domain and range. Name match can be done using several

string matching algorithms such as N-gram, stemming, etc. Currently, our

implementation uses the N-gram algorithm [29] to calculate the similarity between the

names of properties. Range match can be divided into two parts: data type match and

name match. Data type match is done by checking whether both the properties have same

data types or not. If they have same data types then the value of match is 1 or else it is 0

(certainly this could be refined to give a value from 0 to 1 as done by MWSDI

algorithm).

3.3 MWSDI DISCOVERY MATCHING ALGORITHM

The MWSDI discovery algorithm supports both semantic and syntactic discovery

of services. The request R is matched against a service S. The matching of R with S is

done using a matching function which returns a similarity value in the range of 0 or 1.

This similarity value has two different dimensions: syntactic similarity and semantic

similarity. These similarities correspond to inputs, outputs and operations of the R, S

pairs. A match score is calculated for each pair of request and service. The pairs are

ranked in descending order of the match score and presented for the selection of a proper

web service. MWSDI is a more general approach which supports matching at the

interface (port type) level which may include several operations.

14

MATCHING ALGORITHM

The similarity score is obtained by matching the operations of R and S which

include a set of inputs, a set of outputs and a functionality. Semantic Similarity of

operations (refer to table 3-2) is used as the matching unit while matching R and S.

SIMILARITY OF OPERATIONS

The function OPSim (refer to table 3-2) calculates the similarity for an individual

operation pair. OPSim comprises of input similarity, output similarity, syntactic similarity

and conceptual similarity.

Input Similarity (inpSim): The input similarity calculates the similarity between

the set of inputs of the request and the service. Given the set of inputs, we use the

Hungarian algorithm to find the best mapping. This is better explained using a bipartite

graph. Consider a graph G = (R.I, S.I, M) where, R.I is the set of Request inputs and S.I is

the set of Service inputs and M is the set of concept match scores ijm (concept similarity

match score for . , .i jR I S I ). We want to find the best mapping with the maximum match

score.

Output Similarity (oupSim): The output similarity calculates the similarity

between the set of outputs of the request and the service. Given the set of outputs, we use

the Hungarian algorithm to find the best mapping. In the terms of a bipartite graph, G =

(R.O, S.O, M) where, R.O is the set of Request outputs and S.O is the set of Service

outputs and M is the set of concept match scores ijm (concept similarity match score for

. , .i jR O S O ). We want to find the best mapping with the maximum match score.

15

Syntactic Similarity (synSim): The syntactic similarity is the similarity of the

name of the operation in the request and service. To find the syntactic similarity of the

operation names, we have used the N-gram similarity algorithm, which is one of the

numerous string similarity measure algorithms.

Table 3-2: Similarity of Operations

1 2 3

4

( , )

( . , . ) ( . , . ) ( . , . )

( . , . )

OPSim R S

w inpSim R I S I w oupSim RO S O w synSim R name S name

w conSim R F S F

1

1

1

1 2

1 ( . , . )1( ( ))

0

1

1

. { , ,..

( . , . ) u

i j

ij ij iji

u

ijj

v

iji

if R I S I is selectedMax m x where x

m otherwise

such that for all i

for all j

where R I I I

inpSim R I S I

x

x

1 2

. } . { , ,... }

, ,

( . , . )

u v

ij i j

I S I I I I

arethelist of inputs for request R servce S respectively

m conSim R I S I

1

1

1

1 2

1 ( . , . )1( ( ))

0

1

1

. { , ,..

( . , . )

ui j

ij ij iji

u

ijj

v

iji

if R O S O is selectedMax m x where x

m otherwise

such that for all i

for all j

whereR O O O

oupSim RO S O

x

x

1 2

. } . { , ,... }

, ,

. , .( )

u v

i jij

O S O O O O

arethelist of outputs for request R servceS respectively

RO S Om conSim

( , ) ( . , . )synSim R S nGramSim R label S label

16

Concept Similarity (conSim): The concept similarity is the similarity of the

operation concept referenced in the ontology. The concept similarity is also used in

calculating ijm in input and output similarity. So in general, the concept similarity

explains the similarity of two concepts in a given ontology.

The concept similarity is a combination of concept syntactic similarity, coverage

similarity and property similarity.

Concept Syntactic Similarity (conSynSim): The concept syntactic similarity is the

similarity of the ontological name of two concepts. To find the syntactic similarity of the

ontological names, we have used the N-gram similarity algorithm.

Coverage Similarity (cvrgSim): The coverage similarity measures the extent of

knowledge with which one concept covers the other. If the concepts have a ratio of ≥ .8

for compropSim (the number of common properties over the number of unique

properties), then we say the concepts have a coverage similarity of 1. If not, for the

service concept we loop through the hierarchy to find a parent concept that has

compropSim ≥ .8. The coverage similarity is penalized1

0.1* 2x

where x is the number of

levels up in the hierarchy where the parent concept is found. The same process is

repeated to find the child concept in the hierarchy that has compropSim ≥ .8. If found the

penalty is 0.05*x, where x is the number of level down in the hierarchy where the child

concept is found. If none of the above cases work, the coverage similarity is 0.

17

Table 3-3: Concept Similarity

1 2

1 2

5 6 7

where . is the set of properties of R { , ,... }

. is the set of properties of S { , ,... }

( , )

( , ) ( . , . ) ( , )

m

m

R p p p p

S q q q q

conSim R S

w cvrgSim R S w propSim R p S q w conSynSim R S

( , ) ( . , . )conSynSim R S nGramSim Rlabel S label

1

1 ( , ) 0.8

1 0.1 2 ( , ( )) 0.8

1 0.05 ( , ( )) 0.8

0

difference in levels within the hierarchy

( , )

x x

x

compropSim R S

compropSim R parent S

x compropSim R child S

otherwise

x

cvrgSim R S

| ( ) ( ) |( , )

| ( ) ( )|p R p S

compropSim R Sp R p S

Property Similarity (propSim): The property similarity is the similarity of the set

of properties of two concepts. Given the set of properties, we use the Hungarian

algorithm to find the best mapping. In the terms of a bipartite graph, G = ( . , .R p S q , M)

where, .R p is set the properties of the request and .S q is set the properties of the service

and M is the set of property match score ijm (property similarity match score for (

. , .i jR p S q ). We want to find the best mapping with maximum match score. The property

similarity has multiple parts. It has yet another contributor C, a constant. The value of C

18

is 1 when both the properties are inverse functional properties or when both the properties

are not inverse functional properties. If not, the value of C is 0.8.

Range similarity (rangeSim): The range similarity has multiple parts. It takes the

weighted average of the property syntactic similarity and property range similarity. The

property syntactic similarity is calculated using the N-Gram algorithm. The property

range similarity is the similarity of the ranges of the two properties. If the ranges of the

property are object type properties, then the ranges are ontological concepts. The property

range similarity calculates the common properties of the range concepts divided by

number of properties of the range concept of service property.

Property Syntactic similarity (propSynSim): The property syntactic similarity is

the similarity of the ontological name of two properties. To find the syntactic similarity

of the ontological names, we have used the N-gram similarity algorithm.

Property Range similarity (propRangeSim): The property range similarity is the

ratio of common properties of the range concepts of the property to the list of properties

of the range concept of service property.

Cardinality Match (cardSim): The cardinality provides information in

determining whether the properties match. The cardinality match is (a) 1 if both the

request property and service property have the same cardinality. (b) 1 if both request

property and service property are functional properties. (c) 0.9 if the request needs more

than one value and the service property has only one value, the match would be less as

the request requirement is not met and (d) 0.7 if the service needs more than one value

and the request property has only one value.

19

Property Syntactic similarity (propSynSim): The property syntactic similarity is

the similarity of the ontological name of two properties. To find the syntactic similarity

of the ontological names, we have used the N-gram similarity algorithm.

Unmatched Properties (unMatchedProp): When matching two concepts, we can

get some situations wherein the concepts may not have the same number of properties.

The request may have equal, more or less number of properties compared to the concept.

In cases where the advertisement concept has less number of properties compared to

request, we penalize the concept by 0.05 for each of the unmatched property.

Table 3-4: Property Similarity

1

1 1

1( . , . ) ( ( * ))

1 ( . , . )

0

1 1

m

ij iji

i jij

n m

ij ijj i

x x

propSim R p S q Max m xm

if R p S q isselectedwhere x

otherwise

such that for all i for all j

3 ( . , . ) ( . , . ) ( . , . )

0.05 ( . , . )

i j i j i j

i j

rangeSim R p S q cardSim R p S q propSynSim R p S q

unMatchedprop R p S q

ijm c

1 . , .

1 . , .

0.8

i j

i j

if R p S q areinverse functional properties

if R p S q arenot inverse functional properties

otherwise

c

8 9. , . ( . , . ) ( . , . )( ) i j i j i jR p S q w propSynSim R p S q w propRangeSim R p S qrangeSim

| ( . . ) ( . . )|. , .

| ( . . )|( ) i j

i jj

p R p Range p S q RangeR p S q

p S q RangepropRangeSim

20

1 ( . ) ( . )

1 . , .

0.9 ( . ) ( . )

0.7 ( . ) ( . )

( . , . )

i j

i j

i j

i j

i j

cardinality R p cardinality S q

R p S q are functional properties



cardSim R p S q

( . , . ) ( . . , . . )i j i jpropSynSim R p S q p qnGramSim R label S label

The weights are all normalized, (w1, w2, w3, w4) = {3/10, 3/10, 3/10, 1/10}, (w5,

w6, w7) = {1/3, 1/3, 1/3} and (w8, w9) = {1/2, 1/2}.

21

CHAPTER 4

EVALUATION AND DISCUSSION OF RESULTS

The evaluations of the algorithms are based on the first SAWSDL collection

SAWSDL-TC1 [30]. It is publicly available and has 895 services with annotations spread

out to 24 ontologies. The collection has 26 request files and is associated to a binary

relevance set, which lists all the services which are relevant. The relevance factor is

binary and based on whether the services are “relevant” or “not relevant”. Each service in

the collection is restricted to have single interface and single operation and all the

annotations are referenced to OWL ontologies. Thus, we have carefully picked 36

services and 4 request services from the SAWSDL-TC1 collection which have

annotations to single ontology. For each request, we have a relevance set of services from

the 36 services, which were obtained from the SAWSDL-TC1. For request R1 (shown in

Appendix A) there are 4 relevant services and for the other three requests R2, R3 and R4

it is 7, 4 and 7, respectively. The operations for the services in the collection are not

annotated i.e., no functionality for the operations. The services are modified to refer to

ontologies on the Web rather than a local http server (before modification services were

referenced to ontologies on a local http server).

22

IMPLEMENTATION OF THE ALGORITHMS

SAWSDL-MX GUI TOOL

The SAWSDL-MX tool [31] is publicly available. It is implemented in Java and

has the GUI to run the algorithm. In this thesis, we did not implement the algorithm. We

ran and tested the GUI version of SAWSDL-MX 2.1. Running the tool is a 3 step

process.

The first step is called Test collection. We can either load an already existing

collection or alternatively create one. To create a collection, we add the set of services

in the service offers section either by URL or from local system, we added 36

services. Similarly, we add the set of requests into the service request section, we

added 4 requests and for each of the request the relevance set is added.

The second step is the Matchmaker, which is divided into SAWSDL-MX1 and

SAWSDL-MX2. In SAWSDL-MX1, we can choose one of the variants SAWSDL-

(M0-M4) in the configuration dropdown. Based on the configuration selected we can

change the parameters of minimum degree of match or text similarity thresholds.

After selecting the variant and the threshold, the Match button should be pressed to

start the matchmaking process.

The third and final step is the Evaluation, which is divided into 2 sections, one

showing the ranking of services for the request and the other showing the evaluation

charts of precision/recall/response time. Along with the ranking, the degree of match

and score for each shown. Each service in the ranking have a color code of “Black”

23

and “Red” meaning relevant and irrelevant based on the relevance set defined for

each request in the first step.

In the implementation of SAWSDL-MX, they have considered only the top level

annotations for the complex types in the SAWSDL. If there are multiple annotations,

i.e., multiple model references to single element, they consider only one random

model reference. Their algorithm is not restricted to one ontology reference. There

can be multiple ontology references in the SAWSDL.

TVERSKY MODEL PROTOTYPE IMPLEMENTATION

The Tversky Model Prototype API [32] is implemented in Java. The requests are

created in the program by creating a WS_Spec object, which stores the information of the

service like service name, ontology path, operation name, operation model reference,

input parameter, input model references, input types, output parameter, output model

references and output types. The services are stored in a native XML database eXist [33].

Each service in the database is retrieved and stored locally when running the algorithm.

We created a SAWSDL parser to parse these SAWSDL service files and retrieved

appropriate information to create the WS_Spec object.

We implemented the Hungarian algorithm and N-gram algorithm that are used by

the Tversky Model.

The following steps explain the setup and execution of the algorithm.

The API uses a native XML database as the registry to store the service. Install

eXist XML database. In the database, create a collection and upload the services.

24

Make changes to the configuration file (Config.java) to point the location of the

registry and a local directory to download the services.

Create a request by giving the path to the ontology, an operation, its model

reference, a set of inputs with name, type and model reference and a set of outputs

with name, type and model reference.

Running the algorithm, gives the match score for each operation pair of the

request and the service.

In the implementation of Tversky Model, we have considered both the top level

and bottom level annotations for the complex types in the SAWSDL. If there are multiple

annotations i.e., multiple model references to single element, we consider only the first

model reference. The implementation removes the weight for input, output, functional

similarity, if the request does not have any of them, for example, in our test cases we do

not have reference concepts to the operation, hence w4 is set to zero and other weights are

normalized. The implementation of the algorithm is restricted to one ontology reference

for a service. Howerver, the algorithm has conditions to deal when the request and

service reference different ontologies.

MWSDI API IMPLEMENTATION

The MWSDI API [34] is implemented in Java. The requests are created in the

program by creating a WS_Spec object, which stores the information of the service like

service name, ontology path, operation name, operation model reference, input

parameter, input model references, input types, output parameter, output model

references and output types. The services are stored in a native XML database eXist.

25

Each service in the database is retrieved and stored locally when running the algorithm.

We created a SAWSDL parser to parse these SAWSDL service files and retrieved

appropriate information to create the WS_Spec object.

We used the OWL API to find the concepts that are model referenced in the

service. We implemented the Hungarian algorithm and N-gram algorithm that are used

by the Tversky Model.

The following steps explain the setup and execution of the algorithm.

The API uses a native XML database as the registry to store the service. Install

eXist XML database. In the database, create a collection and upload the services.

Make changes to the configuration file (Config.java) to point the location of the

registry and a local directory to download the services.

Create a request by giving the path to the ontology, an operation and its model

reference, a set of inputs with name, type and model reference and a set of outputs

with name, type and model reference.

Running the algorithm, gives the match score for each operation of the service.

In the implementation of MWSDI algorithm, we have considered both the top

level and bottom level annotations for the complex types in the SAWSDL. If there are

multiple annotations, i.e., multiple model references to single element, we consider only

the first model reference. The implementation of the algorithm is restricted to one

ontology reference for a service. The implementation removes the weight for input,

output, functional similarity, if the request does not have any of them, for example, in our

test cases, we do not have reference concepts to the operation, hence w4 is set to zero and

26

other weights are normalized. The algorithm does not have conditions to tackle, if the

request and service reference different ontologies, and hence the individual scores might

be low.

The evaluation in this paper takes a further step in comparing these algorithms

quantitatively with the match score, which is a number ranging between [0-1] with 0

being no match and 1 as exact match.

To perform statistical evaluations we have compared the precision, recall and F-

measure of the algorithms.

cov

Recall (r)

covPrecision (p)

cov

2F-Measure (F)

number of correctly dis ered services

number of all correct services

number of correctly dis ered services

number of dis ered services

precision recall

precision recall

Precision is used to find the ratio of number of true positives (correctly discovered

services) to the sum of true positives and false positives (total number of discovered

services). Precision does not tell about all relevant services. Recall is the ratio of true

positives to the sum of true positives and false negatives (number of relevant not

discovered services), but it does not tell about irrelevant services discovered. The F-

Measure gives a balanced score for testing the accuracy of the algorithms.

The results for each of the algorithms with their Precision, Recall and F-measure

are shown in tables below. The experimental data used to test the algorithms are shown in

the Appendix. R1-R4 are the 4 requests used in the evaluations.

27

The table 4-1 gives the Precision, Recall, F-Measure of SAWSDL-M0 which is a

Logic Based matching with the minimum degree of match as subsumed-by.

The table 4-2 gives the Precision, Recall, F-Measure of SAWSDL-M1 which is a

hybrid logic based matching with Loss of Information syntactic similarity measure and

syntactic threshold of 0.3.

The table 4-3 gives the Precision, Recall, F-Measure of SAWSDL-M2 which is

hybrid logic based matching with Extended Jaccard syntactic similarity measure and



hybrid logic based matching with Cosine syntactic similarity measure and syntactic

threshold of 0.3.


hybrid logic based matching with Jensen Shannon syntactic similarity measure and


The table 4-6 gives the Precision, Recall, F-Measure of TVERSKY model with a

similarity threshold of 0.3.

The table 4-7 gives the Precision, Recall, F-Measure of MWSDI with a similarity

threshold of 0.3.

The table 4-8 gives the average Precision, Recall, F-measure of SAWSDL-MX

variants.

The table 4-9 gives the Syntactic similarity using Extended Jaccard measure.

28

Table 4-1 Precision, Recall and F-Measure for SAWSDL-M0

Request

(Relevant)

Returned

(Relevant)

Precision Recall F-Measure

R1(4) 2(2) 1.0 0.5 0.67

R2(7) 6(6) 1.0 0.86 0.92

R3(4) 6(4) 0.67 1.0 0.80

R4(7) 6(6) 1.0 0.86 0.92

Average 0.94 0.82 0.85


Request

(Relevant)

Returned

(Relevant)

Threshold = 0.3


R1(4) 17(4) 0.23 1.0 0.37

R2(7) 0(0) 0.0 0.0 0.0

R3(4) 14(4) 0.28 1.0 0.44

R4(7) 0(0) 0.0 0.0 0.0

Average 0.09 0.36 0.15

29


Request

(Relevant)

Returned

(Relevant)

Threshold = 0.3


R1(4) 7(3) 0.43 0.75 0.55

R2(7) 0(0) 0.0 0.0 0.0

R3(4) 8(4) 0.5 1.0 0.67

R4(7) 0(0) 0.0 0.0 0.0

Average 0.17 0.32 0.22


Request

(Returned)

Returned

(Relevant)

Threshold = 0.3


R1(4) 17(4) 0.23 1.0 0.37

R2(7) 0(0) 0.0 0.0 0.0

R3(4) 18(4) 0.22 1.0 0.36

R4(7) 0(0) 0.0 0.0 0.0

Average 0.08 0.36 0.13

30


Request

(Relevant)

Returned

(Relevant)

Threshold = 0.3


R1(4) 17(4) 0.23 1.0 0.37

R2(7) 0(0) 0.0 0.0 0.0

R3(4) 16(4) 0.25 1.0 0.4

R4(7) 0(0) 0.0 0.0 0.0

Average 0.09 0.36 0.14

Table 4-6 Precision, Recall and F-Measure for Tversky Model

Request

(Relevant)

Returned

(Relevant)


R1(4) 0(0) 0.0 0.0 0.0

R2(7) 1(1) 1.0 0.14 0.25

R3(4) 5(1) 0.80 1.0 0.89

R4(7) 1(1) 1.0 0.14 0.25

Average 0.78 0.27 0.32

31

Table 4-7 Precision, Recall and F-Measure for MWSDI

Request

(Relevant)

Returned

(Relevant)

Precision

Recall

F-Measure

R1(4) 0(0) 0.0 0.0 0.0

R2(7) 2(2) 1.0 0.28 0.43

R3(4) 4(4) 1.0 1.0 1.0

R4(7) 1(1) 1.0 0.14 0.25

Average 0.82 0.31 0.40

Table 4-8 Average Precision, Recall and F-Measure for SAWSDL-MX Hybrid

Request Precision Recall F-Measure

R1(4) 0.28 0.94 0.41

R2(7) 0.0 0.0 0.0

R3(4) 0.31 1.0 0.47

R4(7) 0.0 0.0 0.0

Average 0.11 0.35 0.16

32

Table 4-9 Syntactic similarity using Extended Jaccard measure

Request

(Relevant)

Returned

(Relevant)

Threshold = 0.3


R1(4) 18(4) 0.22 1.0 0.36

R2(7) 0(0) 0.0 0.0 0.0

R3(4) 17(4) 0.23 1.0 0.37

R4(7) 0(0) 0.0 0.0 0.0

Average 0.08 0.36 0.13

33

The following figures give the graphical representation of Precision, Recall, F-

measure values for TVERSKY, MWSDI and average of SAWSDL-MX variants.

Figure 4-1 Precision graph for SAWSDL M0, Tversky, MWSDI and average SAWSDL-

MX Hybrid

Figure 4-2 Recall graph for SAWSDL M0, Tversky, MWSDI and average SAWSDL-

MX Hybrid

0

0.2

0.4

0.6

0.8

1

1.2

R1 R2 R3 R4

SAWSDL-MX

Tversky

MWSDI

SAWSDL-M0

0

0.2

0.4

0.6

0.8

1

1.2

R1 R2 R3 R4

SAWSDL-MX

Tversky

MWSDI

SAWSDL-M0

34

Figure 4-3 F-Measure graph for SAWSDL M0, Tversky, MWSDI and average

SAWSDL-MX Hybrid

4.1 EVALUATION OF ALGORITHMS BASED ON THE RESULTS

The experiments were conducted on 4 requests and 36 advertisement services.

The SAWSDL-MX algorithm was executed on the 5 variants namely, SAWSDL-M0

with minimum degree of match as Subsumed-by and SAWSDL-(M1-M4) with syntactic

similarity threshold of 0.3. The Tversky and MWSDI algorithms were also set a threshold

of 0.3. The relevant services for each request were provided in the SAWSDL-TC (Test

Collection) which eliminated the requirement for human evaluators identifying the

relevant services for each request. Based on the results for each algorithm, the

observations we made are the following:

0

0.2

0.4

0.6

0.8

1

1.2

R1 R2 R3 R4

SAWSDL-MX

Tversky

MWSDI

SAWSDL-M0

35

The hybrid approach of SAWSDL-MX, which integrates the syntactic similarity

to the discovery and ranking of services, increased the number of false positives.

The result for the SAWSDL-M0, which is a logic-based only variant, has better

precision, compared to the other hybrid variants.

MWSDI does the concept match by taking a weighted average of the syntactic

similarity, property similarity and coverage similarity, which is a better matching

technique compared to Tversky model, which does a less detailed comparison,

since it calculates the match score based on the common property matching.

MWSDI finds the best mapping between the set of inputs and outputs to obtain

maximum overall match using the Hungarian algorithm, but Tversky model does

the average of match score between each individual pair of inputs or outputs.

MWSDI uses the Hungarian algorithm and has a deeper hierarchy of comparison,

which makes it slow and time consuming.

Tversky model and SAWSDL-MX handles the operation matching between

request and service even if they are annotated to separate ontologies. MWSDI has

coverage similarity functions, which are dedicated to similarity matching of

concepts from the same ontology. This affects the match score considerably.

SAWSDL-MX has the option of choosing one of the four syntactic similarity

measures, Loss of Information, Extended Jaccard, Cosine and Jensen-Shannon

36

similarity measures compared to Tversky and MWSDI which rely only on N-

Gram syntactic similarity measure.

The Extended Jaccard syntactic similarity approach has the better results when

compared to other similarity measures used in the SAWSDL-MX variants.

The Test Collection has services with no model references to the operations and

has references only to inputs and outputs, which make them biased to SAWSDL-

MX, since TVERSKY and MWSDI rely on functional similarity of the operation

concept.

TVERSKY and MWSDI have good precision compared to SAWSDL-MX, which

has better Recall values.

37

CHAPTER 5

CONCLUSION AND FUTURE WORK

The goal of this work was to evaluate the effectiveness of the several SAWSDL

based semantic web service discovery algorithms. In particular, we were looking at how

well they fared in precision and recall. The MWSDI algorithm was modified slightly to

improve the accuracy of matching. The original version of MWSDI algorithm had

context similarity which was removed due to the difficulty of implementation. MWSDI

concentrates more on matching the functionality of the operation. Unfortunately, the

SAWSDL-TC has SAWSDL files which do not have annotations to the operation. The

limitations of the SAWSDL-MX1 matchmaker were to rely only on the top-level

annotations and ignore the bottom-level annotations and perform operation matching only

based on the inputs and outputs and not the functionality of the operation. The finding

from our evaluation of the three algorithms is as follows: SAWSDL-MX1 performs well

with the logic based matching filter and has good recall values. The Tversky model

handles the matching of services from different ontologies and performed well with good

precision. The MWSDI approach has good precision values, since it makes use of the

property similarity and in addition has the coverage similarity to tackle concepts at

different levels from the same ontology.

38

In the future, we intend to do a comprehensive comparison of these algorithms

using SAWSDL test collections which have annotations to the operation of the service.

The Tversky algorithm can be improved by including the Hungarian algorithm while

doing the similarity match on the inputs and outputs and also use N-gram similarity to

calculate the syntactic similarity for the different parts of the WSDL. Also, additional

information retrieval techniques such as the Loss of Information, Extended Jaccard,

Cosine and Jensen-Shannon similarity measures will be considered for Tversky and

MWSDI algorithms instead of just using N-gram similarity for calculating the syntactic

similarity. The calculation of weights in the MWSDI is arbitrarily, In the future we can

integrate machine learning techniques to determine the weights. We also intend to

evaluate the SAWSDL-M0+WA and SAWSDL-MX2 matchmaker as part of a future

evaluation of SAWSDL-based semantic web service discovery algorithms.

39

REFERENCES

1. Web service glossary http://www.w3.org/TR/ws-gloss/.

2. Extensible Markup Language (XML) 1.0 (Fifth Edition) [Available at

http://www.w3.org/XML/].

3. SOAP Version 1.2 Part 1: Messaging Framework (Second Edition) [ Available at:

http://www.w3.org/TR/soap12-part1/].

4. Erik Christensen, Francisco Curbera, Greg Meredith, Sanjiva Weerawarana, W3C

Web Services Description Language (WSDL). [Available at:

http://www.w3c.org/TR/wsdl.].

5. HTTP - Hypertext Transfer Protocol [Available at:

http://www.w3.org/Protocols/].

6. Fielding, Roy Thomas (2000), Architectural Styles and the Design of Network-

based Software Architectures, Doctoral dissertation, University of California,

Irvine.

7. UDDI. Universal Description, Discovery, and Integration (UDDI v3.0). 2005

[Available ar: http://www.uddi.org/].

8. Message announcing closure of Technical Committee. [Available at:

http://lists.oasis-open.org/archives/uddi-spec/200807/msg00000.html].

http://www.w3.org/TR/ws-gloss/

http://www.w3.org/XML/

http://www.w3.org/TR/soap12-part1/

mailto:[email protected]




http://www.w3c.org/TR/wsdl

http://www.w3.org/Protocols/

http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm

http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm

http://www.uddi.org/

40

9. Web Services Discovery [Available at: http://msdn.microsoft.com/en-

us/library/f9t5yf68(VS.80).aspx.].

10. Web service Discovery [Available at:

http://en.wikipedia.org/wiki/Web_Services_Discovery].

11. OWL-S. OWL-based Web Service Ontology. 2004 [Available at:

http://www.daml.org/services/owl-s/].

12. WSMO. Web Services Modeling Ontology (WSMO). 2004 [Available at:

http://www.wsmo.org/].

13. Rama Akkiraju, Joel Farrell, John Miller, Meenakshi Nagarajan, Marc-Thomas

Schmidt, Amit Sheth, Kunal Verma, Web Service Semantics - WSDL-S,

http://www.w3.org/Submission/WSDL-S, Retrieved 10 Oct 2006.

14. Joel Farrell, Holger Lausen. Semantic Annotations for WSDL. 2007 [Available

at: http://www.w3.org/TR/sawsdl/].

15. Roberto Chinnici, Jean-Jacques Moreau, Arthur Ryman, Sanjiva Weerawarana,

Web Services Description Language (WSDL) Version 2.0 Part 1: Core Language

[Available at: http://www.w3.org/TR/wsdl20/].

16. OWL. OWL Web Ontology Language Reference, W3C Recommendation.

[Available at: http://www.w3.org/TR/owl-features/].

17. Matthias Klusch, Patrick Kapahnke and Ingo Zinnikus: Hybrid Adaptive Web

Service Selection with SAWSDL-MX and WSDL Analyzer. The 6th Annual

European Semantic Web Conference (ESWC 2009).

http://www.daml.org/services/owl-s/

http://www.wsmo.org/

http://www.w3.org/Submission/WSDL-S

http://www.w3.org/TR/sawsdl/

http://www.w3.org/TR/wsdl20/

http://www.w3.org/TR/owl-features/

41

18. Jorge Cardoso, John A. Miller and Savitha Emaini, "Web Services Discovery

Utilizing Semantically Annotated WSDL," in Reasoning Web 2008, Lecture

Notes in Computer Science (LNCS), Vol. 5224, Baroglio et al., Editors

(September 2008) pp. 240-268.

19. Kunal Verma, Amit P. Sheth, Swapna Oundhakar, Kaarthik Sivashanmugam, and

John A. Miller, "Allowing the Use of Multiple Ontologies for Discovery of Web

Services in Federated Registry Environment," Technical Report #UGA-CS-

LSDIS-TR-07-011, Department of Computer Science, University of Georgia,

Athens, Georgia (February 2007) pp. 1-27.

20. Semantic Web Services (SWS) [Available at:

http://en.wikipedia.org/wiki/Semantic_Web_Services].

21. SAWSDL file [Available at:

http://cs.uga.edu/~shiva/activivtyDestination_sawsdl.wsdl].

22. Ingo Zinnikus, Rupp H.J, Fischer K, "Detecting Similarities between Web

ServiceInterfaces: The WSDL Analyzer" In: Second International Workshop on

Web Servicesand Interoperability (WSI 2006), Pre-conference Workshop of

Conferenceon Interoperability for Enterprise Software and Applications, I-ESA

2006, March20-21, Bordeaux (2006).

23. Support Vector Machines (SVM) [Availble at:

http://en.wikipedia.org/wiki/Support_vector_machine].

http://en.wikipedia.org/wiki/Semantic_Web_Services

http://en.wikipedia.org/wiki/Support_vector_machine

42

24. Matthias Klusch , B. Fries, and K. Sycara. Automated Semantic Web Service

Discovery with OWLS-MX. In 5th International Conference on Autonomous

Agents and Multi-Agent Systems (AAMAS). 2006. Hakodate, Japan: ACM Press.

25. Term Frequency-Inverse Document Frequency. [Avialable at:

http://en.wikipedia.org/wiki/Tf–idf].

26. Bipartite Graph. [Available at: http://en.wikipedia.org/wiki/Bipartite_graph].

27. Tversky, A., Features of Similarity. Psychological Review, 1977.

84(4): p. 327-352.

28. Hungarian Algorithm. [Available at:

http://en.wikipedia.org/wiki/Hungarian_algorithm].

29. Ngram Algorithm. [Available at :

http://74.125.47.132/search?q=cache:wXdDByJaveoJ:www.cs.ualberta.ca/~kondr

ak/papers/spire05.ps+n-GRam+algorithm+.ps&hl=en&ct=clnk&cd=10&gl=us].

30. SAWSDL-TC1 test collection. [Available at:

http://projects.semwebcentral.org/projects/sawsdl-tc/].

31. SAWSDL-MX tool. [Avialable at:

http://projects.semwebcentral.org/projects/sawsdl-mx/].

32. Discvoery algorithms, Tversky and MWSDI. [Available at :

http://cs.uga.edu/~shiva/Discovery.zip].

33. Native XML database eXist. [Avialable at:

http://www.exist.com/].

34. Jaccard index. [Available at: http://en.wikipedia.org/wiki/jaccard_index].

http://en.wikipedia.org/wiki/Bipartite_graph

http://en.wikipedia.org/wiki/Hungarian_algorithm

http://projects.semwebcentral.org/projects/sawsdl-tc/

http://projects.semwebcentral.org/projects/sawsdl-mx/

43

35. Cosine similarity. [Available at: http://en.wikipedia.org/wiki/Cosine_similarity].

36. Jensen-Shannon divergence. [Available at: http://en.wikipedia.org/wiki/Jensen-

Shannon_divergence].

44

A. APPENDIX

TEST CASES

The SAWSDL-TC is publicly available. The 5 request files and the 36 services

we used for testing are available at http://cs.uga.edu/~shiva/sawsdl-tc1.zip. Here is an

example for a request service and advertisement service.

REQUESTS

Here are the samples of requests that have been used. Two of the used requests

are shown in the following pages.

http://cs.uga.edu/~shiva/sawsdl-tc1.zip

45

Figure A-1 novel_author_service.wsdl

46

Figure A-2 novel_author_service.wsdl (continued)

47

Figure A-3 surfing_destination_service.wsdl

48

Figure A-4 surfing_destination_service.wsdl (continued)

49

SERVICES

Here are some of the advertisement SAWSDL services that have been used.

Figure A-5 novel_authorbook-type_service.wsdl

50

Figure A-6 novel_authorbook-type_service.wsdl (continued)

51

Figure A-7 activity_destination_service.wsdl

52

Figure A-8 activity_destination_service.wsdl (continued)

53

Figure A-9 activity_beach_service.wsdl

54

Figure A-10 activity_beach_service.wsdl (continued)

a comparison of sawsdl based semantic web service discovery algorithms

Documents