[ieee 2014 ieee/wic/acm international joint conferences on web intelligence (wi) and intelligent...

Web Service Matchmaking Using a Hybrid of Signature and Specification Matching Methods

Syed Sibte Raza Abidi, Ali Daniyal, Syed Farrukh Mehdi NICHE Research Group, Faculty of Computer Science

Dalhousie University, Halifax, Canada

Abstract—Web services are independent software systems designed to offer machine-to-machine interactions over the WWW to achieve well-described operations. Web Service Matchmaking is a process of searching (or discovering) for a web service within a service registry (such as UDDI) to discover a web service that satisfies certain functional requirements as requested by a service consumer. In this paper, we present a hybrid web service matchmaker that analyzes both the signature and specification of a web service. For signature level service matching, we have developed two techniques: (i) logical similarity measures applied to the services’ input/output concepts; and (b) non-logical matching based on a Structure Preserving Semantic Matching algorithm. For specification matching, we have introduced a short sentence matching approach applied to a services’ description. We evaluated the performance of the matchmaker using the OWLS-TC dataset, and furthermore compared its performance with the similar hybrid matchmaker. Our results indicate a significant improvement in recall and slight improvement in precision.

Keywords—Web Service Matchmaking, Signature Matching, Specification Matching, OWL-S

I. INTRODUCTION Web service is an independent program that can be

published, searched and invoked on the web [1]. Web service operates in an environment in which users, computing applications, and data sources are distributed; the web service provides necessary communication and interaction protocols for these distributed elements to be dynamically interconnected to perform a particular task. Nowadays, web services are employed by major computing organizations to provide a range of functionalities. For example, Microsoft .Net provides the passport service to its developer. Passport allows users to access multiple web pages after a single sign-on. Microsoft’s software developer’s kit allows developers to access the passport service to incorporate an authentication process into their web-based applications. Similarly, Google provides MAP API web services that take URL parameters as an HTTP request for map data, and in turn return map data in the HTTP response back to the client applications.

Web service architecture consists of three main entities: Service Provider, Service Requester and Service Broker. Service provider is the one who creates and publishes the web service on any web service registry, like Universal Description, Discovery, and Integration (UDDI) [1]. Service requester is the consumer of the web service who searches for a service in a

registry, invokes and utilizes the output of the service. Service broker is the middle layer between service provider and service requester, which allows web service providers to publish their service descriptions and helps service consumers to discover the relevant web services—i.e. service matchmaking.

Web Service Matchmaking is a process of searching (or discovering) for a web service within a service registry (such as UDDI) to discover a web service that satisfies the functional requirements as requested by a service consumer. Web service matchmaking is pursued by matching the service request with the services’ description—i.e. the services’ signature and specification. Web service signature describes its input/output parameters. Web service specification describes its functional behaviour in terms of a textual description of function, service category, and intended business. There are two main approaches for service matchmaking: (a) Logical matching (or semantic) approaches use deductive methods—i.e. logical concepts, rules and ontologies—to analyse the service’s description. Semantic Mark-up for Web Services is used to describe a web service, and this ontological description is used to match a service with the service request; and (b) Non-Logical Matching (or syntactic) approaches that analyse the services’ textual descriptions (i.e. inputs/outputs, parameters’ structures, textual-description, and names) to match it with the service request. Typically, text analysis, such as keyword-based matching, is applied to the WSDL based service description to find a service match [2, 3, 4, 6].

In this paper we present a hybrid approach to web service matchmaking targeting signature and specification matching. Our objective is to improve the recall of web service matchmakers, whilst maintaining a high precision value, such that a larger set of matched web services are returned by the matchmaker. We present our Hybrid Web Service Matchmaker (HWSM), as shown in Figure 1, that is a hybrid of logical and syntactic approaches for achieving both service signature and specification matching. Our rationale for using both the signature and specification of a web service is that the overall functionality of any service is described by not just its input/output parameters, rather its textual specification best describes its intended behaviour. Given that a services’ specification description is usually provided as short focused sentences, for specification matching we have implemented a novel short-sentence matching method that compares services’ specification with the service request. In this way, we are proposing a departure from the traditional information retrieval methods used for specification matching. In our approach, a

2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT)

978-1-4799-4143-8/14 $31.00 © 2014 IEEE

DOI 10.1109/WI-IAT.2014.107

266

2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT)

978-1-4799-4143-8/14 $31.00 © 2014 IEEE

DOI 10.1109/WI-IAT.2014.107

266

short sentence structure matching method [9]the services’ text-based specification descranalysis is based on the syntactic structure of tthe subject, verb object of the description. Agais different from comparing keywords, whicapproach for specification matching. In additia new signature matching approach that enPreserving Semantic Matching (SPSM) algoritoffers an interesting logical matching methothe service requests’ and web services’ input/structure by evaluating the semantic similaristructure-defining concepts. The signature anda web service is extracted from both WSDLOWL-S description. We have evaluated our HOWLS-TC [10] test collection containing arservices, described using WSDL and OWLcompare the performance of our matchmaker, MX [4] as a benchmark, where precision, recare the performance measures.

Schematic of Hybrid Web Service Matchmaker

II. RELATED WORK Many matchmaker tools exist that use

signatures and specification information to dweb services based on user requests.

Non-logical matching approaches, usiretrieval methods, have been quite prevamatchmaking. Elgazzar [2] proposed a keywapproach that is applied to five features of service description: WSDL content, WSDLmessages, WSDL ports and web service nambetween two web service descriptions iaveraging the similarities of all five features. Lthat keyword-based search is suboptimal as ilittle text. Rather, a semantic similarity me

] is used to parse ription and then the sentence—i.e. ain, this approach ch is the typical ion, we introduce ntails a Structure thm [7, 8]. SPSM od that compares /output parameter ity between their d specification of L description and HWSM using the round 1000 web

LS standards. To we used OWLS-

call, and accuracy

e web services’ discover relevant

ing information alent in service

word-based search f a WSDL based L types, WSDL

me. The similarity s computed by Lui [6] suggested t contains a very ethod is used to

match services based on ports, messages, operations, and other defia web service discovery model very6], where syntactic techniques (Tservice name, input/output parameteservice as described in WSDL to cluthe parameter names. Gehao [16]based matching on the names operations and service description.

Logical (i.e. semantics) matchinsemantic similarity methods using resource. Wang [11] exploits sdescription’s identifiers and the strmessages and data types. The sespecification. WordNet is usedcalculation. Hui [15] presents an appS service profile for service matcmeasure is based on the disrepresenting request and service parthe standard usage of descripautomatically determine the web sconcept subsumption relations. Guiapproach based on semantic similadescription of services is captured in

Klusch [4] presented the firstOWLS-MX, which combines msemantic similarity of OWL-S sesyntactic similarity. OWLS-MX firsimilarity by defining five logical does not succeed, then OWLS-MX service description of all available descriptions are converted into a token-based string similarity measuthe content similarity between the qu

Yu-Huai [5] also presents a hyblogical matching based on OWL-S matching of textual attributes from W

COMPARISON OF MATCHMAKING SOLUTIORETRIEVAL METHODS; GRAPH=GRAPH BASEMATCHING METHODS; SHORT

I/O Structure IR Log

-ical IR Grap

[2] [6] [3] [13] [11] [4] HWSM

In conclusion, we note thamatchmaking solutions exploit four service description—i.e. input/outpstructure, textual description, and wcompares the main matchmaking matching methods—note that HWSM

port types, input/output finitions. Dong [3] proposed y similar in approach to [2, F/IDF) are applied to the

ers, and operations of a web uster web services based on also employed keywords of services, parameters,

ng approaches are based on some external knowledge

semantics of the WSDL ructure of their operations, ervice request is a textual d for semantic distance proach that employs OWL-

ching, where the similarity stance between concepts rameters. OWL-S facilitates tion logic reasoning to

service similarity based on ilherme [12] uses the same

arity, whereby the semantic n SAWSDL.

t hybrid approach, called methods for logic-based ervices and content based rst determines the semantic filters. If logical matching processes the OWL-S web web services. The service

vector of terms, to which ures are applied to compute uery and the web services.

brid solution that builds on profile and keyword based

WSDL document.

ONS. LEGEND: IR=INFORMATION ED METHODS; LOGICAL= LOGICAL

Description Name h IR Short

Text IR

at existing web service main components of a web

put parameter, parameter’s web service name. Table 1

solutions based on their M is our proposed solution.

267267

III. OVERVIEW OF OUR HYBRID SERVICE MATCHMAKER In our HWSM we have developed similarity measurement

methods that are applied to both the signature and specification of web services—a hybrid of these methods establishes the similarity of a services’ description with a service request. In terms of matchmaking, our objective is to achieve a higher recall whilst maintaining high precision; the intent is to return a list of ranked web services (ranked based on their similarity to the service request). In our hybrid approach, we are using logical methods to analyze the services’ input/output concepts, and syntax based methods to analyze the messaging structure, textual description and service name.

For signature matching, we build on the work reported by Klusch [4]. Given our intent to increase the recall of the matchmaker, we pursue fine-grained logical matching at every input/output concept level as opposed to the higher service level pursued by Klusch [4]. In fine-grained logical matching, any web service that partially satisfies consumer needs would also be selected but with a low rank. For example, a web service ‘S’ having inputs “name”, “credit card number” and “pin” will be selected as a candidate service for the requested web service ‘R’ that requires only “credit card number” and “pin” as inputs. The intent here is to retrieve a broader set of ranked services to the service consumer. During signature matching, to account for situations when the logical filters cannot be applied, such as due to the absence of representative domain ontology, we establish similarity based on the services’ structure. In this regard, we have investigated a graph based approach viz. the Structure Preserving Semantic Matching (SPSM) algorithm [7] that is applied to the parameters’ structure to determine the similarities between two graphs representing the input/output parameters of the service request and the candidate service. Our combination of logical and structural matching of the signature provides a more in-depth similarity matching option.

For specification matching, we use syntax based text matching techniques applied to the services’ textual-description and the name. The services’ textual description contains important information regarding the functionality and behaviour of a web service. Traditionally, matchmakers [4, 13] have exploited the web service textual description defined in a services’ profile as a vector of keywords and applied content-based information retrieval techniques to compute similarities between the service request and the service signature [9]. We argue that the keyword-based string similarity measures are not suitable for short sentences, such as the textual descriptions of services. In long documents, word co-occurrences is an important source of information; but in a short sentence which has a small number of words, word co-occurrence is not a significant factor and hence the analysis can potentially suffer from lack of information. To address this issue, for matching the textual description of a service specification we have used a syntax based approach that is formulated to deal with short sentences [9]. In this way, we establish the syntactical structure of a services’ description and the service request and then compare them to establish specification similarities.

Finally, a weighted aggregation of the hybrid of methods—i.e. logical signature similarity, structural signature similarity,

textual specification similarity and name similarity—determines an overall similarity score between the service description and the service request. Our HWSM always ensures that the service is related to the request if and only if they have similar signatures and similar specifications.

A. Workings of our HWSM Our HWSM takes as input a web service request, and

returns a ranked list of related web services. The workflow for HWSMs’ approach is as follows (as shown in Fig. 1):

1. Web service matchmaking is initiated by providing an OWL-S description of a web request through HWSMs’ request interface for the web service consumers.

2. Preprocessing involves the extraction of four major components of web services—i.e. input/output concepts, their structure, textual description and web service name.

3. After pre-processing, the different similarities are computed by individual modules: (a) Logical signature matching is performed between input/output concepts of the service request and all available web services in the UDDI using the domain ontology; (b) Structural signature matching between the service request and the structures of all available web services is performed using an SPSM algorithm; (c) Textual specification description matching is done between the service request and all available services using our short sentence semantic matching approach; and (d) Web service names are broken into meaningful words and semantic matching is performed to compute similarity between web services.

4. The cumulative similarity, based on the four similarity methods, is calculated and HWSM returns the list of ranked web services to the consumer. Finally, HWSM allows the consumers to select a ‘Top N services’, where N is the number of retrieved web services.

IV. WEB SERVICE SIGNATURE MATCHING METHODS In HWSM we have implemented two different signature

matching approaches: (i) A logical approach using a domain ontology to match the service requests’ and candidate services’ inputs and outputs at the concept level; and (ii) A graph based approach, using the Structure Preserving Semantic Matching (SPSM) algorithm [7, 8], to compare the structure of a service with the request. We explain below both these approaches:

A. Logical Approach to Match Service Parameters Web service ontology (OWL-S) employs a domain

ontology to represent the services’ arguments. Logical matching is based on a deductive approach to calculate the similarity between two concepts represented in a domain ontology based on the relationships between their properties. For logical matching, we first take the input/output concepts from a service request and the advertised web services and align them to the domain ontology. Next, we apply logical rules to determine the subsumption relation between a pair of concepts—the hierarchical relationship between two concepts in domain ontology determines the semantic similarity between them, which could be exact match, subsumption or no relation.

Klusch [4] proposed five logical filters to compute the semantic similarity between service request and the web

268268

service signatures—the logical filters are EXACT, PLUG-IN, SUBSUME, SUBSUME-BY and FAILED matches. These logical filters were applied at the web service level to increase the precision of the matchmaking model, where a match was registered if any one of these logical filters was successfully applied to the service request and the service signature. This approach works well where a service consumer is looking only for exact or specific web services match. For example, if a user requests a web service with ‘latitude’, ‘longitude’ and ‘time’ as inputs and ‘temperature’ as an output. Klusch’s approach will not retrieve a web service that takes ‘position’ and ‘time’ as inputs and returns ‘temperature’ as output (where the parameter ‘position’ is the generic subsumer of ‘latitude’ and ‘longitude’). In this case, the match fails because when matching at the signature level some parameters were EXACT while others were PLUG-IN match; hence no definitive semantic similarity will be determined.

In our signature matching approach, we use the logical filters proposed by Klusch [4], but instead of service level matching we have proposed a fine-grained logical matching approach that computes the degree of subsumption relation for each parameter separately (instead of determining the same degree of subsumption relation for all parameters) for a pair of web services. The individual parameter level matching is consolidated later to give a cumulative semantic similarity. We explain below our logical signature matching approach, but first lets describe the logical filters being used for matching [4].

Let D be the domain ontology that has all the concepts of queries and advertised web services. LSC(C) is the set of least specific concepts C’ where C’ are the direct children of the concept C. LGC(C) is the set of least generic concepts C’ which are the direct parent(s) of the concept C in a domain ontology. In our work, to the five filters proposed by OWLS-MX [4], we have assigned a matching score with respect to the significance of the match ranging from 0 – 1.

Exact Match: Available service S’s parameter exactly matches with the request service R’s parameter. This similarity shows that an input/output parameter of an available web service perfectly matches with some input/output parameter of a requested web service based on logic-based equivalence. The highest similarity value of ‘1’ is assigned to the measure.

Plug-in Match: Available service S’s parameter p plugs-into the a requested service R’s parameter q�� ∈�� . This filter gives a little flexibility to the exact match. It guarantees that the available service S will have the most specific input/output parameters as compared to what has been requested by the consumer request service R. The semantic similarity score for PLUG-IN match is 0.8.

Subsume Match: Available service S’s parameter p SUBSUME the requested service R’s parameter, i.e. p∈Subclasses(q). Subsume match is similar to the plug-in match except that it provides more flexibility in comparing the subsumption relation in an ontology for the input/output parameters. Instead of comparing with the direct child of service R’s parameter, as in PLUG-IN, it looks for any subsumed concept in the hierarchy. The semantic similarity score of 0.7 is assigned to this filter.

Subsumed-by Match: Requested service R’s parameter q is SUBSUMED-BY an available service S’ parameter p, ��

�∈�� . This match focuses only on the direct parent concept instead of looking for more general concepts. A semantic similarity score of 0.6 is given to SUBSUMED-BY.

Failed Match: If none of the above filter match, then the logical comparison between the S’s given parameter and R’s parameters is failed. This will be assigned the similarity of 0.

Based on these filters, we individually compare the logical similarity between each pair of parameters of available service S and requested service R and then aggregate the similarity of all parameters as follows:

Total input parameters similarity: SIMIN (SIN, RIN) = (� i = 0 to N LogicSim(SINi, RINi) )

where N = # R’s input parameters [Equation 1] Total output parameters similarity: SIMOUT (SOUT , ROUT)=(� i= 0 to N LogicSim(SOUTi,ROUTi))

where N = # R’s output parameters [Equation 2] The overall semantic similarity is the average of total input

similarity and total output similarity, as follows:

Total Similarity: SIM (S, R) = (SIMIN + SIMOUT) / N where N=# R’s input & output parameters [Equation 3]

The below example explains the calculation of similarity using our signature matching method. The service request (called ServiceR) is looking for service that takes two inputs, i.e. ‘Car’ and ‘OnePersonBicycle’, and returns ‘Price’ as an output. The input/output description of serviceR is compared with all available web services descriptions in the UDDI. There is a service (called serviceS) ‘BicycleCar_Price_service’, which also takes two inputs, ‘Car’ and ‘Bicycle’ and returns ‘Price’. Logical similarity is calculated between all corresponding input/output parameters of serviceS and serviceR: serviceS input parameter ‘Car’ has an Exact Match with with serviceR input parameter ‘Car’, hence a similarity value of 1 is computed. Next, serviceS input parameter ‘Bicycle’ computes a Subsume-by-Match with serviceR input parameter ‘OnePersonBicycle’, hence giving the similarity value of 0.6. According to equation 1, the overall input similarity is (1+0.6)/2 = 0.8. The output parameter ‘Price’ of serviceS gives an Exact Match with the output parameter ‘Price’ of serviceR, giving similarity value of 1. According to equation 2, the overall output similarity is 1/1=1. The overall logical similarity between serviceR and serviceS, using equation 3 is (0.8+1)/2 = 0.9, and serviceS is retrieved.

B. Graph Based Approach for Service Structure Matching In HWSB, we have incorporated an additional structure-

matching approach to address web service matchmaking in situations where the domain ontology is incomplete or insufficient. For such cases, we have implemented a signature matching method that pursues matchmaking by analyzing the services’ structure of operations and parameter types.

Since a WSDL document describes the services’ input/output parameters in a hierarchical way—i.e. there is a separate graph-like structure for input and output messages—

269269

it’s possible to use graph matching approaches to calculate the similarity between the operations of two services. We have used a Structure Preserving Semantic Matching (SPSM) algorithm [7, 8] that computes the semantic similarity between two graphs extracted from the operation’s message of each WSDL document. SPSM takes a tree-like structure as an input, generates concepts of labels and concepts of nodes, and then computes a semantic similarity between those concepts. The relationship that could exist between two concepts is: equivalent, general, specific, or mismatch. The rationale for using SPSM is that it is a semantic matching method that works well with a domain ontology. Our structure matching approach has four steps:

Step1: Process the services’ WSDL document to extract the operations’ messages, types, and identifier names. The extracted service structure is represented as a graph structure that contains the label of operations, input/output parameters, and their data types as nodes. This generates two separate graphs, one each for input and output messages.

Step 2: Submit input graph structure to SPSM to calculate the input structure similarity. SPSM converts the labels of graph nodes into concepts using WordNet and then determines the semantic correspondence between each pair of nodes using WordNet. SPSM returns the degree of semantic structural similarity between two graph-like structures in the range of 0-1 (1 means exact match while 0 means no match).

Step 3: Submit the output graph structure to SPSM, and calculate the output structure similarity as done in step 2.

Step 4: Calculate the overall structural similarity by averaging the similarity of input and output messages

The following example illustrates the working of the logical signature matching. A user queried for a web service that takes concept ‘Car’ and ‘OnePersonBicycle’ as an input and wants concept ‘Price’ as an output. The structure of operations’ messages of the service request (R) and a potential service (S) is shown below.

• R (input): get PRICE (4WHEELCAR Four Wheel Car Type, 1 PERSONCYCLE One Person Cycle Type (Person any Type))

• R (Output)” get PRICE (PRICE Price Type (currency string, amount float))

• S (input): get PRICE (CAR Car Type, 1 PERSONCYCLE One Person Cycle Type (Person any Type))

• S (Output): get PRICE (PRICE Price Type (currency string, amount float)) SPSM performs the semantic structural similarity measure

on the input/output messages and returns a structural similarity score of 0.875 and 1.0 between the pairs of input messages and outputs messages, respectively. The overall similarity is the average of the structural similarity of the inputs and outputs, i.e. (1.0 + 0.875) / 2 = 0. 9375 and the service is retrieved.

V. SPECIFICATION MATCHING: SYNTAX BASED SERVICE DESCRIPTION MATCHING

The OWL-S profile of a service provides the functional and non-functional description of a web service. Along with the essential functional specification (i.e. input/output parameters), the non-functional specification (i.e. textual description) can

help in service matchmaking. Usually, a web services’ textual description contains 1-2 sentences (around 20-50 words) that describe the functionality of the web service.

Most service matchmaking methods have used a keyword-based information-retrieval method—i.e. a vector space model—to match a service request with candidate services by calculating the cosine similarity using WordNet, whereby WordNet is employed to check if the two words are synonymous. One drawback of a vector space model is that keywords within a textual description are treated as isolated items thus ignoring the syntactic associations between them.

Given that service descriptions usually use short sentences, we argue that a syntax based short sentence matching approach can be used, in addition to the traditional keyword based service description similarity methods. We have investigated a syntax based approach that uses short sentence matching to analyze the service descriptions [29]. We believe that the analysis of the underlying structure of the service description can provide better insights into the intended meaning of the service description since two descriptions may contain the same words yet their different syntactic structure may give two different meanings. For example, given two descriptions (a) This service receives ‘car’ and returns ‘price’; and (b) This service receives ‘price’ and returns ‘car’; the keywords are the same but their syntactic structure is different, hence the functionality of these services is different. Our description matching approach is able to detect the actual intent of the words as follows:

Step 1: Web service textual description is extracted from a service profile and is parsed using the Stanford parser to extract the syntactic structure of the service description. This syntactic structure is represented in typed dependency forms, provided by Stanford, which show the grammatical relationships between two words like the Subject of word ‘take’ is ‘service’ and the Object of word ‘take’ is ‘price’. To compute the structural similarity between textual descriptions we focus on the subject, verb and object of a sentence.

Step 2: After extracting the syntactic information from a sentence, we calculate the similarity between the same syntactic function for two different service descriptions. We use WordNet to calculate the Lin similarity measure [14] between two words i.e., in this case, heads and dependents. Lin similarity measure is the ratio of the information content of the least common subsumer to the information content of each word. In matching a syntactic function, if the similarity between heads fails then the similarity between their dependents is ignored, and vice versa. For example, there is a comparison between two syntactic functions (i.e. dobj(return, price) and dobj(receive price)). In this comparison, the similarity between heads (i.e. ‘returns’ and ‘receives’) is 0 while the similarity between dependents (i.e. ‘service’ and ‘service’) is 1. This interpretation here is that one web service ‘receives’ as input ‘price’ whereas the other web service ‘returns’ as output ‘price’. In such a case, the overall similarity of a syntactic function is determined to be as 0.

Step 3: The overall syntactic structural similarity between two textual descriptions is computed by taking an average of subject, verb and object similarity. If R is a service request and

270270

S is a candidate web service then the overall textual similarity is calculated as:

SIMTEXT(S, R)=[SIMSUBJECT (S, R)+SIMVERB(S, R)+SIMOBJECT(S, R)]/3

The following example illustrates the working of our approach. The textual description of service request R is ‘This service returns price of the pair of a car and 1 person bicycle. The textual description of a candidate web service S is ‘This service returns price of the requested pair of a car and a bicycle’. Our method compares every pair of syntactic function from request R and service S, if and only if, they are in the same group of subject, verb or object. Using the Lin measure, similarity is computed between the heads and dependents. Num (bicycle, 1) is compared with Nsubj (return, service), as both belong to the subject group. Lin computes similarity score of 0.0 between the heads (i.e. ‘bicycle’ and ‘return’) and computes a similarity score of 0.0 between their mismatched dependents (i.e. ‘1’ and ‘service’). Next, there is no match between Num(bicycle, 1) and Dobj (return, price) as they both lie in different groups. Based on our method, the resulting relations for subjects, verb and objects are shown in Table II.

DESCRIPTION SIMILARITY CALCULATION. LEGEND: SIM = SIMILARITY; GRP = GROUP

Service R Service S Lin Sim

Subject + Object

Num(bicycle,1) Nsubj(return,service) 0 Dobj(return,price) Diff Grp Nn(bicycle,person) Nsubj(return,service) 0.1223 Dobj(return,price) Diff Grp Nsubj(return,service) Nsubj(return,service) 1.0 Dobj(return,price) Diff Grp Dobj(return,price) Nsubj(return,service) Diff Grp Dobj(return,price) 1.0 SIM(Sub+Obj) 2.1223/4 = 0.5305

Head (Verb) Dobj(return,price) Dobj(return,price) 1.0

Total Similarity 1+0.5305/2 = 0.765

VI. INTERGRATING SIGNATURE AND SPECIFICATION SIMILARITIES: HYBRID WEB SERVICE MATCHING

To determine the overall similarity a service request and candidate services, we perform a weighted aggregation of the signature and specification similarities. Signature matching is more profound then specification matching, hence it has a higher weight (see eq. 4). In signature matching, the two components of parameter and structure matching are of equal relevance. Hence, if the service parameters do not match to concepts in the domain ontology, then the overall signature similarity is affected. Service name specification similarity has the least contribution because if a pair of services fails specification and signature matching, then having the same service name is not of much importance. We use name matching as a booster, such that if the computed total similarity is above the selection threshold, then the name similarity is attempted to give a similarity boost (as given in equation 5). SIM1(R,S)=0.7 [Avg(Logical_SIM(R,S), Structural_SIM(R, S))] +

0.3*Textual_SIM(R, S) [equation 4]

SIM2(R, S) = SIM1(R, S)+0.2*(1–SIM1(R, S))* Name_SIM(R, S) if SIM1(R, S) < 1 [equation 5]

VII. EVALUATION AND RESULTS Web service matchmaking is a process of retrieving

relevant web services given a service request. The performance of a matchmaker is determined by its ability to retrieve relevant web services. The degree of similarity in measured in terms of precision, recall and accuracy. We performed a series of experiments using the standard OWLS-TC dataset that is routinely used to evaluate the performance of web service matchmaking methods. We use OWLS-MX matchmaker [4] as a benchmark as it entails a similar hybrid approach employing logical-based reasoning and content-based matching. The goal of HWSM is to achieve high recall, without loss of precision.

A. Dataset For our experiment, we have randomly selected 150 web

services, covering seven different domains, from OWLS-TC database. The seven domains are education, medical care, food, travel, communication, economy, and weapons. To evaluate our matchmaker we have selected 7 different queries (or web services requests), where each query covers a specific domain. The reason we have chosen 7 queries is to evaluate the effectiveness of HWSM for different domains. The number of candidate web services and service queries used in the experiments was selected to optimize the run of the experiments whilst ensuring their significance.

For each service query there is a list of a priori defined web services. Table III illustrates the number of web services that are associated with each query (the actual names of the services are not listed here due to lack of space). For evaluation purposes, if a matchmaker when given query 1 manages to retrieve the 17 web services associated with the query then it will have a perfect recall and precision value. For our experiments, we present the 7 queries to the matchmakers who search the entire 150 web services and return a set of related web services. We compare the web services being retrieved against the list of services associated with each query (as noted in table III).

TOTAL NUMBER OF RELEVANT WEB SERVICES FOR EACH QUERY. THE EXACT SERVICES ARE NOT LISTED IN THIS TABLE.

No. Query Web Services # of Relevant Web Services

Query 1 Car1PersonBicyclePriceServie 17 Query 2 Comedy Film finder service 5 Query 3 HikingSurfingDestination 9 Query 4 GroceryStoreFoodService 5 Query 5 UniversityLectureService 18 Query 6 InvestigatingFinding 5 Query 7 GovernmentMissileFundingService 8

B. Results: Logical Signature Matching of Parameters Table IV shows the evaluation results of logical signature

matching approach of HWSM, with a comparison to the logical component of OWLS-MX. Note that the similarities ranges were applied are: EXACT (1.0), PLUG_IN (0.8), SUBSUME (0.7), SUBSUME-BY (0.6) and FAILED (0). Queries 1-7 were applied to both matchmakers, and each matchmaker returned a ranked list of relevant web service (shown in Table V) from the 150 available web services. Our results show that for all the queries the fine-grained logical matching approach used by HWSM

271271

returns a better recall than OWLS-MX, at the cost of a slight reduction in precision. However, the accuracy of both the matchmakers is almost comparable. The higher recall of HWSM is because of the fine-grained parameter matching approach that allows a reasonable degree of flexible when comparing service signatures.

COMPARISON BETWEEN THE OWLS-MX AND HWSM MATCHMAKERS. LEGEND IS AS FOLLOWS: PR=PRECISION; RE=RECALL; AC=ACCURACY;

NU=NUMBER OF SERVICES RETURNED FOR THE QUERY; FIRST COLUMN LISTS THE QUERY NUMBERS AS STATED IN TABLE II

OWLS-MX HWSM NU PR RE AC NU PR RE ACQ1 3 1.00 0.17 0.91 12 0.91 0.64 0.95Q2 1 1.00 0.20 0.97 7 0.60 0.80 0.97Q3 2 1.00 0.20 0.95 5 1.00 0.55 0.97Q4 2 1.00 0.40 0.98 6 0.50 0.80 0.98Q5 3 1.00 0.16 0.90 9 1.00 0.50 0.94Q6 2 1.00 0.40 0.98 5 0.60 0.60 0.97Q7 2 1.00 0.25 0.96 10 0.70 0.88 0.97Avg 1.00 0.25 0.95 0.75 0.68 0.96

Table V, shows the recall capabilities of both matchmakers. For query 1, it is noted that out of the 17 relevant services, OWLS-MX retrieved only 3 web services whereas HWSM retrieved not only the 3 web services but an additional 8 other relevant and 1 irrelevant web service. Furthermore, it is useful to note that HWSM has the same rank for the three services retrieved by OWL-MS.

SERVICE RETRIEVAL FOR QUERY 1. THE NUMBERS IN THE OWLS-MX & HWSM INDICATE THE RANK OF THE RETRIEVED WEB SERVICE

No Query 1 - Relevant Services OWLS-MX HWSM 1 Car1PersonBicyclePrice 1 1 2 CarCyclePrice 3 Kohl Car1PersonBicyclePrice 2 2 4 Bicycle4Wheeledcar_Price 4 5 Vehicle price 6 T-car price 11 7 4WheeledCar1PersonBicyclePrice 3 3 8 Auto Year price 9 Recommended price of car model 10

10 4wheeledcar year price 8 11 leynthu rent a car 12 Car2PersonBicyclePrice 5 13 2PersonBicycle4Wheeledcar_Price 7 14 car price report 15 car price 16 4wheeledcar year price report 9 17 CheapCar 2PersonBicyclePrice 6

Precision 1.00 0.91 Recall 0.17 0.64 Accuracy 0.91 0.95

If a consumer is looking for exactly matched services then both OWLS-MX and HWSM perform well. But in a scenario, such as web service composition, where a consumer is seeking many alternative web services which sufficiently satisfy the service request then HWSM is the best choice.

C. Results: Textual Specification Description Matching To evaluate the performance of our proposed syntactic

structure based specification matching method, we compared it with a traditional information retrieval technique (cosine similarity). We are unable to compare HWSM with OWLS-MX since we do not have the full details of their description matching methods. However, knowing that it uses information retrieval methods we have developed a version of HWSB (termed here as HWSM-1) that uses information retrieval methods. HWSM-2 uses our syntax-based short sentence method. Table VI shows the performance of the two description based matching approaches. We note that HWSM-2 provides a significant improvement in both precision (about 15%) and recall (about 8%) with a 3% increase in accuracy as compared to standard information retrieval techniques.

RESULTS FOR SPECIFICATION DESCRIPTION MATCHING

HWSM-1 HWSM-2PR RE AC PR RE AC

Q1 0.75 0.88 0.95 0.85 1.00 0.98 Q2 0.25 1.00 0.91 0.40 1.00 0.94 Q3 0.40 0.88 0.91 0.50 1.00 0.94 Q4 0.22 0.80 0.94 0.60 1.00 0.97 Q5 0.70 0.77 0.93 0.73 0.77 0.95 Q6 0.38 1.00 0.95 0.50 1.00 0.97 Q7 0.47 0.88 0.94 0.67 1.00 0.97 0.45 0.88 0.93 0.60 0.96 0.96

D. Results: Hybrid OWLS-MX versus Hybrid HWSM We evaluated the performance of hybrid models of OWLS-

MX and HWSM, where similarity scores for both signature and specification matching were combined to give a cumulative similarity score for a web service. Table VII shows the results for hybrid matching approaches used by both OWLS-MX and HWSM. Again, we note that HWSM provides an improvement in the recall with comparable precision results over OWLS-MX, and an overall 2% gain in accuracy.

COMPARISON BETWEEN THE OWLS-MX AND HWSM MATCHMAKERS. LEGEND IS AS FOLLOWS: PR=PRECISION; RE=RECALL; AC=ACCURACY;

NU=NUMBER OF SERVICES RETURNED FOR THE QUERY; FIRST COLUMN LISTS THE QUERY NUMBERS AS STATED IN TABLE II

OWLS-MX HWSMNU PR RE AC NU PR RE AC

Q1 8 1.00 0.40 0.94 20 0.85 1.00 0.98Q2 14 0.30 1.00 0.94 12 0.40 1.00 0.95Q3 5 1.00 0.55 0.97 18 0.50 1.00 0.94Q4 4 0.75 0.60 0.98 8 0.60 1.00 0.98Q5 9 0.77 0.38 0.91 19 0.73 0.77 0.94Q6 9 0.44 0.80 0.96 9 0.56 1.00 0.97Q7 13 0.46 0.75 0.94 11 0.73 1.00 0.98Avg 0.67 0.64 0.94 0.62 0.96 0.96

Table VIII shows the results for both matchmakers for query 1. The retrieval pattern is consistent—i.e. that the precision of the HWSM is less that of OWLS-MX, which means that OWLS-MX is providing consumers the most closely matched web services. But, at the same time many relevant web services are not retrieved by hybrid OWLS-MX due to its conservative matching methods resulting in low

272272

recall. On the other hand, HWSM is able to retrieve most of the relevant web services that are missed by OWLS-MX but at times at the expense of precision. This observation supports our objective of retrieving a larger set of relevant services (some partially matched as well) to facilitate web service composition process.

SERVICE RETRIEVED FOR QUERY 1

No Query 1 - Relevant Services OWLS-MX HWSM 1 Car1PersonBicyclePrice 2 CarCyclePrice 3 Kohl Car1PersonBicyclePrice 4 Bicycle4Wheeledcar_Price 5 Vehicle price 6 T-car price 7 4WheeledCar1PersonBicyclePrice 8 Auto Year price 9 Recommended price of car model

10 4wheeledcar year price 11 leynthu rent a car 12 Car2PersonBicyclePrice 13 2PersonBicycle4Wheeledcar_Price 14 car price report 15 car price 16 4wheeledcar year price report 17 CheapCar 2PersonBicyclePrice

Precision 1.00 0.85 Recall 0.4 1.00 Accuracy 0.94 0.98

VIII. CONCLUDING REMARKS In this work we presented a hybrid approach to improve the

performance of web service matchmakers. We leverage both logical and text analysis methods to analyze the specification and signature of web pages to provide a more comprehensive match between a service request and the retrieved services. We introduce two new approaches, namely the (i) relaxation of the logical filters to provide a fine-grained matching mechanism that analyzes individual parameters during specification matching; and (ii) a short sentence matching method to analyze service signatures. The evaluation shows that our approach is comparable, if not better at times, to a similar well-established hybrid web service matchmaker.

Although it is useful to exploit the textual description of services, we note that this results in a lower precision but a higher recall. In the hybrid OWLS-MX we note that the due to the inclusion of similarity results based on specification description analysis the overall precision has decreased. For example, when using logical matching the service “CarCycle_Price_Service” was not retrieved, but in the hybrid approach due to the specification description similarities this service has been retrieved, thus reducing the precision of OWLS-MX. Similarly, HWSM also shows an increase in false positives. Table IX compares the influence of specification matching with respect to logical signature matching, and lower

precision and higher recall is noted for the hybrid approaches that include specification matching.

COMPARISON BETWEEN THE LOGICAL SIGNATURE MATCHING AND HYBRID APPROACHES

OWLS-MX HWSMPR RE AC PR RE AC

Logical Signature 1.00 0.25 0.95 0.75 0.68 0.96HYBRID 0.67 0.64 0.94 0.62 0.96 0.96In the future our intent is to incorporate additional

grammatical relations for specification description analysis in order to improve the scope of description analysis. We plan to use a rich domain ontology to pursue structure matching using graph based approaches.

IX. REFERENCES [1] Aphrodite Tsalgatidou, Thomi Pilioura. An Overview of Standards and

Related Technology in Web Services. University of Athens, Distributed and Parallel Databases, 12, 135–162, 2002.

[2] Khalid Elgazzar, Ahmed E. Hassan, Patrick Martin. Clustering WSDL Documents to Bootstrap the Discovery of Web Services. International Conference on Web Services, 2010.

[3] Xin Dong, Alon Halevy, Jayant Madhavan, Ema Nemes, Jun Zhang. Similarity Search for Web Services. Proceedings of the 30th VLDB Conference, Toronto, Canada, 2004.

[4] Matthias Klusch, Benedikt Fries, Mahboob Khalid . OWLS-MX: A hybrid Semantic Web service matchmaker for OWL-S services. Journal Web Semantics: Science, Services and Agents on the World Wide Web archive Volume 7 Issue 2, April 2009.

[5] Yu-Huai Tsai, San-Yih, Hwang, Yung Tang. A Hybrid Approach to Automatic Web Services Discovery. International Joint Conference on Service Sciences, 2011.

[6] Fangfang Liu; Yuliang Shi; Jie Yu; Tianhong Wang; Jingzhe Wu. Measuring Similarity of Web Services Based on WSDL. IEEE International Conference on Web Services, 2010.

[7] Fausto Giunchiglia, Pavel Shvaiko and Mikalai Yatskevich. S-match: an algorithm and an implementation of semantic matching. In Proceedings of the European Semantic Web Symposium, LNCS 3053, pp. 61-75, February 2004.

[8] Fausto Giunchiglia1, Fiona McNeill2, Mikalai Yatskevich1, Juan Pane1, Paolo Besana2, Pavel Shvaiko3. Approximate structure-preserving semantic matching. University of Trento, ODBASE, 2008

[9] Jesús Oliva, José Ignacio Serrano, María Dolores del Castillo, and Ángel Iglesias. SyMSS: A syntax-based measure for short-text semantic similarity. Data & Knowledge Engineering 70, 390–40, 2011.

[10] OWLS-TC V2, OWL-S Service Retrieval Test Collection. http://www.semwebcentral.org/projects/owls-tc/

[11] Yiqiao Wang and Eleni Stroulia. Semantic Structure Matching for Assessing Web-Service Similarity. Computer Science Department, University of Alberta, Edmonton, AB, T6G 2E8, Canada, 2003.

[12] Guilherme C. Hobold, Frank Siqueira. Discovery of Semantic Web Services Compositions based on SAWSDL Annotations. 19th International Conference on Web Services, 2012.

[13] Wenjie Li, Wenjing Guo. Semantic-based Web Service Matchmaking Algorithm in Biomedicine. International Conference on BioMedical Engineering and Informatics, 2008.

[14] Dekang Lin. An Information-Theoretic Definition of Similarity. Proceedings of the 15th International Conference on Machine Learning, 1998.

[15] Hui Peng et al. Semantic Description and Discovery of travel web services. Artificial Intelligence and Computational Intelligence - 4th International Conference, 2012.

[16] Gehao Lu, Tengfei Wang,Guojin Zhang,Shijin Li. Semantic Web Services Discovery Based on Domain Ontology. World Automation Congress (WAC), pp. 1-4, 2012.

273273