on evaluating and publishing data concerns for data as a service

25
APSCC 2010, Hangzhou 9 Dec 2010 1 On Evaluating and Publishing Data Concerns for Data as a Service Hong-Linh Truong and Schahram Dustdar Distributed Systems Group, Vienna University of Technology [email protected] http://www.infosys.tuwien.ac.at/Staff/truong

Upload: hong-linh-truong

Post on 22-Nov-2014

579 views

Category:

Self Improvement


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: On Evaluating and Publishing Data Concerns for Data as a Service

APSCC 2010, Hangzhou 9 Dec 2010 1

On Evaluating and Publishing Data Concerns for Data as a Service

Hong-Linh Truong and Schahram Dustdar

Distributed Systems Group, Vienna University of Technology

[email protected]://www.infosys.tuwien.ac.at/Staff/truong

Page 2: On Evaluating and Publishing Data Concerns for Data as a Service

APSCC 2010, Hangzhou 9 Dec 2010 2

Overview

Motivation and background Data concern-aware service engineering

process A framework for evaluating and publishing QoD

of DaaS Experiments Conclusions and future work

Page 3: On Evaluating and Publishing Data Concerns for Data as a Service

APSCC 2010, Hangzhou 9 Dec 2010 3

The rise of DaaS

Web services technologies and the cloud computing model foster the concept of data/information as a service (DaaS) Provide data capabilities rather than provide

computation or software Providing DaaS is an increasing trend

In both business and e-science environments Bio data, weather data, company balance sheets,

etc., via Web services But data is associated with many data concerns

Quality of data, privacy, licensing, etc.

Page 4: On Evaluating and Publishing Data Concerns for Data as a Service

4

Examples of DaaS Source: http://www.undata-api.org/ Source:

http://www.strikeiron.com/Catalog/StrikeIronServices.aspx

Source: http://docs.gnip.com/w/page/23722723/Introduction-to-Gnip

Page 5: On Evaluating and Publishing Data Concerns for Data as a Service

APSCC 2010, Hangzhou 9 Dec 2010 5

Motivation: the role of data concerns

Data consumers/data integrators need “data concerns” to use data in a right way: Is the data good? Or free? to filter irrelevant results: avoid information

overloading to save processing time/energy and storage

Both DaaS service and data providers need to evaluate and provide data concerns

Should we perform data composition?

Page 6: On Evaluating and Publishing Data Concerns for Data as a Service

APSCC 2010, Hangzhou 9 Dec 2010 6

Motivation: service provider versus data provider

The DaaS service provider is separated from the data provider

DaaS

Consumer

DaaS

Sensor

DaaS

Consumer Service provider Data provider

privacy1

quality1

quality2

privacy2

the lack of techniques and tools to deal with the evaluation and publishing of data concerns for DaaS

Page 7: On Evaluating and Publishing Data Concerns for Data as a Service

APSCC 2010, Hangzhou 9 Dec 2010 7

Example: DaaS provider =! data provider

Source: http://www.infochimps.org

Page 8: On Evaluating and Publishing Data Concerns for Data as a Service

APSCC 2010, Hangzhou 9 Dec 2010 8

Background: data resources

Data items → data resources → DaaS APIs → consumers

DaaS and data providers have the right to publish the data

Data items

Data items

Data items

Data resource

DaaS

Data resource Data resource

Data resourceData resource

Ser

vice

AP

IsConsumer

Consumer

SOAP/REST

Page 9: On Evaluating and Publishing Data Concerns for Data as a Service

9

Backgroud: diverse concerns associated with service and data

Hong-Linh Truong, Schahram Dustdar "On Analyzing and Specifying Concerns for Data as a Service" , The 2009 Asia-Pacific Services Computing Conference (IEEE APSCC 2009), (c) IEEE Computer Society, December 7-11, 2009, Biopolis, Singapore.

Page 10: On Evaluating and Publishing Data Concerns for Data as a Service

APSCC 2010, Hangzhou 9 Dec 2010 10

Data concern-aware service engineering process Typical activities

for data wrapping and publishing

Typical activities for data updating &

retrieval

Page 11: On Evaluating and Publishing Data Concerns for Data as a Service

APSCC 2010, Hangzhou 9 Dec 2010 11

Wrapping, selecting, and updating data in DaaS

Typically different strategies for structured data and unstructured data – not our main work

We just reuse existing techniques in order to plug our data concern evaluation and publishing techniques

Page 12: On Evaluating and Publishing Data Concerns for Data as a Service

APSCC 2010, Hangzhou 9 Dec 2010 12

Evaluating data concerns (1)

Based on three concepts:

evaluation scope, evaluation modes and integration model

Evaluation scopes – enable fine-grained evaluation

Three scopes: data resource, service operation, and service as a whole

Evaluation modes – suitable for different types of data

Off-line (before the access to data) and on-the-fly (when the data is requested)

Integration models – suitable for different tool integration strategies Push and pull data concerns Pass-by-value versus pass-by-reference to data concerns

evaluation tools

Page 13: On Evaluating and Publishing Data Concerns for Data as a Service

APSCC 2010, Hangzhou 9 Dec 2010 13

Evaluating data concerns (2)

Pull, pass-by-references Pull, pass-by-values

Push, pass-by-values

Page 14: On Evaluating and Publishing Data Concerns for Data as a Service

APSCC 2010, Hangzhou 9 Dec 2010 14

Publishing data concern information

Off-line publishing of data concerns suitable for static data concerns

the publishing of data concerns of a data resource is separated from the service operation which provides the access to the data resource

On-the-fly publishing of data concerns by associating concerns with retrieved data resources the resulting data resources (e.g., via queries) are annotated with data

concerns evaluated by data concerns evaluation tools.

suitable for providing dynamic data concerns

On-the-fly publishing of data concerns through queries the use of different service operation parameters to query data

concerns of data resources

suitable for validating data concerns before accessing data resources

Page 15: On Evaluating and Publishing Data Concerns for Data as a Service

APSCC 2010, Hangzhou 9 Dec 2010 15

How do we utilize the data concern-aware service engineering process?

Using this model we can determine and publish several concerns

Our “a proof-of-concept” A framework for evaluating and publishing QoD of

DaaS A proof-of-concept implementation of data concern-

aware service engineering process

Another example: model and publish privacy concerns for DaaS [ECOWS 2010]

Michael Mrissa, Salah-Eddine Tbahriti, Hong-Linh Truong, "Privacy model and annotation for DaaS", The 8th European Conference on Web Services (ECOWS 2010), (c)IEEE Computer Society, 1-3 December, 2010, Ayia Napa, Cyprus

Page 16: On Evaluating and Publishing Data Concerns for Data as a Service

APSCC 2010, Hangzhou 9 Dec 2010 16

QoD framework: pull QoD evaluation models for DaaS

Pull QoD Evaluation Models for DaaS

Pass-by-references and pass-by-value

References of data resources: URI

Values: any object

Third-party data evaluation tools

Page 17: On Evaluating and Publishing Data Concerns for Data as a Service

APSCC 2010, Hangzhou 9 Dec 2010 17

QoD framework: publishing concerns (1)

Off-line data concern publishing a common data concern

publication specification a tool for providing data concerns

according to the specification supported by external service

information systems

Page 18: On Evaluating and Publishing Data Concerns for Data as a Service

APSCC 2010, Hangzhou 9 Dec 2010 18

QoD framework: publishing concerns (2)

On-the-fly querying data concerns associated with data resources Using our proposed REST parameter convention in

[Composable Web 2010] Based on metric names in the data concern

specification Specifying requests by using utilizing query parameters

the form of metricName=value

GET/resource?accuracy="0.5"&location=’’Europe”

Hong Linh Truong, Schahram Dustdar, Andrea Maurino, Marco Comerio: Context, Quality and Relevance: Dependencies and Impacts on RESTful Web Services Design. ICWE Workshops 2010: 347-359

Page 19: On Evaluating and Publishing Data Concerns for Data as a Service

APSCC 2010, Hangzhou 9 Dec 2010 19

QoD framework: QoD monitoring and composition

QoD concerns monitoring and composition are useful for the evaluation of aggregated data resources

Our approach Utilizing monitoring rules QoD metrics of data resources are passed to an rule

engine Rules are user-defined for monitoring and composing

QoD metrics

Page 20: On Evaluating and Publishing Data Concerns for Data as a Service

APSCC 2010, Hangzhou 9 Dec 2010 20

Experiments

Implementation Java, JAX-RS/Jersey

Drools

Utilizing UNDataAPI - www.undata-api.org XML data sets without QoD

Illustrating examples: check data from 1990-2009 datasetcompleteness: the completeness of the list of

countries

dataelementcompleteness: the completeness of data elements in the list metrics

RESTful services wrapping to UNDataAPI

Page 21: On Evaluating and Publishing Data Concerns for Data as a Service

APSCC 2010, Hangzhou 9 Dec 2010 21

Experiment: evaluating and annotating QoD metrics

http://www.infosys.tuwien.ac.at/prototyp/SOD1/dataconcerns/

Page 22: On Evaluating and Publishing Data Concerns for Data as a Service

APSCC 2010, Hangzhou 9 Dec 2010 22

Experiments: publishing QoD with data resources

Page 23: On Evaluating and Publishing Data Concerns for Data as a Service

APSCC 2010, Hangzhou 9 Dec 2010 23

Experiments: simple rules for monitoring and composing QoD

Page 24: On Evaluating and Publishing Data Concerns for Data as a Service

APSCC 2010, Hangzhou 9 Dec 2010 24

Conclusions and future work A novel, generic data concern-aware service engineering

process for DaaS A proof-of-concept implementation for evaluating of

quality of data in REST-based DaaS but in principle other concerns can be supported more evaluation are needed

Open research questions: how to deal with other concerns ? what are the trade-offs between on-line and off-line

evaluation ? how to utilize evaluated data concerns for optimizing

data compositions ?

Page 25: On Evaluating and Publishing Data Concerns for Data as a Service

APSCC 2010, Hangzhou 9 Dec 2010 25

Thanks for your attention!

Hong-Linh TruongDistributed Systems GroupVienna University of TechnologyAustria

[email protected]://www.infosys.tuwien.ac.at