integrating software engineering tools

Upload: xrootd

Post on 05-Apr-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 Integrating Software Engineering Tools

    1/13

    Integrating software engineering tools and

    repositories with XML and XSLT

    Henrik HedbergDepartment of Information Processing Science, University of OuluPL 3000, 90014 University of Oulu, [email protected]

    Abstract

    Interoperability between heterogeneous repositories and applications is often needed inInternet-based software development. At present XML is increasingly being used to integrate

    repositories and to express data fetched from various sources, but mismatches are encounteredbetween the schemas of different repositories. XSLT is typically used to stylize results, but thisdoes not utilize the full potential of the technology. It could also be used for more complexmanipulations such as schema transformations by converting data to and from various XMLformats. XSLT also enables the construction of a common vocabulary for federations of datasources. To transform both queries and results, they must be expressed in XML. This paper presentsan XML-based data query and manipulation language called XSQML (eXtensible Structured QueryModelling Language) and a framework for using it. This solution is simple, independent ofprogramming languages and repository types, and applicable to changing situations andenvironments, such as the web. A prototype implementation of the framework called Xfi(eXtensible Framework for Interoperability) connects relational databases, a versioned filerepository, flat files and XSQML-aware applications.

    Keywords: interoperability, heterogeneity, mismatch, XML, extensible markup language,

    query language, XSLT, schema transformation, common data structures, SQL, file

    repositories

    1 Introduction

    A distributed software development environment consists of tools of various kinds. In many casesapplications can be handling the same or very similar kinds of data. Components of the necessary

    data are distributed over multiple heterogeneous repositories or replicated in several locations. Themethods for enabling applications to reach such data, where they exist, are often manual orrepresent inflexible tailored solutions, which leads to inefficiency. Without coordination theintegrity of the data may be lost.

    What is needed is interoperability. Integration does not necessarily mean tight binding of allapplications, but may often be a loose connection between independent data sources and utilizers.Applications and repositories can continue operating as before, but also share their data with otherapplications.

    At present this kind of work is increasingly being done using XML (eXtensible Markup Language)[3], which can express all kinds of data fetched from various sources and is largely seen as aleverage for application integration. As a web standard, XML has already broken the acceptancebarriers and has been adopted in a wide variety of domains from small data repositories to

  • 7/31/2019 Integrating Software Engineering Tools

    2/13

    enterprise applications.

    This paper presents an XML-based data query and manipulation language called XSQML(eXtensible Structured Query Modelling Language) which organizes data into entities and theirattributes. A data model is used to reflect the different sources, such as relational databases and filerepositories. The language introduces four basic commands for manipulating data.

    Until now XSLT (XSL Transformation) has mainly been used for stylistic transformations,although it could also modify a schema independently of its formatting [5]. Use of the XML-baseddata query and manipulation language not only enables the results can be handled but also ensuresthat requests can be transformed from one XML vocabulary to another.

    Xfi (eXtensible Framework for Interoperability) is a framework which connects applications usingXSQML and legacy repositories. It utilizes XSQML and XSLT-based schema transformations toprovide a common data structure and one point of interconnectivity with applications. Prototypeimplementation can handle relational databases using SQL, a versioned file repository and flat files.

    The present solution has many benefits, such as openness, simplicity, independence ofprogramming languages and repository types and suitability for the web. It is also applicable tochanging situations and environments.

    The rest of the paper is structured as follows. After a short review of related work, the XSQMLlanguage is described, along with a discussion of how to use XSLT to perform schematransformations and how to transfer XSQML requests and replies. The fourth chapter outlines thebasic idea of Xfi, that of a common structure for data, and its implementation. The fifth chapterpresents the prototype software development environment, and is followed by the conclusions.

    2 Related work

    There are already related solutions on the market. Oracle and IBM, for example, haveXML-enabled their databases and XML has been used to integrate relational data sources. Thischapter briefly reviews some alternatives.

    Oracle has released a utility called XMLSQL [1] which converts the result of a plain SQL queryinto XML by mapping columns to top-level elements and scalar values to subelements withtext-only content. XML can be stored in a database so that XPath expressions can be used within aSQL operator, and XSLT can be applied to the resulting XML documents or for viewing an XML

    document in the database.

    IBMs XML Extender for DB2 [4] serves as a repository for XML documents which can be queriedusing SQL. Proprietary Data Access Definitions (DAD) define how XML is indexed, and an XMLdocument can be generated from existing data using macros.

    SilkRoute [8] offers an ability to view and query relational data in XML. Initially the database isviewed as a deeply nested XML document defined by writing a complex RXL (Relational to XMLtransformation Language) query which combines the extraction part of SQL with the constructionpart of XML-QL. The actual user query is presented through the XML as viewed in XML-QL.

    XML-DBMS [2] is a generic XML load and extract utility for relational databases that uses its ownXML-based language to describe mappings between documents and relational data. XML DTD and

  • 7/31/2019 Integrating Software Engineering Tools

    3/13

    relational schemas can be generated dynamically.

    The idea of integrating heterogeneous information sources using a 3-tier scheme is presented in[13]. XML is used to carry the results but the original query is in SQL, which is also used for thenecessary schema transformations. In [14] a medical database is monitored with a web browser,which matches with the typical use of XML.

    The problem with existing solutions is product specificity, or limitation to a particular domain.Many of them include only data query capabilities or just construct XML documents from a wholedatabase. Support exists only for relational databases, which is not enough for real interoperability.Important data could be located in a hierarchical directory service using LDAP (LightweightDirectory Access Protocol), a file repository using CVS (Concurrent Versions System) or just a flatfile. Few solutions can handle schema mismatches, and because of incompatible interfaces,different products cannot be further integrated to cover a larger area. Although existing solutionsprove the potential of XML, the need still exists for a more general interoperability framework.

    It is interesting to note is that the combination of a markup language and middleware has beenfound successful [7]. XML can carry complex data structures, and semantic translations can be usedto manage interface definition changes. Also, XSLT has been seen as a potential mapping languagefor semantic data conversions, although it has been argued that the specification is at too low a levelof abstraction.

    3 XSQML: An XML-based query language

    XSQML (eXtensible Structured Query Modelling Language) is a query and data manipulationlanguage based on XML. The main objective of the present research was to develop a lightweight

    technique for data transfers without manual operations. The work was concentrated on creating astandard for interfaces between separate applications and repositories, such as relational databases,directory services, versioned file repositories or flat file systems that would be applicable to adistributed web environment and a variety of programming languages, such as Java, C/C++ andPerl. The solution also had to be semi-integrated, so that components could operate independentlybut participate in a federation to allow their local data to be shared. This leads to schemamismatches, which can be solved using XSL transformations.

    The development of XSQML was started by considering what kind of functionality was needed andhow it should be represented. The operations needed were basic queries and data manipulations, sothat the complexity could be minimized. As most of the example applications included a relational

    database and the well-known, proven technology required for this, SQL was taken as a basis for thework. The aim was to develop an XML-based language which merges the useful parts of bothtechnologies. XSQML models simple SQL-like statements as XML documents, and the result isalso naturally a pure XML document with a behaviour that matches the XML-Acceptor Pattern[15].

    3.1 Structure of content data

    The content data in XSQML is organized into entities and their attributes. An entity is an XMLelement which has other elements as children, while attributes are elements whose content is purecharacter data representing the value of the attribute. Attribute elements can appear only as children

    of the corresponding entities. The name of an entity element specifies its type, which defines theattributes that it can have. Entities can also have subentities, which is indicated by simply nesting

  • 7/31/2019 Integrating Software Engineering Tools

    4/13

    the related elements.

    It is important to note that attribute-value pairs are not expressed as XML element attributes. All thedata are contained in textual nodes, which have element names that describe their meaning. XMLelement attributes are reserved for language directives. The model does not conatin any mixed dataitems (elements which have both element and character data children).

    The structure may be illustrated by means of the following example, which involves two entities,Entity1, which has two attributes named attribute1 and attribute2, having the values value1 andvalue2, respectively, andEntity2, a subentity ofEntity1 that has attribute3 possessing value3. Thesyntax is self-descriptive, as can be seen.

    value1value2

    value3

    In SQL, or when querying other tabular data, entities are mapped to tables and attributes tocolumns. An entity name is adopted directly as a table name and attribute-value pairs correspond toparticular cell values in specified columns of certain rows. Not all the columns need to be specifiedas attributes, but naturally there cannot be any attributes other than existing columns. Nestedentities can be thought of as joins in SQL, so that the previous example contains two tables linkedtogether.

    3.2 Transactions

    One XSQML transaction is composed of two XML documents: a request and a reply. These consistof content data embodied in language elements. Specific XSQML elements and attributes belong tothe XML namespace http://tol.oulu.fi/i3/xsqml/.

    Each XSQML request is a data query or manipulation expressed in XSQML. This version ofXSQML includes four commands which mimic their SQL counterparts: select, insert, update anddelete. Requests consist of a command element which specifies the operation and descendentelements which contain the query constraints or actual data pushed into the repository as specifiedin the content data model. There can be only one first-level entity.

    Only specified attributes are fetched when querying data. To limit the selection of entities, query,

    update and delete constraints are written in the place of attribute values. Empty elements mean thatthe requester is interested in all the values of the corresponding attribute. Language directives suchas how to produce reply elements or to order the results are given as attributes of command orcontent elements.

    For example, if we want to fetch the value ofattribute1 from theEntity1 entities whose value ofattribute2 is lower than or equal to 2000, the following request can be used. There is also adirective that results should be placed in ascending order ofattribute1. Note that lower than (

  • 7/31/2019 Integrating Software Engineering Tools

    5/13

    = 2000-01-01

  • 7/31/2019 Integrating Software Engineering Tools

    11/13

    The Delegrator searches for the entities used in the request and decides which repository isresponsible for such data. As all the entities are tables in the same database, the whole request canbe forwarded to the next module (Arrow 2).

    The XSLT converter transforms the request using preconfigured XSLT ru1es. The target repositorydoes not contain the entity nameInspector, but it does have an entity namedResource, whichincludes human resources, marked with a type attribute having the value h.Inspection is mapped inthe same way to a Taskthe type of which is i, and Participation is transformed directly toTask_resources.

    = h

    >= 2000-01-01= i

    The XSLT converter sends this modified request to the SQL driver (3), which parses it and forms acounterpart SQL statement, which is executed (4).

    SELECT Resource.id, Resource.nameFROM Resource, Task_resources, TaskWHERE Resource.type = h AND Task.type = iAND Task.begin >= 2000-01-01

    AND Resource.id = Task_resources.resourceAND Task_resources.task = Task.idORDER BY Resource.name ASC;

    This time the RDBMS returns the following result set (5). There are only two persons who haveperformed an inspection since 1 January 2000.

    Resource.id Resource.name----------- ------------------203 Doe John7 Meikalainen Matti

    The SQL driver converts the results to a XSQML reply and sends this to the requester, which was

    the XSLT converter (6). This is quite straightforward, except that a primary key which was notexplicitly queried is added to the entity element.

    Doe John

    Meikalainen Matti

    The XSLT converter transforms the reply to its original schema using an inverse XSLT document.Only entity names are affected.

  • 7/31/2019 Integrating Software Engineering Tools

    12/13

  • 7/31/2019 Integrating Software Engineering Tools

    13/13

    2. R. Bourret, C. Bornhvd, and A. Buchmann: A Generic Load/Extract Utility for DataTransfer Between XML Documents and Relational Databases. In Second InternationalWorkshop on Advanced Issues of E-Commerce and Web-Based Information Systems, pp.134-143, 2000.

    3. T. Bray, J. Paoli, and C.M. Sperberg-McQueen (eds.): Extensible Markup Language (XML)1.0. W3C Recommendation. http://www.w3.org/TR/1998/REC-xml-19980210

    4. J. Cheng: XML and DB2. In Proceedings of the 16th International Conference on DataEngineering, pp. 569-573, 2000.

    5. J. Clark (ed.): XSL Transformations (XSLT) Version 1.0, W3C Recommendation.http://www.w3.org/TR/1999/REC-xslt-19991116

    6. J. Clark: XT Java package, version 19991105. http://www.jclark.com/xml/xt.html7. W. Emmerich, W. Schwarz, and A. Finkelstein: Markup meets middleware. In Proceedings

    of the 7th IEEE Workshop on Distributed Computing Systems, pp. 261-266, 1999.8. M. Fernndez, T. Wang-Chiew, and D. Suciu: SilkRoute: Trading between Relations and

    XML. In Proceedings of 9th International World Wide Web Conference, 2000.http://www9.org/w9cdrom/202/202.html

    9. M. Girardot, and N. Sundaresan: Millau: an encoding format for efficient representation andexchange of XML over the Web. In Proceedings of 9th International World Wide WebConference, 2000. http://www9.org/w9cdrom/154/154.html

    10. Hypertext Transfer Protocol v1.1, RFC 2616.http://www.w3.org/Protocols/rfc2616/rfc2616.html

    11. L. Harjumaa: Virtual Software Inspections over the Internet. In Proceedings of the 3rdWorkshop on Software Engineering over the Internet, pp. 30-40, 2000.

    12. B. Martin, and B. Jano (eds.): WAP Binary XML Content Format, W3C Note.http://www.w3.org/1999/06/NOTE-wbxml-19990624

    13. C. Petrou, S. Hadjiefthymiades, and D. Martakos: An XML-based, 3-tier scheme forintegrating heterogeneous information sources to the WWW. In Proceedings of the Tenth

    International Workshop on Database and Expert Systems Applications, pp. 706-710, 1999.14. A. Pons,J. Millet, E. Gijarro, and M. Mainteiga: Medical database migration using new XML

    Internet standard. Computers in Cardiology, pp. 93-96, 1999.15. The XML-Acceptor Pattern-XML is the API.

    http://www.xmleverywhere.com/xml-acceptor.html