biodiversity and climate change use scenarios framework for

11
Biodiversity and climate change use scenarios framework for the GEOSS interoperability pilot process Stefano Nativi a, , Paolo Mazzetti a , Hannu Saarenmaa b , Jeremy Kerr c , Éamonn Ó Tuama d a Italian National Research Council (CNR)-IMAA, C.da S. Loja, Zona industriale-Tito Scalo, 85050 Italy b Finnish Museum of Natural History, 00014 University of Helsinki, Finland c Canadian Facility for Ecoinformatics Research, Department of Biology, University of Ottawa, Box 450, Station A, Ottawa, ON, Canada K1N6N5 d Global Biodiversity Information Facility Secretariat, Universitetsparken 15, 2100 Copenhagen, Denmark ARTICLE DATA ABSTRACT Article history: Received 22 August 2008 Received in revised form 25 November 2008 Accepted 26 November 2008 Climate change threatens to commit 1537% of species to extinction by 2050. There is a clear need to support policy-makers analyzing and assessing the impact of climate change along with land use changes. This requires a megascience infrastructure that is capable of discovering and integrating enormous volumes of multi-disciplinary data, i.e. data from biodiversity, earth observation, and climatic archives. Metadata and services interoperability is necessary. The Global Earth Observation System of Systems (GEOSS) works to realize such an interoperability infrastructure based on systems architecture standardization. In this paper we describe the results of linking the infrastructures of Climate Change research and Biodiversity research together using the approach envisioned by GEOSS. In fact, we present and discuss a service-oriented framework which was applied to implement and demonstrate the Climate Change and Biodiversity use scenario of the GEOSS Interoperability Process Pilot Project (IP3). This interoperability is done for the purpose of enabling scientists to do large-scale ecological analysis. We describe a generic use scenario and related modelling workbench that implement an environment for studying the impacts of climate change on biodiversity. The Service Oriented Architecture framework, which realizes this environment, is described. Its standard-based components and services, according to GEOSS requirements, are discussed. This framework was successfully demonstrated at the GEO IV Ministerial Meeting in Cape Town, South Africa November 2007. © 2008 Elsevier B.V. All rights reserved. Keywords: Biodiversity Climate Change SOA (Service Oriented Architecture) Mediation services GEOSS (Global Earth Observation System of Systems) IP3 (Interoperability Process Pilot Project) Megascience infrastructure Macroecological research 1. Introduction Climate change threatens to commit 1537% of species to extinction by 2050 (Thomas et al., 2004; also Buckley and Roughgarden 2004; Harte et al., 2004), accelerating a mass extinction precipitated by widespread land use changes. The need to assess these impacts and recommend solutions to policy-makers is correspondingly acute and has been high- lighted by the Fourth Assessment Report of the Intergovern- mental Panel for Climate Change (IPCC, 2007). Such analyses require robust infrastructure capable of integrating enormous volumes of data from biodiversity archives, satellite remote sensing, and climatic data. The integration is a stepwise process, where careful definition of a series of specific applications for these data (use scenarios), including step-by-step processes for analyses, are required. ECOLOGICAL INFORMATICS 4 (2009) 23 33 Corresponding author. E-mail addresses: [email protected] (S. Nativi), [email protected] (P. Mazzetti), [email protected] (H. Saarenmaa), [email protected] (J. Kerr), [email protected] (É. Ó Tuama). 1574-9541/$ see front matter © 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.ecoinf.2008.11.002 available at www.sciencedirect.com www.elsevier.com/locate/ecolinf

Upload: zubin67

Post on 14-May-2015

1.030 views

Category:

Documents


16 download

TRANSCRIPT

  • 1.E CO L O G I CA L IN F O R MA TI CS 4 ( 2 0 09 ) 233 3 a v a i l a b l e a t w w w. s c i e n c e d i r e c t . c o m w w w. e l s e v i e r. c o m / l o c a t e / e c o l i n fBiodiversity and climate change use scenarios framework for the GEOSS interoperability pilot processStefano Nativi a,, Paolo Mazzetti a , Hannu Saarenmaa b , Jeremy Kerr c , amonn Tuamad a Italian National Research Council (CNR)-IMAA, C.da S. Loja, Zona industriale-Tito Scalo, 85050 Italy b Finnish Museum of Natural History, 00014 University of Helsinki, Finland c Canadian Facility for Ecoinformatics Research, Department of Biology, University of Ottawa, Box 450, Station A, Ottawa, ON, Canada K1N6N5 d Global Biodiversity Information Facility Secretariat, Universitetsparken 15, 2100 Copenhagen, DenmarkAR TIC LE D ATAABSTR ACTArticle history: Climate change threatens to commit 1537% of species to extinction by 2050. There is a clear Received 22 August 2008need to support policy-makers analyzing and assessing the impact of climate change along Received in revised form with land use changes. This requires a megascience infrastructure that is capable of 25 November 2008 discovering and integrating enormous volumes of multi-disciplinary data, i.e. data from Accepted 26 November 2008biodiversity, earth observation, and climatic archives. Metadata and servicesinteroperability is necessary. The Global Earth Observation System of Systems (GEOSS) Keywords:works to realize such an interoperability infrastructure based on systems architecture Biodiversity standardization. In this paper we describe the results of linking the infrastructures of Climate Change Climate Change research and Biodiversity research together using the approach envisioned SOA (Service Oriented Architecture)by GEOSS. In fact, we present and discuss a service-oriented framework which was applied Mediation services to implement and demonstrate the Climate Change and Biodiversity use scenario of the GEOSS (Global Earth ObservationGEOSS Interoperability Process Pilot Project (IP3). This interoperability is done for the System of Systems) purpose of enabling scientists to do large-scale ecological analysis. We describe a generic IP3 (Interoperability Processuse scenario and related modelling workbench that implement an environment for Pilot Project) studying the impacts of climate change on biodiversity. The Service Oriented Architecture Megascience infrastructure framework, which realizes this environment, is described. Its standard-based components Macroecological research and services, according to GEOSS requirements, are discussed. This framework wassuccessfully demonstrated at the GEO IV Ministerial Meeting in Cape Town, South AfricaNovember 2007. 2008 Elsevier B.V. All rights reserved. 1.Introductionlighted by the Fourth Assessment Report of the Intergovern- mental Panel for Climate Change (IPCC, 2007). Climate change threatens to commit 1537% of species toSuch analyses require robust infrastructure capable of extinction by 2050 (Thomas et al., 2004; also Buckley and integrating enormous volumes of data from biodiversity Roughgarden 2004; Harte et al., 2004), accelerating a massarchives, satellite remote sensing, and climatic data. The extinction precipitated by widespread land use changes. The integration is a stepwise process, where careful definition of a need to assess these impacts and recommend solutions to series of specific applications for these data (use scenarios), policy-makers is correspondingly acute and has been high- including step-by-step processes for analyses, are required. Corresponding author.E-mail addresses: [email protected] (S. Nativi), [email protected] (P. Mazzetti), [email protected] (H. Saarenmaa), [email protected] (J. Kerr), [email protected] (. Tuama).1574-9541/$ see front matter 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.ecoinf.2008.11.002

2. 24E CO L O G I CA L IN F O RM A TI CS 4 ( 2 0 09 ) 233 3 Using real data to solve specific problems, these use scenarios Service Oriented Architecture (SOA) as a framework, and can be transformed into use cases and implemented within an describe the flow of steps as a business process using SOA interoperable, distributed component architecture. Differentterminology and build on the OpenModeller technology. In parts of that infrastructure have already been establishedthis paper, we extend the scope, and present an SOA independently by the IPCC and by the Global Biodiversityarchitectural solution developed for analyzing the impact of Information Facility (GBIF) in their own areas, respectively. climate change on biodiversity, including the required useHowever, linking major infrastructures in separate scenario. Because acquiring and processing environmental domains together will require additional sources of metadatadata is a crucial step in this analysis, we describe the SOA and related infrastructure services. That is, unless theframework and the developed architecture components and integration is done using customized point-to-point protocols services which are standard-based according to GEOSS where provider and user know each other but third parties are requirements. We present how these were used in the GEO excluded. The Global Earth Observation System of SystemsIV Summit in Cape Town in November 2007 for on-the-fly data (GEOSS) now promises to make these disparate services discovery and selection (Nativi et al., 2007b). Finally, we available through its Clearinghouse registry of registriesdescribe the system and the experiments which are part of system. Interoperability Pilot Process (IP3). This is an action of theThe Group on Earth Observations (GEO) currently includes GEOSS AR-07-01 Task (GEO, 20072009). Several components 76 member countries, the European Commission, and 51and services are already registered in the GEOSS registers. We intergovernmental, international, and regional organizations. believe that presentation of this work can help inform the GEO envisioned a system of systems to help realize a future ecological and biodiversity community of the importance of wherein decisions and actions for the benefit of humankindGEOSS for efficient macroecological research. are informed via coordinated, comprehensive and sustained Earth observations and information (GEO, 2005). The GEOSS 1.1. The Species Response to Climate Change use scenarios Implementation Plan spans 10 years (20052015) and recog- for GEOSS IP3 nises nine Societal Benefit Areas (SBAs) including Climate, Ecosystems and Biodiversity. The GEOSS strategy consists of The presented SOA framework was applied to implement and leveraging existing systems and services and promotingdemonstrate the Climate Change and Biodiversity use sce- interoperability through the adoption of a Service Oriented nario of the GEOSS IP3. Architecture (SOA) framework approach based on established The Interoperability Process Pilot Project (IP3) is part of the standards from bodies such as the International OrganizationGEOSS task AR-07-01 (GEO, 20072009) aiming to prototype and for Standardization (ISO) and Open Geospatial Consortiumvalidate the implementation of the Core GEOSS infrastruc- (OGC).ture and the processes for contributing and linking systems.In this paper, we describe the results of linking theThe IP3 was conceived as a way to exercise the process that has infrastructures of Climate Change research and Biodiversity been defined for reaching interoperability arrangements research together using an approach that is compatible with (Khalsa et al., submitted for publication). IP3 helps to identify the GEOSS service-oriented framework. This interoperability the system components and discuss the standards, interface is done for the purpose of enabling scientists to do large- protocols and interoperability agreements currently used by scale ecological analysis. We describe a generic use scenario disciplinary systems, such as GBIF and IPCC. and related modelling workbench that implement an envir- IP3 developed a series of projects involving different SBAs, onment for studying the impacts of climate change onworking out a suite of demonstrations. Four systems/disciplines biodiversity. were initially identified as sources for the pilot project, coveringThe most widely used approach for describing the steps for Earth's water cycle, climate, seismology, and biodiversity large-scale biodiversity data analysis is Ecological Niche(Khalsa et al., submitted for publication). One of them, namely: Modelling (ENM), pioneered by Peterson et al. (2001, 2002)Species Response to Climate Change was developed into and refined subsequently by many others (e.g. Elith et al., functional demonstrations building on the presented frame- 2006). ENM is now employed for a range of global change and work: Ecological Niche Modelling was used to predict present- macroecological applications (e.g. White and Kerr 2007; Kerrday niches for different species (e.g. butterflies in Canada and et al., 2007; Kharouba et al., in press). GBIF has promoted thisAlaska and pikas in the North-West America) and then to approach and organised several international workshops on predict their shifts under different global and regional climate the topic.1 The modelling tools for ENM have diversified andchange scenarios. These demonstrations were presented at the are being made available as an open framework and web GEO IV Ministerial Meeting in Cape Town, South Africa services2 through the OpenModeller project3 (see Canhos et al., November 2007 (Nativi et al., 2007b). 2004).The steps that are required for ENM have recently been described in detail by Santana et al. (2008). They use the2.Scenarios definitionIn the following, we briefly delineate the steps, with accom-1panying data needs, in a simple scenario intended to provide http://www.gbif.org/prog/ocb/modeling_workshop.2 http://openmodeller.cria.org.br/wikis/omgui/Use_Case_Scenario_example outcomes for a topical purpose, namely predicting for_Open_Modeller_Web_Services_API. shifts in the spatial distribution of species' niches as a 3 http://openmodeller.sourceforge.net/. consequence of climate change. Each step in the process we 3. E CO L O G I CA L IN F O R MA TI CS 4 ( 2 0 09 ) 233 325 Fig. 1 Activity diagram for the use scenario of biodiversity and climate change.outline reflects the business processes for ENM (Santana et al., those gaps. The need for more and better data can be 2008; Fig. 1). They are: communicated to policy makers. 3. Determine what environmental characteristics most likely 1. Identify taxa for which sufficient data exist to conductinfluence target species' niches. Examples of data that arebroad-scale analyses aimed at predicting the impacts of most widely necessary include high resolution land cover andglobal change on species distributions in the future. It is climate data. Climate and land use change models can helpuseful for such data to have a historical dimension also, forecast future environmental conditions, but models ofreaching back 30 years or more, so that responses tofuture conditions are unlikely to match either present-day,recently observed climatic and land use changes can bespatial observations of climatic or weather, or of land use. Thedocumented. Predictions for future niche shifts are likely to latter can be observed remote using very high resolutionbe more accurate when limited to species that have recently satellite data (see Kerr and Ostrovsky, 2003) or in situ. Clearly,responded predictably to climate changes that have been models of the future are subject to relatively large uncertain-directly observed (Kharouba et al., in press). Although there ties but can nevertheless provide plausible forecasts of changeare many biodiversity datasets that satisfy these stringent that can, and should, be considered for planning purposes.criteria, they are patchily distributed (e.g. birds from the 4. Determine what climatological data are needed for Ecolo-United Kingdom, butterflies from Canada, etc.). ENM can begical Niche Modelling of the selected group of organisms forapplied to these datasets. Identifying other datasets is apast, current, or future scenarios.challenge but one that GBIF can help solve. If all required5. Determine which modelling algorithms will most accuratelydatasets are stored in a repository online, then data miningand precisely predict shifts in distribution and abundance fortechniques can be used to discover available, comprehen-the selected group of organisms. Identify the reporting needssive datasets. If caching or other central or distributed in terms of data accuracy and error propagation.repositories are inaccessible or do not exist, expert advice 6. Collect the selected species occurrence data (e.g. frommay still successfully identify needed datasets.GBIF), environmental and climate data (e.g. from IPCC) to 2. After assembling biodiversity datasets and mapping theirthe modelling workbench.spatial and temporal distributions, gaps in information7. Run the models and present outputs as series of maps andbecome clearer. These gaps can then imply new datapredicted abundance numbers. Model accuracy should besharing opportunities within and among countries to fill in tested so uncertainty in model outputs under the range of 4. 26E CO L O G I CA L IN F O RM A TI CS 4 ( 2 0 09 ) 233 3desired scenarios can be included to provide a realisticFig. 2 shows the overall system architecture. It consists ofdepiction of policy options. This step will eliminate modelsix main logical components:outputs that are clearly inaccurate and consequentlyminimize the likelihood that failed models will inadver- 1. Biodiversity Data Provider: a component providing biodi-tently influence policy. This approach resembles that of theversity data. It supports two logical operations: a) gettingIPCC is presenting different climate change scenarios,an index of available datasets; b) getting data of a specificdepending on variations in emission reduction efforts.dataset. 2. Climatological Data Provider: a component providingThe above scenario is but one example of a broad-scaleclimatological data. It supports two logical operations: application for biodiversity data. Biodiversity is also affected a) collecting an index of available datasets; by other factors such as tropical deforestation, for which other b) collecting specific data, after a suitable target dataset is scenarios can be produced.identified.Biodiversity is not only being impacted, but is also an3. Catalog: a component performing queries on the available essential component in providing ecosystem services forbiodiversity and climatological datasets. It supports search agriculture, health, the chemical industry, etc. However,operations. Such operation can be very complex, applying these additional scenarios can be foreseen to build on the different kinds of filters based on spatial and/or temporal same pool of primary biodiversity data as the describedcriteria. It performs search operation using indices from climate change scenario. known data providers. This catalog implements distribu-tion and mediation functionalities (i.e. distribution andmediation for heterogeneous protocols, interaction style 3.The frameworkinterface type, information model) through the sameservice interfaces. It implements a broker service which As explained above, the typical biodiversity application supports extended interfaces for asynchronous query scenarios require modelling the impact of climate change ondistribution and caching. Experience with initial imple- species distribution. To build such models within a distributedmentations of the GEOSS architecture components has computing environment, heterogeneous data resources (e.g.demonstrated the importance of a brokering service in biodiversity, climatic and other environmental resources) andorder to facilitate discovery across the GEOSS federation. processing services (e.g. implementing ENM algorithms) mustThe mediation role applies to interoperability across interoperate seamlessly. We have developed and thoroughlycatalog services provided by the GEOSS Climate Change, tested a conceptual framework to permit interoperability Biodiversity and other environmental communities. testing for biodiversity applications. This framework also4. Model Provider: a component that runs ENM techniques on allows testing the GEOSS service architecture through theselected biodiversity and climatological datasets. It sup- development of relevant scenarios that draw on data andports a main operation to run the model by specifying the information exchange from a series of systems intercon-algorithm, the parameter values, and the datasets to be nected through SOA and by applying established standards.used.Fig. 2 The logical architecture of the framework. 5. E CO L O G I CA L IN F O R MA TI CS 4 ( 2 0 09 ) 233 3 27 5. Use Scenario Controller: a controller component that The catalog service is provided by the GI-Cat component thatimplements workflow within the business process typical accesses the data servers' metadata/indexing interfaces toof the biodiversity scenario described above. It is controlledperform queries (Nativi et al., 2007a; Bigagli et al., 2006). Itby the user through the GUI.implements and can be accessed through a standard OGC CS-W 6. Graphical User Interface (GUI): The component for userinterface (OGC, 2007a,b).interaction. It controls the workflow manager to perform The OpenModeller component implements the Modelthe required operations for implementing the biodiversity Provider. It is able to run ENM according to differentbasic scenario. algorithms and parameters. It exposes a proprietary SOAPinterface. Since it can work only on local files, it is necessary toThese components play the three typical roles of a SOAupload all required data locally. To avoid a double transfer where Consumers discover Providers through a Registry. In ouroperation we added a Data Uploader component. This exposes framework Data and Model providers are the Service Provi-a simple web interface that accepts a data description, ders; the GUI-Controller pair acts as a Consumer and the including all the information required for accessing data. Catalog plays the role of the Registry. Where necessary it alsoWhen a description is sent, the Data Uploader provides for the acts as a Broker between Consumer and Providers. This fourth retrieval of the data and for local storage. Thus the logical component is necessary for heterogeneous and federated interaction between the Controller and the Providers for data systems. access (see Fig. 2) is implemented with an indirect interactionThe previous logical architecture has been implementedthrough the Data Uploader. using a layered web architecture with a Service-OrientedThe Controller component implements the business pro- approach selecting or deploying specific data and modelcess of the use scenarios. According to the instruction providers, and introducing new components where required.provided by the user through the GUI, the Controller accessesThe functioning system includes multiple interactingthe Catalog and Model Provider for searching, evaluating and components and implements simple user interfaces (Fig. 3). choosing data, and for running models. The GBIF Portal Server and the Climatological Data Server are the data providers. Each of them has instances of interfaces for accessing metadata and data. The GBIF Portal Server4.Test scenario implements a REST-based interface to retrieve taxonomic information and species occurrences data through HTTP-GETA first demonstration dealt with the Canadian butterfly operations directed on specific resources addressed by properspecies (Amblyscirtes vialis) and its response to climate change. URLs. The Climatological Data Provider implements an OGC This demonstration was presented at the GEOSS IV Ministerial WCS interface (OGC, 2005) providing functionalities forSummit as part of the achievements of the GEOSS IP3 for the retrieving index and metadata (i.e. getCapabilities and descri-Biodiversity and Climate Change SBAs (Species Response to beCoverage) and data (i.e. getCoverage). Climate Change use scenarios) (Nativi et al., 2007b). Fig. 3 The framework main components. 6. 28E CO L O G I CA L IN F O RM A TI CS 4 ( 2 0 09 ) 233 3 Fig. 4 The framework deployment architecture.The deployment architecture realized for the demonstra- In the following paragraphs the main components are tion is formalized by the schema depicted in Fig. 4. The GBIF described in more detail highlighting the technological con- Portal Node is the instance of the GBIF Data Portal Server, while straints and choices. a NCAR climatological server node hosts the Climatological Data Server. A Catalog Server Node located at the CNR-IMAA4.1.Biodiversity Data Provider runs an instance of GI-Cat configured for returning CS-W responses according to the ISO profile (OGC, 2007b). AnotherBiodiversity occurrences are discovered and accessed through Node located at CNR-IMAA hosts the OpenModeller serverweb services published by the GBIF Data Portal5 and using instance and the Data Uploader components they must widely deployed biodiversity standards.6 reside on the same Node. The GBIF Data Portal provides unified access to overThe other interacting Node is the User Device which is 151 million primary species-occurrence records (both speci- typically a device capable of running a Web Browser and a Javamens and observations) from some 266 data providers around Virtual Machine, such as a desktop or laptop computer. In the the world, and covering a diverse range of taxa and ecosys- browser, it runs the Use Scenario application allowing the data tems (Hobern and Saarenmaa, 2005). A high proportion of uploading, the model description and running, and the datathese records are geo-referenced, and ongoing efforts in the visualization output. The search operations are performed data providing communities stress the necessity and value of using a Java-based client of GI-cat (called GI-go GeoBrowser4)providing an accurate geo-location for records. The GBIF for performances issues.virtual database represents a unique resource for Earth Observation studies which require ground-truthing data, 45 http://zeus.pin.unifi.it/joomla/index.php?option=com_content&- http://data.gbif.org/.6 task=view&id=12&Itemid=59. http://wiki.tdwg.org/twiki/bin/view/DarwinCore/WebHome. 7. E CO L O G I CA L IN F O R MA TI CS 4 ( 2 0 09 ) 233 329 Fig. 5 Results of the model demonstrated at the GEO Ministerial Meeting in Cape Town, South Africa November 2007. The model and its projections are the result of successful interoperation of all components of the system. A) The Amblyscirtes vialis distribution projected for the year 2000; B) The Amblyscirtes vialis distribution projected for the year 2050 under the IPCC climate change scenario. Light marks correspond to 100% of probability; gray marks to 50% of probability (photo by Erik Nielsen).whether historical (to study change over time) or contempor-3. Google Earth mapping service providing 1-degree cell ary. In addition to the web based interface which provides the density data or placement marks. user with three main routes into the data served by the GBIF4. Prototype OGC compliant Web Map Service. network a user can explore by species, by country or by dataset with options to download the data GBIF also exposesGBIF works closely with Biodiversity Information Stan- the data through several web services. These are described in dards (BIS) /TDWG,7 an international organisation that de- the following section.velops standards and protocols for sharing biodiversity data.The main components of the network contributed by GBIF Foremost amongst these, and deployed widely in the GBIF are:network are the following:1. Data providing nodes currently some 266 distributed1. Darwin Core8: a standard designed to facilitate thearound the world and growing. exchange of information about the geographic occurrence 2. A central registry of the data providing nodes imple- of species and the existence of specimens in collections. Itmented using UDDI.includes an extension mechanism to allow inclusion of 3. A central indexing and caching system of the data providedother information. Its geospatial extension is particularlyby the nodes. relevant for GEOSS applications. 4. A data portal front end providing unified access to all nodes2. ABCD Schema9: (Access to Biological Collection Data), moreon the network. comprehensive than Darwin Core, this is also designed to 5. Web services for programmatic access to data on the promote accessibility to biological collection data.network. 3. DiGIR10: Distributed Generic Information Retrieval, basedon HTTP, XML and UDDI, is a protocol designed for unifiedThe GBIF data portal provides a number of web services: access to distributed databases. 4. TAPIR11: (TDWG Access Protocol for Information Retrieval) 1. A registry of data providing nodes implemented usingis a newer HTTP/XML based protocol standard developedSOAP to UDDI. by BIS/TDWG for accessing structured data stored in 2. Several related REST style web services for data resourcesdistributed databases. It combines and extends the fea-within the GBIF network, including: tures of BioCASe (a protocol based on DiGIR and developed1. Taxon data web service: providing access to records of for the EU funded project BioCASE for use with ABCD taxon concepts.encoded data) and DiGIR to provide a more generic2. Occurrence record data web service: providing access toprotocol. records of the occurrence of organisms.7http://www.tdwg.org/.3. Occurrence density data web service: providing access to 8http://www.tdwg.org/activities/darwincore/. records showing the density of occurrence records by 9http://www.tdwg.org/activities/abcd/. one-degree cell.10http://digir.sourceforge.net/http://www.tdwg.org/activities/tapir/.4. Provider web service: providing access to records 11http://www.tdwg.org/dav/subgroups/tapir/1.0/docs/TAPIR describing the data providers.Specification_2008-09-18.html. 8. 30 E CO L O G I CA L IN F O RM A TI CS 4 ( 2 0 09 ) 233 3Fig. 6 Client application: user-interface.4.2.Climatological Data Provider oil and gas) availability, rapid pace and direction of techno-logical change favoring balanced development. Climatological data were obtained from the NCAR GIS portal12 A1B Scenario Run set is represented by the five ensemble which provides web access to free global datasets of climate members. Climate models are an imperfect representation of change scenarios. These data (spanning 50 years from 2000 to the earth's climate system and climate modellers employ a 2050) have been generated for the 4th Assessment Report of technique called ensembling to capture the range of possible the Intergovernmental Panel on Climate Change (IPCC) by theclimate states. A climate model run ensemble consists of two Community Climate System Model (CCSM) (IPCC, 2007). This or more climate model runs made with the exact same service can be discovered using the GEOSS Clearinghouseclimate model, using the exact same boundary forcings, (include URL?).where the only difference between the runs is the initial The portal provides several climate change scenarios, as conditions. provided by IPCC: a scenario is a description of a possible outlookThe datasets are processed to generate grid coverages at 1 for the future state of the world, not a forecast of the future. The resolution in the ESRI ARCGrid format and served through the constant 20th century forcing shows the least increase in future standard OGC WCS (Web Coverage Service) interface version surface temperature, the B1 and A1B scenarios displays moderate1.0 (OGC, 2005). Fig. 5 depicts the results obtained for a use case increases and the A2 scenario results in the largest response. dealing with the Canadian common roadside skipper butterfly The interoperability experiments mainly considered the (A. vialis). This use case was demonstrated at the GEO A1B scenario. The A1 storyline and scenario family describes a Ministerial Meeting in Cape Town, South Africa November future world of very rapid and successful economic develop-2007 (Nativi et al., 2007b). ment, low population growth, and the rapid introduction of new and more efficient technologies. Major underlying4.3.Catalog service themes are convergence among regions, capacity building and increased cultural and social interactions, with a sub-GI-cat (Bigagli et al., 2004) is a distributed catalog providing a stantial reduction in regional differences in per capita income. unique and consistent interface that enables the interrogation The A1 scenario family develops into four groups that describe of biodiversity and climatological data resources. GI-cat alternative directions of technological change in the energy exposes an OGC CS-W/ISO interface (OGC, 2007b) and is able system. Main characteristics of A1B scenario include: lowto federate heterogeneous catalogs and access servers that population growth, very high GDP growth, very high energyimplement international geospatial standards (e.g. OGC OWS). use, lowmedium land use changes, medium resource (mainlyIn addition, GI-cat implements a mediation server, making itpossible to federate components that apply non-standard 12http://www.gisclimatechange.org.services (e.g. THREDDS/OPenDAP servers) and GEOSS Special 9. E CO L O G I CA L IN F O R MA TI CS 4 ( 2 0 09 ) 233 331 Arrangements for interoperability (e.g. GBIF). While waiting - Model Output access: OpenModeller saves the model outputs for the GEOSS Clearinghouse to provide an openly documen-in a local directory. To make them accessible we simply ted interface, an interoperability arrangement was introducedexpose the directory through a Web Server. for the GBIF portal services, consisting of the introduction of a formal mapping for the GBIF data model to the ISO 19115 core Environmental and biodiversity data searching on a catalog metadata profile, and the GI-cat to GBIF service protocolsservice was implemented through the transparent interoper- adaptation. ability with GI-cat.4.4.Model service 4.5.The client applicationOpenModeller,13 an open source Ecological Niche Modelling To implement the use scenario business logic and the user- (ENM) framework, was used as the component for processing system interface we developed a client application running in a collected data and generating future projections. It is currently Web browser environment using AJAX15 technologies. With this being developed by the Centro de Referncia em Informao tool, the user is guided through the process of: 1) discovering Ambiental (CRIA), Escola Politcnica da USP (Poli), and the data (by submitting queries to GI-cat) and accessing selected Instituto Nacional de Pesquisas Espaciais (INPE) as an open-data through the GBIF and WCS/NCAR data servers; 2) creating source initiative. It is developed as a stand-alone application the model; 3) running ENM projections; 4) showing results. (OpenModeller Desktop) but the modelling kernel is accessible The user interface reflects the typical use scenario work- also through specific modules implementing external inter-flow (see Fig. 6). A different tab is dedicated to each of the four faces like SOAP and SWIG (Simplified Wrapper and Interfacemain operations: Data Search and Access, Model Creation, Generator).14 In our demonstration we use the SOAP server Model Projection and Output View. A fifth tab is used for module implemented as a CGI component of an Apache Webdebugging. Inside each tab the respective sub-operations are Server. available through accordion menus whose content isThe proprietary OpenModeller SOAP interface implements dynamically updated. The user interface is implemented in operations for the basic modelling activities. The main onesJavascript, XHTML and XSLT, using the JQuery library for are:graphical effects and GUI widgets. The application implementing the required business logic 1. getLayers for viewing the available environmental layers;is also implemented in Javascript. A simple SOAP client library 2. getAlgorithms for viewing the available modelling algorithms;has been developed to interact with the OpenModeller Server. 3. createModel for creating a model based on selected environ-The client application provides functions that implement: themental layers and the provided species occurrences data; access to the required services, the building of request 4. projectModel for projecting a pre-generated model according to messages, the presentation of response messages, and theselected environmental layers (e.g. climate model outputs).interaction with the user.The interfaces to the most time demanding operations (createModel and projectModel) are implemented in an asyn-5.Conclusions chronous way. Each operation call returns a ticket which can be used in a getProgress operation. In this paper, we have described how linking distributed At the time of the demonstration implementation wecomponents needed for research on biodiversity conse- needed to resolve some interoperability issues for integratingquences of global climate change could be achieved. An OpenModeller SOAP Server in our framework:informatics framework was presented and discussed. This framework was successfully demonstrated at the GEO IV- Environmental data access: OpenModeller was not able toMinisterial Meeting in Cape Town, South Africa Novemberaccess remotely located environmental data. Thus we2007, as part of the GEOSS IP3 task.added the Data Uploader to retrieve the required data and The framework described in this paper is the first to maketo store it in a proper local directory. ENM available to any user with a web browser and through- Occurrence data access: OpenModeller required providingweb services. It is an example of an electronic scratch-book foroccurrence data in the createModel request message. We data analysis, automating the steps of the workflow. Suchwould like to have the same approach both for environ- capabilities will be needed from the GEOSS Portal in future.mental data and biodiversity data. We solved this issue byThe framework present valuable innovations such as: anuploading occurrence data in a Web folder using the Data OpenModeller service online with an AJAX client, the Open-Uploader. Then the Controller could access the requiredModeller environmental and biodiversity data searchingdata and properly build the request message. integrated in a transparent way through the interoperability- Occurrence data format: OpenModeller required a specific with a standard catalog service (i.e. the CS-W implemented byformat for occurrence data. For performances reasons the GI-cat), and the mapping of GBIF standard metadata to the ISOformat translation is worked out by the Data Uploader19115 core profile (the metadata model applied by GI-cat).during the upload.13http://openmodeller.sourceforge.net. 1415http://www.swig.org/. http://www.w3schools.com/ajax/default.asp. 10. 32 E CO L O G I CA L IN F O RM A TI CS 4 ( 2 0 09 ) 233 3 The tests of this framework demonstrated the need forElith, J., Graham, C.H., NCEAS Species Distribution Modelling international standards to support interoperability and theGroup, 2006. Novel methods improve prediction ofspecies' distributions from occurrence data. Ecography 29, effectiveness of establishing Special Arrangements for inter-129151. operability where these standards are not fully supported orGEO, 2005. In: Battrick, Bruce (Ed.), Global Earth Observation extended. This is important to establish crosswalks amongSystem of Systems (GEOSS) 10-Year Implementation Plan. ESA heterogeneous information communities. Distributed and Publications Division, The Netherlands. ISBN: 92-9092-495-0. mediation catalog services, implementing a broker approach,ISSN No.: 0250-1589. proved to be a good solution to managing the complexity in GEO, 20072009. Work plan. Towards convergence. 30 pp. Group on multi-disciplinary or federated systems, like GEOSS. Earth Observations, Geneva.Harte, J., Ostling, A., Green, J.L., Kinzig, A., 2004. Biodiversity The framework helped validate GEOSS' initial infrastruc-conservation: climate change and extinction risk. Nature 430 ture, contributing and linking systems. GBIF and IPCC 2 p following 33. components were registered in the architecture registers.Hobern, D., Saarenmaa, H., 2005. GBIF data portal strategy. 40 pp. GBIF REST-based services were submitted as Special Arrange-GBIF Secretariat. http://circa.gbif.net/Public/irc/gbif/dadi/ ments to the GEOSS Standard and Interoperability Forum (SIF) library?l=/architecture/portal_strategy_1/. (Khalsa et al., 2007a,b). Other legacy interfaces, characterizingIPCC, 2007. Summary for policymakers. in: climate change 2007:the physical science basis. In: Solomon, S., Qin, D., Manning, the resource provider components, were assessed to beM., Chen, Z., Marquis, M., Averyt, K.B., Tignor, M., Miller, H.L. reconsidered on the basis of international standards.(Eds.), Contribution of Working Group I to the Fourth The IP3 Mediator, based on the GI-cat technology, will Assessment Report of the Intergovernmental Panel on Climate become a component of the GEOSS services architecture. ThisChange. Cambridge University Press, Cambridge. United service is able to query and access GBIF data through aKingdom and New York, NY, USA. standard OGC CS-W interface; queries are allowed by area,Kerr, J.T., Kharouba, H., Currie, D.J., 2007. The macroecological time interval, taxa, data sources, and free text keywords. contribution to global change solutions. Science 316,15811584. Another important lesson learned was the need to includeKerr, J.T., Ostrovsky, M., 2003. From space to species: ecological modelling tools in the resources managed by GEOSS.applications for remote sensing. Trends in Ecology and The framework components described here do not yet Evolution 18, 299305. make use of the GEO Portals, as they were not available at the Khalsa, S.J., Nativi, S., Shibasaki, R., Ahern, T., Rainer, J.M., 2007a. time when this work was done (the 1st half of 2007). ThisThe GEOSS Interoperability Process Pilot Project, EGU interoperability topic will be developed in the next future. Inproceedings, Vienna (Austria), 1520 April 2007. fact, the IP3 framework will be extended and its multi-Khalsa, S.J., Nativi, S., Shibasaki, R., Ahern, T., Thomas, D., 2007b.The GEOSS Interoperability Process Pilot Project, IGARSS '07, disciplinary capabilities will be strengthened, demonstratingBarcelona (Spain), July 2007. the impact of local Climate Change on Biodiversity (2008Khalsa, S.J., Nativi, S., Geller, G., submitted for publication, The 2009). GEOSS Interoperability Process Pilot Project (IP3), Submitted to In our opinion, this pilot framework and its successfulIEEE TGARS Special Issue on Data Archiving and Distribution. implementation demonstrate the importance of GEOSS for Kharouba, H.M., Algar, A., and Kerr, J.T., in press. Historically efficient macroecological research.calibrated predictions of butterfly species' range shift usingglobal change as a pseudo-experiment. Ecology.Nativi, S., Bigagli, L., Mazzetti, P., Mattia, U., Boldrini, E., 2007a.Discovery, query and access services for Imagery Gridded and Acknowledgment Coverage Data: a clearinghouse solution. IGARSS '07, Barcelona(Spain), July 2007. We thank Siri Jodha Khalsa, leader of the IP3 initiative, for hisNativi, S., Mazzetti, P., Saarenmaa, H., Kerr, J., Kharouba, H., contributions to this work. Tuama, ., Singh Khalsa, S.J., 2007b. Predicting the impact ofclimate change on biodiversity a GEOSS scenario. GEOMinisterial IV Plenary, Cape Town, 2930 November 2007. The REFERENCES Full Picture 2007 GEO Book. 262264. Tudor Rose, Leicester, UK.OGC, 2005. OpenGIS Web Coverage Service (WCS)Implementation Specification, Ver. 1.0 (Corrigendum) (1.0.0), Bigagli, L., Nativi, S., Mazzetti, P., Villoresi, G., 2004. GI-Cat: a webOGC 2005 document N. 05-076.service for dataset cataloguing based on ISO 19115. Proc. of 1stOGC, 2007a. OpenGIS Catalog Services Specification, Ver. 2.0.2,International Workshop on Geographic InformationOGC 2007 document N. 07-006R1.Management (GIM'04) 15th International Workshop onOGC, 2007b. Catalogue Services Specification 2.0.2 ISODatabase and Expert Systems Applications (DEXA'04). IEEEMetadata Application Profile, Ver. 1.0.0, OGC 2007 documentComputer Society Press. ISBN: 0-7695-2195-9, pp. 846850. N. 07-045. Bigagli, L., Nativi, S., Mazzetti, P., 2006. Mediation to deal withPeterson, A.T., Sanchez-Cordero, V., Soberon, J., Bartley, J.,information heterogeneity application to Earth System Buddemeier, R.H., Navarro-Siguenza, A.G., 2001. Effects ofScience. European Geosciences Union, Advances inglobal climate change on geographic distributions ofGeosciences, vol. 8, pp. 39. SRef-ID: 1680-7359/adgeo/2006-8-3.Mexican Cracidae. Ecological Modelling 144, 2130 www. Buckley, L.B., Roughgarden, J., 2004. Biodiversity conservation: specifysoftware.org/Informatics/bios/biostownpeterson/effects of changes in climate and land use. Nature 460 2 pPetal_EM_2001.pdf.following 33. Peterson, A.T., Ortega-Huerta, M.A., Bartley, J., Sanchez-Cordero, V., Canhos, V.P., Souza, R., de Giovanni, R., Canhos, D.A.L., 2004.Soberon, J., Buddemeier, R.H., Stockwell, D.R.B., 2002.Global biodiversity informatics: setting the scene for a New Future projections for Mexican faunas under global climateWorld of ecological modelling. Biodiversity Informatics 1, change scenarios. Nature 416, 626629 www.specifysoftware.113. org/Informatics/bios/biostownpeterson/Petal_N_2002.pdf. 11. E CO L O G I CA L IN F O R MA TI CS 4 ( 2 0 09 ) 233 333 Santana, F.S., Siqueira, M.F., Saraiva, A.M., Correa, P.L.P., 2008. APhillips, O.L., Williams, S.E.,, 2004. Extinction risk from climatereference business process for ecological niche modelling.change. Nature 427, 145148 (8 January).Ecological Informatics 3 (1), 7586.White, P.J., Kerr, J.T., 2007. Human impacts on Thomas, C.D., Cameron, A., Green, R.E., Bakkenes, M., Beaumont,environmentdiversity relationships: evidence for bioticL.J., Collingham, Y.C., Erasmus, B.F.N., Ferreira de Siqueira, M.,homogenization from butterfly species richness patterns.Grainger, A., Hannah, L., Hughes, L., Huntley, B., van Jaarsveld, Global Ecology and Biogeography 16, 290299.A.S., Midgley, G.F., Miles, L., Ortega-Huerta, M.A., Peterson, A.T.,