cybergis toolkit: a software toolbox built for scalable...

22
CyberGIS Toolkit: A Software Toolbox Built for Scalable cyberGIS Spatial Analysis and Modeling Yan Liu 1,2 , Michael Finn 4 , Hao Hu 1 , Jay Laura 3 , David Mattli 4 , Anand Padmanabhan 1,2 , Serge Rey 3 , Eric Shook 5 , Kornelijus Survila 1 , and Shaowen Wang 1,2 1 CyberInfrastructure and Geospatial Information Laboratory (CIGI) 2 National Center for Supercomputing Applications (NCSA) University of Illinois at Urbana-Champaign 3 GeoDa Center for Spatial Analysis and Computation Arizona State University 4 Center of Excellence for Geospatial Information Science U.S. Geological Survey 5 Department of Geography Kent State University CyberGIS All-Hands Meeting 2013 Seattle, WA., September 15, 2013

Upload: others

Post on 12-Jul-2020

47 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CyberGIS Toolkit: A Software Toolbox Built for Scalable ...cybergis.cigi.uiuc.edu/cyberGISwiki/lib/exe/fetch...CyberGIS Toolkit: A Software Toolbox Built for Scalable cyberGIS Spatial

CyberGIS Toolkit: A Software Toolbox Built for

Scalable cyberGIS Spatial Analysis and Modeling

Yan Liu 1,2, Michael Finn 4, Hao Hu 1, Jay Laura 3, David Mattli 4, Anand

Padmanabhan 1,2, Serge Rey 3, Eric Shook 5, Kornelijus Survila 1, and

Shaowen Wang 1,2

1 CyberInfrastructure and Geospatial Information Laboratory (CIGI)

2 National Center for Supercomputing Applications (NCSA)

University of Illinois at Urbana-Champaign

3 GeoDa Center for Spatial Analysis and Computation

Arizona State University

4 Center of Excellence for Geospatial Information Science

U.S. Geological Survey

5 Department of Geography

Kent State University

CyberGIS All-Hands Meeting 2013

Seattle, WA., September 15, 2013

Page 2: CyberGIS Toolkit: A Software Toolbox Built for Scalable ...cybergis.cigi.uiuc.edu/cyberGISwiki/lib/exe/fetch...CyberGIS Toolkit: A Software Toolbox Built for Scalable cyberGIS Spatial

Outline

• Purposes

• Software integration approach o Software component selection

o Software engineering

o Scalability analysis

• Progress update o CyberGIS Toolkit 0.5-alpha release

o Continuous integration framework

o Deployment on advanced cyberinfrastructure

• Case study: Parallel PySAL (pPySAL) o Parallel PySAL project

o Illustration of parallelization strategies

• Future work and concluding discussion

2

Page 3: CyberGIS Toolkit: A Software Toolbox Built for Scalable ...cybergis.cigi.uiuc.edu/cyberGISwiki/lib/exe/fetch...CyberGIS Toolkit: A Software Toolbox Built for Scalable cyberGIS Spatial

Six Major Goals of the NSF CyberGIS Project

3

1. Engage multidisciplinary communities through a participatory approach to evolving CyberGIS software requirements;

2. Integrate and sustain a core set of composable, interoperable, manageable, and reusable CyberGIS software elements based on community-driven and open source strategies;

3. Empower high-performance and scalable CyberGIS by exploiting spatial characteristics of data and analytical operations for achieving unprecedented capabilities for geospatial scientific discoveries;

4. Enhance an online geospatial problem solving environment to allow for the contribution, sharing and learning of CyberGIS software by numerous users, which will foster the development of crosscutting education, outreach and training programs with significant broad impacts;

5. Deploy and test CyberGIS software by linking with national and international cyberinfrastructure to achieve scalability to significant sizes of geospatial problems, amounts of cyberinfrastructure resources, and number of users; and

6. Evaluate and improve the CyberGIS framework through domain science applications and vibrant partnerships to gain better understanding of the complexity of coupled human-natural systems.

Software as

deliverables

Building blocks

Computational

solutions

Page 4: CyberGIS Toolkit: A Software Toolbox Built for Scalable ...cybergis.cigi.uiuc.edu/cyberGISwiki/lib/exe/fetch...CyberGIS Toolkit: A Software Toolbox Built for Scalable cyberGIS Spatial

CyberGIS Software Environment

4

CyberGIS

Gateway

CyberGIS

Toolkit

GISolve

Middleware

Page 5: CyberGIS Toolkit: A Software Toolbox Built for Scalable ...cybergis.cigi.uiuc.edu/cyberGISwiki/lib/exe/fetch...CyberGIS Toolkit: A Software Toolbox Built for Scalable cyberGIS Spatial

“CyberGIS Toolkit, a deep approach to CyberGIS software

integration research and development, is focused on

developing and leveraging innovative computational strategies

needed to solve significant geospatial scientific problems by

exploiting high-end cyberinfrastructure (CI) resources.”

-- NSF CyberGIS Project, 3rd Year Annual Report (2013)

5

Page 6: CyberGIS Toolkit: A Software Toolbox Built for Scalable ...cybergis.cigi.uiuc.edu/cyberGISwiki/lib/exe/fetch...CyberGIS Toolkit: A Software Toolbox Built for Scalable cyberGIS Spatial

Objectives

• Identify and integrate a set of loosely coupled scalable

geospatial software components into the CyberGIS Toolkit

• Establish and sustain the CyberGIS Toolkit as a reliable software

toolbox through an open and rigorous software building, testing,

packaging, and deployment framework

• Capture computational and spatial characteristics of a software

element focusing on computational performance, scalability, and

portability in various CI environments • XSEDE (Extreme Science and Engineering Discovery Environment. http://xsede.org)

• Open Science Grid

• Extreme-scale supercomputers

• Provide a software environment for computational and data

scientists to easily configure and use CyberGIS Toolkit

components 6

Page 7: CyberGIS Toolkit: A Software Toolbox Built for Scalable ...cybergis.cigi.uiuc.edu/cyberGISwiki/lib/exe/fetch...CyberGIS Toolkit: A Software Toolbox Built for Scalable cyberGIS Spatial

Software Integration Approach

• Software component selection o Open source strategy

o Community-driven component identification

• Benefits to related science areas

• Software engineering o Complexity in software integration

o A holistic framework for streamlined component integration

• CI-based scalability evaluation and enhancement o Computational intensity analysis – theoretical approach

o Performance analysis and profiling – experimental approach

7

Page 8: CyberGIS Toolkit: A Software Toolbox Built for Scalable ...cybergis.cigi.uiuc.edu/cyberGISwiki/lib/exe/fetch...CyberGIS Toolkit: A Software Toolbox Built for Scalable cyberGIS Spatial

Open Source Strategy

• Components

o Each component must be open sourced

o No restriction on any particular open source license

• Toolkit

o Improves the accessibility to integrated software

capabilities through the establishment of a software

toolbox

o Focuses on the robustness, portability, compatibility, and

scalability of each component within advanced CI

environments

8

Page 9: CyberGIS Toolkit: A Software Toolbox Built for Scalable ...cybergis.cigi.uiuc.edu/cyberGISwiki/lib/exe/fetch...CyberGIS Toolkit: A Software Toolbox Built for Scalable cyberGIS Spatial

Component Selection Example: pRasterBlaster

• Need for scalable map reprojection in cyberGIS analytics

o Spatial analysis and modeling

• Distance calculation on raster cells requires appropriate

projection

o Visualization

• Reprojection for faster visualization on Web Mercator base

maps

• pRasterBlaster integration in CyberGIS Toolkit and Gateway

o Software componentization: librasterblaster, pRasterBlaster,

MapIMG

o Build, test, and documentation

o Gateway user interface

9

Page 10: CyberGIS Toolkit: A Software Toolbox Built for Scalable ...cybergis.cigi.uiuc.edu/cyberGISwiki/lib/exe/fetch...CyberGIS Toolkit: A Software Toolbox Built for Scalable cyberGIS Spatial

pRasterBlaster Component View

10

librasterblaster pRasterBlaster MapIMG

Cyberinfrastructure Service Providers Developers End Users

Page 11: CyberGIS Toolkit: A Software Toolbox Built for Scalable ...cybergis.cigi.uiuc.edu/cyberGISwiki/lib/exe/fetch...CyberGIS Toolkit: A Software Toolbox Built for Scalable cyberGIS Spatial

Integration Challenges • Diversity and heterogeneity of software components

o Programming languages

o Application vs. library

o Dependent libraries

o Code availability

o Programming models

o Cyberinfrastructure resources

• Multiple levels of integration

o Desktop level

o Heterogeneous software environments

o Cyberinfrastructure

• Software distribution

o CI deployment

o Packaging for broader distribution 11

Page 12: CyberGIS Toolkit: A Software Toolbox Built for Scalable ...cybergis.cigi.uiuc.edu/cyberGISwiki/lib/exe/fetch...CyberGIS Toolkit: A Software Toolbox Built for Scalable cyberGIS Spatial

Strategies

• Build and test

o Establish a streamlined process to build and test software codes

• Computational intensity analysis

o Computational bottleneck evaluation

o Scalability analysis

o Generalize solutions as computational and spatial knowledge

• Packaging and distribution

o Package software for download and build in user computing environments

o Provide common distribution packages (Debian, RPM, Windows installer, etc.)

o Establish CyberGIS Toolkit as a configurable software module that can be loaded/unloaded on supercomputers

• Documentation and training

o Build a CyberGIS Toolkit web site to host the software suite, user guide , development documentation, education materials, user feedbacks, and forums

o Develop online training materials

12

Page 13: CyberGIS Toolkit: A Software Toolbox Built for Scalable ...cybergis.cigi.uiuc.edu/cyberGISwiki/lib/exe/fetch...CyberGIS Toolkit: A Software Toolbox Built for Scalable cyberGIS Spatial

Continuous Integration Framework

13

Developer-level Testing

Support

NMI-based Continuous

Integration

Scalability Analysis and

Enhancement

Continuous

Integration

Management

Service

e.g., Travis CI

e.g., Jenkins

NMI: National Middleware Initiative (http://batlab.org)

Page 14: CyberGIS Toolkit: A Software Toolbox Built for Scalable ...cybergis.cigi.uiuc.edu/cyberGISwiki/lib/exe/fetch...CyberGIS Toolkit: A Software Toolbox Built for Scalable cyberGIS Spatial

Component Integration Process

• Selection

• Parallelization

• Development of test cases

• Development of build and test plans

• NMI regular build and test

• Scalability analysis on CI o Identification of potential computational bottlenecks

o Scalability to the number of processors

o Scalability to problem size

• Release in the CyberGIS Toolkit

• Accessibility in CyberGIS o Cyberinfrastructure deployment for high-end users

o Lowering access barriers for community users through Gateway and GISolve

o Incorporation in advanced application workflow 14

Page 15: CyberGIS Toolkit: A Software Toolbox Built for Scalable ...cybergis.cigi.uiuc.edu/cyberGISwiki/lib/exe/fetch...CyberGIS Toolkit: A Software Toolbox Built for Scalable cyberGIS Spatial

Demo

• CyberGIS Toolkit 0.5-alpha release

o http://cybergis.cigi.uiuc.edu/cyberGISwiki/doku.php/ct

15

Page 16: CyberGIS Toolkit: A Software Toolbox Built for Scalable ...cybergis.cigi.uiuc.edu/cyberGISwiki/lib/exe/fetch...CyberGIS Toolkit: A Software Toolbox Built for Scalable cyberGIS Spatial

Software Components

• Parallel Agent-Based Modeling – PABM

o HPC models: MPI, parallel I/O

o Contributor: UIUC team

• Parallel PySAL

o Parallel python implementation

o Contributor: ASU team

• Parallel map reprojection – pRasterBlaster

o HPC models: MPI, parallel I/O

o Contributor: High-performance mapping group, CEGIS, USGS

• SpatialText

o Full-text geocoding of massive social media and text data

o Contributor: Kalev Leetaru and UIUC team

16

Page 17: CyberGIS Toolkit: A Software Toolbox Built for Scalable ...cybergis.cigi.uiuc.edu/cyberGISwiki/lib/exe/fetch...CyberGIS Toolkit: A Software Toolbox Built for Scalable cyberGIS Spatial

Dependency Graph – CyberGIS Toolkit 0.5-alpha

17

PABM pRasterBlaster SpatialText pPySAL

MPI

IPM

PAPI

Lustre GDAL

GEOS Proj4

PySAL

Perl

Numpy, Scipy

Python

Page 18: CyberGIS Toolkit: A Software Toolbox Built for Scalable ...cybergis.cigi.uiuc.edu/cyberGISwiki/lib/exe/fetch...CyberGIS Toolkit: A Software Toolbox Built for Scalable cyberGIS Spatial

Toolkit Deployment on XSEDE

• XSEDE resources o Stampede@TACC: 10 petaflops; cluster computing, multi-threaded, GPU computing

o Lonestar@TACC: 0.3 petaflops; cluster computing, multi-threaded, large memory

o Trestles@SDSC: 0.1 petaflops; cluster computing, data I/O intensive computing

o Gordon@SDSC: 0.34 petaflops; cluster computing, data I/O intensive computing,

large memory

o Blacklight@PSC: 0.037 petaflops; shared memory

o Keeneland@NICS: cluster computing, GPU computing

• Compiling and install o Compilers: Intel, GNU, PGI

o MPI: Open MPI, mvapich2, mpich2, PGI, Intel MPI

• Software environment configuration o Environment Modules

o Adaptive software module loading/unloading

o Demo

18

Page 19: CyberGIS Toolkit: A Software Toolbox Built for Scalable ...cybergis.cigi.uiuc.edu/cyberGISwiki/lib/exe/fetch...CyberGIS Toolkit: A Software Toolbox Built for Scalable cyberGIS Spatial

Case Study: Parallel PySAL

19

Serge Rey, Jay Laura

Page 20: CyberGIS Toolkit: A Software Toolbox Built for Scalable ...cybergis.cigi.uiuc.edu/cyberGISwiki/lib/exe/fetch...CyberGIS Toolkit: A Software Toolbox Built for Scalable cyberGIS Spatial

Concluding Discussion

• The first set of selected software components has been

integrated into the CyberGIS Toolkit 0.5-alpha

• A prototype integration framework has been established to

allow sophisticated multi-level integration tests

• Scalability analysis has been effective to identify

computational bottlenecks and improve the performance of

several components

• CyberGIS Toolkit has been deployed on XSEDE for

community access o Through GISolve Open Service API

• Community evaluation and feedback is critical for building

high-quality and scientifically sound software

20

Page 21: CyberGIS Toolkit: A Software Toolbox Built for Scalable ...cybergis.cigi.uiuc.edu/cyberGISwiki/lib/exe/fetch...CyberGIS Toolkit: A Software Toolbox Built for Scalable cyberGIS Spatial

Acknowledgements

• NSF Software Infrastructure for Sustained Innovation (SI2) Program

• This material is based in part upon work supported by NSF under Grant Number OCI-1047916. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation

21

Page 22: CyberGIS Toolkit: A Software Toolbox Built for Scalable ...cybergis.cigi.uiuc.edu/cyberGISwiki/lib/exe/fetch...CyberGIS Toolkit: A Software Toolbox Built for Scalable cyberGIS Spatial

We need your feedback!

Contact: [email protected]

Thanks!

22