an australian geoscience data cube aaron sedgmen geoscience australia

15
An Australian Geoscience Data Cube Aaron Sedgmen Geoscience Australia

Upload: karlie-golson

Post on 29-Mar-2015

218 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: An Australian Geoscience Data Cube Aaron Sedgmen Geoscience Australia

An Australian Geoscience Data Cube

Aaron Sedgmen

Geoscience Australia

Page 2: An Australian Geoscience Data Cube Aaron Sedgmen Geoscience Australia

Overview

• Organisational background

• Data cube concept

• Geoscience Australia’s data cube implementation

• The shift from traditional methods of managing EO data

• Example applications of the data cube

• Where to with the data cube

An Australian Geoscience Data Cube

Page 3: An Australian Geoscience Data Cube Aaron Sedgmen Geoscience Australia

Organisational Background

Geoscience Australia – a government agency providing advice and information to the Australian Government and geoscientific information to industry and other stakeholders.

National Earth Observations Group - provides earth observation products and services as well as expert advice, and information for decision makers.

An Australian Geoscience Data Cube

Page 4: An Australian Geoscience Data Cube Aaron Sedgmen Geoscience Australia

An Australian Geoscience Data Cube

The Space-Time Data Cube: a new paradigm for managing and using

environmental data

Page 5: An Australian Geoscience Data Cube Aaron Sedgmen Geoscience Australia

An Australian Geoscience Data Cube

191610 (x)

575 (t) x 7 (λ)

The Data Cube concept

Page 6: An Australian Geoscience Data Cube Aaron Sedgmen Geoscience Australia

An Australian Geoscience Data Cube

“Cubing” Landsat images

Dice… & … Stack

ti

me

space

Landsat images

Tile squares

Page 7: An Australian Geoscience Data Cube Aaron Sedgmen Geoscience Australia

GA’s data cube implementation

• GA developed a working data cube prototype in early 2013 to undertake time-series analysis of Landsat data

• Contains fifteen years (1998-2012) of the Landsat 5 & 7 archive covering the Australian land mass

• 3,960,528 tiles sourced from a total of 550,537 Level 1T, ARG25, Pixel Quality & some Fractional Cover datasets

• 110TB of compressed geoTIFF files

• Access to the cube is via a Python API that enables generation of mosaiced time slices, and temporal stacks of derived quantities

• Users can apply their own algorithms via the API for generating derived quantities

An Australian Geoscience Data Cube

Page 8: An Australian Geoscience Data Cube Aaron Sedgmen Geoscience Australia

Hosting of the data cube at NCI

• The GA data cube is hosted on the National Computational Infrastructure (NCI), located at the Australian National University in Canberra.

• The Raijin super computer at the NCI is currently ranked around 27th in the world, based on the following specifications:• 57,472 cores

• 160 Tbytes memory

• 10 Pbytes spinning disk

• 1.2 Pflops computer performance

• The storage and processing power available at NCI is a critical enabler for the data cube

An Australian Geoscience Data Cube

Page 9: An Australian Geoscience Data Cube Aaron Sedgmen Geoscience Australia

An Australian Geoscience Data Cube

1Petabyte hierarchical archive: Millions of individual scenesTape store accessed by robot.

Orthorectificationcalibration, cloud Masking, atmospheric correction, mosaicing

Feature extraction,algorithm applicationspectral unmixing Product packaging

and delivery

Identify footprint of product in space or time

Client requests product

Search catalogue order scenes

GA’s Traditional EO product process

EO products have traditionally been produced on demand for areas of interest from tape archives of scene based raw data

Page 10: An Australian Geoscience Data Cube Aaron Sedgmen Geoscience Australia

A paradigm shift from traditional methods

• The data cube holds multiple Landsat products for the entire archive – removes the need to generate products at time of request

• Hosting the data cube at NCI co-locates “big data” with high performance computing – enables in-situ analysis of the whole archive

• Computational analysis is moved from the scientist’s local environment to a central HPC facility

• Removes the need to download and replicate the data

• Provides computing power not otherwise available to many scientists

• Opens up possibilities to integrate the Landsat archive with other “big data” datasets hosted at the HPC facility

An Australian Geoscience Data Cube

Page 11: An Australian Geoscience Data Cube Aaron Sedgmen Geoscience Australia

An Australian Geoscience Data Cube

Surface water

Menindee Lakes time series

1998-2012

Total observations per grid cell ~600-1200

4000*4000 grid cells

Page 12: An Australian Geoscience Data Cube Aaron Sedgmen Geoscience Australia

Continental-Scale surface water results

An Australian Geoscience Data Cube

Time series analysis ofentire 15yr archive ofARG25 data at 25mresolution.

~2 days processing time(pre Raijin HPC facility)

Page 13: An Australian Geoscience Data Cube Aaron Sedgmen Geoscience Australia

What the GA data cube is not (yet)

• A publically available production system• Still a working prototype being used for internal environmental science

projects

• A real-time delivery system for time-series data serving large numbers of concurrent users (i.e. a web-delivery system)• A number of OGC specifications, including CF-netCDF, Web Coverage

Service (WCS), Web Processing Service (WPS) and Web Coverage Processing Service (WCPS), are being investigated for enabling this capability.

• Yet another system for delivering “pretty pictures” (a la GeoServer or Google Earth Engine)• The data cube environment is optimised for scientific analysis. The

delivery of portrayal data (e.g. map images via WMS) is best served by systems optimised for data distribution.

An Australian Geoscience Data Cube

Page 14: An Australian Geoscience Data Cube Aaron Sedgmen Geoscience Australia

Acknowledgements

• Dr Stuart Minchin – Chief, Environmental Geoscience Division

Geoscience Australia

• Alex Ip – Senior Developer, eResearch Infrastructure

Geoscience Australia

An Australian Geoscience Data Cube

Page 15: An Australian Geoscience Data Cube Aaron Sedgmen Geoscience Australia

Phone: +61 2 6249 9576

Web: www.ga.gov.au

Email: [email protected]

Address: Cnr Jerrabomberra Avenue and Hindmarsh Drive, Symonston ACT 2609

Postal Address: GPO Box 378, Canberra ACT 2601

Questions?

Thank you