the earthserver initiative: towards agile big data services

16
EarthServer :: GEOSS 2012, Bonn :: ©2012 Peter Baumann The EarthServer initiative: towards Agile Big Data Services 2nd GEOSS Science and Technology Stakeholder Workshop Bonn, Germany, 2012-aug-29 Peter Baumann Jacobs University | rasdaman GmbH Bremen, Germany [email protected]

Upload: israel

Post on 11-Jan-2016

26 views

Category:

Documents


2 download

DESCRIPTION

2nd GEOSS Science and Technology Stakeholder Workshop Bonn, Germany, 2012-aug-29 Peter Baumann Jacobs University | rasdaman GmbH Bremen, Germany [email protected]. The EarthServer initiative: towards Agile Big Data Services. About the Presenter. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The EarthServer initiative:  towards Agile Big Data Services

EarthServer :: GEOSS 2012, Bonn :: ©2012 Peter Baumann

The EarthServer initiative: towards Agile Big Data Services

2nd GEOSS Science and Technology Stakeholder WorkshopBonn, Germany, 2012-aug-29

Peter Baumann

Jacobs University | rasdaman GmbHBremen, Germany

[email protected]

Page 2: The EarthServer initiative:  towards Agile Big Data Services

EarthServer :: GEOSS 2012, Bonn :: ©2012 Peter Baumann

About the Presenter

Professor of CS, Jacobs University• Head, Large-Scale Scientific Information Systems

research group

Main outcome so far:

rasdaman• first „Big Raster Data Analytics“ server

Standardization• OGC: chair of raster-relevant working groups,

editor of 12+ standards & candidate standards• ISO: working on Raster („Array“) SQL• INSPIRE: Invited expert for coverages

www.jacobs-university.de/lsis, www.rasdaman.org

Page 3: The EarthServer initiative:  towards Agile Big Data Services

EarthServer :: GEOSS 2012, Bonn :: ©2012 Peter Baumann

Roadmap

OGC standards

rasdaman

EarthServer

EarthServer & GEOSS

Conclusions

Page 4: The EarthServer initiative:  towards Agile Big Data Services

EarthServer :: GEOSS 2012, Bonn :: ©2012 Peter Baumann

Feature and Coverage Data Standards

Core element in OGC: geographic feature• = abstraction of a real world phenomenon• associated with a location relative to Earth

Special kind of feature: coverage• = space-time varying multi-dimensional phenomenon• Typical representative: raster image• ...but there is more!

Typically, coverages are Big Data

Page 5: The EarthServer initiative:  towards Agile Big Data Services

EarthServer :: GEOSS 2012, Bonn :: ©2012 Peter Baumann

«FeatureType»AbstractCoverage

MultiSolidCoverage

MultiSurfaceCoverage

MultiCurveCoverage

MultiPointCoverage

DiscreteCoverage

ContinuousCoverage

as per GML 3.2.1

RectifiedGridCoverage

ReferenceableGridCoverage

GridCoverage

all n-DNew subtypes possible

Coverage Types

Page 6: The EarthServer initiative:  towards Agile Big Data Services

EarthServer :: GEOSS 2012, Bonn :: ©2012 Peter Baumann

Coverage Encoding

Pure GML: complete coverage represented by GML

Special Format: other suitable file format (ex: MIME type “image/tiff”)

Multipart-Mixed: multipart MIME, type “multipart/mixed”

GML CoverageDomain set

Range type

Range set

App Metadata

GML CoverageDomain set

Range type

xlinkApp Metadata

NetCDF file

NetCDFDomain set

Range type

Range set

App Metadata

GeoTIFF

Range type

Range set

6

Page 7: The EarthServer initiative:  towards Agile Big Data Services

EarthServer :: GEOSS 2012, Bonn :: ©2012 Peter Baumann

feature coverage

data

WMS

images data

meta

data

WCPS

WCS-T

WCS

FE

WFS-T

WFS

CQL

CS-T

CS-W

Core OGC Service Standards

• WMS "portrays spatial data” pictures• WCS "provides data + descriptions; data with original semantics,

may be interpreted, extrapolated, etc.“ [09-110r4]

……

7

Page 8: The EarthServer initiative:  towards Agile Big Data Services

EarthServer :: GEOSS 2012, Bonn :: ©2012 Peter Baumann

Web Coverage Service (WCS) Core: Simple & efficient access to multi-dimensional coverages

subset = trim | slice

WCS Extensions for additional functionality facets• “band extraction”, scaling, reprojection, interpolation, query language, ...

Application Profiles define domain-oriented bundling

8

Page 9: The EarthServer initiative:  towards Agile Big Data Services

EarthServer :: GEOSS 2012, Bonn :: ©2012 Peter Baumann

Raster Query Language: ad-hoc navigation, extraction, aggregation, analytics

Time series

Image processing

Summary data

Sensor fusion& pattern mining

Web Coverage Processing Service (WCPS)

Page 10: The EarthServer initiative:  towards Agile Big Data Services

EarthServer :: GEOSS 2012, Bonn :: ©2012 Peter Baumann

Scalable On-Demand Processing for the Earth Sciences• EU funded, 3 years, 5.85 mEUR• Platform: rasdaman (Array Analytics server) • Distributed query processing, integrated data/metadata search, 3D clients

Strictly open standards: OGC WMS+WCS+WCPS; W3C Xquery; X3D

6 * 100+ TB databases for all Earth sciences + planetary science

EarthServer: Big Earth Data Analytics

Page 11: The EarthServer initiative:  towards Agile Big Data Services

EarthServer :: GEOSS 2012, Bonn :: ©2012 Peter Baumann

Array DBMS for massive n-D raster data• new database attribute type: array<celltype,extent>• Data integration: rasters stored in standard database

Extending ISO SQL with array operators: •

“tile streaming” architecture• n-D array set of n-D tiles• extensive optimization, hw/sw parallelization

In operational use • dozen-Terabyte objects• Analytics queries in 50 ms on laptop

The rasdaman Raster Analytics Server

select img.green[x0:x1,y0:y1] > 130from LandsatArchive as img

www.rasdaman.org

Page 12: The EarthServer initiative:  towards Agile Big Data Services

EarthServer :: GEOSS 2012, Bonn :: ©2012 Peter Baumann

Value-Added Satellite Image Archive

[Diedrich et al 2001]

Page 13: The EarthServer initiative:  towards Agile Big Data Services

EarthServer :: GEOSS 2012, Bonn :: ©2012 Peter Baumann

WCPS peer-to-peer cloud • each node accepts all requests • Incoming node distributes query, semantics based• Manifold optimization criteria

for $a in ( A ), $b in ( B )return encode( ( ($a.nir - $a.red) / ($a.nir + $a.red) - ($b.nir - $b.red) / ($b.nir + $b.red) ), “HDF5“ )

coverageA

for $b in ( B )return encode( ($b.nir - $b.red) / ($b.nir + $b.red), “array-compressed“ )

for $a in ( A )return encode( ($a.nir - $a.red) / ($a.nir + $a.red), “array-compressed“ )

rasdaman: Distributed Query Processing

coverageB

[Owonibi 2012]

Page 14: The EarthServer initiative:  towards Agile Big Data Services

EarthServer :: GEOSS 2012, Bonn :: ©2012 Peter Baumann

EarthServer Contribution to GEOSS

Integrated n-D coverage data / metadata search• Smooth integration with Broker

[Nativi, Mazzetti 2012]

Page 15: The EarthServer initiative:  towards Agile Big Data Services

EarthServer :: GEOSS 2012, Bonn :: ©2012 Peter Baumann

Integrated n-D coverage data / metadata search• Smooth integration with Broker

EarthServer Contribution to GEOSS

• Including „reverse lookup“ queries: „give me metadata for data with specific properties“• Also integration with MapServer, GDAL, ...

Scalable n-D interfaces, based on OGC standards• Working „in situ“on existing archives; no copying!

Flexible ad-hoc processing & filtering• Through OGC standardized query language

nD visual Web clients• 1D diagrams, 2D maps, 3D data cubes, 3D timeseries sets, ...• Dynymically composed from query results

Page 16: The EarthServer initiative:  towards Agile Big Data Services

EarthServer :: GEOSS 2012, Bonn :: ©2012 Peter Baumann

Conclusion

Sensor, image, & statistics data = a main source of Big Data in Earth Sciences• Petrol industry has „more bytes than barrels“

OGC standards offer common platform• spatio-temporal coverages – a unified, cross-domain data model• Web Coverage Service suite – from simple download to flexible analytics• www.ogcnetwork.net/wcs

EarthServer can contribute Agile Analytics to GEOSS• OGC coverage standards• rasdaman technology

www.earthserver.eu