improving data catalogs with free and open source software kevin o’brien university of washington...

21
Improving Data Catalogs with Free and Open Source Software Kevin O’Brien University of Washington Joint Institute for the Study of the Atmosphere and Ocean Steven C Hankin – NOAA/PMEL Roland Schweitzer – Weathertop Consulting AGU Fall Meeting 2013

Upload: clare-norton

Post on 17-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Improving Data Catalogs with Free and Open Source Software Kevin O’Brien University of Washington Joint Institute for the Study of the Atmosphere and Ocean

Improving Data Catalogs with Free and Open Source Software

Kevin O’BrienUniversity of WashingtonJoint Institute for the Study of the Atmosphere and Ocean

Steven C Hankin – NOAA/PMELRoland Schweitzer – Weathertop Consulting

AGU Fall Meeting 2013

Page 2: Improving Data Catalogs with Free and Open Source Software Kevin O’Brien University of Washington Joint Institute for the Study of the Atmosphere and Ocean

The Unified Access Framework (UAF)

• A Global Earth Observation Integrated Data Environment (GEO-IDE) project

• An attempt to improve scientific data management and access

• Focus on successes

Page 3: Improving Data Catalogs with Free and Open Source Software Kevin O’Brien University of Washington Joint Institute for the Study of the Atmosphere and Ocean

Lots of data already available

Page 4: Improving Data Catalogs with Free and Open Source Software Kevin O’Brien University of Washington Joint Institute for the Study of the Atmosphere and Ocean

What “success” did UAF chose to copy?

Year 1 focused on gridded datasets.

Servicestack:

netCDF-CF-DAP-THREDDS-WMS

• Projects: (too many to name)

Dataformats:

netCDF GRIB HDF

Applications: Matlab ArcGIS Ferret

GrADS Google Earth IDV LAS ERDDAP …

Users: (too many to name)

Page 5: Improving Data Catalogs with Free and Open Source Software Kevin O’Brien University of Washington Joint Institute for the Study of the Atmosphere and Ocean

Developing the UAF Catalog Cleaner

(a ‘web crawler’)N

OM

ADS

UAF ‘RAW’ catalog

NOAA NOAA Affiliated

NMFSOAR NWS NESDIS

NO

DC

NG

DC

GFD

L

PMEL

AOM

LO

CO

PFEG

ND

BC

ESRL

Coas

twat

ch

IOOS National Partners

IOOS Regional Partners

NAV

O

AOO

S

NAN

OO

S

CEN

COO

S SCCO

OS

PACI

OO

SG

LOS

NER

ACO

OS

MAC

OO

RA SECO

ORA

CARI

COO

S GCO

OS

NO

MAD

S

UAF ‘CLEAN’ catalog

NOAA NOAA Affiliated

NMFSOAR NWS NESDIS

NO

DC

NG

DC

GFD

L

PMEL

AOM

LO

CO

PFEG

ND

BC

ESRL

Coas

twat

ch

IOOS National Partners

IOOS Regional Partners

NAV

O

AOO

S

NAN

OO

S

CEN

COO

S SCCO

OS

PACI

OO

SG

LOS

NER

ACO

OS

MAC

OO

RA SECO

ORA

CARI

COO

S GCO

OS

‘RAW’

‘CLEAN’

Page 6: Improving Data Catalogs with Free and Open Source Software Kevin O’Brien University of Washington Joint Institute for the Study of the Atmosphere and Ocean

Tree Crawl Dataset Crawl Cleaner

CatalogRef and

Dataset URL’s

Raw catalog XML

Page 7: Improving Data Catalogs with Free and Open Source Software Kevin O’Brien University of Washington Joint Institute for the Study of the Atmosphere and Ocean

Tree Crawl Dataset Crawl Cleaner

url="http://cwcgom.aoml.noaa.gov/thredds/dodsC/OCEAN_GEOSTROPHIC_CURRENTS/CURRENTS.nc"url="http://cwcgom.aoml.noaa.gov/thredds/dodsC/GLOBAL_MONTHLY_CARBON_FLUXES/FLUXES.nc"url="http://cwcgom.aoml.noaa.gov/thredds/dodsC/GLOBAL_SEASON_CARBON_FLUXES/FLUXES.nc"url="http://cwcgom.aoml.noaa.gov/thredds/dodsC/ROMSMETEO/kk1.nc"url="http://cwcgom.aoml.noaa.gov/thredds/dodsC/MCI_GULF/kk1.nc"url="http://cwcgom.aoml.noaa.gov/thredds/dodsC/MSGSST/SST.nc"url="http://cwcgom.aoml.noaa.gov/thredds/dodsC/TERRA_K490_GULF/terrak490.nc"url="http://cwcgom.aoml.noaa.gov/thredds/dodsC/TERRA_K490_GULF_3D/terrak490.nc"url="http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/NARR.dailyavgs/subsurface/soill.199910.nc"url="http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/NARR.dailyavgs/subsurface/soill.199911.nc"url="http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/NARR.dailyavgs/subsurface/soill.199912.nc"url="http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/NARR.dailyavgs/subsurface/soill.200001.nc"url="http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/NARR.dailyavgs/subsurface/soill.200002.nc"url="http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/NARR.dailyavgs/subsurface/soill.200003.nc"url="http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/NARR.dailyavgs/subsurface/soill.200004.nc"url="http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/NARR.dailyavgs/subsurface/soill.200005.nc"url="http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/NARR.dailyavgs/subsurface/soill.200006.nc"url="http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/NARR.dailyavgs/subsurface/soill.200007.nc"url="http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/NARR.dailyavgs/subsurface/soill.200008.nc".

CatalogRef and

Dataset URL’s

Page 8: Improving Data Catalogs with Free and Open Source Software Kevin O’Brien University of Washington Joint Institute for the Study of the Atmosphere and Ocean

Tree Crawl Dataset Crawl Cleaner

Aggregations

CF compliance

Access services

UAF Clean Catalog

Page 9: Improving Data Catalogs with Free and Open Source Software Kevin O’Brien University of Washington Joint Institute for the Study of the Atmosphere and Ocean

UAF Clean Catalog

Page 10: Improving Data Catalogs with Free and Open Source Software Kevin O’Brien University of Washington Joint Institute for the Study of the Atmosphere and Ocean

How to provide feedback to data providers?

•Remember the “Building on Success” theme

• ncISO metadata assessment tool is very successful

Page 11: Improving Data Catalogs with Free and Open Source Software Kevin O’Brien University of Washington Joint Institute for the Study of the Atmosphere and Ocean
Page 12: Improving Data Catalogs with Free and Open Source Software Kevin O’Brien University of Washington Joint Institute for the Study of the Atmosphere and Ocean
Page 13: Improving Data Catalogs with Free and Open Source Software Kevin O’Brien University of Washington Joint Institute for the Study of the Atmosphere and Ocean

How about a catalog quality assessment tool?

How to provide feedback to data providers?

•Remember the “Building on Success” theme

• ncISO metadata assessment tool is very successful

Page 14: Improving Data Catalogs with Free and Open Source Software Kevin O’Brien University of Washington Joint Institute for the Study of the Atmosphere and Ocean
Page 15: Improving Data Catalogs with Free and Open Source Software Kevin O’Brien University of Washington Joint Institute for the Study of the Atmosphere and Ocean
Page 16: Improving Data Catalogs with Free and Open Source Software Kevin O’Brien University of Washington Joint Institute for the Study of the Atmosphere and Ocean

Statistics for current catalog and all it’s children

Links to rubric reports for child catalogs

Page 17: Improving Data Catalogs with Free and Open Source Software Kevin O’Brien University of Washington Joint Institute for the Study of the Atmosphere and Ocean

Missing services

Data issues

Page 18: Improving Data Catalogs with Free and Open Source Software Kevin O’Brien University of Washington Joint Institute for the Study of the Atmosphere and Ocean

url url

url

url url

url

url url

Page 19: Improving Data Catalogs with Free and Open Source Software Kevin O’Brien University of Washington Joint Institute for the Study of the Atmosphere and Ocean

Data issues

Original Catalog

Page 20: Improving Data Catalogs with Free and Open Source Software Kevin O’Brien University of Washington Joint Institute for the Study of the Atmosphere and Ocean

Moving Forward….

• Welcome feedback on rubric and Catalog Cleaner tool

• Change wording in rubric

• UAF master catalog to go beyond gridded files• Use ERDDAP to including In Situ featureTypes

• Continue community outreach to improve catalogs

Page 21: Improving Data Catalogs with Free and Open Source Software Kevin O’Brien University of Washington Joint Institute for the Study of the Atmosphere and Ocean

Thank you!UAF: geo-ide.noaa.govCatalog Cleaner code and documentation:

http://ferret.pmel.noaa.gov/LAS/documentation/the-uaf-catalog-cleaner/THREDDS: www.unidata.ucar.edu/projects/THREDDSnetCDF: www.unidata.ucar.edu/netcdfOPeNDAP: www.opendap.orgCF: cf-pcmdi.llnl.gov

Kevin.M.O’[email protected]

AGU Fall Meeting 2013