ensuring long term access to remotely sensed hdf4 data with layout maps
DESCRIPTION
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps. Mike Folks, The HDF Group Ruth Duerr, NSIDC. Background and basic concept. I’m Plastic Man!. HDF4 is. EXTENSIBLE. FLEXIBLE. SELF-DESCRIBING. But There’s a cost…. Complexity!. complexity. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/1.jpg)
Ensuring Long Term Access to Remotely Sensed HDF4 Data
with Layout MapsMike Folks, The HDF Group
Ruth Duerr, NSIDC
1
![Page 2: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/2.jpg)
Background and basic concept
2
![Page 3: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/3.jpg)
3
HDF4 is
FLEXIBLE
EXTENSIBLE
SELF-DESCRIBING
I’m Plastic Man!I’m Plastic Man!
![Page 4: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/4.jpg)
ButThere’s a cost…
4
![Page 5: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/5.jpg)
Complexity!
5
![Page 6: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/6.jpg)
6
![Page 7: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/7.jpg)
7
![Page 8: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/8.jpg)
8
How do we save HDF users from having to deal with all of
the complexity under the hood?
![Page 9: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/9.jpg)
9
Through the HDF software libraries, either by using the
HDF APIs directly or by using HDF tools that depend on the
HDF libraries.
But what about the future…
![Page 10: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/10.jpg)
• There is a risk in depending solely on the HDF libraries to access HDF-formatted data over the long term.
• It is possible, especially in the distant future, that the libraries may not be available.
10
![Page 11: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/11.jpg)
Really smart people and software?
11
Maybe future data users and their computers will be so smart that the HDF4 format will be a piece of cake.
![Page 12: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/12.jpg)
12
Maybe not.
![Page 13: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/13.jpg)
We need an “easy” button
13
![Page 14: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/14.jpg)
14
“If only we could read HDF data with an read HDF data with an independent program that does not rely on independent program that does not rely on
the HDF API… the HDF API… A possible approach [would be to] extend
hdfls to print a hierarchical map of a data file, [and] write ncdump/hdp-like utilities to find,
assemble and write out SDSes and vdatas.”
“Leveraging HDF Utilities”Christopher LynnesHDF Workshop X.
![Page 15: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/15.jpg)
The project
15
![Page 16: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/16.jpg)
HDF4 mapping
• Problem The complex internal byte layout of HDF files
requires one to use the API to access HDF data. This makes long-term readability of HDF data
dependent on long-term allocation of resources to support HDF software.
• Proposed solution Create a map of the layout of data objects in an
HDF file, allowing a simple reader to be written to access the data.
16
![Page 17: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/17.jpg)
HDF4 mapping project activities
1. Assess and categorize HDF4 data held by NASA To determine what types of objects to map. To get an idea of the magnitude of the project.
2. Develop prototype for proof of concept Develop markup-language based layout
specification. Develop tool to produce layout for an HDF4 file. Develop and test two independent tools to read
HDF4 data based solely on the map files.
17
![Page 18: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/18.jpg)
Project activities (continued)
3. Assess results and plan next steps Present results and options for proceeding to the
community. Assess the likely usefulness of this approach, as
well as any desirable modifications Evaluate the effort required for a full solution that
best meets community needs Submit a proposal for the work needed to provide
a full solution
18
![Page 19: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/19.jpg)
1. Assess and categorize
19
![Page 20: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/20.jpg)
How many NASA HDF4 products?
Data Center HDF4 Products
ASF 0
GES-DISC 236
GHRC 54
ASDC 63
LP-DAAC 67
NSIDC 47
ORNL-DAAC 2
PO.DAAC 22
SDAC 0
MrDC 95
Total 586
20
![Page 21: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/21.jpg)
Data characteristics
• Product Identification Product Name Data Level Archive Location Product Version
• Whether the product was multi-file• For HDF-EOS products
HDF-EOS version For point data
• Number of point data sets• Maximum number of levels
For swath data• Number of swaths• Maximum number of dimensions• Organized by time, space, both, or other• Whether dimension maps were used
For gridded data• Number of grids• Max number of dimensions in a grid• Number of projections used• Whether any grids were indexed
• HDF Version
• For raster data Number of 8-bit rasters Number of 24-bit rasters Number of general rasters Whether any rasters had attributes Whether any rasters were compressed Whether any rasters were chunked Whether there were any palettes
• For SDS data Number of SDSs Maximum number of dimensions Did any SDS have attributes Was any SDS annotated Were dimension scales used Was compression used and if so what kind Was chunking used
• For Vdata Number of Vdata structures Did any Vdata have attributes Did any Vdata fields have attributes Was compression used and if so what kind Was chunking used
Product Characteristics Examined
21
![Page 22: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/22.jpg)
Other results
• Slightly more than half of the HDF4 products are in HDF-EOS 2 format
• Grids are the most common HDF-EOS data structures in use
• No products use a combination of grid, swath, and point data structures
22
![Page 23: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/23.jpg)
2. Prototype and proof of concept
23
![Page 24: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/24.jpg)
HDF4 mapping prototype workflow
24
HDF4 File “H4.hdf”
HDF4 File “H4.hdf”
HDF4 Mapping File (XML document)“H4.hdf.map.xml”
HDF4 Mapping File (XML document)“H4.hdf.map.xml”
hmaplinked with HDF4 library
hmaplinked with HDF4 library
Reader 1(C program)
Object DataObject Data Groups, Data Objects, Structural and Application
Metadata; Locations of Object Data
Reader 2(Perl Script)Reader 2
(Perl Script)
![Page 25: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/25.jpg)
Proof-of-concept results
• The HDF Group created prototype map generation software and a draft map specification
• Map generator was tested on a wide variety of data products
• GES-DISC and NSIDC independently wrote software that uses maps to read data files in NSIDC’s and GES-DISC’s archives
• Summary - the concept is feasible!
25
![Page 26: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/26.jpg)
Example map fragment
<?xml version="1.0" encoding="utf-8"?><hdf4:HDFMap xmlns:hdf4="http://www.hdfgroup.org/HDF4/HDF4Map"> <hdf4:RootGroup> <hdf4:SDS objName="data1" objPath="/" objID="xid-DFTAG_NDG-2"> <hdf4:Attribute name="data range" ntDesc="32-bit signed integer"> 0 255 </hdf4:Attribute> <hdf4:Datatype dtypeClass="INT" dtypeSize="4" byteOrder="BE" /> <hdf4:Dataspace ndims="2"> 10 100 </hdf4:Dataspace> <hdf4:Datablock nblocks="1"> <hdf4:BlockOffset> 2502 </hdf4:BlockOffset> <hdf4:BlockNbytes> 4000 </hdf4:BlockNbytes> </hdf4:Datablock> </hdf4:SDS> </hdf4:RootGroup></hdf4:HDFMap>
26
![Page 27: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/27.jpg)
Next steps
27
![Page 28: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/28.jpg)
Effort for full implementation
• Finalize map file xml specification compatibility with existing standards NCML, XFDU,
PREMIS, ESML, DFDL
• Implement production quality mapping tool and API
• Possibly do similar assessment for HDF5 maps.
28
![Page 29: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/29.jpg)
Implementation Processes
• Generate maps for existing archives GES-DISC approach: append the map XML to the XML
files already kept for each file in their archive NSIDC non-ECS data implementation: add an XML file
for each data file in same directory ROM to add capability to NASA ECS systems in
process Other NASA systems TBD
• Generate maps for new data Add map generation as a step in the ingest process
using stand alone tool Request product generation systems to use new API
calls that generate maps
![Page 30: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/30.jpg)
How you can help
• Consider what it might take to implement this for your archive
• Review the materials on the wiki and elsewhere - comment heavily! Wiki page added to NASA’s ESDC wiki Project page at The HDF Group website:
• http://www.hdfgroup.org/projects/hdf4mapping/
30
![Page 31: Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps](https://reader036.vdocuments.mx/reader036/viewer/2022081520/568148c2550346895db5df56/html5/thumbnails/31.jpg)
Thank you.This report is based upon work supported in part
by a Cooperative Agreement with the National Aeronautics and Space Administration (NASA)
under NASA Award NNX06AC83A. Any opinions, findings, and conclusions or
recommendations expressed in this material are those of the author(s) and do not necessarily
reflect the views of the National Aeronautics and Space Administration.
31