nsf meeting on cyberinfrastructure for surficial processes, jan.18-19, 2006 slide 1 geon: the...
TRANSCRIPT
NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006
Slide 1
GEON: The Geosciences Network
Chaitan BaruSan Diego Supercomputer Center (SDSC)
California Institute for Telecommunications and Information Technology (Calit2)
NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006
Slide 2
Data Management
DATA COLLECTION
DATA PUBLICATION
DATA ACCESS
DATA ANALYSIS
GEON
NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006
Slide 3
GEON Background• See website: www.geongrid.org, and portal• Began as a collaboration among ~15 institutions• Goals
• Provide a Cyberinfrastructure-based Interpretive Environment for Earth Science research, e.g. for data acquired in EarthScope
• Support for data discovery • A platform for data integration
• Train students and geoscience researchers in state-of-the-art and advanced IT concepts, i.e. technical aspects of geoinformatics
• Two-Tier approach• Develop working systems, while also doing research and building
advanced prototypes• The focus this year is on registering content and tools at portal and
providing a number of “reference” datasets• The end goal is to provide science infrastructure. Support for both
“hosted” and “non-hosted” data
NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006
Slide 4
Topics for Today
• LIDAR data management and processing in GEON• Courtesy: Prof. Ramon Arrowsmith, Arizona State
• Data Registration• Linkage with other geoinformatics, CI projects• Won’t cover details of grid computing,
visualization, data integration, …
5~1.2 billion data points
Example Data Set:
• Northern San Andreas fault and associated marine terraces.
• Flown February 2003
• Funded by NASA in collaboration w/ USGS.
• ~418 Square Kilometers
6
~1.1 million data pointsTo produce this DEM
7
8
9
10
11
12
13
14
15
16
17
18
19
Lidar Processing Workflow: Using Kepler
Subset
Analyze
move process
Visualize
move render display
Arizona Cluster
NFS Mounted DiskIBM DB2
Datastar
NFS Mounted Disk
d1d1
d2 (grid file)
d2
d2d1
iView3D/Browser
CreateScene file
Fledermaus (or ASU OpenGL tool LViz)
sd
20
Data Set # of points Schema Source
Northern San Andreas (NSAF)
1.2 billion10 column (x,y,z
+ attributes)NASA / USGS
West Rainier~800 million –
1 billion (est.)10 column (x,y,z
+ attributes)
Southern SAF Laser Scan
?? Likely to be 5+ billion
??
NCALMNAPA ~500M ??
E. CA Shear Zone (E. Mohave)
~500M ??
Antarctic Dry Valleys
10-100M? ??Bea Csatho (Ohio
State)
Hector Mine EQ 10-100M? ?? Ken Hudnut (USGS)
Alvord (Tripod) 16.6M4 column (x,y,z +
intensity)John Oldow (U.
Idaho)
LiDAR DATA SETS COMMITTED (?) TO GEON DISTRIBUTION:
NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006
Slide 21
Current Activities
• “Release” of GLW—GEON LIDAR Workflow capability
• Incorporation of ground-based LIDAR data• Ground-based Data Collection Workshop,
organized by John Oldow, April 6-7, 2006, SDSC/Calit2 Synthesis Center. Sponsored by NSF
NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006
Slide 22
Data Registration: GEONsearch
Choose a filetype
Choose subject (from a “base” ontology)
Choose location (from a gazetteer Webservice)
Choose a time (numeric range or from a time ontology Webservice)
Choose concepts from ontologies
www.geongrid.org
NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006
Slide 23
GEONsearch and myGEON
GEONsearch
Search Condition(s)spatial temporal concept
Log
GEON Catalog
GEON Datasets
extracted information/indexes
Web services
GazetteerGeologic
Age
myGEON
Map Service
-Move data-Create map service
-Create session
selected results(shape file GEON ID’s)
Handle to interactivemap session
Save MapSession
Saved sessionSearch results
(in Data Integration Cart©)
NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006
Slide 24
The 1-2-3 of GEON Data Registration1. Register dataset, tool to index terms
• Allows users to more easily discover relevant resources
2. Register dataset “schema” to ontology• E.g. Age_MA Geologic Age• Could be relational dbms, shapefile, Excel, netCDF, …• Allows discovery of datasets that have information of interest, e.g. “all
datasets that have velocity data”
3. Register data values to ontology• E.g. “Jur” Jurrasic Age from Geologic Age ontology• Allows advanced data integration, e.g. integrate Paleobiology data with
Paleostrat, or Neptune, Janus, etc.
• Prerequisite: ontologies need to be defined (by community), represented in OWL, and registered
NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006
Slide 25
GEON Data Registration
Ontology Registration
Dataset Registration(hosted)
Data Item (Schema) Registration(hosted / non-hosted)
Data Item Detail Registration(values)
Service Registration
Resource Registration
NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006
Slide 26
Data Registration Activities
• GEON “Mini-Workshop on Information Exchange from Distributed Data Systems”, Feb 7th, 2006 • Co-organized
by Chuck Meertens, UNAVCO/GEON and Ben Domenico, Unidata/LEAD
• Goal: Register netCDF/OpenDAP data in GEON portal
NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006
Slide 27
GEON IDV
• Courtesy Dr. Chuck Meertens, UNAVCO• Adapt IDV for earth science datasets• Incorporate web service calls in IDV to invoke
GEONsearch• and access and manipulate netCDF-based 3D, 4D data
sets
NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006
Slide 28
Geo-ontologies
• Data Registration and Ontology meetings• GEON Data Registration meeting, March 10-11, SDSC• Volcano Ontology meeting, sponsored by NASA SESDI
project (Semantically-Enabled Scientific Data Integration), Feb 16/17, SDSC
• An opportunity for the community to develop community standards for knowledge representation, e.g.• Schemas, controlled vocabularies, ontologies• And, choose a common representation system, e.g. OWL
NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006
Slide 29
Linkage with Other Geoinformatics, CI Projects
• CUAHSI Hydrologic Information System (HIS)• HIS is using GEON data registration and search capability, and
mapping services, and GEON PoP node structure and the “GEON Pack” (i.e. a common software stack),
• CHRONOS• Database federation
• Hosting Paleo-pollen databases• Hosting NAVDAT• IT collaborations with NCMIR/BIRN (NIH), SESDI (NASA),
LEAD, GRASP (Grid Benchmarking), Globus (Data Replication Service middleware)
NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006
Slide 30
E.g, CHRONOS Federated Databases
• The following databases are all part of the CHRONOS Federated Database at SDSC based on IBM’s DB2 Information Integrator. Federated database is registered in GEON.• Neptune• PaleoStrat• PaleoBiology• Janus• TimeScale• FAUNMAP• MIOMAP
NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006
Slide 31
Opportunities• Leverage CI from existing projects in same or even different
disciplines• Adopt a service-oriented architecture (SOA)
• i.e. standardize on Web service interfaces for tools, applications, and data • E.g. Web Mapping Services for map image services, and WFS, WCS, and other
standards, e.g for accessing geologic maps, gravity data, sensor data, …• Need to deal with .NET and Java compatibility
• Develop centralized community services, e.g. for LIDAR processing• Develop community standards for knowledge representation
• Schemas, controlled vocabularies, ontologies. Choose common representation system, e.g. OWL
• Organize community meetings, workshops, conferences• Develop “Meta-workflow” frameworks
• Support inter-operation among different scientific workflow systems
• There may be an opportunity to work through a proposed new GSA Division on Geoinformatics and AGU working group on IT
• Geoinformatics 2006. See www.geongrid.org/geoinformatics2006