data sharing issues, metadata, archives, and comprehension urgency neesgrid () schedule:...

26
DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid (www. neesgrid .org ) schedule: Characterize the Earthquake Engineering community use of data and metadata: January 2002. Distribute preliminary metadata standards: May 2002. Publish standards for data and metadata models and representations by September 2002 (Prudhomme and Mish, 2001). Consortium Developer of NEES www.nees.org Working groups on data issues: looking for interested volunteers

Upload: amberly-austin

Post on 27-Dec-2015

220 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid () schedule: –Characterize the Earthquake Engineering

DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION

• Urgency• NEESgrid (www.neesgrid.org) schedule:

– Characterize the Earthquake Engineering community use of data and metadata: January 2002.

– Distribute preliminary metadata standards: May 2002.– Publish standards for data and metadata models and

representations by September 2002 – (Prudhomme and Mish, 2001).

• Consortium Developer of NEES www.nees.org– Working groups on data issues: looking for interested

volunteers

Page 2: DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid () schedule: –Characterize the Earthquake Engineering

Identify/define uses of data and metadata

• To help me remember what I did last time• To permit other researchers to duplicate test• Real time remote PI interaction• To allow numerical simulation

– Interactive decision making during experiment– Years after the test

• Automated control of the experiment• Visualization

– Research and education, sponsors

• Data search/query filter• Artificial Intelligence, inverse/system identification• Software sharing by common interface opensees

Page 3: DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid () schedule: –Characterize the Earthquake Engineering

Use of data

• Data search/query filter

• Artificial Intelligence, inverse/system identification

• Software sharing by common interface opensees

Page 4: DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid () schedule: –Characterize the Earthquake Engineering

Experience of geotech community

• CWRU database on element tests

• VELACS USC

• COSMOS and IRIS

• PEER structures data bases UCSD, UW

• UCD cgm.engr.ucdavis.edu

Page 5: DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid () schedule: –Characterize the Earthquake Engineering

Other community examples• Atmosphere/ocean research NCAR, NOAA, Navy

– Example of flux vector interchanged between programs– User specific API to interface with “black box”– CORBA – Common Object Request Broker Architecture. A

spec for an “object that may be accessed by many platforms – java, fortran, etc.

• Fluid flow– Visualization code runs with solver– Open GL– Generic flux vector– Connection of mismatched meshes (regular and scattered.)– Meshing experimental data with numerical data.

Page 6: DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid () schedule: –Characterize the Earthquake Engineering

Data use and format

• Think ahead for uses– Needs assessment

– Format changes

– Visualization of large data sets is demanding

• What is data ?• Format

– Access tools input and output

– Don’t store twice because it is in different format (calibration?)

Page 7: DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid () schedule: –Characterize the Earthquake Engineering

Formats, coding

• Oracle

• Flat ASCII

• XML

Page 8: DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid () schedule: –Characterize the Earthquake Engineering

What are benefits of standardization?

• Knowledge of data format at one facility is transferable to others.– E.g., numerical simulation of tests at CWRU,

UCD.– Training of experimenters may transferable.

• User interfaces to databases may be sharable; so, maybe we will not have to each develop the interfaces independently. – Search, query, automated IO, visualization……..

Page 9: DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid () schedule: –Characterize the Earthquake Engineering

Barriers to standardization and how to overcome them

• Need a “killer app” that assumes a standard

• The gap between Civil Engineering and Information Technology.

Page 10: DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid () schedule: –Characterize the Earthquake Engineering

“Killer App” features

• To help me remember what I did last time: automated metadata documentation

• To permit other researchers to duplicate test• Real time remote PI interaction- teleparticipation• To allow numerical simulation

– Interactive decision making during experiment

– Years after the test

• Automated control of the experiment

Page 11: DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid () schedule: –Characterize the Earthquake Engineering

“Killer App” features(2)

• Visualization

• Data search/query/access/filter

• Web portal - for all of the above?

Page 12: DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid () schedule: –Characterize the Earthquake Engineering

Metadata Design• Determine the structure of metadata to

optimize– Intuitive query language– Readable to computers and humans– Completeness without redundancy– Flexibility and Evolution

• Curation by NEES SI and Consortium• Write code- XML document type definitions

Page 13: DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid () schedule: –Characterize the Earthquake Engineering

Strawman metadata structure

1. Project Identifiers2. Catalog of Materials, Objects, Sensors and

Apparatus 3. Sequence of Model Test Events and

Measurements4. Sensor Channel Gain Lists (1)5. Image Data6. Control Data Files

Page 14: DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid () schedule: –Characterize the Earthquake Engineering

Discussion Items

• Philosophical issues related to culture of data sharing?– Data producer should get first shot at

publication– How long should we allow a data generator to

ponder before other people can have access?– How do we publish electronic data?– Give academic credit to data publishers,

Page 15: DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid () schedule: –Characterize the Earthquake Engineering

XML<ModelTest><Catalog>

<Sensors><Sensor SN="PCB3245">

<Type>Piezoelectric Accelerometer</Type><Manufacturer>PCB</Manufacturer><Model>352</Model><CalibrationDate>092899</CalibrationDate><Sensitivity Unit="mV/g">100</Sensitivity><Range>50g</Range><SensorData> http://www.pcb.com/pcb3245 </SensorData>

</Sensor> </Sensors></Catalog>

Page 16: DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid () schedule: –Characterize the Earthquake Engineering

There must be nice interfaces to complex data structures. Automatic metadata generator should do most of the work.

TEDS (Transducer Electronic Data Sheets), SCEDS, automated geometry definition will make the job do-able.

Page 17: DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid () schedule: –Characterize the Earthquake Engineering

Discussion Items• At what metadata level do we refer to other

archives instead of re-archiving?Example: – Accelerometer amplifier gain for each test

event archive– Accelerometer calibration in the test archive– Date and method of calibration in facility

archive– Cross-axis sensitivity at manufacturers archive

Page 18: DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid () schedule: –Characterize the Earthquake Engineering

Strawman metadata hierarchy

• Section 1 of the outline in Table 1 contains metadata associated with the research project.

• Section 2 is a catalog of physical objects used to construct or test the model. This includes: apparatus used to test the model, passive materials and markers that are placed in the model, and sensors that are used in the model tests.

Page 19: DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid () schedule: –Characterize the Earthquake Engineering

Strawman metadata hierarchy

• Section 5 describes image data. This could include photographs, video camera data, and/or engineering drawings of configuration.

• Section 6 describes the data required to control the experiment. This could determine the location of a CPT sounding, the rate of penetration of a penetrometer, or command files to control a shaker.

Page 20: DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid () schedule: –Characterize the Earthquake Engineering

Strawman metadata hierarchy

• Section 3 describes sequencing of events. A sequence can be the measurement of the location of an object, or an event involving activation of an actuator or a penetrometer sounding. 

• Section 4 includes the sensor-channel-gain lists; this documents which sensors are plugged into which amplifier channels, and also includes the sequence in which the sensor data was recorded, and parameters that define gains and filters.

Page 21: DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid () schedule: –Characterize the Earthquake Engineering

CAD of geometry and instrument location numbers

Printable version of report (pdf) describing experiment and automatically generated data time histories

Excel spreadsheets of metadata

ASCII data files of sensor readings during about 90 simulated earthquakes (about 1 MB each)

Page 22: DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid () schedule: –Characterize the Earthquake Engineering

Excel spread sheet describing calibration factors, amplifier channel numbers, gains, data file format, ...

Page 23: DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid () schedule: –Characterize the Earthquake Engineering

Event BV, page 3 of pdf document - semiautomatic plot generation using MathCAD program, central vertical array of accelerometer data

Page 24: DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid () schedule: –Characterize the Earthquake Engineering

SiteCouncil

NEES Collaboratory

NE

ES

gri

d

System Integrator

EarthquakeResearchers

Educators &Students

ProfessionalEngineers

OtherPractitioners

NEES Consortium

Site B

Site C

Site A

Othersite 1

Othersite 2

NEES Consortium

Development

Sim

ula

tion

/Exp

erim

enta

l Fac

ilitie

s

Page 25: DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid () schedule: –Characterize the Earthquake Engineering

OXCOXC

OXC

PrototypeOLS router Prototype

OLS router

To Berkeley, SantaCruz To Sacramento, Merced

80 PC cluster

SGI 16 processor

Parallel computer

3 D visualization machine_1

3 D visualization machine_2

GSR_1

NEESImaging

HDTVCamera

Environmental Monitoring

SGI image processor

GSR_2

UC Davis Research Network

Page 26: DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION Urgency NEESgrid () schedule: –Characterize the Earthquake Engineering

DKS02

Z558 mm

A28

Dry Nevada sandDr ~ 100%

DKS03

Dry Nevada SandDr ~ 98%

474 mm

A25

Transverse array

A22A13

Inside container width = 787 mm

Inside container width = 904 mm

Transversearray

5 mm cover sand

A10

A20

DKS04

Saturated Nevada Sand, Dr ~ 102%

air hammers

DKS05

245 mm

accelerometerpore fluid pressure

Concrete Basin

4 mm cover sand

Transverse array

Transverse array

1651 mm

1762 mm

Dry Nevada sandDr ~ 100%

shaking

549 mm

displacement