data in the nees data repository
DESCRIPTION
Data in the NEES Data Repository. Conditions for Current and Future Use and Re-Use. Stanislav Pej ša NEEScomm D ata C urator , NEES. Quake Summit 2012, Boston, Massachusetts July 12, 2012. This work is licensed under a Creative Commons Attribution 3.0 Unported License. Table of Contents. - PowerPoint PPT PresentationTRANSCRIPT
Data in the NEES Data Repository
Conditions for Current and Future Use and Re-Use
Quake Summit 2012, Boston, MassachusettsJuly 12, 2012
Stanislav Pejša NEEScomm Data Curator, NEES
This work is licensed under a Creative Commons Attribution 3.0 Unported License.
Table of Contents
• Reference models• Data Flow in NEES Data Repository• NEES Data Goals• Data Archiving• Quality Assurance• Access and Sharing• Data Re-Use• Data Preservation
I2S2 Research Lifecycle
Research activity
Administrative activity
Publication
Archive activity
http://www.ukoln.ac.uk/projects/I2S2/documents/I2S2-ResearchActivityLifecycleModel-110407.pdf
DCC Curation Lifecycle
Data Digital objects
Full Lifecycle Actions Description and Representation information Preservation planning Community watch and participation Curate and Preserve
Sequential actions Conceptualise Create or receive Appraise and select Ingest Preservation action Store Access, use, re-use Transform
Occasional actions Dispose Reappraise Migrate
http://www.dcc.ac.uk/resources/curation-lifecycle-model
OAIS Functional Model
6 functional entities Ingest Archival storage Data management Administration Preservation
planning Access
Data Flow in NEES
http://nees.org/warehouse/experiment/1622/project/637
NEES Data Goals
Aligned with NSF Data Management Plan (DMP) requirements* All research data and documentation will be archived
Types of data and other materials to be produced during project Archived data will be of high quality
Standards to be used for data and metadata format and content Archived data will be accessible and shareable
Policies for access and sharing Archived data will be re-usable
Policies and provisions for re-use Archived data will be preserved
Plans for archiving and preservation of access to them* http://www.nsf.gov/pubs/policydocs/pappguide/nsf11001/gpg_2.jsp#dmp
Data Archiving
Who research team, site personnel, curator, NEEScomm
What sensor measurements, sensor calibrations, observations, analyses, numerical simulations,
images and videos, reports (including publications and presentations), logs
When Dates are stated in the Data Sharing and Archiving Policies (1 month, 6 moths, 12 months) For as long as the data are useful ~ indefinitely ~ for 20 years
Where Project Warehouse http://nees.org/warehouse/welcome
Why increases researcher’s impact saves work, time, money good practice advances science
Information Package
Information Package – discoverable through descriptive information
Content Information - the original target of preservation - consists of
Content Data Object (bits) Representation Information – needed to make object
understandable to the community (record)
Preservation Description Information - information needed to preserve the Content Information
Provenance Context Reference (Identification) Fixity – protect the CI from undocumented alteration Access rights
Quality Assurance
Data need to be understandable Standards Seeing standards
Research teams Professional standards Team guidelines for data management NEEScomm requirements
NEES Sites Certifications Professional standards Local guidelines (naming conventions, etc.)
NEES Data Repository OAIS PREMIS Dublin Core Documentation and metadata requirements
Curation interactive and iterative exchange assessment of technical quality of data and relevant
documentation
Access and Sharing
Time Unprocessed data – within 1 month Corrected data and documentation – within 6 months Data made PUBLIC within 12 months
Conditions for access and sharing (Let others know that they can use your data) Open Data
data Creative Commons
presentations, reports, pre-prints/post-prints, teaching materials
Open Sourcesoftware
more on intellectual property considerations https://nees.org/legal/licensing
Data Re-Use
Use of known, tested, and open formats is key to the success of any future attempt to use data
Data Use - Using research data for the current research purpose/activity to infer new knowledge about the research subject.
Data Re-use - Using research data for a research purpose/activity other than that for which it was intended.
Data Purposing - Making research data available and fit for the current research activity.
Data Re-purposing -Making existing research data available and fit for a future known research activity.
Supporting Data Re-use - Managing existing research data such that it will be available for a future unknown research activity.
Darlington, M. (ed.) (2011a) "ERIM Terminology", version 4. University of Bath, last updated April 12, 2011 http://wiki.bath.ac.uk/display/ERIMterminology/ERIM+Terminology+V4Ball, A., Darlington, M, Howard, T., McMahon, Chris, Culley, S. (2012). Visualizing Research Data Records for their Better Management. Journal of Digital Information, Vol 13, No 1. Available at http://journals.tdl.org/jodi/article/view/5917
Preservation
Bit-level preservationAll files will be stored and preserved on the bit-level
Full preservation Required and recommended (supported) formats
Preservation strategies: format migration format refresh emulation