rda data foundation and terminology (dft) ig: introduction prepared for rda plenary san diego, march...

9
RDA Data Foundation and Terminology (DFT) IG: Introduction Prepared for RDA Plenary San Diego, March 9, 2015 Gary Berg-Cross, Raphael Ritz, Co-Chairs DFT IG A PID record that points to a metadata record and to instantiations of identical bit-streams that may store additional attributes Goal: Describe a basic, abstract (but clear) data organization model that systemizes the already large body of definition work on data management terms, especially as involved in RDA’s efforts.

Upload: lillian-patrick

Post on 24-Dec-2015

218 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: RDA Data Foundation and Terminology (DFT) IG: Introduction Prepared for RDA Plenary San Diego, March 9, 2015 Gary Berg-Cross, Raphael Ritz, Co-Chairs DFT

RDA Data Foundation and Terminology (DFT) IG:Introduction

Prepared for RDA Plenary San Diego, March 9, 2015

Gary Berg-Cross, Raphael Ritz, Co-Chairs DFT IG

A PID record that points to a metadata record and to instantiations of identical bit-streams that may store additional attributes

Goal: Describe a basic, abstract (but clear) data organization model that systemizes the already large body of definition work on data management terms, especially as involved in RDA’s efforts.

Page 2: RDA Data Foundation and Terminology (DFT) IG: Introduction Prepared for RDA Plenary San Diego, March 9, 2015 Gary Berg-Cross, Raphael Ritz, Co-Chairs DFT

DFT IG Session Agenda (16:00-17:30-11)

• 16:00-16:10 Overview of the DFT IG, Case Statement & the Breakout Session- Goals and Plans Gary Berg-Cross

• 16:10-16-20 Overview of the Ted-T tool (Raphael/Thomas)• 16:20 -16:45 R Liaison relation to other RDA IGs and WGs & Solicitation of ideas for

additional Use Cases and candidate vocabulary items • MIG and related RDA work (Keith Jefferies)• Practical policy (Regan Moore)• Adopter DataFed.net ( Aaron Addison & Cynthia Hudson Vitale) • Science Europe Working Group on Research Data (Peter Doorn)

• Also possible to hear from Legal interoperability,• Legal interoperability• Marine data harmonization• Data Fabric• PIT and data type registries

• 16:45-17:20 General Discussion (including remote participants)• 17:20-17:30 Discussion of follow on work & Plan for follow up virtual meeting.

Page 3: RDA Data Foundation and Terminology (DFT) IG: Introduction Prepared for RDA Plenary San Diego, March 9, 2015 Gary Berg-Cross, Raphael Ritz, Co-Chairs DFT

Prior DFT WG Activities & Accomplishments• One of the first RDA WGs

• Drafted 4 related Model Documents on core work;1. Data Models 1: Overview – 20 + models2. Data Models 2: Analysis & Synthesis3. Data Models 3: Term Snapshot4. Data Models 4: Use Cases- Work with other RDA WGs on use cases to

illustrate data concepts

• Presented draft work & held community discussions at RDA P1-P3 meeting

• Participated in cross WG discussions• Developed Semantic Media Wiki Term Definition Tool (Ted-T) to

capture initial list of terms and definitions for discussions, demo held at P3 (see http://smw-rda.esc.rzg.mpg.de/index.php/Main_Page)

• Participated in Adoption Day -Common Language Resources and Technology Infrastructure Adopting DFT, DataFed.net, CLARIN etc.

Candidate ListEvolved toRefined List

Tool demo at Plenary 3

Page 4: RDA Data Foundation and Terminology (DFT) IG: Introduction Prepared for RDA Plenary San Diego, March 9, 2015 Gary Berg-Cross, Raphael Ritz, Co-Chairs DFT

Overview of Term Development

Starter areas and items :Persistent Identifiers (PIDs and types)Digital Object - Data ObjectCollection - Data Set - Aggregation

Repository (Registries and related Policies)

ScopeTerms fromModel PapersPlaced In Tool

Digital Information Object

A digital item or group of items referred to as a unit, regardless of type or format that a computer can address or manipulate as a single object.

Defs & Refinement

Analysis and Revision Process

Getting Defs organized for review

Page 5: RDA Data Foundation and Terminology (DFT) IG: Introduction Prepared for RDA Plenary San Diego, March 9, 2015 Gary Berg-Cross, Raphael Ritz, Co-Chairs DFT

Example of Work on 10 categories of Terms

Digital Object (aka Digital Entity)A digital object is composed of structured sequence of bits/bytes. As an object it is named. This bit sequence can be identified & accessed by a unique and persistent identifier or by use of referencing attributes describing its properties.Note Digital Entity definition from X.1255 ITU standard “machine-independent data structure consisting of one or more elements in digital form that can be parsed by different information systems; the structure helps to enable interoperability among diverse information systems in the Internet.”

Page 6: RDA Data Foundation and Terminology (DFT) IG: Introduction Prepared for RDA Plenary San Diego, March 9, 2015 Gary Berg-Cross, Raphael Ritz, Co-Chairs DFT

More Terms and Initial definitions are in TeD-T

Page 7: RDA Data Foundation and Terminology (DFT) IG: Introduction Prepared for RDA Plenary San Diego, March 9, 2015 Gary Berg-Cross, Raphael Ritz, Co-Chairs DFT

• It has, of course, been difficult to get consensus on the scope a common vocabulary with detailed definitions.

• The work has been more of model and vocabulary identification than integrated definition

• We are and were in frequent discussions with communities about our results and will intensify this interaction.

• Based on this experience, a broader plan for long-term maintenance will be submitted to the TAB and Council as part of the IG.

• As needed in consultation with these & other appropriate RDA entities, some update to term definitions the can be anticipated as part of maintenance.

• The term tool (TED-T): a plan for its maintenance and use for DFT terms and perhaps other WGs must be provided.

• A special task force may be empowered to do this and other maintenance activities in line with guidance from RDA governance organizations.

• Based on interest a DFT IG was formed to continue efforts

Lessons Learned and Follow Up

Page 8: RDA Data Foundation and Terminology (DFT) IG: Introduction Prepared for RDA Plenary San Diego, March 9, 2015 Gary Berg-Cross, Raphael Ritz, Co-Chairs DFT

Coordinated with several other RDA Groups

• Considerable discussion of vocabularies has been part of RDA group activities at Plenaries and as part of ongoing RDA group discussion.

• Cross-group coordinated with several RDA WGs, as shown in the Data Fabric Figure on data concepts and relations.• This coordination task needs to be ongoing.• Potentially all groups could be engaged

in this IG and we with them• Much more work and discussion would be useful such

as with the PP WG and its terminology that was only briefly sketched out without full definitions.

• PP along with MIG has expressed an interest in more formalized definitions that can be processed by computer and the Ted-T tool may be capable of doing this or at least demonstrating its feasibility.

Page 9: RDA Data Foundation and Terminology (DFT) IG: Introduction Prepared for RDA Plenary San Diego, March 9, 2015 Gary Berg-Cross, Raphael Ritz, Co-Chairs DFT

Objectives for P51. Start IG discussion and leverage existing work and approach but improve

both1. We are expecting considerable discussion of new requirements coming out of groups

nearing completion, but also support as part of adoption.2. We can also leverage the experience of other IGs as to success factors

2. Focus on facilitating community discussion on core concepts 1. Based on feedback, some curated revisions on definitions and extension of the

current synthesis model can be expected to finalize and stabilize the effort for subsequent use.

3. Facilitate definition development 1. Potential adopters will be encouraged at P5 to provide feedback on additional use

case scenarios to illustrate what areas of work they plan on using the models and vocabulary for.

2. This will serve to plan work and virtual meetings between P5 and P6.