iso/iec 11179 part 4 rules and guidelines for the formulation of data definitions

21
Lois Fritts SAIC January 17, 2000 Open Forum on Metadata Registries Santa Fe, NM SDC-0002-021-JE-2022

Upload: cid

Post on 05-Jan-2016

25 views

Category:

Documents


0 download

DESCRIPTION

ISO/IEC 11179 Part 4 Rules and Guidelines for the Formulation of Data Definitions. Lois Fritts SAIC January 17, 2000. Open Forum on Metadata Registries Santa Fe, NM. SDC-0002-021-JE-2022. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: ISO/IEC 11179 Part 4  Rules and Guidelines for the Formulation of Data Definitions

Lois FrittsSAICJanuary 17, 2000

Open Forum onMetadata Registries

Santa Fe, NM

SDC-0002-021-JE-2022

Page 2: ISO/IEC 11179 Part 4  Rules and Guidelines for the Formulation of Data Definitions

SDC-0002-021-JE-2022

Challenges

Data element definitions and descriptions are often insufficient to support reuse or multiple users of data.

Data element names are often not definitive for value domains.

Data standardization must focus on data element definitions rather than names.

Page 3: ISO/IEC 11179 Part 4  Rules and Guidelines for the Formulation of Data Definitions

SDC-0002-021-JE-2022

Purpose of Definitions

The purpose of a data element definition is to

define a data element with words or phrases that

describe, explain, or make definite and clear its

meaning.

Page 4: ISO/IEC 11179 Part 4  Rules and Guidelines for the Formulation of Data Definitions

SDC-0002-021-JE-2022

Data Definition Rules

A data definition shall be:

Unique

Singular

A statement of concept, not its negative

A descriptive phrase or sentence

Commonly understood abbreviations

Without embedded definitions

Page 5: ISO/IEC 11179 Part 4  Rules and Guidelines for the Formulation of Data Definitions

SDC-0002-021-JE-2022

Distinguishable from every other definition within the registry.

Good - The date when a regulation became effective.The date when collection of the sample began.

Poor - The date when something started.

Unique

Page 6: ISO/IEC 11179 Part 4  Rules and Guidelines for the Formulation of Data Definitions

SDC-0002-021-JE-2022

Singular

Always expressed in the singular.

Good -

The unique identification number assigned to a facility.

Poor -

Unique identification number assigned to facilities.

Page 7: ISO/IEC 11179 Part 4  Rules and Guidelines for the Formulation of Data Definitions

SDC-0002-021-JE-2022

Positive, Not Negative

Cannot exclusively say what it is not.

Good -The name of a facility that is recognized by the local community as the commercial name.

Poor -The name of a facility that is not the legal name.

Page 8: ISO/IEC 11179 Part 4  Rules and Guidelines for the Formulation of Data Definitions

SDC-0002-021-JE-2022

Descriptive

Include the essential characteristics of the concept.

Good -The name of the individual designated to be the facility’s representative for communications about the facility.

Poor -Person to contact.

Page 9: ISO/IEC 11179 Part 4  Rules and Guidelines for the Formulation of Data Definitions

SDC-0002-021-JE-2022

Avoid Abbreviations

Use only commonly known abbreviations.

Good -The Standard Industrial Classification

(SIC) code that represents the economic activity of a company.

Poor -The SIC code that represents the economic activity of a company.

Page 10: ISO/IEC 11179 Part 4  Rules and Guidelines for the Formulation of Data Definitions

SDC-0002-021-JE-2022

No Embedded Definitions

Second concept should not appear in the definition.

Good -The text that describes the method used to calibrate an instrument.

Poor - The text that describes the method used to calibrate an instrument, where calibration is the process of rectifying the graduation of quantitative instruments.

Page 11: ISO/IEC 11179 Part 4  Rules and Guidelines for the Formulation of Data Definitions

SDC-0002-021-JE-2022

Data Definition Guidelines

State the essential meaning of the concept. Be precise and unambiguous. Be concise. Be able to stand alone. Be expressed without embedding rationale,

functional usage, domain information or procedural information.

Avoid circular reasoning. Use consistent terminology and structure for

related definitions.

Page 12: ISO/IEC 11179 Part 4  Rules and Guidelines for the Formulation of Data Definitions

SDC-0002-021-JE-2022

Essential Meaning

Avoid non-essential characteristics.

Good -The name of a country where mail is delivered.

Poor -The last line of a mail piece that names the country where mail is delivered.

Page 13: ISO/IEC 11179 Part 4  Rules and Guidelines for the Formulation of Data Definitions

SDC-0002-021-JE-2022

Precise and Unambiguous

Express exact meaning of the concept.

Good - The calendar date when latitude and

longitude coordinates were determined.

Poor -The data collection date.

Page 14: ISO/IEC 11179 Part 4  Rules and Guidelines for the Formulation of Data Definitions

SDC-0002-021-JE-2022

Concise

Comprehensive without extraneous terms.

Good -The name of the person to contact for clarification of technical information.

Poor -The individual EPA or State officials maycontact if clarification of the information reported on the form is required.

Page 15: ISO/IEC 11179 Part 4  Rules and Guidelines for the Formulation of Data Definitions

SDC-0002-021-JE-2022

Stand Alone

Stand alone without further definition.

Good -

The Hydrologic Unit Code (HUC) that represents a surface drainage basin or a combination of drainage basins.

Poor -

The Hydrologic Unit Code (HUC) that represents a cataloging unit.

Page 16: ISO/IEC 11179 Part 4  Rules and Guidelines for the Formulation of Data Definitions

SDC-0002-021-JE-2022

Without Embedded Rationale

Does not include rationale, functional usage, or procedural information.

Good -The distance in meters above or below a reference surface.Poor -The distance above or below a reference surface, measured in meters rather than feet, because meter is an international unit of measure.

Page 17: ISO/IEC 11179 Part 4  Rules and Guidelines for the Formulation of Data Definitions

SDC-0002-021-JE-2022

Avoid Circular Reasoning

A data element should not be defined in the context of another data element.

Poor -

Facility Identification Number–The number assigned to a facility.

Facility–The site identified by a facility identification number.

Page 18: ISO/IEC 11179 Part 4  Rules and Guidelines for the Formulation of Data Definitions

SDC-0002-021-JE-2022

Consistent with Related Data

A common terminology and syntax.Good -The code that represents the method used to determine vertical coordinates.

The name of the method used to determine vertical coordinates.Poor -The code that represents the method used to determine horizontal coordinates.

The name of the method used to determine the latitude and longitude of a place.

Page 19: ISO/IEC 11179 Part 4  Rules and Guidelines for the Formulation of Data Definitions

SDC-0002-021-JE-2022

Example Definition Syntax

Use a phrase, not a sentence.The name of the country where mail is delivered.

Begin the definition by stating the representation class, such as: The name of…. The code that represents….The text that describes…. The measure of the…. The number assigned by…to identify….The sum, dimension, capacity (quantity) of….

Page 20: ISO/IEC 11179 Part 4  Rules and Guidelines for the Formulation of Data Definitions

SDC-0002-021-JE-2022

Definitions in Context

Must state exactly the same concept.

Same -The measure of elevation in meters, above or below a reference datum (Registry).

The vertical distance in meters either above or below a reference surface (Standard).

Different -The height or depth of a facility relative to sea level.

Page 21: ISO/IEC 11179 Part 4  Rules and Guidelines for the Formulation of Data Definitions

Good definitions Good definitions promote the promote the

standardization and standardization and reuse of data reuse of data

elements, leading to elements, leading to data sharing and data sharing and

integration of integration of information systems.information systems.

SDC-0002-021-JE-2022