sc 32 tutorial session
DESCRIPTION
SC 32 Tutorial Session. WG 2 eXtended Metadata Registry (XMDR) April 19, 2005. Bruce Bargmeyer, Lawrence Berkley National Laboratory University of California. 11179 Metadata Registries Extensions. Register (and manage) any semantics that are useful for managing data. - PowerPoint PPT PresentationTRANSCRIPT
1Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
SC 32 Tutorial Session
WG 2 eXtended Metadata Registry (XMDR)
April 19, 2005
Bruce Bargmeyer,
Lawrence Berkley National Laboratory
University of California
2Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
11179 Metadata RegistriesExtensions
Register (and manage) any semantics that are useful for managing data. E.g., this may include registering not only permissible
values (concepts), definitions, but may extend to registration of the full concept systems in which the permissible values are found.
E.g., may want to register keywords, thesauri, taxonomies, ontologies, axiomitized ontologies….
Support traditional data management and data administration
Lay Foundation for semantics based computing: Semantics Service Oriented Architecture, Semantic Grids, Semantics based workflows, Semantic Web ….
3Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
XMDR Project Background
Collaborative, interagency effort DOD, EPA, USGS, NCI, Mayo Clinic, LBNL …& others
Draws on and contributes to interagency cooperation on ecoinformatics
Involves international, national, state, local government agencies, other organizations
Recognizes great potential of semantics-based computing, management of metadata Improving collection, maintenance, dissemination, processing of
very diverse data structures Collaboration arises from needs for traditional data administration,
for sharing data across multiple organizations, for managing complex semantics associated with data, and for semantics based computing.
Many Players, Many Interests…Shared Context
4Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Purposes of XMDR Project
1. Propose revisions to 11179 Parts 2 & 3 (3rd Ed.) – to serve as the design for the next generation of metadata registries.
2. Demonstrate Reference Implementation – to validate the proposed revisions
Extend semantics management capabilities Enable registration of correspondences between multiple
concept systems and between concept systems and data Explore uses of terminologies and ontologies Systematize representation of concepts and relationships Enable registration of metadata for knowledge bases Adapt & test emerging semantic technologies
5Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Movement TowardSemantics Management
Going beyond traditional Data Standards and Data Administration
In addition to anchoring data with definitions, we want to process data and concepts based on context and relationships, and axioms
In addition to natural language, we want to capture semantics with more formal description techniques FOL, DL, Common Logic, OWL
Going beyond information system interoperability and data interchange to processing based on inferences and probabilistic correspondence between concepts found in
natural language (in the wild) and both data in databases and concepts found in concept systems.
6Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
11179 Semantics Management
Real World
ModelingMethods &
Terminology
Model Artifacts,Data Exchange,
Concept Systems
Applications
Methodologies: EDR, NIAM, O-O, RDF, UML, Ontology
CDIF, (UML- MDL, XMI), OWL, CL & SCL (KIF)
SQL, Object, RDF, Semantic Web
Metamodel constructs: CDIF Core, MOF, XML-Schema, RDFS, ODM, OWL, CLMDR Semantics
Management:Data elements,Domains,Concepts, Terms …
7Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
SC 32 Standards & Projects
Real World
ModelingTools
Applications
Methodologies: EDR, NIAM, O-O, RDF, UML, Ontology
CDIF, (UML- MDL, XMI), OWL, CL & SCL (KIF)
SQL, Object, RDF, Semantic Web
WG 3 - SQL
Metamodel constructs: CDIF Core, MOF, XML-Schema, RDFS, ODM, OWL, CL
WG 2 - MMF (19763), CL (24707) &MOF PAS submission
WG 2 - MMF (19763), CL (24707) &XMI PAS submission
WG 2 – 11179
WG 2 - 20944
MDR SemanticsManagement:Data elements,Domains,Concepts, Terms …
WG2 - 20943
ModelingMethods &
Terminology
Model Artifacts,Data Exchange,
Concept Systems
8Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
XMDR Focus
Methodologies: EDR, NIAM, O-O, RDF, UML, Ontology
CDIF, (UML- MDL, XMI), OWL, CL & SCL (KIF)
SQL, Object, RDF, Semantic Web
WG 3 - SQL
Metamodel constructs: CDIF Core, MOF, XML-Schema, RDFS, ODM, OWL, CL
WG 2 - MMF (19763), CL (24707) &MOF PAS submission
WG 2 - MMF (19763), CL (24707) &XMI PAS submission
XMDR project
XMDR project
XMDR project
Real World
ModelingMethods &
Terminology
Applications
WG 2 – 11179Parts 2 & 3
WG 2 - 20944
MDR SemanticsManagement:Data elements,Domains,Concepts, Terms …
WG2 - 20943
Model Artifacts,Data Exchange,
Concept Systems
9Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Drawing Together
Metadata Registry
Terminology Thesaurus Themes
DataStandards
Ontology GEMET
StructuredMetadata
UsersUsers
MetadataRegistries
Terminology
CONCEPT
Referent
Refers To Symbolizes
Stands For
“Rose”,“ClipArt”
10Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Data Element ConceptAfghanistan
Belgium
China
Denmark
Egypt
France
Germany
…………
Data ElementsData ElementsAFG
BEL
CHN
DNK
EGY
FRA
DEU
…………
ISO 3166English Name
ISO 31663-Numeric Code
004
056
156
208
818
250
276
…………
ISO 31663-Alpha Code
Afghanistan
Belgium
China
Denmark
Egypt
France
Germany
…………
Name:Context:Definition:Unique ID: 4572Value Domain:Maintenance Org.:Steward:Classification:Registration Authority:Others
Name:Context:Definition:Unique ID: 3820Value Domain:Maintenance Org.:Steward:Classification:Registration Authority:Others
Name:Context: Definition:Unique ID: 1047Value Domain:Maintenance Org.:Steward:Classification:Registration Authority:Others
Name: Country IdentifiersContext:Definition:Unique ID: 5769Conceptual Domain:Maintenance Org.:Steward:Classification:Registration Authority:Others
11Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
What is an ontology?
The subject of ontology is the study of the categories of things that exist or may exist in some domain. The product of such a study, called an ontology, is a catalog of the types of things that are assumed to exist in a domain of interest D from the perspective of a person who uses a language L for the purpose of talking about D. The types in the ontology represent the predicates, word senses, or concept and relation types of the language L when used to discuss topics in the domain D.
Building, Sharing, and Merging Ontologies-John F. Sowa
12Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Terminolocgical & Formal(Axiomatized) Ontologies
The difference between a terminological ontology and a formal ontology is one of degree: as more axioms are added to a terminological ontology, it may evolve into a formal or axiomatized ontology.
Cyc has the most detailed axioms and definitions; it is an example of an axiomatized or formal ontology. EDR and WordNet are usually considered terminological ontologies.
Building, Sharing, and Merging OntologiesJohn F. Sowa
13Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
An Axiom for an Axiomatized Ontology
Definition: The resource_cost_point predicate, cpr, specifies the cost_value, c, (monetary units) of a resource, r, required by an activity, a, upto a certain time point, t.
If a resource of the terminal use or consume states, s, for an activity, a, are enabled at time point, t, there must exist a cost_value, c, at time point, t, for the activity, a,that uses or consumes the resource, r. The time interval, ti = [ts, te], during which a resource is used or consumed byan activity is specified in the use or consume specifications as use_spec(r, a, ts, te, q) or consume_spec(r, a, ts, te, q) where activity, a, uses or consumes quantity, q, of resource, r, during the time interval [ts, te]. Hence,
Axiom: a, s, r, q, ts, te, (use_spec(r, a, ts, te, q) enabled(s, a, t)) ∀ ∧ ∨(consume_spec(r, a, ts, te, q) enabled(s, a, t))≡ c, cpr(a,c,t,r)∧ ∃
Cost Ontology for TOronto Virtual Enterprise (TOVE)
14Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Samples of Eco & Bio Graph Data
Nutrient cycles in microbial ecologies These are bipartite graphs, with two sets of nodes, microbes and reactants (nutrients), and directed edges indicating input and output relationships. Such nutrient cycle graphs are used to model the flow of nutrients in microbial ecologies, e.g., subsurface microbial ecologies for bioremediation.
Chemical structure graphs: Here atoms are nodes, and chemical bonds are represented by undirected edges. Multi-electron bonds are often represented by multiple edges between nodes (atoms), hence these are multigraphs. Common queries include subgraph isomorphism. Chemical structure graphs are commonly used in chemoinformatics systems, such as Chem Abstracts, MDL Systems, etc.
Sequence data and multiple sequence alignments . DNA/RNA/Protein sequences can be modeled as linear graphs
Topological adjacency relationships also arise in anatomy. These relationships differ from partonomies in that adjacency relationships are undirected and not generally transitive.
15Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Eco & Bio Graph Data (Continued)
Taxonomies of proteins, chemical compounds, and organisms, ... These taxonomies (classification systems) are usually represented as directed acyclic graphs (partial orders or lattices). They are used when querying the pathways databases. Common queries are subsumption testing between two terms/concepts, i.e., is one concept a subset or instance of another. Note that some phylogenetic tree computations generate unrooted, i.e., undirected. trees.
Metabolic pathways: chemical reactions used for energy production, synthesis of proteins, carbohydrates, etc. Note that these graphs are usually cyclic.
Signaling pathways: chemical reactions for information transmission and processing. Often these reactions involve small numbers of molecules. Graph structure is similar to metabolic pathways.
Partonomies are used in biological settings most often to represent common topological relationships of gross anatomy in multi-cellular organisms. They are also useful in sub-cellular anatomy, and possibly in describing protein complexes. They are comprised of part-of relationships (in contrast to is-a relationships of taxonomies). Part-of relationships are represented by directed edges and are transitive. Partonomies are directed acyclic graphs.
Data Provenance relationships are used to record the source and derivation of data. Here, some nodes are used to represent either individual "facts" or "datasets" and other nodes represent "data sources" (either labs or individuals). Edges between "datasets" and "data sources" indicate "contributed by". Other edges (between datasets (or facts)) indicate derived from (e.g., via inference or computation). Data provenance graphs are usually directed acyclic graphs.
16Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
A graph theoretic characterization
Readily comprehensible characterization of metadata structures
Graph structure has implications for: Integrity Constraint Enforcement Data structures Query languages Combining metadata sets Algorithms for query processing
17Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Definition of a graph
Graph = vertex (node) set + edge set Nodes, edges may be labeled Edge set = binary relation over nodes
cf. NIAM
Labeled edge set RDF triples (subject, predicate, object) predicate = edge label
Typically edges are directed
18Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Example of a graph
infectious disease
influenza measles
is-ais-a
19Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Types of Metadata Graph Structures
Trees Partially Ordered Trees Ordered Trees Faceted Classifications Directed Acyclic Graphs Partially Ordered Graphs Lattices Bipartite Graphs Directed Graphs Cliques Compound Graphs
20Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Graph Taxonomy
Directed Graph
Directed Acyclic Graph
Graph
Undirected Graph
Bipartite Graph
Partial Order Graph
Faceted Classification
Clique
Partial Order Tree
Tree
Lattice
Ordered Tree
Note: not all bipartite graphsare undirected.
21Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Trees
In metadata settings trees are almost most often directed edges indicate direction
In metadata settings trees are usually partial orders Transtivity is implied (see next slide) Not true for some trees with mixed edge types. Not always true for all partonomies
22Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Example: Tree
Oakland Berkeley
Alameda County
California
part-of
part-of part-of
part-of
part-of
Santa Clara San Jose
Santa Clara County
part-of
23Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Trees - cont.
Uniform vs. non-uniform height subtrees Uniform height subtrees
fixed number of levels common in dimensions of multi-dimensional data
models
Non-uniform height subtrees common terminologies
24Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Partially Ordered Trees
A conventional directed tree Plus, assumption of transitivity Usually only show immediate ancestors
(transitive reduction) Edges of transitive closure are implied Classic Example:
Simple Taxonomy, “is-a” relationship
25Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Example: Partial Order Tree
Polio Smallpox
Infectious Disease
Disease
is-a
is-a is-a
is-a
is-a
Diabetes Heart disease
Chronic Disease
is-a
Signifies inferred is-a relationship
26Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Ordered Trees
Order here refers to order among sibling nodes (not related to partial order discussed elsewhere)
XML documents are ordered trees Ordering of “sub-elements” is to support classic
linear encoding of documents
27Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Example: Ordered Tree
part-ofpart-of
part-of
Paper
Title page Section Bibliography
Note: implicit ordering relation among parts of paper.
28Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Faceted Classification
Classification scheme has mulitple facets Each facet = partial order tree Categories = conjunction of facet values (often written
as [facet1, facet2, facet3]) Faceted classification = a simplified partial order graph Introduced by Ranganathan in 19th century, as Colon
Classification scheme Faceted classification can be descirbed with
Description Logc, e.g., OWL-DL
29Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Example: Faceted Classification
Wheeled Vehicle Facet
2 wheeled
4 wheeled
3 wheeled
is-a is-a is-a
Internal Combustion
Human Powered
Vehicle Propulsion Facet
Bicycle Tricycle MotorcycleAuto
is-a is-a
is-a
is-ais-a
is-ais-a
is-a
is-a
is-a is-a
30Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Faceted Classifications and Multi-dimensional Data Model
MDM – a.k.a. OLAP data model Online Analytical Processing data model Star / Snowflake schemas
Fact Tables fact = function over Cartesian product of
dimensions dimensions = facets
geographic region, product category, year, ...
31Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Directed Acylic Graphs
Graph: Directed edges No cycles No assumptions about transitivity (e.g., mixed edge
types, some partonomies) Nodes may have multiple parents
Examples: Partonomies (“part-of”) - transitivity is not always
true
32Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Example: Directed Acyclic Graph
Wheeled Vehicle
2 Wheeled Vehicle
4 WheeledVehicle
3 WheeledVehicle
is-a is-a is-a
Internal Combustion
Vehicle
Human PoweredVehicle
PropelledVehicle
Bicycle Tricycle MotorcycleAuto
is-a is-a
is-a
is-ais-a
is-ais-a
is-a
is-a
is-a is-a
is-a is-a
Vehicle
33Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Partial Order Graphs
Directed acyclic graphs + inferred transitivity Nodes may have multiple parents Most taxonomies drawn as transitive reduction, transitive closure
edges are implied. Examples:
all taxonomies most partonomies multiple inheritance
POGs can be described in Description Logic, e.g., OWL-DL
34Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Example: Partial Order Graph
Wheeled Vehicle
2 Wheeled Vehicle
4 WheeledVehicle
3 WheeledVehicle
is-a is-a is-a
Internal Combustion
Vehicle
Human PoweredVehicle
PropelledVehicle
Bicycle Tricycle MotorcycleAuto
is-a is-a
is-a
is-ais-a
is-ais-a
is-a
is-a
is-a is-a
is-a is-a
Vehicle
Dashed line = inferred is-a (transitive closure)
35Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Directed Graph
Generalization of DAG (directed acyclic graph)
Cycles are allowed Arises when many edge types allowed Example: UMLS
36Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Lattices
A partial order For every pair of elements A and B
There exists a least upper bound There exists a greatest lower bound
Example: The power set (all possible subsets) of a finite set LUB(A,B) = union of two sets A, B GLB(A,B) = intersect of two sets A,B
37Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Example Lattice: Powerset of 3 element set
{a,b,c}
{c}{a} {b}
{b,c}{a,b} {a,c}
Denotes subset
Empty Set
38Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Lattices - Applications
Formal Concept Analysis synthesizing taxonomies
Machine Learning concept learning
39Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Bipartite Graphs
Vertices = two disjoint sets, V and W All edges connect one vertex from V and one
vertex from W Examples:
mappings among value representations mappings among schemas (entity/attribute, relationship) nodes in Conceptual
Graphs
40Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Example Bipartite Graph
California
Massachusetts
Oregon
CA
MA
OR
States Two-letter state codes
41Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Clique
Clique = complete graph (or subgraph) all possible edges are present
Used to represent equivalence classes Typically, on undirected graphs
42Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Example of Clique
California
CA
Calif.
CAL
Here edges denote synonymy.
43Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Compound Graphs
Edges can point to/from subgraphs, not just simple nodes
Used in conceptual graphs CG is isomorphic to First Order Logic Could be used to specify contexts for
subgraphs
44Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Example Compound Graph
Colin Powell claimed
Iraq had WMDs
45Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Conclusions
We can characterize metadata structure in terms of graph structures
Partial Order Graphs are the most common structure: used for taxonomies, partonomies support multiple inheritance, faceted classification implicit inclusion of inferred transitive closure
edges
46Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Axiomitized Ontology
Definition: The resource_cost_point predicate, cpr, specifies the cost_value, c, (monetary units) of a resource, r, required by an activity, a, upto a certain time point, t.
If a resource of the terminal use or consume states, s, for an activity, a, are enabled at time point, t, there must exist a cost_value, c, at time point, t, for the activity, a,that uses or consumes the resource, r. The time interval, ti = [ts, te], during which a resource is used or consumed byan activity is specified in the use or consume specifications as use_spec(r, a, ts, te, q) or consume_spec(r, a, ts, te, q) where activity, a, uses or consumes quantity, q, of resource, r, during the time interval [ts, te]. Hence,
Axiom: a, s, r, q, ts, te, (use_spec(r, a, ts, te, q) enabled(s, a, t)) ∀ ∧ ∨(consume_spec(r, a, ts, te, q) enabled(s, a, t))≡ c, cpr(a,c,t,r)∧ ∃
47Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Challenges
How to register & manage the various graph structures? DBMS, File systems ….
How to query the graph structures? XQuery for XML Poor to non-existent graph query languages
How to get adequate performance, even in high performance computing environment
User interface complexity How to manage semantic drift Versions How to interrelate graphs with other graphs and with
data Granularity at which to register metadata (then point
to greater detail elsewhere?)
48Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
How can Terminologies and Ontologies help Manage Metadata?
At the level of metadata instances in a registry, connect metadata entities via shared terms via automatic indexing of metadata words via text values from specific metadata elements
At the level of the 11179 (or other) metamodel, ontologies can help specify formal relationships is-a and part-of hierarchies, etc. Inheritance, aggregation, … for automatic searching of sub-classes & inverses to specify semantic pathways for indexing
49Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Major Tasks, Deliverables & Milestones
Initial Architecture Design
Present Proposed 11179 Part 2 Revisions to SC32
WG2 mtg in DC
Research and Development
Task/Deliverable
Test Implementation (External Users)
Prepare Draft Revision of 11179 Part 2 for SC32
mtg in Berlin
System Test & Evaluation (Internal Participants)
Identify, Select Metadata Sources
Identify, Select Technologies
Develop Project Plan
Gantt Chart Forthcoming
50Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
General Tasks/Intentions
Limited Query Optimization
Brief ISO/IEC L8
Documents will Recommend Choices
Task/Intention
Brief DOD Metadata Working Group
Brief IC Metadata Working Group
Limited Help Functions
Prioritized Content Registration
Limited User Interface - Initially
Will Seek to Promote Awareness
51Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Potential Standards/Technologies
DBMS Object, XML, Relational, RDF/Graph, Logic, Text, Document, Multimedia
Knowledge Representation Web Ontology Language (OWL) Common Logic (CL)
Middleware/Messaging Cocoon 2, Jini, CoABS, JMS, XMLBlaster, SOAP
XML [Semantic] Web Services Axis, JWSDP
Agent Development ABLE, JADE
Engines/Servers OMS (IBM), Federator/OMS (OWI) Jess
Open Source and Risk Tolerant
52Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Architecture Approach
Fully modular approach Exemplars:
Apache Web Server Eclipse IDE Protégé Ontology Editor
Benefits: numerous modules are relatively easy to implement clean separation of concerns and high reusability and portability tooling support required is minimal
53Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Ontology EditorProtege11179 OWL Ontology
XMDR Prototype Architecture: Initial Implemented Modules
MetadataValidator
AuthenticationService
MappingEngine
RegistryExternalInterface
Generalization Composition (tight ownership) Aggregation (loose ownership)
Jena, Xerces
Java
RetrievalIndex
FullTextIndex
Lucene
LogicBasedIndex
Jena, OWI KSRacer
RegistryStore
WritableRegistryStore
Subversion
54Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
XMDR Content Priority List
Phase 1
(V.A) National Drug File Reference Terminology (?)
DTIC Thesaurus (Defense Technology Info. Center Thesaurus)
NCI Thesaurus National Cancer Institute Thesaurus
NCI Data Elements (National Cancer Institute Data Standards Registry
UMLS (non-proprietary portions)
GEMET (General Multilingual Environmental Thesaurus)
EDR Data Elements (Environmental Data Registry)
ISO 3166 Country Codes – from EPA EDR
USGS Geographic Names Information System (GNIS)
55Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
XMDR Content Priority List
Phase 2LOINC Logical Observation Identifiers Names and Codes
ITIS Integrated Taxonomic Information System
Getty Thesaurus of Geographic Names (TGN)
SIC (Standard Industrial Classification System)
NAICS (North American Industrial Classification System)
NAIC-SIC mappings
UNSPSC (United Nations Standard Products and Services Codes)
EPA Chemical Substance Registry System
EPA Terminology Reference System
ISO Language Identifiers ISO 639-3 Part 3
IETF Language Identifiers RFC 1766
Units Ontology
56Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
XMDR Content Priority List
Phase 3HL7 Terminology
HL7 Data Elements
GO (Gene Ontology)
NBII Biocomplexity Thesaurus
EPA Web Registry Controlled Vocabulary
BioPAX Ontology
NASA SWEET Ontologies
NDRTF
57Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
Acknowledgementsand References
Frank Olken, LBNL Kevin Keck, LBNL John McCarthy, LBNL
59Open Forum 2003 on Metadata RegistriesOpen Forum 2005 on Metadata Registries
11179 Metadata RegistriesExtensions
Register (and manage) any semantics that are useful for managing data. E.g. Add semantic information beyond definitions link any concept found in concept systems or in the “wild”
(or clusters of concepts and relations) to data This may include all of the permissible values
(concepts) and the full concept systems in which the permissible values are found. E.g., may want to register keywords, thesauri, taxonomies,
ontologies, axiomitized ontologies…. Lay Foundation for semantics based computing:
Semantics Service Oriented Architecture, Semantic Grids, Semantics based workflows, Semantic Web …