trends in concept modelling turning issues into solutions how to discipline a cat sue ellen wright,...
TRANSCRIPT
![Page 1: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/1.jpg)
Trends in Concept Modelling
Turning Issues into SolutionsHow to Discipline a Cat
Sue Ellen Wright, Kent State University
![Page 2: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/2.jpg)
What is ISOcat? – 1
• The implementation of the ISO TC 37 Data Category Registry, a Metadata Registry
• A knowledge resource containing, i.a., definitions for data category concepts used to annotate language resources
• A (potentially) authoritative concept database that can be used to anchor relations in external Relation Registries (RRs) and other knowledge resources
Neeri Conference, Helsinki, 2009-10-01 3
![Page 3: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/3.jpg)
Semantic Issues for DCs
• What data element names occur in these data?
• What do the content of these DCs “mean” in a semantic sense?
• Can I utilize these data in my environment, especially across barriers of communities of practice?
Neeri Conference, Helsinki, 2009-10-01 4
![Page 4: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/4.jpg)
Goals of ISOcat
• Supporting the reusability, integratability, and interoperability of data by defining data categories (data element concepts) as an instantiation of ISO 11179
• Trusting data in a climate of different communities of practice
• Community collaboration in defining the data categories used in language resources
Neeri Conference, Helsinki, 2009-10-01 5
![Page 5: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/5.jpg)
Metadata Registries in 11179
• An ISO metadata registry consists of a hierarchy of "concepts" with associated properties for each concept. Concepts are similar to classes in object-oriented programming but without the behavioral elements. Properties are similar to Class attributes. ISO standards require that each concept and property have a precisely worded data element definition
Neeri Conference, Helsinki, 2009-10-01 6
![Page 6: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/6.jpg)
What is ISOcat? – 2
• A social network designed to facilitate the creation of data category specifications for use in linguistic annotation schemes
• A forum for achieving consensus on data category names, definitions, permissible instances and data category selections for work groups and thematic domains
• A framework for standardizing a subset of these data categories and data category selections (e.g., tagsets)
Neeri Conference, Helsinki, 2009-10-017
![Page 7: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/7.jpg)
ISOcat as an MDR with a history
• “Authority” and credibility are hampered by:– Incorrect spellings, definitions, examples– Narrow perspectives that ignore individual
language specifics– Failure to observe moderately uniform
conventions– Failure in the past to accommodate
consensus among experts
Neeri Conference, Helsinki, 2009-10-018
![Page 8: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/8.jpg)
Knowledge Resource Issues
• Flawed legacy data from the Syntax pilot DCR– Lack of stable guidelines for attributes such as
definitions & some names– Introduction of data refinements, which require
updating virtually all DCs– Technical glitches that resulted in missing DCs– Inevitable errors, coupled with inability to edit entries– Lack of efficient group consensus mechanism
• Danger of similar issues in the future
Neeri Conference, Helsinki, 2009-10-019
![Page 9: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/9.jpg)
Examples: Flawed SALT Input
• Critical standardized items not imported
• Recreated, but not marked as standard
Neeri Conference, Helsinki, 2009-10-0110
/part of speech/ is standardized as per ISO 12620:1999, but shows up as private & candidate
![Page 10: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/10.jpg)
Uncorrected Data Errors
• Incorrect definition for some languages
Neeri Conference, Helsinki, 2009-10-0111
In languages that actually use the accusative case, it is frequently used for other purposes, particularly as an object of some prepositions, or to use a noun as an adverbial. (Alas, even Crystal can be wrong on occasion.)
![Page 11: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/11.jpg)
Uncorrected Data Errors
• Discursive, rambling definitions; inappropriate example
___ ___ ___ Neeri Conference, Helsinki, 2009-10-01 12
Again, reliance on one source & lack of feedback from speakers of other languages results in misinformation. A better example: She fixed him a nice lunch.
![Page 12: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/12.jpg)
Language-Related Issues
• English, which can talk about the accusative case, but does not actually have an accusative case, conflates dative and accusative (at least) into objective case, but there is no entry for this.
Neeri Conference, Helsinki, 2009-10-01 13
![Page 13: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/13.jpg)
Conflicting Definitions
• The object is to define appellative nouns used as components of proper nouns
Neeri Conference, Helsinki, 2009-10-0114
![Page 14: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/14.jpg)
Uncorrected Data Errors
• Tautological definition, confusing note, lack of a clarifying example
15
Definitions should not simply restate the elements of the data category name.
![Page 15: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/15.jpg)
Uncorrected Data Errors
• Incorrect “language-independent” name (incorrect English name)
Neeri Conference, Helsinki, 2009-10-0116
The correct name is participial adjective..
![Page 16: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/16.jpg)
Solutions
• Data must be reliable and adhere to consistent rules in order to contribute to trust on the Web.
• DCR Guidelines (unavailable during Syntax phase) are now in place. http://www.isocat.org/manual/DCRGuidelines.pdf
Neeri Conference, Helsinki, 2009-10-0117
![Page 17: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/17.jpg)
Solutions• Provide appropriate social networking
environment & technical features• Designed to avoid and correct similar
discrepancies in the future• Rationale:
– Individual experts can make mistakes.– Group consensus verifies form and content.– Multiple mother tongues contribute to broader
understanding.– Even non-standardized DCs benefit from consensus.
Neeri Conference, Helsinki, 2009-10-01.18
![Page 18: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/18.jpg)
Roles in ISOcat – Individuals• Guests
– Access, select, and output data
• Experts – Above, plus save DCs, DCSs– Create new DCs– Share DCs– Create & serve on Ad Hoc Groups, TDGs– Coordinate Ad Hoc Group, chair TDGs– Submit DCs for standardization
Neeri Conference, Helsinki, 2009-10-0119
![Page 19: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/19.jpg)
Roles in ISOcat• Ad hoc groups – informal groups of experts
assigned by individual experts– Comment on and reach consensus on DCs in shared
space– Submit (if desired) DCs for standardization
• Semi-formal ad hoc groups:– LISA/OSCAR group, others (DITA? DARWIN? XLIFF?)– CLARIN work groups– ISOcat work groups for other major projects
• Informal, truly ad hoc groups
Neeri Conference, Helsinki, 2009-10-0120
![Page 20: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/20.jpg)
Roles in ISOcat• Thematic Domain Groups
– Formal groups appointed by ISO TC 37 P members & TDG Chairs
• TDG Chairs– Manage DC evaluation process per ISO 12620
• DCR Board members– Conduct DC validation & harmonization
process
Neeri Conference, Helsinki, 2009-10-0121
![Page 21: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/21.jpg)
QA Scenario 1, Phase 1
• Expert A spots perceived error in DC belonging to Expert B or to a TDG
• Expert A clones currently locked DC in his/her own workspace
• Here s/he can propose editorial corrections in the cloned DC
• Expert A invites Expert B to share clone
• Expert A informs TDG and shares clone
Neeri Conference, Helsinki, 2009-10-0122
![Page 22: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/22.jpg)
QA Scenario 2, Phase 1
• TDG chair/Ad Hoc Group leader creates a DCS for review– Selection from current DCs or– Creation of new DCs– Profile change management
• DCS (with its DCs) assigned to a pre-defined group (ad hoc or TDG)
Neeri Conference, Helsinki, 2009-10-01
23
![Page 23: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/23.jpg)
Profiles versus DCSs
• Profile membership is part of the DC specification– the profile indicates the thematic domain of the DC– the profile view in the UI is created by a query– there are a limited number of profiles
• A DCS is a collection of DCs– hand picked by an user for a specific purpose– can contain DCs from various profiles– there can be an unlimited number of DCSs
• There isn’t (yet) a profile specific view on a DCS
24Neeri Conference, Helsinki, 2009-10-01
![Page 24: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/24.jpg)
QA Scenario 3, Phase 1
• Any role (individual expert, group, DCRB member) identifies the need to harmonize DCs between TDGs or across communities of practice within a TDG
• DC or DCs collected, cloned
• Discussion group assigned & notified
• Multiple ad hoc or thematic domain groups may have to interact.
Neeri Conference, Helsinki, 2009-10-0125
![Page 25: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/25.jpg)
All QA Scenarios, Phase 2
• Wiki and/or forum based discussion
• Informal consensus or formal balloting
• Correction according to above decision
• Harmonization issues– Option 1: DCs merged to form one, perhaps
with multiple profile options– Option 2: Multiple DCs, with clear indication of
differences that justify doublettes– Option 3: Build external RR linking doublettes
Neeri Conference, Helsinki, 2009-10-01 26
![Page 26: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/26.jpg)
Justified Doublettes – Terminology
• The value domain for /part of speech/ in Terminology is very short.
Neeri Conference, Helsinki, 2009-10-01 27
![Page 27: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/27.jpg)
Doublettes – Morphosyntax PoS
• The value domain for morphosyntax is extremely large.
![Page 28: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/28.jpg)
Tasks
• Enable the retention of historical data category specifications while facilitating ongoing revisions
• Integration of a variety of social networking features– Wiki and forum add-ons– Internal messaging system– Link to external mail
• Creation, consensus, revision, harmonization
Neeri Conference, Helsinki, 2009-10-0129
![Page 29: Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University](https://reader038.vdocuments.mx/reader038/viewer/2022103006/56649e715503460f94b70450/html5/thumbnails/29.jpg)
Thanks for your attention!
Come play with the cat!
http://www.isocat.org/
30Neeri Conference, Helsinki, 2009-10-01