iso 16642 - a tutorial part 2: representing data categories
DESCRIPTION
ISO 16642 - a tutorial Part 2: Representing data categories. TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria. Why formalizing DatCats?. Systematizing data category description: Notion of Data Category Registry (DCR) I need a data category: is it there? - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/1.jpg)
ISO 16642 - a tutorialPart 2: Representing data
categories
TMF - Terminological Markup Framework
Laurent Romary - Laboratoire Loria
![Page 2: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/2.jpg)
Why formalizing DatCats?
Systematizing data category description:– Notion of Data Category Registry (DCR)
• I need a data category: is it there?– Query by name, definition etc.
Automatizing processes:– Format control of TMLs– Filters from one TML to GMT
![Page 3: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/3.jpg)
Which model for DatCats?
Using XML:– Coherence with TMF principles– Using stylesheet to generate schemas and filters
Using RDF (Resource Description Framework)– Intended format for representing meta-data:
• Description of a DatCat is meta-data with regards TMF
![Page 4: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/4.jpg)
RDF - a quick presentation
Cf. other file
![Page 5: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/5.jpg)
Data Categories
A Formal Description
![Page 6: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/6.jpg)
Data Category Registry
dcsd:DataCategory
rdf:about
Data Category
DCRegistryDCRegistry
DescriptionDescription
VersionNumber
dcsd:VersionNumber
![Page 7: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/7.jpg)
Data Category description
DCDefinition
DCName
Content
dcsd:DCDefinition
dcsd:DCName
dcsd:Content
dcsd:DCIdentifier
dcsd:Level
DCType (S, C)dcsd:DCType
Salt 2000-11-08/SEW
dcsd:DCAdmin
DCComment
dcsd:DCComment
Data Category
Locus
DCAdmin
DCIdentifierDCParent
dcsd:DCParent
DCExample
dcsd:DCExample
![Page 8: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/8.jpg)
Simple and complex DatCats
Complex data categories– shall serve as field identifiers (not names) in databases
and can have content. The datatype for this content shall be declared for each data category and can commonly take the form of different categories of text, defined data types (such as dates), and specified data domains, e.g., picklists comprising standardized permissible instances.
» Example: /Part of Speech/
Simple data categories– shall serve as the content of complex data categories.
» Example: /Noun/, /Verb/, /Adjective/ etc.
![Page 9: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/9.jpg)
Levels and content
Content
DataType TargetType
Ref to other datcat(s)
dcsd:DataType dcsd:TargetType
rdf:Alt
rdf:li
List of References
List of References
Ref to other datcats
rdf:Alt
rdf:li
Level/Loci
rdf:Alt
Ref to other datcat(s)
rdf:li
List of References
![Page 10: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/10.jpg)
Administrative properties
dcsd:DCAdmin
Data Category
DCAdmin
Status
dcsd:Status
StatusDatedcsd:StatusDate
StatusNote
dcsd:StatusNote
EditionDate
dcsd:EditionDate
ShortForm AdmittedName ForbiddenName
Source
dcsd:Source
VariantNames
dcsd:VariantNames
Dcsd:ShortFormDcsd:AdmittedName Dcsd:ForbiddenName
![Page 11: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/11.jpg)
RDF Representation
![Page 12: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/12.jpg)
/term/ - RDF description (1)
<dcsd:DataCategory dcsd:DCIdentifier="ISO12620A01"dcsd:DCName="term"dcsd:position="A.01"dcsd:DCType="C">
<dcsd:DCDefinition> A verbal designation of a generalconcept in a specific subject field </dcsd:DCDefinition>
<dcsd:DCComment><dcsd:sourceComment>For definition of related term, see
ISO 1087-1, 3.4.3.</dcsd:sourceComment><dcsd:conceptComment>Terms can consist of single words
or be composed of multiword strings…</dcsd:conceptComment><dcsd:Example>"radix" in annex C, figure
C.1.</dcsd:Example><dcsd:DictionnaryID>A.1</dcsd:DictionnaryID>
</dcsd:DCComment>
![Page 13: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/13.jpg)
/term/ - RDF description (2)
<dcsd:Content dcsd:DataType="plainText"/> <dcsd:Level>
<rdf:Alt><rdf:li>TL</rdf:li><rdf:li>TC</rdf:li>
</rdf:Alt></dcsd:Level><dcsd:DCAdmin dcsd:OrgSource="ISO TC 37"
dcsd:DocSource="ISO12620:1999"dcsd:subDate="2000-10-20 SEW"dcsd:registryComment="Prepared
2000-10-20"dcsd:Status="Accepted"/>
</dcsd:DataCategory>
![Page 14: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/14.jpg)
/term type/ - RDF description (1)
<dcsd:DataCategory dcsd:DCIdentifier="ISO12620A0201"dcsd:DCName="term type"dcsd:position="A.02.01"dcsd:DCType="C">
<dcsd:DCDefinition>An attribute assigned to aterm</dcsd:DCDefinition>
<dcsd:DCComment><dcsd:DictionnaryID>A.2.1</dcsd:DictionnaryID>
</dcsd:DCComment><dcsd:Content dcsd:DataType="picklist">
<rdf:Alt><rdf:li>ISO12620A020101</rdf:li><rdf:li>ISO12620A020102</rdf:li><rdf:li>ISO12620A020119</rdf:li>
</rdf:Alt></dcsd:Content>
![Page 15: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/15.jpg)
/term type/ - RDF description (2)
<dcsd:Level><rdf:Alt>
<rdf:li>TL</rdf:li><rdf:li>TC</rdf:li>
</rdf:Alt></dcsd:Level><dcsd:DCAdmin dcsd:OrgSource="ISO TC 37"
dcsd:DocSource="ISO12620:1999"dcsd:subDate="2000-10-20 SEW"dcsd:registryComment="Prepared
2000-10-20"dcsd:Status="Accepted"/>
</dcsd:DataCategory>
![Page 16: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/16.jpg)
Actualizing a DatCat
TMF specific properties
![Page 17: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/17.jpg)
Styling properties
dcsd:Style
Data Category
Style
StyleName
dcsd:StyleName
ElementNamedcsd:ElementName
AttributeName
dcsd:AttributeName
TypeValue
dcsd:TypeValue
Simple
ElementAttribute
TypedElementValuedElementTVElement
Value
dcsd:Value
For ‘ Simple ’
AnchorInfo
dcsd:Anchor
AnchorLevel
![Page 18: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/18.jpg)
Attribute style description
• dcsd:StyleName="Attribute"
– Conditions of use:• Not valid for annotations
– Required properties• dcsd:AttributeName
– Example:• dcsd:AttributeName="id"
• <anchorElement id="xx54893">…</>
![Page 19: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/19.jpg)
Element style description
• dcsd:StyleName="Element"
– Required properties• dcsd:ElementName
– Example:• dcsd: ElementName ="definition"
• <definition>…</definition>
![Page 20: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/20.jpg)
TypedElement style description
• dcsd:StyleName="TypedElement"
– Required properties• dcsd:ElementName, dcsd:TypeValue
– Example:• dcsd:ElementName ="termNote"
• dcsd:TypeValue="partOfSpeech"
• <termNote type="partOfSpeech"/>N</termNote>
![Page 21: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/21.jpg)
ValuedElement style description
• dcsd:StyleName="ValuedElement"
– Conditions of use:• Not valid for annotations
– Required properties• dcsd:ElementName
– Example:• dcsd:ElementName ="pos"
• <pos value="noun"/>
![Page 22: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/22.jpg)
TVElement style description
• dcsd:StyleName="TVElement"
– Conditions of use:• Not valid for annotations
– Required properties• dcsd:ElementName, dcsd:TypeValue
– Example:• dcsd:ElementName ="free"• dcsd:TypeValue="pos"
• <free type="pos" value="noun"/>
![Page 23: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/23.jpg)
Simple style description
• dcsd:StyleName="Simple"
– Conditions of use:• Express the value of simple data categories
– Required properties:• dcsd:Value
– Example:• dcsd:Value ="Nom"
• <pos>Nom</pos>
![Page 24: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/24.jpg)
Dealing with languages
![Page 25: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/25.jpg)
Two types of languages
Working language• The language used at a given place in a document,
along the XML hierarchy
• Representation: xml:lang
Object language• The language about which you speak at a given place
in your terminological entry (e.g. describes the Language Section level)
• Representation: as a data category "language", with a narrow scope
![Page 26: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/26.jpg)
Example — DXLT
<langSet lang='en’ xml:lang="fr"><descrip type="definition">Une valeur entre 0 et 1 utilisée...</descrip><tig>
<term xml:lang="en">alpha smoothing factor</term>
<termNote type="termType">fullForm</termNote></tig>
</langSet>
![Page 27: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/27.jpg)
Example — GMT
<struct type="LS" xml:lang="fr"><feat type="language">en</feat><feat type="definition">Une valeur entre 0 et 1 utilisée...</feat><struct type="TL">
<feat type="term" xml:lang="en">alpha smoothing factor</feat>
<feat type="termType">fullForm</feat></struct>
</langSet>
![Page 28: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/28.jpg)
Conclusion
– A general model for analysing and representing terminological data collection
– An underlying formalism expressed in XML,RDF
– Associated tools (Salt project)• DCSEditor,
• DCSBrowser,
• Automatic generation of XSLT filters and XML schemas from a given TML specification
![Page 29: ISO 16642 - a tutorial Part 2: Representing data categories](https://reader036.vdocuments.mx/reader036/viewer/2022062500/56814f0a550346895dbc9f02/html5/thumbnails/29.jpg)
Useful pointers
SALT project– http://www.loria.fr/projets/SALT– http://www.ttt.org/
The TMF site– http://www.loria.fr/projets/TMF