Terminology standards – enhancing languageISO/TC 37 Semantic Interoperability
ISO TC 37 Secretariatc/o Infoterm
Christian Galinski
Bamako (Mali) 2005-05-06/07
ISO/TC 37 – Bamako 2005-06/07
OverviewUNESCO’s IFAP Area 4IFAP UNESCO and multilingualityAdvocating open access solutionsLanguage in industryeContent developmentGlobal semantic interoperability Standards for ...Terminology standardizationTerminology? Content entitiesTerminology eContentTerminology in ISO/TC 37+ Language resources & LR management+ Content resourcesStandardization of terminological principles and methodsISO/TC 37ISO/TC 37/SC 1 ~ 4ISO/TC 37 OutlookSemantic interoperability – HOW?
ISO/TC 37 – Bamako 2005-06/07
What is terminology?
The description of the specialized vocabulary of an application domain
Cf. Eugen Wüster: conceptual viewknowledge representation at concept level
Monolingual or multilingualMainly nouns (in cl. multi-words nominal units), some verbs, adjectives and adverbs A strong yet practical simplification of lexical descriptionIncreasing occurrence of non-verbal knowledge representations
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37 – Bamako 2005-06/07
IFAP Areas of interventionWhat are IFAP’s areas of intervention?• Area 1: Development of international, regional and national information
policies• Area 2: Development of human resources and capabilities for the
information age• Area 3: Strengthening institutions as gateways for information access
• Area 4: Development of information processing and management tools and systems (Multilingualism) standardsISO/TC 37 methodology standards:
• terminology• language resources (at the level of concepts)• other content entities (at the level of concepts)
ISO/TC 37 – Bamako 2005-06/07
UNESCO and multilinguality
Promoting a wider, more equitable access to information (« Recommendation on the promotion of multilingualism and universalaccess to Cyberspace »/ Initiative B@bel)
Raising awareness of issues of equitable access and multilingualism
Encouraging Member States to
Develop strong policies which promote and facilitate language diversity on the Internet Guidelines for Terminology Policies
Create widely-available online tools and applications (such as terminologies, automatic translators, dictionaries) for content in local languages
Share of best practices and information ISO/TC 37
ISO/TC 37 – Bamako 2005-06/07
Advocating open access solutions
“Member States and international organizations should encourage open access solutions including the formulation of technical and methodological standards for information exchange, portability and interoperability, as well as online accessibility of public domain information on global information networks.”(UNESCO Recommendation on Multilingualism and Access to Cyberspace)
“Governments should promote the development and use of open, interoperable, non-discriminatory and demand-driven standards.” (WSIS Action Plan)
Open source software? + Open content?
ISO/TC 37 – Bamako 2005-06/07
Language in industryExchange of content entities:e.g. entry in a product catalogue
Name of company (® enterprise)Name of product (model) (™ enterprise)Generic name of product (e.g. © Harmonized System)Class (name under which the product falls) (e.g. © eCl@ss)Verbal/textual description (© enterprise)Picture (© rights owner)Technical data
• (unified) branch properties (e.g. © OAGi)
• Standardized characteristics (e.g. © DIN)
• Enterprise product specific data (e.g. for collaborative business)
• Enterprise internal data (maybe confidential/secret)
225/55/16 V
ISO/TC 37 – Bamako 2005-06/07
eContent DEVELOPMENT
Workflow management for content development: net-based, distributed, cooperative creation of structured content
CO-OPERATION INTEROPERABILITYSTANDARDIZATION
Re-use in applications:(based on the “single-source” principle)
• eLearning• eGovernment• eHealth• eBusiness• other e...s
multilingualmultimodalmultimedia
complying withmulti-channel outputaccessibility requirements
ISO/TC 37 – Bamako 2005-06/07
THE CHALLENGE: (user point-of-view)
• throughout the enterprise/organization requested e.g. in e-government• between enterprises/organizations requested by the market• within industry consortia requested by industry branches• between industry consortia ??? (urgently needs harmonization
and especially open standards)• between different e…s requested by the user• between different language communities requested by the end user
within the standardization world
Global Semantic Interoperability
ISO/TC 37 – Bamako 2005-06/07
STANDARDS FOR:
hw sw methodology standardsTechnology ITU, ISO, IEC, industryBusiness models UN/ECE, ISO, industry“Language” ISO/TC 37, research consortiaTransfers/transactions ITU, UN/ECE, industryStandards* MoU/MG – why?Content ? Methodology!!! semantic interoperabilityLegal issues ?
*standards should be examined, whether they support, allow or hinder multilinguality and cultural diversity (very important for SMEs) and semantic interoperability at large
ISO/TC 37 – Bamako 2005-06/07
Terminology standardization
Standardization of terminologies• Terminological data
• Linguistic and non-linguistic representations• Designations: term, abbreviation, graphic symbol, formula,
acoustic symbol, etc.• Descriptions: definition, explanation, non-linguistic
[descriptive] representation, etc.• Source-related data• Data management related data (field, record, holding)• Classification (multiple)
• Terminology-related data: names, phraseology, ...Standardization of terminological principles and methods
generic for many types of content entities
ISO/TC 37 – Bamako 2005-06/07
Terminology? content entitiesTerminology? knowledge representations
• Nomenclature, taxonomy, typology, partonomy, ...• Glossary, vocabulary, ...• Terminological phraseology• Graphical symbols and other non-linguistic representations?• Properties, characteristics, attributes, ...• Ontology• Names? to be further studied
+ closely related:Thesauri, classification schemes, keywordsEncyclopedic (knowledge) entries
• Knowledge-enriched terminology entries• Names, proper names, ...
Ontologies, topic maps, ...
ONE methodology
ISO/TC 37 – Bamako 2005-06/07
Terminology eContent
embedded terminology (or combination of terminology + …)• Texts: translation, localization, internationalization…• Speech: communication…• Image: CAD/CAM…• Multimedia: video, presentations…
knowledge-rich terminology• Encyclopedic knowledge: Wikipedia…• “Knowledge” management: incl. true “content management”
• document management, • communication management, • information management
“popularized” terminology
“Terminology and other language and content resources”ONE methodology
ISO/TC 37 – Bamako 2005-06/07
Terminology todayGiven its pervasive occurrence in all (written or spoken)
domain communication, terminology today has to be considered an economic factor especially inproduct data description and management (incl. eCatalogues and product classification)quality managementinter-cultural aspects of management and marketingtranslation and localizationinformation, documentation, software developmentknowledge transfer, teaching and training, …Multilinguality and cultural diversity
terminology science as a field of fundamental research as well as applied R&Dimpact on standardization
ISO/TC 37 – Bamako 2005-06/07
Terminology in ISO/TC 37
Multifunctional nature of terminology:
Terminology as knowledge representationTerminologies as means of domain communicationTerminologies as means of access to other kinds of information (objects)Terminologies as means of knowledge ordering at micro-level
ISO/TC 37 – Bamako 2005-06/07
+ Language resource management
Language resources:• Text corpora tagging (on the basis of grammar models)• Lexicographical data
• Words• Collocations• Morphology
• Terminology• Speech data
LR management:• Input / import• Metadata (incl. bundling/bindings etc.)• Data modelling & metamodel(s) • Exchange / interoperability• etc.
ISO/TC 37 – Bamako 2005-06/07
+ other kinds of content entities
Textual & non-linguistic types of content:Audio information (e.g. read-out written content)av information (e.g. sign language)Multimedia informationHaptic information (e.g. in “intelligent cars”)…
Increasingly different (technical) types of content co-occur or are embedded in each other or are combined with each other – e.g. traffic telematics
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37 – Standardization of terminological principles and methods
Fundamental principlesVocabulary of terminologyTerminographyLanguage resource managementTerminology work (especially systematic ~~)Applications based on terminology methodsContent management? eContent mContent
• Multilingual, multimodal, multimedia, universal accessibility, multi-channel
• Re-usability interoperability/ies• Resource-sharing peer2peer
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37Old title: Terminology and other language resourcesOld scope:Standardization of principles, methods and applications relating to terminology and other language resources
New title:Terminology and language and content resourcesNew scope:Standardization of principles, methods and applications relating to terminology and other language and content resources in the contexts of multilingual communication and cultural diversity
As is the case with terminologies, language resources in general have to be considered as multilingual, multimedia and multimodal from the outset.
Generic fundamental standards for all activities involving language
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 1 (1)
Title: Principles and methodsOld scope: Standardization of basic principles and methods for developing scientific and technical terminologies and other language resourcesNew scope: ??? still under discussion
ISO/TC 37/SC 1 prepares the meta-standards for the documents prepared by ISO/TC 37/SCs 2, 3 and 4, which cannot be consistentand coherent without these standards. The same applies to the documentation of content management in organizations.
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 1 (2)
The following standards are under the direct responsibility of ISO/TC 37/SC 1:ISO 704:2000 Terminology work – Principles and methodsISO 860:1996 Terminology work – Harmonization of concepts and termsISO 1087-1:2000 Terminology work – Vocabulary – Part 1: Theory and application
The following standards are under preparation:ISO/CD 704 Terminology work – Principles and methodsISO/CD 860 Terminology work – Harmonization of concepts and termsISO/PWI 1087-1 Terminology work – Vocabulary – Part 1: Theory and applicationISO/WD 22134 Practical guide for socioterminology
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 2 (1)
Title: Terminography and lexicographyNew scope: Standardization of terminological and lexicographical working methods, procedures, coding systems, workflows, and cultural diversity management, as well as related certification schemes
Tens of thousands of terminology commissions, committees and other terminological entities (especially terminology standardizing SCs and WGswithin the standardization framework) are using ISO/TC 37/SC 2 standards. This indirectly improves the overall degree of re-usability and interoperability of the resulting data and documents.
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 2 (2)
The following standards are under the direct responsibility of ISO/TC 37/SC 2:
ISO 639-1:2002 Codes for the representation of names of languages – Part 1: Alpha-2 codeISO 639-2:1998 Codes for the representation of names of languages – Part 2: Alpha-3 codeISO 1951:1997 Lexicographical symbols and typographical conventions for use in terminographyISO 10241:1992 International terminology standards -- Preparation and layoutISO 12199:2000 Alphabetical ordering of multilingual terminological and lexicographical data represented in the Latin alphabetISO 12616:2002 Translation-oriented terminographyISO 15188:2001 Project management guidelines for terminology standardization
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 2 (3)The following standards are under preparation:
ISO/CD 639-3 Codes for the representation of names of languages – Part 3: Alpha-3 code for comprehensive coverage of languagesISO/WD 639-4 Codes for the representation of names of languages– Part 4: Implementation guidelines and general principles for language codingISO/WD 639-5 Codes for the representation of names of languages– Part 5: Alpha-3 code for language families and groups ISO/CD 639-6 Codes for the representation of names of languages – Part 6: Extension coding for language variationISO/DIS 1951 Presentation/representation of entries in dictionariesISO/CD 10241-1 Terminological entries in standards – Part 1: General requirementsISO/AWI 10241-2 Terminological entries in standardsISO 12615 Bibliographic references and source identifiers for terminology ISO/PWI TR 22128 Quality assurance guidelines for terminology productsISO/PWI 22130 Additional language codingISO/NP 23185 Assessment and benchmarking of terminological holdings
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 3 (1)
Old title: Computer applications for terminology
New title: Terminology management systems and content interoperabilityNew scope: Standardization of principles and requirements for semantic interoperability, terminology and content management systems, and knowledge ordering tools
Software developers are taking the documents of ISO/TC 37/SC 3 for designing terminology management systems (TMS) or terminology management modules to be integrated into content management as well as information and knowledge management systems. In this way the terminological principles and methods (provided by ISO/TC 37/SC 1) are directly integrated as ‘defaults’ into concrete system design for handling all kinds of information.
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 3 (2)
The following standards are under the direct responsibility of ISO/TC 37/SC 3:ISO 1087-2:2000 Terminology work – Vocabulary – Part 2: Computer applicationsISO 6156:1987 Magnetic tape exchange format for terminological/ lexicographical records (MATER) - withdrawnISO 12200:1999 Computer applications in terminology –Machine-readable terminology interchange format (MARTIF) –Negotiated interchangeISO 12620:1999 Computer applications in terminology – Data categoriesISO 16642:2003 Computer applications in terminology –Terminological markup framework
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 3 (3)
The following standards are under preparation:ISO/PWI TR 12618 Computational aids in terminology – Design, implementation and use of terminology management systemsISO/CD 12620-1 Computer applications in terminology – Data categories – Part 1: Model for description and procedures for maintenance of data category registries for language resourcesISO/CD 12620-2 Computer applications in terminology – Data categories – Part 2: Terminological data categories
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 4 (1)
Title: Language resource managementScope: Standardization of specifications for computer-assisted language resource management
Given the fact that• linguistic infrastructures are being established or re-enforced as part of the
rapidly evolving information and communication society;• professional activities involving language resource sharing and standardization
are increasing in diverse areas: governmental or non-governmental organizations, public or private institutions, educational institutions, commercial enterprises, etc., both, globalization and localization necessitate multilingual communication;
there is an increasing need for new standardization as well as urgent recognition of existing de facto standards and their transformation into International Standards.
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 4 (2)The following standards are under preparation:
ISO/AWI 21829 Terminology for language resourcesISO/CD 24610-1 Language resource management – Feature structures – Part 1: Feature structure representationISO/WD 24611 Language resource management – Morphosyntacticannotation frameworkISO/WD 24612 Language Resource Management – Linguistic Annotation FrameworkISO/WD 24613 Language resource management – Lexical markupframeworkISO/AWI 24614-1 Word segmentation of written texts for mono-lingual and multi-lingual information processing – Part 1: General principles and methodsISO/AWI 24614-2 Word segmentation of written texts for mono-lingual and multi-lingual information processing – Part 2: Word segmentation for Chinese, Japanese and KoreanISO/NP 24614-3 Word segmentation of written texts for mono-lingual and multi-lingual information processing – Part 3: Word segmentation for other languages
ISO/TC 37 – Bamako 2005-06/07
Basic principles and requirements concerning multilingual e/m-content development, data categories/metadata, data modelling, rules for repositories (maintained in MAs/RAs/Reg’s)
*ISO 16642 TMF; ISO 10303-11 EXPRESS; ISO 10303-21 SDAI; …**ISO 12200 MARTIF; ISO 13584-42 PLIB ~ IEC 61360-2***ISO 12620 Data categories; ISO 13584-511 Fastener dictionary; IEC 61360-4 Core dictionary; …
DDDs DDDs DDDs DDDs*** *** *** ***
Domain data dictionaries***
Data categoriesISO 12620***
Datamodelsother e...s**
Datamodelsother e...s**
Datamodels**eBusiness
DatamodelsISO 12200**
(family of)metamodels*
ISO 16642*
State-of-the-art
METHODOLOGY APPLICATIONS
ISO/TC 37 – Bamako 2005-06/07
Semantic interoperability standards
Content-related requirementsWorkflow methodologyMetadataMetadata repositoriesData modelling principles and requirementsMicro data modelsMetamodelsContent repositoriesFederation of repositories…
ISO/TC 37 – Bamako 2005-06/07
CONFERENCES
Terminology Summer School- Cologne (Germany) 2005-07-14/23TAMA 2005 “Terminology in Advanced Management Applications”– Wiesbaden (Germany) 2005-11-09TKE 2005 “Terminology and Knowledge Engineering”– Copenhagen (Denmark) 2005-08-15/19OFMR 2006 “Open Forum on Metadata Registries”– Japan 2006-03-20/22
Thank you for your attention
ISO/TC 37c/o Infoterm – International Information Centre for Terminology
Aichholzgasse 6/12A-1120 Vienna – AustriaTel: +43-1-817 44 99Fax:+43-1-817 44 [email protected]://www.infoterm.info
ISO/TC 37 Secretariat: Secretary: Christian GalinskiChairman: Håvard Hjulstad (SN)
ADDRESS: