locloud vocabulary services: thesaurus management introduction, walter koch and gerda koch, ait

Download LoCloud Vocabulary Services: Thesaurus management introduction, Walter Koch and Gerda Koch, AIT

Post on 13-Jul-2015



Data & Analytics

2 download

Embed Size (px)


  • Thesaurusmanagement Quickstart


  • What are controlled vocabularies?

    organized arrangement of words and phrases used to index content and/or to retrieve content through browsing or searching

    include preferred and variant terms have dened scope or describe a specic domain


  • Thesaurus = a controlled vocabulary arranged in a known order and structured so that the various rela4onships among terms are displayed clearly and iden4ed by standardized rela4onship indicators.

    Important: ISO 25964 is a standard for building thesauri SKOS is a W3C recommenda4on designed for representa4on

    of controlled vocabularies and is built upon RDF and RDFS. It allows publica4on of such vocabularies as linked data.

    1 h@p://www.niso.org/schemas/iso25964/ (September 19th, 2014)

  • ISO 25964

    Part 1: Thesauri for informa>on retrieval - published in 2011 - developing a thesaurus (mono- and

    mul4lingual) - replaced previous standards ISO 2788/5964 - includes data model and XML schema

    Part 2: Interoperability with other vocabularies - published in 2013 - recommenda4ons for the establishment and maintenance of

    mappings between mul4ple thesauri, or between thesauri and other types of vocabularies

    Data Model


  • SKOS Simple Knowledge Organiza4on System http://www.w3.org/2004/02/skos/intro SKOS provides a standard way to represent knowledge organiza4on systems using the Resource Descrip4on Framework (RDF). Encoding this informa4on in RDF allows it to be passed between computer applica4ons in an interoperable way. Using RDF also allows knowledge organiza4on systems to be used in distributed, decentralised metadata applica4ons. Decentralised metadata is becoming a typical scenario, where service providers want to add value to metadata harvested from mul4ple sources.

  • Mul4lingual vocabulary issues (examples)

    structural problems: conceptual systems dier in the various languages

    equivalence problems: lexicalisa4on of concepts diers in dierent languages

    eg. bone sh bone (en); Knochen Grten (de)[1]; intra- and inter-language problems; terms dier in meaning (homographs) given term can have more than one meaning in a language

    eg. Turkey (country) and turkey (animal) [1] h@p://www.dsoergel.com/cv/B67.pdf 20th August, 2014

  • Federated Model

    LoCloud vocabulary based on federated model having independent vocabularies for various languages in the same domain (no one language is dominant)

    alignment of vocabularies via concept iden>ers, end-user can search in all linked indexing vocabularies

    AIT experimental applica4on based on TemaTres Vocabulary Tool

  • TemaTres ... supports distributed management models ensures consistency and integrity of data and

    rela4onships between terms has features specially designed to provide data

    traceability and quality control in the context of a controlled vocabulary

    supports the analysis and categorisa4on of terms for search

    enables vocabularies to be represented in a wide range of metadata standards relevant to knowledge management http://www.vocabularyserver.com/

  • TemaTres func4onali4es No limits to number of terms, alterna4ve labels, levels of

    hierarchy, etc allows import/export of data in text or SKOS format mul4lingualism SPARQL endpoint rela4onships between terms notes user management Reports Addi4onally: meta-terms: dene facets, collec4ons or arrays of terms, expose vocabularies

    with powerful web services, search terms sugges4on (did you mean...?), display terms in mul4ple deep levels in the same screen, user management, duplicate and free terms control, mul4lingual terminology mapping etc.

  • Why TemaTres? Fast to use Making vocabularies available as Webservice in the enrichment process

    Many vocabularies (like UNESCO, Gemet, PICO) have already been established with this tool and are usable in the LoCloud infrastructure (h@p://www.vocabularyserver.com/vocabularies.php , 175 vocabularies available)

    Addi4onally own vocabularies can be created Best star4ng point: Skos-le for import

  • Import in TemaTres

    Tabulated text

    Tagged text

    Skos core

  • Vocabularies that can at present be used during LoCloud aggrega4on:

    Author Name of vocabulary University of California, Santa Barbara Alexandria Digital Library Feature Type Thesaurus Royal Commission on the Ancient and Historical Monuments of Scotland (RCAHMS)

    Archeological Objects Thesaurus Scotland

    English Heritage Archeological Sciences Thesaurus English Heritage Building Materials Thesaurus English Heritage Components Thesaurus American Folklore Society Ethnographic Thesaurus English Heritage Event Type Thesaurus English Heritage Evidence Thesaurus English Heritage FISH Archeological Objects Thesaurus Eionet European Environment Information and Observation Network

    General Multilingual Environmental Thesaurus GEMET

    Federation Internationale des Archives du Film (FIAF)

    General Subject headings for Film Archives

    The Discovery Programme Irish Monuments The Discovery Programme Irish Periods Royal Commission on the Ancient and Historical Monuments of Scotland (RCAHMS)

    Maritime Craft Thesaurus Scotland

    English Heritage Maritime Craft Type Thesaurus English Heritage and Royal Commission on the Historical Monuments of England

    MDA Archaeological Objects Thesaurus

    Royal Commission on the Ancient and Historical Monuments of Wales (RCAHMW)

    Monument Thesaurus Wales

    Royal Commission on the Ancient and Historical Monuments of Scotland (RCAHMS)

    Monument Type Thesaurus

    English Heritage Period Thesaurus Royal Commission on the Ancient and Historical Monuments of Wales (RCAHMW)

    Period Thesaurus Wales

    Bibliographic Standards Committee of the Rare Books and Manuscripts Section (ACRL/ALA)

    Relator Terms for Use in Rare Book and Special Collections Cataloguing

    Universidad de Len

    Tesauro de Ciencias de la Documentacin

    Library of Congress. Prints and Photographs Division

    Thesaurus for Graphic Materials 1: Subject Terms

    Library of Congress. Prints and Photographs Division

    Thesaurus for Graphic Materials 2: Genre and Physical Characteristic Terms

    Ministero per i Beni e le Attivit Culturali

    Thesaurus PICO 4.1

    UKAT UK Archival Thesaurus (UKAT) UNESCO UNESCO thesaurus

  • Tool for vocabulary training

    Mediathread is CCNMTL's 1 open-source plaoorm for explora4on, analysis, and organiza4on of web-based mul4media content

    Launched at Columbia in 2010, Mediathread has now been used in over 300 courses across a wide range of subject domains, including Social Work, Journalism, East Asian Studies, Art History, Film Studies, History, Public Health, Educa4on, and English.2

    Mediathread is in use today at over 25 Colleges and Universi4es, including the MIT, Dartmouth College, Princeton University, Wellesley College etc. 3

    Mediathread is under constant development 1 Columbia Center for New Media Teaching and Learning

    h@p://ccnmtl.columbia.edu/poroolio/custom_sotware_applica4ons_and_tools/mediathread.html 2014-10-09

    2h@p://mediathread.info/content/cases-columbia 2014-10-09 3h@p://getmediathread.com/index.html#who 2014-10-09

  • Accessing Mediathread



  • Next Parts: Thesaurusmanagement:

    Part 1: Basics Part 2: Import/Export Part 3: Mul4lingual Vocabularies

    Op4on: Mediathread in a Nutshell

  • Mediathread in a Nutshell

  • Ater logging into Mediathread

  • Mediathread sec4ons (I) From Your Instructor (let side)

    Contains the composi4ons with the instruc4ons

    Start with How to use the Mediathread tool?

    Followed by Chapter 0 to Chapter 6

    Composi>ons give instruc4ons

    Ater reading each composi4on complete the associated Assignment (same chapter number and name)

  • Mediathread sec4ons (II)

    Assignments contain exercises (middle) Accomplish them by clicking on Respond to Assignment If necessary check the instruc4ons in the composi4ons again

  • Reading Composi4ons (I)

    Ater clicking on a Composi4on Read the text on the let side Click on the symbol or text to see Power Point slides on the

    right side

  • Reading Composi4ons (II) Change size and posi4on of the slides by

    Using the arrow and plus/minus signs on the let Using the scroll func4on of your mouse (to change size) Dragging the slide by holding the let mouse bu@on (to change


  • Reading Composi4ons (III)

    When nished click on LoCloud Vocabulary Training to return to the course overview

    Here click on the next Composi4on or on Respond to Assignment

    Or use the links at the bo@om of each Composi4on or Assignment


  • The Locloud Vocabulary Training ...

    is an English online tool workshop

    includes all features of the vocabulary tool TemaTres

    is too comprehensive to complete it in this sec4on

    can be started in class and nished any 4me online

    Please use the *me le, to start with the Vocabulary Training ...

  • Star4ng Vocabulary Training ...

    Open Mediathread under h@p://mtp.ait.co.at

    Logging in