metadata transformation important technical considerations: extraction / normalization / enrichment

Download METADATA TRANSFORMATION Important technical considerations: extraction / normalization / enrichment

Post on 17-Jan-2016




0 download

Embed Size (px)


  • METADATA TRANSFORMATIONImportant technical considerations: extraction / normalization / enrichment

  • Extraction: XML is pickyAll tags must be closed as opposed to HTML

    Doesnt like any of your special characters& = &< = = >Encoding sensitive

  • Extraction: AttributesConsider the following example:If you export names, post codes and coordinates into the coverage element how can you use these afterwards?London12.1234,89.1235531The ESE doesnt define these attributes for anything but languageLondon12.1234,89.1235531

  • Extraction: Additional dataESE may not alway contain all the information which MAY be interesting from an aggregators perspectiveThe ESE can be extended without breaking the format but it needs to be done in such a way as not to conflict or interfere with the XML structure of ESE elements

  • Normalization: datesDate extraction is somewhat inaccurate and may well render bogus was almost as bad as in the 1920s......back in the dark ages...Values given by reference may be erroneously considered valid for the contentIf uncertain about what to put where consider what is most useful to the end-user

  • Normalization: vocabulariesVan Eyck, JanJan Van EyckVan Eyck JanVan Eyck, Jan en Hubertgebroeders Van EyckVan Eyck, J. (1395-1441)

    (Example from, courtesy of Jef Malliet)

  • Normalization: precisionca. 15601560 ?16th century1500-1599

    (Example from, courtesy of Jef Malliet)

  • Enrichment: what is it?ExampleMapping content values to common vocabulary with defined relationships between themEnables vast quantities of unrelated content to be automatically linked to eachother rendering considerable added valueExampleAutomatic language translationPoor quality but possibly better than nothing