leipzig functional categorisation 11/12/2013

42
Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013 Functional Categorisation for Historical Place Types Giovanni Colavizza Leibniz-Institut für Europäische Geschichte (IEG), Mainz [email protected] 1

Upload: giovanni-colavizza

Post on 30-Jul-2015

172 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Functional Categorisation for Historical Place Types

!!

Giovanni Colavizza Leibniz-Institut für Europäische Geschichte (IEG), Mainz !

[email protected]

���1

Page 2: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Section 1: introduction and motivations

���2

Page 3: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

The topic

!Controlled vocabulary: “a pre-selected list of terms used for categorisation.” or “an organized arrangement of words and phrases used to index content and/or to retrieve content through browsing or searching.” @Patricia Harpring !Gazetteer: “a geographical dictionary or directory used in conjunction with a map or atlas.” @Wikipedia !!Focus for this talk: Controlled Vocabularies of concepts, not proper names. Historical Place Types. !

���3

Page 4: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Examples

���4

Terms—label – concept relations—are often not defined. Natural language is context and interpretation specific. !!@Dalia Varanka, A topographic feature taxonomy for a US national topographic mapping ontology, 2009. !!!@Excerpt from LinkedGeoData ontology.

Page 5: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Motivations!!Quantitative analysis ↦ Classification !Classification for quantitative analysis ↦ unambiguous, consistent, shared !

���5

Controlled vocabularies for place types at the moment: • grow out of necessity, are project specific • have high degree of ambiguity • lack of explicit (formal) definitions of terms • are not designed for portability and reuse

Page 6: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Basic definitions - I!!Taxonomy: “a semantic network of concepts (referred to, or labeled, via a controlled vocabulary), linked by hierarchical relationships. A taxonomy is thus a limited thesaurus.” !Thesaurus: “a semantic network of concepts (referred to, or labeled, via a controlled vocabulary), linked by equivalence, hierarchical and associative relationships.” !Ontology: “formal and explicit specification of a shared conceptualisation.” @Studer, 1998 and Guarino, 2009

���6

Page 7: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Basic definitions - II!Taxonomy: contains categories organised hierarchically. Used to classify. E. g. “vehicles” ↦ “terrestrial vehicles” ↦ “car”. !Thesaurus: contains concepts and labels for them, organised relationally. Used to index and search. E. g. “terrestrial vehicles” ↦ “car”@en (alternatives: “macchina”@it, “voiture”@fr, .. relates_to: “car park”, “highway”, ..) !Ontology: contains classes, properties and logical rules. Eventually instances of classes. Used to instance and reason. E. g. “car” is_subclass_of “vehicle”. “has_horsepower” is a property between an instance of class “car” and an positive integer. “Audi RS Q3” is_a “car”. And so on..

@Thomas Francart

���7

Page 8: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Getty’s TGN - IGetty’s Thesaurus of Geographic Names: “a database of places in context.” !Target: professionals in the heritage sector. Always growing by design. !Structure of a record:

���8

Page 9: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Getty’s TGN - II

Possible tensions from two directions: • The mix of physical features and administrative entities in the hierarchies, since

“a geographic place is an administrative entity or a physical feature with a name”.

• The account for both current and historical places, types and hierarchies. !

Good ideas: • Instances of administrative entities. E.g. Ancient Egypt (former nation) is

predecessor of Egypt (modern nation). • Instances have time spans.

���9

Page 10: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Getty’s TGN - III!!Place type: “a term that characterises a significant aspect of the place, including its role, function, political anatomy, size, or physical characteristics.” @TGNGuidelines, section 3.6.1.1 !Foundation for the hierarchy of every TGN record via preferred type. Organised in flat general categories (Christian types, Physical features types, etc.). !Most place types can be assigned to three macro-areas: physical features, administrative divisions (geopolitics and internal state structure) and functions (religious, economic, social, etc.). !!

���10

Page 11: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Getty’s TGN - IV

!Terminological issues. Guideline: prefer the local terminology. USA has state and county, Italy region and province. Italian region is merged with region (generic administrative entity) and generic region (another more generic entity). !Lead to Ontological issues. Place types are not themselves structured into a defined thesaurus, neither they are formally distinguished in different domains, with specific rules to disambiguate them. !!

���11

Page 12: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Section 2: proposal

���12

Page 13: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Desiderata!

• Allow for comparison beyond single project (data integration) • Interoperability and portability • Scalability • More accurate retrieval • Reasoning… • Essentially: make vocabularies more machine-actionable

���13

One possible solution: integrate a more strict knowledge model in the backend of controlled vocabularies. Express it via thesauri of concepts built abiding to ontologies. Standards already there: ISO 25964 (data model), SKOS (ontology)

Page 14: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

An example - I!List of monasteries in France:

���14

Id Name Type …

1 Manlieu Abbey tgn:monastery …

2 Argentan Abbey tgn:monastery …

… … … …

Can we improve on the simple tag “monastery”?

Page 15: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

An example - IIThesaurus of concepts: <skos:Concept rdf:about="labelling.org/function-concept-10"> <skos:prefLabel xml:lang="en">worship</skos:prefLabel></skos:Concept> <skos:Concept rdf:about="labelling.org/function-concept-11"> <skos:prefLabel xml:lang="en">estate administration</skos:prefLabel></skos:Concept>

!Controlled vocabulary of place types: <skos:Concept rdf:about="labelling.org/voc7/label-33"> <skos:prefLabel xml:lang="en">monastery</skos:prefLabel> <skos:related rdf:resource="labelling.org/function-concept-10"> <skos:related rdf:resource="labelling.org/function-concept-11"> </skos:Concept> !In the database:

���15

Id Name Type1 Manlieu Abbey voc7/label-33:monastery2 Argentan Abbey voc7/label-33:monastery… … …

Page 16: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Idea - conceptAn integrated approach: 1. develop back-end thesauri 2. vocabularies are built as needed, in natural language, associating tags with

formally defined concepts (avoid late integration)

���16

!n-m mapping between vocabularies and ontologies. Focus on what’s shared. Add details to the backend. Pareto principle: 80% effects (tags we need) come from 20% causes (concepts).

Page 17: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Historical place types - I

���17

Quite problematic: • Same nouns mean different things in space, time, culture • Generic tags for specific meanings lead to ambiguity • Layers of knowledge: historical agents, socio-political contexts, historians’

interpretations, etc. !Example: “palazzo” in Medieval and Early Modern Venice. For contemporaries: Doge’s palace -> Other nobles’ palaces had proper names, e.g. Ca’ Foscari means House Foscari For us: A category of (historical) buildings — usually former nobles’ residences OR a more generic category of somewhat big buildings

Page 18: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Functional categorisation - I

���18

Historical knowledge is mostly about events and processes, which drive the production of evidence (sources) !!!!!!!!

@Grossner, Representing Historical Knowledge in Geographic Information Systems, 2010.

Page 19: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Functional categorisation - II

���19

We model representations more than real objects, and we study humans: purpose and function are the main concern !From nouns to verbs:

• Most vocabularies of place types/features are already loosely classified by functionality (economic activity, leisure facility, place of culture, etc.)

• There are less verbs than nouns (Wordnet synsets: ~82k nouns, ~14k verbs)

• Verbs act as bridges between concepts in natural language, linked data triples, etc…

!Not the only perspective (e.g. natural features, institutions), but a starting point

Page 20: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Example: Barber-shops in Venice

!!!!!!!!!

!@Filippo De Vivo, Patrizi, informatori, barbieri. Politica e comunicazione a Venezia nella prima età moderna. Milan: Feltrinelli, 2012. In English: id., Information and communication in Venice: Rethinking Early Modern Politics. Oxford: Oxford University Press, 2007.

���20

Page 21: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Historical place types - II

Problems: 1. Same nouns mean different things in space, time, culture 2. Generic tags for specific meanings lead to ambiguity 3. Layers of knowledge: historical agents, socio-political contexts, historians’

interpretations, etc. !Expected outcomes: 1. We can add specifications of space, time, culture to concepts defining a term 2. Generic tags can be linked to specific concepts 3. The process of linking vocabulary terms to concepts helps the historian clarify

its reasoning and the layer of knowledge s/he is representing !

���21

Page 22: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Historical place types - III

Solving the “palazzo” problem: !

���22

Page 23: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

What is a place - conceptual model I

���23

Page 24: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

What is a place - conceptual model II

���24

Page 25: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Section 3: implementation

���25

Page 26: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

How?

!• Build thesauri of functions with a bottom-up approach, from sources • Build vocabularies when needed, reusing existing if possible • Develop a software to integrate such thesauri and the creation/re-use of

controlled vocabularies • Raise and foster a community of interest and work together !!Let’s break down each part…

���26

Page 27: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Building thesauri of functions - I!Good starting point: Getty’s AAT function facets: http://www.getty.edu/vow/AATHierarchy?find=&logic=AND&note=&subjectid=300054593 !Provide a general framework, i.e. functional domains and upper layers: economics, government, social, education, etc. !Small teams of historians and ontologists: start from sources and make explicit part of the knowledge entailed in them. A process of abstraction from detail and generalisation. !Let’s see an example…

���27

Page 28: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Building thesauri of functions - II

���28

Page 29: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Building thesauri of functions - III

Giovanni Bartolomeo da Gabiano, bookseller, publisher and entrepreneur in Venice. !Business letters from which we can infer the activities going on at his shop at Rialto.

���29

“Data in mane de Messer Ioanne Bertolamie a la libraria da la Fontana in Venecia” “Given into the hands of Mr Giovanni Bartolomeo, at the bookshop at the Fountain

[insigna] in Venice”

Page 30: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Building thesauri of functions - IV

Various activities, today usually considered as separated: • book-selling, accounting, warehousing, etc. • publishing and sometimes printing • patronage and other social activities • …

���30

This letter mentions new editions being made —apparently the market was good for medical treatises: Avicenna, Aliabate, etc.— and engravings ordered for them.

Page 31: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Building vocabularies - I

���31

Similar to building thesauri of functions, but without supervision. !Essential to: • permit to use the same tags we’re currently finding in controlled vocabularies,

thus natural language and possibly no definitions • allow for intuitive linkage with thesauri, and suggest vocabulary tags already

built and close in meaning • design for continuous growing: term merge or split, sub-categories, …

Page 32: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Building vocabularies - II

���32

An example: Venetian fiscal declarations, 1514. Rented “flats” (litt. small houses: ‘chaseta’) for residence: ‘flat’ (in vocabulary) Possible interesting functions according to source: ‘renting’, ‘lodging’/‘dwelling’ under ‘economic functions’.

Page 33: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Building vocabularies - III

���33

@Luzzatto, Sergio, Pedullà, Gabriele (editors), Atlante della letteratura italiana, vol. 1, Torino, Einaudi, 2010.

Page 34: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

The software: Labelling system - I

!Design requirements: !• Building thesauri of concepts in the back-end. • Building controlled vocabularies, for users. • Querying the system for such contents (for every agent, openly). • Administering and linking all these tasks and users into a single system. • Provide transparent management of the most used data formats. • Reuse open source solution whenever possible. • Be very intuitive and easy to use. !Waiting for possible grant on this…

���34

Page 35: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

The software: Labelling system - II

���35

Page 36: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

The software: Labelling system - III

���36

http://www.vocabularyserver.com/

Page 37: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

The software: Labelling system - IV

���37

!Implementation is key: • we are struggling to have several people from different backgrounds work with

standards such as SKOS and RDF • we need a common entry point, as transparent as possible • we need to differentiate vocabulary building and thesauri of functions

concretely !

Page 38: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

The community - I

���38

Work on integration and alignment requires a community of interest and long time—slow growth. !Experts’ workshop on Controlled Vocabularies, Mainz 10-11/10/2013: • gathered experts from different fields (history, IT, geography, …) • discussed extensively about place types and the functional categorisation • established a working group to start the process !As of today: • circa 30 experts • wiki space and mailing list within DARIAH-DE

Page 39: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

The community - II

���39

Space on DARIAH-DE wiki. Already populated with references, vocabularies and first alignment projects, plus the RDF (with SKOS) version of the Getty’s AAT function facets. !!!!!!!!Send me an e-mail to join us :)

Page 40: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Summary

!!1. Motivation: controlled vocabularies are ambiguous and lack definitions 2. Object: Historical place types 3. Proposal: use functional categorisation to overcome limitations 4. Implementation: community of interest, reuse of standards, ad-hoc software,

bottom-up source-based approach

���40

Page 41: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

Future directions

Short term priority: development of the Labelling system !Long term: • engage with more researchers and projects • test the method in different settings • steadily grow the vocabulary base • integrate existing vocabularies in the system

���41

Page 42: Leipzig Functional Categorisation 11/12/2013

Giovanni Colavizza Leipzig eHumanities Seminars 11/12/2013

���42

Thanks! !!

Giovanni Colavizza Leibniz-Institut für Europäische Geschichte (IEG), Mainz !

[email protected]