practising metadataarts-ed.csu.edu.au/__data/assets/pdf_file/0019/260830/scifleet-slides.pdfjson .....
TRANSCRIPT
4/07/12'
1'
<title> "Practising Metadata
<emph> "The Text Encoding Initiative and
"the Architecture of Meaning </emph> </title> "
Dr Paul Scifleet Charles Sturt University, Sydney, AU
With Professor Susan P Williams University of Koblenz, Koblenz, DE
… research focuses on the organisation of knowledge in virtual spaces. He is interested in the design and management of information resources in networked environments, and in particular, the challenges individuals and organisations face in practice. He is concerned with the ways digital networks and information flows shape (and are shaped by) social and organisational communication and the changing dimensions of this for documenting society. The social organisation of knowledge: the exploration of how our documentary practices are [socially] constituted
4/07/12'
2'
Background and context
! Markup Languages (ML) available for over 20 years
! Widespread and growing agreement among professional communities and standards developers about the type of information that must be supported within each domain
! However literature indicates the community vision faces challenges …and many goals may not be realised (cf. Debreceny & Gray (2003);
Wrightson (2007);Sperberg-MacQueen & Burnard (2004))
! Arguments for and against the role of markup languages are continuing now, in social media, as programmers argue whether inclusion of metatadata for short message communications (XMPP Vs ATOM Vs JSON .. and combinations within) is worth all the time and effort
4
4/07/12'
3'
XML validation
XML transformation
Document input
Data store
XMLification Shared
How are the definitions of content and the design of encoded documents being determined in practice?
The process and practices of documenting are black-boxed and/or treated as unproblematic and routine The articulation work involved in documenting and documentation is largely invisible
Aim of my work has been to extend existing understandings by providing an in-depth investigation of documentary practices
5
The research challenge
The Study
› A survey focusing on how ML’s are being used in practice
- 2008, 32 respondents, 12 countries {Australia [1], Denmark [1] France [1] Japan [1] Taiwan [1] Canada [2], Italy [2], Nederland [2], Norway [2] Slovenia [2] United Kingdom [4],USA [13]}
- Text Encoding Initiative (TEI): All kinds of texts: focus on literary and linguistic works, widely adopted
- Chosen for its maturity. Has been very influential for the design of other semantically rich vocabularies, e.g. Extensible Business Reporting Language (XBRL), Health Markup (HL7) and Legal Markup (LegalML)
- Research design comprised 3 data collection instruments:
(1) the completion of a questionnaire booklet,
(2) the contribution of encoded texts to support the study’s analysis
(3) interviews conducted by the researcher
6
4/07/12'
4'
7
Openning up the black box
8
TEI fragment
4/07/12'
5'
› Paul Caton (2001) has described this type of encoding as: “…a signifying practice strongly implicated in a politically conservative human ideology.”
› … because a normative encoding practice favours a mechanistic view of content designation that tends to hide the political and performative aspects of the activity of encoding.
› Geoffrey Nunberg (1996) has described digitisation as the “morselization” of text into uniform, structured, quantifiable components.
› Both share a belief that there is a changing conception of information within this that requires further attention to the practices that surround the design of the material documentary form if we are to ensure that the effort of changing to digital structures and processes does not result in the loss of substantial, meaningful and material human knowledge.
9
Criticism and concern
10
TEI fragment
4/07/12'
6'
1. Prioritises the array of practices that converge on the research phenomena as a field of practice
2. Explores more than routine and normative elements of the activity. 3. Presents a field of practice as a structuring space where
interactions amongst the elements of the field are determined by the resources available to the field, their interactions with the habitus (background, knowledge and feel for the game) of the practitioner and the interrelationships of these things with each other.
4. Is allied with a social constructionist epistemology that emphasises the relational idiom between people and artefacts and their interaction.
5. Is an interpretive approach to research that presents practice as theory.
Research lens: Practice Theory
11
12
Conceptual framework for the Field of Practice
4/07/12'
7'
13
" 3 interrelated data collection techniques (questionnaire, markup analysis, in-depth interviews) to account for both sides of the encoding relationship (the documenter’s relationship to the document)
" Questionnaire survey of 32 TEI markup projects (quant. and qual. questions) " In-depth automated markup analysis (28 projects submitted 630 text files) " In-depth interviews with participants representing 15 projects
Mixed method study
Managing findings
14
4/07/12'
8'
15
Themes in documentary practice
16
Conceptual framework for the Field of Practice
4/07/12'
9'
17
Themes in documentary practice
Key findings: ML as Standard
18
4/07/12'
10'
19
Themes in documentary practice
" Rate the degree of autonomy your unit has in deciding which text encoding projects to proceed with.
" While more than 78% of scholar-practitioners working on digital encoding projects rated their autonomy as highly autonomous or more, all library-respondents rated their autonomy from somewhat autonomous to not all
Raising two issues that we explored further in the study…
Key findings: Mission
20
4/07/12'
11'
“The TEI’s adoption as a model in digital library projects raised some interesting issues about the whole philosophy of the TEI, which had been designed mostly by scholars who wanted to be as flexible as possible… A
rather different philosophy prevails in library and information science where standards are defined and then followed closely – this to ensure that
readers can find books easily.” (Susan Hockey, discussing the history of the TEI 2004)
› Are there different encoding practices emerging within different professional areas of responsibility in the governing institutions, most particularly between academic scholarship and librarianship?
› This study could find no evidence of differences between the encoded texts that could be attributed to scholar and librarian
› A sense of autonomy may play out in practice and there are different encoding models in play, but they are not attributable to distinctions between scholars and librarians
21
Questions of Professional Difference
Example of markup analysis
Paul Scifleet Documentary Practice 181
Figure 6.12: Profile of encoded document batches showing tag use
22
4/07/12'
12'
Example of markup analysis
Paul Scifleet Documentary Practice 181
Figure 6.12: Profile of encoded document batches showing tag use
23
Example of markup analysis
Paul Scifleet Documentary Practice 181
Figure 6.12: Profile of encoded document batches showing tag use
24
4/07/12'
13'
› The study identified significant organisational influences on the decision making of encoders that influences both what to encode and how it is encoded (depth & detail)
› All participants acknowledged - The service orientation of their work (production, for a sometimes unknown user) - Conformance to collaborative projects or production for database distribution and
subscription services
› However, in a scholarly environment particularly, the influence of organisational arrangements on encoding choices and the institutional influence on documentary practice is often subtle and evidenced only in the general context of organisational doxa (factors like in-kind support and limitations on resources are playing a role but often not recognised for what they are):
“…under these conditions many constraints are not noticed until they are breached.”
25
The Perception of Autonomy
Key finding
1. An emergent documentary practice - The study presents a profile of documentary practice previously unexplored
2. The document - The profiles of encoded document show no discernible differences based on pre-
existing professional domains of practice (not traditional humanities scholarship or librarianship)
3. The documentary task - The categorized list of professionals involved in documentary practice presents
an unexpectedly complex picture of collaborative document production - Project directors; IT specialists; content, metadata and markup specialists; digital
collections and preservation specialists; editors; financial management and other specialised advisory roles including legal
4. A design process - The patterns show a consistent pattern in design: document analysis, similar
procedures for encoding, end user analysis and post implementation review occurring
26
4/07/12'
14'
Implications and directions
› Conceptual framework of practice informs the logic of practice but it is not a logical model for document description - It is arguable that such models are still needed
› Findings present a description of the processes and procedures for working with markup languages - Help to achieve a better articulation of the dimensions of the problem space
- Improve confidence, education and training
› Variables and significant relationships for further investigation are now known - The use of batch text analysis could be extended
- Applying the study (theory & methodology) to other ML’s
› Characteristics of practice inform our understanding (practitioner & academic) of a rich and generative social practice
27
Social IA:
Investigating the information architecture of social media to support an understanding of the social construction of
knowledge (through Web 2.0)
• Perceptions of Privacy & consumer information management in Web 2.0
• Enterprise Social Networks (genres of communication in EMB ~ the Yammer project)
• Social media, data journalism and digital scholarship (the GNIP project)
4/07/12'
15'
Picture: AAP Image/Lukas Coch, 7 March, 2012
Implications and directions
30
4/07/12'
16'
31
Implications and directions
Implications and directions
32
4/07/12'
17'
Implications and directions
33
› How can information managers work with the information architecture of social media (acquiring, managing disseminating)?
› How can that information architecture support qualitative research in social studies?
› QMA
› Genre analysis
› Communicative practice (Jurgen Habermas and Speech Act theory)
34
Implications and directions
4/07/12'
18'
› Caton, P.: Towards a Politics of Text Encoding, Association for Computers and the Humanities/Association for Literary and Longuistic Computing, Annual Conference, 2001, New York
› Debreceny, R. & Gray G.L.: ‘The production and use of semantically rich accounting reports on the internet: XML and XBRL. International Journal of Accounting Information Systems. 2, 47-74 (2003)
› Hockey, S.; History of Humanities Computing, in S. Schreibman, R. Siemens and J. Unsworth (Eds.), A Companion to Digital Humanities. Blackwell Publishing, Malden MA, 3-19 (2004)
› Nunberg, G.: Farewell to the Information Age, in G. Nunberg (Ed.), The Future of the Book, University of California Press, Berkeley & Los Angeles: California, 103-138. (1996)
› Wrightson, A.: Is it Possible to be Simple Without Being Stupid? Exploring the Semantics of Model-driven XML. Extreme Markup Languages, (2007)
› Sperberg-MacQueen, C.M., Burnard, L.: Guidelines for Electronic text Encoding and Interchange, The TEI Consortium, (2004)
35
Some references made in this presentation
Questions?