{ the front matters: capturing journal front matter content with jats

53
{ THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

Upload: susan-lambert

Post on 27-Dec-2015

230 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

{

THE FRONT MATTERS:

Capturing Journal Front Matter Content with JATS

Page 2: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

Front Matter vs. Journal Matter (disambiguation)

For the purposes of this presentation:“front matter” = “journal matter”

In the current publishing environment where more and more journals are published online, there are many examples of journals without a traditional “front”.

Page 3: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

Obvious

Page 4: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

This… not as much

??

??

?

Page 5: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

Team Introduction

Rachael Carter a journal manager at PMC at the National Center of

Biotechnology Information at the US National Library of Medicine. Rachael graduated in 2010 from the University of Maryland with a Masters of Library Science.

Kathryn Funk a technical editor for NIHMS and PubMed Health at the National

Center of Biotechnology Information at the US National Library of Medicine. Kathryn graduated from The Catholic University of America with a Masters of Library and Information Science.

Rebecca Mooney formerly a journal manager at PMC at the National Center of

Biotechnology Information at the US National Library of Medicine, recently moved to a new position as a Project Analyst in the IT Department of the American Association for the Advancement of Science (AAAS). Rebecca graduated in 2008 from the University of Maryland with a Masters of Library Science.

Page 6: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

“Decisions must be made about what will actually be saved for future use… Will the content consist only of articles in a journal, or will it also include front matter (such as the names of the names of the members of the journal’s editorial board)?”

Marcum, 2001

The Big Picture

Page 7: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

PMC as an archive has a responsibility to answer:

What we should preserve? How we should preserve? Why preserve?

NLM Initiative

Page 8: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

PMC Submission Method A

Page 9: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

• Currently, PMC strives to archive data at the article level, but sees the potential benefit in finding a way to preserve information about the journal that the articles were published in, such as who was Editor in Chief at the time of publication? What was the journal’s philosophy at this time? Etc.

• TOCs: PMC creates their one table of contents, organized by article-type. Still very article based, not at the issue level.

PMC structure

Page 10: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

Front Matter “capturing” in PMC as it currently exists – through banner journal-links only

Page 11: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

What PMC Front Matter IS Editorial board Journal philosophy Submission guidelines Subscription information Covers Journal contact information Publisher information

What PMC Front Matter is NOT Tables of contents Advertisements Forewords Prefaces

Scope of Front Matter within project

Page 12: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

Frontmatter DTD development Timeline

NLM DTD developed

issue-admin.dtd was made available

pmc-journalmatter.dtd developed

Atypon Issue XML presented at JATS-Con

2001 2011 2012

Page 13: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

XML to the rescue

- The content is queryable and reusable - Updating just requires editing a file - Allows for data manipulation over various platforms/formats

Value of capturing front matter as XML

Limitations of PDF

- Assumes there is an issue to scan- Difficult to update

content- Limited to certain

platforms and technologies

Page 14: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

o Mostly because we already use JATS

o It’s flexible

o Already had meaningful framework to capture journal article content

o Works well within the structure of PMC• consistency

Why we chose to create an extension to JATS

Page 16: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

To capture front matter in the environment in which it was published

To work as much as possible with the existing JATS framework

To create a DTD that would allow for flexibility in both use in rendering

Goals

Page 17: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

Testing 1 2 3

Looking at samples

Defined content types

Created new elements

Completed first iteration of the pmc-journalmatter.dtd

Tagged samples of front matter using our DTD and made adjustments

User testing: PMC journal managers

Adjustments made to final DTD based on user feedback

Page 18: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

Highlighted physical example of a journal’s front matter

Page 19: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

Anything in RED is required

<journal-meta> contains, in order:• <journal-id>*• <journal-title-group>• <issn>*• <isbn>*• <publisher>?

<issue-meta> contains, in order:• <pub-date>*• <volume>?• <issue>?• <issue-title>*• <issue-sponsor>*• <first-page><last-page>?<page-range>? OR <elocation-id>?

<document-meta> contains, in order:• <pub-date>*• <document-title>• <self-uri>*

<body> contains, in order:• <person-list> requires one or more <person>• <person> contains, in order:

• <name> OR <string-name> OR <collab>• <degrees>*• <address>*• <aff>*• <role>*• <ext-link>*• <xref>*

Initial Classification

Page 20: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

Created new elements

<person-

list><issue-meta>

<document-

meta>

Page 21: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

Tagged samples of front matter using our DTD and made adjustments

Page 22: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

User testing: PMC journal managers

Page 23: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

DTD technical details

pmc-journalmatter.

dtd

.ent

.mod

pmc-journa

l matter custom .ent

customizations

Page 24: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

<journalmatter journalmatter-type="issue" content-type="edboard">

Root element:journalmatter

Page 25: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

How to generate a foundation for organizing and labeling the front matter content?

Answering the question of can we tag all of this content in one document?

Challenges

Page 26: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

Root element attribute: @journalmatter-type

Page 27: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

Prevents hybrid of issue and non-issue content in the same document

Changes in content can be more easily updated

Allows a single journal to have issue and standing documents

Issue vs. Standing: The Benefits

Page 28: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

standing – Information of Authors

Example: Standing & Issue

issue - Cover

Page 29: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

@content-type Separate documents

Flexibility In tagging and rendering

Update as need be EX: Journal philosophy vs. ed board

Root element: @content-type

Page 30: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

@content-type

edboard

cover

general-info

publisher

info-for-authors

other

Individual documents for each @content-type.

Page 31: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

Cover ("cover"): can include cover image, caption, and cover image copyright information.

Editorial Board ("edboard"): can include executive editors, associate editors, etc. as well as general editorial board members.

General Journal Information ("general-info"): can include but is not limited to journal mission statement, scope, journal contact information, subscription information, copyright, and other journal-specific content.

Publisher Information ("publisher"): can include publisher philosophy, other journals published, contact information, etc.

Information for Authors ("info-for-authors"): can include article submission and formatting instructions.

Other ("other"): if the document is not one of the listed types or the type of document cannot be determined, the "other" attribute value may be used.

@content-type values

Page 32: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

The 4 Main elements of a document

<jour

nal-

met

a>

<issue-

meta>

<document-meta>

<body>

<journalmatter>

Page 33: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

<!ENTITY % journal-meta-model "(journal-id*, journal-title-group*, issn*, isbn*, publisher*)">

<journal-meta>

Page 34: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

JATS journal-meta

pmc-journal matter journal-

meta

Page 35: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

<!ENTITY % issue-meta-model "(pub-date*, volume?, issue?, issue-id*, issue-title*, issue-sponsor*)">

<issue-meta>

Page 36: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

JATS article-meta

Pmc-journalmatter issue-meta

Page 37: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

<!ENTITY % document-meta-model "((document-title, document-subtitle?)?, contrib-group?, pub-date*, (((fpage, lpage?, page-range?) | elocation-id)?), self-uri*, permissions?)"

<document-meta>

Page 38: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

JATS article-meta

pmc-journalmatter

document-meta

Page 39: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

Borrowed directory from JATS (with a few additions)

<body>

Page 40: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

Addition: <person-list>

<!ELEMENT person-list (title?, person+) >

Page 41: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

Person-list vs. Person-group

Page 42: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

advisory-board: A board appointed to advise the editorial board

editor: Content editors editorial-board: A group of editors on a

publication guest-editor: Content editors that have

been invited to edit all or part of a work reviewer: Content reviewer transed: Editors of a translated version

of a work

@person-list-type

Page 43: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

Not required – suggested list Not controlled attribute Only used when content-type=“general-

info” Intent was to give meaning for searching

and grouping purposes. Used similarly to JATS’ @sec-types

@sec-type

Page 44: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

@sec-type is not a required or controlled attribute. However, when "general-info" is the @content-type of the document, the following is a suggested list of types:

association*

copyright journal-contact journal-philosophy subscription-info

*This refers to associations which may be affiliated with a journal but does not necessarily publish the journal.

List of @sec-types

Page 45: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

http://dtd.nlm.nih.gov/ncbi/pmc/journalmatter/

DTD Documentation

Page 46: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

So how’s it all going to look?

?

Page 47: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

Still relatively untested No rendering No actual use

Lack of an existing model

Based on perceived needs of PMC as an archive. Unanticipated uses beyond.

Different naming conventions and structures of published journal front matter

Limitations

Page 48: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

Trying to start a conversation Looking for ways to best capture to suit

needs both inside PMC and the broader JATS community

Determining whether the content types will be applicable for future applications

Initiating the usage for the DTD and seeing what happens

Looking Forward

Page 49: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

Breena Krick Jeff Beck Audrey Hamelers Christopher Maloney PMC Journal Managers

Acknowledgements

Page 50: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

Andrew N.. The Oxford Journals Online Archives: The Purpose and Practicalities of a Major Digitization Program. Serials Review. (2006. June). 32(12), 78-80.

Holdsworth David. Preservation Strategies for Digital Libraries. Glasgow, UK: HATII, University of Glasgow;DCC Digital Curation Manual. (2007. November). Retrieved from: http://www .dcc.ac.uk /resource/curation-manual /chapters/preservation-strategies-digital-libraries .

Marcum D. Scholars as Partners in Digital Preservation. CLIR Issues. (2001. March/April)20. Retrieved from:http://www .clir.org/pubs /issues/issues20.html.

Markantonatos N. Article vs Issue XML: Capturing the Table of Contents under the NLM DTD. Bethesda, MD:National Center for Biotechnology Information; Journal Article Tag Suite Conference (JATS-Con) Proceedings 2011. (2011). Retrieved from: http://www .ncbi.nlm.nih .gov/books/NBK57236/..

Wheeler B. Journal Identity in the Digital Age. Journal of Scholarly Publishing. (2010. ) 42(1), 45-88.

NLM Journal Archiving and Interchange Tag Suite. Retrieved from: http://dtd .nlm.nih.gov/.

PMC Journal Matter DTD Documentation. Retrieved from: http://dtd .nlm.nih.gov /ncbi/pmc/journalmatter/.

BMC Cancer. Retrieved from: http://www .biomedcentral.com/bmccancer/. Frontiers in Cancer Genetics. Retrieved from: http://www .frontiersin .org/

cancer_genetics.

References

Page 52: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

Questions?

Page 53: { THE FRONT MATTERS: Capturing Journal Front Matter Content with JATS

Multiple documents: Dependent on information being captured

1 XML document: content-type=“standing” OR “issue”

2 document: 1 content-type=“standing 1 content-type=“issue”

Cover Editorial Board

General Journal Information

Publisher Information

Information for Authors

“standing” “edboard” “general-info”

“publisher” “info-for-authors”

“issue” “cover” “edboard” “general-info”

“publisher” “info-for-authors”