tracking changes to jats xml in an online proofing system
TRANSCRIPT
Tracking Changes to JATS XML in an Online Proofing System
22
Dartmouth Journal Services
33
Content Services for STM Publishers
Peer Review Editorial XML-First Composition Electronic Deliverables
Online Hosting
Mobile Apps
44
The Sheridan Group (TSG) Hunt Valley, MD
Technology Lab (TSG) Hunt Valley, MD
Dartmouth Journal Services (DJS)Waterbury, Vermont
Dartmouth Printing Company (DPC)Hanover, New Hampshire
The Sheridan Press (TSP)Hanover, Pennsylvania
Sheridan Books (SBI)Ann Arbor and Chelsea, Michigan
SBI
DJS
TSP
TSG
DPC
The Sheridan Group Companies
55
If the online journal is the journal of record, then how come almost all production workflows only provide PDF proofs?
PDFs cannot incorporate, without significant cost, the elements of tomorrow’s scientific articles; (i.e., multi-media content, data linking, semantic enrichment, supplemental material)
HTML presentation is the future of science articles, even if PDF is the file of record today. HTML5 alone offers an expanding array of features that will improve presentation in the browser (i.e., MathML support, offline caching, native browser support for multi-media, etc.)
The Question
66
The PDF-Based Workflow
Correction Cycle
77
Project Team
Charles O’ConnorWorkflow Automation SpecialistArticleExpress Project Manager
Mike HeppDirector, Technology StrategyArticleExpress Project Leader
Antony GnanapiragasamWorkflow Automation SpecialistArticleExpress System Architect
Tina FleischerTechnical Support SpecialistArticleExpress Quality Assurance
88
Web-based Proofing, Editing, and Review System and Automated XML-Driven Composition
The Solution
99
Online XML Editing
Correction Cycle
1010
Building Support for the Idea
DJS
1111
Collaborative Online Editing Environment
ProductionTeam
Production Editor
Publisher
CorrespondingAuthor
Co-Authors
1212
For a browser-based XML article proofing system to function well in a journal publishing workflow, it must have a comprehensive change tracking capability:
Multiple users interacting with the system and document in the same workflow step
The ability to act upon the changes, regardless of what role/actor made them and regardless of what order they were inserted
Allow editors to accept or reject changes without breaking the underlying XML
The Technical Challenge for Track Changes
1313
Underlying XML Editing Environment
1414
Although the XML editing environment choice was important, there were limitations that needed to be overcome through custom development:
No easy way for authors to add more complex XML structures
Change tracking – limited to insertions and deletions only
XML Editing Limitations
1515
1 2 3 4 5 6
XML 1 XML 2 XML 4 XML 6XML 5XML 3
Sequential Editing
1616
1
2
3
4
5
6
XML 1
XML 4
XML 8
Parallel Editing
8 9
XML 2
XML 6
XML 5
XML 3
XML 9
Merge
XML 7
1717
The Longest Common Subsequence Problem
We may get an accurate representation of the difference between the original and the edited versions of the text, but it may not tell us what the author actually did.
Limitations of XML Differencing Approach
1818
Before: “I say cheese to you”
After: “I say oh pleeze to you”
Diff: “I say coh pleesze to you”
Limitations of XML Differencing Approach
1919
Loss of Granularity
Attempting to overcome this problem by applying a cleanup parameter or otherwise grouping changes can lead to a loss in the granularity of changes.
Changes within changes will not be marked individually as changes, which is a problem if they should be dealt with discretely.
Limitations of XML Differencing Approach
2020
Loss of Granularity
Before <p>hello world</p>
After <p><italic>hello silly italic world</italic></p>
Limitations of XML Differencing Approach
2121
Custom Elements
Use information from event handlers in SDL LiveContent Create to create custom track changes elements and attributes.
Edits can be performed in a number of ways we needed different elements to capture these edits
Custom Elements
2222
Custom Elements
Custom Elements
2323
Custom Elements
Example of formatting.
Custom Elements
2424
1 2
3
4
5
6
XML 1
XML 2
XML 4 XML 5XML 3
Random Access Sequential Editing
7 8 9
2525
Solved by designing and developing:
Comprehensive change tracking
Rule engine that protects the structure of the XML by governing the order of acceptance and rejection of edits
Denormalization of nested elements to granularly expose all edits
Change Tracking Solution
2626
Rule Engine for Accepting/Rejecting Changes
Format
Delete
Insert
2727
ProofExpress – Review Mode
2828
Problem: Order of decision making…
If the system does not enforce an order of decision making, then the process may break the XML.
Rule Engine for Accepting/Rejecting Changes
2929
Order of Decision Making Solution
3030
XML Denormalization
For the accept/reject rule engine to work properly, the track changes tool must show how changed nodes are nested within each other.
As per the rule engine, insertions and deletions should always be the outside changes when they occur in relation to changes in formatting.
XML Denormalization
3131
XML Denormalization
Formatting nodes
XML Denormalization
3232
ArticleExpress Demo
Questions?