the interoperability imperative
TRANSCRIPT
Bill KasdorfVP and Principal Consultant, Apex Content Solutions Member of IDPF Board, EPUB 3 WG, W3C DPUB IG
The Interoperability ImperativeHow Publishing Technologies Continue to Converge
Books in Browsers 2014: “Bridging the Web and Digital Publishing”
Unofficial Draft 30 June 2015: “EPUB+WEB” (AKA “EPUB-WEB”)
GitHub, 24 September 2015: “Portable Web Documents for the OWP”
W3C Working Draft, 15 October 2015: “Portable Web Publications for the OWP”
W3C Editors Draft, 28 November 2016: “Web Publications for the OWP”
(a work in process)
(P)WP
Recent Realization:First we need to define a Web Publication!
Meaning an arbitrarily extensive and complex collection of resources on the web
(web pages, CSS, fonts, images, media, scripts, etc.)that has an identity, that can be referenced, etc.Whether/how it’s packaged is a separate issue.
“PUBLICATION” ≠ “DOCUMENT”but
“BUNCH OF STUFF ON THE WEB”might = “PUBLICATION”
(P)WP
Recent Realization:First we need to define a Web Publication!
Meaning an arbitrarily extensive and complex collection of resources on the web
(web pages, CSS, fonts, images, media, scripts, etc.)that has an identity, that can be referenced, etc.Whether/how it’s packaged is a separate issue.
(P)WP
Recent Realization:First we need to define a Web Publication!
Meaning an arbitrarily extensive and complex collection of resources on the web
(web pages, CSS, fonts, images, media, scripts, etc.)that has an identity, that can be referenced, etc.Whether/how it’s packaged is a separate issue.
The “P” is coming to mean
“packaged” more than “portable”
Relationshipwith
researchers Decisionsbased onanalytics
Peer reviewautomation
Reference trackingStatistic checking
Text and Datamining
Howefficientis text?
Rethinkuse of
document
Integratingdigital
artefacts
Thenarrative
ResearchData
Whowants theNarrative
Integratingmultimedia
Taxonomyresources
LinkedOpen Data
Relationshipmaps
Wikipedia
All digitalartefacts
Code protocolsORCIDS neededXML linked data
Reducingfriction
Publications atdifferent placesOpen science
satellite
NEWRELATIONSVIA SOCIALNETWORKS SCN’s
creatingnew journals
Becomepublishers?
Startups
A challenge An opportunityin integration
STM= B2C
publishing?
Chaos anddiversity
Not thesame asasking an
expert
Inertia
Signalvs noise
Business/leisure/
research?
Needseasier
paywalls
Newbusinessmodels
INFORMATIONSTRUCTURE
ANDCONTEXTUALISATION
Behaviouralanalytics
Researcher ResearcherConsumer Doctor
Patient Patient
CAVEATS
Big dataanalyticsUser focus
DYNAMICPUBLISHING
Creatingsolutions
Not searchresults
Usertracking =
new productsand services
PU
BLISH
ING1
UnstructuredData
PrecisionInformation
Automatedliteraturenavigation
Look upeverything Outsource
your brainand memoryLook up on the fly
Anydevicewill do
Differentskill sets
Ask right questionsGet solutions,
not searchresults
Factsand datavs meta-analysis
Thecustomised
solution
Atomisationof
information
Who wantsthe full
document?
OpenScience
Platforms
CitizenScience
Innovationin Society
Who wantsthe full
document?EVERYONE
IS ACUSTOMER
Virtualreality
Augmentedreality+
Sociallearning
Reputation management
Metrics
Socialnetworks
LifeLogging
Socialreading
Collaborationtools
LIFELONGLEARNING
ASKWATSON!
THEPERIPHERAL
BRAIN
22 USERS
Convergenceonlineand
offline
INDUSTRIAL-ISATION OFRESEARCH
Scalingup
Usingbig data
Textanalytics
Datamovingfrom labto lab
Fast translationof results
Dataanalyticsby thecrowd
OPENSCIENCE
OpenData
Reproducibility
SharingResearch
DataSMALLSHOPLABS
Citizenscience
Garageshops
Outsideacademia
FindableRetrievable
Accessible
RESEARCHDATA
LinkingData and
Pubs
InteroperableCollaboration
PerformanceEvaluation
Poolingof Data
Robotlabs
Machinegenerated
ResearchHypothesisExperiments Citizen
Science
Knowledgegraphs
AUTOMATEDKNOWLEDGE
CREATION 3RESEARCH
PRIVACYANDSECURITY
44 Warrantingreproducibility
Identity,reputation
Linkpeople
CertificationVoR
Userssecuring
theirown
metadataUserprivacy
BalancePrivacy
and ValueNEED
UnauthorisedPDFs
PROBLEMS
Theft andprivacy
Pushwalls
Contentlocks
Internetlockedand
blocked
Biguser data
Rightto be
forgotten
Safeharbour
Individualisedservicesallowed?
BOOST INARTIFICIAL
INTELLIGENCE
TDMStatistics
on steroids
Internetof Data
ArtificialIntelligence
Machinelearning,machinereading
COMPUTERPOWER ONSTEROIDS
Cloudcomputing
Webscalecomputing
No morecapacity
limits
Easierinnovation?
Computingcosts upor down?
Big Data meets Arti�cial Intelligence Text
Non-text
ProtocolsResearch
Data
Knowledgegraphs
Code
Orcids
MOREOUTPUTS
- ALLDIGITAL
Outputsborn
digital Increasedoutputvariety 5TECHNOLOG
Y5
User-centered Publishing delivers Precision Informatio
n
The Machine is the New Reader
Science as a Social Machine
Data Privacy requires a Web of Trust
STM Tech Trends: Outlook 2020THE TECHNOLOGY FLOODGATES ARE OPEN
Kindly sponsored by
It’s not just about text.
And almost all of this depends on Web
technologies.
Minor update to HTML 5 on 1 Nov. 2016Mostly fine-tuning to align with actual practice.
A few new elements, like <details> and <summary>, for info users can choose whether or not to read.Removed some features, mostly very technical.
A few changes, like <figcaption> anywhere in <figure>.HTML 5.2 is being worked on, due late 2017.
This will continue to evolve. This is a good thing!
HTML 5.1
Working toward print-quality rendering.“Grid” and “flexbox” for complex layouts.
CSS variables and the “calc” function for adaptability.New font features for sophisticated display.
MS is working on a new spec for table behavior; goal is interoperability among browsers.
These will help make complex (STM!) content more reliable and responsive.
CSS is modular: ongoing progress.
CSS
The Web Publication Vision:ONE PUBLICATION FOR BOTH
ONLINE AND OFFLINE USE.The same content in two different “states”:
Offline, packaged or cached;Online, with all essential resources linked.
A canonical URL that leads to both.
EPUB 3 has become essential to the publishing ecosystem
E-Readers It’s the “master format” for virtually all systems.
EPUB 3 has become essential to the publishing ecosystem
E-Readers It’s the “master format” for virtually all systems.
Accessibility It’s the format for interchange of accessible content.
EPUB 3 has become essential to the publishing ecosystem
E-Readers It’s the “master format” for virtually all systems.
Accessibility It’s the format for interchange of accessible content.
Education Platforms are built on the EPUB for Education profile.
EPUB 3 has become essential to the publishing ecosystem
E-Readers It’s the “master format” for virtually all systems.
Accessibility It’s the format for interchange of accessible content.
Education Platforms are built on the EPUB for Education profile.
Not Just Books It’s used for all kinds of publications.
EPUB 3 has become essential to the publishing ecosystem
E-Readers It’s the “master format” for virtually all systems.
Accessibility It’s the format for interchange of accessible content.
Education Platforms are built on the EPUB for Education profile.
Not Just Books It’s used for all kinds of publications.
Global Widely adopted in US, EU, Far East, Israel.
We want to avoid two competing specs. These need to be the same thing.
Could be one master spec, or a layered spec with “profiles”:
e.g., PWP as a profile of a WP (a type of WP),and “EPUB 4” in turn as a profile of PWP (like EPUB for Education is for EPUB),
a type of PWP requiring more predictability, accessibility, archivability.
EPUB 4 vs. (P)WP
We want to avoid two competing specs. These need to be the same thing.
Could be one master spec, or a layered spec with “profiles”:
e.g., PWP as a profile of a WP (a type of WP),and “EPUB 4” in turn as a profile of PWP (like EPUB for Education is for EPUB),
a type of PWP requiring more predictability, accessibility, archivability.
EPUB 4 vs. (P)WP
This is why we’re working on
combining the IDPF into the W3C.
Publishing Business GroupProvides a formal voice for publishing in the W3C.
Dues are comparable to IDPF dues. IDPF members are automatically members
for two years at current IDPF dues.Publishing Working Group
Full participation in W3C standards at W3C dues.EPUB 3 Community GroupFree to all; maintains EPUB 3.
Here’s the plan.
People are afraid publishers will become “lost” in the W3C.This plan is designed to ensure
that publishers have an even greater presence and influence
and can participate at costs that even
small publishers can bear.
Tightens up EPUB 3 without breaking it.New spec format, better integrated.
Deprecates unused features of EPUB 3.0.Clarifies/extends support for remote resources
(e.g., metadata, fonts, datasets).Improves CSS behavior between author/RS/user.
Improved and stricter accessibility support.Expected to advance to member vote
as final Recommended Spec on Thursday.
EPUB 3.1
Better alignment with OWP.Undated references to HTML & SVG to stay in synch.
No EPUB CSS profile; uses CSS WG “official definition.”Some metadata improvements.
Prioritizes linked bibliographic metadata records.Deprecates @refines attribute and adds
more explicit attributes for specific functionality.Final notice on NCX.
Slated for removal in next major revision of EPUB.
EPUB 3.1
EPUB in Microsoft EdgeRead EPUBs natively just as you can with PDFs.
HTML5-based version of WoodwingMagazine industry’s leading production system
moves from proprietary to OWP software.VitalSource Content Studio
Easy cloud-based creation of complex educational media as EPUB 3.
IBM adopts EPUB 3 for all documentsSignificant move away from PDF.
EPUB is Not Just for Books!
Separate spec devoted to accessibility.Clear guidelines to enable certification of accessibility
and discovery of accessible features in an EPUB.Based on WCAG 2.0: A is must, AA is recommended.
Adds publication-specific requirements.Requires accessibility-specific metadata.
Techniques document provides “how to do it” advice.Applicable and referenceable by
any version of EPUB and other specs too.
EPUB Accessibility 1.0
Categories of compliance.“Discovery-Enabled”: Just metadata.
“Accessible”: MD + WCAG 2.0 + EPUB requirements.“Optimized”: Metadata + specific features.
Metadata aligned with schema.org.accessMode (textual, visual, auditory, tactile).
accessibilityFeature (what features does it have).accessibilityHazard (e.g., flashing can cause seizures).accessibilitySummary (human-readable explanation).
accessModeSufficient (e.g. text + alt text = textual).
EPUB Accessibility 1.0
W3C Web Annotation Working GroupWeb Annotation Data Model Web Annotation Vocabulary
Web Annotation ProtocolProvide interoperable “data structures” for annotations:
Can exchange annotations between systems. Can store annotations on an annotation server.
Put annotations on text, images, videos, etc.Final recommendations to pub in February.
Annotations
Annotating All KnowledgeCoalition of over 70 scholarly publishers, platforms,
libraries, and technology organizations.Open source, standards-based, supports key formats
(HTML, PDF, EPUB, images, video and data).Ambitious 3-year timeline:
Pilots at JSTOR, arXiv, eLife, etc.Force11: Community Platform, Working Group.
Annotations
Ambitious AAK 3-Year TimelineYear 1, Design & Build: Interview users, run
experiments, gather requirements. Discussions w/ key platforms. Ship standards. Write software.
Year 2, Deploy: Make annotation available with articles, books and other media objects.
Year 3, Market: Drive adoption through partnerships and targeted programs.
(Thanks to Maryann E. Martone, Hypothes.is)
Annotations
IIIF: The International Image Interoperability Framework
A Community . . . Over 600 national libraries, research institutions, museums, tech firms, aggregators, and projects.
. . . that creates APIs . . . Image (the pixels); Presentation (human readable info);
Authentication (almost finished); Search (to come). . . . that it uses to create interoperable services.
Focusing on providing a good UX.
Interoperable Images
Image APIURL and identifier for image; can express regions, size, mirror, rotation, rights info, multiple versions.
Presentation APIStructure, properties (labels, rights, technical info, links),
can associate transcription, translation, commentary, etc. with regions of an image.
Working on Audio, Video; 3D in futureAll based on Web technologies, including Annotations.
Interoperable Images
WEB PUBLICATIONSWeb Publications for the Open Web Platform: https://w3c.github.io/dpub-pwp/HTML 5.1https://www.w3.org/TR/html/EPUB 3.1http://www.idpf.org/epub/31/spec/epub-spec.htmlACCESSIBILITYEPUB Accessibility 1.0: http://www.idpf.org/epub/a11y/EPUB Accessibility Techniques: http://www.idpf.org/epub/a11y/techniques/techniques.htmlWEB ANNOTATIONSData Model: https://www.w3.org/TR/annotation-model/Vocabulary: https://www.w3.org/TR/2016/CR-annotation-vocab-20160906/Protocol: https://www.w3.org/TR/2016/CR-annotation-protocol-20160906/ANNOTATING ALL KNOWLEDGEhttps://hypothes.is/annotating-all-knowledge/INTERNATIONAL IMAGE INTEROPERABILITY FRAMEWORKhttp://iiif.io/
Resources