the interoperability imperative

39
Bill Kasdorf VP and Principal Consultant, Apex Content Solutions Member of IDPF Board, EPUB 3 WG, W3C DPUB IG The Interoperability Imperative How Publishing Technologies Continue to Converge

Upload: apex-covantage

Post on 15-Apr-2017

155 views

Category:

Technology


0 download

TRANSCRIPT

Bill KasdorfVP and Principal Consultant, Apex Content Solutions Member of IDPF Board, EPUB 3 WG, W3C DPUB IG

The Interoperability ImperativeHow Publishing Technologies Continue to Converge

IDPF+W3C

1. A Brief History of the Convergence

Books in Browsers 2014: “Bridging the Web and Digital Publishing”

Unofficial Draft 30 June 2015: “EPUB+WEB” (AKA “EPUB-WEB”)

GitHub, 24 September 2015: “Portable Web Documents for the OWP”

W3C Working Draft, 15 October 2015: “Portable Web Publications for the OWP”

W3C Editors Draft, 28 November 2016: “Web Publications for the OWP”

(a work in process)

(P)WP

Recent Realization:First we need to define a Web Publication!

Meaning an arbitrarily extensive and complex collection of resources on the web

(web pages, CSS, fonts, images, media, scripts, etc.)that has an identity, that can be referenced, etc.Whether/how it’s packaged is a separate issue.

“PUBLICATION” ≠ “DOCUMENT”but

“BUNCH OF STUFF ON THE WEB”might = “PUBLICATION”

(P)WP

Recent Realization:First we need to define a Web Publication!

Meaning an arbitrarily extensive and complex collection of resources on the web

(web pages, CSS, fonts, images, media, scripts, etc.)that has an identity, that can be referenced, etc.Whether/how it’s packaged is a separate issue.

(P)WP

Recent Realization:First we need to define a Web Publication!

Meaning an arbitrarily extensive and complex collection of resources on the web

(web pages, CSS, fonts, images, media, scripts, etc.)that has an identity, that can be referenced, etc.Whether/how it’s packaged is a separate issue.

The “P” is coming to mean

“packaged” more than “portable”

Relationshipwith

researchers Decisionsbased onanalytics

Peer reviewautomation

Reference trackingStatistic checking

Text and Datamining

Howefficientis text?

Rethinkuse of

document

Integratingdigital

artefacts

Thenarrative

ResearchData

Whowants theNarrative

Integratingmultimedia

Taxonomyresources

LinkedOpen Data

Relationshipmaps

Wikipedia

All digitalartefacts

Code protocolsORCIDS neededXML linked data

Reducingfriction

Publications atdifferent placesOpen science

satellite

NEWRELATIONSVIA SOCIALNETWORKS SCN’s

creatingnew journals

Becomepublishers?

Startups

A challenge An opportunityin integration

STM= B2C

publishing?

Chaos anddiversity

Not thesame asasking an

expert

Inertia

Signalvs noise

Business/leisure/

research?

Needseasier

paywalls

Newbusinessmodels

INFORMATIONSTRUCTURE

ANDCONTEXTUALISATION

Behaviouralanalytics

Researcher ResearcherConsumer Doctor

Patient Patient

CAVEATS

Big dataanalyticsUser focus

DYNAMICPUBLISHING

Creatingsolutions

Not searchresults

Usertracking =

new productsand services

PU

BLISH

ING1

UnstructuredData

PrecisionInformation

Automatedliteraturenavigation

Look upeverything Outsource

your brainand memoryLook up on the fly

Anydevicewill do

Differentskill sets

Ask right questionsGet solutions,

not searchresults

Factsand datavs meta-analysis

Thecustomised

solution

Atomisationof

information

Who wantsthe full

document?

OpenScience

Platforms

CitizenScience

Innovationin Society

Who wantsthe full

document?EVERYONE

IS ACUSTOMER

Virtualreality

Augmentedreality+

Sociallearning

Reputation management

Metrics

Socialnetworks

LifeLogging

Socialreading

Collaborationtools

LIFELONGLEARNING

ASKWATSON!

THEPERIPHERAL

BRAIN

22 USERS

Convergenceonlineand

offline

INDUSTRIAL-ISATION OFRESEARCH

Scalingup

Usingbig data

Textanalytics

Datamovingfrom labto lab

Fast translationof results

Dataanalyticsby thecrowd

OPENSCIENCE

OpenData

Reproducibility

SharingResearch

DataSMALLSHOPLABS

Citizenscience

Garageshops

Outsideacademia

FindableRetrievable

Accessible

RESEARCHDATA

LinkingData and

Pubs

InteroperableCollaboration

PerformanceEvaluation

Poolingof Data

Robotlabs

Machinegenerated

ResearchHypothesisExperiments Citizen

Science

Knowledgegraphs

AUTOMATEDKNOWLEDGE

CREATION 3RESEARCH

PRIVACYANDSECURITY

44 Warrantingreproducibility

Identity,reputation

Linkpeople

CertificationVoR

Userssecuring

theirown

metadataUserprivacy

BalancePrivacy

and ValueNEED

UnauthorisedPDFs

PROBLEMS

Theft andprivacy

Pushwalls

Contentlocks

Internetlockedand

blocked

Biguser data

Rightto be

forgotten

Safeharbour

Individualisedservicesallowed?

BOOST INARTIFICIAL

INTELLIGENCE

TDMStatistics

on steroids

Internetof Data

ArtificialIntelligence

Machinelearning,machinereading

COMPUTERPOWER ONSTEROIDS

Cloudcomputing

Webscalecomputing

No morecapacity

limits

Easierinnovation?

Computingcosts upor down?

Big Data meets Arti�cial Intelligence Text

Non-text

ProtocolsResearch

Data

Knowledgegraphs

Code

Orcids

MOREOUTPUTS

- ALLDIGITAL

Outputsborn

digital Increasedoutputvariety 5TECHNOLOG

Y5

User-centered Publishing delivers Precision Informatio

n

The Machine is the New Reader

Science as a Social Machine

Data Privacy requires a Web of Trust

STM Tech Trends: Outlook 2020THE TECHNOLOGY FLOODGATES ARE OPEN

Kindly sponsored by

It’s not just about text.

And almost all of this depends on Web

technologies.

Minor update to HTML 5 on 1 Nov. 2016Mostly fine-tuning to align with actual practice.

A few new elements, like <details> and <summary>, for info users can choose whether or not to read.Removed some features, mostly very technical.

A few changes, like <figcaption> anywhere in <figure>.HTML 5.2 is being worked on, due late 2017.

This will continue to evolve. This is a good thing!

HTML 5.1

Working toward print-quality rendering.“Grid” and “flexbox” for complex layouts.

CSS variables and the “calc” function for adaptability.New font features for sophisticated display.

MS is working on a new spec for table behavior; goal is interoperability among browsers.

These will help make complex (STM!) content more reliable and responsive.

CSS is modular: ongoing progress.

CSS

The Web Publication Vision:ONE PUBLICATION FOR BOTH

ONLINE AND OFFLINE USE.The same content in two different “states”:

Offline, packaged or cached;Online, with all essential resources linked.

A canonical URL that leads to both.

Wouldn’t it be great if there was no difference between

an online publication and an EPUB?

OMG! OMG! OMG!Please don’t kill EPUB!

Calm down!Nobody’s taking

EPUB away!!

EPUB 3 has become essential to the publishing ecosystem

E-Readers It’s the “master format” for virtually all systems.

EPUB 3 has become essential to the publishing ecosystem

E-Readers It’s the “master format” for virtually all systems.

Accessibility It’s the format for interchange of accessible content.

EPUB 3 has become essential to the publishing ecosystem

E-Readers It’s the “master format” for virtually all systems.

Accessibility It’s the format for interchange of accessible content.

Education Platforms are built on the EPUB for Education profile.

EPUB 3 has become essential to the publishing ecosystem

E-Readers It’s the “master format” for virtually all systems.

Accessibility It’s the format for interchange of accessible content.

Education Platforms are built on the EPUB for Education profile.

Not Just Books It’s used for all kinds of publications.

EPUB 3 has become essential to the publishing ecosystem

E-Readers It’s the “master format” for virtually all systems.

Accessibility It’s the format for interchange of accessible content.

Education Platforms are built on the EPUB for Education profile.

Not Just Books It’s used for all kinds of publications.

Global Widely adopted in US, EU, Far East, Israel.

We want to avoid two competing specs. These need to be the same thing.

Could be one master spec, or a layered spec with “profiles”:

e.g., PWP as a profile of a WP (a type of WP),and “EPUB 4” in turn as a profile of PWP (like EPUB for Education is for EPUB),

a type of PWP requiring more predictability, accessibility, archivability.

EPUB 4 vs. (P)WP

We want to avoid two competing specs. These need to be the same thing.

Could be one master spec, or a layered spec with “profiles”:

e.g., PWP as a profile of a WP (a type of WP),and “EPUB 4” in turn as a profile of PWP (like EPUB for Education is for EPUB),

a type of PWP requiring more predictability, accessibility, archivability.

EPUB 4 vs. (P)WP

This is why we’re working on

combining the IDPF into the W3C.

Publishing Business GroupProvides a formal voice for publishing in the W3C.

Dues are comparable to IDPF dues. IDPF members are automatically members

for two years at current IDPF dues.Publishing Working Group

Full participation in W3C standards at W3C dues.EPUB 3 Community GroupFree to all; maintains EPUB 3.

Here’s the plan.

People are afraid publishers will become “lost” in the W3C.

People are afraid publishers will become “lost” in the W3C.This plan is designed to ensure

that publishers have an even greater presence and influence

and can participate at costs that even

small publishers can bear.

In the meantime, the evolution of EPUB 3 continues.

Tightens up EPUB 3 without breaking it.New spec format, better integrated.

Deprecates unused features of EPUB 3.0.Clarifies/extends support for remote resources

(e.g., metadata, fonts, datasets).Improves CSS behavior between author/RS/user.

Improved and stricter accessibility support.Expected to advance to member vote

as final Recommended Spec on Thursday.

EPUB 3.1

Better alignment with OWP.Undated references to HTML & SVG to stay in synch.

No EPUB CSS profile; uses CSS WG “official definition.”Some metadata improvements.

Prioritizes linked bibliographic metadata records.Deprecates @refines attribute and adds

more explicit attributes for specific functionality.Final notice on NCX.

Slated for removal in next major revision of EPUB.

EPUB 3.1

EPUB in Microsoft EdgeRead EPUBs natively just as you can with PDFs.

HTML5-based version of WoodwingMagazine industry’s leading production system

moves from proprietary to OWP software.VitalSource Content Studio

Easy cloud-based creation of complex educational media as EPUB 3.

IBM adopts EPUB 3 for all documentsSignificant move away from PDF.

EPUB is Not Just for Books!

Accessibility

2. Mainstreaming Accessibility

Separate spec devoted to accessibility.Clear guidelines to enable certification of accessibility

and discovery of accessible features in an EPUB.Based on WCAG 2.0: A is must, AA is recommended.

Adds publication-specific requirements.Requires accessibility-specific metadata.

Techniques document provides “how to do it” advice.Applicable and referenceable by

any version of EPUB and other specs too.

EPUB Accessibility 1.0

Categories of compliance.“Discovery-Enabled”: Just metadata.

“Accessible”: MD + WCAG 2.0 + EPUB requirements.“Optimized”: Metadata + specific features.

Metadata aligned with schema.org.accessMode (textual, visual, auditory, tactile).

accessibilityFeature (what features does it have).accessibilityHazard (e.g., flashing can cause seizures).accessibilitySummary (human-readable explanation).

accessModeSufficient (e.g. text + alt text = textual).

EPUB Accessibility 1.0

Annotations

3. Progress on Interoperable Annotations.

W3C Web Annotation Working GroupWeb Annotation Data Model Web Annotation Vocabulary

Web Annotation ProtocolProvide interoperable “data structures” for annotations:

Can exchange annotations between systems. Can store annotations on an annotation server.

Put annotations on text, images, videos, etc.Final recommendations to pub in February.

Annotations

Annotating All KnowledgeCoalition of over 70 scholarly publishers, platforms,

libraries, and technology organizations.Open source, standards-based, supports key formats

(HTML, PDF, EPUB, images, video and data).Ambitious 3-year timeline:

Pilots at JSTOR, arXiv, eLife, etc.Force11: Community Platform, Working Group.

Annotations

Ambitious AAK 3-Year TimelineYear 1, Design & Build: Interview users, run

experiments, gather requirements. Discussions w/ key platforms. Ship standards. Write software.

Year 2, Deploy: Make annotation available with articles, books and other media objects.

Year 3, Market: Drive adoption through partnerships and targeted programs.

(Thanks to Maryann E. Martone, Hypothes.is)

Annotations

Images

4. Coming Soon: Interoperable Images.

IIIF: The International Image Interoperability Framework

A Community . . . Over 600 national libraries, research institutions, museums, tech firms, aggregators, and projects.

. . . that creates APIs . . . Image (the pixels); Presentation (human readable info);

Authentication (almost finished); Search (to come). . . . that it uses to create interoperable services.

Focusing on providing a good UX.

Interoperable Images

Image APIURL and identifier for image; can express regions, size, mirror, rotation, rights info, multiple versions.

Presentation APIStructure, properties (labels, rights, technical info, links),

can associate transcription, translation, commentary, etc. with regions of an image.

Working on Audio, Video; 3D in futureAll based on Web technologies, including Annotations.

Interoperable Images

WEB PUBLICATIONSWeb Publications for the Open Web Platform: https://w3c.github.io/dpub-pwp/HTML 5.1https://www.w3.org/TR/html/EPUB 3.1http://www.idpf.org/epub/31/spec/epub-spec.htmlACCESSIBILITYEPUB Accessibility 1.0: http://www.idpf.org/epub/a11y/EPUB Accessibility Techniques: http://www.idpf.org/epub/a11y/techniques/techniques.htmlWEB ANNOTATIONSData Model: https://www.w3.org/TR/annotation-model/Vocabulary: https://www.w3.org/TR/2016/CR-annotation-vocab-20160906/Protocol: https://www.w3.org/TR/2016/CR-annotation-protocol-20160906/ANNOTATING ALL KNOWLEDGEhttps://hypothes.is/annotating-all-knowledge/INTERNATIONAL IMAGE INTEROPERABILITY FRAMEWORKhttp://iiif.io/

Resources

Thanks!

Bill [email protected]

+1 734 904 6252@BillKasdorf