does dita need tags?
TRANSCRIPT
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 1
Does DITA Need Tags?Michael Priestley, Enterprise Content Technology Strategist, IBM CIO
Lu Ai, Program Manager, Content Standards and Structure, IBM Marketing
Carlos Evia, Director of Technical and Professional Writing, Virginia Tech
Content owners: Michael Priestley, Lu Ai, Carlos Evia
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 2
• Why lightweight DITA?• Authoring DITA in markdown• Publishing DITA with JSON• Next steps
Does DITA need tags?
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 4
Collaboration and integration depend on standards
4
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 5
Barriers to collaboration
ProcessLack of governance
Lack of feedback
CultureLack of incentives for reuse
Incentives for reinvention
ContentInconsistent content types
Inconsistent classification
TechnologyLifecycle silos tie authoring to delivery
Formats tie content to authoring system
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 6
Can you get there with DITA?
• Yes.
• But…
• Challenges of complexity and format-dependency
• Opportunity:
• Lightweight DITA as a new OASIS standard (in development)
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 7
When enough is too much
Full DITA
LightweightDITA
“Here’s something simple!”
“We need more features!”
“It’s too complex!”
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 8
Lightweight DITA vs Full DITA
Full DITA Lightweight DITA
Topics ~100 elements ~30 elements
Maps 10 elements
(+30 shared with topic)
3 elements
(+3 shared with topic)
<p>What elements are
allowed in a paragraph?</p>
dl parml fig syntaxdiagram imagemap image
lines lq note hazardstatement object ol pre
codeblock msgblock screen simpletable sl table
ul boolean cite keyword apiname option
parmname cmdname msgnum varname wintitle ph b
i sup sub tt u codeph synph filepath msgph
systemoutput userinput menucascade uicontrol q
term abbreviated-form tm xref state data data-
about foreign unknown draft-comment fn
indextermref indexterm required-cleanup
image
ph (phrase)
b (bold)
i (italic)
u (underline)
sup (superscript)
sub (subscript)
xref (link)
data
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 9
Lightweight DITA at OASIS
Current focus on:
• Industry scenariosoMedicaloSoftware developmentoEducationoMachine industriesoMarketing/ecommerce
• Format mappingsoXMLoHTML5oMarkdownoJSON
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 10
Collaboration and integration depend on standards
1
0
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 11
Authoring DITA in Markdown
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 12
Mapping DITA to HTML5 (HDITA)
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 13
Mapping DITA to Markdown (MarkDITA)
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 14
• The “beautiful structure” of a step
• When the DTD is away…o… the “steps” will play
1. Am I a step?
<step>
<cmd>Or am I a step?</cmd>
</step>
Overprotected child vs. the free spirit
<!ENTITY % step.content
"((%note;)*,
%cmd;,
(%choices; |
%choicetable; |
%info; |
%itemgroup; |
%stepxmp; |
%substeps; |
%tutorialinfo;)*,
(%stepresult;)? )"
>
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 15
The Markdown wars
• There is no Markdown standard
• Well, there’s CommonMark
• And there’s Gruber’s original
• And many other flavors
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 16
• Should we go the Fountain way?
• Or should we embrace the simplest scenario?
# I am a topic
This is my abstract or shortdesc
- A point
- Another point
MarkDITA is not a new flavor
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 17
• {#identifier .class}
• The best/only way to talk about headings
• A solution to the <section> problem
# I am a topic
With a shortdesc paragraph
## And I am a section {.section}
Add a scoop of Pandoc header attributes
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 18
# The point of it all
I can sum it up here
I can say some more stuff
## Stuff {.section}
And so on
- This
- Is
- A List
From < > to ##
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 19
# How to do something {.task}
Introduction to something
## Prerequisites {.prereq}
Find some time to do it
## Context {.context}
Be prepared to do it
1. Plan it
2. Do it
Even better
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 20
{.section}
{.example}
{.prereq}
{.context}
{.result}
{.postreq}
@outputclass equivalents
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 21
• [key] works for keyref declared in a map
• conref… that’s another story
• Unless we go the GitHub way:
{{ site.data.conrefs.phrases.reusable }}
Content reuse
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 22
# Topic collection {.map}
- [First concept](c-first.md)
- [First task](t-first.md)
- [Second task](t-second.md)
Maps
Or
<map>
<title>Topic collection</title>
<topicref href="c-first.md"
format="markdown" />
<topicref href="t-second.dita" />
</map>
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 23
My relationship with Jarno Elovirta
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 24
• But we already have MkDocs and Jekyll
• And Pandoc transforms to EPUB and
Why bother with DITA?
Next slide: DITA benefits and advantages
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 25
DITA benefits and advantages
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 26
Publishing DITA with JSON
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 27
JavaScript Object Notation (http://json.org)
What is JSON
Data type
• Number
• String
• Boolean
• Object
• Array
• Null
Data structure
• name/value pair (NVP)
"name": "Lu Ai"
• object
{"firstname": "Lu","lastname": "Ai"}
• array
“presenters": [{
"firstname": "Lu","lastname": "Ai"
},{
"firstname": "Michael","lastname": “Priestley"
}]
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 28
Why JSON
• Lightweight, easy to parse format
• popular alternative to XML as a data interchange format
Use case at IBM
• Dynamic content delivery via API
CMS1
CMS2
CMS3
Enterprise
Content
Catalog
JSON
Convertersjson
xml
csv
DITAuser2
user
1json
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 29
IBM Case Study Structured Content Pilot
• Component based content management
• Open standard
• Reusable
• Measurable
• PDF only asset management
• Proprietary data model
• Low reusability
• Difficult to track and measure
Current State Future State
DITA
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 30
IBM Case Study Structured Content Pilot
IBM Case Study DITA Specialization: a non-semantic approach
topictitle: macys’.com: …section: {overview}
dlentry: {the need}dlentry: {the solution}dlentry: {the result}
…section: {client background}section: {need}section: {solution}section: {solution components}section: {benefit}….
This less constrained
specialization approach
provides flexibility, but does
not offer much semantics.
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 31
IBM Case Study Structured Content Pilot
IBM Case Study DITA: a mechanical translation to JSON
DITA JSON
• Property name has minimal semantics
• Unnatural/complicated structure in JSON
• Difficult to query
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 32
IBM Case Study Structured Content Pilot
IBM Case Study DITA Specialization: a semantic approach
This tightly constrained
specialization approach
provides rich semantics, but
is less flexible.
casestudytitle: macy’s.comcsprolog
metadataclientbackgroundcsneed
titleshortdescriptionneedbody
cssolutiontitleshortdescriptionsolutionbodysolutioncomponentstransformation
csbenefittitleshortdescriptionbenefitbodylongquoteshortquotecsinsidestory
…
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 33
IBM Case Study Structured Content Pilot
IBM Case Study DITA: a semantic approach
DITA Design
• Adopt a semantic approach in
developing Case Study DITA
specialization
• Allow minimal formatting tags in
Case Study DITA
• Analyze specific content needs
and use cases for case studies to
best determine when to use
semantic and formatting tags
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 34
IBM Case Study Structured Content Pilot
IBM Case Study DITA: semantic translation to JSON
JSON Design
• Flat structure in JSON whenever
possible
• Structured hierarchy is only
used if it makes semantic sense
• Formatting tags removed or
retained in JSON “value”
• Program friendly: simpler
structure, easier queries,
cleaner output
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 35
IBM Case Study Structured Content Pilot
Query Comparison: compile an “overview” based on two different JSON data models
Non-semantic Semantic
Complex query
Section[@.title=“Overview”].dl.dlentry[?(@.dt=“The need”)].dd+Section[@.title=“Overview”].dl.dlentry[?(@dt=“The solution”)].dd+section{@.title=“Overview”].dl.dlentry[?(@dt=“The result”)].dd.q+section{@.title=“Overview”].dl.dlentry [?(@dt=“The result”)].ph
Simpler query
csneed.shortdesc+cssolution.shortdesc+csbenefit.shortdesc
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 36
IBM Case Study Structured Content Pilot
IBM Case Study JSON Data Model
• industry, country, language, category, product, collateraltype, productgroup, creator, sme, owner, reviewdate, expirationdate, contact, businesspartner
metadata
• title, subtitle, clientinfo, clientcontact, need, solution, benefit, clientexperience, quotes, solutioncomponents, formoreinfo, copyrightstatement, disclaimers
content
• casestudy_pdf, clientlog, industryimage, ibmcolorblock, video associatedassets
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 37
IBM Case Study Structured Content Pilot
Next Steps
• Finalize Case Study DITA specialization and JSON data model
• Case Study JSON Schema
• DITA2JSON converter
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 38
Next steps/resources
CIO Enterprise Content Center of Excellence | ©Copyright IBM Corp. 2014, 2015 39
• Continuing design work at the Lightweight DITA subcommittee:ohttps://www.oasis-open.org/committees/tc_home.php?wg_abbrev=dita-
lightweight-dita
• Public discussion at the LinkedIn group:ohttps://www.linkedin.com/groups/Lightweight-DITA-4943862/about
• Test out early prototype mappings:ohttps://github.com/jelovirt/dita-ot-markdown
ohttp://any2dita.com/
What’s happening