“it’s not rocket science!” applying cms and semantic enrichment to transform book publishing
DESCRIPTION
TRANSCRIPT
©2010 Really Strategies, Inc. | www.rsuitecms.com
“It’s not rocket
science!”
Wednesday Webinar Series
Applying CMS and semantic enrichment
to transform book publishing
©2010 Really Strategies, Inc. | www.reallysi.com
Welcome, Overview, Introductions
Online Poll
Semantic enrichment
Wolters Kluwer Health case study
Online Poll
Content management with RSuite
Q&A
Webinar Agenda
©2010 Really Strategies, Inc. | www.reallysi.com
Who is Really Strategies? Founded: 2000
Consulting Services to Publishers
Specialists in XML-based Content Management Solutions
Project/Program Management
Workflow Analysis and Reengineering
Content and Metadata Modeling
Technology Assessment and Roadmaps
Much more…
2004 - 2009
2006, 2007, 2008
2004, 2005, 2007
2007, 2009
2007, 2009
A content management system for
publishers.
©2010 Really Strategies, Inc. | www.reallysi.com
Serving over 100 companies
4
STM Educational Media Tech Pubs
©2010 Really Strategies, Inc. | www.reallysi.com
Jake Zarnegar, CTO
Silverchair
Jabin White, Director of Strategic Content
Wolters Kluwer Health – Professional & Education
Mike Sherlock, Program Manager
Really Strategies, Inc.
Webinar Presenters
5
©2010 Really Strategies, Inc. | www.reallysi.com
The CMS – Semantic Landscape
6
Content managementContent creationContent delivery
Author submission
Online submission
Tools used
Editorial workflow
Check in/ check out
Version control
Production
Metadata
TransformationsWeb sites
iPad and other
mobile apps
Good old print
Content enrichment
Taxonomy management
©2010 Really Strategies, Inc. | www.reallysi.com
ONLINE POLL
7
Silverchair | www.silverchair.com
Jake Zarnegar, Silverchair
Why Enrich Your Content With Semantics?
Silverchair | www.silverchair.com
Semedica, a division of Silverchair
Tagmaster Semantic autotagging w/expert review
Totem Taxonomy/Ontology manager
Cortex Biomedical taxonomy & thesaurus
Swiss Semantic web services
Silverchair | www.silverchair.com
My Brother is a Rocket Scientist
Silverchair | www.silverchair.com
His Test Answers
A-C-B-A-E-D-C-A-B-E-A-C-B-B-E-D-C-A-B-C-B-C-A-D-B-E-A-D-C-A-B-A-E-A-C-A-E-C-B-A-E-B-D-D-E-A-B-C-A-B-A-C-A-A-C-E-C-B-D-D-A-B-C-E-A-C-C-D-E-B-A-B-B-C-D-E-A-D-A-E-B-E-C-A-E-D-A-C-E-B-C-A-B-E-A-A-D-E-A-B
Silverchair | www.silverchair.com
Semantic Enrichment Raison D’Etre
To put thousands and thousands of tiny meaningful hooks in your data so that your software applications can create richer outcomes for your users and your organization.
Silverchair | www.silverchair.com
Semantic Enrichment Raison D’Etre
To put thousands and thousands of tiny meaningful hooks in your data so that your software applications can create richer outcomes for your users and your organization.
Silverchair | www.silverchair.com
Semantics in 3 Minutes
Silverchair | www.silverchair.com
Semantics are About Meaning
Semantics describe the meaning of your content, on top of the physical structure. Meaning is generally conveyed in topics and concepts.
Semantic metadata formally answers the most important question of all for content producers and users:
What is this content about?
Silverchair | www.silverchair.com
“Atomizing” Information
The semantic approach requires us to go beyond documents and think of our content as data.
For example:
1 textbook chapter = 1 documentOR1 textbook chapter = 712 distinct pieces of data (sections, paragraphs, lists, tables, figures, equations, etc.)
Silverchair | www.silverchair.com
But breaking down content into its smallest parts is not an end unto itself…
Silverchair | www.silverchair.com
Taxonomy as Semantic Foundation
• The taxonomy is the framework for the semantic layer and semantic tagging—crucial for concept grouping and hierarchical relationships
• Also serves to normalize terminology and language variances when combined with a robust thesaurus
• Industry-standard taxonomies facilitate integration
Silverchair | www.silverchair.com
Use taxonomy axes to organize your atomized content on key traits and prepare it for recombination…
Nuts & Bolts: Semantic Tagging
• Tagging is the insertion of semantic (meaning) information in the XML, whose smallest unit is called a tag
• Tagging can also be placed in database tables and header files if the content is inaccessible (such as images and videos)
• Tagging should be done at the smallest “atomic” level of data possible
Silverchair | www.silverchair.com
Paragraph entity identification. What is this content about?
Silverchair | www.silverchair.com
Semantic article summary. What is this content about?
Silverchair | www.silverchair.com
Semantics for Your Users
Silverchair | www.silverchair.com
Know Your Users!
Focus your metadata creation on how your users want to use your content:
• How do they search? Browse? At what point in their workflow is your product used?
Almost all information sites have multiple use cases. You need to know what those use cases are for your products.
Start with what is the most important to the mostusers and work your way down a priority list.
Silverchair | www.silverchair.com
The Semantic Use Test
I am specifically identifying __________ because ____________ is very important to my ____________ users when they are _____________.
Silverchair | www.silverchair.com
Semantic Metadata: Focus on Use
Example: I am specifically identifying concise disease treatment content because immediate access to treatment options is very important to my emergency physician users when they have 8 seconds to look up an answer.
Silverchair | www.silverchair.com
McGraw-Hill: metadata targeted to deliver fast, concise treatment info to ED
Silverchair | www.silverchair.com
Semantic Metadata: Focus on Use
Example: I am specifically identifying skin disorder images on all body locations and all types of skinbecause visual diagnosis is very important to my family physicianusers when they are trying to identify a rash.
Silverchair | www.silverchair.com
Derm101: images show up immediately in the diagnosis results for searches
Silverchair | www.silverchair.com
Semantic Metadata: Focus on Use
Example: I am specifically identifying manufacturer names because the source of medical devices is very important to my surgical residentusers when they are prepping for a procedure.
Silverchair | www.silverchair.com
Semantic Metadata: Focus on Use
Example: I am specifically identifying manufacturer names because the source of medical devices is very important to my surgical residentusers when they are prepping for a procedure.
Not Likely!
Silverchair | www.silverchair.com
Semantics for Your Organization
Silverchair | www.silverchair.com
Use Semantics to Know Your Users
Silverchair | www.silverchair.com
Use Semantics to “Know Thyself!”
Silverchair | www.silverchair.com
Thank you!
For more information:
Jake ZarnegarCTO, SilverchairPresident, Silverchair Information [email protected](434) 296-6333 x236
www.silverchair.comwww.semedica.com
Jabin White
Director of Strategic Content
Wolters Kluwer Health – Professional & Education
Really Strategies/Silverchair Webinar –September 29, 2010
Agenda
• A little background (framing the problem)
• Our goals
• When we’re done, we’ll be able to…
Who we are
• We are Wolters Kluwer Health – Professional & Education
• Wolters Kluwer Health includes:▫ Lippincott Williams & Wilkins titles
▫ Ovid
▫ UpToDate
▫ Provation Order Sets
▫ Drug Facts & Comparisons
▫ Medi-Span
▫ Clin-eguide
A Little History
• Joined WK Health in May 2009
▫ Responsible for making sure content flows through company more efficiently (DTDs, Content Management, Authoring Tools, Semantic Enrichment, Product Information Management, etc.)
• The reasons are not important, but we hadn’t spent a lot of time modernizing our digital production methods
Today – Our typical workflow
• Book is “signed”• Instructions for authors are sent, and ignored• Chapters, etc., are submitted in MS Word• Word files are sent “over the wall” (outsourced),
coded, and put into a pagination software (still some Quark, moving to Adobe InDesign)
• Final pages are approved• High-resolution PDFs are sent to printer• After final pages are approved, vendors convert into
XML (if the title was comped after May 2009). If before, we roll the dice…
• Delivered back to P&E archive, along with printer PDFs, application files, and images
So what’s your problem?
• We pay at every step of the previous workflow, and we believe unnecessarily near the end
• If we need ePub, we have to go back into the archive to a “mixed bag” of content (some Quark, some PDF, some XML)
• There is no central repository – or common format – in which to apply semantic tagging▫ And the frustrating thing is we have GOOD DTDs!
• If we believe in semantic markup, which we do, we must essentially throw content over the wall again just as in composition (shampoo, rinse, repeat)
Enter RSuiteCMS
• RSuiteCMS gives us the ability to control the workflow and use good content management practices (it does a LOT more, but we’re starting slow)
• Very importantly, we get to have authors write in XML without them knowing (or quite frankly caring)
• We put a LOT of work into the authoring environment, trying to keep authors away from angle brackets
• “It takes a lot of hard work to make things simple”
When we’re done, we’ll be able to…
• …Produce structured content with lower effort/cost
• Working on SECOND RSuiteCMSimplementation as we speak▫ Will scale in latter part of 2010 and 2011
• We are moving cautiously and ensuring “buy in” from stakeholders at each step
• Ideally, we will grow our ability to produce clean, structured XML to check into our repository
• But Rome wasn’t tagged in a day...
Enter Semedica
• Gives us the ability to add semantic tagging to our content, either when it is finished (in the repository) or while it is being worked on (within RSuiteCMS)
• Semedica gives us the ability to:
▫ Leverage a standard taxonomy (Cortex)
▫ Add to the taxonomy and manage equivalencies – perhaps mined from our search logs – “Wenckenback = Wenckebach” (Totem)
▫ Apply the tags to our content (Tagmaster)
Why Semantic Tagging?
• It adds extra power to our content to drive:▫ More precise searching▫ Contextually-based connections▫ Lowering of “two terms meaning the same thing”
syndrome (hypertension vs. high blood pressure; heart attack vs. myocardial infarction)
▫ Filling in of content gaps▫ Asking questions of data (aka, querying): “How many
chapters do we publish that are tagged with the term “pediatric oncology” or “leukemia” that also contain the treatment “interferon therapy”
How RSuiteCMS & Semedica need each
other• I wouldn’t think of using Semedica to enrich
Word files (and not just because Jake would laugh)
• I couldn’t make the business case for RSuiteCMS to help produce structural XML without dangling the prospect of semantic enrichment
• Which came first, the chicken or the egg?
Jabin WhiteDirector of Strategic ContentWolters Kluwer [email protected]: @jabinwhiteBlog: Technically Speaking athttp://www.bookbusinessmag.com/channel/technically-speaking
©2010 Really Strategies, Inc. | www.reallysi.com
XML source content must be updated regularly
by working medical professionals without using
desktop software
and
updates must be easily imported and exported
for multiple channels without technical
intervention
Facilitating semantic enrichment for the
5-Minute Clinical Consult product
©2010 Really Strategies, Inc. | www.reallysi.com
Solution: a simplified, browser-based interface to RSuiteContributors only see what they need to see
©2010 Really Strategies, Inc. | www.reallysi.com
Workflow toolsEnable contributors to manage their own tasks
©2010 Really Strategies, Inc. | www.reallysi.com
Integrated Xopus XML editor Enables balance between required content structure and flexibility to enhance information
©2010 Really Strategies, Inc. | www.reallysi.com
Custom PubMed lookup and reference managementUsers can search, insert, cite, and auto-renumber to ensure markup consistency
©2010 Really Strategies, Inc. | www.reallysi.com
Editing in XML source enables users to tag items of interestIn this example, user highlights text and uses an icon to label text as a drug name
©2010 Really Strategies, Inc. | www.reallysi.com
Alternate view: XML markup stays hidden in backgroundContributors are not aware that they are editing XML content
©2010 Really Strategies, Inc. | www.reallysi.com
Exporting XML from repositoryManaging Editor can easily select a content set and choose an export target
©2010 Really Strategies, Inc. | www.reallysi.com
Lessons learned
It’s hard to add XML structure to unstructured content Authors must work on a single content source; semantic
enrichment is too valuable to throw away Challenges to manage:
Editing tool vs. form: need to balance conformity vs. medical usefulness
Hiding XML from editors requires very tight content controls Training occasional external contributors not a viable option Lack of control over user’s browser types and versions makes
technical support difficult
It’s not rocket science, it’s just a lot of hard work
©2010 Really Strategies, Inc. | www.reallysi.com
QUESTIONS
17