on our way to iptc rnews 1.0 - inception and design of a standard
DESCRIPTION
The International Press Telecommunications Council (IPTC) is developing rNews – a new standard for embedding publishing metadata into online documents using RDFa or HTML5 Microdata. In this talk, @smyles, @kansandhaus and @agebhard, members of the IPTC, provide a look back at how rNews came into existence and advanced to what it is today, argue the business case for semantic markup and introduce us to rNews in its current state.http://www.iptc.orghttp://rnews.orgGiven at Semantic Web Media Summit on Sept 14th, 2011http://lanyrd.com/2011/semantic-web-media-summit/shhtp/TRANSCRIPT
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
1
rNewsEmbedded Data For
The News Industry
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
Hello!
§ Stuart Myles – @smylesLead of the IPTC Semantic Web WG &Deputy Director of Schema Standards,The Associated Press
§ Evan Sandhaus – @kansandhausLead Architect, Semantic Platforms,The New York Times Company
§ Andreas Gebhard – @agebhardManaging Editor,Getty Images
2
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
4
...And 50 Others
STORY
PHOTO
Story components which are obvious to a person…
STORY
PHOTO
...are not so obvious to a machine.
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
The Problem of Structured Data
§ Modern Web Sites Built with 3 Tier Architecture• Data Tier: Database
Where Content Lives.• Presentation Tier:
HTML Document that is sent to user.
• Logic Tier: Software that reads from the Data Tier and outputs the Presentation Tier.
8
Data Tier
Logic Tier
Display Tier
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
The Problem Of Structured Data: Continued
9
Label Type Value
id number 1248069162607
Headline text New Web Code Draws Concern...
Byline text By TANZINA VEGA
Date date 20101010
Body text In the next few years, a powerful...
Length number 1123
Tag text Privacy
Tag text Computers and the Internet
Tag text Web Browsers
<html> <head> <title> New Web Code Draws Concern... </title> </head> <body> <div> New Web Code Draws Concern... </div> <div> By TANZINA VEGA </div> <div> October 10, 2010 </div> <div> In the next few years, a powerful... </div> </body></html>
Data Tier Display TierLogic Tier
§ Content very well structured on Data Tier, but all of this structure is lost in translation to presentation tier.
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
The Problem Of Structured Data: Continued
10
<html> <head> <title> New Web Code Draws Concern... </title> </head> <body> <div> New Web Code Draws Concern... </div> <div> By TANZINA VEGA </div> <div> October 10, 2010 </div> <div> In the next few years, a powerful... </div> </body></html>
Display Tier
=
?
§ Search engines, social networks, aggregators and other sites only see the Display Tier, and cannot leverage the underlying structure of the data.
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
Semantic Markup Standards
11
Microformats RDFa Microdata JSON
§ First§ Simple§ Rigid
§ Official§ Complex§ OpenGraph
§ Unofficial§ Flexible§ Schema.org
§ Official§ Developers§ External
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
12
rNews
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
rNews Defined
rNews is a data model for embedding machine-readable publishing metadata in web documents and a set of suggested implementations.
13
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
rNews is a data model
14
ImageObjectVideoObjectAudioObject
Article
Comment
OrganizationPerson Location
NewsItem
comment
associatedMedia
Concept
about
PostalAddress
addressaddress
mentions
address
creatoreditorcontributorprovidercopyrightHolderaccountablePerson
creatoreditor
contributorprovider
copyrightHoldersourceOrganization
name
associatedArticle
GeoCoordinates
geoCoordinates
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
for embedding machine-readable publishing metadata in web documents
15
HeadlineBylineTagsCreator...
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
and a set of suggested implementations
16
RDFa Microdata JSON
Today Very Soon Maybe?
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
rNews - Working Example
17
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
18
123456789
10111213141516171819202122232425262728293031323334353637383940414243444546474849505152
<!DOCTYPE4html4PUBLIC4">//W3C//DTD4XHTML41.04Transitional//EN"44"http://www.w3.org/TR/xhtml1/DTD/xhtml1>transitional.dtd">44<html><head></head><body>444<div>4444<div>4444444<div>Allies4Are44Split...</div>444444<div>NATO4Takes44Command</div>444444<div>44444444<img44src="img/libya_sample_reuters.jpg"/>44444444<div>Credit:4Goran4Tomasevic/Reuters</div>444444444<div>Rebel4fighters44take...</div>444444</div>444444<div>By4STEVEN4LEE44MYERS</div>444444<div>WASHINGTON44|4March424,42011</div>444444<div>44444444<p>Having44largely4succeeded...</p>444444</div>444444<div>44444444<p><a44href="http://www.nytimes.com/content/help/rights/copyright/copyright>notice.html">44444444444©4Copyright442011444444444</a><span>The4New4York44Times44Company</span></p>44444444<p><a44href="http://www.nytimes.com/ref/membercenter/help/agree.html">44444444444Disclaimer444444444</a></p>4444444</div>4444</div>44444<div>444444<div>44444444<div>Section</div>44444444<div>World</div>444444</div>444444<div>Tags</div>4444444<div>44444444<div>4444444444<div>People</div>4444444444<div>Qaddafi,4Muammar44el></div>44444444</div>444444</div>444444444444<div>44444444<div>Discussion44(3)</div>44444444<div>4444444444<div>So4the4question44is..."</div>4444444444<div>4444444444<a44href="http://timespeople.nytimes.com/view/user/27242827/activities.html">Chuck</a></div>4444444444<div>March425th,44201148:274am</div>44444444</div>444444</div>4444</div>444</div></body></html>
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
HTML 5 Microdata
19
<!DOCTYPE HTML><html itemscope itemtype="http://schema.org/NewsArticle" ><head>! <style type="text/css">@import url(css/iptc_times2.css);</style>! <meta itemprop="dateCreated" content="2011-03-23"/>! <meta itemprop="description" content="The questions about the command..."/>! <meta itemprop="inLanguage" content="en-US"/>! <meta itemprop="thumbnailUrl" content="http://graphics8.nytimes.com/images/common/icons/t_wb_75.gif"/>! <meta itemprop="genre" content="Current"/>! <meta itemprop="id" content="1248069687395"/>! <meta itemprop="version" content="2"/>! <meta itemprop="publishingPrinciples" content="http://www.nytco.com/press/ethics.html"/>! <meta itemprop="wordCount" content="879"/>!</head><body>! <div style="height:900px" class="article">! ! <div class="a_column">! ! ! <div itemprop="headline" class="headline">Allies Are Split on Goal and Exit Strategy in Libya</div>! ! ! <div itemprop="alternativeHeadline" class="rider">NATO Takes Command</div>! ! ! <div itemprop="associatedMedia" itemscope itemtype="http://schema.org/ImageObject">! ! ! ! <img itemprop="URL" class="image" src="img/libya_sample_reuters.jpg"/>! ! ! ! <div class="image_credit">Credit:! ! ! ! ! <span itemprop="creator" itemscope itemtype="http://schema.org/Person"> ! ! ! ! ! ! <span itemprop="name">Goran Tomasevic</span>! ! ! ! ! </span> ! ! ! ! ! /! ! ! ! ! <span itemprop="sourceOrganization" itemscope itemtype="http://schema.org/Organization">! ! ! ! ! ! <span itemprop="name">Reuters</span>! ! ! ! ! ! <meta itemprop="tickerSymbol" content="NYSE TRI"/>! ! ! ! ! </span>! ! ! ! </div>
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
RDFa
20
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd"><html xmlns:rnews="http://dec.iptc.org/rnews/0.1/"><head>! <style type="text/css">@import url(css/iptc_times2.css);</style></head><body>! <div class="article" style="height:623px">! ! <div class="a_column">! ! ! <div property="rnews:headline" class="headline">Allies Are Split on Goal and Exit Strategy in Libya</div>! ! ! <div class="rider">NATO Takes Command</div>! ! ! <div class="main_image">! ! ! ! <img class="image" src="img/libya_sample_reuters.jpg"/>! ! ! ! <div class="image_credit">Credit: Goran Tomasevic/Reuters</div>! ! ! ! <div class="image_caption">! ! ! ! ! Rebel fighters take cover during a shelling near Ajdabiyah, Libya on Thursday.! ! ! ! </div>! ! ! </div>! ! ! <div rel="rnews:createdBy" class="byline">By ! ! ! ! <span about="http://demo.iptc.org/per/steven_lee_myers" typeof="rnews:Person">! ! ! ! ! <span property="rnews:name">STEVEN LEE MYERS</span>! ! ! ! </span>! ! ! </div>! ! ! <div class="publication_date">! ! ! ! <span property="rnews:dateline">WASHINGTON</span>! ! ! ! | ! ! ! ! <span property="rnews:dateCreated" content="2011-03-24">March 24, 2011</span>! ! ! </div>
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
The Way to rNews
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
The Way To rNews
§ June: Genesis of rNews - Evan at SemTech 2010
§ November 5 - Rome: chartered
§ internal discussions about NYT draft
§ March 9 - Dubai: rNews 0.1
§ lots of feedback, changes and additions
§ June 9 - Berlin: rNews 0.5
§ June 28: rNews 0.6
§ September 6: rNews 0.7 [aligned w/ schema.org]
§ October 7 - Vienna: rNews 1.022
2010
2011
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
23
Engaging Our Community
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
24
Engaging Our Community
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
25
Engaging Our Community
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
26
Engaging Our Community
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
Feedback we incorporated...
§ In Person• 3 Meetups: New York, Berlin, London• Over a dozen one-on-one meetings with leading media and
technology companies.
§ Online • Rnews.org forum• Numerous Blog Posts
§ In The Standard’s Community• W3C Community Group• Media Standards Trust
27
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
Feedback we incorporated...
28
pointcircleelevationpolygonboxlineGeo
CoordinatesLocation
latitudelongitudealtitude
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
Feedback we incorporated...
29
Person
editor
NewsItem
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
rNews Benefits
Or Why You Should Care About rNews
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
Benefit #1: Better Links
31
With StructuredData
No StructuredData
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
Benefit #2: Better Analytics
32
Javascript can extract richer news metadataAnalytics per item, not just per page
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
33
Benefit #3: Better Ad Placement
Leverage metadatanot just text
Avoid unfortunatejuxtapositions
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
rNews as a news API
34
Level the Playing FieldEncourage Open Innovation
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
35
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
How Can You Help Us Get to rNews 1.0?
§ Check out the rNews 0.7 spec§ Mark up some pages using rNews§ Extract rNews properties using your favourite distiller§ Dream up The Next Metadata Killer App™
Let us know what you thinkLet us know how we can help
@smyles • @agebhard • @kansandhaus
36
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
37
rNewsThank
You