a semantic makeover for cms data - linked jazz · linked jazz project — @linkedjazz // code4lib...
TRANSCRIPT
![Page 1: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/1.jpg)
A Semantic Makeover for CMS DataBill Levay — @wjlevay
Linked Jazz Project — @linkedjazz // Code4Lib 2015
![Page 2: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/2.jpg)
Project GitHub Repo
github.com/wjlevay/tulane-jazz-data
![Page 3: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/3.jpg)
![Page 4: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/4.jpg)
![Page 5: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/5.jpg)
![Page 6: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/6.jpg)
![Page 7: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/7.jpg)
![Page 8: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/8.jpg)
![Page 9: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/9.jpg)
Tulane University Digital Collections
Two collections:
Hogan Jazz Archive Photography Collection
Ralston Crawford Collection of Jazz Photography
CONTENTdm system
![Page 10: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/10.jpg)
Tulane University Digital Collections
1,787 digital images
at least 681 unique individuals
at least 2,767 depictions —http://xmlns.com/foaf/0.1/depiction
People depicted in the same photograph can be said to “know” each other — http://xmlns.com/foaf/0.1/knows
These relationships can be expressed in RDF
![Page 11: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/11.jpg)
![Page 12: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/12.jpg)
![Page 13: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/13.jpg)
![Page 14: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/14.jpg)
![Page 15: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/15.jpg)
![Page 16: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/16.jpg)
![Page 17: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/17.jpg)
![Page 18: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/18.jpg)
![Page 19: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/19.jpg)
Searching VIAF
Python script searches VIAF for each nameviafURL = 'http://viaf.org/viaf/search?query=local.personalNames+%3D+{SEARCH}&httpAccept=text/xml'
Uses name + birth year if we have it
Assigns grades to search results based on our confidence in the match
Parses XML results, which include alt names, LC and Wikipedia IDs, titles of attributed works
Whitelisted terms for titles: “New Orleans,” “ragtime,” “jazz,” “big band,” etc.
![Page 20: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/20.jpg)
![Page 21: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/21.jpg)
![Page 22: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/22.jpg)
Building N-Triples
If VIAF results give us Wikipedia ID, form a DBpedia URI
Else, use Library of Congress URI
Append datatype IRI (internationalized resource identifier) to date triples
Use GeoNames URI for places
![Page 23: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/23.jpg)
Dates
YYYY
YYYY-MM
YYYY-MM-DD
1960s
circa 1950
Early 1949
Spring 1946
http://www.w3.org/2001/XMLSchema#gYear
http://www.w3.org/2001/XMLSchema#gYearMonth
http://www.w3.org/2001/XMLSchema#date
http://www.w3.org/2001/XMLSchema#string}
![Page 24: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/24.jpg)
Building N-Triples
<personURI> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person>
<personURI> <http://xmlns.com/foaf/0.1/name> "First Last"@en
<personURI> <http://xmlns.com/foaf/0.1/depiction> <photoURI>
<person1URI> <http://xmlns.com/foaf/0.1/knows> <person2URI>
<photoURI> <http://purl.org/dc/terms/created> "YYYY-MM-DD"^^<http://www.w3.org/2001/XMLSchema#date>
<photoURI> <http://purl.org/dc/terms/spatial> <geonamesURI>
![Page 25: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/25.jpg)
![Page 26: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/26.jpg)
![Page 27: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/27.jpg)
Future Development
Integrate with existing Linked Jazz dataset
Improve VIAF matching script
Automate GeoNames place URI lookup
Work with Tulane to publish linked data
The problem of photo collages
![Page 28: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/28.jpg)
Next Up: Discographies
Express jazz discography data in RDF
Event-based with recording session as focus
MusicBrainz/LinkedBrainz have tackled discogs to some extent, but not in the vein of traditional jazz discography
Music Ontology and Event Ontology
Use MusicBrainz URIs for releases
![Page 29: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/29.jpg)
![Page 30: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/30.jpg)
Acknowledgments
Hogan Jazz Archive, Tulane University
Dr. Cristina Pattuelli
Matt Miller
the Linked Jazz Team
![Page 31: A Semantic Makeover for CMS Data - Linked Jazz · Linked Jazz Project — @linkedjazz // Code4Lib 2015. Title: slides.key Created Date: 20150210192945Z](https://reader033.vdocuments.mx/reader033/viewer/2022052015/602c9e16843fd51e3d7a6cc9/html5/thumbnails/31.jpg)
github.com/wjlevay/tulane-jazz-datalinkedjazz.org
Bill Levay — @wjlevay Linked Jazz Project — @linkedjazz // Code4Lib 2015