the listening experience database

19
The Listening Experience Database Alessandro Adamou Knowledge Media Institute, The Open University [email protected] @anticitizen79

Upload: alessandro-adamou

Post on 16-Apr-2017

97 views

Category:

Science


1 download

TRANSCRIPT

Page 1: The Listening Experience Database

The Listening Experience Database

Alessandro AdamouKnowledge Media Institute, The Open University

[email protected]

@anticitizen79

Page 2: The Listening Experience Database

Background

Research gap between leading strands of analysis of the musical experience (cognitive, commercial, critical) [1,2], widened during the Web of Data age.

Primary sources assumed to exist in significant quantities but:

• unstructured (or worse, not digitised) and/or

• unpublished and/or

• domain-biased (popular interest, phonographic era, social media)

LED consortium formed end 2012 to collate primary evidence of listening.

• £0.75m AHRC grant (2013-15)

• £0.98m AHRC grant (2016-19)[1] S. Burstyn. In quest of the period ear. Early Music, XXV(4):692-701, November 1997.

[2] R. C. Wegman. Music as heard: Listeners and listening in Late-Medieval and Early Modern Europe (1300-1600). Musical Quarterly, 82(3-4):432-433, 1998.

Page 3: The Listening Experience Database

Crowdsourcing in databases

Obtaining data by soliciting contributions from a community.

examples:• Historic Cambridge Newspaper Collection

• Zooniverse (SETIlive, Old Weather etc.)

• Wikimedia Foundation projects

• setlist.fm, discogs.com

• UK Reading Experience Database

“Must a modern database really start up empty today?”

Page 4: The Listening Experience Database

Inclusion protocols

• From any historical period and culture– current oldest entry is 11th Century AD

• Involving any musical genre

• Must be documented with a referenceable source– also unpublished, if obtained from an archival resource

– e.g. diaries, private correspondence, oral history, official

papers, (auto)biographies, social media

• No solicited criticism or fictional accounts

• No minimum standard for the level of detail in

describing the entities involved

• in English (primarily or officially translated)

Page 5: The Listening Experience Database

LED-in-a-slidehttp://open.ac.uk/Arts/LED8227 individual listening experience / ~10k submissions

Evidence from published sources as well as manuscripts

Supervised crowdsourcing by experts and enthusiasts

Implemented using Linked Data, cross-domain data reuse

Faceted browsing, search, geographical browsing

~ 400,000 RDF triples

Page 6: The Listening Experience Database

Shape of LED data - by region

Page 7: The Listening Experience Database

Shape of LED data - by genre

“Art music” groups Classical (incl. contemporary Classical), Baroque, Romantic, Chamber music

Page 8: The Listening Experience Database

Shape of LED data - by period

Page 9: The Listening Experience Database

Native Linked Data implementation• All generated data are entirely stored as RDF triples

– Browsing, searching etc. directly on the quad store

• Multi-tenancy and crowdsourcing model with named graphs

• Modular ontology using Bibo, DC, Music Ontology, FOAF, Schema.org (and a little in-house)

• Data reuse and reconciliation with external sources is integrated with the whole lifecycle

• Flexibility of SPARQL query interface: not constrained by the facets offered by the Web portal.

Page 10: The Listening Experience Database

External Linked datasets

● British National Bibliography http://bnb.data.bl.uk

○ Published works in the UK

● DBpedia http://dbpedia.org○ Geographical data, musical works, published works worldwide

● LinkedBrainz http://linkedbrainz.org○ Discontinued, but reengineering code has been made public

● data.gov.uk http://reference.data.gov.uk○ Exact time instants and intervals in the British calendar

● VIAF http://viaf.org○ for bridging alignments between BNB and DBpedia (mainly)

Page 11: The Listening Experience Database

Dealing with vagueness• Under-represented or vague spatial data

– e.g. “at home in Haymarket”, “church in Italy”, “a trip from Vienna to London”

• Not fully qualified temporal instants or intervals– e.g. “April 3, the 1820s”, “late 18th Century”, “Summer 1938, at night”,

“sometime between [...]”

• Entities being described but not named– e.g. “British soldiers”, “Anglican Mass”, “mourners of Felix

Mendelssohn”, “Mrs. Britten”

• Unaligned semantics

– e.g. “Chords”, “Electric Guitar”, “Gibson Les Paul Sunburst”

– e.g. “King of England”, “Queen”, “Monarch”

Not the primary application of Linked Data, but the paradigm and founding semantics can be adapted (to an extent).

Page 12: The Listening Experience Database

Spatial data extractionFree text input Stanbol + OpenNLP Curated Input RDF+GeoSPARQL

Apollo Theatre, gallery

“Apollo Theatre, gallery”

dbp:Apollo_TheaterManhattan

dbp:Apollo_TheatreCity of Westminster

led:place/12345 sg:sfIntersects dbp:Apollo_Theater ; rdfs:label “Apollo ...gallery”@en

Page 13: The Listening Experience Database

Fuzziness in temporal data

Extended Date/Time Format (standard draft, Library of Congress, 2012)• Allows formalisation of underspecified points in time and

intervals– “187u-22-uu” means “sometime in Summer in the 1870’s”

• We extended it to support subjective intervals (e.g. early/mid/late, also for daytimes) and ranges (from-to)

• Made available in RDF for others to reuse, through data.open.ac.uk (currently only materialised data)

Page 14: The Listening Experience Database

LED contributions to the LD cloud

Royal Carl Rosa Company – “Faust”for orchestra and voicedate: 14 May, 1917location: Garrick Theatre (indoors, private space)

Novel data: historical music performances

Novel data: portions and quotes of document sources / manuscripts (not modelled in BNB)

Journeying boy : the diaries of the young Benjamin Britten 1928-1938

Diary entries:• Page 17, Feb 14 1929: “Still absent from school work. Everso much more […]”• Page 67, March 18 1931: “Go with Mummy to B.B.C – Beethoven concert […]”• Page 70, April 22 1931: “Go to John Nicholson’s to tea at 2.45. & to hear

Gramophone records on his new Radio-Gram Hear. Brahms. Pft. Concerto Mov. 1. (Rubenstein) Tchaik.”

• …

Page 15: The Listening Experience Database

LED contributions to the LD cloud

Refined data: biographical enhancements

Refined data: semantic alignments between DBpedia, BNB and MusicBrainz

dbpedia:Aaron_Coplanddbpedia:Jane_Austen

≡≡

mb:aad3af83-5b59-4b86-a569-1a8409149b09#_bnb:AustenJane1775-1817

Mary Somerville

Full name: Mary Fairfax Greig SomervilleSocial group: Rulers, chiefs, aristocracy & gentry etc.Occupation: ScientistReligion: Christian, Protestantwrote: Memoir of Mary Somerville (1817, 1840’s, 1849, 1850…)

Page 16: The Listening Experience Database

Figures on reuse

Type Unique instances Total reuse Peak

People 8186 31869 1479

Written works 425 7474 431

Geographical locations 1410 8470 1061

Musical works (songs, albums) 6790 4241 46

Musical genres 343 7104 1195

Computed on 8227 distinct listening experiences

Source Reused distinct instances

DBpedia 2596

BNB 553

data.gov.uk 3278

MusicBrainz 1203

from external data sources

Page 17: The Listening Experience Database

Ongoing work

• Text mining of listening evidence (e.g. most commonly used terms for describing listening for specific periods or genres).

• Analytics on structured data (community detection/clustering)

• Detection of listening experiences through Web crawling or hooking into the user experience

• Controlled vocabularies (e.g. HISCO for historical occupations)

• Linked Data Fragments for facilitating reuse (under investigation)

Page 18: The Listening Experience Database

Further Reading

about LED:

Brown, S., Barlow, H., Adamou, A. and d'Aquin, M. (2015). The Listening Experience Database Project: Collating the Responses of the "Ordinary Listener" to Prompt New Insights into Musical Experience, The International Journal of the Humanities: Annual Review, 13, p. 17-32, CGPublisher

Brown, S., Adamou, A., Barlow, H. and d'Aquin, M. (2014). Building listening experience Linked Data through crowd-sourcing and reuse of library data, Proceedings of the 1st International Workshop on Digital Libraries for Musicology, p. 1-8, ACM

related:

Hyvönen, E. (2012). Publishing and Using Cultural Heritage Linked Data on the Semantic Web,Morgan & Claypool

Lewis, D. and Martin, T. (2015). Managing Vagueness with Fuzzy in Hierarchical Big Data. 2015 INNS Conference on Big Data, Vol. 53, p. 19-28, Elsevier

Page 19: The Listening Experience Database

Thank you - QA time

Alessandro AdamouKnowledge Media Institute, The Open University

[email protected]

@anticitizen79