modeling the complexity of music metadata in semantic graphs for exploration and discovery
TRANSCRIPT
Modeling the Complexity of Music Metadatain Semantic Graphs for Exploration and Discovery
ANR-14-CE24-0020
Pasquale Lisena, Raphaël Troncy, Konstantin Todorov, Manel Achichi
Digital Libraries for Musicology (DLfM) Workshop28th October 2017 | Shanghai Conservatory of Music
Information contained in librarian knowledgebut not publicly available
Hard question for currentmusic models and ontologies
Different practical implications(MIR, concert and radio programming, music recommendation)
3
Project Goals • Improve music description to fostermusic exchange and reuse
• Connect sources, multiply usage,enrich user experience
• Music specific data model• Vocabularies and data public available as
Linked Open Data• Tools for visualization, interconnections,
recommendation• Experience and praxis for other institutions
5
Works62 550 | XML
Scores9 154 | XML
Concerts340 609 | XML
Discs9 500 | XML
Works6 846 | UNIMARC
Scores30 319 | UNIMARC
Concerts5 164 | XML
Discs8 602 | XML
Source Datasets
Works135 940 | INTERMARC
Scores89 184 | INTERMARC
6
Source Datasets
DATASET
Works
Scores
Concerts
Discs
Classic work
Jazz improvisation
Ethnic/World/Traditional music
How to manage this complex metadata?
7
State of the Art: MusicOntology
- One of the first example of describingmusic using Semantic Web
- Extend FRBR, Timeline Ontology, Event Ontology
- Uses vocabularies for Keys, Musical Instrument (by MusicBrainz), Genres (DBpedia)
8Raimond, Samer A. Abdallah, Mark B. Sandler, and Frederick Giasson. 2007. The Music Ontology. In 15thInternational Conference on Music Information Retrieval (ISMIR). 417–422
The DOREMUS model
F15Work
F22Expression
F28Expression
Creation
- Music specific extension of FRBRoo
- Triplet pattern:Work-Expression-Event
- Dynamic:every triplet is autonomous, and linkable to the other ones
- Relies on Linked Data principles (everything is an URI,RDF model)
9http://data.doremus.org/ontology
F14Work
F22Expression
M2Opus
StatementF28Expression
Creation
R3 is realized in
E7Activity
5
1
“Sonate pour violoncelle et piano no 1”@fr“Sonates" , "Sonata in F"
Ludwig van Beethoven
Ludwig von Beethoven
composercompositeur@frcompositore@it
U17 has opus statement
U12 has genre
P102 has title
U31 had function of
type
P14 carried out by
P9 consists of
P4 has time span1796
Sonatasonata@it , sonate@fr ,
klaviersonate@de
M42 PerformedExpression
Creation
M43PerformedExpression
Berlin
P4 has time span
1796
P7 tookplace at
F24 Publication Expression
F30 Publication
Event
P4 has time span
1797
P7 took place at
Vienna
U4 had princepspublication
U54 is performed expression of
P165 incorporates
1770
1827
P98born
P100died
F MajorF Dur@de , Fa majeur@fr,
Fa maggiore@it , Fa mayor@es
M6Casting
M23Casting Detail 1
U30quantity
U2 foresees
mop
PianoPianoforte@itFortepian@pl
M23Casting Detail
1
U30quantity
U2 foresees
mop
CelloVioloncello@itVioloncelle@fr
F15Complex
Work
F19 Publication
WorkM44Performed
Work
U5 had premiere
U38 has descriptive expression
R10 has member
Controlled Vocabularies
12
“Sax”@en
“Saxophone”@en
“Saxofone”@pt
“Sassofono”@it
“Saxophone”@fr
Alternate labels Alternate languages
<http://data.doremus.org/vocabulary/iaml/mop/wsa>
“English term is preferred globally”
Notes
“Woodwinds”@en“Legni”@it Hierarchy
“Baritone Saxophone”@en• Disambiguation• Search• Graph-based analysis
APPLICATIONS
Controlled Vocabularies
13
GENRESDiabolo (629)
IAML (607)Itema3 (212)Redomi (313)
RAMEAU (654)
Medium of performanceMIMO (2480)Itema3 (314)IAML (419)
Diabolo (2117)RAMEAU (876)Redomi (179)
Musical keys29
Modes22
Catalogues151 Derivation types
16
Functions~ 30
coming soon
http://data.doremus.org/vocabularies
Interlinking: Vocabularies
14
http://data.doremus.org/vocabulary/iaml/genre/cha
“cha-cha-cha”
http://data.doremus.org/vocabulary/diabolo/genre/cha_cha_cha
“cha cha cha”
http://yamplusplus.lirmm.fr/
=
String matching + graph traversal
Interface for validatingthe matching
001 FRBNF139081882FR
100 $313891295$w.0..b.....$aBeethoven$mLudwig van$d1770-1827
144 $w....b.fre.$aSonates$bPiano$pOp. 27, no 2$tDo dièse mineurLANG TITLE MOP OPUS KEY
“MARC must die” -- Roy Tennant, 2002http://lj.libraryjournal.com/2002/10/ljarchives/marc-must-die/#_
MARC issues
16
• Different variantsUNIMARC, INTERMARC
• Free text fielddifferent practices in describing the same information
“Op. 27 n. 2” - “Op. 27 no 2”
• Frequent mistakes in editorial workwrong fields, typos, wrong punctuation
Data conversion
marc2rdf
experts-mademapping rules
17
controlled vocabularies
https://github.com/DOREMUS-ANR/marc2rdf/
• Field parsing and mapping• NLP techniques• Graph generation• String2URI
TASK
S
Interlinking: Works
18
http://data.doremus.org/expression/d72301f0-0aba-3ba6-93e5-c4efbee9c6ea
“Sonata quasi una fantasia”
http://data.doremus.org/expression/22679001-2cd0-3f84-b502-0f337429966f
“Quasi una fantasia”
https://github.com/DOREMUS-ANR/legato
=
Legato F-measure > 0.85Precision > 0.87
Recall > 0.82
Interlinking: Works
19
1. Data cleaningremoving “noisy” properties, i.e. identifiers, comments, …
2. Instance profilingrepresent each resource as sub-graph
3. Instance indexing and matchingconvert the sub-graph in a set of keywords in order to apply text document matching techniques
4. Post-processingClustering of the datasets, identify false positive of previous points
Visualizing
20http://overture.doremus.org
Prototype of web app that uses the DOREMUS dataset
• Follow the linkslike in the graph
• Enriched experienceDBpedia, GeoNames, …
• Timeline of related event• Similar works
recommendation
Future Work
21
• Pivot Vocabularies of Genres and MoPsas result of the interconnection task
• Recommendation Systemfirst step: “Combining Music Specific Embeddings for Computing Artist Similarity” @ISMIR2017
• Schema.org injection in all pagesgoals: SEO optimization, simplification of the data in order to extend their usage
23
results
This and more questions:https://github.com/DOREMUS-ANR/knowledge-base/tree/master/query-examples
Links
http://www.doremus.org/DOREMUS Website
GitHub pagewith tools, converters, ontologies, ...https://github.com/DOREMUS-ANR/
Dataset & SPARQL Endpointhttps://data.doremus.org/sparqlhttps://data.doremus.org/fct
OVERTUREhttps://overture.doremus.org/
This presentationhttps://www.slideshare.net/squalelis
24