ieee-themes: analysis and exploitation of musician social networks for recommendation and discovery
Post on 05-Dec-2014
1.651 Views
Preview:
DESCRIPTION
TRANSCRIPT
Analysis and Exploitation of Musician Social Networks
for Recommendation and Discovery
Kurt Jacobson
Ben Fields b.fields@gold.ac.uk
Christophe Rhodes
Mark Sandler
Michael Casey
Fields et. al - Analysis and Exploitation of Musician Social Networks2
overview– motivation– dataset– experiments– social radio
Fields et. al - Analysis and Exploitation of Musician Social Networks3
motivation
Fields et. al - Analysis and Exploitation of Musician Social Networks4
motivation Novelty Curves
Fields et. al - Analysis and Exploitation of Musician Social Networks5
motivation The Web
Fields et. al - Analysis and Exploitation of Musician Social Networks6
So much music,so little time.
Fields et. al - Analysis and Exploitation of Musician Social Networks7
So much music,so little of it good.
Fields et. al - Analysis and Exploitation of Musician Social Networks8
How do we discover good music?
Fields et. al - Analysis and Exploitation of Musician Social Networks9
listening
Fields et. al - Analysis and Exploitation of Musician Social Networks10
listeningsocial
Fields et. al - Analysis and Exploitation of Musician Social Networks11
listeningsocial
Fields et. al - Analysis and Exploitation of Musician Social Networks12
listeningsocial
Fields et. al - Analysis and Exploitation of Musician Social Networks13
listeningsocial
Fields et. al - Analysis and Exploitation of Musician Social Networks14
listeningsocial
Fields et. al - Analysis and Exploitation of Musician Social Networks15
dataset
Randomly Selected Artist
Fields et. al - Analysis and Exploitation of Musician Social Networks16
dataset Sampling Myspace
Randomly Selected Artist
Selected Artist's top
friend
Selected Artist's top
friend
Selected Artist's top
friend
Selected Artist's top
friend
Selected Artist's top
friend
Fields et. al - Analysis and Exploitation of Musician Social Networks17
dataset Sampling Myspace
Randomly Selected
Artist
Selected Artist's
top friend
Selected Artist's
top friend
Selected Artist's
top friend
Selected Artist's
top friend
Selected Artist's
top friend
Artist's top friend Artist's
top friend
Artist's top friend
Artist's top friend
Artist's top friend
Artist's top friend
Artist's top friend
Artist's top friend
Artist's top friend
Artist's top friend
Artist's top friend
Artist's top friend
Artist's top friend
Artist's top friend
Artist's top friend
Artist's top friend
Artist's top friend
Artist's top friend
Fields et. al - Analysis and Exploitation of Musician Social Networks18
dataset Sampling Myspace
Fields et. al - Analysis and Exploitation of Musician Social Networks19
dataset Sampling Myspace– scale-free (mostly)
– 15,478 nodes (artist pages)
– 120,487 directed edges
– 91,326 undirected edges
– avg. degree
– 15.5 as a directed graph
– 11.8 when undirected
Fields et. al - Analysis and Exploitation of Musician Social Networks20
dataset Cumulative Degree Distribution
Fields et. al - Analysis and Exploitation of Musician Social Networks21
dataset Cumulative Degree Distribution
Fields et. al - Analysis and Exploitation of Musician Social Networks22
experiments
Fields et. al - Analysis and Exploitation of Musician Social Networks23
experimentsGeodesic v. Acoustic Distance
–pair nodes by geodesic distance –looking for correlation with
pairwise EMD–result is inconclusive
Fields et. al - Analysis and Exploitation of Musician Social Networks24
experimentsGeodesic v. Acoustic Distance
Fields et. al - Analysis and Exploitation of Musician Social Networks25
experimentsMax Flow v. Acoustic Distance
– pairs of artist nodes grouped based on Maximum Flow
– a randomized network was created as well to compare the relationship
– results point toward a mostly orthogonal relationship
– examining the mutual information shows that most information not common across spaces
Fields et. al - Analysis and Exploitation of Musician Social Networks26
experimentsMax Flow v. Acoustic Distance
Fields et. al - Analysis and Exploitation of Musician Social Networks27
experimentsMax Flow v. EMD
Fields et. al - Analysis and Exploitation of Musician Social Networks28
experimentsMax Flow v. marsyas distance
Fields et. al - Analysis and Exploitation of Musician Social Networks29
experimentsLow Entropy Communities
–looking at whether communities are more homogenous if edges are weighted with sonic similarity
–uses genre entropy
Figure 1. Box and whisker plot showing the spread ofcommunity genre entropies for each graph partition methodwhere gm is greedy modularity, gm+a is greedy modular-ity with audio weights, wt is walktrap, and wt+a is walktrapwith audio weights. The horizontal line represents the genreentropy of the entire sample. The circles represent the av-erage value of genre entropy for a random partition of thenetwork into an equivalent number of communities.
If an artist specified no genre tags, this node is ignoredand makes no contribution to the genre entropy score. Inour data set, 2.6% of artists specified no genre tags.
4 RESULTS
The results of the various community detection algorithmsare summarized in Figure 1 and Table 1. When the genreentropies are averaged across all the detected communities,we see that for every community detection method the aver-age genre entropy is lower than SG as well as lower than theaverage genre entropy for a random partition of the graphinto an equal number of communities. This is strong evi-dence that the community structure of the network is relatedto musical genre.
It should be noted that even a very simple examinationof the genre distributions for the entire network sample sug-gests a network structure that is closely related to musicalgenre. Of all the genre associations collected for our dataset, 50.3% of the tags were either “Hip-Hop” or “Rap” while11.4% of tags were “R&B”. Smaller informal network sam-ples, independent of our main data set, were also dominatedby a handful of similar genre tags (i.e. “Alternative”, “In-die”, “Punk”). In context, this suggests our sample wasessentially “stuck” in a community of Myspace artists as-sociated with these particular genre inclinations. However,it is possible that these genre distributions are indicative ofthe entire Myspace artist network. Regardless, given that
algorithm c �SC� �Srand� Qnone 1 1.16 - -gm 42 0.81 1.13 0.61gm+a 33 0.90 1.13 0.64wt 195 0.80 1.08 0.61wt+a 271 0.70 1.06 0.62
Table 1. Results of the community detection algorithmswhere c is the number of communities detected, �SC� is theaverage genre entropy for all communities, �Srand� is theaverage genre entropy for a random partition of the networkinto an equal number of communities, and Q is the modu-larity for the given partition.
the genre entropy of our entire set is so low to begin withit is an encouraging result that we could efficiently identifycommunities of artists with even lower genre entropies.
From Figure 1 we see that, without audio-based weight-ing, the greedy modularity algorithm (gm) and the walk-trap algorithm (wt) result in nearly the same genre entropies.However the walktrap algorithm results in almost five timesas many communities which we would expect, because ofsmaller community size, to result in a lower genre entropy.It should also be noted that the optimized greedy modulationalgorithm is considerably faster than the walktrap algorithm- O(m log n) versus O(n2 log n).
With audio-based weighting, we see mixed results. Audio-based weighting seems to improve the results of the walk-trap algorithm (wt+a) - decreasing genre entropy and in-creasing modularity slightly. However, applying audio weightsto the greedy modularity algorithm (gm+a) actually increasedthe genre entropy scores and resulted in the identification offewer communities. It should be noted that our approach toaudio-based similarity was fairly primitive and alternativeapproaches may yield better results.
5 MYSPACE AND THE SEMANTIC WEB
Since our results indicate that the Myspace artist network isof interest in the context of music-related studies, we havemade an effort to convert this data to a more structured for-mat. We have created a Web service 5 that describes anyMyspace page in a machine-readable Semantic Web format.Using FOAF 6 and the Music Ontology 7 , the service de-scribes a Myspace page in XML RDF. This will allow fu-ture applications to easily make use of Myspace networkdata (i.e. for music recommendation).
5 available at (Omitted for submission)6 http://www.foaf-project.org/7 http://musicontology.com/
Fields et. al - Analysis and Exploitation of Musician Social Networks30
experimentsLow Entropy Communities
Figure 1. Box and whisker plot showing the spread ofcommunity genre entropies for each graph partition methodwhere gm is greedy modularity, gm+a is greedy modular-ity with audio weights, wt is walktrap, and wt+a is walktrapwith audio weights. The horizontal line represents the genreentropy of the entire sample. The circles represent the av-erage value of genre entropy for a random partition of thenetwork into an equivalent number of communities.
If an artist specified no genre tags, this node is ignoredand makes no contribution to the genre entropy score. Inour data set, 2.6% of artists specified no genre tags.
4 RESULTS
The results of the various community detection algorithmsare summarized in Figure 1 and Table 1. When the genreentropies are averaged across all the detected communities,we see that for every community detection method the aver-age genre entropy is lower than SG as well as lower than theaverage genre entropy for a random partition of the graphinto an equal number of communities. This is strong evi-dence that the community structure of the network is relatedto musical genre.
It should be noted that even a very simple examinationof the genre distributions for the entire network sample sug-gests a network structure that is closely related to musicalgenre. Of all the genre associations collected for our dataset, 50.3% of the tags were either “Hip-Hop” or “Rap” while11.4% of tags were “R&B”. Smaller informal network sam-ples, independent of our main data set, were also dominatedby a handful of similar genre tags (i.e. “Alternative”, “In-die”, “Punk”). In context, this suggests our sample wasessentially “stuck” in a community of Myspace artists as-sociated with these particular genre inclinations. However,it is possible that these genre distributions are indicative ofthe entire Myspace artist network. Regardless, given that
algorithm c �SC� �Srand� Qnone 1 1.16 - -gm 42 0.81 1.13 0.61gm+a 33 0.90 1.13 0.64wt 195 0.80 1.08 0.61wt+a 271 0.70 1.06 0.62
Table 1. Results of the community detection algorithmswhere c is the number of communities detected, �SC� is theaverage genre entropy for all communities, �Srand� is theaverage genre entropy for a random partition of the networkinto an equal number of communities, and Q is the modu-larity for the given partition.
the genre entropy of our entire set is so low to begin withit is an encouraging result that we could efficiently identifycommunities of artists with even lower genre entropies.
From Figure 1 we see that, without audio-based weight-ing, the greedy modularity algorithm (gm) and the walk-trap algorithm (wt) result in nearly the same genre entropies.However the walktrap algorithm results in almost five timesas many communities which we would expect, because ofsmaller community size, to result in a lower genre entropy.It should also be noted that the optimized greedy modulationalgorithm is considerably faster than the walktrap algorithm- O(m log n) versus O(n2 log n).
With audio-based weighting, we see mixed results. Audio-based weighting seems to improve the results of the walk-trap algorithm (wt+a) - decreasing genre entropy and in-creasing modularity slightly. However, applying audio weightsto the greedy modularity algorithm (gm+a) actually increasedthe genre entropy scores and resulted in the identification offewer communities. It should be noted that our approach toaudio-based similarity was fairly primitive and alternativeapproaches may yield better results.
5 MYSPACE AND THE SEMANTIC WEB
Since our results indicate that the Myspace artist network isof interest in the context of music-related studies, we havemade an effort to convert this data to a more structured for-mat. We have created a Web service 5 that describes anyMyspace page in a machine-readable Semantic Web format.Using FOAF 6 and the Music Ontology 7 , the service de-scribes a Myspace page in XML RDF. This will allow fu-ture applications to easily make use of Myspace networkdata (i.e. for music recommendation).
5 available at (Omitted for submission)6 http://www.foaf-project.org/7 http://musicontology.com/
Fields et. al - Analysis and Exploitation of Musician Social Networks31
experimentsLow Entropy Communities
Fields et. al - Analysis and Exploitation of Musician Social Networks32
social radio
Fields et. al - Analysis and Exploitation of Musician Social Networks33
social radioWeighted Max Flow Playlists
–max flow presents an interesting opportunity to create playlists using least resistant paths
–preliminary testing shows promise–needs more exhaustive testing
Fields et. al - Analysis and Exploitation of Musician Social Networks34
social radioPlaylist Generator
Fields et. al - Analysis and Exploitation of Musician Social Networks34
social radioPlaylist Generator
Fields et. al - Analysis and Exploitation of Musician Social Networks35
social radioThe Social Radio
– produce playlists via weighted distance paths
– next destination song is determined via a vote across all listeners
– candidate songs selected from disparate communities
Fields et. al - Analysis and Exploitation of Musician Social Networks36
social radioThe Social Radio
Fields et. al - Analysis and Exploitation of Musician Social Networks37
resources– http://mypyspace.sourceforge.net/
– http://dbtune.org/myspace/
– http://omras2.doc.gold.ac.uk/software/fftExtract/
– slides: http://slideshare.com/BenFields
– contact: b.fields@gold.ac.uk
http://blog.benfields.net
twitter: @alsothings
Fields et. al - Analysis and Exploitation of Musician Social Networks37
Questions?
resources– http://mypyspace.sourceforge.net/
– http://dbtune.org/myspace/
– http://omras2.doc.gold.ac.uk/software/fftExtract/
– slides: http://slideshare.com/BenFields
– contact: b.fields@gold.ac.uk
http://blog.benfields.net
twitter: @alsothings
top related