mp25: audio fingerprinting and metadata correction with python
DESCRIPTION
TRANSCRIPT
Audio fingerprinting and metadatacorrection with Python
Alastair Porter
November 21, 2011
Me
Background in Computer ScienceMasters McGill Music TechOnline
http://github.com/alastair (20/28 music; 11 in python)http://twitter.com/alastairporter
Python as a go-to language
Quick for prototypingUse the same code in a production releaseVery handy for API access (thin wrapper around urllib2)
Music and Metadata
Music and Metadata
The problem:People are really bad at naming musicInconsistent over releases
The solution:CrowdsourcingGet info from as many trusted sources as possibleMake renaming take no effort
MusicBrainz
Amazon
Amazon (Coverart)
Last.fm
Last.fm (Genre tags)
MusicBrainz
albumidentify
http://github.com/albumidentify/albumidentify
MP3, FLAC, Ogg, CDs
Identification strategy
If there’s a CD TOC, use that (musicbrainz lookup)If no match, use audio fingerprintingIf no match, do a text lookup (artist/album)
Fingerprinting
Converts an audio signal to a short sequence of numbersSmaller to compare than an entire filePerceptual features rather than byte comparison (workswith different encodings)
Identification strategy
Fingerprinting gives us a set of candidate tracksA track could be on many albums (original release, best of,mix album)Keep a list of what tracks we have for each albumOnce we fill all the slots for an album, success!
Metadata strategy
Text information from MusicbrainzGenre from last.fmImage from Amazon (or folder.jpg)Musicbrainz tells us where these are (don’t need to search)Save in every file (Text is cheap)
Writing it all out
Custom MP3/ID3 writerOgg meta tagsFLAC meta tagsName files
Artist/Artist - Year - Album/01 - Artist - Track
Replaygain!Be a good citizen: Submit fingerprints to musicbrainz
What’s next
New version of musicbrainzNew fingerprinterMore metadataMore metadata
Thanks
More information:
MusicBrainz: http://musicbrainz.orgalbumidentify:http://github.com/albumidentify/albumidentify
More fingerprinting: http://acoustid.org,http://echoprint.me
Last.fm