1 organization of and searching in musical information donald byrd school of music indiana...

31
1 Organization of and Searching in Musical Information Donald Byrd School of Music Indiana University 19 January 2006

Upload: brett-tate

Post on 29-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

1

Organization of and Searching in Musical Information

Donald Byrd

School of Music

Indiana University

19 January 2006

rev. Jan. 2006 2

Overview

1. Introduction and Motivation

2. Basic Representations

3. Why is Musical Information Hard to Handle?

4. Music vs. Text and Other Media

5. OMRAS and Other Projects

6. Summary

3

1. Introduction and Motivation

• Three basic forms (representations) of music are important– Audio: most important for most people (general public)

• All Music Guide (www.allmusicguide.com) has info on >>230,000 CD’s

– MIDI files: often best or essential for some musicians, especially for pop, rock, film/TV

• Hundreds of thousands of MIDI files on the Web

– CMN (Conventional Music Notation): often best, sometimes essential for musicians (even amateurs) and music researchers

• Music holdings of Library of Congress: over 10M items– Includes over 6M pieces of sheet music and tens/hundreds of

thousands of scores of operas, symphonies, etc.: all notation, especially Conventional Music Notation (CMN)

• Differences among the forms are profound

4

2. Basic Representations of Music & Audio

Audio (e.g., CD, MP3): like speech

Time-stamped Events (e.g., MIDI file): like unformatted text

Music Notation: like text with complex formatting

Digital Audio

Time-stamped Events

Music Notation

rev. Jan. 2006 5

Basic Representations of Music & Audio

Audio Time-stamped Events Music Notation Common examples CD, MP3 file Standard MIDI File Sheet music

Unit Sample Event Note, clef, lyric, etc.

Explicit structure none little (partial voicing much (complete information) voicing

information)

Avg. rel. storage 2000 1 10

Convert to left - OK job: easy OK job: easyGood job: hard Good job: hard

Convert to right 1 note: pretty easy OK job: hard - other: hard or very hard Ideal for music music music bird/animal sounds sound effects speech

6

The Four Parameters of Notes

• Four basic parameters of a definite-pitched musical note1. pitch: how high or low the sound is: perceptual analog

of frequency

2. duration: how long the note lasts

3. loudness: perceptual analog of amplitude

4. timbre or tone quality

• Above is decreasing order of importance for most Western music

• …and decreasing order of explicitness in CMN!

7

How to Read Music Without Really Trying

• CMN shows at least six aspects of music:– NP1. Pitches (how high or low): on vertical axis

– NP2. Durations (how long): indicated by note/rest shapes

– NP3. Loudness: indicated by signs like p , mf , etc.

– NP4. Timbre (tone quality): indicated with words like “violin”, “pizzicato”, etc.

– Start times: on horizontal axis

– Voicing: mostly indicated by staff; in complex cases also shown by stem direction, beams, etc.

• See “Essentials of Music Reading” musical example.

8

3. Why is Musical Information Hard to Handle?

1. Units of meaning: not clear anything in music is analogous to words (all representations)

2. Polyphony: “parallel” independent voices, something like characters in a play (all representations)

3. Recognizing notes (audio only)

4. Other reasons

– Musician-friendly I/O is difficult

– Diversity: of styles of music, of people interested in music

rev. Jan. 2006 9

Units of Meaning (Problem 1)

• Not clear anything in music is analogous to words – No explicit delimiters (like Chinese)– Experts don’t agree on “word” boundaries (unlike Chinese)– Music is always art => “meaning” much more subtle!

• Are notes like words?• No. Relative, not absolute, pitch is important• Are pitch intervals like words?• No. They’re too low level: more like characters• Are pitch-interval sequences like words?• In some ways, but

– Ignores note durations– Ignores relationships between voices (harmony)– Probably little correlation with semantics

10

Independent Voices in Music (Problem 2)

J.S. Bach: “St. Anne” Fugue, beginning

11

Independent Voices in Text

MARLENE. What I fancy is a rare steak. Gret?

ISABELLA. I am of course a member of the / Church of England.*

GRET. Potatoes.

MARLENE. *I haven’t been to church for years. / I like Christmas carols.

ISABELLA. Good works matter more than church attendance.

--Caryl Churchill: “Top Girls” (1982), Act 1, Scene 1

M: What I fancy is a rare steak. Gret? I haven’t been...

I: I am of course a member of the Church of England.

G: Potatoes.

Performance (time goes from left to right):

12

Music Notation vs. Audio

• Relationship between notation and its sound is very subtle

• Not at all one symbol <=> one symbol– Notes w/ornaments (trills, etc.) are one => many

– All symbols but notes are one => zero!

– Bach F-major Toccata example

• Style-dependent– Swing (jazz), dotting (baroque art music)

– Improvisation (baroque art music, jazz)

– “Events” (20th-century art music)

– How well-defined is style-dependent

• Interpretation is difficult even for musicians– Can take 50-90% of lesson time for performance students

13

Music Perception and Music IR

• Salience is affected by texture, loudness, etc.– Inner voices in orchestral music rarely salient

• Streaming effects and cross-voice matching– produced by timbre: Wessel’s illusion (Ex. 1, 2)

– produced by register: Telemann example (Ex. 3)

• Octave identities, timbre and texture– Beethoven “Hammerklavier” Sonata example (Ex.4, 5)

– Affects pitch-interval matching

14

4. Music vs. Text and Other Media

———— Explicit Structure ———— Salienceleast medium most increasers

Music audio events notation loud; thin texture

Text audio (speech) ordinary text with markup “headlining”: large, written text bold, etc.

Images photo, bitmap PostScript drawing-program bright colorfile

Video videotape MPEG? Premiere file motion, etc.w/o sound

Biological DNA sequences, MEDLINE abstracts ??data 3D protein structures

15

Features of Music: Text Analogies

• Simultaneous independent voices and texture• Analogy in text: characters in a play

• Chords within a voice• Analogy in text: character in a play writing something visible

to the audience while saying different out loud

• Rhythm• Analogy in text: rhythm in poetry

• Notes and intervals• Note pitches rarely important

• Intervals more significant, but still very low-level

• Analogy in text: interval = (very roughly!) letter, not word

16

Features of Text: Music Analogies

• Words• Analogy in music: for practical purposes, none

• Sentences• Analogy in music: phrases (but much less explicit)

• Paragraphs• Analogy in music: sections of a movement (but less explicit)

• Chapters• Analogy in music: movements

rev. Jan. 2006 17

5. OMRAS and its Research

• OMRAS: Online Music Recognition and Searching– Details at www.omras.org

• Support from Digital Libraries Initiative, Phase 2– First major grant for music IR; from 1999 to 2002

• Joint project of IU, UMass, and Kings College London

• Goal: search realistic databases in all three representations

• Research Tools– True polyphonic search, i.e., search polyphonic music for

polyphonic pattern

– Full GUI for complex music notation

– Modular architecture: to let users mix and match

18

OMRAS Audio-degraded Music IR Experiment

• Before (original audio recording)

• After (audio -> MIDI -> audio)

• Started with recording of 24 preludes and fugues by Bach• Colleagues in London did polyphonic music recognition

• Audio -> events “an open research problem”• Results vary from excellent to just recognizable

• One of worst-sounding cases is Prelude in G Major from the Well-Tempered Clavier, Book I

19

OMRAS Audio-degraded Music IR Experiment

• Started with recording of 24 preludes and fugues by Bach

• Colleagues in London did polyphonic music recognition– “Convert to right [more than one note]: hard or very hard”

– Results are recognizable, but… Listen (worst-sounding case)!

• Jeremy Pickens (UMass) converted results to MIDI file and used as queries against database of c. 3000 pieces in MIDI form– Method: “harmonic distributions”

• Outcome for “worst” case: the actual piece was ranked 1st!

• Average outcome: actual piece ranked c. 2nd

20

OMRAS Research: Music Notation

• CMN often best form for musicians (even amateurs)– CMN sometimes essential for music researchers

• Searching CMN is obviously important...

• But almost no work on it so far! Why?– Specialized audience

– Complexity => huge investment in programming

– Lack of test collections

• Prospects for solving problems are good

21

NightingaleSearch

• Nightingale® is high-end commercial music editor for Macintosh– www.ngale.com

• NightingaleSearch inherits all normal functionality of Nightingale

• Searching commands use “Search Pattern” score as query

• Find next (“editor”) or find in database (“IR”) searching– Find in database is exact- or best-match

• Options: match pitch, match duration, etc.

• Does passage-level retrieval

22

*Bach: “St. Anne” Fugue, with Search Pattern

23

NightingaleSearch in Action

• With BachStAnne, exact-match OK, but...

• Best-match (threshhold 2) gives much better recall (of passages) with no loss of precision

• A harder example: user looking in a digital music library for “Twinkle, Twinkle, Little Star”(demo with a tiny personal library)

24

*Mozart: Variations for piano, K. 265, on “Ah, vous dirais-je, Maman”

&

?

2

4

2

4

Theme

œ

œ

œ

œ

œ

œ

œ

œ

œ

œ

œ

œ

œ

œ œ

œ œ

œ

œ

œ

œ

œ

œ

œ

œ

œ

œ

œ

˙

˙

&

?

Variation 2

œ

`

œ

œ œ œ

œ

œ œ œ œ

œ

œ

œ

œ œ œ

œ

œ

œ

œ œ œ

œ

œ ˙

˙

œ œ œ œ

œ

œ œ œ

œ

œ ˙

˙

œ œ œ œ

œ

œ œ œ

œ

œ œ œ

œ # œ œ

œ

œ œ

œ

œ # œ œ

&

œ œ

œ

œ œ œ

25

*Suzuki: “Twinkle” Variations

&###c

Variation A

œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ

&##

#4

œ œ œ œ œ œ œ œ œ œ œ œ

(etc.) Variation B

œ œ ‰ j œ œ œ ‰ J œ œ œ ‰ J œ œ œ ‰ J œ

&##

#8

œ œ ‰ J œ œ œ ‰ J œ œ œ ‰ J œ œ œ ‰ j œ

(etc.) Variation C

œ œ œ œ œ œ œ œ œ œ œ œ

&##

#12

œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ

(etc.)

&##

#16 Variation D

œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ

&##

#19

œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ œ

(etc.) Theme

œ œ œ œ œ œ ˙ œ œ œ œ

&##

#24

œ œ ̇ œ œ œ œ œ œ ̇ œ œ œ œ œ œ ̇

&##

#29

œ œ œ œ œ œ ̇ œ œ œ œ œ œ ̇

26

Typke’s MIR System Survey

• Rainer Typke’s “MIR Systems: A Survey of Music Information Retrieval Systems” lists many systems– http://mirsystems.info/

• Commercial system: Shazam• Some research systems can be used over the Web, incl.:

– C-Brahms– Meldex/Greenstone– Mu-seek– MusicSurfer– Musipedia/Tuneserver/Melodyhound– QBH at NYU– Themefinder

27

Machinery to Evaluate Music-IR Research

• Problem: how do we know if one system is really better than another, or an earlier version?

• Solution: standardized tasks, databases, evaluation– In use for speech recognition, text IR, question answering, etc.

• Important example: TREC (Text Retrieval Conference)• For music IR, we now have...• IMIRSEL (International Music Information Retrieval

Systems Evaluation Laboratory) project– http://www.music-ir.org/evaluation/

• MIREX (Music IR Evaluation eXchange) modeled on TREC– 2005: audio only– 2006: audio and symbolic

28

Collections (a.k.a. Databases) (1 of 2)• Collections are improving, but very slowly• For research: poor to fair

– “Candidate Music IR Test Collections” • http://mypage.iu.edu/~donbyrd/MusicTestCollections.HTML

– Representation “CMN” vs. CMN

• For practical use: pathetic (symbolic) to good (pop audio)– Most are commercial, especially audio– Very little free/public domain– …especially audio! (cf. RWC)

• IPR issues are a total mess

29

Collections (a.k.a. Databases) (2 of 2)• Why is so little available?

– Symbolic form: no efficient way to enter– Solution: OMR? AMR? research challenges– Music is an art!– Cf. “Searching CMN” slides: chicken & egg problem– IPR issues are a total mess

rev. Jan. 2006 30

6. Summary (1 of 2)• Basic representations of music: audio, events, notation

– Fundamental difference: amount of explicit structure

• Have very different characteristics => each is by far best for some users and/or application

• Converting to reduce structure much easier than to add• Music in all forms very hard to handle mostly because of:

– Units of meaning problem– Polyphony

• Both problems are much less serious with text

rev. Jan. 2006 31

6. Summary (2 of 2)• Projects include

– Audio-based: via recognition of polyphonic music (OMRAS, query-by-humming, etc.)

– CMN-based: monophonic query vs. polyphonic database (emphasis on UI) (OMRAS)

– Style-genre identification from audio– Creative applications: music IR for improvisation, etc.

• Machinery to evaluate research is coming along (MIREX)• Collections

– for research: poor to fair– For practical use: pathetic (symbolic) to good (pop audio)– improving, but…– Serious problems with IPR as well as technology