music: broken symmetry, geometry, and complexity

Music: Broken Symmetry,Geometry, and ComplexityGary W. Don, Karyn K. Muir, Gordon B. Volk, James S. Walker

The relation between mathematics andmusic has a long and rich history, in-cluding: Pythagorean harmonic theory,fundamentals and overtones, frequencyand pitch, and mathematical group the-

ory in musical scores [7, 47, 56, 15]. This articleis part of a special issue on the theme of math-ematics, creativity, and the arts. We shall exploresome of the ways that mathematics can aid increativity and understanding artistic expressionin the realm of the musical arts. In particular, wehope to provide some intriguing new insights onsuch questions as:

• Does Louis Armstrong’s voice sound like histrumpet?

• What do Ludwig van Beethoven, Ben-ny Goodman, and Jimi Hendrix have incommon?

• How does the brain fool us sometimeswhen listening to music? And how havecomposers used such illusions?

• How can mathematics help us create newmusic?

Gary W. Don is professor of music at the Universityof Wisconsin-Eau Claire. His email address is [email protected].

Karyn K. Muir is a mathematics major at the State Uni-versity of New York at Geneseo. Her email address [email protected].

Gordon B. Volk is a mathematics major at the Universi-ty of Wisconsin-Eau Claire. His email address is [email protected].

James S. Walker is professor of mathematics at the Uni-versity of Wisconsin-Eau Claire. His email address [email protected].

• Melody contains both pitch and rhythm. Isit possible to objectively describe their con-nection?

• Is it possible to objectively describe the com-plexity of musical rhythm?

In discussing these and other questions, we shalloutline the mathematical methods we use andprovide some illustrative examples from a widevariety of music.

The paper is organized as follows. We first sum-marize the mathematical method of Gabor trans-forms (also known as short-time Fourier trans-forms, or spectrograms). This summary empha-sizes the use of a discrete Gabor frame to performthe analysis. The section that follows illustratesthe value of spectrograms in providing objec-tive descriptions of musical performance and thegeometric time-frequency structure of recordedmusical sound. Our examples cover a wide rangeof musical genres and interpretation styles, in-cluding: Pavarotti singing an aria by Puccini [17],the 1982 Atlanta Symphony Orchestra recordingof Copland’s Appalachian Spring symphony [5],the 1950 Louis Armstrong recording of “La Vie enRose” [64], the 1970 rock music introduction to“Layla” by Duane Allman and Eric Clapton [63], the1968 Beatles’ song “Blackbird” [11], and the Re-naissance motet, “Non vos relinquam orphanos”,by William Byrd [8]. We then discuss signal syn-thesis using dual Gabor frames, and illustratehow this synthesis can be used for processingrecorded sound and creating new music. Then weturn to the method of continuous wavelet trans-forms and show how they can be used togetherwith spectrograms for two applications: (1) zoom-ing in on spectrograms to provide more detailedviews and (2) producing objective time-frequency

30 Notices of the AMS Volume 57, Number 1

portraits of melody and rhythm. The musical il-lustrations for these two applications are from a1983 Aldo Ciccolini performance of Erik Satie’s“Gymnopédie I” [81] and a 1961 Dave Brubeck jazzrecording “Unsquare Dance” [94]. We conclude thepaper with a quantitative, objective descriptionof the complexity of rhythmic styles, combiningideas from music and information theory.

Discrete Gabor Transforms: SignalAnalysisWe briefly review the widely employed method ofGabor transforms [53], also known as short-timeFourier transforms, or spectrograms, or sono-grams.The firstcomprehensive e!ort inemployingspectrograms in musical analysis was Robert Co-gan’s masterpiece, New Images of Musical Sound[27] — a book thatstilldeservesclose study. A morerecent contribution is [62]. In [37, 38], Dörfler de-scribes the fundamental mathematical aspects ofusing Gabor transforms for musical analysis. Othersources for theory and applications of short-timeFourier transforms include [3, 76, 19, 83, 65].Thereis also considerable mathematical background in[50, 51, 55], with musical applications in [40]. Us-ing sonograms or spectrograms for analyzing themusic of birdsong is described in [61, 80, 67]. Thetheory of Gabor transforms is discussed in com-plete detail in [50, 51, 55] from the standpoint offunction space theory. Our focus here, however,will be on its discrete aspects, as we are going tobe processing digital recordings.

The sound signals that we analyze are all dig-ital, hence discrete, so we assume that a soundsignal has the form {f (tk)}, for uniformly spacedvalues tk = k!t in a finite interval [0, T]. A Gabortransform of f , with window functionw , is definedas follows. First, multiply {f (tk)} by a sequence ofshifted window functions {w(tk!!")}M"=0, produc-

ing time-localized subsignals, {f (tk)w(tk!!")}M"=0.

Uniformly spaced time values, {!" = tj"}M"=0, areused for the shifts (j being a positive integergreater than 1). The windows {w(tk ! !")}M"=0 areall compactly supported and overlap each other;see Figure 1. The value of M is determined bythe minimum number of windows needed to cover[0, T], as illustrated in Figure 1(b). Second, becausew is compactly supported, we treat each subsignal{f (tk)w(tk ! !")} as a finite sequence and applyan FFT F to it. This yields the Gabor transform of{f (tk)}:(1) {F{f (tk)w(tk ! !")}}M"=0.

We shall describe (1) more explicitly in a moment(see Remark 1 below). For now, note that becausethe values tk belong to the finite interval [0, T],we always extend our signal values beyond theinterval’s endpoints by appending zeros; hencethe full supports of all windows are included.

(a) (b) (c)

Figure 1. (a) Signal. (b) Succession of windowfunctions. (c) Signal multiplied by middlewindow in (b); an FFT can now be applied tothis windowed signal.

The Gabor transform that we employ uses aBlackman window defined by

w(t) =

"##$

##%

0.42+ 0.5 cos(2#t/$) +0.08 cos(4#t/$) for |t| " $/2

0 for |t| > $/2

for a positive parameter $ equaling the widthof the window where the FFT is performed. InFigure 1(b) we show a succession of these Black-man windows. Further background on why we useBlackman windows can be found in [20].

The e"cacy of these Gabor transforms is shownby how well they produce time-frequency portraitsthat accord well with our auditory perception,which is described in the vast literature on Ga-bor transforms that we briefly summarized above.In this paper we shall provide many additionalexamples illustrating their e"cacy.

Remark 1. To see how spectrograms display thefrequency content of recorded sound, it helps towrite the FFT F in (1) in a more explicit form.The FFT that we use is given, for even N, by thefollowing mapping of a sequence of real numbers{am}m=N/2!1

m=!N/2 :

(2) {am}F!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!#

&A% =

1$N

N/2!1'

m=!N/2am e

!i2#m%/N(,

where % is any integer. In applying F in (1), wemake use of the fact that each Blackman windoww(tk ! !") is centered on !" = j"!t and is 0 fortk outside of its support, which runs from tk =(j" ! N/2)!t to tk = (j" + N/2)!t and is 0 attk = (j"±N/2)!t . So, for a given windowing spec-ified by ", the FFT F in (2) is applied to the vector

(am)N/2!1m=!N/2 defined by

)f (tk)w([k! j"]!t)

*j"+N/2!1

k=j"!N/2.

In (2), the variable % corresponds to frequenciesfor the discrete complex exponentials e!i2#m%/N

used in defining the FFT F . For real-valued data,such as recorded sound, the FFT values A% satisfythe symmetry condition A!% = A%% , where A%% isthe complex conjugate ofA% . Hence, no significantinformation is gained with negative frequencies.Moreover, when the Gabor transform is displayed,the values of the Gabor transform are plotted as

January 2010 Notices of the AMS 31

squared magnitudes (we refer to such plots asspectrograms). There is perfect symmetry at %and !%, so the negative frequency values are notdisplayed in the spectrograms.

Gabor FramesWhen we discuss audio synthesis, it will be im-portant to make use of the expression of Gabortransforms in terms of Gabor frames.An importantseminal paper in this field is [92]. A comprehensiveintroduction can be found in [48]. We introduceGabor frames here because they follow naturallyfrom combining (1) and (2). If you prefer to seemusical examples, then please skip ahead to thenext section and return here when we discusssignal synthesis.

Making use of the description of the vector (am)given at the end of Remark 1, we can express (1)as

(3)1$N

k1'

k=k0

f (k!t)w([k! j"]!t) e!i2#[k!j"]%/N.

In (1), the values k0 and k1 are the lower and upperlimits of k that are needed to extend the signalvalues beyond the endpoints of [0, T]. We havealso made use of the fact that w([k! j"]!t) = 0for k " j" !N/2 and for k & j" +N/2.

We now define the functions G",%(k) for % =!N/2, . . . ,N/2! 1 and " = 0, . . . ,M as

(4) G",%(k) =1$Nw([k! j"]!t) e!i2#[k!j"]%/N

and write C",% for the Gabor transform values thatare computed by performing the computation in(3):

(5) C",% =k1'

k=k0

f (tk)G",%(k).

We have suppressed the dependence on j since itis a fixed value, specifying the amount of shiftingfor each successive window (via !" = tj"). It isset in advance and does not change during theanalysis procedure using {G",%}.

The significance of Equation (5) is that each Ga-bor transform value C",% is expressed as an innerproduct with a vector

+G",%(k)

,. Hence, the entire

arsenal of linear algebra can be brought to bearon the problems of analyzing and synthesizingwith Gabor transforms. For instance, the vectors{G",%} are a discrete frame. The theory of framesis a well-established part of function space theory,beginning with the work of Du"n and Schae!eron nonharmonic Fourier series [45, 99] throughapplications in wavelet theory [33, 18, 58] as wellas Gabor analysis [50, 51, 55, 24].

To relate our discrete Gabor frame to this bodyof standard frame theory, and to provide an ele-mentary condition for a discrete Gabor frame, we

require that the windows satisfy

(6) A "M'

"=0

w2(tk ! !") " B

for two positive constants A and B (the frameconstants). The constants A and B ensure numeri-cal stability, including preventing overflow duringanalysis and synthesis. The inequalities in (6)obviously hold for our Blackman windows whenthey are overlapping as shown in Figure 1(b). Us-ing (4) through (6), along with the Cauchy-Schwarzinequality, we obtain (for K := k1 ! k0 + 1):

(7)M,N/2!1'

"=0,%=!N/2|C",% |2 " (KB) 'f'2,

where 'f' is the standard Euclidean norm: 'f'2 =-k1k=k0

|f (tk)|2. When we consider Gabor transformsynthesis later in the paper, we shall also find that

(8) (A/K) 'f'2 "M,N/2!1'

"=0,%=!N/2|C",% |2.

So we have (using B1 := A/K and B2 := KB):

(9) B1'f'2 "M,N/2!1'

"=0,%=!N/2|C",%|2 " B2'f'2 ,

a discrete version of standard analysis operatorbounds in function space theory. By analysis oper-ator, we mean the Gabor transform G defined by

{f (tk)}G!!!!!!!!!!!!!!!!!# {C",%}.

Remark 2. (1) It is not necessary to use a singlefixed window w to perform Gabor analysis. Forexample, one could use windows w"(tk ! !") for

each ", provided A "-M

"=0w2" (tk ! !") " B is

satisfied. Doing so allows for more flexibility inhandling rapid events like drum strikes [100],[37, Chap. 3], [39]. (2) The time values {tk} donot need to be evenly spaced (this is called non-uniform sampling) [50, 51]. An application usingnonuniform sampling is described in Example 12.

Musical Examples of Gabor AnalysisWe now discuss a number of examples of usingspectrograms to analyze recorded music. Our goalis to show how spectrograms can be used to pro-vide another dimension, a quantitative dimension,to understanding the artistry of some of the greatperformances in music. For each of the displayedspectrograms, you can view a video of the spec-trogram being traced out as the music plays bygoing to this webpage:

(10) http://www.uwec.edu/walkerjs/MBSGC/

Viewing these videos is a real aid to understand-ing how the spectrograms capture importantfeatures of the music. The website also providesan online bibliography and links to the software


http://www.uwec.edu/walkerjs/MBSGC/

we used. Another site with advanced software fortime-frequency analysis is [69].

As we describe these examples, we shall brieflyindicate how they relate to the art of music and itscreation. While spectrograms are extremely use-ful for quantitative analysis of performance—andperformance itself is an act of artistic creation—weshall also point out some ways in which spectro-grams can be of use in aiding the creative processof musical composition. When we finish discussingthese examples, we will summarize all of theseobservations on musical creativity.

Example 1 (Pavarotti’s vocals). Our first exampleis a spectrogram analysis of a 1990 recordingof the conclusion of Puccini’s aria, “Nessun Dor-ma”, featuring a solo by Luciano Pavarotti [17].See Figure 2. This spectrogram shows the high-amplitude vibrato (oscillations of pitch) thatPavarotti achieves. Using the spectrogram, wecan measure quantitatively the amplitudes in thevibrato. This illustrates an interesting creativeapplication of spectrograms. Because they can bedisplayed in real time as the singer is performing,they can be used to help performers to analyzeand improve their vibrato, both in amplitude andsteadiness. One a!ordable package for real-timespectrogram display can be found at [1]. Noticealso that Pavarotti is able to alter the formantstructure (the selective amplification of di!erentfrequency bands) of the same word in the lyrics,which produces a clearly audible change in bright-ness (timbre) in the sound. Using spectrograms tostudy formants is usually the domain of linguis-tics [75]. But here we see that formants—througha real-time adaptation of selective frequency am-plification (selective resonance)—play a role in themagnificent instrumentation that is possible withthe human voice.1

Example 2 (Contrasting choral and solo voicings).Our second example is another passage fromthe same recording of “Nessun Dorma” [17], con-taining a choral passage and the introduction ofPavarotti’s concluding solo. See Figure 3. We cansee in the spectrogram that there is a notablecontrast between the sinuous, blurred vibrato ofthe chorus in the beginning of the passage ver-sus the clearer vibrato of Pavarotti’s solo voice,which is centered on elongated, single pitches.The blurring of the vibrato of the chorus is dueto reverberation (persistent echoing). We candescribe this with the following model:

(11) f (tk) =J'

j=0

g([k! jm]!t)h(j),

1There is some speculation that the human vocal appa-ratus actually evolved first in order to sing, rather thanspeak [73].

Figure 2. Spectrogram from a recording of“Nessun Dorma” with Luciano Pavarotti.Pavarotti’s large-amplitude vibrato is clearlyvisible and measurable. The di!eringformants (selective amplification of di!erentfrequency bands) for di!erent vocalizations ofthe lyrics (printed below the spectrogram) arealso clearly displayed. Pavarotti changes theformants for the first two vocalizations ofvincerò, and this is clearly audible as a changeof brightness (timbre) of the sound. Inaddition, Pavarotti holds the final note, sungas an extended ò, with large, constantamplitude vibrato for more than 5 seconds, anamazing feat. (A video for this figure is at (10),a larger graphic is at [101].)

where m is a positive integer, g is the sound thatis reverberating, and h is a damping function. Thesuperposition of the slightly damped time-shiftedversions of g creates the blurring e!ect, due tothe closeness together in time of almost identical,shifted versions of g. Equation (11) is a discreteconvolution. Audio engineers frequently employa model like this, called convolution reverb, tosimulate the e!ects of reverberation [44], [7,Sec. 16.7.2]. The function h, called the impulseresponse, is created once through digitally record-ing the reverberation of a very sharp sound (animpulse). Assuming linearity and shift-invarianceproperties of reverberation, the reverberationof other sounds is simulated digitally via Equa-tion (11). We shall give an example of usingconvolution reverb for producing a new musicalcomposition, “Sierpinski Round”, in Example 13.

In contrast to the reverberation in the chorus,Pavarotti’s vocals—which emerge at high volumefrom the midst of the chorus, an emotionally mov-ing experience—are much more clearly definedin their vibrato, and his vibrato oscillates aroundconstant pitches. The lack of reverberation inPavarotti’s vocals is probably due to a di!erencein the way his voice was recorded. The recordingwas done live at the ruins of the Baths of Caracallain Rome. Pavarotti sang into a single microphone,while the multiple voices of the chorus wererecorded by another microphone (or small setof microphones relative to the large number of


chorus members). Consequently, the recordingof the chorus picks up the reverberation o! ofthe walls of the ruins, while Pavarotti’s singlemicrophone records his initial voicing withoutmuch reverberation. The resounding emergenceof Pavarotti’s voice from the midst of the chorusis no doubt an artistic choice of Puccini to createa dramatic contrast between the individual andthe collective. The blurring of the reverberationof the choral voices versus the clarity of the solovoice serves to further enhance the contrast. Thisenhanced contrast may also be a creative decision,and with methods such as convolution reverb itcan be deliberately produced in creating digitalrecordings.

Example 3 (Does Louis Armstrong’s voice soundlike his trumpet?). The classical pianist EdnaStern has answered this question very nicely, withher succinct description of the creative artistry ofLouis Armstrong:

The way he plays trumpet is verysimilar to his singing, it comesthrough in the way he is vibrat-ing or marking the expressivemoments in the music. Also thetiming and the way he is build-ing his phrases are both done inthe same declamatory way as hissinging. When he is singing, hisvibratos are very similar to thetrumpet vibratos and I think thatthere is a clarity in the way hepunctuates his syllables, usingthe exploding ones in a sort oftrumpet way. [89]

In Figure 4 we show two spectrograms of clipstaken from Armstrong’s 1950 recording of “La Vieen Rose” [64] that illustrate perfectly, in a quan-titative way, what Stern is talking about. Theycontain a trumpet solo and vocals of Armstrong.All the vocals exhibit vibrato. The fundamentalsfor the trumpet notes are in a higher frequencyrange (around 500 Hz) than the vocals (around200 Hz). The most prominent trumpet notes ex-hibiting vibrato are between 0.2 and 1.4 seconds,between 10.5 and 12.2 seconds, and between 14.0and 15.3 seconds. It is interesting how Armstrongincreases the amplitude of the vibrato as thesenotes progress. This is most evident for the threenotes just cited. There is an interesting contrastbetween these trumpet notes compared with theconstant frequency notes of other instruments inthe passage (the horizontal bars in the spectro-gram that correspond to bass notes at the lowestfrequencies, as well as guitar and piano notes athigher frequencies). During the recording, Arm-strong also plays notes with constant amplitudevibrato. For example, at the end of the record-ing, he sustains a note with constant amplitude

Figure 3. Spectrogram from a recording of“Nessun Dorma” with Luciano Pavarotti. In the

first 7.5 seconds, there is a chorus of voicessinging. Their lyrics produce several sinuous

blurry curves, corresponding to fundamentalsand overtones of male and female voices. Thischorus is singing with vibrato, but the vibrato

is obscured by the blurring (a convolution ofreverberating sound). In contrast, Pavarotti’s

solo emerges from the chorus at about 7.0seconds. His voicings are with much clearer

vibrato and are centered on single, moreelongated pitches. (A video for this figure is at

(10), a larger graphic is at [101].)

vibrato for about six seconds. The correspondingspectrogram image is similar to the final notefor ò! held by Pavarotti in Figure 2, so we do notdisplay it.

The exploding syllables that Stern refers toare evident in the very brief, sharply slopedstructures that initiate many of the vocals. Aparticularly prominent one occurs in the vocalsspectrogram in Figure 4 at about 14.0 seconds. Asimilar explosive onset of a trumpet note occurs atabout 8.8 seconds in the spectrogram containingthe trumpet solo.

Louis Armstrong is well known for both histrumpet playing and his unique vocal style. Herewe have shown, in a quantitative way, that it ispointless to try to separate these two aspects ofhis performance.

Example 4 (Dissonance in rock music). Our nextexample is a recording of one of the most famouspassages in rock music, the introduction to the1970 recording of “Layla” by Derek and the Domi-noes [63]. See Figure 5. The passage begins withthe two guitars of Duane Allman and Eric Claptonplaying in perfect synchronicity. At around 1.7seconds, however, a buzzing distortion of thesound occurs. This distortion is due to a beatinge!ect between closely spaced overtones. Althoughthis dissonance may o!end some ears, it is mostcertainly a conscious e!ect done by the musiciansas a prelude to the intense pleading, with a veryrough textured voice, of the singer later in thesong. It is interesting to contrast this intensedissonance with the more subtle one invoked in“Gymnopédie I” (see Figure 15). When we discuss


Figure 4. Top: Spectrogram of a trumpet soloby Louis Armstrong. Bottom: Spectrogram ofLouis Armstrong vocals. (A video for thisfigure is at (10), a larger graphic is at [101].)

Figure 5. Spectrogram from a recording of“Layla” with Eric Clapton and Duane Allman onguitars. An important feature of thespectrogram is the clear indication of beatingbetween di!erent overtones. For example, fortime between 1.7 and 2.2 seconds andfrequencies around 450 Hz and 750 Hz, andalso for time between 2.2 and 2.4 seconds andfrequencies around 300 Hz, 550 Hz, and 700Hz. The beating appears as sequences of darkpoints lying between horizontal overtonebands. It is clearly audible as a buzzing sound.(A video for this figure is at (10), a largergraphic is at [101].)

“Gymnopédie I”, we will show how this contrastcan be characterized quantitatively and how thatcould be of use in musical composition.

Example 5 (The geometry of Appalachian Spring).The use of finite groups in analyzing the structure

of notes and chords in musical scores is well estab-lished [56, 47, 15]. The work now extends even tothe use of highly sophisticated methods from al-gebraic geometry [70] and orbifold topology [95].Computer methods are being used as well [21, 12,7].

One advantage that spectrograms have overanalysis of scores, however, is that they providea way for us to identify these group operationsbeing performed over an extended period of timein a complex musical piece. In such cases, scoreanalysis would be daunting, at least for thosewithout extensive musical training. Spectrogramscan help those without score-reading experienceto see patterns, and it can help those who dohave score-reading experience to connect thesegroup operations to the patterns that they see inthe score. Either way, it is valuable. Certainly asolid understanding of these group operations,and how master composers have used them, isan important aspect of learning to create newmusical compositions.

Figure 6. Spectrogram of a passage fromAppalachian Spring by Aaron Copland. In thetime intervals marked by A through G, there isa complex sequence of transpositions andtime dilations, played by various instruments,of the basic melody introduced in A. (A videofor this figure is at (10), a larger graphic is at[101].)

As an example of analyzing a spectrogram with-out score references, we look at Section 7 from anAtlanta Symphony recording of Aaron Copland’ssymphonic suite, Appalachian Spring [5]. See Fig-ure 6. Copland uses group-theoretic operationsin the time-frequency plane—transposition anddilation—in order to develop the basic melody ofthe Shaker hymn, “Simple Gifts”.

In the spectrogram, within the time interval Aand the frequency range of 300 to 600 Hz, thereis the initial statement of the melody, played bywoodwinds. There is also a transposition anddilation (time shortening) of part of the melodyto a much higher frequency range (time range0:15 to 0:17 and frequency range 900 Hz to 1200


Hz)—a sort of grace note e!ect, but using anentire melodic motive. In the time interval B andthe frequency range of 200 to 500 Hz, there is atransposition down to stringed instruments, witha time dilation (shrinkage) of the melody, plussome interpolated notes so that the melody stillplays for about 30 seconds. In the time interval C,there is a fugue-like development of the melody.It begins with another transposition from A’smelody to a lower frequency range than in B.Then at about time 1:12 there is an overlay of atransposition of A’s melody to a higher frequencyrange played by horns, and underlying it a veryslow-moving time dilation (stretching) of a trans-position down to a lower frequency range playedby bass strings. In the time interval D there is atransition passage, played by woodwinds, of shortmotives pulled from the melody. The beginning oftime interval E, from about time 1:37 to time 1:40,contains a rising horn crescendo that leads intoa transposition of the melody from A to a higherfrequency range (also played by horns). Withinthe time interval F , there is a transposition downagain, played by woodwinds. Finally, the passageconcludes with the time interval G, containing acombination of all the previous transpositions:played by strings, horns, and woodwinds athigh volume, emphasized by a stately rhythmicpounding of a bass drum.2 This passage fromAppalachian Spring is a remarkable example ofan extended development of a melodic theme.

Example 6 (What do Ludwig van Beethoven, BennyGoodman, and Jimi Hendrix have in common?).The short answer, of course, is that they all createdgreat music. In Figure 7 we show three spectro-grams of recordings of short passages from theirmusic. It is interesting to find the similarities anddi!erences between these spectrograms and themusic they reflect. In the Beethoven passage—which is a portion of his Piano Sonata in E (Opus109) performed by David Añez Garcia [66, Move-ment 1, Measures 15–17]—we see a descendingseries of treble notes followed by an ascending se-ries, reflecting an approximate mirror symmetry.The symmetry is broken, however, by an ascend-ing set of bass notes (indicated by the arrow onthe spectrogram). The mirror-symmetric patternis also broken by the gentle rising treble notestrailing o! to the right.

The spectrogram of the Goodman recording—a clip from a circa 1943 live broadcast of “Sing,

2For those listeners familiar with the religious back-ground of the Shaker hymn, the metaphor—externalto the music—of the inheritance of the earth isunmistakable in this concluding portion.

Sing, Sing” [14]3—also shows a similar, approxi-mate mirror symmetry from time 3.5 seconds to8.5 seconds, which is also broken by a gently risingscale trailing o! to the right. In this case, Goodmanis playing a clarinet and bending the notes as iscommonly done in jazz style. The bending of thenotes is clearly displayed in the spectrogram bythe connected curved structures (which possess asymmetry of their own). This is a significant con-trast to the discretely separated, constant harmon-ic piano notes in the Beethoven passage.

The spectrogram of the Hendrix passage—a clipfrom his 1968 recording of “All Along the Watch-tower” [46]—exhibits a similar pattern to theother two, an approximately mirror-symmetricaldescension and ascension of pitch, followed bya gently rising trailing o! of melodic contour.Hendrix, however, illustrates a unique aspect ofhis music. Rather than using discrete notes, heinstead uses his electric guitar to generate a con-tinuous flow of chords. The chord progression iscontinuous rather than a set of discrete tonestypically used in most Western music. It is inter-esting that the electronic sound produced here issurprisingly warm and soft, especially in the later“trailing o!” portion. Perhaps this is due to Hen-drix’s synthesizing continuous chord transitionsand vibrato (“wah-wah”) within the blues scale, aremarkable achievement.

A more extended spectrogram analysis of theBeethoven piano sonata can be found in [27,pp. 49–56]. For more on jazz vis-à-vis classicalmusic, see [16, pp. 106–132]. For incisive com-ments on Hendrix in particular, see [2, pp. 197and 203].

Example 7 (Birdsong as a creative spark for mu-sic). Birdsong has a long and distinguished histo-ry of providing a creative spark for musical com-position. Donald Kroodsma, perhaps the world’sleading authority on birdsong, has succinctly de-scribed this history:

As early as 1240, the cuckoo’scuckoo song appears in humanmusic, in minor thirds. The songsof skylarks, song thrushes, andnightingales debut in the ear-ly 1400s. The French composerOlivier Messiaen is my hero, ashe championed birdsongs in hispieces. I love Respighi’s Pinesof Rome, too, as the songs of anightingale accompany the or-chestra…I learned recently, too,why I have always especially en-joyed the “Spring” concerto from

3We hope our discussion of this piece will help to alleviateits undeserved obscurity. At the time of writing, [14] is stillin print.


Figure 7. Top: Spectrogram from a recordingof Beethoven’s Piano Sonata in E (Opus 109).The arrow indicates an ascending bass scale,entering in contrast to the descending treblescale. Middle: Spectrogram from a BennyGoodman recording of “Sing, Sing, Sing”.Bottom: Spectrogram from a Jimi Hendrixrecording of “All Along the Watchtower”. (Avideo for this figure is at (10), a larger graphicis at [101].)

Vivaldi’s Four Seasons: It is thebirds and their songs that cel-ebrate the return of spring inthis concerto, and now I clearlyhear their birdsongs in the brief,virtuoso flourishes by the soloviolinist. [61, p. 275]

Kroodsma’s book, from which this quote is taken,is full of spectrograms and their use in analyzingthe musical qualities of birdsong. There is evena most amusing description on p. 274 of his re-searching which composers Java sparrows preferand carrying out musical education with them!

Unfortunately, we do not have space here todo justice to the music of Messiaen. However,there is an excellent discussion, utilizing bothspectrograms and scores, of the creative inspi-ration of birdsong for his music in Rothenberg’sbook [80]. The website for that book [98] alsocontains some samples of Messiaen’s music andrelated birdsongs. One technique that Messi-aen employed was to slow down the tempo of

birdsong to increase its comprehensibility tohuman ears. He did this based on his own metic-ulous transcriptions into musical scores of songshe heard in the field. With Gabor transforms, thisprocess can be done digitally with the recordedsongs themselves; we will discuss how in Exam-ple 12. There is certainly much more to explore inMessiaen’s music using the tools of spectrograms.The tools of percussion scalograms for rhythmanalysis (which we discuss later) can be employedas well, since Messiaen incorporated the rhythmicstyles of birdsong in his music [80, p. 198].

Besides Messiaen, Stravinsky also incorporatedelements of birdsong into some of his compo-sitions. In his classic work of musical theory,Mache devotes an entire chapter to studying, viaspectrograms, the connections between the musicof birds and elements of Stravinsky’s music [67,Chap. 5]. Mache uses spectrograms as a kind ofgeneralized musical notation, which gets aroundthe very challenging problem of transcribingbirdsong into standard musical notation. Machealso discusses other animal music-making in thischapter, which he entitles “Zoomusicology”.

Figure 8. Spectrogram of Paul McCartney’sduet with a songbird, from the Beatles’recording “Blackbird”. (A video for this figureis at (10), a larger graphic is at [101].)

Besides the world of classical music, or artmusic, birdsong has been used in more popularmusic as well. A striking example is from the Bea-tles, who bridged the worlds of popular song andartistic music. At the end of their 1968 recordingof “Blackbird” [11], Paul McCartney literally singsa duet with a blackbird. In Figure 8 we show aspectrogram of this portion of the recording. Theblackbird’s chirps lie in the upper portion of thetime-frequency plane with sharply sloped rapidchanges of pitch, while McCartney’s vocals liemore in the lower portion (although his overtonesoften extend into the region where the blackbirdis singing). In some places, for instance between10.0 and 10.5 seconds, McCartney’s rapid articu-lation is quite similar to the blackbird’s, as we cansee in the similar sharply sloped structures—forMcCartney, most prominently between 250 and


1000 Hz, and for the blackbird, most prominentlybetween 1500 and 3000 Hz—whereas in otherplaces, there is a contrast between the longertime-scale rhythm to McCartney’s vocals and theshorter time-scale rhythms of the blackbird’schirps.

Example 8 (Do our ears play tricks on us?). Infact, our ears can fool us sometimes. The mostfamous auditory illusion is due to Shepard [85].Shepard created electronic tones that seem to riseendlessly in pitch, even though they in fact ascendthrough just one octave. The tones can also bearranged to create an illusion of endless descent.The website [52] has a nice demonstration ofShepard tones. To hear examples showing that anillusion is actually occurring, go to the URL in (10)and select the topic Example 8: Audio Illusions.

Figure 9. Spectrogram of Shepard’s illusion ofan endlessly rising series of tones, which infact stay within just one octave. The arrows

indicate where the illusion occurs. (A video forthis figure is at (10), a larger graphic is at

[101].)

Shepard’s illusion is easily explained usingspectrograms. Figure 9 practically speaks for it-self. The illusion is due to our brain tracking thepitch contour of the fundamentals for the tonesand expecting the next fundamental to be exactlyone octave higher than where it occurs. The over-tones of the electronic tones are all designed inan octave series—the fundamental multiplied bytwo, four, eight, etc.—to hide the drop in pitch.

Shepard’s auditory illusion has been used sub-sequently by composers in electronic music, themost notable compositions being produced by Ris-set [7, Chap. 13]. The endlessly rising illusion canbe used for expressing moments of infinite, tran-scendent joy, while the endlessly descending illu-sion can be used for invoking the opposite emo-tion.

The special arrangement of only octave spac-ings in the overtones does not lend itself well totraditional instruments. It is natural to ask, how-ever, whether composers in the past have usedillusions like the Shepard tones. Some examplesby Bach have been proposed [60, p. 719]. Another

Figure 10. Top: Extract from the score ofWilliam Byrd’s motet, “Non vos relinquam

orphanos”. Bottom: Spectrogram of arecording of the passage, with its structures

aligned with the vocals in the score above. Anascending series of overtones lies within the

parallelogram. (A video for this figure is at(10), a larger graphic is at [101].)

example is pointed out by Hodges in his delightfularticle on geometry and music [47, Chap. 6]. Theillusion occurs in performances of a Renaissancemotet by William Byrd, “Non vos relinquam or-phanos” (“I will not leave you comfortless”). Todescribe the illusion, we quote from Hodges’sarticle:

Jesus is foretelling his ascensioninto heaven, Vado ‘I am going’.

The moment passes quick-ly …The Vado motif seems tomove steadily upward throughthe voices, pointing to Jesus’ ownmovement upwards to heaven.…In fact, the movement is not assteady as it sounds; at two of therepetitions there is no movementupwards. …the ear is deceived.

In Figure 10, we show the score and a spectrogramthat illustrates the overtones of the voices at thefour vocalizations of Vado in a 2003 CambridgeSingers recording of the motet [8]. There is clear-ly a similarity to the Shepard tone arrangementof overtones. As Hodges points out, the e!ect issubtle. Undoubtedly it is more noticeable when lis-tening to the motet in a locale with long reverber-ation times contributing to the frequency trackingprocess in our brains (such as the Great Hall of Uni-versity College School, London, where this record-ing was made). As a partial confirmation of thisidea, we will amplify the indicated overtones in the


motet’s spectrogram in Example 10. This is an ex-ample of audio synthesis with Gabor transforms,which we discuss in the next section.

Summary. The musical examples we have dis-cussed were chosen to illustrate that spectrogramsenhance our understanding of how great perform-ing artists produce their music and to provideideas for using Gabor analysis as an aid to creat-ing new music. Our discussion of these examplesproduced at least these five ideas:

(1) Perfecting vibrato. Using spectrograms, ei-ther recorded or in real time, to analyzeand perfect vibrato. This applies to vi-brato in singing and also to vibrato ininstrumental music. More generally, thisapplies to improvements in other musicaltechniques, such as holding a constantpitch.

(2) Convolution reverb. Spectrograms can beused to compare the results, in a quantita-tive way, of recordings made with di!erentconvolution reverb methods.

(3) Analyzing dissonance. The beating e!ectof dissonance can be quantified easily witheither spectrograms or scalograms (as wedescribe in Example 14). This allows forevaluation of creative e!orts to employdissonance in music.

(4) Visualizing geometric transformations ofthe time-frequency plane. Such transfor-mations are regularly used in musicalcompositions. Spectrograms provide ane!ective tool for analyzing the patternsof such transformations over longer timescales than score analysis facilitates. Suchanalyses are valuable for understandingthe connection between these patternsand the music they reflect.

(5) Zoomusicology. Spectrograms provide ageneralized musical notation for captur-ing the production of sonorities by otherspecies, such as birds. This can yield newideas for pitch alterations and rhythmicalterations (analyzed with the percussionscalogram method). Also, slowing downthe tempo of birdsong, which we discusslater in Example 12, is an important tech-nique (first applied by Messiaen). WithGabor transforms, this slowing down canbe performed automatically (without te-dious and complicated field transcriptionsto standard musical notation).

Discrete Gabor Transforms: SignalSynthesisThe signal {f (tk)} can be reconstructed from itsGabor transform {C",%}. We shall briefly describethis reconstruction and show how it can be ex-pressed in terms of synthesis with a dual frame,

a frame dual to the Gabor frame {G",%}. Fol-lowing this discussion, we apply this synthesismethod to audio processing and the creation ofnew electronic music.

First, we briefly sketch the reconstruction pro-cess. The FFT in (2) is invertible, via the formula:

(12) am =1$N

N/2!1'

%=!N/2A% e

i2#m%/N .

Hence, by applying such FFT-inverses to the Gabortransform in (1), we obtain a set of discrete signals:

{{f (tk)w(tk ! !")}}M"=0.

For each ", we then multiply the "th signal by{w(tk ! !")} and sum over ", obtaining

{M'

"=0

f (tk)w2(tk ! !")} = {f (tk)

M'

"=0

w2(tk ! !")}.

Multiplying the right side of this last equation bythe values .

/M'

"=0

w2(tk ! !")

0

1!1

,

which by (6) are no larger than A!1, we obtain ouroriginal signal values {f (tk)}.

Synthesis Using Dual FramesWe now describe one way in which this reconstruc-tion can be expressed via a dual frame. Alternativeways are described in [23, 25, 57].

We apply inverse FFTs to the Gabor transformvalues {C",%} in (5), then multiply by {w([k !j"]!t)}, and sum over " to obtain:

(13)M'

"=0

N/2!1'

%=!N/2C",%

w([k! j"]!t)$N

ei2#(k!j")%/N.

We then divide by-Mm=0w

2([k!jm]!t), obtaining

M'

"=0

N/2!1'

%=!N/2C",%

w[(k! j")!t] ei2#(k!j")%/N$N-Mm=0w2([k! jm]!t)

= f (tk).

Now, define the dual Gabor frame {2",%} by

(14) 2",%(k) =w([(k! j"]!t) ei2#(k!j")%/N$N-Mm=0w2([k! jm]!t)

.

We leave it as an exercise for the reader to showthat {2",%} is, in fact, a discrete frame. Using (14),we obtain

(15) f (tk) =M'

"=0

N/2!1'

%=!N/2C",% 2",%(k).

Equation (15) is our synthesis of {f (tk)} using thedual frame {2",%}.

We note that by combining (15) with (6) andusing the Cauchy-Schwarz inequality, we obtainInequality (8).


Remark 3. When there is a high degree of overlap-

ping of windows, then$N-Mm=0w

2([k!jm]!t) isapproximately a constant C, and we have 2",%(k) (G%",%(k)/C. Hence, the dual Gabor frame is, modu-

lo a constant factor and complex conjugation, ap-proximately the same as the Gabor frame. In anycase, (15) shows that we can reconstruct the origi-nal signal f as a linear combination of the vectorsin the dual Gabor frame, each of which is support-ed within a single window of the form w([(k !j"]!t). These last two statements reveal why wemultiplied by the windowings before summing in(13).

The synthesis on the right side of (15) can beexpressed independently of the Gabor transform.We shall use S to denote the synthesis mapping

(16) {B",%}S!!!!!!!!!!!!!!!!!!!!!#

&& (tk) =

M'

"=0

N/2!1'

%=!N/2B",% 2",%(k)

(

based on the right side of (15), but now applied toan arbitrary matrix {B",%}.

Musical Examples of Gabor SynthesisWe shall now illustrate Gabor synthesis with mu-sical examples. There are two basic schemes. Onescheme, which is used for creating new music, isto use the synthesis mapping S in (16). The inputmatrix {B",%} is specified by the composer, andthe output {& (tk)} is the new, electronic audio.There is software available for synthesizing musicin this way. A beautiful example is the MetaSynthprogram [71].

Another scheme, which is used for processingaudio, can be diagrammed as follows (where P

denotes some processing step):

(17) {f (tk)}G!!!!!!!!!!!!!!!!!!!!!!!!!!!# {C",%}

P!# {B",%}

S!!!!!!!!!!!!!!!!!!!!!# {& (tk)}.

The end result {& (tk)} is the processed audio.

Example 9 (Reversing figure and ground). In [38],Dörfler mentions that one application of Gabortransforms would be to select a single instru-ment, say a horn, from a musical passage. We willillustrate this idea by amplifying the structureH shown in the spectrogram on the left of Fig-ure 11, which is from a passage in a 1964 BostonSymphony Orchestra recording of Stravinsky’sFirebird Suite [93]. The sound corresponding tothis structure is a faint sequence of ascendingharp notes.

To amplify just this portion of the sound, wemultiply the Gabor transform values by a maskof quadratically increasing values from 3.7 to4.1 within a narrow parallelogram containing Hand value 1 outside the parallelogram4 (see theright of Figure 11) and then perform the synthesismapping S. Notice that the amplified structureA stands out more from the background of the

4The precise formula can be found at the website in (10).

remainder of the spectrogram (which is unal-tered). Listening to the processed sound file, wehear a much greater emphasis on the harp notesthan in the original, and the volume of the restof the passage is unchanged. We have altered thefigure-ground relationship of the music.

The modification we have performed in thisexample is a joint time-frequency filtering of theGabor transform. With this example, we havetouched on an important field of mathematicalresearch known as Gabor multipliers. More detailscan be found in [9, 41, 49, 82, 97].

Figure 11. Left: spectrogram of portion ofFirebird Suite. The structure to be selectively

amplified is labeled H. Right: spectrogram withamplified structure labeled A. (A video for this

figure is at (10), a larger graphic is at [101].)

Figure 12. Selective amplification of a region inthe spectrogram of a passage from a William

Byrd motet, cf. Figure 10. (A video for thisfigure is at (10), a larger graphic is at [101].)

Example 10 (Amplifying Byrd’s illusion). A similarexample of Gabor multipliers involves an amplifi-cation of a portion of the Gabor transform of thepassage from the William Byrd motet consideredin Example 8. We again amplify a parallelogram re-gion of the Gabor transform (see Figure 10). Thisproduces the spectrogram shown in Figure 12. Af-ter applying the synthesis mapping S to the mod-ified transform, we are better able to hear the il-lusion. That is because we have selectively ampli-fied the overtones of the vocals that our hearing is“tracking” when we hear the illusion.

Example 11 (Removing noise from audio). One ofthe most important examples of audio processingwith Gabor transforms is in audio denoising. For


reasons of space, we cannot do justice to thisvast field. Su"ce it to mention that an excellentdescription of the flexibility of Gabor transformdenoising, combining Gabor multipliers with so-phisticated Bayesian methods, is given in Dörfler’sthesis [37, Sec. 3.3]. An elementary introductionis given in [96, Sec. 5.10]. One advantage of Gabortransforms is that the compact support of thewindows allows for real-time denoising, includingadaptation to changing noise characteristics.

Example 12 (Slowing down sound). In Example 7we raised the issue of slowing down the tempoof birdsong. This could be done in two ways. Thefirst way is to simply increase the size of the in-terval !t between successive signal values whenplaying back the sound. This has the e!ect, how-ever, of decreasing the frequencies of the pitchesin the sound. For example, if !t is doubled in size,then the frequencies of all the pitches are halved,and that lowers all the pitches by one octave. Thewebsite [98] has examples of this type of slowingdown of birdsong.

Another approach to slowing down the bird-song, while leaving the pitches unchanged, is tomake use of the Gabor transform {C",%} of thesound. For example, to decrease the sound tempoby a factor of 2, we would first define a matrix{B",%} as follows. We set B2",% = C",% for each ",and then interpolate between successive valuesB2",% and B2"+2,% to define each value B2"+1,% . Final-ly, we apply the synthesis mapping in (16) to thematrix {B",%}. The windows for this synthesis areall shifts of the original window w , centered atpoints spaced by the original increment !! . Now,however, twice as many windowings are done (soas to extend twice as far in time). This methodkeeps the pitches at the same levels as the originalsound, but they now last twice as long.

Slowing down by a factor of 2 is just the easiestmethod to describe. This latter approach can alsobe used to slow down (or speed up) the tempoof the sound signal by other factors. Both thesemethods can also be applied, either separatelyor in combination, to other digital sound filesbesides birdsong. For example, the MetaSynthprogram [71] allows for this sort of morphing ofdigital sound. It is interesting that the morph-ing we have described—which via Gabor framescan be done in a localized way on just parts ofthe signal—requires the extensions of the Gabortransform methodology described in Remark 2.

Example 13 (Granular synthesis of new music). Asmentioned above, composers can use the schemein (16) to create new music. Some good examplescan be found at the MetaSynth website [71]. How-ever, we cannot resist showing two of our owncompositions, which are really just motifs thatstill need to be set within larger compositions.

Figure 13. Spectrogram for the fractal music“Fern”. (A video for this figure is at (10), alarger graphic is at [101].)

In Figure 13 we show the spectrogram of a gran-ularly synthesized musical passage, “Fern” [35].The method used for generating the grains {B",%}was to use the iterated function system (IFS) cre-ated by Michael Barnsley for drawing a fractalimage of a fern leaf [10, pp. 86–87]. Using theprobabilities p1 = 0.01, p2 = 0.85, p3 = 0.07, andp4 = 0.07, one of the corresponding four a"netransformations,

Ai

3t%

4

=3ai bici di

4 3t%

4

+3eifi

4

, (i = 1,2,3,4),

is applied to a pair of time-frequency coordinates(starting from an initial seed pair of coordinates).We shall not list all of the four a"ne transforma-tionsAi here. They can be found in [10, Table 3.8.3on p. 87]. It will su"ce to list just one of them:

A3

3t%

4

=3

0.2 !0.260.23 0.22

4 3t%

4

+3

00.07

4

,

which is applied with probability p3 = 0.07. Noticethat A3 is not a member of the Euclidean group,since the matrix used for it is not orthogonal.

This IFS is used to draw points of the fernthat begin the fundamentals of the notes. Aninstrumental spectrum generator was then usedto generate overtones and time durations for thenotes and to generate the actual sound file. Our it-erations were done 2500 times using John Rahn’sCommon Lisp Kernel for music synthesis [78],which converts the generated fractal shape into afile of instrumental tone parameters determininginitiation, duration, and pitch. This file of instru-mental tone parameters was then converted into adigital audio file using the software C-Sound [30].

An interesting feature of the type of musicshown in Figure 13 is that the notes containedin the piece do not exhibit the classical notesymmetries (transpositions, reflections, etc.) thatare all members of the Euclidean group for thetime-frequency plane. That is because some of thea"ne transformations Ai , such as A3, are notmembers of the Euclidean group. Perhaps thesenon-Euclidean operations on its notes are onesource of the eerie quality of its sound.


Figure 14. Spectrogram for the granularsynthesis music, “Sierpinski Round”. (A videofor this figure is at (10), a larger graphic is at

[101].)

As a second instance of granular synthesis,we applied Barnsley’s IFS for the Sierpinski trian-gle [10, Table 3.8.1 on p. 86]. However, instead ofgenerating only the grains for tones from Sierpin-ski’s triangle, we also superimposed two shiftingsin time of all such grains, thus producing threeSierpinski triangles superimposed on each oth-er at di!erent time positions. The spectrogramof the composition, “Sierpinski Round” [36], isshown in Figure 14. Besides having visual artis-tic value, it is interesting to listen to. It soundslike a ghostly chorus, with a bit of the Shepardrising-pitch-illusion e!ect.

We have only touched on the vast arena of gran-ular synthesis, fractal music, and electronic musicin general. More work is described in [7, 47, 74,77, 28]. Although electronic music has been devel-oping for about half a century, it is perhaps stillin its infancy. Tonal music, based on harmony ofone form or another, has been with us for millen-nia. In his masterful critical history of twentieth-century music, Alex Ross provides a balanced, nu-anced view of the development of all forms of newmusic, including electronic music [79, Part III].

Summary. We have described several uses ofGabor transform synthesis for creating new music.These uses are (1) modifying the intensity of aspecific instrument or set of overtones in a musi-cal passage, (2) removing noise, (3) slowing down(or speeding up) the sound, (4) granular synthesis.Although this only scratches the surface of theuse of Gabor transform synthesis, we did provideadditional references in all of our examples.

Continuous Wavelet TransformsIn this section we briefly review the method ofscalograms (continuous wavelet transforms) andthen discuss the method of percussion scalograms.Both methods will be illustrated with musicalexamples.

ScalogramsThe theory of continuous wavelet transforms(CWTs) is well established [31, 26, 68]. A CWT

di!ers from a spectrogram in that it does notuse translations of a window of fixed width;instead it uses translations of di!erently sizeddilations of a window. These dilations induce alogarithmic division of the frequency axis. Thediscrete calculation of a CWT that we use isdescribed in detail in [4, Section 4]. We shall onlybriefly review it here.

Given a function 5 , called the wavelet, thecontinuous wavelet transform W5[f ] of a soundsignal f is defined as

(18) W5[f ](! , s) =1$s

6)

!)f (t)5(

t ! !

s)dt

for scale s > 0 and time translation ! . For thefunction 5 in the integrand of (18), the variable sproduces a dilation and the variable ! produces atranslation.

We omit various technicalities concerning thetypes of functions 5 that are suitable as wavelets;see [26, 31, 68]. In [26, 32], Equation (18) is derivedfrom a simple analogy with the logarithmicallystructured response of our ear’s basilar membraneto a sound stimulus f .

We now discretize Equation (18). First, we as-sume that the sound signal f (t) is nonzero onlyover the time interval [0, T]. Hence (18) becomes

W5[f ](! , s) =1$s

6 T

0f (t)5(

t ! !

s)dt.

We then make a Riemann sum approximation tothis last integral using tm = m!t , with uniformspacing!t = T/N, and discretize the time variable! , using !k = k!t . This yields

(19)

W5[f ](!k, s) (T

N

1$s

N!1'

m=0

f (tm)5 ([tm ! !k]s!1).

The sum in (19) is a correlation of two discretesequences. Given two N-point discrete sequences{fk} and {5k}, their correlation {(f : 5)k} is definedby

(20) (f : 5)k =N!1'

m=0

fm5m!k .

[Note: For the sum in (20) to make sense, thesequence {5k} is periodically extended, via 5!k :=5N!k.]

Thus Equations (19) and (20) show that theCWT, at each scale s, is approximated by a mul-tiple of a discrete correlation of {fk = f (tk)} and{5 sk = s!1/25(tks!1)}. These discrete correlationsare computed over a range of discrete values of s,typically

(21) s = 2!r/J , r = 0,1,2, . . . , I · J,where the positive integer I is called the number ofoctaves and the positive integerJ is called the num-ber of voices per octave. For example, the choiceof 6 octaves and 12 voices corresponds—based


on the relationship between scales and frequen-cies described below—to the equal-tempered scaleused for pianos.

The CWTs that we use are based on Gaborwavelets.5 A Gabor wavelet, with width parame-ter ' and frequency parameter %, is defined asfollows:

(22) 5(t) ='!1/2e!#(t/')2ei2#%t/'.

Notice that the complex exponential ei2#%t/' hasfrequency %/'. We call %/' the base frequency.It corresponds to the largest scale s = 1. The bell-

shaped factor '!1/2e!#(t/')2

in (22) damps downthe oscillations of 5 , so that their amplitude issignificant only within a finite region centered att = 0. (This point is discussed further, with graph-ical illustrations, in [4, 20].) Because the scaleparameter s is used in a reciprocal fashion inEquation (18), it follows that the reciprocal scale1/s will control the frequency of oscillations of thefunction s!1/25(t/s) used in Equation (18). Thusfrequency is described in terms of the parameter1/s, which Equation (21) shows is logarithmicallyscaled. This point is carefully discussed in [4] and[96, Chap. 6], where Gabor scalograms are shownto provide a method of zooming in on selectedregions of a spectrogram. Here we shall providejust one new example.

Example 14 (Subtle dissonance in “Gymnopédie I”).On the bottom of Figure 15, we show a spectro-gram of the beginning of a 1983 Aldo Ciccoliniperformance of Erik Satie’s “Gymnopédie I” [81].There is a very subtle dissonance of some of theovertones of the notes in the piece. To see thisdissonance we calculated a Gabor scalogram forthe first 5.944 seconds of the passage, using 1octave and 256 voices, a width parameter of 0.1,and frequency parameter 65. The base frequencyis therefore 650 Hz, and the maximum frequencyis 1300 Hz. This scalogram is shown on the topof Figure 15. The scalogram has zoomed in onthe spectrogram enough that we can see severalregions where overtones are so closely spacedthat beating occurs. This subtle beating is barelyevident to our ears, contributing to the hauntingquality of the tones in the piece.

Besides Satie, other composers have used sub-tle dissonance e!ects in overtones. Debussy wasa master of such techniques, including using dif-ferent types of scales to enhance the interplay ofthe overtones. The musical history of Ross [79,pp. 43–49] contains a nice synopsis of Debussy’suse of overtones, along with the pentatonic scaleof much Oriental music (Chinese, Vietnamese,and Javanese gamelan music). The article [34]explores the overtone structures of Debussy indepth and also examines the question of whether

5A CWT with Gabor wavelet is closely related to theStockwell transform [91, 54, 43, 90].

ab

c

a

b

c

Figure 15. Bottom: Spectrogram of thebeginning of “Gymnopédie I” by Erik Satie.Top: A zooming in on the first six seconds forfrequencies between 650 Hz and 1300 Hz.There is beating at three places, marked a, b, cin both the scalogram and spectrogram. (Avideo for this figure is at (10), a larger graphicis at [101].)

the interplay of overtones would be enhanced ifthe system of just intonation were employed.

What Figures 15 and 5 illustrate is that we canuse spectrograms and scalograms to produce aquantitative measure of the amount of beating inovertones. For example, beating frequencies canbe easily read o! from the spectrogram in Figure 5and the scalogram in Figure 15. Furthermore, theintensity of the overtones that are “clashing” canbe measured from the numerical data that thosegraphs are displaying. It is an interesting topicfor future study: to measure these e!ects in themusic of Debussy, for example, and compare theuse of alternative systems of intonation.

The theory that beating in overtones cre-ates a dissonant sound in music originated withHelmholtz [59], [47, Chap. 5]. Sethares’s book [84]is an important reference relating Helmholtz’stheory to a wide variety of di!erent scales, includ-ing the pentatonic scale and Javanese gamelanmusic.

Percussion ScalogramsAs described in [96, Chap. 6] and [20], scalogramscan be used in conjunction with spectrograms toproduce quantitative portraits of musical rhythm,called percussion scalograms. The theory behindpercussion scalograms is described in detail in[20]. It is related to work of Smith [86, 87, 88].Here we will only state the method. It consists ofthese two steps:


Step 1. Let {|C",%|2} be the spectrogramimage. Calculate the average µ[|C|2] overall frequencies at each time index ":

(23) µ[|C|2](") = 1

N/2

N/2!1'

%=0

|C",% |2,

and denote the time average of µ[|C|2] byA:

(24) A = 1

M + 1

M'

"=0

µ[|C|2](").

Then the pulse train {P(!")} is defined by

(25) P(!") = 1{!k :µ[|C|2](k)>A}(!"),

where 1S is the indicator function for aset S (1S(t) = 1 when t * S and 1S(t) = 0when t " S). The values {P(!")} describea pulse train whose intervals of 1-valuesmark o! the position and duration of thepercussive strikes.

Step 2. Compute a Gabor CWT of thepulse train signal {P(!")} from Step 1.This Gabor CWT provides an objectivepicture of the varying rhythms within apercussion performance.

When implementing this method, it is sometimesnecessary to process the spectrogram by limitingits values to certain frequency bands (intervals of% values), setting values of the spectrogram to 0outside of such bands. We now show an examplein which this leads to a precise analysis of the re-lationship between melody and rhythm. (A secondexample is discussed in [20, Example 6 on p. 355].)

Example 15 (Melody and rhythm in “UnsquareDance”). In the 1961 Dave Brubeck Quartet’srecording of “Unsquare Dance” [94], there is anamazing performance involving hand claps, pianonotes, and bass notes all played in the unusualtime signature of 7

4 . In Figure 16 we show ouranalysis of the melody and rhythm in a passagefrom “Unsquare Dance”. We used three di!erentfrequency ranges from the spectrogram to isolatethe di!erent instruments from the passage. Thepassage begins with a transition from rapid drum-stick strikings to hand clappings when the pianoenters. The rhythm of the hand clappings pluspiano notes has a 7

4 time signature. Notice that thebass notes are playing with a simple repetitionof 4 beats that helps the other musicians playwithin this unusual time signature. In sum, theanalysis shown in Figure 16 provides quantitativeevidence for the “tightness” (rhythmic coherence)with which these musicians are performing.

Summary. We gave a couple of examples of theuse of scalograms and percussion scalograms inquantitatively analyzing musical performance inthe work of Satie and Brubeck. These examplesshow the value of these techniques for analyzing

0 4.096 sec 8.1920.12

1.84 strikessec

29.51 2 3 4

0.17

0.94 strikessec

5.310.13

1.02 strikessec

8.12

1 2 3 4 5 6 7

P

0

500 Hz

1000

Figure 16. Melody and rhythm in a passagefrom “Unsquare Dance”. Top: Spectrogram. P

aligns with the piano notes. Second from top:Percussion scalogram of frequencies above

3000 Hz. Drumstick and hand clap percussionare emphasized. Third from top: Percussionscalogram of frequencies between 400 and

3000 Hz. Piano notes are emphasized. Bottom:Percussion scalogram of frequencies below400 Hz. Bass notes are emphasized. Noticethat the hand clapping interlaces with the

piano notes—7 beats to a measure of 4(marked o! by the bass notes). (A video for

this figure is at (10), a larger graphic is at[101].)

pitch, rhythm, and overtone structure. We indi-cated a topic of future research on the music ofDebussy, which we hope will yield techniques thatcan be applied to composition in general.

Quantifying Rhythmic ComplexityArticle [20] introduced sequences of rest lengthsbetween notes (percussive strikes). See Figure 17.This was done because the rests between notesare at least as important as the notes themselves.As the classical pianist Artur Schnabel said: “Thenotes I handle no better than many pianists. Butthe pauses between the notes — ah, that is wherethe art resides.”6

We will now attempt to quantify the com-plexity of a sequence of rests, using ideas frominformation theory, such as entropy. Ideas from

6To hear Artur Schnabel perform, go to the URL in(10) and click on the link Artur Schnabel: Beethoven’sMoonlight Sonata.


information theory have been applied extensive-ly to musical analysis. A major landmark in thefield is the book by Meyer [72], which, althoughnot providing explicit formulas and quantitativemethods, nevertheless set the stage for futurework. The field is now vast. One place to searchfor more material is the online bibliography ofDownie [42], which, besides covering the impor-tant practical application of using informationtheory to retrieve musical works from databases,also includes many papers on the general top-ic of information theory and music. If there isanything new in our approach, it would be theuse of percussion scalograms derived from spec-trograms of recorded music of extemporaneous,or improvisational, performances (as opposed toscore analysis). In any case, we felt this work to besu"ciently interesting to include here, even if it isonly a brief introduction to the relations betweeninformation theory and music.

Figure 17. Percussion scalograms. Left: From“Dance Around”. Middle: From “Welela”. Right:From “Taxman” (bass notes only). Below eachscalogram is the sequence of rests for therhythmic sequence.

First, we review the basic notion of entropyfrom information theory [6, 29]. We assume thatwe have a stochastic source that emits sequencesof symbols from a finite set S = {a0, a1, . . . , an}and pk is the probability that the symbolak is emit-ted. The entropy E of the source is then measuredin bits by the formula

(26) E =n'

k=0

pk log2

71

pk

8

.

The entropy E provides a greatest lower boundfor the average number of bits per symbol thatany uniquely decipherable encoding method couldachieve in encoding the symbols from the source.In practice, a finite sequence of symbols wouldneed to be encoded. By the law of large numbers,the relative frequencies of occurrence of the sym-bols {ak} will be converging to their probabilities,so the relative frequencies can be used in place ofthe theoretical probabilities {pk}.

We will interpret an entropy value as indi-cating the magnitude of complexity of a finitesequence—the idea being that the higher the en-tropy, the more complex the sequence; hence the

more bits it takes to encode the sequence. Us-ing the term complexity, rather than information,allows us to avoid any connotations of meaningto our sequences of rests. Such connotations ofmeaning are generally irrelevant in informationtheory anyway—something being more complexhas nothing to do with whether it is more or lessmeaningful.

We will now illustrate these ideas with a specificexample of a rest sequence, the rest sequence fromthe “Dance Around” passage shown in Figure 17.The sequence is

(27) 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 .

For this sequence, we get relative frequencies ofp0 = 12/14 and p1 = 2/14 for the rest sym-bols 0 and 1, respectively. Hence the entropy isE = 0.92. We leave it as an exercise for the read-er to check that for the rest sequence from the“Welela” passage shown in Figure 17, the entropyis E = 1.46. These results seem to confirm ourimpression that the “Welela” sequence is morecomplex rhythmically than the “Dance Around”sequence.

However, there is more to it than that. In Table 1we show our calculations of the entropy E for 10di!erent rhythmic passages from di!erent genresof music. While the entries for E do seem to beconsistent with the fact that African drumming isgenerally regarded as more rhythmically complexthan rock drumming, we do notice that the value ofE for “Dance Around” is fairly high. For instance, itis essentially the same as the “Taxman” sequence.But the “Taxman” sequence is

(28) 1 0 0 0 1 0 0 0 1 1 0 0 0 1 1 0 0 0 1 1,

which seems to be more complex than the “DanceAround” sequence in (27). Moreover, using theentropy E would give exactly the same values ofE = 1 for the following two sequences:

0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1(29)

0 0 1 1 1 0 0 0 1 0 0 0 1 1 1 1(30)

Table 1. Complexity Measures E and C forRhythmic Sequences

Title% Genre E C

Wipe Out Rock 0.57 0.54Dance Around Rock 0.92 0.68

Toad Rock 0.90 0.81What’d I Say R & B 1.22 0.97

Taxman Rock 0.99 0.98African Drumming 1 African 1.24 1.13African Drumming 2 African 1.42 1.14

Welela African 1.46 1.23African Drumming 3 African 1.46 1.44

Sing, Sing, Sing Jazz 1.52 1.47%Audio files and discography for these sequences are

at the URL in (10).


But it seems clear that the second sequence ismore complex rhythmically than the first.

To solve this problem, we introduce a secondnotion of entropy. The entropy E is based onone type of modeling of a stochastic source: thememoryless or Markov-0 source model [29, 13]. Wenow consider another model, the Markov-1 sourcemodel. In the Markov-1 model, we assume thatthe probabilities of emitting any given symbol willdepend on what the previous emitted symbol was.These probabilities will be the transition probabil-ities between states, a state being the value of asymbol. The Markov-1 entropy, which we will de-note by C, is then calculated by an expected valueover all states of the entropies for these transitionprobabilities. Rather than give a precise formula,it is probably easier to just give an example.

Consider the sequence (29). To do our calcu-lation, we must assume an initial state. Because0 is the most common rest value in rhythm, wewill always assume that the initial state is 0 (thisinvolves appending a dummy symbol 0 to the be-ginning of the sequence). We then get probabilitiesof p0 = 9/16 and p1 = 7/16 of being, respectively,in state 0 and state 1 prior to emitting anothersymbol. The transition probabilities are

State 0 (p0 = 9/16) State 1 (p1 = 7/16)

0 # 0:8

91 # 0:

0

7

0 # 1:1

91 # 1:

7

7

and we then calculate the Markov-1 entropy C by

C = p0E0 + p1E1,

where E0 and E1 are calculated from the transitionprobabilities for state 0 and 1, respectively. Thisyields C = (9/16)(.503258) + (7/16)(log2 1) =0.28, whereas for the sequence (30), we obtainC = 0.89. These Markov-1 entropy values are moreconsistent with the apparent complexity of thetwo sequences.

In Table 1 we show our calculations of theMarkov-1 entropy C for the 10 di!erent rhyth-mic passages. These results seem promising. TheAfrican drumming sequences are measured as themost complex. That includes the jazz sequencefrom “Sing, Sing, Sing”, which imitates Africanrhythm—and is one of the most famous drumsequences in jazz history, perhaps because of itscomplexity. The rock drumming sequences areless complex using this measure.

Obviously we have worked with only a verysmall sample of music. We intend to continueassembling more data, but we do find the resultsof interest and hope that this discussion will spursome further work by others as well. Working onother measures of complexity, such as contextual

entropy, is another research path. By contextu-al entropy we mean the calculation of entropybased on a non-Markov description of the source.Symbols will be produced with probabilities thatdepend on what context the symbol is in. One setof contexts would be the levels of hierarchy of therhythm. As discussed in [20], there is a more com-plete notation for rhythm sequences that includesgrouping the notes into hierarchies. The level ofhierarchy a symbol belongs to might then be usedas the context for that symbol. This idea has theadvantage that it may be able to incorporate thefact that production of rhythm sequences is not aMarkov process.7

We have done a couple of preliminary calcula-tions with the “Dance Around” and “Welela” rhyth-mic hierarchies reported in [20, Examples 1 and 2].We found that this new complexity measure, basedon single (memoryless) frequencies of the notelengths as well as contextual (hierarchical) group-ings, gives complexities of 1.12 for the “DanceAround” sequence and 1.58 for the “Welela” se-quence. The two sequences are now closer incomplexity value, and this may reflect the possi-bility that this new measure may be taking intoaccount some of the speed variations in the “DanceAround” drumming (as discussed in the captionof Figure 5 in [20]). We mention these results onlyto pique the reader’s interest. If our speculationsare borne out, then we shall describe our findingsin more detail in a subsequent paper.

Summary. We have described some complexitymeasures for musical rhythm and done an initialstudy of their value in quantifying the complex-ity in di!erent rhythmic styles. We will continueworking on the relation between entropy measuresof complexity and musical rhythm, as that workhas just begun.

Concluding RemarksWe have introduced some exciting aspects ofthe relations between mathematics and music.The methods of Gabor transforms and continu-ous wavelet transforms were shown to providea wealth of quantitative methods for analyzingmusic and creating new music. Readers who wishto learn more can find ample resources in theReferences.

7The reason is similar to what Chomsky describes forphrase structure in language [22, p. 22]: “in English, wecan find a sequence a + S1 + b, where there is a depen-dency between a and b, and we can select as S1 anothersequence containing c + S2 +d, where there is dependen-cy between c and d, then select as S2 another sequence ofthis form, etc.” The parallel between English phrases andmusical phrases (with S1, S2, . . . as bridges) is obvious.


AcknowledgementsThis research was partially supported by the Na-tional Science Foundation Grant 144-2974/92974(REU site SUREPAM program) and by a grant fromthe UW-Eau Claire O"ce of Research and Spon-sored Programs. We thank William Sethares for hiscogent review of an earlier draft of this paper. Weare also grateful to Marie Taris for her perceptiveeditorial guidance throughout our preparation ofthis article. J.S.W., in particular, would like to thankSteven Krantz for all the encouragement and aidhe has given him in pursuing the interdisciplinarystudy of mathematics and music.

References[1] Acoustical Research Institute of the Austri-

an Academy of Sciences, STx software, avail-able at http://www.kfs.oeaw.ac.at/content/section/16/392/lang,8859-1/.

[2] J. Adams, Hallelujah Junction: Composing an Amer-ican Life, Farrar, Straus and Giroux, New York,2008.

[3] J. B. Allen and L. R. Rabiner, A unified approachto short-time Fourier analysis and synthesis, Proc.IEEE 65 (1977), 1558–64.

[4] J. F. Alm and J. S. Walker, Time frequencyanalysis of musical instruments, SIAM Review44 (2002), 457–76, available at http://www.uwec.edu/walkerjs/media/38228[1].pdf.

[5] Appalachian Spring, Rodeo, Fanfare for the Com-mon Man, Atlanta Symphony Orchestra, cond.Louis Lane, Telarc 1982, Track 6: “AppalachianSpring (Suite)”.

[6] R. B. Ash, Information Theory, Dover, New York,1990.

[7] G. Assayag, H. Feichtinger, J.-F. Rodrigues (eds.),Mathematics and Music: A Diderot MathematicalForum, Springer, New York, 2002.

[8] Ave Verum Corpus—Motets and Anthems of WilliamByrd, by Cambridge Singers, recorded at Great Hallof University College School, London, CollegiumRecords, 2003, Track 16: “Non Vos Relinquam”.

[9] P. Balazs, Regular and Irregular Gabor Multiplierwith Application to Psychoacoustic Masking, Ph.D.thesis, Univ. of Vienna, 2005.

[10] M. Barnsley, Fractals Everywhere, Second Edition,Academic Press, Cambridge, MA, 1993.

[11] The Beatles (The White Album), Capitol/EMI, 1968,Disc 1, Track 11: “Blackbird”, Converted to mono byAudacity sound editor.

[12] J. W. Beauchamp, ed., Analysis, Synthesis, and Per-ception of Musical Sounds: The Sound of Music,Springer, 2007.

[13] T. C. Bell, J. G. Cleary, and I. H. Witten, TextCompression, Prentice-Hall, Englewood Cli!s, NJ,1990.

[14] Benny Goodman Orchestra, Jazz Collector Edition,Laser Light Digital, 1991, Track 13: “Sing, Sing,Sing”, Circa 1943 live broadcast of Benny GoodmanOrchestra.

[15] D. Benson, Music: A Mathematical O!ering, Cam-bridge, 2006.

[16] L. Bernstein, The Joy of Music, Amadeus Press,Pompton Plains, NJ, 2004.

[17] Carreras, Domingo, Pavarotti in concert, Orches-tra del Maggio musicale fiorentino and Orchestradel Teatro dell’Opera di Roma, conducted by ZubinMehta. Decca, 1990, Track 12: “Nessun Dorma”.

[18] P. Cassaza, The art of frame theory, Taiwan-ese J. of Math. 4 (2000), 129–201, available athttp://www.math.nthu.edu.tw/~tjm/abstract/0006/tjm0006_1.pdf.

[19] D. Chen and S. Qian, Joint Time-Frequency Analysis:Methods and Applications, Prentice Hall, EnglewoodCli!s, NJ, 1996.

[20] X. Cheng, J. V. Hart, and J. S. Walker, Time-frequency analysis of musical rhythm, Notices ofthe American Mathematical Society 56 (2009), 344–60, available at http://www.ams.org/notices/200903/.

[21] E. Chew, Math & Music—The Perfect Match,OR/MS Today 35 (2008), 26–31, available atwww-rcf.usc.edu/~echew/papers/ORMS_Today_2008/2008-orms_today_ec_lowres.pdf.

[22] N. Chomsky, Syntactic Structures, Mouton deGruyter, New York, 2002.

[23] O. Christensen, Pairs of dual Gabor framegenerators with compact support and desired fre-quency localization, Appl. Comput. Harmon. Anal.20 (2006), 403–10.

[24] O. Christensen, An Introduction to Frames andRiesz Bases, Birkhäuser, Boston, 2003.

[25] O. Christensen and W. Sun, Explicitly given pairsof dual frames with compactly supported gener-ators and applications to irregular B-splines, J.Approx. Theory 151 (2008), 155–63.

[26] C. K. Chui, Wavelets: A Mathematical Tool for SignalAnalysis, SIAM, Philadelphia, PA, 1997.

[27] R. Cogan, New Images of Musical Sound, HarvardUniversity Press, Cambridge, MA, 1984.

[28] Computer Music Journal, webpage: http://www.mitpressjournals.org/cmj/.

[29] T. M. Cover and J. A. Thomas, Elements ofInformation Theory, Wiley, New York, 1991.

[30] C-Sound is available at http://www.csounds.com/.

[31] I. Daubechies, Ten Lectures on Wavelets, SIAM,Philadelphia, PA, 1992.

[32] I. Daubechies and S. Maes, A nonlinear squeez-ing of the continuous wavelet transform based onauditory nerve models, in Wavelets in Medicine andBiology, CRC Press, Boca Raton, FL, 1996, 527–46.

[33] I. Daubechies, A. Grossman, and Y. Meyer, Pain-less nonorthogonal expansions, J. Math. Phys.27 (1986), 1271–83.

[34] G. W. Don, Brilliant colors provocatively mixed:Overtone structures in the music of Debussy, MusicTheory Spectrum 23 (2001), 61–73.

[35] G. W. Don, “Fern”, Unpublished recording,available at http://www.uwec.edu/walkerjs/MBSGC/fern.wav.

[36] G. W. Don and J. S. Walker, “Sierpins-ki Round”, unpublished recording, availableat http://www.uwec.edu/walkerjs/MBSGC/Sierpinski_Round.wav.

[37] M. Dörfler, Gabor Analysis for a Class of SignalsCalled Music, Ph.D. thesis, Univ. of Vienna, 2002.

[38] M. Dörfler, Time-frequency analysis for musicsignals—A mathematical approach, J. of New MusicResearch 30 (March 2001) .


http://www.uwec.edu/walkerjs/MBSGC/Sierpinski_Round.wav

http://www.kfs.oeaw.ac.at/content/section/16/392/lang,8859-1/

http://www.kfs.oeaw.ac.at/content/section/16/392/lang,8859-1/

http://www.uwec.edu/walkerjs/media/38228%5B1%5D.pdf

http://www.uwec.edu/walkerjs/media/38228%5B1%5D.pdf

http://www.math.nthu.edu.tw/~tjm/abstract/0006/tjm0006_1.pdf

http://www.math.nthu.edu.tw/~tjm/abstract/0006/tjm0006_1.pdf

http://www.ams.org/notices/200903/

http://www.ams.org/notices/200903/

http://www-rcf.usc.edu/~echew/papers/ORMS_Today_2008/2008-orms_today_ec_lowres.pdf

http://www-rcf.usc.edu/~echew/papers/ORMS_Today_2008/2008-orms_today_ec_lowres.pdf

http://www.mitpressjournals.org/cmj/

http://www.mitpressjournals.org/cmj/

http://www.csounds.com/

http://www.csounds.com/

http://www.uwec.edu/walkerjs/MBSGC/fern.wav

http://www.uwec.edu/walkerjs/MBSGC/fern.wav

http://www.uwec.edu/walkerjs/MBSGC/Sierpinski_Round.wav

[39] M. Dörfler and H. Feichtinger, Quilted Gaborfamilies I: Reduced multi-Gabor frames, Appl. Com-put. Harmon. Anal. 356 (2004), 2001–23, availableat http://homepage.univie.ac.at/monika.doerfler/.

[40] M. Dörfler and H. G. Feichtinger, QuantitativeDescription of Expression in Performance of Mu-sic, Using Gabor Representations, Proceedings ofthe Diderot Forum on Mathematics and Music, 1999,Vienna.

[41] M. Dörfler and B. Torrésani, Representation ofoperators in the time-frequency domain and gen-eralized Gabor multipliers, to appear in J. FourierAnal. Appl.

[42] J. S. Downie, Music Information Retrieval on-line bibliography: http://www.music-ir.org/research_home.html.

[43] J. Du, M. W. Wong, and H. Zhu, Continuous anddiscrete inversion formulas for the Stockwell trans-form, Integral Transforms Spec. Funct. 18 (2007),537–43.

[44] J. Duesenberry, Convolution reverb and be-yond, Electronic Musician, April 1, 2005, availableat http://emusician.com/mag/emusic_convolution_reverb_beyond/index.html.

[45] R. J. Duffin and A. C. Schaeffer, A class of non-harmonic Fourier series, Trans. Amer. Math. Soc.72 (1952), 341–66.

[46] Electric Ladyland, Jimi Hendrix Experience, Reprise,1968, Track 15: “All Along the Watchtower”,converted to mono by Audacity sound editor.

[47] J. Fauvel, R. Flood, and R. Wilson (eds.), Music andMathematics: From Pythagoras to Fractals, Oxford,New York, 2003.

[48] H. G. Feichtinger, F. Luef, and T. Werther, AGuided Tour from Linear Algebra to the Foun-dations of Gabor Analysis, in Gabor and WaveletFrames, IMS Lecture Notes Series, 10 (2007), 1–49.

[49] H. G. Feichtinger and K. Nowak, 2003, A First Sur-vey of Gabor Multipliers, In reference [51], availableat http://www.univie.ac.at/nuhag-php/home/feipub_db.php.

[50] H. G. Feichtinger and T. Strohmer (eds.), GaborAnalysis and Algorithms: Theory and Applications,Birkhäuser, Boston, MA, 1998.

[51] H. G. Feichtinger and T. Strohmer (eds.), Ad-vances in Gabor Analysis, Birkhäuser, Boston, MA,2002.

[52] S. Flinn, Shepard’s Tones webpage: http://www.cs.ubc.ca/nest/imager/contributions/flinn/Illusions/ST/st.html.

[53] D. Gabor, Theory of communication, Journal of theInstitute for Electrical Engineers 93 (1946), 873–80.

[54] P. C. Gibson, M. P. Lamoureux, and G. F. Mar-grave, Letter to the editor: Stockwell and wavelettransforms, J. Fourier Anal. Appl. 12 (2006), 713–21.

[55] K. Gröchenig, Foundations of Time-FrequencyAnalysis, Birkhäuser, Boston, MA, 2001.

[56] L. Harkleroad, The Math Behind the Music,Cambridge University Press, Cambridge, UK, 2006.

[57] D. M. Healy Jr. and S. Li, A parametric class of dis-crete Gabor expansions, IEEE Trans. Signal Process.44 (1996), 201–11.

[58] C. E. Heil and D. F. Walnut, Continuousand discrete wavelet transforms, SIAM Review

31 (1989), 628–66, available at http://people.math.gatech.edu/~heil/papers/siam.pdf.

[59] H. Helmholtz, On the Sensations of Tone, Dover,New York, 1954.

[60] D. R. Hofstadter, Gödel, Escher, Bach: An EternalGolden Braid, Vintage, New York, 1980.

[61] D. Kroodsma, The Singing Life of Birds, Houghton-Mi#in, New York, 2005.

[62] J. Latartara, Pedagogic applications of Fourieranalysis and spectrographs in the music theoryclassroom, J. of Music Theory Pedagogy 22 (2008),61–90.

[63] Layla and Other Assorted Love Songs, Derek andthe Dominoes, original recording remastered, Poly-dor, 1970, Track 13: “Layla”, converted to mono byAudacity sound editor.

[64] Louis Armstrong: Gold, Hip-O Records, 1996, orig-inal June 26, 1950, recording remastered, Disc 2,Track 3: “La Vie en Rose”.

[65] G. Loy, Musimathics: The Mathematical Founda-tions of Music, Vol. 2, MIT Press, Cambridge, MA,2007.

[66] Ludwig van Beethoven, Piano Sonata in E (Opus 109),Movements 1 and 2, performed by Daniel Añez Gar-cia on January 25, 2007, at Jeneusses Musicales duCanada, available at http://www.youtube.com/watch?v=e5jgYHGoAg0&feature=related.

[67] F. B. Mache, Music, Myth and Nature, Contem-porary Music Studies, Vol. 6, Taylor & Francis,London, 1993.

[68] S. Mallat, A Wavelet Tour of Signal Processing,Second Edition, Academic Press, San Diego, CA,1999.

[69] MatLab, Time-frequency toolboxes, availableat http://www.univie.ac.at/nuhag-php/mmodule/.

[70] G. Mazzola, The Topos of Music, Birk-häuser, Basel, 2002.

[71] MetaSynth, music software, http://www.uisoftware.com/MetaSynth/.

[72] L. B. Meyer, Music the Arts and Ideas: Patternsand Predictions in Twentieth-Century Culture,University of Chicago Press, 1967.

[73] S. Mithen, The Singing Neanderthals: The Ori-gins of Music, Language, Mind, and Body, HarvardUniversity Press, 2006.

[74] J. of New Music Research, website: http://www.tandf.co.uk/journals/nnmr/.

[75] W. O’Grady, M. Dobrovolsky, and M. Arnoff,Contemporary Linguistics, an Introduction, St. Mar-tin’s Press, New York, 1993.

[76] L. R. Rabiner and J. B. Allen, Short-time Fouri-er analysis techniques for FIR system identicationand power spectrum estimation, IEEE Trans.Acoust. Speech Signal Process. 27 (1979), 182–92.

[77] Perspectives of New Music, website: http://www.perspectivesofnewmusic.org/.

[78] J. Rahn webpage: http://faculty.washington.edu/~jrahn/.

[79] A. Ross, The Rest Is Noise: Listening to theTwentieth Century, Farrar, Straus and Giroux, NewYork, 2007.

[80] D. Rothenberg, Why Birds Sing: A Journey intothe Mystery of Bird Song, Basic Books, New York,2005.


http://faculty.washington.edu/~jrahn/

http://homepage.univie.ac.at/monika.doerfler/

http://homepage.univie.ac.at/monika.doerfler/

http://people.math.gatech.edu/~heil/papers/siam.pdf

http://people.math.gatech.edu/~heil/papers/siam.pdf

http://www.music-ir.org/research_home.html

http://www.music-ir.org/research_home.html

http://emusician.com/mag/emusic_convolution_reverb_beyond/index.html

http://emusician.com/mag/emusic_convolution_reverb_beyond/index.html

http://www.youtube.com/watch?v=e5jgYHGoAg0&feature=related

http://www.youtube.com/watch?v=e5jgYHGoAg0&feature=related

http://www.univie.ac.at/nuhag-php/home/feipub_db.php

http://www.univie.ac.at/nuhag-php/home/feipub_db.php

http://www.univie.ac.at/nuhag-php/mmodule/

http://www.univie.ac.at/nuhag-php/mmodule/

http://www.uisoftware.com/MetaSynth/

http://www.uisoftware.com/MetaSynth/

http://www.cs.ubc.ca/nest/imager/contributions/flinn/Illusions/ST/st.html

http://www.cs.ubc.ca/nest/imager/contributions/flinn/Illusions/ST/st.html

http://www.tandf.co.uk/journals/nnmr/

http://www.tandf.co.uk/journals/nnmr/

http://www.perspectivesofnewmusic.org/

http://www.perspectivesofnewmusic.org/

http://faculty.washington.edu/~jrahn/

[81] Satie: L’Oeuvre pour piano, vol. 1, Aldo Ciccoli-ni, piano, recorded in 1983, EMI, 1987, Track 7:“Gymnopédie I”.

[82] K. Schnass, Gabor Multipliers: A Self-containedSurvey, master’s thesis, Univ. of Vienna, 2004.

[83] W. Sethares, Rhythm and Transforms, Springer,New York, 2007.

[84] W. Sethares, Tuning, Timbre, Spectrum, Scale,Springer, London, 2005.

[85] R. N. Shepard, Circularity in judgements of rela-tive pitch, J. of the Acoustical Soc. of America 36(1964), 2346–53.

[86] L. M. Smith, Modelling rhythm perception by con-tinuous time-frequency analysis, Proceedings of the1996 International Computer Music Conference,Hong Kong, 392–95.

[87] L. M. Smith, A multiresolution time-frequency anal-ysis and interpretation of musical rhythm, thesis,University of Western Australia, 2000.

[88] L. M. Smith and H. Honing, Time-frequencyrepresentation of musical rhythm by continuouswavelets, J. of Mathematics and Music 2 (2008),81–97.

[89] E. Stern, quoted from the Pandalous web-site, available at http://www.pandalous.com/using search topic: Louis Armstrong.

[90] R. G. Stockwell, A basis for e"cient represent-ation of the S-transform, Digital Signal Processing,17 (2007), 371–93.

[91] R. G. Stockwell, L. Mansinha, and R. P. Lowe,Localization of the complex spectrum: the S trans-form, IEEE Trans. Signal Processing 44 (1996),998–1001.

[92] T. Strohmer, Numerical algorithms for discreteGabor expansions, see [50, pp. 267–94].

[93] Suite: The Firebird, Igor Stravinsky, Boston Sympho-ny Orchestra, 1964 recording, cond. Erich Leinsdorf,BMG Classics, 1991.

[94] Time Further Out, Dave Brubeck Quartet, SONY,1961, Track 7: “Unsquare Dance”, recorded on June8, 1961.

[95] D. Tymoczko, The geometry of musicalchords, Science 313 (2006), 72–4, available athttp://www.sciencemag.org/cgi/content/full/313/5783/72/DC1.

[96] J. S. Walker, A Primer on Wavelets and theirScientific Applications, Second Edition, Chapman& Hall/CRC Press, Boca Raton, FL, 2008, materialreferenced from Chapters 5 and 6 is avail-able at http://www.uwec.edu/walkerjs/MBSGC/PrimerExtracts/.

[97] T. Werther et al., CPR Artifact Removalin Ventricular Fibrillation ECG Signals Us-ing Gabor Multipliers, IEEE Trans. BiomedicalEngineering 56 (2009), 320–27.

[98] Why Birds Sing website: http://www.whybirdssing.com/.

[99] R. M. Young, An Introduction to NonharmonicFourier Series, Academic Press, New York, 1980.

[100] M. Zibulski and Y. Zeevi, Discrete multiwin-dow Gabor-type transforms, IEEE Trans. SignalProcessing 45 (1997), 1428–42.

[101] Online version of figures: Available athttp://www.ams.org/noticespages/201001/2010010030/.


http://www.ams.org/bookstore

http://www.pandalous.com/

http://www.sciencemag.org/cgi/content/full/313/5783/72/DC1

http://www.uwec.edu/walkerjs/MBSGC/PrimerExtracts/

http://www.uwec.edu/walkerjs/MBSGC/PrimerExtracts/

http://www.whybirdssing.com/

http://www.whybirdssing.com/

http://www.ams.org/noticespages/201001/2010010030/



music: broken symmetry, geometry, and complexity

Documents