a constraint onthe expressive timingofa melodic … constraint onthe expressive timingofa melodic...

14
Haskins Laboratories Status Report on Speech Research 1992, SR-109/110, 263-276 A Constraint on the Expressive Timing of a Melodic Gesture: Evidence from Performance and Aesthetic Judgment* Bruno H. Repp Discussions of music performance often stress diversity and artistic freedom, yet there is general agreement that interpretation is not arbitrary and that there are standards that performances can be judged by. However, there have been few objective demonstrations of any extant constraints on music performance and judgment, particularly at the level of expressive microstructure. The present study illustrates such a constraint in one specific case: the expressive timing of a melodic gesture that occurs repeatedly in Robert Schumann's famous piano piece, "Traumerei." Tone onset timing measurements in 28 recorded performances by famous pianists suggest that the most common "temporal shape" of this (nominally isochronous) musical gesture is parabolic, and that individual variations can be described largely by varying a single degree of freedom of the parabolic timing function. The aesthetic validity of this apparent constraint on local performance timing was investigated in a perceptual experiment. Listeners judged a variety of timing patterns (original parabolic, shifted parabolic, and nonparabolic) imposed on the same melodic gesture, produced on an electronic piano under MIDI control. The original parabolic patterns received the highest ratings from musically trained listeners. (Musically untrained listeners were unable to give consistent judgments.) The results support the hypothesis that there are classes of optimal temporal shapes for melodic gestures in music performance, and that musically acculturated listeners know and expect these shapes. Being classes of shapes, they represent fZezible constraints within which artistic freedom and individual preference can manifest themselves. INTRODUCTION Much has been written about music perfor- mance, with the emphasis generally being on the diversity among interpretations by different artists and in different historic periods. Yet, in each period (and quite likely across periods) there have also been generally accepted performance standards, which were reflected in music educa- tion, performance practice, and music criticism. This research was made possible through the generosity of Haskins Laboratories <Michael Studdert.Kennedy, president). Additional support came from Nm BRSG Grant RR-05596 to the Laboratories. A short version of this paper was presented at the Second International Conference on Music Perception and Cognition in Los Angeles, February 1992. I am grateful to Pat Shove for many stimulating discussions. 263 The nature of these standards has been discussed in a number of treatises (most notably Lussy, 1882), but rarely in objective and quantitative terms. This is particularly true with regard to the expressive microstructure of performance-all those variations that are not easily captured in music notation but that are essential to the com- municative function of interpretation. Musicians are usually only dimly aware of these variations, which they control intuitively rather than deliber- ately. Similarly, musical listeners perceive the structure and expression conveyed by these varia- tions without being aware of the microstructure as such. It has been up to experimental psychologists to discover and measure these variations objec- tively (e.g., Palmer, 1989; Repp, 1990; Gabrielsson, Bengtsson, & Gabrielsson, 1983; Seashore, 1938/67; Shaffer, 1981).

Upload: phungngoc

Post on 15-May-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

Haskins Laboratories Status Report on Speech Research1992, SR-109/110, 263-276

A Constraint on the Expressive Timing of aMelodic Gesture: Evidence from Performance and

Aesthetic Judgment*

Bruno H. Repp

Discussions of music performance often stress diversity and artistic freedom, yet there isgeneral agreement that interpretation is not arbitrary and that there are standards thatperformances can be judged by. However, there have been few objective demonstrations ofany extant constraints on music performance and judgment, particularly at the level ofexpressive microstructure. The present study illustrates such a constraint in one specificcase: the expressive timing of a melodic gesture that occurs repeatedly in RobertSchumann's famous piano piece, "Traumerei." Tone onset timing measurements in 28recorded performances by famous pianists suggest that the most common "temporalshape" of this (nominally isochronous) musical gesture is parabolic, and that individualvariations can be described largely by varying a single degree of freedom of the parabolictiming function. The aesthetic validity of this apparent constraint on local performancetiming was investigated in a perceptual experiment. Listeners judged a variety of timingpatterns (original parabolic, shifted parabolic, and nonparabolic) imposed on the samemelodic gesture, produced on an electronic piano under MIDI control. The originalparabolic patterns received the highest ratings from musically trained listeners.(Musically untrained listeners were unable to give consistent judgments.) The resultssupport the hypothesis that there are classes of optimal temporal shapes for melodicgestures in music performance, and that musically acculturated listeners know and expectthese shapes. Being classes of shapes, they represent fZezible constraints within whichartistic freedom and individual preference can manifest themselves.

INTRODUCTIONMuch has been written about music perfor­

mance, with the emphasis generally being on thediversity among interpretations by differentartists and in different historic periods. Yet, ineach period (and quite likely across periods) therehave also been generally accepted performancestandards, which were reflected in music educa­tion, performance practice, and music criticism.

This research was made possible through the generosity ofHaskins Laboratories <Michael Studdert.Kennedy, president).Additional support came from Nm BRSG Grant RR-05596 tothe Laboratories. A short version of this paper was presentedat the Second International Conference on Music Perceptionand Cognition in Los Angeles, February 1992.

I am grateful to Pat Shove for many stimulating discussions.

263

The nature of these standards has been discussedin a number of treatises (most notably Lussy,1882), but rarely in objective and quantitativeterms. This is particularly true with regard to theexpressive microstructure of performance-allthose variations that are not easily captured inmusic notation but that are essential to the com­municative function of interpretation. Musiciansare usually only dimly aware of these variations,which they control intuitively rather than deliber­ately. Similarly, musical listeners perceive thestructure and expression conveyed by these varia­tions without being aware of the microstructure assuch. It has been up to experimental psychologiststo discover and measure these variations objec­tively (e.g., Palmer, 1989; Repp, 1990;Gabrielsson, Bengtsson, & Gabrielsson, 1983;Seashore, 1938/67; Shaffer, 1981).

264

Even though a number of studies of expressivemicrostructure have been published, they haverarely provided evidence of constraints onperformance parameters. The principal reason isthat they usually were based on very smallsamples of performances, so no statements couldbe made about the generality of particularmicrostructural patterns. Hypotheses about thegenerality of such patterns, as instantiated forexample in the performance rules of Friberg(1991) or in the hierarchical timing model of Todd(1985), remain to be validated on largeperformance data bases. Moreover, studies ofmusic performance have rarely combinedmeasurements with formal perceptual evaluationsto confirm the aesthetic validity of thehypothesized or measured patterns.

The work of Johan Sundberg and his colleaguesis a significant exception (see Sundberg, Friberg,& Fryden, 1991). A study by Sundberg andVerrillo (1980) had a purpose very similar to thatof the present research. These authors were con­cerned with the temporal shape of the ritardando,the gradual slowing of tempo commonly observedin performance at the ends of most compositions.They asked whether there was an optimal timecourse for this slowing down which performers ob­served and listeners expected. They selected 24recordings of rhythmically uniform music, mostlyby J. S. Bach, and measured the onset intervalsbetween successive tones, whose reciprocals theythen plotted as local tempo decreasing over time.Sundberg and Verrillo found that the averagefunction resulting from these measurements couldbe described in terms of two linear segments, thesecond steeper in slope than the first. They alsoconducted a perceptual test in which musicallyexperienced listeners were presented with ex­cerpts that exhibited various forms of ritardando,some corresponding to the observed average func­tion and others having deviant temporal shapes ofvarious kinds. The listeners tended to prefer theritardandi corresponding to the originalperformances.

In a later discussion of the same data, Kronmanand Sundberg (1987) abandoned the bilinearmodel and instead fitted the average data pointswith a single curve (a square-root function), whichthey claimed was similar to that observed whenother rhythmic motor activities, such as locomo­tion, come to a smooth halt. (Specific references torelevant literature were not given.) This functionthus may represent a rather general constraint onthe optimal shape of the musical ritardando.

Although these studies exhibit some method­ological weaknesses l and therefore can only be re­garded as preliminary, they nevertheless set agood precedent for the kind of approach to betaken in investigations of performanceconstraints.

The present investigation concerns possibleconstraints on the temporal shape of an expressivemelodic gesture. (The performance-oriented term"melodic gesture" is used here to refer to a briefsequence of melody tones that is executed as asingle expressive unit. The equivalent term"(rhythmic) group" is often used in the musicologi­cal literature.) By a constraint is meant arestriction on the performance patterns that occurin expert interpretations and that are judgedacceptable by musically experienced listeners.Melodic gestures occur throughout Western musicin most styles, and they come in a large variety offorms. It seems unlikely that all these forms aresubject to any single performance constraint. Thenature of these constraints may vary as a functionof many factors, including tempo, metric andharmonic structure, style, and so on. Rather thansearching for a universal constraint, the presentstudy focused on the timing pattern of oneparticular melodic gesture. If it could bedemonstrated that this pattern is subject to asignificant constraint in performance and inperceptual judgment, this would at least providean existence proof of such constraints onexpressive microstructure. Moreover, by focusingon a specific case, the constraint can becharacterized rather precisely. Questions about itsorigin and generality may then form the basis forfuture research.

The melodic gesture under investigation occursin Robert Schumann's famous piano piece,"Traumerei" (No.7 of "Kinderszenen," op. 15),whose score is shown in Figure 1. The melodicgesture moves from bar 1 into bar 2. (See Figure 3below.) In its notated form, it consistsof five eighth-notes ascending in pitch and a finallonger note which repeats the pitch of thepreceding eighth-note. The gesture recurs sixtimes (eight times, if the obligatory repeat of thefirst eight bars is counted) during the piece,with some variations in key and intervalstructure. These recurrences are aligned verticallyin Figure 1. The gesture is of central importanceto the expressive quality of a performanceof "Traumerei" and may be assumed to begiven close attention by both performers andlisteners.

A Constraint on the Expressive Timing ofMelodic Gesture: Evidence from Performance and Aesthetic Judgment 265

~(espr.)

• S' • I~~..-:::::::= (espr.)

I~'

! I' ;:::::

I < I

!Fd

OJ---r ----

)i v

-:-

It ~.

!< ~.

E,

(l

'.lM

[I]~ ----

): v~.

~.

'-"~

--ITQ]

~,!

VI V,~

(espr.)

[ill-,~d -- Ci6l

~l IJII ~rit. r-~ I I.

ioJ t 1:7'

tn~' r--l. ~~ ~ 01 II < I

i-. ~~..1<.' .. z· ... .J:_:

...,'"

)

..I

fit.I < I=

• < •. ='I d

~

t" "~.

'.lM.

[TIl

), " ~.

1>II

----.----'3:Ul.

Figure 1. Piano score of Schumann's "Traurnerei," arranged on the page so parallel structures are vertically aligned.The score was created after the Clara Schumann edition (Breitlcopf 6: Hartel) using MusicProse software; minordeviations from the original are due to software limitations.

266

o Bars 112

• Bars 516

" Bars 9/10

.. Bars 13114

o Bars 17/18 I--i--ir---p;

• Bars 21122

1400·T:=:::I==r::;--T-,-n

1200

(j) 1000E---o 800*---+--i----+-...,a--;-7';f',;.........;

400+,--+--+--+--+--....,'-6 '-7 '·8 2-' 2·2

BAR/EIGHTH-NOTE

PERFORMANCE MEASUREMENTSTone onset timing measurements were obtained

from the digitized waveforms of 28 differentperformances of "Traumerei," taken fromcommercial recordings (LP, CD, or cassette) by 24pianists. Two famous pianists (Alfred Cortot andVladimir Horowitz) were represented with threedifferent recordings each. The measurements wereaveraged over the obligatory repeat of bars 1-8(observed by all but two pianists in the sample)before further analysis. Thus there were data forsix instances of the melodic gesture of interest ineach of the 28 performances, a total of 168.

Initially, the geometric mean durations of thefive lOIs for each of the six instances of the ges­ture were computed across the 28 performances.These durations (in ms) were plotted as a functionof score time (i.e., at equal abscissa intervals), andtheir pattern was examined as to whether it couldbe fit by some simple function. These data areshown in Figure 2. It is evident that the timingpattern of each instance was fit well by a smoothcurvilinear function, in fact a parabola (quadraticcurve). Overall, pianists tended to speed upsomewhat in the initial part of the melodic gestureand to slow down at the end. This slowing downwas especially pronounced in the last instance ofthe melodic gesture (bars 21-22), where the scoreindicates a fermata (hold) on the last note. It wasleast pronounced in the two instances in the mid­dle section of the piece (bars 9-10 and 13-14). Allinstances, however, were described well byquadratic functions which differed mainly in cur­vature.

Figure 2. Timing patterns of six instances of the samemelodic gesture in "Triumerei." The data points are thegeometric average durations of 28 performances (Repp,1982), with quadratic functions fitted to them. Theabscissa labels refer to bars 1-2.

The onsets of the tones corresponding to the sixmelody notes define five interonset intervals(lOIs) which would be equally long if the musicwere performed mechanically (e.g., by acomputer). In fact, they are never equal in ahuman performance; pianists always give anexpressive temporal shape to this crucial part ofthe melody. This temporal shape can be visualizedas the pattern of observed 101 durations, plottedas connected points equidistant along the x-axis("score time"). How many such patterns are there?In principle, the melodic gesture can be performedwith any temporal pattern whatsoever.2 However,the hypothesis pursued here is that only certainpatterns actually occur in expert performancesand are found acceptable by listeners.

One characteristic of this class of patterns maybe predicted on the basis of the general principleof 1mal lengthening (e.g., Lindblom, 1978; Todd,1985): A slowing down of tempo is often observedat the ends of action units such as phonologicalphrases in speech or melodic gestures in music,particularly when they coincide with the end of alarger structural unit, such as a clause(subphrase) or phrase. Therefore, the timingpatterns to be investigated may be expected toshow some lengthening of the last 100(s). Anindependent reason for lengthening of the last 101might be the occurrence of two grace notes(essentially a written-out arpeggio) in the lefthand during that interval (see Figure 1). However,these grace notes occur only in bars 2, 6, and 18,not in bars 10, 14, and 22. The pattern ofexecution of these two sets of variants may differ.Another relevant phenomenon is the possiblelengthening of accented tones. In the score, thefourth note of the melodic gesture follows a barline and thus in theory carries a strong metricalaccent (downbeat). Based on the notated music,therefore, a lengthening of the fourth intertoneinterval might be predicted. Musical intuitionsuggests, however, that this theoretical accent issuspended in performance, and that the accentedtone of the melodic gesture is in fact the final one.Whether this is in fact so is an empirical issue tobe addressed below. In principle, nothing canprevent a pianist from placing an overt accent onthe fourth tone.

In the remainder of this paper, a summary ofperformance measurements is followed by thedetailed report of a perceptual experiment. Themeasurements derive from a comprehensiveanalysis of timing microstructure in performancesof Schumann's "Traumerei"; for details, the readeris referred to Repp (1992).

A Constraint on the Expressive Timing ofMelodic Gesture: Evidence from Perfomulnce and Aesthetic Judgment 267

Subsequently, all 168 individual timing patternswere plotted and examined in the same way. Itwas found that 87% of them could be describedrather well by quadratic functions of varyingelevation (i.e., average tempo) and curvature (i.e.,degree of tempo modulation). All but two of theexceptions followed a single pattern: a relativeshortening of the last IOl.3 This pattern, whosemain representative was the French pianist AlfredCortot in his three performances, suggests adifferent structural interpretation of the melodicgesture: a division into two subgestures and/or anintention to place an accent on the fourth tone.Three other pianists showed this patternintermittently; Cortot himself consistently avoidedit in bars 21-22, where he showed the standardparabolic timing curve.

Further analysis of the coefficients of thequadratic polynomials (y = a + bx + cx2 ) fit to 87%of all instances revealed some strong relationshipsamong the constant (a), linear (b), and quadratic(c) terms of these functions. The latter two, inparticular, were highly correlated. There was alsoa substantial correlation between the quadraticand constant terms. Linear regressions among thecoefficients made it possible to predict the linearand constant terms from the quadratic term andthus to generate a single family of parabolas byvarying the quadratic term alone. (This family isshown in the upper left-hand panel of Figure 4below.) It captures a substantial amount of thevariance in the data, with deviations occurringmainly in the constant term (i.e., elevation alongthe ordinate, corresponding to variations inoverall tempo), which is irrelevant to the temporalshaping of the melodic gesture.

These quadratic curves represent a strongconstraint on the timing pattern of the melodicgesture studied here. Apparently, the largemajority of expert pianists achieve a parabolictiming function by controlling a single degree offreedom. No pianist lengthened the second 101,say, or shortened the first, or showed any pattern(other than the type favored by Cortot) thatdeviated substantially from a parabQlic trajectory(though see Footnote 2). Even the Cortot patternfollowed a parabolic curve through the first fourlOIs. To the author, however, these performancessound mannered. This subjective impression, inconjunction with the overall predominance ofparabolic timing patterns, suggested that themore typical parabolic patterns might also bepreferred by other musically experiencedlisteners. This hypothesis was tested in thefollowing perceptual experiment.

PERCEPTUAL EXPERIMENT

The purpose of this experiment was todemonstrate that listeners' aesthetic preferencesconverge on the timing patterns that characterizethe majority of expert performances. To that end,subjects were presented with the melodic gestureof interest, executed with a variety of timingpatterns, each of which was to be rated foracceptability on a 10-point scale. The timingpatterns included parabolic and "hybrid"(nonparabolic) shapes. Among the former, therewere some that belonged to the family of functionsobserved in actual performances, whereas othersdeviated in the location of the minimum. It wasexpected that listeners would prefer the "normal"over the deviant parabolic shapes. In addition,these functions varied in curvature. Sincelisteners might also exhibit a preference for aparticular curvature (degree of tempo modulation)within each class of temporal shapes, somedeviant shapes might actually be preferred oversome normal shapes. However, for a given degreeof curvature, the normal shapes were expected tobe preferred most. The hybrid shapes weregenerated from two normal parabolic patterns ofdifferent curvature by interchanging their lOIs inall possible ways. It was expected that listeners'judgments would reflect the hybrids' degree ofapproximation to a parabolic shape. The responseswere also expected to yield information aboutwhat deviations from the normal shapes are morereadily tolerated than others. In fact, one of thesedeviant shapes resembled the Cortot type ofpattern.

The role of listeners' musical experience was arather crucial issue. If the parabolic constraintuncovered in the performance timingmeasurements reflects a general principle ofphysical motion, i.e., an optimal pattern ofacceleration-deceleration, then even listenerswithout much musical experience might show apreference for it. The alternative possibility is thatlisteners need to be attuned to temporal patternsin classical music performance to show reliablepreferences in this task. To investigate this issue,subjects both with and without musical experiencewere tested. A second question, concerningsubjects with musical experience, was whethertheir judgments would be based on generalknowledge of performance principles in classicalmusic, or on specific knowledge of "Triiumerei"and its performance. This question was notaddressed rigorously, but some relevantinformation was obtained.

268

Methods

Subjects. Twenty-six subjects participated. Themajority of them had responded to anadvertisement in the Yale campus newspaper;others were recruited personally by the authorand included some friends and family memberswho served without pay. Twelve subjects had littlemusical education; most of them did not play anyinstrument, while some had studied aninstrument for a short time. Fourteen subjectswere musically experienced; they included 11pianists, two violinists, and one flutist, ranging inskill from advanced amateur to professional level.

Stimuli. The stimuli were generated on aRoland RD250S digital piano under MIDI control.Temporal resolution was 5 ms. Each stimulusconsisted of the excerpt shown in Figure 3 (frombars 1-2 of "Triiumerei"), played with one of 45different timing patterns. The timing pattern wasapplied only to the melodic gesture of interest,which comprised five lOis; the timing of thepreceding context, comprising three longer lOis,was constant at values representing the geometricmeans of the 28 expert performances measured byRepp (1992): 1065, 1380, and 1825 ms,respectively. The timing of the left-hand grace­note tones during the last 101 of the criticalgesture was such that the first tone started afterone third of the IOI had elapsed and ended withthe onset of the second tone, which started afterone third of the remaining interval had elapsed.(This timing pattern was fairly common in the 28performances examined.) To make room for thegrace-note tones, the preceding chord, the tied­over quarter-notes of the preceding chord in bar 2

were realized as tied-over eighth-notes.Sustaining pedal was added as indicated in thescore. The tones had a fixed expressive intensitypattern similar to that of one of the expertperformances.

The timing patterns of the critical melodicgesture are illustrated in Figure 4. The upper left­hand panel shows the five "normal" patterns,which followed parabolic functions of varyingcurvature. Each parabola was generated by theequation, 101(ms) =C + Lx + Qx2 , where x standsfor the ordinal numbers of the lOIs (1,... ,5). Thequadratic term (Q) of the polynomial equation wasset at values of 20, 40, 60, 80, and 100, which spanthe range of most empirically observed timingfunctions. The linear (L) and constant (C) terms ofthe parabolas were derived according to theempirically determined regression equations, L =35 - 5.5Q and C = 388 + 7.8Q (see Repp, 1992).The resulting stimuli were named Q20, ..., Q1oo.

The lower panels in Figure 4 illustrate two setsof deviant parabolic curves. Each set varied in Qalong the same values as the normal set, but theconstant and linear terms differed. In the "left­shifted" set, C was decreased by 300 and L wasincreased by 100, whereas, in the "right-shifted"set, C was increased by 300 and L was decreasedby 100. In each case, the change in one parameterwas arbitrary, but the change in the otherparameter was chosen so as to keep the averageIOI duration equal to that of the normal conditionwith the same Q. The stimuli in the left-shiftedset (Q20L, ..., QlOOL) started faster and endedslower than the normal stimuli; the opposite wastrue for the stimuli in the right-shifted set(Q20R, ..., QI00R).

*~.~.

0 1 2~

,.., ~ ('J J.( --- .........

/'I

\. -. ,. .tJ "*-

~v· ~ (' I

< p Of" .d.I

( .. ,_tc ..

Figure 3. The musical excerpt used in the experimmt (from ban 1-2 of "'Triumerei," with slightly modified finalnotes). The melodic gesture of interest is boxed in.

A Constraint on the Expressive Timing ofMelodic Gesture: Evidence from Peifrmrulnce and Aesthetic Judgment 269

1200I

1200

0 020

• 0401000 1000

D. 060

J.. 080

800 0 0100 800

400 400 I..--. 1-6 1-7 1-8 2-1 2-2 1-6 1-7 1-8 2-1 2-2(j)

E--- 1400 1000

0 0 020R

1200• Q40R

1000 800 D. 060R

J.. oaoR800

600 600

400

200 400 I

1·6 1-7 1-8 2-1 2-2 1-6 1-7 1-8 2-1 2-2

BAR/EIGHTH-NOTE

Figure 4. Timing patterns of the experimental stimuli. Upper left-hand panel: nonnal parabolic patterns. Lower left­hand panel: left-shifted parabolic patterns. Lower right-hand panel: right-shifted parabolic patterns. Upper right-handpanel: hybrid patterns.

270

The remaining 30 timing patterns were gener­ated as illustrated in the upper right-hand panelof Figure 4. The heavy lines in the figure illustratethe Q20 and Q100 timing patterns, representedhere as polygons rather than as smooth curves.Thirty hybrid patterns were generated by inter­changing 101 durations from those two patterns.With two possible values for each of five lOIs,there are 32 possible patterns, two of which arethe original ones. The original patterns werecoded arbitrarily as HooooO (= Q20) and Hllll1(= Q100), and hybrid patterns were coded asH10000, HllOOO, etc. Clearly, some of thesehybrids (e.g., H00100, HllOll) were very similarto the originals, whereas others were moredissimilar. Although some of them were clearlynonparabolic (e.g., H01010), others might by fit bya left-shifted or right-shifted parabola (HooOlland H1lloo, respectively). In contrast to the left­and right-shifted parabolic patterns, however, allindividual lOIs in the hybrid patterns were withinthe normal range. One hybrid pattern, HllllO,was not unlike the Cortot pattern described above.

The stimuli were recorded electronically fromthe audio output jack of the digital piano ontohigh-quality cassette tape . Six examples wererecorded at the beginning of the tape, the firstthree with isochronous timing of the melodic ges­ture (i.e., with constant lOIs of 500 ms), and thesecond three being stimuli HOll01, Q80R, andQ40. These examples were followed by three dif­ferent randomized sequences of the 45 stimuli.Interstimulus intervals were 5 s, wi' m addi­tional 5 s after each group of 15, and "her 5 sbetween blocks.

Procedure. Subjects received a dubbed copy ofthe master cassette, accompanied by detailedprinted instructions, an answer sheet, and aquestionnaire about their musical experience.They listened on their home audio equipment andreturned the completed materials. (Control oversound quality and playback level was not crucialin this study.)

The instructions displayed the score of heexcerpt (cf. Figure 3) and included the followingcrucial sections:

...Eacb time the excerpt will be played with asligbtly different timing pattern of the notes. Yourtask is to judge the aesthetic appeal of eacb timingpattern. Clearly, there are no rigbt or wrongresponses bere; I want to ftnd out wbat soundsgood to you...."

(After the first set of examples had beenintroduced:)

.....In the following three examples, the eighth­notes vary in duration, as they would in a bumanperformance. Eacb of the three examples bas adifferent timing pattern, and they may not (in fact,should not) sound equally good to you. Clearly,there are some timing patterns that are preferable toothers.... In the following test, you will indicate[your] preference by giving a numerical ratingbetween 1 and 10 to each excerpt you bear, wbere10 is the best possible rating and 1 is the worst. ...However, don't use these [ratings] in an absolutesense, but try to adjust to the diversity of timingpatterns you bear and use the whole scale; that is,give ratings of 9 or 10 to the best patterns you bearin the course of this experiment, and ratings of 1 or2 to the worst, regardless of bow you might judgethese patterns in an absolute sense. Avoid givingtoo many ratings in the middle range; try to use theextremes as well...."

Nearly all subjects in fact used the whole rangeof rating categories.

Results and DiscussionConsistency of judgments. The first question to

ask was whether the subjects were able to performthe task-that is, give reliable judgments. Thereliability of their ratings could be determined bycorrelating the ratings across the three blocks ofstimuli. Since the first block served to familiarizesubjects with the stimuli, the correlation betweentr second and third blocks was expected to be1 r than that between the first block ande of the other two. However, although thisw .rue for some individual subjects, there wasno such overall tendency in the data, and thethree interblock correlations were thereforeaveraged for each subject.

All 14 musically experienced subjects exhibitedsignificant average correlations, ranging from 0.31(p < .05) to 0.79 (p < .0001). Of the 12 musicallyinexperienced subjects, however, only two showeda significant average correlation (0.49 and 0.51,respectively, both p < .001); for the rest, thecorrelations ranged from -0.01 to 0.18. This is avery striking difference. Most musically untrainedsubjects apparently did not possess a stablecriterion by which to judge the stimuli.

A second criterion that separated the twosubjects groups was their response to timingpattern Q20L. As was evident to the authorduring stimulus generation, this pattern (as wellas Q40L) sounded really ridiculous, in contrast tothe other patterns, which seemed at leastmoderately acceptable. Indeed, all musically

A Constraint on the Expressive Timing ofMelodic Gesture: Evidence from Performtlnce and Aesthetic Judgment 271

Figure 5. Average ratings given to the parabolic patterns.

20 40 60 80 100

CURVATURE (0)

As is evident from the figure, the prediction thatthe normal parabolic curves would receive thehighest ratings was confirmed. The main effect ofType was highly significant [F(2,26) = 65.60, p <0.0001). There was also a significant main effect ofCurvature [F(4,52) = 5.50, p < 0.001], though it

was irrelevant in view of a strong two-wayinteraction [F(8,104) = 20.95, p < 0.0001]. Thisinteraction was evidently due to the very differenteffect of Curvature for left-shifted parabolas thanfor normal and right-shifted ones.

The latter two stimulus types were analyzed ina separate ANaYA There were significant effectsof Type [F(l,13) = 40.24, p < 0.0001] and ofCurvature [F(4,52) = 11.26, p < 0.0001], but anonsignificant interaction [F(4,52) = 1.44, p =0.24]. Normal parabolas were rated more highlythan right-shifted ones at all degrees of curvature,and for both types the most preferred curvaturewas Q40. By contrast, left-shifted parabolas werejudged extremely unfavorably at low curvatures(as noted earlier) and more favorably at highcurvatures. At Q100, left-shifted functions werealmost as acceptable as normal ones, and moreacceptable than right-shifted ones. This indicatesthat the subjects were particularly averse tohearing a short first IOI; for stimuli Q60L toQ100L, the reduction of the starting tempoapparently compensated for the exaggeration ofthe final slow-down.

Hybrid patterns. The 30 hybrid patterns,together with the parent patterns Q20 and Q100,formed a 2 x 2 x 2 x 2 x 2 design: Each of the fivelOIs could either have a short duration (from Q20)or a long duration (from Q100). These fivepositions will be referred to by the letters Po, B, C,D, E in the following. A 5-way repeated-measuresANaYA was conducted on the subjects' ratings.Significant main effects in this analysis wouldindicate that the listeners preferred a shorter orlonger IOI duration in particular positions. Sucheffects were more likely in the positions whereQ20 and Q100 differed most, i.e., E and A (cf.Figure 4). Of greater interest were anyinteractions among the five position factors, whichwould indicate that the relationships amongseveral lOis mattered. The average ratings areshown in Table 1.

Only one of the five main effects reached signifi­cance, that of position A [F(1,13) = 8.10, p < 0.02]:Listeners preferred the shorter 101 in that posi­tion (see Table 1, bottom row).4 The main effectfor the last position, E, was nonsignificant [F(1,13)1.06, p < 0.32], even though the change induration was larger (424 ms vs. 264 ms). This isinteresting in view of the "Cortot pattern"mentioned earlier, in which the last 101 isabnormally shortened; apparently, the presentlisteners were not very consistent in theirresponses to different degrees of finallengthening.5

10 --- Normal

9 -- Left-shifted

(98 -.- RighI-shifted

ZI- 7<:(a: 6W(9 5<:(a: 4

W> 3<:(

2

1 I

experienced subjects assigned their lowest ratingsto Q20L, with average ratings ranging from 1.0 to2.0. Ten of the 12 musically inexperiencedsubjects, however, gave this stimulus averageratings between 5.0 and 9.33! The remaining twosubjects gave average ratings of 2.67 and 3.0,respectively; however, they were not the twoindividuals who showed significant reliability ofjudgments.

Because of this striking dichotomy injudgmental criteria and consistency between thetwo subject groups, further analysis was restrictedto the data of the 14 musically experiencedsubjects. Their responses to the parabolic andhybrid patterns were analyzed separately.

Parabolic patterns. The parabolic patternsconstituted a 3 (Type) by 5 (Curvature) design.The subjects' ratings were averaged over the threeblocks and subjected to a two-way repeated­measures ANaYA. The average ratings areplotted in Figure 5.

272 Rep"

Table 1. Average ratinGs (or the hybrid stimuli.

Code Rating Code Rating Code Rating Code Rating

HOOOOO 7.4 HO1000 6.5 HlOOOO 5.4 HI 1000 6.9HOOOOI 7.6 HOlOOl 6.5 HlOOOl 4.9 HllOOl 5.1HOOOlO 6.8 HOlOlO 6.1 HlOOlO 5.2 HllOlO 5.6HOOOll 6.0 HOlOll 6.1 HlOOll 4.6 HllOll 5.3HOOlOO 7.1 HOlloo 6.7 HlOloo 5.7 HlllOO 5.5HOOlOl 7.0 HOllOl 6.1 HlOlOl 5.5 Hlll01 6.0HOOIlO 6.8 HOlllO 6.6 HlOllO 4.7 HllllO 5.7HOOlll 6.8 HOllll 6.8 HlOlll 4.2 Hlllli 5.4

HOO ... 6.9 HOI ... 6.4 HlO ... 5.0 Hll ... 5.7

HO ... 6.7 HI ... 5.4

There were several significant interactions,however, which indicated that listeners did notjudge 101 durations individually. The threelargest interactions were AB [F(1,13) = 16.59, p <0.002], ABCD [F(1,13) =16.15, p < 0.002], and CE[F(1,13) = 12.00, p < 0.005]. Four additionalinteractions, ACD, BCD, BD, and BDE, weresignificant at p < 0.05, and three furtherinteractions, ACDE, ABCE, and BCDE, werenearly significant (P < 0.06). It is perhapsnoteworthy that the only two positions that werenever involved together in a significantinteraction are A and E. The beginning and theend of the timing pattern thus seemed to bejudged independently.

These interactions indicate that it is the patternof lOis that mattered, not individual 101durations. The AB interaction, for example, showsthat a shorter second 101 (B) was preferred whenthe first 101 (A) was short, but a longer B waspreferred when A was long (see Table 1,penultimate row). The BD and CE interactionsshow a similar pattern of preferred positivecovariation between two positions. Now consider amore complex interaction, ABCD, which subsumestwo other significant interactions, ACD and BCD.It can be viewed as four CD interactions, one foreach of the four combinations of A and B values.Three of these four two-way interactions exhibitthe positive covariation described above, but one(that for long A and short B) exhibits negativecovariation. That is, in that specific conditionlisteners preferred a long C when D was short,and a short C when D was long. The reason forthis complex interaction is not obvious, but it isremarkable that position C, which had a durationdifference of only 24 ms, was so strongly involvedin it. Listeners' sensitivity to small deviations in

that position mirrors the restricted range ofobserved 101 durations.

Another prediction may be examined in thehybrid pattern data. The parent patterns, Q20and Q100, were parabolas of the normal type. Thehybrid patterns approached parabolas in variousdegrees. Therefore, none of them should have beenrated higher than the parent patterns, whereasquite a few should have been rated lower.However, the difference in average ratingsbetween Q20 and Q100 of about 2 points (cf.Figure 5) must be taken into account. The revisedprediction, therefore, is that no hybrid patternshould have been judged more acceptable thanQ20, but some should have been judged lessacceptable than Ql00.

The first part of the prediction was confirmed:Only one hybrid stimulus, HOOOOl (i.e., Q20 witha lengthened final 101), received a higher averagerating than Q20 (7.6 vs. 7.4), but this differencewas certainly not significant. The second part ofthe prediction was also supported: There were 7hybrid stimuli that received lower average ratingsthan Q100 (SA). The lowest rated stimulus,H10111 (4.2), corresponded to Q100 with ashortened second 101. In fact, Table 1 shows thatall stimuli of the type H10... received low ratingswhose range (4.2 to 5.7) did not overlap at all withthe ratings of HOO and HOl ... stimuli (range:6.1 to 7.6); Hll stimuli were in between(range: 5.1 to 6.9). This again confirms the relativeimportance of the first 101 in relation to thesecond. Clearly, listeners did not like a relativelylong first 101. This is easily explained by the tonalstructure: The first tone of the melodic gesture, E,is a half-step below the tonic (i.e., a "dissonantlower neighbor") and, moreover, in a metricallyweak position, which calls for a quick resolution.

A Constraint on the Expressive Timing ofMelodic Gesture: EfJidencefrom Perjomuznce and Ae;thetic Judgment 273

It might also be asked whether the ratings ofthe hybrid stimuli reflected the degree to whichthey approached a parabolic timing curve. As theresults for parabolic patterns show, however, whatmatters is not so much the parabolic shape itselfas its parameters. That deviations from aparabolic shape can be tolerated is illustrated bythe ratings for hybrid stimulus H11110 (5.7),which were slightly higher than those for theperfectly parabolic stimulus H11111, alias Q100(5.4). Hl1110 resembles the "Cortot pattern," andCortot may have taken advantage of listeners'tolerance for variations in the final 101. Theresulting timing shape is only moderatelyacceptable, however, which matches the author'simpression from listening to Cortot's recordings(whatever other qualities they may have).

GENERAL DISCUSSIONThe present results are limited in a number of

ways, which will be discussed below. Within theselimitations, however, they provide a clear indica­tion of a constraint on performance timing that isshared by expert performers and musically accul­turated listeners. While it may be perfectly obvi­ous to some theorists that such constraints mustexist, their objective demonstration and character­ization has rarely been undertaken before. The lo­cal constraint examined here is flexible enough topermit a large variety of concrete timing patterns;yet there is reason to believe that, in a specificmusical context, a single pattern may be optimal.Because of the contextual timing variation inher­ent in different performances, the evidence for op­timality comes from the perceptual data alone.For the specific musical excerpt presented here,the timing shape labeled Q40 seemed to be best,on the average.

It is necessary to discuss now what possiblegenerality this finding may have. Three major is­sues concern individual differences in preferencesand experience, the specific stimulus conditions ofthe experiment, and the specific musical excerptselected.

Individual differences among listeners did exist,of course, as they do in nearly all psychologicalstudies. However, the high levels of significance ofsome of the effects obtained suggest considerableagreement. More extensive replications ofjudgments per subject would be needed tointerpret individual differences. A fewobservations are offered here: All subjects but onegave some of their highest ratings to parabolicpatterns of the normal type; the exception was aprofessional pianist who gave her highest ratings

to stimuli HI1110 (the Cortot-like pattern),H11010, and Q40R, which all shared an initialaccelerando but had a reduced ritardando at theend. Among the normal parabolic patterns, mostsubjects' preference fell on patterns with lowercurvature (Q20, Q40, or Q60), though twosubjects, both accomplished pianists, preferredthose with higher curvature. One subject,interestingly the youngest in the group (an 11·year old girl who studies the piano), did notdifferentiate much among the different degrees ofcurvature, though she clearly preferred thenormal patterns over the left- and right-shiftedones. How the internal standards by which suchpatterns are judged are acquired in the course ofmusic education is of course a very interestingquestion for future research.

That musical experience is a sine qua non forreliable performance in the experimental task wasdemonstrated convincingly here. The precisenature of the necessary experience is less clear,however. The subject sample did not includeindividuals who cannot play an instrument butlisten extensively to classical music; the musicallyexperienced subjects were all instrumentalists ofvarying degrees of proficiency. The severalprofessional pianists in the group, who surely hadthe most extensive musical education, actuallywere not the most reliable judges. It is entirelypossible that professional musicians' criteria areless (lXed than those of amateurs and ordinarymusic lovers, because constant interaction withother musicians as students, ensemble players,and teachers may encourage tolerance of a largevariety of interpretative nuances.

It seems unlikely that specific knowledge of"Traumerei" and exposure to performances of thismusic in the past played a significant role in sub­jects' judgments. Most subjects were very familiarwith the piece, but some were not. One subject, aflutist, indicated that she did not know it at all;two others, who are string players, and the 11·year-old pianist indicated they were "fairly" famil­iar with it. Yet, these subjects gave reliable judg­ments consistent with those of the other musi­cians. The more important argument is one ofplausibility: Although some surface characteristicsof previously heard performances may well be partof the memories of familiar pieces of music, per­formance rules must also be stored in a more ab­stract form, so as to be applicable to music neverheard before. Musically experienced listenerssurely can judge the performance quality of novelmusic in a familiar style, just as performers cansightread new music with good expression, again

274

provided the style is familiar (as it is in the case ofany piece from the Romantic period).

Subject variables thus do not seem to impose aserious limitation on the generality of the presentresults. Stimulus variables are more of a problem.There are at least three factors that may influencesubjects' judgments but were kept constant in theexperiment. One is the melodic and harmonic con­tent of the musical excerpt. As pointed out in thesection on performance measurements, themelodic gesture under examination occurs sixtimes in "Traumerei," and only two of these in­stances are exactly identical. As Figure 2 showed,the average timing curve of the excerpt used inthe rating task (which occurs in bars 1-2 and 17­18) has only moderate curvature, comparable tothe Q40 stimulus. A lower curvature was typicalof the variants in the middle section of the piece,while high curvatures were mainly associatedwith the last instance preceding the fermata. Thusthe listeners indeed preferred the curvature ap­propriate to the excerpt offered, but they may wellprefer a different curvature for other variants.However, the preference for normal parabolicshapes should hold across all variants.

A second factor is the timing (and the impliedtempo) of the context in which the critical melodicgesture was presented. The lOIs of the precedingmusical events were set somewhat arbitrarily atthe geometric means of the performance sample.It is possible, even likely, that a different choice oflOIs for the context would have influencedsubjects' preferences. For example, if the lOis hadbeen longer (implying a slower tempo), listenersmay have opted for a more curved or elevatedtiming function. This would be interesting to testin future experiments. As it was, however,listeners were presented with an averagecontextual timing pattern, and they preferred acurvature that also corresponded to the average,which seems appropriate. Their generalpreference for normal parabolas should beindependent of variation in contextual timing.

A third factor is the intensity microstructure ofthe melodic gesture, which was also held fixed. Itwas derived from an individual performance, andits contour may not have been close to theaverage.6 It did constitute a crescendo, however,as marked in the score. Very little is known at thistime about the perceptual interdependence, if any,of timing and intensity microstructure. It isconceivable that a different intensity contourwould change subjects' curvature preferences.Again, however, there is no reason to believe thatthe subjects would prefer atypical timing patterns,

as long as the intensity microstructure stayedwithin the normal range ofvariation.

A fmal consideration is the selection of timingfunctions presented in the experiment. Clearly,there are many possible shapes that were notincluded, mainly because they were expected tosound terrible and might have offended musicallisteners' sensibilities. This is not a seriousomission. On the other hand, it is conceivable thatthere are timing curves superior to Q40 in thisparticular context. The left- and right-shiftedparabolas constituted fairly gross deviations, andthere are other functions closer to the normal onesthat, in a sufficiently sensitive perceptual test,might prove even more highly acceptable. It mustalso be noted that implicit tempo (which isdifficult to quantify in a temporally modulatedperformance; see Repp, 1992) was confoundedwith curvature to some extent, Q100 having aslower tempo than Q20. Listeners' overallpreference for Q40 may have constituted apreference for (contextually appropriate) tempo asmuch as for curvature. This would have to besorted out by varying the constant and quadraticparameters of the timing curves independently.

In summary, consideration of various stimulus­related factors suggests that listeners' preferencefor a particular curvature of the timing functionmay well be context-dependent; however, theirgeneral preference for normal parabolic shapesmost likely is not. It should also be rememberedthat the normal family of parabolic shapes wasderived from a set of performances that variedwidely in the performance parameters (tempo,contextual timing, intensity microstructure)whose possible role in perception was justconsidered. The generality of the parabolicconstraint across this performance variationshould have a parallel in perceptual preferenceacross similar variation.

This leads to the broader question concerningthe generality of the parabolic constraint to otherkinds of melodic gestures and musical styles. Oneobvious limitation is that the constraint canmeaningfully apply only to melodic gestures thathave at least four lOIs. The more lOIs, thestronger the constraint may manifest itself. Repp(1992) examined the timing patterns of threeother melodic gestures in "Traumerei," eachcomprising 4 lOIs near the end of a phrase; they,too, seemed to follow the constraint, but somewhatless consistently than the 5-101 gesture examinedhere. Gestures with less that 4 lOis, of course,cannot violate the parabolic constraint; they aresimply irrelevant to it.

A Constraint on the Expressive Timing 0[Melodic Gesture: Evidencefrom Performance and Aesthetic Judgment 275

Another limitation is that the gestures mayneed to have a ritardando in them. This was truefor all the instances examined by Repp (1992).Moreover, the present results are in strongagreement with the performance and perceptionresults of Sundberg and Verrillo (1980), whofocused on final ritardandi in Baroque music. Theparabolic constraint thus may characterizeritardandi at all levels of the grouping structure,and quite possibly across different musical styles.It may indeed represent a "natural" way ofchanging tempo, including both accelerando andritardando, though the evidence for accelerando islimited to the initial part of the melodic gestureexamined here.

These tempo changes, moreover, must beuninterrupted. This perhaps constitutes the mostserious limitation of the parabolic constraint. Itmay only apply to gestures that are rhythmicallyuniform and do not contain tones that receivespecial emphasis for harmonic or melodic reasons.If so, it characterizes only a small minority of themelodic gestures in a musical piece, though theymay be the most salient ones, which mark theends of major sections. This minority, however,turns into a majority if all short melodic gesturesin which the constraint applies trivially areincluded. It is noteworthy that Todd (1992), in theprocess of extending his coarse-grained model ofexpressive timing (Todd, 1985) to detailed localtiming patterns, has been assuming a linearvelocity function of tempo change for melodicgestures ("segments") of any length, apparentlywith good success. A linear velocity change isequivalent to a quadratic timing function for theraw lOIs during a unidirectional tempo change.Previously, Todd (1985, 1989) presented datasuggesting that the global timing shapes of wholephrases can be modelled by a family of parabolicfunctions. His current, somewhat modifiedconception promises to contitute a valid basis of ageneral performance model.

The parabolic functions used in Repp (1992) andin the present study were empirically derived andmay eventually have to give way to similar buttheoretically motivated functions such as proposedby Todd (1992), provided that they fit the dataequally well. The extramusical origin ofconstraints on performance timing is still a matterof speculation, but it is likely to lie in aspects ofphysical movement that have invaded musicalperformance and ultimately account for thefrequent allusions to "musical motion" in themusicological literature. Although musical motionis often attributed to tonal sequences without

explicitly appealing to performance, it seemslikely that music needs to be set into motion by aperformer, real or imaginary. Once the physicalmovement has entered the music, it will in tum beable to "move" a listener, provided it has theproperties that the sensitive listener is attuned to.The kinds of melodic gestures that are most"moving" in a good performance are probablythose that give the timing constraint a chance toemerge clearly and impress itself on the listener.

REFERENCESFriberg.. A. (1991). Generative rules for music performance: A

formal description of a rule system. Computer Musk Journal, 15,56-71.

Gabrielsson, A., Bengtsson, I., &: Gabrielsson, B. (1983).Performance of musical rhythm in 3/4 and 6/8 meter.Samdintwilln JOlln1l11 ofPsychology, 24, 193-213.

Kronman, U., &: Sundberg.. J. (1987). Is the musical rilard anallusion to physical motion? In A. Gabrielsson (Ed.), Action lindperception in Rhythm and Music (pp. 57-68). Stockholm:Publications issued by the Royal Swedish Academy of MusicNo. 55.

Undblom, B. (1978). Final lengthening in speech and music. In E.Carding, G. Bruce, and R. Bannert (Eds.), Nordic Prosody (pp.85-101). Lund University, Sweden: Department of Linguistics.

Lussy, M. (1882). MusiCIII expression: IIccmts, nuances, and tempo inrocaland instrumental music. London: Novello.

Palmer, C. (1989). Mapping musical thought to musicalperformance. Journal of Experimental Psychology: HumanPerception and Perfomumce, 15,331-346.

Repp, B. H. (1990). Patterns of expressive timing in performancesof a Beethoven minuet by nineteen famous pianists. Journal ofthe Aroustiad Society ofAmeriCII, 88, 622-641.

Repp, B. H. (1992). Diversity and commonality in musicperformance: An analysis of timing microstructure inSchumann's "'Triumerei:' Journal of the AcoustiCllI Society ofAmeriCII, 92(5), 2546-2568

Seashore, C. E. (1938/67). Psychology of music. New York:McGraw-Hill/Dover.

Shaffer, L H. (1981). Performances of Chopin, Bach, and Bartok:Studies in motor programming. Cognitive Psychology,13, 326­376.

Sundberg, J., Friberg.. A., &: Fryden, L. (1991). Threshold andpreference quantities of rules for music performance. MusicPerception, 9, 71-91.

Sundberg.. J., &: Verrillo, V. (1980). On the anatomy of the retard:A study of timing in music. Journal of the AcoustiCllI Society ofAmeriCII, 68, 772-779.

Todd, N. (1985). A model of expressive timing in tonal music.Music Ptrception, 3, 33-58.

Todd, N. (1989). A computational model of rubato. ContempomTyMusic Rmew, 3, 69-88.

Todd, N. (in press). The kinematics of musical expression. Journalofthe AcoustiClll Society ofAmeriCIl.

FOOTNOTES-Music Perception, in press.IThese weaknesses include preselection of performances with

Htypical" ritardimdi, averaging across heterogeneous musicalmaterials, a rather unbalanced and poorly described design inthe perception experiment, and great Variability in listeners'judgments.

276

2That is, as long as the pattern does not lead musically literatelisteners to conclude that a rhythmic mistillce has been made. Inother words, the performance timing pattern must be compatiblewith the notated temporal pattern. There is, of course, a greyarea here in which faithfulness to the score may be a matter ofopinion.

3The remaining two exceptions, both occurring in the perfor­mance of Brazilian pianist Cristina Ortiz, exhibited a relativelengthening of the third 101 instead (Le., a W-shaped pattern).

4This seems to contradict the earlier observation that subjectsdisliked a short first 101 in left-shifted pattern;. Note, however,that the first 101 of stimulus Q20 corresponded to that of

stimulus Q80L (d. Figure ~). Thus, while listeners dislikedabnormally short first lOIs, within the normal range theyptefened short over long first lOIs.

SIt should not be inferred that the subjects were unable to detectthe difference in duration of the final 101 (424 ms). The presenttask was one of MSthetic, not sensory discrimination.

6The methods of deriving and transferring the intensity valueswill not be defended here, as they are still in need of validation.Suffice it to say that the dynamic variation sounded appropriateto the author. An analysis of the intemity microstructure of theentire sample of 28 "Triumerei" performances remains to beconducted.