acoustics, perception, and production of legato articulation on a

17
Haskins Laboratories Status Report on Speech Research 1994, SR-117/11S, 193-209 Acoustics, Perception, and Production of Legato Articulation on a Digital Piano* Bruno H. Repp This study investigated the perception and pro- duction of legato ("connected") articulation in re- peatedly ascending and descending tone se- quences on a digital piano (Roland RD-250s). Initial measurements of the synthetic tones re- vealed substantial decay times following key re- lease. High tones decayed faster than low tones, as they did prior to key release, and long tones de- cayed sooner than short tones because of their more extensive pre-release decay. Musically trained subjects (including pianists) were asked to adjust the key overlap times (KOTs) of successive piano tones so that they sounded optimally, mini- mally, or maximally legato. The results supported two predictions based on the acoustic measure- ments: KOTs for successive tones judged to be optimally or maximally legato were greater for high than for low tones, and greater for long than for short tones, so that auditory overlap presum- ably remained more nearly constant. For minimal legato adjustments the effect of tone duration was reversed, however. Adjusted KOTs were also longer for relatively consonant tones (3 semitones separation) than for dissonant tones (1 semitone separation). Subsequently, KOTs were measured in skilled pianists' legato productions of tone se- quences similar to those in the perceptual experi- ment. KOTs clearly increased with tone duration, an effect that was probably mototic in origin. There was no effect oftone height, suggesting that the pianists did not immediately adjust to differ- ences in acoustic overlap. KOTs were slightly shorter for dissonant than for conson.ant tones. This research was supported by NIlI grant MH-51230. lam grateful to Charles Nichols for his extensive assistance, to Janet Hander-Powers and Nigel Nettheim for comments on an earlier draft, and to the participating pianists for their patience. 193 They also varied with position in the ascending- descending tone sequences, indicating that the pi- anists exerted strategic control over KOT as a con- tinuous expressive dimension. There were large individual differences among pianists, both in the perceptual judgment and in the production of legato. INTRODUCTION A. Modes of articulation On most musical instruments, successive tones can be produced in two basic modes of articulation: unconnected and connected. In the unconnected mode, perceptible intervals of (what seems to be) silence separate successive tones. On the piano, this is achieved by releasing a key before the next key is depressed. It is appropriate whenever the score indicates a rest or staccato articulation, and also at the ends of phrases or slurs as an aspect of "phrasing." The connected mode is generally referred to as leg at 0 articulation. Here the preceding tone seems to end at the same time as the following tone begins. Correspondingly, the pianist releases a key at about the same time as the next key is depressed. The piano is one of a number of instruments that permit the simultaneous sounding of several tones. This is achieved by depressing several keys at the same time) The possibility of this third mode of articulation, the simultaneous or chordal mode, has implications for legato articulation: In order to achieve a very smooth connection between tones, a pianist may release a key after depressing the key for the following tone, so that there is a small amount of overlap. The duration of this overlap (time of key release minus time of key depression for the following tone) will be referred to as key overlap time (KOT) in the following; it is positive when there is overlap, and negative when

Upload: others

Post on 09-Feb-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Acoustics, Perception, and Production of Legato Articulation on a

Haskins Laboratories Status Report on Speech Research1994, SR-117/11S, 193-209

Acoustics, Perception, and Production of LegatoArticulation on a Digital Piano*

Bruno H. Repp

This study investigated the perception and pro­duction of legato ("connected") articulation in re­peatedly ascending and descending tone se­quences on a digital piano (Roland RD-250s).Initial measurements of the synthetic tones re­vealed substantial decay times following key re­lease. High tones decayed faster than low tones,as they did prior to key release, and long tones de­cayed sooner than short tones because of theirmore extensive pre-release decay. Musicallytrained subjects (including pianists) were asked toadjust the key overlap times (KOTs) of successivepiano tones so that they sounded optimally, mini­mally, or maximally legato. The results supportedtwo predictions based on the acoustic measure­ments: KOTs for successive tones judged to beoptimally or maximally legato were greater forhigh than for low tones, and greater for long thanfor short tones, so that auditory overlap presum­ably remained more nearly constant. For minimallegato adjustments the effect of tone duration wasreversed, however. Adjusted KOTs were alsolonger for relatively consonant tones (3 semitonesseparation) than for dissonant tones (1 semitoneseparation). Subsequently, KOTs were measuredin skilled pianists' legato productions of tone se­quences similar to those in the perceptual experi­ment. KOTs clearly increased with tone duration,an effect that was probably mototic in origin.There was no effect of tone height, suggesting thatthe pianists did not immediately adjust to differ­ences in acoustic overlap. KOTs were slightlyshorter for dissonant than for conson.ant tones.

This research was supported by NIlI grant MH-51230. lamgrateful to Charles Nichols for his extensive assistance, toJanet Hander-Powers and Nigel Nettheim for comments on anearlier draft, and to the participating pianists for theirpatience.

193

They also varied with position in the ascending­descending tone sequences, indicating that the pi­anists exerted strategic control over KOT as a con­tinuous expressive dimension. There were largeindividual differences among pianists, both in theperceptual judgment and in the production oflegato.

INTRODUCTIONA. Modes of articulation

On most musical instruments, successive tonescan be produced in two basic modes ofarticulation: unconnected and connected. In theunconnected mode, perceptible intervals of (whatseems to be) silence separate successive tones. Onthe piano, this is achieved by releasing a keybefore the next key is depressed. It is appropriatewhenever the score indicates a rest or staccatoarticulation, and also at the ends of phrases orslurs as an aspect of "phrasing." The connectedmode is generally referred to as leg at 0

articulation. Here the preceding tone seems to endat the same time as the following tone begins.Correspondingly, the pianist releases a key atabout the same time as the next key is depressed.

The piano is one of a number of instrumentsthat permit the simultaneous sounding of severaltones. This is achieved by depressing several keysat the same time) The possibility of this thirdmode of articulation, the simultaneous or chordalmode, has implications for legato articulation: Inorder to achieve a very smooth connection betweentones, a pianist may release a key after depressingthe key for the following tone, so that there is asmall amount of overlap. The duration of thisoverlap (time of key release minus time of keydepression for the following tone) will be referredto as key overlap time (KOT) in the following; it ispositive when there is overlap, and negative when

Page 2: Acoustics, Perception, and Production of Legato Articulation on a

194 Repp

the key release precedes the following keydepression.2

A small amount of key overlap is not perceivedas a simultaneity but rather as an increasedconnectedness of the tones. One reason why someamount of key overlap can be tolerated is that thesound level of a piano tone, after a rapid rise,decays as a function of time, so that the end of thepreceding tone is usually much softer than thebeginning of the following tone. Because of thisdiscrepancy in relative intensity and the abruptrise time of the following tone, masking mayoccur, so that a brief acoustic overlap may not bereadily detectable.

However, masking is not likely to provide a fullexplanation because the acoustic overlap of suc­cessive piano tones is actually much more exten­sive than suggested by the KOT. Although thedamper extinguishes the string vibrations when akey is released, this process is not instantaneous,and soundboard vibrations and acoustic reverber­ation in a room may further contribute to prolong­ing a tone's acoustic duration. The highest pianostrings (usually starting with F#6) do not haveany dampers at all. Thus, when two tones are per­ceived as unconnected (i.e., when KOT is nega­tive), the apparent silent interval between them isat least partially filled by the decaying energy ofthe first tone, and there may in fact be someacoustic overlap. In typical legato articulation,where one key is released shortly after the de­pression of the next key, the acoustic overlap maybe quite substantial, yet listeners do not complainabout hearing simultaneous tones. This is proba­bly due to auditory grouping of a tone's "tail" withits "body," and consequent perceptual segregationof the simultaneous tones, within limits (seeBregman, 1990).3 Here we note only that a shortKOT corresponds to a much longer tone overlaptime, whose precise duration depends on the in­tensities and decay rates of the tones and on theambient noise level.

These considerations give rise to a number ofinteresting questions about the acoustics,perception, and production of legato articulationon the piano--questions that the present studywas intended to address in a preliminary way:

(1) Acoustics: What are the decay characteristicsof piano tones, particularly after key release?

(2) Perception: How much key overlap (and con­sequent tone overlap) can listeners tolerate whenjudging the articulation of successive tones to belegato, before they start hearing simultaneities?Does experience as a pianist affect these judg-

ments? What factors influence the amount of keyoverlap listeners are willing to tolerate?

(3) Production: What are the typical KOTs whenpianists play legato? Is there variation amongindividual pianists in this respect? Do pianistsvary their timing of key depressions tocompensate for factors that affect perceptualoverlap? Are there differences in degree of legatobetween pianists' right and left hands, andbetween pairs of fingers on each hand?

There is not much previous research addressingthese questions; what little is known to the authoris summarized in the following sections.

B. Acoustics

The acoustic decay characteristics of sustained(undamped) piano tones are fairly well understood(see, e.g., Martin, 1947; Weinreich, 1990; Wogram,1990). After a brief rise time and early amplitudepeak, the sound decays slowly, typically with aninitial faster rate of decay giving way to a slowerrate. These two decay rates seem to represent therespective contributions of vertical and horizontalstring motion: The vertical component is initiallylarger but is transmitted to the vertically movingsoundboard and hence decays to below the level ofthe horizontal component within a few seconds(Weinreich, 1990). The decay rate of the verticalcomponent can be altered radically, however, byinteractions among the two or three strings struckby the same key (Weinreich, 1990) and by therelative impedance of the soundboard at thevibrating frequencies (Wogram, 1990); thus thetwo components may not always be clearlydistinguishable in the amplitude envelope. Tonesabove roughly 700 Hz (about F5) seem to haveonly a single decay rate (Martin, 1947). The factmost relevant to the present study is that thedecay rate increases with fundamental frequency:While a low tone such as C2 takes about 4 s todecay by 20 dB (1/10 of the amplitude) on anupright piano, C4 takes only about 2 s, and C6about 1 s (Wogram, 1990). Because of the complexresonance characteristics of the soundboard andother factors, however, the decay rates of tonesclose in pitch may differ substantially (Benade,1990; Wogram, 1990). The precise decaycharacteristics of individual piano tones thus arelargely instrument-specific and need to bemeasured in any particular experimental context.

The literature offers surprisingly little informa­tion about the decay characteristics of piano tonesfollowing key release, after the damper falls uponthe strings. These post-release decay times are

Page 3: Acoustics, Perception, and Production of Legato Articulation on a

Acoustics, Perception, and Production of Legato Articulation on a Digital Piano 195

likely to depend on the amplitude of string vibra­tion when the damper touches the strings, on thethickness of the strings, on the weight and surfacecondition of the damper, and on the velocity of keyrelease (damper lowering), among other things.Moreover, the post-release decay times of tonesheard by a listener are probably a good deal longerthan those of the strings alone, since they includethe decay of (undamped) sound board vibrationsand reverberation. Therefore, it is necessary tomeasure these times in any specific setting usedfor research purposes. In the first part of the pre­sent study, this was done on the output of anelectronic instrument whose sounds presumablyhad been modeled on a specific piano recorded un­der specific conditions. The main concern here wasnot the representativeness of these measurementsfor pianos in general but an adequate characteri­zation of the specific acoustic environment inwhich the following perception and production ex­periments were conducted.

C. PerceptionIn the large literature on auditory psy­

chophysics, there does not seem to be a singlestudy that investigated listeners' ability to detectthe acoustic overlap between the end of one toneand the beginning of another tone. However, thereare many studies on the detectability of a silentgap between two pure tones or noise bursts (e.g.,Perrott & Williams, 1971; Williams & Perrott,1972; Collyer, 1974; Fitzgibbons, Pollatsek, &Thomas, 1974; Divenyi & Danner, 1977; Neff,Jestaedt, & Brown, 1982; Shailer & Moore, 1983;BUlls & Florentine, 1985; Formby & Forrest,1991). Unfortunately, none of these studies em­ployed complex tortes· with decaying amplitudes;amplitude envelopes were rectangular, and gapdetection thresholds were generally on the orderof a few milliseconds. Many studies have shownthat the gap detection threshold increases as thefrequency separation of two pure tones increases,and Dannenbring and Bregman (1978) have re~

ported perception of illusory overlap when alter­nating nonoverlapping tones are sufficiently dif­ferent in frequency to be allocated to separate au­ditory streams. The intensity of the sounds seemsto have no effect. on gap detection, except at verylow levels. However, Plomp (1964) demonstratedin a widely cited study that the gap detectionthreshold increases as the relative sound lev~l ofthe second of two noise bursts decreases. He at­tributed this finding to a decay of auditory sensa­tion following.the offset of the first Iloise. This in­ternal decay, in terms of equivalent sound pres-

sure level (dB), was a negative exponential func­tion of time and reached the sensation threshold225 ms after noise offset, regardless of noise in­tensity. That is, the internal decay time was con­stant, but the decay rate varied with sound level.

Plomp's results are cited here to acknowledgethat a sound often does not end with its acoustictermination; if its acoustic offset is abrupt, it con­tinues to resonate for some time in the listener'sauditory system. When a sound decays over time,however, the physical input generally will deter­mine the auditory sensation, since acoustic decay(in dB) generally is a linear rather than negativeexponential function oftime (Benade, 1990). Thus,unless the acoustic decay rate is very high, thesensation level due to internal decay of auditoryinput will generally be below the momentary in­put level. For this reason (and others), the psy­choacoustic literature is not very helpful in pre­dicting what degree of acoustic overlap of pianotones might lead listeners to judge them as con­nected (legato) or unconnected. Even a controlledpsychoacoustic study with ramped complex tonesand highly trained listeners would only provide apartial answer. The question is best addressedwithin the realistic sound environment of pianotones, using listeners with musical experiencerather than extensive laboratory training.

A recent study by Kuwano, Namba, Yamasaki,and Nishiyama (1994) moved in that direction. Inthe initial, more psychoacoustic, part of theirresearch, they synthesized a 13-note melody withsounds whose pressure envelopes were eitherrectangular or triangular (Le., linearly decaying).4The melody was presented at a fixed tempo, andthe overlap of the tones was varied by changingtheir total duration. Musically untrained subjectsjudged when the tones changed from "separated"to "connected," and from "connected" to"overlapping." The average tone overlap timescorresponding to these two boundaries were -56and 16 ms for steady-state tones but 30 and 104ms for decaying tones. That is, listeners judgedsteady-state tones to be optimally connected whenthey had a brief silent gap between them, whereasdecaying tones perceived as connected overlappedby 67 illS, on the average. Apparently, this overlapwas not heard as such.

In the second, more realistic, part of their study,Kuwano et al. asked a pianist to play the tune onan electronic piano in eight different ways, rang­ing from "markedly separated" to "extensivelyoverlapping," while keeping the tempo constant.The recordings were subsequently presented tolisteners who judged them as "separated,"

Page 4: Acoustics, Perception, and Production of Legato Articulation on a

196 Repp

"marginally connected," or "overlapping." Theirresponses confirmed the pianist's intentions: Theperformance intended to be "marginally con­nected" (i.e., legato) also received the highestnumber of judgments in that category. The aver­age overlap time of successive tones in that per­formance was 240 ms.

Kuwano et al. calculated overlap times byreproducing the tones of the pianists'performances individually on a MIDI system andmeasuring the point at which each tone haddecayed to 60 dB below its peak level; this pointwas taken to be the end of the tone, and itsdistance from the onset time of the originallyfollowing tone was the overlap time.5 Althoughthis -60 dB criterion is commonly adopted inmeasuring reverberation times and, according toBenade (1990), often corresponds to the thresholdof audibility for decaying isolated sounds, it maybe a rather liberal criterion for tones in context, asthe final part of the decaying tone is probablymasked by the following tone. Still, in the absenceof an independent investigation ofmasking amongpiano tones, the estimate of acoustic overlap oflegato tones provided by Kuwano et at is the bestcurrently available. Although the overlap times oftones judged "connected" were much shorter intheir first experiment, for reasons that are notquite clear, their second experiment is morerepresentative of actual piano performance.

D. ProductionKey overlap times have been. measured in

several studies of piano performance, though noneof them focused specifically on legato playing.Thus, Sloboda (1983) showed that pianists varyKOTs (i.e., articulation) to convey different'metrical interpretations of the same sequence ofnotes. 6 In a follow-up study, Sloboda (1985)displayed frequency distributions of KOTs for twopianists playing two tunes. The distributions werebimodal, with one peak around 40 ms and theother somewhere between -100 and -220 ms,corresponding to legato and staccato modes ofarticulation, respectively.

KOTs were also measured by MacKenzie andVan Eerd (1990) in a study of rapid scale playingat several tempi. When the tempo was 8 tones persecond or faster, average KOTs were 5 ms or less;the longest times (28 ms) were observed at theslowest tempo (4 tones per second). KOTs wereabout twice as long for the right hand as for theleft hand; however, this difference was confoundedwith register, as the two hands played two octavesapart. Rapid scale playing is not conducive tooptimal legato articulation, but neither does it

permit staccato; thus these data may berepresentative of "minimal" legato.

Palmer (1989) reported KOTs, even though shedefined overlap acoustically as "the overlappingtime of two adjacent notes' amplitude envelopes"(p. 335). Since she used a synthesizer with tonesthat had a fixed 80 ms decay following keyrelease, and since she averaged KOTs across anumber of tones varying in articulation, her dataare not very revealing from the presentperspective. More relevant are average KOTprofiles for 10 pianists playing simple tunes,presented by Drake and Palmer (1993). MaximumKOTs were between 0 and 50 ms, presumablyreflecting legato articulation. This study alsoincluded an analysis of a concert pianist'sperformance of an excerpt from a Beethovensonata, but the KOTs were not reported in detail.Interestingly, both Palmer (1989) and Drake andPalmer (1993) found that the KOT was shorterwhen one of the two tones, especially the first one,was relatively long.

The situation most conducive to legato playing,short of providing explicit instructions, is theexpressive performance of a slow, melodious piece.Repp (1994, in press) selectively measured KOTsin two pianists' performances of Schumann's"Traumerei." One pianist (an amateur) generallyshowed KOTs around 0 ms, whereas the otherpianist (a professional musician) showedsurprisingly long KOTs, often of several hundredsof milliseconds. Since her performances did notsound abnormal, such extreme overlaps areeVidently acceptable to the ear in an expressiveperformance of complex music, but they mayrepresent an extreme case.7

E. The present studyThe production data available so far suggest

that legato articulation is most often associatedwith KOTs of 0-50 ms, though longer overlapsmay be observed under some circumstances. Noneof these measurements were obtained in asituation in which pianists were instructed to playlegato. Little attention has been given to acoustictone overlap and to listeners' perceptual tolerancefor such overlap, with the exception of thepioneering study by Kuwano et at (1994). Thepresent research took an approach similar totheirs, but it focused in more detail on the acousticcharacteristics of the tones used and on severalfactors that may influence KOTs. The studyincluded an acoustic analysis, a perceptualexperiment, and a production experiment.

The purpose of the acoustic analysis was todescribe the decay characteristics of the digital

Page 5: Acoustics, Perception, and Production of Legato Articulation on a

Acoustics, Perception, and Production of Legato Articulation on a Digital Piano 197

piapo tones used in the subsequent experiments.The analysis was expected to confirm the knownfact that high piano tones decay more rapidly thanlow tones before the key is released. Therefore, thepost-release decay times were expected to beshorter for high tones than for low tones. Onequestion of interest was whether the post-releasedecay times reflect merely a tone's sound level atthe moment of key release, or whether there arealso differences between high and low tones intheir post-release rates of decay.

In the perceptual experiment, the method ofadjustment was used to obtain judgments ofoptimal and marginal legato from musicallytrained listeners, including pianists. The stimuliwere scales and arpeggi varying along threedimensions: tempo, register, and interval step size(correlated in this instance with relativedissonance versus consonance of successive tones).It was predicted that more key overlap would betolerated at slow tempi and in a high register,because long and high tones are softer at keyrelease than are short and low tones, so thatacoustic overlap is both less extensive and lessaudible. Listeners' sensitivity to these factors mayreflect perceptual constancy in terms of somecriterion of auditory tone overlap. It was alsoexpected that more key (as well as acoustic)overlap would be tolerated with relativelyconsonant sequences of tones than with relativelydissonant ones, both for auditory and aestheticreasOns. Because of the higher commonality ofpartials of consonant complex tones (Kamebka &Kuriyagawa, 1969), their acoustic overlap may beless detectable as well as more pleasing to the earthan the overlap of dissonant tones.

In the production experiment, skilled pianistsprovided samples of their best legato by playingthe scales and arpeggi used in the perceptionexperiment. The question was whether they wouldadjust their KOTs in response to factors thataffect acoustic overlap, or whether legato playingis motorically invariant and insensitive to thesefactors, even though they may influenceperceptual judgments. (This hypothesis will befurther qualified below.) Secondary questionsconcerned the existence of individual differencesin KOTs, as well as possible differences betweenthe left and right hands and among pairs ofadjacent fingers.

I. DECAY CHARACTERISTICS OF THEDIGITAL PIANO TONES USED

The purpose of the acoustic analysis was tocharacterize the sound pressure envelopes of thepiano tones used in the perception and production

experiments, with particular attention to thedecay following key release. The analysis wasessential because the tones were produced by anelectronic instrument about whose soundinventory no detailed specifications were availablefrom the manufacturer. A previous study (Repp,1993) had focused on their peak sound levels andon aspects of their spectral structure but had notconsidered their decay characteristics.

It would have led too far to investigate in detailthe extent to which these synthetic piano toneswere representative of natural piano tones.However, it seemed highly likely that they weremodelled on a set of acoustically recorded tones.Thus, their decay probably represented not onlythe decay of string vibrations but also the decay ofsoundboard vibrations and acoustic reverberationin some enclosed space. This was as it should bebecause it corresponds to what a listener hearsand because it makes the synthetic tones soundrealistic, especially over earphones.

A.MethodThe instrument was a Roland RD-250s digital

piano using a proprietary "adaptive synthesis"algorithm. "Piano 1" sound was used with aconstant MIDI velocity of 40, because this velocitywas also used in the subsequent perceptionexperiment. 8 To reduce the number ofmeasurements, only every third tone wasanalyzed in the range from C2 (65 Hz) to C7 (2093Hz). Four series of tones between these twoendpoints were played under the control of aMacintosh IIvx computer, using the PerformerMIDI sequencing program. The' four seriesdiffered in the nominal tone duration, i.e. in theinterval between MIDI "note on" (key depression)and "note off' (key release) commands, which was250,500,750, or 1000 ms'. The nominal intertoneinterval (between each MIDI "note off' commandand the next "note on" command) was 500 ms.9

The analog output of the digital piano was inputto a Macintosh lIci computer using AudioMediasoftware with a sampling rate of 44.1 kHz. Thedigitized waveforms were then analyzed usingSignalyze software. The RMS amplitude envelopewas computed for each tone using a window size of30 ms. Since the program does not display dBvalues, all measurements were performed on theRMS amplitude values and subsequentlyconverted into dB (with an arbitrary reference).As the output of the digital piano wasdeterministic and temporal resolution was veryfine, it was not necessary to perform eachmeasurement more than once. Apparentirregularities were double-checked, however.

Page 6: Acoustics, Perception, and Production of Legato Articulation on a

B. Results and discussionFigure 1 shows pre-release decay as a function of

musical pitch, as measured in the 1000-ms tones.The graph shows each tone's peak RMS level, aswell as the sound levels 250, 500, 750, and 1000ms from energy onset.10 Peak level was reachedafter a variable rise time, which ranged from 24 to43 ms for C2 to C5 (except for A4, which had anunusually slow rise time of 89 ms) but was muchshorter (around 5 ms) from Eb5 on.n Peak levelswere fairly stable between C2 and C6, althoughthere was some variation from tone to tone (as al­ready demonstrated by Repp, 1993). Above C6,peak level decreased as a function of pitch. Theinitial pre-release decay also increased dramati­cally at high pitches. Tones below C6 decayed byonly a few dB over the first 250 ms, whereas fromC6 on there was a very substantial initial decay.Beyond 250 ms, the decay rates varied less dra­matically as a function of pitch, except for thetones in the lowest octave, which decayed at amuch slower rate. Some individual tones (e.g.,Eb4, A5) had amplitude envelopes with irregularcharacteristics, which may reflect beats caused byslightly mistuned strings in the original pianothat served as a model. The sound level of C7 be­yond 500 ms was too low to be measured. It isclear from Figure 1 that the pre-release decay in­creased with pitch, as expected, though there wereirregularities in this relationship which presum­ably reflect the complex acoustics of real pianos.l2

70,------------------.

198

60

30

-t- peak value

...... after250ms

-e- after 500 ms

....... after 750 ms

~ after1000ms

Repp

The four lower functions in Figure 1 representthe sound levels at the time of key release for thefour tone durations employed here. Since thissound level decreased as pitch increased and astone duration increased, the post-release decaytime must likewise have decreased with increas­ing pitch and increasing duration. This is con­firmed in Figure 2, which shows how soon afterkey release tones of different nominal durationsreached 1/10 (-20 dB) or 1/100 (-40 dB) of theirpeak amplitude.13 Missing data points indicatethat the specified level was reached before key re­lease. The post-release decay times were quitesubstantial. Even by the conservative -40 dB cri­terion, the lowest tones took about 300 ms to de­cay, and tones up to C6 took at least 100 ms.Tones above C6 decayed somewhat sooner, thoughA6 and C7 were abnormal in that they showedmuch slower decay when released early thanwhen released late. The reason for this anomalywas not clear, as these tones should not have beenaffected at all by key release because of the ab­sence of dampers for the highest strings in a realpiano.

Nominal tone duration

350.,....---------1 ...... 250 ms (-40 dB)___ 500 ms (-40 dB)

....... 750 ms (-40 dB)

....... 1000 mS (-40 dB)

-0- 250 ms (-20 dB)

-0- 500 ms (-20 dB)

-tr- 750 ms (-20 dB)

-0- 1000 ms (-20 dB)

50

~ ~ ~ ~ n ~ ~ ~ ~ ~ ~ ~ ~ ~ 8 ~ ~ ~ ~ ~ DPitch

Figure 2. Post-release decay times: Time after key releaseby which the tones had decayed to -20 and -40 dB of theirpeak level.

~ ~ ~ ~ n ~ ~ ~ ~ ~ 8 ~ ~ ~ 8 ~ ~ ~ ~ ~ DPitch

Figure 1. Pre-release decay: RelativeRMS sound levels ofdigital piano tones at their peaks and at four time pointsin their amplitude envelopes (measured from energyonset).

The effects of pre-release decay on the post­release decay time, shown in Figure 2, would havebeen obtained even if the post-release decay ratehad been constant. However, higher-pitched tonesalso decayed faster than lower tones, not only

Page 7: Acoustics, Perception, and Production of Legato Articulation on a

Acoustics, Perception, and Production of Legato Articulation on a Digital Piano 199

berore but also after key release. Pre-release toneduration, on the other hand, did not seem to haveany" systematic effect on post-release decay rate;therefore, Figure 3 shows the data averaged overthe four nominal tone durations. The figure showsthe time it took for the post-release sound level todecay by 20 dB. (This is the time differencebetween the -20 dB and -40 dB points in Figure 2.)This time decreased from about 180 to 90 ms overthe pitch range investigated (i.e., the decay rateincreased from about 11 dB/cs to 22 dB/cs), andthe decrease as a function of pitch was muchsteeper during the lowest octave than during thehigher octaves. The points for the two highestpitches have been omitted in the figure because oftheir much slower post-release decay rates (cf.Figure 2). The lines were fitted by hand toindicate the general trend of the data; again, thereare some irregularities.14

200-,...-----------------,

_180gl:O 160-0oC'I 140~>-rJ 120~oS 100~F 80

Pilch

Figure 3. Post-release decay rates: The time it took fortones to decay by 20 dB following key release.

Although there seem to be no data in theliterature to compare the present results with, thecomplex and somewhat irregular acousticcharacteristics of the present tones suggest thatthey were indeed modelled after acousticallyrecorded piano tones. It should be noted, however,that natural piano tones exhibit much greatervariation in peak sound level (Repp, 1993); somekind of equalization must have been applied in theproprietary synthesis scheme that generated thedigital tones. To what extent the present findingsare representative of the decay characteristics ofnatural piano tones remains to be detennined.However, they adequately describe the acoustic

environment within which the followingexperiments were conducted.

II. PERCEPTUAL JUDGMENT OF LEGATOARTICULATION

The purpose of this experiment, as alreadyindicated, was to investigate the influence of threefactors (register, tempo, and relative consonance)on the amount of key overlap perceived as legato.An adjustment task was used to determine theoverlaps judged to represent the "best" as well as"minimal" and "maximal" legato. It was expectedthat listeners would tolerate more key overlap inconditions where there is less acoustic overlap dueto shorter post-release decay times, namely in thehigh register and at a slow tempo (long tonedurations). Furthermore, it was predicted thatmore key (and acoustic) overlap would betolerated for relatively consonant than fordissonant tones. (The consonant tones were alsomore widely separated in pitch, which could leadto the same prediction.) Half of the musicallytrained subjects were pianists, and it was ofinterest whether their specific experience wouldbe reflected in different criteria for, and/orreduced variability of, their adjustments.

A. Method1. Subjects. Fourteen paid volunteers

participated. All but one were Yaleundergraduates; they ranged in age from 18 to 25.All subjects were musically trained, havingreceived between 8 and 15 years of formalinstruction on at least one instrument. Seven werepianists (though several of them played a secondinstrument as well); the others played variousinstruments including violin, cello, double bass,oboe, trumpet, and guitar.

2. Materials and procedure. An interactiveadjustment task was set up using the Roland RD­250s digital piano interfaced with a MacintoshIIvx computer. The control program was writtenusing the Max graphic programming environment.The program created various random orders of 24tone sequences resulting from the combination ofthree registers, four tempi, and two step sizes (ordegrees of relative consonance). All tones had aconstant MIDI velocity of 40. Each sequenceconsisted of a continuously ascending anddescending scale or arpeggio based on fivedifferent tones. The three registers (low, medium,high) represented starting frequencies of C2 (65Hz), C4 (262 Hz), and C6 (1047 Hz), respectively.The four tempi represented tone inter-onset

Page 8: Acoustics, Perception, and Production of Legato Articulation on a

200 Repp

intervals (lOIs) of 260, 519, 779, and 1039 ms. I5

The two step sizes were 1 and 3 semitones (st), sothat the tone sequences represented either a shortchromatic scale extending over 4 st (a major third)or a diminished-seventh-chord arpeggio extendingover one octave. Tones separated by 1 st formedthe highly dissonant interval of a minor second,whereas tones separated by 3 st formed themoderately consonant interval of a minor third.Sequences were started and stopped by clickingSTART and STOP "buttons" on the computerscreen. Nominal tone duration (key release time,and hence also KOT) was controlled by ahorizontal "slider" on the screen that could bedragged or clicked with the mouse. Each sequencestarted with the slider in the left-most position,corresponding to a nominal tone duration of 150ms, which made the tones sound definitelyunconnected. Nominal tone durations controlledby the slider ranged from 150 to 1500 ms in 10-mssteps; they were not displayed numerically.

Subjects sat at the computer and listenedbinaurally to the output of the digital piano overSennheiser HD540II earphones. The volume wasset at the same comfortable level for all subjects.After receiving written instructions and a fewminutes of free practice, subjects completed fourblocks of 24 trials each, with short breaks inbetween. Each block was initiated by theexperimenter who reset the program, which thengenerated a new random sequence of the same 24trials. The current trial number was displayed onthe computer screen. The subject initiated eachtrial by clicking the START button andterminated it by clicking the STOP button afteradjusting the slider according to the criterionspecified. The program: stored the slider settings,and the experimenter saved them in a file at theend of each block. Each block took between 10 and15 minutes to complete.

Instructions were different for each of the firstthree blocks. In the first block, the subject wasasked to adjust the slider so as to find the "best"legato. (S)he was advised to move the slider slowlyto tlle right until the tones sounded not onlyconnected but unacceptably overlapping, and thento reverse direction and try to "zero in" on theoptimal legato setting by moving the slider backand forth over a narrower region. (S)he waswarned not to "overshoot" the target zone on theslider when the tempo was fasU6 In the secondblock, the subject was asked to zero in on theboundary between unconnected and connectedtones, and to find the setting that was just barelyacceptable as legato ("minimal" legato). In the

third block, the subject was asked to find thehighest (right-most) setting that was stillacceptable as legato, before the tones started tosound noticeably overlapping ("maximal" legato).The fourth block replicated the first, the taskbeing again to find the best legato.

3.. Data analysis. KOTs were determined bysubtracting the 101 durations from the nominaltone durations adjusted by the subjects. In a fewinstances of overshoot, where the KOT exceededthe 101 (indicating that the subject was obliviousto the complete overlap of successive tones), the101 was subtracted once again, yielding the KOTof tones separated by one intervening tone. Twelvemissing data points (the program occasionallyfailed to save the last trial in a block) werereplaced by the group mean for that particularcondition. Repeated-measures analyses ofvariance were conducted on the data from eachblock, with the three independent variables (1m,register, step size) as within-subject factors, andwith pianistic expertise as a between-subjectfactor.

B. Results and discussion

The results are summarized in Figure 4.Consider first the "best" legato adjustments, whichare shown averaged over blocks 1 and 4. Therewas substantial between-subject variability. (Notethat standard errors are shown, not standarddeviations.) The variability increased withregister, step size, and 101. Nevertheless, therewere a number of reliable effects. First, aspredicted, KOT increased with register [F(2,24) =25.52, p < .0001]; the average KOTs were 14, 62,and 139 ms, respectively. Second, also aspredicted, KOT increased as a function of step sizeor relative consonance [F(l,12) =20.96, p < .0007];the averages were 48 and 95 ms for 1- and 3-ststep sizes, respectively. The effect of step size wasreduced in the high register, which was reflectedin a two-way interaction [F(2,24) = 3.65, p < .05].Third, there was also a main effect of 101 [F(3,36)= 3.57, p < .03], though it was smaller andnonmonotonic; the respective average KOTs were49,84,83, and 70 ms. The nature of the 101 effectvaried with both register and step size; the tripleinteraction was significant [F(6,72) = 2.43, p <.04]. Two additional significant effects were themain effect of blocks [F(1,12) = 6.59, p < .03], dueto shorter KOTs (by 20 ms) in block 4 than inblock 1, and the block by step size interaction[F(l,12) =5.42, p < .04], due to a reduced step sizeeffect in block 4. Variability was also reduced inblock 4. Pianistic expertise had no effect at all.

Page 9: Acoustics, Perception, and Production of Legato Articulation on a

Acoustics, Perc tion, and Production 0 Le ato Articulation on a Oi ital Piano 201

Low register Middle register High register400..,----........---......----..,.------,r------,------,

o

300

(j)

S 200Q)

E:;:::

a. 100ttl"t:Q)

>o>­Q)

~

-100

1 st____ Max

-e- Best

-.- Min

3 st 1 st 3 st 1 st

00>0>0>co '"C\IlCl O....

3 st

Inter-onset interval (ms)

Figure 4. Adjusted KOT in three conditions as a function of register, step size, and 101. The bars represent plus/minusope standard error.

The results for "maximal" legato judgmentswere basically similar, except that the adjustedJ{OTs were longer and the effect of 101 wasstronger and more nearly monotonic.Variability was very high and increased withregister and 101, though not with step size. Theeffect of registflr was pronounced [F(2,24) = 21.45,P < .0001], with average KOTs of 87, 162, and 235IllS, respectively. The predicted effect of step size(or consonance) was prflsent [F(1,12) =8.91, p <.02], though not very large; the average KOTswere 137 and 185 ms, respectively. The step sizeeffect was again absent in the high register,although the two-way interaction fell short ofsignificance. The effect of 101 was highlysignificant [F(3,36) == 19.20, p < .0001], due to anegatively accelerated increase in KQT with 101(80, 164, 198, and 204 ms, respectively). Therewas a significant 101 by register interaction[F(6,72) = 3.71, p < .003], due to larger effects of101 as register increasfld. There was no effect ofpianistic expertise.

Adjustments of "minimal" legato exhibited muchleSS variability. The average KOTs were negativfl,indicating that nominal gaps of up to 100 ms canstill be acceptable as legato, depending on the

condition: The most striking effect here was thatof 101 [F(3,36) = 25.22, P < .0001], though it wasinverted: KOT decreased (i.e., nominal gap timeincreasfld) as 101 increased, the average durationsbeing -26, -50, -89, and -107 ms, respectively.There was also a significant effect of step size[F(1,12) = 9.40, p < .01] in the predicted direction,with average KOT being less for chromatic scales(-76 ms) than for diminished-seventh-chordarpeggi (-60 ms). Finally, there was an effect ofregistflr [F(2,24) =3.85, p < .04], though it was notmonotonic: KOTs were shortest for low tones (-81ms) and longest for medium-pitched tones (-57ms), with high tones in between (-66 ms). No othereffects reached significance.

From Figure 4 it is clear that the range of KOTsacceptable as legato increases dramatically with101 and with register. Since several degrees ofconnectedness are probably discriminable withinthe larger ranges (though the relevant perceptualexperiments remain to be conducted), the resultsimply that, the slower the tempo and the higherthe register, the greater the variety of possiblelegato nuances. The opposite effects ofIOl on theupper and lower boundaries of the legato rangeprobably account for the irregular effect of 101 on

Page 10: Acoustics, Perception, and Production of Legato Articulation on a

202 Repp

"best" legato judgments, which lie approximatelyin the center of the range in the low and middleregisters, but closer to the upper boundary in thehigh register.

The effect of 101 on "maximal" legato judgmentswas as expected, with increasing KOTs beingtolerated as 101 increased. This is almostcertainly due to the lower sound levels at releaseand the shorter post-release decay times of longtones, which result in reduced acoustic andauditory overlap with the following tone. It isnoteworthy that the 101 effect was largestbetween 250 and 500 ms and smallest or absentbetween 750 and 1000 ms. This agrees with thefaster initial decay of piano tones, which takesplace during the first 500 ms or so (see Figure 1).

The strong effect of register for both "best" and"maximal" legato judgments was also in thepredicted direction, with the largest KOTs in thehigh register and the smallest in the low register.This is consistent with the much faster decay ofhigh than low tones, both before and after keyrelease. Unlike the effect of 101, the register effectwas of similar magnitude in "best" and "maximal"legato judgments.

The predicted effect of step size was obtained forboth ''best'' and "maximal" legato judgments, but itwas virtually absent in the high register. Thisinteraction is compatible with an interpretation interms of relative consonance. The relativedissonance of complex tones has been attributed tothe interaction of individual partials (Plomp andLevelt, 1965; Kameoka & Kuriyagawa, 1969), andsince high piano tones have fewer significantpartials than low tones, they are likely to showfewer such interactions and hence smaller effectsof perceived dissonance. It is difficult to see howsuch an interaction could have arisen fromfundamental frequency separation alone. Thefundamentals of tones separated by 3 st were wellwithin the auditory filter bandwidth (Moore &Glasberg, 1983) in the low register but wereseparated by more than one critical band in thehigh register, whereas frequencies separated by 1st were always within the same critical band. Ifanything, this should have led to a larger step sizeeffect in the higher register. -

The "minimal" legato judgments essentiallyrepresent gap detection thresholds. Of course,these estimates are much less precise than thoseobtained in typical psychoacoustic experiments,but their ecological validity may be greater. It isinteresting to note that the average adjustmentsnever corresponded to a key overlap, not even forhigh tones whose rapid pre-release decay caused a

substantial drop in sound level before the onset ofthe following tone. Moreover, even though thisdrop increased with tone duration, more ratherthan less of a nominal gap was needed to hear aseparation between long tones, and this was trueregardless of pitch, the effect of register beingrather small and nonmonotonic. Thus it was notthe case that a sufficiently large drop in theamplitude envelope disrupted perception ofconnectedness. Rather, it seemed as if long tonessounded inherently more legato than short tones,so that more of a physical separation was requiredto hear them as disconnected. Although Kuwanoet al. (1994) did not investigate the effect ofnominal tone duration on perceptual judgments,their measurements of acoustic overlap times in apianist's production show longer overlaps for longthan for short tones when the intention was toplay with various degrees of connectedness oroverlap, but also longer acoustic gaps for longthan for short tones when the intention was toplay in a disconnected mode.l7 This pattern seemscongruent with the present perceptual findings ofan increased range of KOTs for long tones and ofan inverted effect of tone duration (i.e., IOn onnominal gap durations.

Kuwano et al. (1994) did not report KOTs butonly acoustic overlap times, based on a -60 dBcriterion for the end of a tone. According to thatmeasure, the. acoustic overlap was less than 170ms when listeners gave predominantly"separated" responses, about 240 ms when theyjudged the tones to be "marginally connected," andmore than 280 ms when they gave mostly"overlapping" responses. The lOIs were 600 and300 ms (Kuwano, pers. comm.; the test melodycontained both quarter and eighth notes), thepitch steps varied between 2 and 5 st, and thepitches ranged from F4 to F5. The present 260and 519 ms 101 conditions in the middle registerwith a 3 st step size come closest to their stimuli.The average post-release decay time to -60 dB(extrapolated from the -20 and -40 dB points inFigure 2) of the relevant tones is roughly 250 ms.This implies acoustic overlaps of about 230 ms for"minimal" legato, 340 ms for "best" legato, and 400ms for "maximal" legato. If the "marginallyconnected" category of Kuwano et al. is equatedwith the present "minimal" legato, then the dataseem in agreement. It seems, however, that thepresent "best" legato stimuli would have beenjudged by their listeners as "overlapping," and thepresent "maximal" legato stimuli, as "extensivelyoverlapping." There are a number of differencesbetween the studies, however, that could account

Page 11: Acoustics, Perception, and Production of Legato Articulation on a

Acoustics, Perception, and Production of Legato Articulation on a Digital Piano 203

for this apparently lower tolerance for overlap intheir subjects, such as their use of musicallyuntrained subjects. Also, their piano tones mayhave had decay characteristics different fromthose of the present tones; unfortunately, they didnot describe those characteristics.

On a more general level, the present data agreewith their findings in demonstrating that a sub­stantial acoustic overlap can be tolerated by lis­teners before they complain about simultaneity ofpitches. However, the results should not be takento imply that the overlap is not detectable as such,even though the end of the "tail" of the decayingtone is almost certainly masked by the more pow­erful following tone. To assess the auditory de­tectability of overlap and the masking betweensimultaneous complex tones, precise psychoacous­tical experiments are required. What mattersmore than detectability in a musical context, how­ever, is perceptual and aesthetic tolerance withina specific instrumental environment and an asso­ciated performance tradition.

Because of masking between andlor perceptualsegregation of simultaneous tones, it seemsunlikely that any perceptual criterion (either"best" or "maximal" legato) corresponds to a fixedamount of acoustic overlap. This possibility wasexplored briefly by adding the post-release decaytimes determined in the first part of this study tothe KOTs found in the second part. As expected,there was no constancy overall, though low­pitched tones judged to be maximally legato alloverlapped by about the same amount. Thelisteners' responses in the adjustment task may betaken .as prima facie evidence for some auditoryconstancy in terms of degree of connectedness.However, there may be no simple acousticcorrelate of this constancy.

A final word is in order about individualdifferences. There was no difference betweenexperienced pianists and nonpianists, but therewas large variability within each group. The Illostunusual subject was a guitarist whose adjustedKOTs were all negative, even for "maximal"legato. Apparently, he was acutely sensitive toacoustic tone overlap and employed a criterionappropriate for his own instrument. Of course, hecontributed strongly to the variability among thenonpianists. But even when his data wereexcluded, there was no clear difference betweenthe two groups, and pianists themselvesapparently had widely divergent criteria for whatcounted as a good legato. This may be related toindividual differences in average KOT during

legato articulation, an aspect of a pianist's "touch."The final part of this study investigated thisproduction aspect of legato playing.

III. PRODUCTION OF LEGATOARTICULATION BY PIANISTS

The purpose of this experiment was twofold.One aim was to measure the KOTs pianistsproduce when they intend to play optimally legatoand to assess the magnitude of individualdifferences, as well as possible differences betweenright and left hands and between pairs of fingers.The second aim was to investigate whetherpianists adjust, consciously or subconsciously, tofactors that affect acoustic tone overlap andperceptual judgments of legato style, asdemonstrated in the previous experiment. Thepianists were asked to play scales and arpeggi likethose used as stimuli in Part II, on the sameinstrument. Thus they were operating undersimilar acoustic conditions, and even though thesounds were synthetic and the keyboard feltdifferent from that of a real piano, it was expectedthat, if adjustments in legato playing occur inresponse to acoustic factors on a real piano, theywould also occur on a digital piano.

The same three factors as in Part II were variedin the materials, and the predictions were thesame: Pianists were expected to show longerKOTs in Ii high register than in a low register, ata slow tempo (long lOIs) than at a fast tempo(short lOIs), and in a relatively consonant than ina dissonant sequence of tones. A complicatingcircumstance, however, was that these factors,tempo and step size in particular, may also havepurely motoric consequences that are independentof the auditory feedback about tone overlap. Thusit seems that a smooth legato is more difficult toachieve (and perhaps also aesthetically lessdesirable) at a fast than at a slow tempo; if so, thiseffect reinforces the prediction based on acousticconsiderations, but results supporting theprediction then cannot be attributed to a singlecause. The situation is different with regard tostep size: It may be more difficult to achieve asmooth legato when the fingers are spread (3 ststep size) than when they are close together (l ststep size); if so, this effect counteracts thepredicted effect of relative consonance, so thatattribution of an obtained effect to acoustic­aesthetic or motoric causes is possible. Only achange of register seems to have no obviousmotoric implications, as long as each hand stayswithin its typical range on the keyboard. Thus an

Page 12: Acoustics, Perception, and Production of Legato Articulation on a

204 Repp

effect of register on KaT would provide the bestevidence for an adjustment to acoustic conditions.

Individual differences among pianists inaverage KaT were of interest because they mayreflect an aspect of the elusive quality of "touch."Since in much music the right hand plays themelody and the left hand the accompaniment, itwas also considered possible that legato style isbetter developed in the right hand, leading tolonger KOTs. As noted in the Introduction,MacKenzie and Van Eerd (1990) observed such ahand difference in rapid scale playing, but the factthat the right hand played in a higher registerwas not controlled for. In the present design, toavoid awkwardness, the low register was playedonly with the left hand, and the high register onlywith the right; however, the middle register wasassigned to either hand, thus making a directcomparison possible. Finally, it was hypothesizedthat there might be more overlap between the"weak" fourth finger and its neighbors thanbetween the more independent first three fingers.

A. Method1. Subjects. The subjects were eight highly

skilled pianists. Four of them (all female) weregraduate students at the Yale School of Music;three (all male) were seniors in Yale college and,as winners of the annual undergraduate concertocompetition, had performed as soloists with theYale Symphony Orchestra; and one (female, olderthan the others) was a semi-professionalaccompanist. They were paid for theirparticipation.

2. Materials. The materials were very similar tothose used in Experiment 1, except that they wereshown in musical notation' on separate sheets ofpaper. Each scale or arpeggio ascended anddescended three times and ended on a long note.The step size was 1 or 3 st, and the starting pitchwas C2, C4, or C6. Two identical versions wereprepared for the middle register, one marked"with the left hand" and the other "with the righthand." The low register examples were to beplayed with the left hand, and the high registerexamples with the right hand. Three tempoconditions were indicated by the note values:sixteenth notes, eighth notes, or quarter notes. Allexamples were to be played at about 60 beats perminute, so that the tempi corresponded to lOIs ofapproximately 250, 500, and 1000 ms,respectively.

3. Procedure. The subject sat at the Roland RD­250s keyboard and listened to its output over

Sennheiser HD540II earphones. A metronomeflashing silently at 60 beats per minute stood·within view. After presenting written instructionsand permitting a brief warm-up period on the in­strument, the experimenter placed the randomlyshuffled 24 music sheets before the pianist, one ata time. (S)he was asked to play each example"with (her or his) best legato" at the tempo indi­cated but not exactly with the metronome; ex­pressive timing and dynamics, to the extent thatthe simple materials encouraged them, were wel­come. The overall dynamic level asked for wasmezzoforte. The fingering was prescribed in theinstructions as 5-4-3-2-1-2-3-4-5 for the left handand 1-2-3-4-5-4-3-2-1 for the right hand.l8 The ex­perimenter monitored the playing of all examplesto make sure that the instructions were followed.If either the subject or the experimenter was notsatisfied with an example, it was repeated imme­diately. All productions were recorded in MIDIformat using Performer software.

4. Data analysis. KOTs were calculated andsubjected to repeated-measures analyses of vari­ance. Three separate analyses were conducted: onthe left-hand data, on the right-hand data, and onthe middle-register data for both hands. The fac­tors in these analyses were step size (2 levels), 101(3 levels), register or hand (2 levels), fingers (8levels), and repetitions (3 levels), with subjects asthe random factor yielding the interactions thatserved as error terms. Note that the fingers factor(i.e., pairs of fingers: 5-4, 4-3, 3-2, 2-1, 1-2, 2-3, 3­4,4-5 for the left hand, and 1-2, 2-3, 3-4, 4-5, 5-4,4-3, 3-2, 2-1 for the right hand) was not decom­posed into finger pairs (4 levels) and order of fin­gers within each pair (2 levels), for reasons thatwill become evident. The repetitions factor wastreated as crossed with the other factors, not asnested within examples. Analogous ANOVAs werealso performed on each individual pianist's data,in which case repetitions served as the randomfactor.

B. Results and discussionThe main results, averaged across pianists,

repetitions, and fingers, are shown in Figure 5.The average KOTs ranged from about 50 to 150ms across the different conditions. The moststriking effect was that of 101, which wassignificant in all three analyses Oeft hand: F(2,14)=7.93, P =.005; right hand: F(2,14) =9.68, p <.003; both hands: F(2,14) = 9.47, P < .003]. KOTsincreased as the tempo decreased, as waspredicted on both acoustic and motoric grounds.

Page 13: Acoustics, Perception, and Production of Legato Articulation on a

Acoustics, Perception, and Production of Legato Articulation on a Digital Piano 205_-------~==:::<...=..::..:...::==:..:L.:::.:.:~...:..::::==:.:::.L==:..:...::...:.:.::==:..::.:.:..::....:::::...:Q=e..:-.:=--------=-::"'-

High register

1 st

-e- Right hand

____ Left hand

Middle register

1 st 3 st

Low register

1 st 3 st

;1150.,---,-----r-----,.---,-----r-----,

~g8C\ILDO......

~g8C\ILDO.... ~g§ ~g§

...... ......Approximate 101 (ms)

Figure 5. Legato production: Average KOT as a function of 101, step size, register, and hand.

There was also an effect of step size or relativedissonance: KOTs were slightly longer for dimin­isbed-seventh-chord arpeggi than for chromaticscales [left hand: F(1,7) = 4.14, P < .09; right hand:F( 1,7) = 21.76, p < .003; both hands: F(1,7) =15.11, p = .006]. The effect was reliable only forthe right hand, and in the middle register therewas a significant interaction between hand andstep size [F(l,7) = 5.71, p < .05]. Since there is noobvious motoric reason why arpeggi should beplayed more legato than chromatic scales, the stepsize effect probably reflects an adjustment to therelative consonance or dissonance of thesuccessive tones. The greater sensitivity of theright hand to this dimension may be due to itsleading role in melodic material.l9

There was no significant effect of register foreither hand. Although it seems that more overlapoccurred in the high than in the low register, thisdifference may be due to hands rather thanregister. The absence of a register effect indicatesthat the pianists did not adapt their legatotechnique to the acoustic decay characteristics ofthe tones. The effect of 10I may then also belIlotoric rather than perceptual in origin.

There was a significant effect of hand. in themiddle register [F(l,7) = 6.82, P < .04]. KOTs werelonger for the right hand (101 ms average overall)than. for the left hand (86 ms). Although seven ofthe eight pianists showed a difference in thisdirection, it was individually reliable for onlythree. Note also that, in the middle register, thehand difference was apparently restricted to thearpeggio

As expected, there were striking individualdifferences in average KOTs. Individual grandaverages ranged from 27 ms to 145 ms. There wasno obvious relation to either gender or relativeexperience of the pianists. Although experiencedpianists can undoubtedly change their degree oflegato when the music requires it, the fact thatthey produced such different KOTs in the sameexperimental situation suggests genuineindividual differences in legato technique or"touch."20

The final effect to consider is that of fingers,which is portrayed in Figure 6. It was highly reli­able [left hand: F(7,49) = 7.86, p < .0001; righthand: F(7,49) = 13.15, p < .0001; both hands:F(7,49) = 12.93, p < .0001] and was shown by ev­ery pianist. Its pattern was not what had been ex­pected, however. Rather than reflecting longerKOTs for less independent pairs of fingers, it rep­resented an interaction between finger pairs andorder or, more simply, a main effect of position inthe scale or arpeggio: In each upward movementand in each downward movement, KOTs startedout long and then decreased until the scale orarpeggio reversed direction. This decrease wasfairly linear in scales but apparently nonlinear inarpeggio The finger by step size interaction wassignificant for the right hand [F(7,49) =4.10, p <.002] and for both hands [F(7,49) = 7.68, p <.0001]; for the left hand, the triple interactionwith register was significant instead [F(7,49) =3.36,p < .006]. Although the two hands are shownseparately in Figure 6, the interaction of fingerswith hand was not significant.

Page 14: Acoustics, Perception, and Production of Legato Articulation on a

206 Repp

160..--------....---=------...1 st

...... Righthand

...... Lefthand

0-+-.,..-~r-r--1--r--r---r-...............-r-......-....-....-,......,.......l

R.htW~~~~~w- w~~~~~w-Ig ~ c\, cJ> ..t ..b ..t cJ> c\, ~ c\, cJ> ..t ..b ..t cJ> c\,~~w_w~~~ ~~w_w~~~

~ ..b..tcJ>c\,~c\,cJ>..t ..b..tcJ>c\,~c\,cJ>..t

Pairs of fingers

Figure 6. Legato production: Average KOT as a functionof finger pair, step size, and hand.

The consistency of the finger effect indicatesthat it represents an important aspect of legatoarticulation, though its exact cause is uncertain atpresent. One possibility is that the forearm ro­tates as the scale or arpeggio is being played,transferring weight from one part of the hand tothe other; this larger-scale movement may lag be­hind the fingers, causing decreased key overlapwhen moving in a given direction. Another factorthat could contribute to the relatively short KOTjust before a reversal is that the same finger mustbe used twice in close succession (2-1-2 or 4-5-4).However, there was no finger by 101 interaction,which would be expected if anticipatory lifting of afinger played an important role. Another possiblefactor that apparently can be ruled out is dynamicvariation. The pianists naturally tended to in­crease the dynamic level as they went up and todec~ease it as they went down a scale or arpeggio.WhIle the absolute and especially the relativesound levels of successive tones could have an in·fluence on the perception and production of KOTs,note that the obtained parallel trends for the as­cending and descending halves (Figure 6) is con­trary to the opposite trends expected on the basisof dynamics. Most likely, therefore, the finger ef­fect, like the tempo effect, is not perceptual butrather motoric or cognitive in origin.

IV. GENERAL DISCUSSION

The prese?t study combined acoustic analyses,perceptual Judgments, and production measure-

ments in an attempt to obtain some basic informa­tion about legato playing. Stimulated by the recentdemonstration by Kuwano et al. (1994) that tonesplayed and judged to sound connected showconsiderable acoustic overlap, the present studyextended the investigation to focus on several fac­tors that influence the amount ofthis overlap.

The acoustic analyses demonstrated that thepost-release decay times of piano tones decreaseas pitch (register) increases and as the duration ofthe tone increases. These effects are due in part topre-release decay, which is faster for high than forlow tones and more extensive for long than forshort tones, and in part to an increase in post­release decay rate with pitch (but not withduration). This information, previouslyunavailable in systematic form in the literatureled to the prediction that listeners would adjusttheir perceptual criteria in judging legatoarticulation, and that pianists would adjust theirlegato playing, so as to avoid extensive acousticoverlap. That is, KOTs (the directly observablevariable in MIDI recordings) were expected to belonger for tones with shorter post-release decaytimes (i.e., high and long tones).

These predictions were confirmed in theperceptual experiment, which was really a studyof "passive" legato production, withoutinvolvement of the fingers. Listeners' adjustmentswere evidently sensitive to acoustic overlap, withKOTs being longer for high and long tones.Whether perceptual constancy in terms of somecriterion of auditory (non)overlap was maintainedacross conditions could not be demonstrateddirectly, but it may be assumed that the subjectsaimed for such a constancy, as this was essentiallywhat the instructions requested them to do. Itwould be naive to expect a simple measure ofacoustic overlap to correspond to such aperceptual constancy, but application of dynamicmodels of auditory processing may uncover aninvariant auditory property in future research.

The results of "active" legato production weredifferent. Although KOTs were longer for longthan for short tones, this may be attributed tomotoric factors. There was no effect of register;what seemed like one was probably due to adifference between hands. Thus pianists' playingseemed to reflect primarily motoric constraints,not adjustments to varying acoustic overlap. Thisimplies, paradoxically, that the pianists' intended"best" legato might be judged by listeners (even bythemselves!) to be nonoptimal in certainconditions, for example at a slow tempo in the lowregister (compare Figsures 5 and 4). A perceptual

Page 15: Acoustics, Perception, and Production of Legato Articulation on a

Acoustics, Perception, and Production of Legato Articulation on a Digital Piano 207

evaluation of the recorded natural legato samplesremains to be conducted. It should be noted that,unlike the stimuli in the perception experiment,the natural productions varied in timing anddynamics, and in KOT from one tone pair to thenext.21 These kinds of natural variation may wellhave an effect on the perceptual criterion for whatconstitutes a good legato in a melody.

There was one factor that both passive and ac­tive legato players were sensitive to: When thesuccessive tones were relatively dissonant (1 stapart), adjusted and produced KOTs were shorterthan when the tones were relatively consonant (3st apart). Although, in principle, pitch separationper se could have been responsible for this differ­ence, the effect was attributed to relative conso­nance on grounds of plausibility.22 However, theeffect may not be purely aesthetic in nature: Dueto the ll:l1"ger nurnber of shared hl:l1"rnonics of rela­tively consonant tones, there may be greatermasking and/or fusion between consonant thanbetween dissonant tones (see DeWitt & Crowder,1987), making acoustic overlap more difficult todetect. There was also a confounding of step sizewith pitch range, and hence with average pitchand average acoustic overlap: On the average,arpeggi had slightly shorter overlaps thanchromatic scales because they covered a one-oc­tave range, whereas the latter extended only Overthe first 4 st in this range. More detailed investi­gations will be required to sort out these vari­ables.

Palmer (1989) and Drake and Palmer (1993) re­ported that KOTs tended to be shorter when oneof two successive tones was relatively long, espe­cially the first one. This observation seems to con­tradict the clear tendency towards longer KOTsfor longer tones in the present study. However,their finding was obtained in materials containingboth short and long tones, and their pianists werenot instructed to play legato throughout. Mostlikely, their finding reflects the fact that. longtones tend to end motives and phrases; the re­duced KOT following such tones is an aspect of"phrasing" which was absent in the present homo­geneous materials, unless the progressive reduc­tion in KOT during the ascending and descendingportions of the scales and arpeggi is to be inter­preted in a similar manner.

MacKenzie and Van Eerd (1990) observed longerKOTs for the right hand than for the left hand inrapid scale playing, The present production data,obtained at much slower tempi, tend to supporttheir interpretation of this effect as one ofhand,rather than .of register. It appears that, in some

pianists at least, the right hand is more adept atplaying legato than the left hand.

The absolute KOTs reported here are longerthan those obtained in most earlier studies. Thisis probably due to the relatively slow tempi and tothe explicit instruction to play legato, though it isalso possible that the Roland RD250s keyboard,with its relatively easy action, encouraged anincreased legato. Certainly, replications andextensions of the present findings on a realcomputer-monitored piano are desirable, togetherwith measurements of the post-release decaytimes of natural piano tones.

The relative insensitivity of the present pianiststo acoustic/perceptual factors should not beinterpreted as implying that pianists cannotadjust their degree of legato when musicalcircumstances require it. On the contrary, theyare likely to be quite flexible in that regard.Kuwano et al. (1994) demonstrated that a pianistcan play with different degrees of connectednessor separation when instructed to do so, thoughtheir results look somewhat categorical, with littledifference in acoustic overlap between tonesintended to be "marginally connected,""overlapping a little," and "overlapping to somedegree." However, these terms are nearlysynonymous and may have been so understood bythe pianist. Whether there is a tendency toperceive or produce legato in a categorical fashionis an interesting question for future research, asare the many factors that may affect KOT, such asfingering, phrasing, and individual style. Allexpressive parameters of music performance aresubject to continuous and systematiC modulation,and KOT is hardly an exception. However, it is alittle studied aspect of piano technique, and muchremains to be learned about it.

REFERENCESBenade, A. H. (1990). Fundamentals ofmusical acoustics. New York:

Dover. (Originally published in 1976 by Oxford UniversityPress.)

Bregman, A. S. (1990). Auditory scene analysis. Cambridge, MA:MlTPress.

Buus, S., & Florentine, M. (1985). Gap detection in normal andimpaired listeners: The effect of level and frequency. In A.Michelsen (Ed.), Time resolution in auditory systems (pp. 159-179).Berlin: Springer-Verlag.

Collyer, C. E. (1974). The detection of a temporal gap between twodisparate stimuli. Perception and Psychophysics, 16, 96-100.

Dannenbring, G. 1., & Bregman, A. S. (1976). Stream segregationand the illusion of overlap. Journal of Experimental Psychology:Human Perception and Performance, 2, 544-555.

DeWitt, 1. A., & Crowder, R. G. (1987). Tonal fusion of consonantmusical intervals: The oomph in Stumpf. Perception andPsychophysics,41, 73-84.

Page 16: Acoustics, Perception, and Production of Legato Articulation on a

208 Repp

Divenyi, P. L., & Danner, W. F. (1977). Discrimination of timeintervals marked by brief acoustic pulses of various intensitiesand spectra. Perception and Psyclwphysics, 21, 125"142.

Drake, c., & Palmer, C. (1993). Accent structures in musicperformance. Music Perception, 10,343-378.

Fitzgibbons, P. J., Pollatsek, A., & Thomas, I. B. (1974). Detectionof temporal gaps within and between perceptual tonal groups.Perception and Psychophysics, 16, 522-528.

Formby, c., & Forrest, T. G. (1991). Detection of silent temporalgaps in sinusoidal markers. Jounml of the Acoustical Society ofAmerica, 89, 830"837.

Fowler, C. A. (1984). Segmentation of coarticulated speech inperception. Perception and Psyclwphysics, 36, 359-368.

Kameoka, A., & Kuriyagawa, M. (1969). Consonance theory partII: Consonance of complex tones and its calculation method.Journal of the Acoustical Society ofAmerica, 45,1460-1469.

Kuwano, S., Namba, S., Yamasaki, T., & Nishiyama, K. (1994).Impression of smoothness of a sound stream in relation tolegato in music performance. Perception and Psychophysics, 55,173-182.

MacKenzie, C. L., & Van Eerd, D. L. (1990). Rhythmic precision inthe performance of piano scales: Motor psychophysicS andmotor programming. In M. Jeannerod (Ed.), Attention andPerformance XIII (pp. 375-408), Hillsdale, NJ: Erlbaum.

Martin, D. W. (1947). Decay rates of piano tones. Journal of theAcoustical Society ofAmerica, 19, 535-541.

Moore, B, C. J., & Glasberg, B. (1983). Suggested formulae forcalculating auditory-filter bandwidths and excitation patterns.Journal of the Acoustical Society ofAmerica, 74, 750-753.

Namba, S., Kuwano, S., & Yamasaki, T. (1992). Distinction be­tween legato and staccato styles of rendition in piano playing:An illusion. presented at the Second International Conferenceon Music Perception and Cognition, Los Angeles, CA.

Neff, D. L., Jesteadt, W., & Brown, E. L. (1982). The relationbetween gap discrimination and auditory stream segregation.Perception and Psyclwphysics, 31, 493-501.

Palmer, C. (1989). Mapping musical thought to musicalperformance. Journal of Experimental Psychology: HumanPerception and Performance, 15,331-346.

Perrott, D. R, & Williams, K. N. (1971). Auditory temporalresolution: Gap detection as a function of interpulse frequencydisparity, Psychonom. Sci. 25, 73-74.

Plomp, R (1964). RClte of decay of au1itory sensation. Jounml of theAcoustical Society ofAmerica, 36, 277-282.

Plomp, R, & Levelt, W. J. M. (1965). Tonal consonance and criticalband-width. Journal of the Acoustical Society ofAmerica, 38, 548­560.

Repp, B. H. (1993). Some empirical observations on sound levelproperties of recorded piano tones. Journal of the AcousticalSociety ofAmerica, 93, 1136-1144.

Repp, B. H. (1994). Relational invariance of expressivemicrostructure across global tempo canges in musicperformance: An exploratory study. Psychological Research, 56,269-284.

Repp, B. H. (in press). Pedal timing and tempo in expressivepiano performance: A preliminary investigation. Psychology ofMusic.

Shailer, M. J., & Moore, B. C. J.(1983). Gap detection as a functionof frequency, bandwidth, and level. Journal of the AcousticalSociety ofAmerica, 74,467-473.

Sloboda, J. A. (1983). The communication of musical metre inpiano performance. Quarterly Jounml of Experimental Psychology,35A,377-396.

Sloboda, J. A. (1985). Expressive skill in two pianists: Metricalcommunication in real and simulated performances. CanadianJournal ofPsychology, 39, 273-293.

Weinreich, G. (1990). The coupled motion of piano strings. In FiveLectures on the Acoustics of the Piano, edited by A. Askenfelt(Royal Swedish Academy of Music, Stockholm), pp. 73-81.

Williams, K. N., & Perrott, D. R (1972). Temporal resolution oftonal pulses. Jounml of the Acoustical Society of America, 51, 644­647.

Wogram, K. (1990). The strings and the soundboard. in FiveLectures on the Acoustics of the Piano, edited by A. Askenfelt(Royal Swedish Academy of Music, Stockholm), pp. 83-99.

FOOTNOTES*Jounml of the Acoustical Society ofAmerica, 97, 3862-3874 (1995).I Alternatively, the damper pedal may be engaged whiledepreSSing keys successively. The present study, however, wasnot concerned with pedaling.

2The technique of "partial release," which is possible on grand:tanos, will not be considered here.Sonoko Kuwano (Namba, Kuwano, and YamasClki, 1992) haspOinted out thClt the overlClp is distinctly audible CIS an"impurity" at the onset of an isolClted tone excerpted from CIlegato passClge. Note the parallel to the perception ofcoarticulation in speech (e.g., Fowler, 1984).

4The spectral composition of the sounds WClS not reported.5Although they fCliled to mention this in their published paper,

Kuwano et a1. did include the original key release whenplaying back the individuClI tones (KuwClno, personalcommunication). Thus they had informCltion about post-releasedecCly times as well as Clbout KOTs, but only acoustic toneoverlClp times were reported.

6SIoboda seemed to be unawClre of the post-release decay ofpiano tones, as he stated that "This event (Le., releClse of thekey] is simultaneous with the termination of the sound" (p.383).

7Extreme legato is Cllso referred to as legatissimo or "fingerpedaling." Nearly continuous use of the damper pedal in theseperformances further increClsed the acoustic overlClp of tonesand may also have increased the tolerance levels of listeners.

SOn the Roland RD-250s, the spectral characteristics of the toneschange with key velocity (intensity), CIS they do on a realinstrument (d. Repp, 1993). The choice of the fixed velocityvalue was arbitrary.

9The author is striving toward a consistent terminology ofmusical events. Nominal tone duration refers to the MIDI level ofdescription and should be distinguished from (actual, acoustic)tOile duration, which includes the post-release decay time, Clndfrom note duration (better: note value) which is not a physicalmagnitude at ClII but refers to the relative values (integer rCltios)of notated symbols, such as 1/4 or 1/8.

IOSince the onset of energy in the amplitude envelope started 15ms (half the durCltion of the integrCltion Window) before theonset of energy in the wClveform, these measurement pointscorrespond to 235, 485, 735, Clnd 985 ms, respectively, fromenergy onset in the waveform.

II The rise time was the time from the onset to the peak of theamplitude envelope, minus 30 ms. This estimate wasreasonClbly Clccurate because of the fast rise and slow decay ofthe energy. A signal with zero rise time and no decay (i.e., astep function) would have a rise time of exactly 30 ms (thewindow duration) in the RMS amplitude envelope.

12In fact, preliminary analyses of acoustically recorded pianotones suggest that their decay charClcteristics are even moreirregular than those of the synthetic tones analyzed here.

13These times, too, were adjusted for the window dUrCItion bysubtracting 30 ms from the times measured in the envelope.Resolution of amplitude values was not fine enough todetermine the -60 dB points, which had been used as the

Page 17: Acoustics, Perception, and Production of Legato Articulation on a

Acoustics, Perception, and Production of Legato Articulation on a Digital Piano 209

criterion of tone offset by Kuwano et al. (1994). However, thosetime points can be extrapolated from the differences betweenthe -20 dB and -40 dB time points (see also Figure 3), assumingthat the post-release decay (in dB) was linear.

14Very similar results were obtained when, instead of taking thedifference between the -20 and -40 dB time points relative topeak sound level, the time to decay by 20 dB was measuredrelative to the sound level at key release. Additionalmeasurements showed that an increase in key velocity from 40to 60 increased this decay time by about 10 ms regardless ofpitch, but a further velocity increase from 60 to 80 had no effect.Key velocity varied in the legato production experiment,reported below.

15The intended lOIs were 250, 500, 750, and 1000 ms. However,when the sound output was later digitized and measured, itwas discovered that the lOIs were consistently too long by3.9%, due to an unrecognized problem with the Max software.This problem did not affect the accuracy of timing: Thestandard deviation of lOIs within the same sequence was lessthan 1 ms, probably representing just human measurementerror. The nominal tone durations (key release times) were notaffected either.

16Overshoot was possible (and did occur on a few trials) becausethe range of the slider was fixed; thus, fast tone sequencesrequired an adjustment in the lower (left-hand) region of theslider, whereas slow sequences required an adjustment in theupper (right-hand) region.

17The melody used by Kuwano et al. contained both long andshort tones. Presumably, they were referring to the length ofthe first tone in short-long and long-short sequences.

18This fingering would not be used for an extended chromaticscale, but it is quite natural for a 5-note chromatic scale, thoughmany alternative fingerings are possible.

19This interpretation is very tentative, as there seems to be a stepsize effect for the left hand in the low register which is just aslarge as that for the right hand in the high register. However,neither the step size main effect nor the register by step sizeinteraction was significant for the left hand.

20jt would be interesting to observe the same pianists' KOTs inmore natural playing, without explicit instructions to playlegato. In fact, at the end of the experimental session four of thepresent subjects played a simple monophonic tune three timeswith their right hand in the middle register, to provideperformance data for a different study. Three of them producedKOTs that were 40-50 ms shorter than in the most comparableexperimental condition, but the fourth pianist produced timesthat were about 15 ms longer, on the average. Their "overlapprofiles" for the tune were quite dissimilar, possibly due todifferent choices of fingering. Clearly, there is much more to belearned about the factors that cause systematic variation inKOTs.

21The average MIDI velocity in the pianists' playing was a gooddeal higher than the fixed velocity in the perceptionexperiment, but this was partially offset by a relatively lowervolume of the auditory feedback. As mentioned in an earlierfootnote, however, the post-release decay characteristics of thedigital piano tones did not vary much with dynamic level.

22It could be argued that the experiment should have been de­signed to separate these two factors from the outset. However,this is impossible without a radical change in fingering patternsand scale construction, which would raise new problems. Forexample, a pitch interval larger but more dissonant than theminor third (3 st) is the tritone (6 st), but it would permit only a3-tone arpeggio with the fingering 1-3-5-3-1. Passing the thumbunder the other fingers is undesirable in a study of legato, as italmost certainly reduces KOT.