some fluid dynamics aspects of speech

24
Résumé La production des Sons lors de Ia phonation est principa lement due a Ia fluctuation de l’écoulement dans Ic conduit vocal. Alors que les equations qui décrivent cet Ccoulement sont connues avec precision, la solution de ces equations fortement non-linCaires est imposible sans utiliser des approximations. L’aeroacoustique est tine science qui considère une definition systématique de l’Ccoulement et de l’acoustique qui permet une approxi mation optimale. Dans de nombreux cas, Ic sens corn mun mène a des approximations qui sont en accord avec les résultats de l’aCroacoustique. Dans d’autres cas, l’aé roa coustique darffie les problèmes induits par tine approche intuitive. Dans cet article now présentons les concepts fondamentaux de l’aé roacoustique. Now discutons de l’écoulement dans ic conduit vocal (oscillations des cordes vocales, departs de tourbillons et la turbulence). L’in fluence du caractèrc de monopole, dipole et quadrupole des sources sonores qui excitent Ic conduit vocal est dCcri te en se basant sur une caricature du conduit vocal. Finalement now discutons l’aé roacoustique du sifflement humain en relation avec Ia production des sons voisés. Ahitract The production of sound during phonation is mainly due to the unsteady flow in the vocal tract. While the equa tions describing this flow are accurately known, the solu tion of these highly non-linear equations is impossible without the use of an approximation. Aeroacoustics is a science that considers a systematic definition of flow and acoustics allowing an optimal approximation. In many cases the common sense yields an approximation which agrees with the results of aeroacoustics. In other cases ae roacoustics clarifies problems induced by an intuitive ap proach. In the present paper we give a review of the basic concepts of aeroacoustics. We discuss the flow in the vocal tract (the vocal cords oscillations, vortex shedding and turbulence). The influence of the monopole, dipole and quadrupole character of the sources of sound on the excitation of the vocal tract is described on the basis of a caricature for the vocal tract. Finally we discuss the aeroa coustics of human whistling in relationship with voiced sound production. Keywords : speech production, fluid dynamics, aeroa coustics, vocal tract excitation sources. Papers by Teager & Teager (1983, 1990) and Kaiser (1983) on non-linear sound production mechanisms and flow in the vocal tract provide us with questions without answers. The aim of the present paper is to give the reader an infi~~rm41 introduction to the subject which indicates which type of questions on the interaction between flow and acoustics (aeroacoustics) may be relevant for speech production research. In particular which type of questions are tractable, because many questions will be left without an answer. Some theoretical aspects of the problem have been addressed by McGowan (1988) in a paper that dis cusses the application of the formal approach of aeroacoustics as developed by Powell (1964, 1990) and Howe (1975, 1980) to phonation. Experimental aspects of the flow in the vocal tract have been considered by among others Ishizaka & Matsudaira (1972), Gupta et al. (1973), Titze (1988), Rothenberg (1981), Some flui dynami aspects of speech* Avraham HIRSCHBERG Laboratory for Fluid dynamics and Heat transfer, Dept. of Physics Eindhoven University of Technolo~r W&S 0-54, Postbox: 513, 5600 MB Eindhoven, The Netherlands. *paper originally presented at the Fourth Colloquium Signal analysis and Speech, 22-23 October 1990, held at the Institute for Perception Research, Eindhoven, The Netherlands. Introduction Bulletin deLi Communication Parlée n°2, 1992, pp. 7-30

Upload: independent

Post on 02-Dec-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

RésuméLa production des Sons lors de Ia phonation est principalement due a Ia fluctuation de l’écoulement dans Icconduit vocal. Alors que les equations qui décrivent cetCcoulement sont connues avec precision, la solution deces equations fortement non-linCaires est imposible sansutiliser des approximations. L’aeroacoustique est tinescience qui considère une definition systématique del’Ccoulement et de l’acoustique qui permet une approximation optimale. Dans de nombreux cas, Ic sens cornmun mène a des approximations qui sont en accord avecles résultats de l’aCroacoustique. Dans d’autres cas, l’aéroacoustique darffie les problèmes induits par tine approcheintuitive. Dans cet article now présentons les conceptsfondamentaux de l’aéroacoustique. Now discutons del’écoulement dans ic conduit vocal (oscillations des cordesvocales, departs de tourbillons et la turbulence). L’influence du caractèrc de monopole, dipole et quadrupoledes sources sonores qui excitent Ic conduit vocal est dCcrite en se basant sur une caricature du conduit vocal.Finalement now discutons l’aéroacoustique du sifflementhumain en relation avec Ia production des sons voisés.

AhitractThe production of sound during phonation is mainly dueto the unsteady flow in the vocal tract. While the equations describing this flow are accurately known, the solution of these highly non-linear equations is impossiblewithout the use of an approximation. Aeroacoustics is ascience that considers a systematic definition of flow andacoustics allowing an optimal approximation. In manycases the common sense yields an approximation whichagrees with the results of aeroacoustics. In other cases aeroacoustics clarifies problems induced by an intuitive approach. In the present paper we give a review of the basicconcepts of aeroacoustics. We discuss the flow in thevocal tract (the vocal cords oscillations, vortex sheddingand turbulence). The influence of the monopole, dipoleand quadrupole character of the sources of sound on theexcitation of the vocal tract is described on the basis of acaricature for the vocal tract. Finally we discuss the aeroacoustics of human whistling in relationship with voicedsound production.Keywords : speech production, fluid dynamics, aeroacoustics, vocal tract excitation sources.

Papers by Teager & Teager (1983, 1990) and Kaiser (1983) on non-linear sound production mechanismsand flow in the vocal tract provide us with questions without answers. The aim of the present paper is to givethe reader an infi~~rm41 introduction to the subject which indicates which type of questions on the interactionbetween flow and acoustics (aeroacoustics) may be relevant for speech production research. In particularwhich type ofquestions are tractable, because many questions will be left without an answer.

Some theoretical aspects of the problem have been addressed by McGowan (1988) in a paper that discusses the application of the formal approach of aeroacoustics as developed by Powell (1964, 1990) andHowe (1975, 1980) to phonation. Experimental aspects of the flow in the vocal tract have been consideredby among others Ishizaka & Matsudaira (1972), Gupta et al. (1973), Titze (1988), Rothenberg (1981),

Some flui dynami aspectsof speech*

Avraham HIRSCHBERG

Laboratory for Fluid dynamics and Heat transfer, Dept. of PhysicsEindhoven University ofTechnolo~r

W&S 0-54, Postbox: 513, 5600 MB Eindhoven, The Netherlands.

*paper originally presented at the Fourth Colloquium Signal analysis and Speech, 22-23 October 1990, held at theInstitute for Perception Research, Eindhoven, The Netherlands.

Introduction

Bulletin deLi Communication Parlée n°2, 1992, pp. 7-30

Av~jw~ Hisai~c;

Scherer & Titze (1983), Koike (1980), Cranen (1987), Shadle (1985a,b), Barney et al. (1990), Liljencrants(1990), Thomas (1986), lijima et al. (1988) and Hegerl (1989). Except for the papers of Teager & Teager(1983, 1990) and the studies on human whistling by Shadle (1985a) and Wilson et al. (1971) most experimental studies on the aeroacoustics of phonation focus on the glottal flow. The present paper is an attemptto fill the gap between the formal paper of McGowan (1988) and the experimental data available. I will usecurrent knowledge on systems similar to the vocal tract. For example much ofmy own experience on internalflow oscillations (Bruggeman et al., 1991 ; Hirschberg et al., 1988; Peters et al., 1992) and woodwind musical instruments (Hirschberg et aL, 1990 ; van Zon et al., 1990; Hirschberg et al., 1991 ; Fabre, 1992) can beused to illustrate fluid dynamic effects which can influence phonation. This should at least partially compensate my lack ofspecific experience in phonetics.

I will restrict myself to some fluid dynamic aspects of the problem. In particular I would like to give somegeneral considerations on the character of flow induced sound sources and the relationship between acousticsand flow in speech production. For a formal approach of aeroacoustics the reader should consult the originalpapets of Powell (1964, 1990) and Howe (1975, 1980) or the few available textbooks (Goldstein, 1976;Blake, 1986; Lighthill, 1978; Dowling & Ffowcs Williams, 1983). I will try to translate some of the crucialspecific concepts of aeroacoustics into common physical concepts. I will try to give a feeling for the use ofthese general concepts in speech by considering some simple examples : vocal fold oscillation, human whistling and sound production by turbulence. As an excellent formal introduction to the subject is available(McGowan, 1988), I will use a slightly less formal approach. In particular, I will neglect the convective effectson the acoustic wave propagation. This implies that the Green’s function, which will be defined in section 2,ignores convective effects. Convective effects are however included in the sound source!

The reader should be a~are of the limitation of a qualitative approach. Fluid dynamics is governed by essentially non-linear differential equations and therefore generalisation of results of particular experiencesshould be done with extreme care. In spite of the fact that the basic equations describing a flow are wellknown and accurate, the non-linearity of the equations makes an accurate prediction of the flow virtually impossible. Even if a numerical approach is considered, we always have to introduce some approximations.

The uncertainty in the description of the flow is a problem because sound production by flow in thevocal tract is an extremely inefficient process which depends on details of the flow. Typically in free space atvelocities u0 of one tenth of the speed of sound c0, about 1 0-9 of the energy of the flow is transformed intoacoustic energy! Compared to this, voiced sound production is expected to be quite efficient (order 10-2).This problem is further complicated by the fact that the perception of speech is determined by the details ofthe spectral distribution ofsound and its temporal evolution.

It is also important to note that the mechanism of regeneration of oscillations of the vocal folds may bequite different from the mechanisms which determine the quality of speech. In a musical instrument such asa large oboe (1.5 m pipe) the fundamental oscillation frequency (220 Hz) at which the reed is oscillating isvirtually absent in the acoustic &r field outside the instrument (Hirschberg et al., 1991). The musically relevant sound consists of higher harmonics, which are very efficiently radiated and are therefore not contributing significantly to the regeneration of the reed oscillation. The low frequency pressure fluctuations whichare responsible for the oscillation of the reed are kept within the instrument because the radiation efficiencyat low frequencies is very low. One should therefore make a distinction between the problem of vocal cordsoscillation and that of the production of speech. The strong correlation between the low frequency oscillations in the sub- and supraglottal pressures observed by Koike (1980) and Cranen (1987) is expected to influence the vocal cords oscillations. However, in contrast to this, the higher frequency behaviour of thesupraglottal pressure seems rather independent from the subglottal pressure oscillations. In voiced sound production the high frequencies are much more represented in the supraglottal pressure than in the subglottalpressure (Kojke, 1980 ; Cranen, 1987). In this sense a source/filter model could be justified in spite of thestrong low frequency coupling between the sub- and supraglottal systems.

I will start by proposing a definition of “sound” and by discussing in section 2 the relationship betweenflow and acoustic field. I will give in section 3 an informal discussion of some important concepts of fluidmechanics: vortichy, boundary laye~c, flow separation, vortices and turbulence. A more accurate treatment of

SOME FLUID DTh44.MIC ASPECTS OF SPEEC1I

these concepts can be found in textbooks on fluid mechanics. Very useful introductions to the subject aregiven by Lugt (1983), Tritton (1988), Prandtl & Tietjens (1934), Milne-Thomson (1966) and Batchelor(1967). In section 3, I will also discuss the problem of vocal cord oscillation which is very closely related tothe process of flow separation.

Three basic types ofsound sources are essential in phonation:— the monopole (volume injection) [+]— the dipole (force) [+ -

—the quadmpole[±~]or[+--+]The periodic volume flow through oscillating vocal cords acts as a monopole on the supraglottal (downs

tream) part of the vocal tract. Vortex shedding induces an aeroacoustic dipole (Powell, 1964; Shadle, 1985a;Blake, 1986).

Turbulence in free space induces a quadrupole (Powell, 1964; Blake, 1986). The understanding of theinfluence of the type of the sound source on the production of sound is a key element that is absent in thediscussion of Teager & Teager (1983, 1990). In section 4, I will explain why the type and position of thesource is crucial for sound production. In order to keep the discussion clear I use in this section a caricatureof the vocal tract. The vocal tract will be replaced by a closed tube with uniform cross section!

As an example of a dipole source we consider in section 5 a model for human whistling proposed byShadle (1 985a) and Wilson et al. (1971). This an example of sound production for which the interactionbetween flow and acoustics is essentially different from the interaction assumed in a sourcelfilter model. Wewill address the question whether this type of interaction is also relevant for voiced sound production.

.Ae c scs2.1 Lighthill’s analogy

Sound consists offluctuations p’ of the fluid density p in the (audio) range of frequencies f between 20 Hzand 20.000 Hz, which propagate as waves with a speed c given by [Goldstein (1976), Blake (1986), Lighthill(1978) or Dowling & Ffowcs Williams (1983)]:

c2 = (~)pI~Jp)5 (1)

where p is the pressure. In absence of mean flow the influence of friction and heat transfer on acoustic wavein a pipe is limited to a region closed to the wall, the visco—thermal boundary layers. In absence ofmean flow,the thickness ~ of the boundary layer is of order (vko)°.5 were V is the kinematic viscosity of air (v= 1 .5x1 0-5m2/s) (Lighthill, 1978). Hence 8~ < 0.3 mm for f> 20 Hz. Friction and heat transfer can therefore often beneglected in the bulk of the flow when we consider acoustic wave propagation. This implies that the pressurefluctuations are adiabatic, the entropy s is constant (Lighthill, 1978; Dowling & Ffowcs Williams, 1983).The pressure fluctuations p’ corresponding to the density fluctuations are therefore given by p’=c2p’. As thetypical pressure fluctuations p’ have an amplitude less than 10-2 of the atmospheric pressure Pa (less than160 dB), the equations governing the propagation of acoustic waves can be linearised. (In particular situations like for long propagation distances a more severe restriction on the amplitude should be applied beforelinear theory can be used).

Deviations from the linearised wave equation are defined by Lighthull (Powell, 1964; Goldstein, 1976;Blake, 1986 ; Lighthill, 1978 ; Dowling & Ffowcs Williams, 1983) as aeroacoustic sound sources Q Henceif the listener is placed in a stagnant fluid with a speed of sound c0 we have by definition:

a2p’/at2 — c02a2p’/~2 = Q (2)

Note that it is crucial in Lighthull’s derivation of equations that c0 is a constant corresponding to the speedof sound at the location of the listener. This is the consequence of the fact that: In order to be usefri equation

10 Av~i~i HIRSCHBERG

(2) should describe the propagation ofsound at the location ~ ofthe listener placed outside the soun~e region in aun!fàrm stagnant fluid [Q(~,t)=O]. Furthermore simple order of magnitude estimate can only be carried outon the basis of Lighthil’s analogy, when the source region where Q ≠ 0 is small compared to the acousticwave length ~. When the source region is small compared to we say it is compact.

Lighthil (Goldstein, 1976 ; Blake, 1986 ; Lighthill, 1978; Dowling & Ffowcs Williams, 1983) showsthat Q can be expressed in terms of a stress tensor T1~:Q = (3)

~ is related to the flow velocity v1 by:

~ = pv~v~ — + (p — c~2p’)~j1 (4)

where 0~ is the viscous stress tensor.The first term pv1v, in equation (4) which is called the Reynolds stress tensor, is responsible for the sound

generation by non-linear convectivefi~rces in the flow such as sound production by turbulence.The second term is the influence of viscosity which can often be neglected in the bulk of the flow be

cause the Reynolds number Re of the flow in the vocal tract is large (Re = 0(103)) (see Tritton, 1988Prandtl & Tietjens, 1934 ; Mime-Thomson, 1966; Batchelor, 1967; and section 3).

The last term (p — co2p’)6~~ represents the sound production due to non-isentropic processes (such as heattransfer or combustion) or to a d~ff~rence in local speed ofsoundc(~,t)=(~)pThp)~°~5 in the vocal tract (iJ and thespeed of sound c0 at the location (~) of the listener. The term (p — c02p’)3,, is influenced by the presence ofmoisture and CO2 in the breath and by the temperature difference between the vocal tract and the environment. The importance of such effects in flames is obvious when we listen to the strong increase of soundproduction during the ignition of the flame of a gas burner. In phonation this effect is not expected to be important. If it is significant, it is only expected to be significant in the production of fricative sounds. Of course such effects will be significant in experiments on the influence of the inhalation of Helium on phonationas described by Teager & Teager (1983, 1990) and Kaiser (1983). We also neglect the entropy increase dueto friction upon mixing of the jet formed at the vocal folds with the air in the vocal tract. This entropy variation induces a small monopole contribution which is a factor (U01c0)2 smaller than the effect of the variablevolume flow through the glottis.

While the aeroacoustic formulations based on different acoustic variable (p’, p’,...) are in principle equivalent as long as no approximation is introduced, the goal of aeroacoustics is to obtain a reasonable prediction ofsound production based on an approximation of the source. Therefore in aeroacoustics the variable p’ and p’are not equivalent like in acoustics (there is no simple relationship between these variables because p’ ≠ c2p’).

When describing the influence of non-linear convective effects (pv1v,) on the sound production the choice ofp’ yields the most easily interpretable form of aeroacoustic source Q When considering unsteady heat transfer processes the choice of p’ is more adequate (Howe, 1975 ; Lighthil, 1978). Using p’ instead ofp’, one obtains a formulation of the aeroacoustic source which stresses the monopole character of sound production byunsteady heat transfer which is not obvious in Lighthill’s formulation. In phonation, if we want to includethe influence of a mean potential flow U0 on the acoustic wave propagation, the optimal choice of acousticvariable is expected to be the total enthalpyB’ = (p’/p 0 ~ ~.U0) (McGowan, 1988 ; Howe, 1975). For mostqualitative discussions p’ or p’ can be used indifferently if the flow is isentropic and the mean flow velocitiesare much lower than the speed ofsound (1U01 /c << 1).

2.2 The Green’s functionLighthill’s formulation (2-4) is equivalent to the original laws of conservation of mass and momentum

governing the flow from which it is derived and is therefore an exact equation. The power of Lighthill’s approach is that it yields a convenient formulation for introducing approximations. The use of a differentialequation such as (2-3) to obtain an approximated formulation implies the estimation of derivatives of relevant quantities. This is an inaccurate and therefore hazardous approach. This is the reason why Lighthill pro-

SOME FLUID DYNAMIC ASPECTS OF SPEECH

posed to use an inte~alformulation based on the Green’s function formalism (Goldstein, 1976 ; Morse &Feshbach, 1953).

The Green’s function is a generalised function which is defined as the solution of the wave equation forthe case that a pulse ö(t—t)6Lx—X) is the source ofsound

a2G/&2 — c02V2G = ~(t—t)&~—y) (5)

where t is the time at which the pulse is released at the source position y. Hence G(.~,t I y,t) is the linearsystem response observed at time t at the listener location ~ to the pulse ö(t—t)&~—y).Of course G is not fully defined by a differential equation. We have to specif,r the initial and boundtay

conditions. The initial conditions for G correspond to the causality condition (G = 0 and ~G/& =0 for t <We should not hear the pulse before it has been released. We can chose for G the same boundary conditionsas for the actual acoustic field p’. We call G in such a case a “tailored” Green’s function (Goldstein, 1976). Ifthe physical boundary conditions in our problem (phonation) can be described by an relationship betweenthe local values of p’ and its gradient Vp’, then we obtain by using a tailored Green’s function G a formal solution of the wave equation (2) in the form (Goldstein, 1976 ; Morse & Feshbach, 1953):

p’(~,t)=ftJffQ(x,t) G(~,tI~,t)cj~dt (6)

where we consider a fixed volume V as domain of definition of the Green’s function. The linear boundaryconditions specified above correspond to a locally reacting wall with properties defined in terms of an impedance. Intuitively equation (6) can be understood as the weighted sum of impulsive point sources with amplitude Q distributed in space and time. As Q is in general a non-linear function of p’, equation (6) is ingeneral not an explicit solution of the problem, it is an integral equation. Equation (6) and the corresponding tailored Green’s function are not always an optimal choice. We will now see that the surface integralswhich appear when G is not tailored can have a simple physical interpretation, making the use of a non-tailored Green’s function quite attractive.

The advantage of the integral formulation becomes obvious when we see that by using Lighthill’s expression for Q and by partial integration we can move the space derivatives from a2T~~Iay1~y, toward the Green’sfunction G. Starting from the general Green’s theorem (Goldstein, 1976 ; Morse & Feshbach, 1953), we obtain after some manipulation (using the momentum conservation law) the equation:

p’(~ ,t) f~ {fJf T~ (~ ,t) (a2Glay1ay~) d~ } dt— ft[ff G (~pv~/at) n~dS] dt

— f~,{ff (p’8~—a13+pv1v~) (~G/ay~)n~dS } dr (7)

where n is the outer normal at the surf~ce S enclosing the volume V in which G has been defined. As theGreen’s function is the solution of a linear problem, we can much more easily obtain accurate estimates of thederivatives of G than that of derivatives of T1~. Hence we can now obtain a much more accurate estimate ofthe sound production by a flow on the basis of global estimates of the flow properties, than we could haveobtained by using the differential equation (2). If the Green’s function is chosen such that either G = 0 or(aG/ay~)n, = 0 on the surface S enclosing V, equation (7) will take a simple form. Ifwe define G in the vocaltract which we assume to be closed at the glottis and have rigid walls, the second surface integral will vanish,either due to the presence of a wall or because the flow far outside the vocal tract is assumed to vanish (freespace). The first surface integral represents then the contribution of displacement (vibration) of the walls ofthe vocal tract and flow trough the glottis.

As a fist example of the application ofLighthill’s theory we compare now the sound production by turbulence in free space to the sound production by turbulence in a tube. These examples should illustrate the ef

Avi~w~ HIRSCHBERG

fect of the non-uniformity of the Green’s function. We now ignore the detail of the flow. Turbulence will bediscussed in section 3.5.

In free space the symmetry of the Green’s function G0 relative to the derivation with respect to the observer and source position will further simplif~r the problem because the derivatives ~)G0I~)y1 can be replaced by—aG0IaX~ and hence taken out of the integral (Goldstein, 1976 ; Blake, 1986). In the far field approximation(for I~ >> ~) the length scale for space variation of p’ is simply the wave length ~ = c0/f. So that all derivatives

can be replaced by a division by ?L Assuming that the characteristic frequency in the flow is proportional to the mean flow velocity U0 we find that ~. is inversely proportional to Uo. Using these results and thefact that we expect pv,v, to scale with pU02 leads to the famous (130/c)8 law for the intensity p’2 of the soundgenerated by turbulence in free space (Powell, 1990 ; Goldstein, 1976 ; Blake, 1986 ; Lighthil, 1978Dowling & Ffowcs Williams, 1983). The spectrum of the sound appears to be smooth with a maximumaround f= U0ID, where D is the width of the turbulent flow.

Such simple laws are obviously not valid in the vocal tract. In the vocal tract the spatial and spectral nonuniformity of the acoustic response G is essential. It is therefore not surprising that Ingard & Singhal (1975)report a large scatter in the measured power law dependence of the internal sound intensity generated by aturbulent flow in a pipe. Globally in a duct p’2 is proportional to (U0/c)” were 5 ≤ n ≤ 6. A power n = 6 corresponds to an infinitely long pipe (Howe, 1975). This approximation should be valid in a finite duct segment at higher frequencies, which are still below the cutoff frequency, but friction and radiation lossesconsiderably reduce resonant behaviour. When considering the sound intensity within a narrow frequencyband variations of n from 5 to 8 have been observed. In an infinitely extended pipe with uniform cross section at low Mach numbers the interaction between the acoustic field and the turbulence is rather weak(Meecham, 1965). The interaction will usually be concentrated in a region with a variable cross section or abend. The effect of a bend can qualitatively be explained by using the method of images (Morse & Feshbach,1953 ; Meecham, 1965). In general we expect only a significant sound production by turbulence from a region near an edge where the Green’s function is strongly non-uniform. A formal solution is given by Howe(1975) from which a value of (p’)2 proportional to (U01c)4 can be expected for a localised turbulent spotconvected through an abrupt constriction.

The influence of the spatial non-uniformity of the Green’s function on sound produced by turbulence in apipe, is illustrated by the influence of the teeth on fricative sound production (Shadle, 1985a,b). Formally this canbe explained by the fact that near a sharp edge as a tooth, ~2GIay~F)y~ is very large (locally infinite at a sharp edge).A second example of the effect of an edge is given in Fig. 1 and 2 where we show the internal sound spec

trum measured at 5 cm from the end of a 28 cm long organ pipe with a square pipe cross section 2 x 2 cm2.The pipe geometry is shown in Fig. 3. The temperature is 20°C. In Fig. 1 we show the spectra obtainedwhen the jet (1 mm thick) is blown on the labium, which result in a self sustained oscillation of the jet whichis strongly coupled to an acoustic resonance of the pipe. The spectrum is dominated by the periodic sound,due to the jet oscillation, which is composed out of exactly harmonic components. The pipe oscillation is dominated by the third mode (1612 Hz). About 60 dB lower we observe the sound produced by turbulencewhich is a broad band spectra modulated by the response of the pipe. By blowing into the pipe about 1 mmbelow the labium we obtained the spectra of Fig. 2 for blowing pressures Po of respectively 0.25, 0.5, 2 and 4kPa. Below 0.25 kPa, the jet is laminar. The transition from laminar to turbulent occurs between 0.25 and0.5 kPa. From the data above 0.5 kPa we observe that p’2 increases globally by 12 dB for an increase of Po bya factor 2. This corresponds to an increase of p’2 proportional to (U01c)4. This (U0/c)4 dependence, as expected for a dipole in a pipe is due to the presence of the labium (sharp edge) (Howe, 1975 ; Goldstein, 1976;Blake, 1986 ; Dowling & Ffowcs Williams, 1983). At low frequencies we observe a modulation of the spectra by the longitudinal resonance modes of the pipe (roughly harmonics of 520 Hz).

It is interesting to note that at high frequencies there is a significant difference between the turbulentnoise in Fig. 1 and that for the corresponding spectra (po = 2 k Pa) in Fig. 2. Above 8.6 kHz the turbulentnoise in Fig. 1 is about 5 dB higher than in Fig. 2. This is expected to be due to the difference in distancebetween the flow and the sharp edge at the labium.

SOME FLLIID DYNAMIC ASPECTS OF SPEECN

4mm

~f7 j~omm

280mm 1

Fig. 3 Organ pipe geometry.

The most striking feature of the data presented in Fig. 2 is the sudden increase of p’ just above the cutofffrequency for the first transversal mode of the pipe (f= 8.6 kHz). An effect similar to this has been observedby Badin (personal communication) in a study of fricative sounds. It is further interesting to note that thewidth of the supraglottal part of the vocal tract is much larger than the height. Therefore, the cutoff frequency for the first transversal mode in the vocal tract is determined by the width. Hence the coupling of turbulence with the first propagating transversal mode is not strongly affected by the variation in the height of thetongue body and is observed for various fricatives at about the same frequency~

From the above discussion it appears that a major advantage of the formal approath described above isthat we have separated the problem in two clearly distinct parts:— the calculation of a linear system response G,— the estimation of flow.

Although we do not yet have a detailed information about the flow, we already can recognise the stronginfluence of the spatial and spectral non-uniformity ofG.

It is important to realise that it may be easier to define a Green’s function which is not tailored. In such acase in the derivation of the integral equation for p’, surface contributions will appear (Goldstein, 1976;Morse & Feshbach, 1953). A simple example of this statement is the use of a Green’s function for the supraglottal part of the vocal tract in which the glottis is assumed to be closed. Ifwe replace in the calculation ofGthe glottis by a rigid dosed wall we have (aG/ay1) = 0 at the glottis. Hence the second surfrce integral inequation (7) vanishes as a result of this choice. The interpretation of the first surface integral is that the time&rivatiue ofthe tmnsglottal mass flux [(apvj/at)n~J acts as a monopole sound soutre on the supraglottal part ofthe vocal tract. In practice the descriptions ofphonation based on a source/filter model are often based on anintuitive application of this concept. We further see that the choice of the Green’s function affects the character (monopole, dipole...) of the sound source ! Ifwe chose a tailored Green’s function the transglottal volumeflux is not a monopole sound source any more. We will see further that in such a case we have a dipole soundsource at the glottis (corresponding to the time dependent pressure difference between the trachea and thesupraglottal part of the vocal tract).

Fig. 1 Power spectra of the internal acoustical pressure measured at 5 cm from the pipe end in the organ pipe shown in Fig.3. The blowing pressure ~o is 2 kPa. The jet is oscillating inducing strong harmonics in the spectrum. The oscillohon are dominated by the third pipe mode F = 1620 Hz.

Fig. 2 Same as Fig. 1 with a non-oscillating jet. The jet is blowing 2 mm below the labium. The sound is produced by turbulence. Below p~=O-2S kPo, the jet is laminar. The onset ofturbulence occurs between 0.25 and 0.5 kPo. Note the suddenincrease of p’ at the cutoff frequency, ~ = 8.6 kHz, For the firstpropagating transversal mode of the pipe.

Avr~iw~ HIRSCHBERG

An appropriate Green’s function G for a system like the vocal tract can efficiently be calculated either bythe well known procedure of an expansion in standing waves (modes, Morse and Feshbach, 1953) or by usinga Matched Asymptotic Expansion (MAE) procedure (Lesser & Crighton, 1975; Kevorkian & Cole, 1981).

In the Matched Asymptotic Expansion procedure various regions of the flow field are described by meansof distinct approximations. The MAE procedure yields a formal receipt to glue the various regions into a solution of the problem. This is a formalisation of the intuitive procedure used by most authors for the glottalflow. In the the trachea and supraglottal part of the vocal tract at low frequencies one assumes plane wavepropagation. At the glottis a plane wave approximation will certainly fail because of the fast area variation.Furthermore the variation in flow velocities are very large, so that we expect a significant influence of thenon-linear convective acceleration which is neglected in the acoustic approximation. One would therefore expect that we have locally to use an exact description of the flow. However because of the strong spatial gradients one can locally neglect in first approximation the time derivatives in the wave equation. Furthermore,as the dimension D of the glottis is small compared to the acoustic wave length ~ (compact flow), it can beshown that we can assume the glottal flow to have a uniform density (locally incompressible). This yields aquasi-stationary incompressible flow approximation of the flow through the glottis which is commonly usedin the study of phonation. The assumption of continuity of flow and pressure yields the necessary relationship between the approximate solutions in the various regions (subglottal part of the vocal tract, glottis andsupraglottal part of the vocal tract).

By using the method of images (Morse & Feshbach, 1953) one can also gain a considerable insight on thestructure of G. For example by using the method of images one can easily see that a dipole oriented perpendicular to a hard wall will behave as a quadrupole. The reader will find the method of images applied to a cylinder in the textbook ofMilne-Thomson (1966). Applying this method we see that a quadrupole in the vicinityof a cylinder will behave as a dipole, if the radius of the cylinder is small compared to the distance between thetwo dipoles forming the quadrupole (The edge of a tooth can be approximated as a thin cylinder).

2.3 Powell’s vortex sound theoryA formal definition of acoustic field has been proposed by Howe (1980). This definition appears to be

very useful in the study of phonation (McGowan, 1988). In general one can decompose any field such as thevelocity field y in a irrotational part derived from a potential ~ and a rotational (solenoidal) part derived froma vector potential ~. We further split the potential into a steady ~ o and unsteady ~‘ part. We obtain by definition:

y = V~0 + V4’ + V x A (8)

By definition the potential flow V4 is irrotational because VxVØ 0. By definition the flow VxA inducedby the vorticity ~ = Vxy is incompressible because V.VxA 0. Furthermore the density variation in the steady potential flow VØ0 are small because the pressure variation driving the flow in the vocal tract are smallcompared with the atmospheric pressure. This is equivalent to the statement that V~0 is a low Mach numberflow (U/c << 1). The acoustic flow 1k is defined by Howe (1980) as the unsteady, compressible part of theflowVØ’:

1k V4? (9)

This definition stresses the importance of the vorticity ~y as a source of sound.While Lighthill’s formalism is the adequate form to discuss qualitatively the influence of turbulence,

when the flow is not turbulent it is more appropriate to use for the study of phonation Powell’s vortex soundapproach (1964, 1990) which we will now summarise. For low Mach number flows, when we neglect friction and thermal processes it can be shown that:

a2T../ay~x. p0a2v~v~/axjax~ Po V.(ij~ X y) (10)

SOME FLuID DYN4MIC ASPECTS OF SPEECH 15

This expression was first derived by Powell (1964) for free space. Howe (1975, see also McGowan, 1988)demonstrated that it can also be used for internal flows in presence of an irrotational mean flow [U0 = V%] ifwe use the total enthalpy [B’ P’/Po+ ~.U0j as aeroacoustic variable instead of p’ or p’. The acoustic velocityjj~, is related to B’ by the momentum conservation law:

a~/at+VB’o (11)

These equations are used by McGowan (1988). For a qualitative discussion the convective terms in thewave equation can be neglected and we can approximate B’ by c02p’/p0 or P”Po• Equation (10) clearly indicates a relationship between the production ofsound and the presence ofvorticity ~ in the flow.

Furthermore it appears that when the flow is compact (D << ~) it is most efficiently described in terms ofvortex dynamics. We can understand this when considering the momentum conservation law for a friction-less fluid (Euler equation):

p (ay/at + (y.V) y) = —Vp (12)

We see that because for an isentropic flow p = p(p), by taking the rotational of this equation we can remove the pressure forces Vp from the equation ofmotion. This implies that in terms ofvorticity ~y the equation of motion is a purely kinematic equation (Lugt, 1983 ; Tritton, 1988 ; Prandtl & Tietjens, 1934Milne-Thomson, 1966 ; Batchelor, 1967). In a two-dimensional flow we find:

(a/at +~= 0 (13)

which implies that vorticity is a fluid property It is convected away with the local flow velocity y. This explains our common observation that vortices shed by impulsively blowing cigarette smoke travel with thesmoke as a ring (Lugt, 1983).

2.4 Acoustic energyIn an intuitive discussion it is easier to convince people of the nature of aeroacoustic sound sources by

describing the interaction of flow with the acoustic field in terms of the energy Wa transferred by from thesource Q to the acoustic field p’ during a certain period of time such as a period of oscillation (T = Fl) forperiodic fields, rather than by using the integral formulation based on the Green’s function. When Q corresponds to the injection ofvolume with a rate (dV/dt) at a fixed position in space, it is obvious that the acoustic work Wa performed by the source is givçn by:

Wa = JpdV = f~p’(dV/d’r)dt = J~p’Qdt (14)

We see from equation (14) that a volume source in free space (p’ 0) will be less efficient than a confinedsound source. An example of a volume source in free space, is a compact pipe termination with diameter D.The acoustic volume flux at the pipe exit dV/dt = (ICD2Ua/4) acts as a monopole sound source on the freespace outside the pipe. The pressure p’ at the pipe exit is in first approximation in phase with the accelerationbecause: as/at = —Vp’/p0. Therefore the work Wa performed by the source will be determined by the smalldeviation from this incompressible approximation. It can be shown that : Wa = (itD/2?~)2JUa2dt. Hence in freespace a compact volume source is a very inefficient source ofsound because (D/~) << 1. This explains the usefulness of the supraglottal part of the vocal tract (see section 4).

In the case ofvortex sound in free space, Howe (1980) has demonstrated that W can be calculated by:

Wa — f~ffJ~[P (WXY)] dy dt (15)

AVR~HAM HIRSCHBERG

Equation (15) indicates, as we will discuss further, that vorticity can only perform acoustic work in thepresence of an acoustic velocity u2. This is typical for a dipole sound source. The term —p(~xi) has the formof a Magnus force density exerted on the fluid element dx with vorticity ~i placed in a velocity field x. Such aforce is perpendicular to x and can therefore not perform any work. However when ~ is not parallel to ~ formula (15) shows that the “Magnus force” transfers energy from the rotational flow to the acoustic flow. Thefact that a vortex ring acts as a dipole can be understood both intuitively and formally from the discussionsgiven in the literature (Powell, 1990; Howe, 1975 ; Blake, 1986 ; Prandtl & Tietjens, 1934).

2.5 SummaryFrom the discussion given above it should be clear that the aeroacoustic approach to phonation consists

of five main steps:— The identification of a specific sound production mechanism and the choice of the adequate aeroacousticvariable (p’, p’ or B’).— The formulation of a wave equation and the definition of the source Q,— The choice of a convenient Green’s function and the derivation of an integral equation,— Manipulation of the integral equation to transfer space derivatives from the source Qto the Green’s function,— Introduction of a model for the flow into the integral equation.

In Lighthil’s approach the last step is an order of magnitude estimate. This crude approach yields scalinglaws which indicate the dependence of sound production on various parameters in the problem. In manycases this order ofmagnitude estimate is already quite interesting because our ear has a logarithmic sensitivityto sound ! At this level of approximation one neglects the feedback of acoustics fluctuations on the flow. Infree space it is usually reasonable to neglect the feedback from the acoustic field to the incompressible flowwhich generates the sound.

The vocal tract is a resonator in which acoustic energy can accumulate. In a resonator the acoustic velocities (ua=p’/pc) may become larger than the incompressible flow velocities exciting the field (Hirschberg et al.,1991). This implies a strong feedback from the acoustic field to the flow and an essentially non-linear behaviour. The selfsustained oscillation of the jet flow in the flute shown in Fig. 1 is an example of the possible effect of this feedback. A simple example in phonation is the dependence of the flow through the glottis on thetransglottal pressure which is significantly influenced by the acoustic pressure fluctuations (Rothenberg,1981; Koike, 1980 ; Cranen, 1987). Hence in such a case the integral equation is not an explicit solution ofthe problem. An iterative procedure should be used to solve the equation.

3. Some elements of luid mech nics3.1 Approximations of the basic equations and characterisation of the flow

The law of mass, momentum and energy conservation governing a fluid flow are well known. Howeverbecause the equations are non-linear it is in general not possible to obtain an exact solution. In particular thenon-linearity due to the convective acceleration (~.V)~ in the momentum conservation law (12) can makeeven a straightforward numerical solution quite inaccurate. We will therefore always have to use an approximation. We consider in this section some elementary approximations which can be used to gain insight intothe behaviour of the flow in vocal tract.

In general an approximation is obtained by considering the dimension-less form of the equations of motion. In this form there appears in front of each term a dimension-less number which is a measure for the relative importance of the term. Under specific condition some small terms can be neglected.

In phonation the most crucial parameters are the Strouhal number Sr0, the Reynolds number Re, theHelntholtz number He and the Mach number M. The Strouhal number Sr0 = fD/U0 is a measure for theratio of acceleration due to the unsteadiness of the flow and convective acceleration due to the non-uniformity of the flow. The Helmholtz number He = DR. yields information about the compactness of the flow (uni

SOME FLUID DYNAMIC ASPEC1S OF sp~aI 17

formity of the density). The Reynolds number Re = D Udv , where V is the kinematic viscosity, is a measurefor the ratio of convective forces and viscous forces. The Mach number M = Udc yields information on thedensity variations in a steady flow (For M << 1, Ap/p = M212).

The meaningfulness of dimension-less number depends largely on the correct choice of the characteristicfrequency f, length D, velocity U0... This implies in some empirical knowledge on the flow. This insight canbe provided by experiments as described by Teager & Teager (1983, 1990), Cranen (1987), Shadle (1985a,b)and Barney et al. (1990). Furthermore different choices may be appropriate to investigate different aspects ofa flow.

The difficulty of the use of a Reynolds number as a measure for the importance of inertial forces compared to frictional forces is now illustrated by considering the flow in a duct. A boundary layer approximationis only meaningful in regions where the shape of the vocal tract changes rapidly. In a long tube inertial forcesare negligible compared to viscous forces. We have a so called fully developed pipe flow which is dominatedby friction. In fact inertia is then negligible even if Re = U0D/v >> 1. In such a case the Reynolds numberbased on the pipe diameter is mainly an indication for the stability of the flow (occurrence of turbulence).This example illustrates that the use of dimension-less numbers to estimate the relative importance ofvariousterms in the equations ofmotion is only meaningful when we have a reasonable understanding ofthe flow.

When the Reynolds number is very low (Re < 1), viscous forces dominate and the non-linearity of theequations is not crucial. In the vocal tract we have typical Reynolds numbers of the order of 103. Non-linearity is an essential feature of the flow. In first approximation when Re>> 1 we can neglect friction in the bulkof the flow. When the flow is irrotational this yields a potential flow which is reasonably easily calculated.However we can never neglect friction at the wall. There is always at least a thin region (with a thickness 6.of the order of (D/~Ii~) for a stationary flow and of orde (~1~1o) for oscillating flows) along the wall wherefriction is as important as inertial forces. This region is called a boundary layer. Typical for a boundary layeris that the pressure in this region is imposed by the outer friction-less bulk of the flow. Furthermore theboundary layer always contains rotation because it is a quasi-parallel flow u = u(y) in which the componentalong the wall dominates but varies from the outer flow velocity U0 to zero at the wall. In the ideal case,boundary layers remain thin and friction yields only a small correction to the ideal friction-less potential flowwhich is described in section 3.2.

Even in a limited region with rapidly changing geometry the potential flow approximation is usually notvalid. The most spectacular deviation from a potential flow is due to the separation of the boundary layerfrom the wall (section 3.3). At the separation point the vorticity contained in the boundary layers is injectedinto the main flow. As we consider the high Reynolds number limit, the vorticity remains bounded to thefluid particle (13). The evolution of the vorticity distribution results in the formation of a free jet (stationaryflow) or periodic vortex shedding (periodic flow).

At high Reynolds numbers we can neglect viscous force in the flow as long as the flow is non-turbulent.Above a critical Reynolds number depending on the type of flow considered the non-linearity of the convective acceleration may result in a flow instability which is called turbulence (Lugt, 1983 ; Tritton, 1988).Typical for turbulence is a high dissipation of energy (section 3.4).

3.2 Potential flowWhen the flow is irrotational (~ = Vxy = 0) we can define a potential so that x = V4. In such a case we

can write the momentum equation for a friction-less fluid (12) in the integral form (Prandtl & Tietjens,1934; Milne-Thomson, 1966 ; Batchelor, 1967):

aØ/at + LvI212 + i = g(t) (16)

where i is the specific enthalpy which can be calculated by using the equation:

i=Jdp/p (17)

18 HIRSCHBERG

and g(t) is a function of time which without loss ofgenerality can be included into the potential (because thisdoes not affect the velocity field ~ = VØ). Equation (16) is the Bernoulli equation for an unsteady compressible isentropic potential flow. When we consider a compact flow (He << 1) we can use the incompressibleapproximation:

+ Lv12/2 + p/po = g(t) (18)

In the case of the glottal flow the unsteady term a~p/& is often negligible because Sr0 << 1 and we obtainthe commonly used equation:

1Y12/2 + P4~~ = constant (19)

‘While equation (18) is certainly an excellent approximation in the glottis, equation (19) is a reasonablebut cruder approximation. In particular in voiced sound production, just upon closure of the glottis we seefrom the data ofKoike (1980) and Cranen (1987) a sudden rise in transglottal pressure of the order of2 kPa.The characteristic rise time of the pressure is 2 ms. The characteristic thickness of the glottis is 3 mm (Titze,1988). This implies an order of magnitude for the unsteady pressure p~)4il~)t of 1 0-’kPa. Hence using (19)instead of (18) we neglect effects of the order 10_i. This the unsteady potential term is taken into account inthe analysis ofGupta et al. (1973). Note that the influence of the volume source due to the vocal fold displacement is of the same order of magnitude as the unsteady effects described above. Hence one should eitherneglect or take both effects at the same time.Ifwe assume the cross sectional area of the sub and supraglottal part of the vocal tract to be equal we find

by using the quasi-stationary mass conservation law that in the incompressible potential flow approximationthe velocity does not change. By applying the stationary incompressible Bernouilli equation (19) through theglottis we find that there is no pressure difference across the glottis ! Hence in the quasi-stationary friction-less approximation a variation of the glottal area will not influence the transglottal flow nor the transglottalpressure. This corresponds to the paradox of d’Alembert discussed by Teager & Teager (1983). An object placed in an infinitely extended stationary potential flow does not offer any resistance to the flow. Flow separation which is the result ofviscous forces is necessary to explain drag.

3.3 Boundary layer separation, shear layers, free jets and vortices.We have seen in the previous section that a friction-less incompressible flow approximation cannot ex

plain why the glottis acts as a volume flux control device. The key of the problem is that the boundary layersseparate from the wall in the diverging part of the glottis. Upstream of the separation point the potential flowapproximation is valid. Downstream the flow is not irrotational any more.

The occurrence of boundary layer separation can be understood qualitatively when we start by considering afluid partide in the main flow. As stated by the momentum conservation law (12), in absence of friction (in themain flow), the particle motion is determined by an equilibrium between the convective force p(y.V) ~ and thepressure gradient Vp. The pressure gradient normal to the wall vanishes in a boundary layer so that the pressureis imposed by the friction-less outer flow. Because the pressure in a boundary layer is imposed by the outer flow,the pressure gradient tangential to the wall is in the boundary layer the same as in the main friction-less flow. Inthe outer flow convective forces are in equilibrium with the pressure gradient (12). As friction in the boundarylayer implies a loss of kinetic energy the convective force in the boundary layer may not always be sufficientlylarge to compensate the pressure gradient. When the adverse precsure gradient is too large such as at a sharpedge (teeth) or if the diverging part of the channel is too long, boundary layer separation will occur.

Boundary layer separation can in first approximation be described as the formation of a free shear layer. Ashear layer is a line, separating a recirculation region with low velocities (dead water region) from the mean flow.A tube bounded by two shear layers of opposite vorticity is called a free jet when the shear layers are rea

sonably straight. In a free jet the velocity is approximately uniform and hence the pressure is uniform andequal to the pressure in the recirculation region.

V I

V V

VI

I VI

SOME FLUID DYNAMIC ASPECTS OF SPEECH 19

Thin shear layers are unstable (Blake, 1986 ; Lugt, 1983 ; Tritton, 1988 ; Prandtl & Tietj ens, 1934).Small perturbations, with sufficiently low frequenqc will induce a roll up of the shear layer into coherentstructures which we call vortices. The vorticity of the shear layer is concentrated in these vortices. In first (extremely crude) approximation the vorticity can be considered to be concentrated in along a line. One canprove that such a line vortex must either form a closed ring or end up at a wall. [V.~ = 0 hence Hsw1nj dS = 0].The formation of ring vortices will certainly occur both at the glottis (during the opening) (Barney et al.,1990) and at the mouth opening (Wilson et al., 1971). A similar vortex shedding is observed at the end of apipe which is acoustically driven by a clarinet mouth piece (Fig. 4) (Hirschberg et al., 1991). A ring vortexcan be very persistent. It will travel at a velocity of the order of U0. At high Reynolds numbers the vortex becomes unstable and is annihilated after some time by turbulence.

Fig. 4 Vortex shedding due to acoustical re~sononce of a pipe driven by a clarinet mouthpiece. Flew visualization by shadow method ondCO2 injection. (Pipe diameter is 2 cm.)

3.4 Flow in the glottis and vocal cord oscillationsAt the glottis we expect that, after the shedding of a starting vortex, a quasi stationary jet flow will be esta

blished. Within one oscillation period the vortex will travel over a distance of the order ofU0ko = 0(1 0-1m)which is much larger than the aperture of the glottis h = 0(1 0-3m). In such a case the stationary free jet approximation commonly used in the literature is expected to be a fair approximation. We will see that this assumption should be considered with care.

As the energy in the jet or vortex is in general dissipated by turbulence and because the glottis aperture issmall compared to the vocal tract diameter, there is almost no recovery of total pressure (p + pv2/2) upon deceleration of the flow further downstream. This explains the occurrence of a transglottal pressure drop Apwhich was not predicted by a friction-less flow. The magnitude of Ap does not crucially depend on the details of the jet flow far downstream of the glottis. In principle Ap depends on the pressure in the trachea, including acoustic fluctuations and the acoustic field downstream of the glottis. (Stationary pressure decrease inthe supraglottal part of the vocal tract is negligible). However, the corresponding transglottal volume flow isvery sensitive to changes in jet flow near the separation point.

In a quasi stationary approximation the velocity v~ in the jet is calculated from the transglottal pressuredrop Ap by using the equation:

v~ = (2 ApIp)°’S (20)

is in fact obtained by using Bernoulli’s equation (19), neglecting the flow velocity upstream of the glottis. This equation is independent of the geometry of the glottis. The transglottal volume flux is the product of v and the jet cross sectional area S~. In general flow separation which determines S~ will not occur atthe narrowest cross section of the glottis Sg but somewhere further downstream (Fig. 5). As Sg < S~ and because of the continuity of the volume flux (Sgvg = ~ the velocity vg at glottis narrowest cross section will belarger than v~. By applying Bernoulli’s equation (19), we see that a local minimum of the pressure occurs,which results into a pressure force Fb which tends to close the glottis (Ishizaka & Matsudaira, 1972 ; Guptaet al., 1973 ; Titze, 1988 ; Rothenberg, 1981 ; Scherer & Titze, 1983; Koike, 1980 ; Cranen, 1987) whichwe further call the Bernoulli force.

20 AVRAHAM HIRscI~~

Fig. 5 Flow separation in the glottis, Formation of the supragloitd jet and starting vortex.

We see that the occurrence of a starting vortex at the glottis is crucial for phonation mainly because it influences the position of the separation point (Sj). The formation of a starting vortex implies an initial deviation from a quasi-stationary model which for small Strouhal numbers Sr0 will only be significant during afraction of the oscillation period. For the regeneration of the glottis oscillations, this effect is usually neglected. However as note in the introduction, because the sound relevant for perception correspond to higher frequencies, this unsteady behaviour may be relevant in phonation.

In order to explain the maintenance of the oscillation of the glottis by the Bernoulli force it is necessarythat Fb does depends on factors other than the aperture h of the glottis. We can understand by consideringthe work Wb performed by Fb over an oscillation period:

w~ Jr~A~ (21)

For periodic oscillations Wb = 0 if Fb depends only on h. In the classical two mass model (Ishizaka &Matsudaira, 1972 ; Gupta et al., 1973 ; Titze, 1988; Rothenberg, 1981 ; Scherer & Titze, 1983 ; Koike,1980; Cranen, 1987) or the collapsing tube model (Titze, 1988) the phase shift between h and Fb is obtained by a variation of the glottis geometry. In these models a fixed separation point of the supraglottal jet isimplicitly assumed. It is interesting to note that the assumption of a ftxed position of the separation point isnot justified by experience. Stationary experiments by Scherer & Titze (1983) of the pressure distributionalong a model of the glottis show a dependence of the separation point on the aperture h of the glottis. Asthe glottis is made narrower, the influence of frictional forces increases because Re decreases with decreasingflux. This result into a movement of the separation point towards the glottis narrowest cross section. For narrow cross sections the measurements (Scherer & Tirze, 1983) show a disappearance of a local pressure minimum (Fb = 0) indicating a separation at the narrowest cross section. Our experiments on the oscillation of avalve demonstrate that the time dependence of this flow separation process may explain the oscillation of arigid valve in absence of acoustic feedback (Hirschberg et al., 1991). This indicates that the separation of aboundary layer is a rather slow process and that the flow in the glottis can be essentially unsteady in spite ofthe fact that Sr0 << 1. Self sustained oscillation in a “single mass” model of the vocal fold in absence ofacoustical feedback is possible. The second degree of freedom needed for a dephasing of Fb and h is the movementof the separation point. Again this demonstrates that we should be extremely careful when using dimensionless numbers such as Re and Sr0 in order to justify an approximation. This supports the remark ofTeager &Teager (1983) that experiments are important.

It is interesting to note that acoustic feedback, which is the main cause of reed oscillation in a clarinet(Hirschberg et al., 1990; 1991) may also be quite important in phonation. It is obvious that because theacoustic pressure fluctuations p’ at the glottis are comparable to the transglottal pressure iXp, the acousticpressure fluctuations will also contribute to the force on the vocal folds, either directly or by a modulation ofv~ and hence of Fb (Gupta et al., 1973 ; Tirze, 1988 ; Rothenberg, 1981; Cranen, 1987). While Gupta et al.(1973) indicate that an acoustically driven vocal cord oscillation is possible in absence of Bernoulli force,Titze (1988) shows that acoustic loading is a significant effect. Cranen (1987) indicates that as a result ofacoustic feedback, a permanent leak of the glottis will have a significant influence on phonation.

et 1’(~1vortex

-.-‘—

SOME FLUID DYN4MJC ASPECTS OF SPEECH 21

As noted by Teager & Teager (1983, 1990) a free jet will have a tendency to follow a smoothly curvedwall. This corresponds to our common experience with the tea pot effect : when pouring slowly, the tea flowsalong the wall rather than separating from the wall at the pipe exit. This effect is called the Coanda effect(Tritton, 1988). The Coanda can induce a strong asymmetry of the flow in the glottis which is observedboth in experiments (Teager & Teager, 1983, 1990) and numerical calculations (Liljencrants, 1990). Whenthe jet follows one of the walls, it is the separation at the opposite side which will determine the flux andBernoulli force. We do not expect a drastic influence of the Coanda effect on the sound production at theglottis.

3.5 TurbulenceIn fluid mechanics we make a strong distinction between vorte: shedding and turbulence (Lugt, 1983;

Tritton, 1988). Vortex shedding is the process described above in section 3.3. The 2-D vortices formed at theglottis or ring vortices shed at the mouth opening are very persistent. (Note: ring vortices are 2-D structuresin cylindrical coordinates). Turbulence is an essentially three-dimensional motion which can rather abruptlyannihilate a vortex ring. Turbulence is a chaotic behaviour of the flow triggered by the non-linear convectiveforces in the flow. This occurs at high velocities when viscous forces are not sufficient to stabiise the flow. Inturbulence, energy extracted from the mean flow at large length scales by the stretching of large vortex structures by a non-uniformity of the mean flow, is transferred to smaller length scales by “vortex stretching” ofthese smaller vortex structures. This is the so called “cascade “process of successive vortex stretching bywhich the energy is transferred to decreasing length scales. When the energy has reached a critical length scalecorresponding to a Reynolds number of order one (the Kolmogorov length scale) it is dissipated by viscousforces. Due to this very effective dissipation process, in absence of a non uniform main flow turbulence dies.Turbulence which occurs in the jet downstream of the glottis will soon results in a disappearance of the jetstructure.

An accurate theoretical description of unsteady turbulent flows is impossible at the present time (Binder&Ronneberger, 1991).

While turbulence will almost certainly occur in the supraglottal jet, at typical conditions encountered inspeech, the flow in the oscillating glottis is not expected to be turbulent. This justifies the use of the equationof Bernoulli (19) in the glottis. A two-dimensional numerical simulation of the glottis flow implicitly excludes the three-dimensional turbulence. I do expect that this locally is a reasonable approximation in theglottis.

At low amplitudes the interaction between turbulence and the acoustic field in an infinitely extended pipewith uniform cross section is weak ( Howe, 1980; Binder & Ronneberger, 1991). Hence we expect only acoupling between the turbulence and the acoustic field for the fundamental frequency of the vocal cords oscillation because the amplitude of the acoustic velocity can be comparable to the main flow velocity. If significant, this interaction will be located in regions of high acoustic velocity amplitude, hence at the mouthopening. As we will discuss flurther in section 5, at the mouth opening in voiced sound production, we willalso have periodic vortex shedding (Fig. 4). We expect in voiced sound production this effect to be acoustically much more relevant than the sound production by turbulence.

3.6 Sound and pseudo-soundWhen we measure a pressure fluctuation in the vocal tract with a microphone we have two contribu

tions : one from the acoustic field and one from the incompressible flow. The contribution from the incompressible flow is the “pseudo-sound” which consists out of pressure disturbances which do not propagate withthe speed of sound. These pressure fluctuations are convected away with the local flow velocity (think of thelow pressure in a tornado). Because turbulence is a chaotic flow with a broad band spectra, it can be distinguished from the acoustic field by spectral analysis (Cranen, 1987) if we consider voiced sounds.Furthermore when measuring with two microphones at a distance of a few centimetre like used by Cranen(1987) the pseudo-sound can be distinguished from sound because it is less spatially coherent. At shorter distances the coherent part of the pseudo-sound can also be distinguished from sound because it corresponds to

22 Av~te.~~j HIRscii~o

a propagation of pressure fluctuations by convection (U0) while sound waves propagate at the speed of sound(c). In placing single pressure transducers to measure the acoustic field one should stay far enough away fromthe glottis so that vortices have time to annihilate. The experiments of Barney et al. (1990) yields useflil information on a reasonable choice of this distance.

4. Excitation of a simple resonator by basic typesof sound sources

As stated above aeroacoustic sound sources in the vocal tract can have the character of a monopole (oscillating flow through the glottis), dipole (vortex shedding) and quadrupole (turbulence).

In free space a compact monopole is a very inefficient source of sound (2.4), this is the reason why theglottis is placed deep in the vocal tract. Furthermore, while in free space a dipole is a factor (D/A)2 less efficient than a monopole (Goldstein, 1976 ; Blake, 1986; Lighthill, 1978 ; Dowling & Ffowcs Williams,1983), this is not the case in a resonator. A dipole placed in a duct can be acoustically more efficient than avolume source. The efficiency of a source depends crucially on its position relative to the acoustic resonancemodes. In terms of section 2.2 we would say that the Green’s function is non-uniform. We discuss now thiscrucial effect on the basis of a simple caricature of the vocal tract.

The supraglottal part of the vocal tract is a resonator which we represent for simplicity as a pipe segmentof length L, closed at one end (glottis) and open at the other end (mouth). We will now show that the capability of a sound source to excite such a resonator depends strongly on the frequency of oscillation of thesource and its position in the resonator. In this discussion we will use mainly the energetic considerations ofsection 3.5. We therefore will speak in terms of acoustic pressure p’ and velocity ua. Of course the entire discussion could be carried out in terms of Green’s function G and space derivatives VG. We have chosen theinformal approach, hoping that the reader is satisfied by the knowledge that a more formal discussion is possible.

At frequencies below the cutoff frequency of the pipe, the acoustic field can be represented as two planewaves with opposite propagation directions. Interference of these waves results into standing waves. For givenacoustic source the acoustic field can be considered as build up out of a series of standing waves with wavelength

~~=4L/(1+2n);n=0,1,2,... (22)

Each of these standing waves is a so called mode which behaves as an independent harmonic oscillator(acoustic mass/spring system) with a resonance frequency f, = ~ (Note: in speech, the resonances of thevocal tract are called formants). T1ne closed end corresponds to a node of the acoustic velocity distributionand a maximum of the pressure amplitude in the standing waves. At the open end the acoustic pressure p’ isalmost zero (pressure node).

Let us place a monopole (pulsating sphere) in the resonator. The volume flow injected is Q= dV/dt,where V is the volume ofthe sphere. The source performs acoustic work given by:

WaJ~p’Q dt’ (23)

We see from this formula that placing a monopole at the open end (p’~O) will not excite the resotiator.Please note that direct injection ofQ in free space without vocal tract would be a very ineffective way ofproducing sound because the sound source cannot perform much work (p’ 0). Hence the vocal tract is notonly a filter it is also an impedance matching between source and free space. The injection of Q at the closedend can excite a mode of the resonator ifwe adjust the oscillation frequency to that of the mode. If we neglect losses and non-linear effects, we see from equation (23) that p’ will increase indefinitely with increasingtime (resonance). The higher p’ the more work the source can perform.

luLl

SOME FLUID DYN4MIC ASPECTS OF SPEECH 23

A dipole corresponds with two monopoles of equal strength Q but with opposite phases, placed at a smalldistance 6 from each other along the pipe. 6 should be small compared to the wave length c/f. We will showthat this corresponds to a force excitation. Assume for simplicity that the flow between the two monopoles isuniform. The velocity of the fluid in this region is given by QIS, where S is the cross section area of the pipe.The momentum of the fluid in the region is [p0(QIS) S 6] , where Po is the fluid mean density FromNewton’s law we know that the rate of change in momentum corresponds to a force of magnitude F directedalong the axis of the dipole:

F = d[p0 Q6]Idt (24)

In words : the air between the monopoles is, like a cat in a bag, jumping up and down. This results into aforce F on the “bag”. In the case of the pipe the force F is provided by the surrounding air (bag) in the formof a pressure jump Ap = F/S over the region where the dipole is placed. Any pressure jump in the flow can beinterpreted as a dipole (McGowan, 1988).

The acoustic work Wa performed~ l~’~’ the force F is given by:t t1: Wa = J0F (dx/dt)dt’ = J0F uadt’ (25)

where Ua is the acoustic velocity~ We see from formula (25) that a dipole sound source like vortex shedding atthe closed pipe end (Ua=O) will not excite the modes of the pipe. Hence in this model of the vocal tract, atlow frequencies, we do not expect the dipole contribution due to the distribution of vorticity in the supraglottal jet to be a significant source of sound (McGowan, 1988). A high frequency burst can however be expected if the starting vortex shedding during the opening of the glottis passes close to one of the false folds.As noted by McGowan (1988) we do not expect this process to be accurately described by a quasi-stationaryflow model. Howe (1975) gives some examples of the convection of vorticity along a flow inhomogeneitywhich are very similar to the problem of the interaction of the starting vortex with the false folds. In first approximation a two-dlmensional description of the flow can be used. We also see from equation (25) that a dipole placed at the open end where the amplitude of Ua is maximum will strongly interact with the acousticfield in the pipe. Hence the vortex shedding illustrated in Fig. 4 is expected to be an effective sound source.A quadrupole is obtained by placing two opposite dipoles at a small distance from each other (8 << c/f).

In a pipe with uniform cross section a quadrupole is an very ineffective sound source, what ever its positionalong the pipe except when it is placed near an edge (where a2G/ay12 is singular, see section 6).

We have considered here only the supraglottal part of the vocal tract as a resonator. This can be a usefulapproximation, however there is a priori no reason to exclude the coupling with the subglottal part and thelungs. In particular low frequency oscillations might be due to resonance of the entire system. Ifwe considernow a model of the vocal tract in which the glottis is a diaphragm separating two tube segments, the transglottal flow is not a monopole any more. There is no creation ofvolume at the glottis ifwe neglect the effectof the wall displacement and the increase in entropy upon turbulent mixing (effect of order (Udc)2) the fluctuating part of the glottal flow can be considered to be generated by a dipole (corresponding to the transglottal pressure difference).

It is interesting to understand now this process in terms of vortex sound theory. It should be clear fromthe discussion in section 3 that the dipole is induced by the modulation of the vorticity in the shear layersbounding the supraglottal jet. The main effect of the vorticity can be described by assuming two monopolesof opposite phase placed across the glottis. The force necessary to maintain this dipole is supplied by the glottis. On the basis of a model of the vocal tract described as a closed tube we did not expect that the detail ofthe distribution of the vorticity in the jet far from the glottis is signfficant for the sound production. We canexplain this now in terms of vortex sound theory~ We consider the excitation of standing waves of the entiresystem. Due to the continuity of acoustic flux, the acoustic velocity ua in the opening of the diaphragm representing the glottis should be large compared to the acoustic velocity in the pipes just upstream anddownstream from the diaphragm. Therefore we expect (equation 15) that only the vorricity in the supraglot

24 AvR.&Iw~ HIRSCHBERG

tal jet close to the glottis will perform acoustic work. From the formula ofHowe, we see that an interaction isonly expected when the path of the vorticity is cutting the acoustic streamlines (ua should not be parallel tov). At the separation point the vorticity is by definition flowing in a direction which is quite different fromthat of the potential flow Ua so that a strong interaction is possible. At large distances from the glottis the direction of the flow v which convects the vorticity and the direction of the acoustic velocity Ua are expected tobe almost parallel, so that even if Ua would not be weak, there would be a very weak interaction between theacoustic field and the vorticity~ This ftirther confirms our statement that the distinction between transglottalflow and jet vorticity is for low frequencies an artifact rather than a fundamentaL improvement of the theoryA significant generation of sound may occur at the false folds. However this type of interaction can only beeffective if the local acoustic velocity related to the excited modes is large. Therefore the characteristic frequency for this sound corresponds roughly to a wave length ?~ of four times the distance between the glottisand the false fold. This implies quite high frequencies which are not expected to be relevant for speech production.

5. Human whistlingHuman whistling is a typical example of a flow phenomenon which is induced by a strong feedback from

the acoustic field to the flow (Shadle, 1 985a, Wilson et al., 1971). Human whistling is the result of acoustically induced vortex shedding at the teeth or the lips, coupled with an Helmholtz resonator oscillation of themouth (Shadle, 1985a, Wilson et al., 1971). Human whistling cannot be explained by a simple modificationof the source/filter model. We will now see that the vortex shedding responsible for whistling also occurs invoiced sound production, which may result in a significant non-linear response of the vocal tract to the fluctuating flow through the glottis.

As stated above, periodic vortex shedding induced at an open pipe termination by strong acoustic oscillations is a dipole type sound source. The acoustic dipole corresponding to a vortex ring is directed perpendicular to the plane of the ring. The relationship between a ring vortex and a dipole is extremely well explainedformally by Prandtl & Tietjens (1934) and informally by Powell (1990).A new vortex is shed at the moment that the acoustic flow velocity is changing is zero, changing direction

from pipe inwards to pipe outwards (Bruggeman et al., 1991, Hirschberg et al., 1988). It can be seen fromHowe’s formula (15) that indeed in such a case we expect absorption of sound by the vortex. It should however be clear that after half a period of the oscillation of the acoustic field the acoustic velocity will change signand therefore the vortex will produce acoustic energy. In a pipe with sharp edges, the amplitude of the acoustic velocity decreases very rapidly space wise when we travel away from the edge (where ita is infinitelylarge !). Hence it is not surprising that the production is not able to compensate the initial strong absorption(Hirschberg et al., 1988). in a horn like the ups, the acoustic velocity is not singular at the separation point

• (there are no sharp edges). Therefore the initial absorption will be modest. Furthermore it appears that if thetravel time of the vortex in the horn matches the oscillation period the energy production can be larger thanthe initial absorption. Indeed we observe for a critical range of Strouhal numbeis Sr0 = fD/U0 of order unity,that the energy reflection coefficient of a pipe termination with a horn can be larger than unity (Fig. 6 & 7).This is an essentially non-linear process and it is therefore not surprising that in this range of Sr0 the reflec

• tion of an harmonic wave generates higher harmonics (Fig. 8). A more detailed discussion of these data isgiven by Peters et al. (1992).

This nrn~ of h~voiir I~ ~in ~vn6n~,l-ion for hiim~n wi-i~crIino~ 1QR~ W~lcon er ~,1 1()7fl Ac

stated above the mouth cavity with the pipe formed by our lips is an Helrnhokz resonator which determinesthe frequency of oscillation. We adjust the blowing velocity U0 to reach the critical range of Sr0. In this casevortex shedding is certainly induced by the acoustic field. Other phenomena like the “singer’s formant” couldalso be related to the occurence of acoustically induced vortex shedding in the vocal tract.

In some cases, periodic vortex shedding can be induced without acoustic feedback. The feedback is generated by the motion of the vortices This is the case in a flute at very low blowing pressures In such a case the

So~ME FLUID DYNAMIC ASPECTS OF SPEECH 25

1.1

1.0Ce0.9

0.8

0.7o 5 10 0Uo/f 0 UoIf 0

Fig. 6 Energy reflection ccefficient re Fig. 7 End correction p correspon- Fig. 8 Energy reflection coefficient r~measured at a pipe end with a horn ding to the data of Fig. 6. Note that the for the higher harmonics of the excitationI ). Pipe diameter D=3 cm. end correction is defined by using the signal. Same conditions as Fig. 6. [F0 =

Radius of curvature of the horn is 2xD. end of the uniform pipe cross section as 60 Hz, (uj1/U0 = 0.64; f2= 2f0, (uJ2/U0The baffle diameter of the horn is 5xD. reference plane, rather than the end of = 0.06; F3 = 3f0, (u,j3/U0 = 0.11; f4=4f0,The measurement is carried out with an the horn. (_______ with How; — — — — (uj4/U0= 0.04].accurate two microphone methode U0 = 0).(&Jre = 0.2%). The data presented isfor the fundamental F0 (f~ < 240 Hz) atan oscillation amplitude 0.6> (ua)i/Uo>0.3. For flow conditions typical for phonation (M < 0.1). As a reference we indicate the low frequency data obtained inabsence of flow(———— U0=0).

blowing frequency is proportional to the blowing velocity and it is not as stable as in the case of an acoustically induced oscillation (Fabre, 1992). Another common example ofsuch a whistling, due to flow instabilitywithout acoustic feedback, is the sound produced by a cylinder when moved at high speeds through the air(aeolian tones).

Flow will have another spectacular effect on the acoustic properties of an open pipe termination. Thephase of the pressure reflection coefficient, which is expressed commonly as an end correction ~p becomesstrongly frequency dependent (Fig. 7). The end correction ~p is defined here by taking the end of thestraight pipe as reference for measuring the phase between the reflected and incoming pressure waves. Athigh Strouhal numbers Sr0= fDIU0 we find a behaviour similar to the one found in absence of flow. AroundSr0= 0(1) we observe a dramatic decrease in op. At low Strouhal numbers Op is for a horn as the lips an orderof magnitude lower than in absence of mean flow. For a pipe with sharp edges the end correction decreases“only” a factor 2.4, but the effect of the mean flow on the end correction is still present (Peters et al., 1992;Rienstra, 1983). Hence this effect is not a particularity of horns. Such effects have to be taken into account ifwe want to deduce the acoustical properties of the vocal tract from measurements of the geometry~

It is also interesting to note that vortex shedding will also occur in the absence of mean flow if the acoustic particle displacement is large enough (ua/l0)D = 0(1)). In such a case the acoustic energy will always beabsorbed by the vortices, because there is no mean flow to extract energy from (Peters et al., 1992)!

We may conclude from this discussion that vortex shedding at the lips or teeth can significantly affect voiced speech production.

I fUo=0

2.5~“Uo=0

to

0.5 -

5 10 15

26 AVRAH.AMHIRScHBERG

6. ConclusionsAn exact solution of the equations of motion for the flow in the vocal tract cannot be obtained. Even a

1. numerical solution will not be accurate because we are not yet able to describe an unsteady turbulent flow.Aeroacoustics provides a theoretical frame work to develop adequate approximations. In many cases the

formal approach leads to the solutions obtained intuitively. A major advantage of the formal approach is thatit clarifies the limits of validity of an approximation. The formal approach provides a clear distinction between the acoustic problem of constructing a Green’s function and the more difficult problem of estimatingthe generation of sound by the flow.

There is a considerable freedom in the choice of the Green’s function. The character of the source depends on this choice. A discussion of the monopole, dipole or quadrupole character of the source of soundcannot be carried out without specif~’ing the Green’s function.

The most important fluid dynamic aspects of phonation which we have identified are:— The flow through the glottis in voiced sound production.— The periodical vortex shedding at the lips in voiced sound production and whistling.— The turbulent flow around sharp obstacles in fricative sound production.

The flow in the glottis can be described as a locally incompressible flow. We do not expect that the turbulence will significantly affect this flow. Also the details of the turbulence in the jet downstream of the glottiswill not be relevant. Turbulence is important because it explains the existence of the transglottal pressure as aresult of losses of stagnation pressure. A two-dimensional laminar flow approximation seems therefore a veryreasonable first approximation in the glottis, if flow separation is accurately described. We expect that aquasi-stationary model will not be accurate because the flow separation which determines both the transglot

:1 taL volume flux and the Bernoulli force is expected to be essentially unsteady. The unsteadiness of the flow isessential as long as the starting vortex remains close to the glottis. However we do not expect that the effectof the transglottal volume flux and that of the periodic vortex shedding at the glottis are essentially distinctphenomena. The main effect of the vorticity is to control the volume flux through the glottis by determiningthe flow separation behaviour. The excitation of the supraglottal part of the vocal tract by the dipole contribution of the vorticity is a minor phenomena at low frequencies. The contribution of the dipole described byMcGowan (1988) is only significant at high frequencies. We expect that in such a case an essentially unsteady description of the interaction with the false folds should be• used. Example of very similar problems aregiven by Howe (1975). Due to the Coanda effect, a symmetric flow is not expected to occur (Teager &Teager, 1983 ; Liljencrants, 1990). A simulation which imposes flow symmetry may therefore be less accuratethan a full simulation. However the occurence of a Coanda effect in an oscillating flow has not yet beenconfirmed by experiments. As we expect in voiced sound production that the acoustic pressure fluctuationsp’ will be comparable to the average transglottal pressure Ap, one should use a model which take the variation in p’ into account (Titze, 1988 ; Rothenberg, 1981; Cranen, 1987). A reasonable approach which isequivalent to the models of Titze (1988) and Cranen (1987) is to assume a linear acoustic response of thevocal tract, coupled to the numerical simulation of the flow in the glottis. Such a model has the great advantage to take the acoustic loading into account. Effects such as a shift of formants due to a permanent leak ofthe glottis can be described by the model (Cranen, 1987).

The description of acoustically induced periodic vortex shedding is one of the main goals of our furtherresearch. Bruggeman et al. (1991) and Peters et al. (1992) present some interesting simple models which canbe used to gain insight. A recent review of the literature on application of vortex sound theory is given byBievins (1990). in particular when a two-dimensional flow approximation is reasonable, vortex shedding ismost efficiently described in terms of singularities in a potential flow. The major difficulty is the descriptionof the unsteady flow separation.

As the description of turbulent flow remains difficult, an accurate description of fricative sound is difficult. We have however seen that most of the results of Shadle (1 985a,b) and of Barney et al. (1990) can berationalised in terms of aeroacoustics. It is also interesting to investigate further why turbulence near an edge(such a the labium of a flute or a tooth) is so effective in exciting the first transverse mode of a pipe.

SOME FLUID D~N&MIC ASPECTS OF SFEEOI 27

Including the contribution of the first propagating transverse mode in the vocaL tract into a source/filtermodel is facilitated by the &ct that the cutoff frequency is rather constant because the cutoff frequency ismainly determined by the width of the vocal tract and not by the height.

It seems reasonable to conclude that a more systematic approach of phonation integrating flow andacoustics can be very valuable. In this sense the remarks ofTeager & Teager (1983, 1990) should be given thecredit that they already have triggered some research (Kaiser, 1983 ; McGowan, 1988 ; and this review !).Many of the questions raised by Teager & Teager however are either facts of life which are easily answered byusing the present knowledge on aeroacoustics or remain open questions. For example, aeroacoustjcs teachesus that sound production in low Mach number compact flow is a very inefficient process. It is not at all surprising that only a minute fraction of the flow energy contributes to speech. This fact explains also why aconsiderable improvement of sound production is possible : we have a large energy supply available ! Theproposition of Teager & Teager (1983, 1990) to carry out accurate in vivo measurements of turbulence inthe vocal tract to deduce sound production, is an impracticable procedure. We will never obtain by such flowmeasurements in a vocal tract more than some qualitative insight. In vivo experiments are a challenge because of the many potential sources of errors. I therefore very much appreciate the in vitro type of experimentsofGupta et al. (1973), Shadle (1 985a,b) and Barney et al. (1990). I think that such experiments are in manyaspects much more relevant than many in vivo experiments.

The statement ofTeager: that “... pure theory (is) running a distance last when it attempts to explain thetotally unexpected”, can in aeroacoustjcs often be reversed. Experiments without a reasonable formal background are useless in aeroacoustics. In some cases even theory predicted a behaviour which was not obviousfrom the experimental data. The Lighthill’s (U0/C)8 power law is such an example.

Finally, Teager & Teager’s (1983, 1990) statement that a source/filter model is not an accurate descriptionof phonation because the flow in the vocal tract is essentially non-linear, seems a quite reasonable but extremely general statement. However we could state that it is f~scinating how well this caricature is performing inview ofall the potential problems which it neglects.

Acknowledgement

This investigation in the programme of the Foundation for Fundamental Research on Matter (FOM) hasbeen supported by the Netherlands Technology Foundation (STW project ETN 71.1403). I wish to thankC. d’AJessandro, P Badin, B. Cranen, J.H. Eggen, A. Houtsma and J. Smith for teaching me the little I knowabout speech and for malcing me share their enthusiasm for this subject. I wish to thank R McGowan forhis comments on the draft of this paper.

List of symbols

vector potentialB total enthalpyc local speed ofsoundc0 speed of sound at the listener positionD characteristic length scale, pipe diameterf frequency

cutoff frequency for transversal pipe resonancefrequency ofharmonic i

Fb Bernoulli force on the vocal cordsF forceG Green’s function G(~,t I y, t)G0 Free space Green’s fttnction Go=6(t-__I~-~I/c0)/(4itc0Ix~yI)h aperture of the glottis, distance between the vocal cordsHe Helntholtz number He =

28 Avi~HIRsai~c;

specific enthalpyk wave number k= (0/cL tube lengthM Mach number M = Udcn integern1 outer normal on surface Stip transglottal pressurep pressure

pressure fluctuationsPo reservoir pressureQ source ofsoundRe Reynolds number Re= U0D/v

energy reflection coefficients specific entropyS surface enclosing VSr0 Strouhal number Sr0=fD/U0t timeT oscillation period T=f’~ Lighthil’s stress tensorU0 mean flow velocity

acoustic velocityua mean acoustic velocity

• ! particle velocityV volume, domain ofdefinition ofGw vorticity ~=Vx!Wa acoustic workWb work ofBernoulli force Fbx1, x position of the listenery~, x position of the source

end correction for reflection of plane waves at an open pipe terminationviscous boundary layer thickness

ö distance between source and sink in dipoleA. acoustic wave length(0 radial frequency w=2itfp densityp’ density fluctuationsP o average density4) flow potential4)’ acoustic potential4) o mean flow potentiala1, viscous stress tensor

References

BM~y, A.M., Sw~ix~, C.H. & THOMAS D.W. (1990). Aiiflow measurement in a dynamic mechanical model• ofthe vocalfolds. University ofSouthampton, report (Southampton, UK), 155-158.

• BATCHELOR, G.K. (1967). An introduction tofluid dynamics. Cambridge University Press.• 1 BINDER, G. & RONNEBERGER, D. (1991). Response of shear flows to imposed unsteadiness. In INPG (Ed.),

Proceedings ofthe Euromech Colloqium N272:. Aussois, France: INPG.

1~i~

SOME FLUID DYNAMIC ASPECtS OF SPEECH 29

BLAKE, WK (1986). Mechanics of flow-induced sound and vibration, Vol. I : General concepts and elementary sources. Applied Mathematics and Mechanics, 17-I, Academic Press Inc., Orlando.

BLEVINS, RD. (1990). Flow-induced vibration. (2nd ed.). New York: Van Nostrand Reinhold.BRUGGEMAN, J.C., HIP.SCHBERG, A., VAN DONGEN M.E.H., WIJNANDS, A.P.J. & GORTER, J. (1991). Self-

sustained aeroacoustic pulsations in gas transport systems: experimental study of the influence of closedside branches. Journal ofSoundand Vibration, 149, Forthcomming (1-23).

CRANEN, B. (1987). The acoustic impedance ofthe glottis : measurements and modeling. Unpublished PhD dissertation, Katholieke Universiteit te Nijmegen, The Netherlands.

D0WLING, A.P & FFowcs WILLIAMS J.E. (1983). Sound and sources ofsound Chichester: Ellis HorwoodPublishers.

FABRE, B. (1992). La production du son dans les instruments de musique a embouchure defli2te, PhD thesis,Universitédu Maine, Le Mans.

GOLDSTEIN, M.E. (1976). Aeroacoustic.c. New York: McGraw Hill.GUpTA, V., WilsoN, TA. & BiIAvi~s, G.S. (1973). A model for vocal cord excitation. Journal of the

Acoustical Society ofAmerica, 54, 1607-1617.HEGERL, G.C. (1989). Neue ansatze zur numerischen simulation der glottesanregung und -stromung.

Fortschritte derAkustik, DAGA, 323.HIRSCHBERG, A., BRUGGEMAN, J.C., WIJNANDS, A.PJ. ANt MORGENSTERN, M. (1988). The “whistler nozzie”

and horn as aeroacoustic sound sources in pipe systems. In Proceedings ofthelnctitute ofAcoustics, 10,701-708.HIRSCHBERG, A., VAN DE LAAR, RWA., MA1u~ou-MAu1u~s, J.P, WIJNANrs, A.P.J., D~, Hj.,

KIUJIJSwJJK, S.G. & HousmIA, A.J.M. (1990). A quasi-stationary model of air flow in the reed channelofsingle reed woodwind instruments. Acustica, 70, 146-154.

HIRSCHBERG, A., GILBERT, J., WIJNANDS, A.P.J. & Houisi~, A.J.M. (1991). Non-linear behaviour of single-reed woodwind musical instruments. NAGJourna~ Neclerlands Akoestisch Genootschap, 107,31-43.

Howi~, M.S. (1975). Contributions to the theory of aerodynamic sound, with application to excess jet noiseand the theory of the flute. Journal ofFluidMechanics, 71, 625-673.

HowE, M.S. (1980). The dissipation ofsound at an edge. JournalofSoundand Vibration, 70,407-411.HOWE, M.S. (1984). On the absorption of sound by turbulence and other hydrodynamic flows. IMAJ of

AppliedMathematics, 32, 187-209.IIJIMA, H., Mwj, N. & NAc~I, N. (1988). Viscous flow analyses of the glottal model using a finite element

method. In Proceedings of the SecondJointMeeting ofASA andASj paper N° S5.9, IEEE 1989, 246-249.INGARD, U. & SINGHAL, V.K. (1975). Effect of flow on the acoustic resonances of an open ended duct.

Journal oftheAcoustical Society ofAmerica, 58, 788-793.ISHIZAKA, K. & MATSUDAIRA, M. (1972). Fluid mechanical consideration of vocal cord vibrations. (Speech

Communication Research Laboratory Monograph N° 8).KAISER, J.F. (1983). Some observations on vocal tract operation fiom a fluid point of view. In I.R. Titze &

RC. Scherer (Eds.) Vocal Fold Physiology (pp. 3 58-386). Denver, CO: Denver Center for PerformingArts.

KEVORKLAN, J. & Coii, J.D. (1981). Perturbation methods in applied mathematics. New York: Springer-Verlag.KoIKE, Y (1980). Sub- and supraglottal pressure variation during phonation. In K.N. Stevens & M.Hirano

(Eds.), Vocal Fold Physiology (pp. 181-192). Tokyo: Univ. ofTokyo.LESSER, M.B. & CRIGHTON, D.G. (1975). in PhysicalAcoustics, 11, Ed. by Mason, WP., Academic Press.LIGHThIJj, J. (1978). Waves influids. Cambridge: Cambridge University Press.LILJENCR~-1S, J. (1990). Numerical simulations of glottal flow. Forthcoming in J.Gauflin & B.Frirzell

(Eds.), VocalFoldPhysiology. Raven Press.LUGT, H.J. (1983). Vortexflow in nature and technology. New York : John Wiley & Sons.MCGOWAN, RS. (1988). An aeroacoustic approach to phonation. Journal ofthe Acoustical Society ofAmerica,

83,696-704.MERcj-i~j, W.C. (1965). Surface and volume sound from boundary layers. Journal ofthe Acoustical Society of

America, 37, 5 16-522.

30 Avi~ HIRsa~Ro

MILNE-THOMSON, L.M. (1966). Theoretical aerodynamics (4th edition). Macmillan & Co., Dover Edition(1973).

:1 MoRSE, P.M. & FESHBACH, H. (1953). Methods oftheoretical physics, Vol. IéII. New York: McGraw-HillBook Co.

PRANDU, L. & TIETJENS, O.G. (1934). Fundamentals ofhydro- and aeromechanics. New York: Dover publications, Inc.

PETERs, M.C.A.M., HIRScHBERG, A~, VAN DE KONIJNENBERG, J., HuIjsMAN, F., DE LEEUW, R.W, OP DEBEEI, S. & WIJNANDs, A.P.J. (1992). Experimental study of the low Mach number limit of thecoustic behaviour of open pipe termination. Paper presented at the 14th AIAA conference onAeroacoustics, Aachen 11-14 May (1992), Germany; and paper submitted to the Journal of FluidMechanics.

PowELL, A. (1964). Theory of vortex sound. Journal oftheAcoustical Society ofAmerica, 36, 177-195.POWELL, A. (1990). Some aspects of aeroacoustics : from Rayleigh until today. Transactions of ASME,

Journal ofVibration andAcoustics, 112, 145-159.RIENSTRA, S.W. (1983). A small Strouhal number analysis for acoustic wave-jet flow-pipe interaction.

Journal ofSound and Vibration, 86, 539-556.ROTHENBERG, M. (1981). Acoustic interaction between the glottal source and the vocal tract. In K.N.

Stevens & M.Hirano (Eds.), Vocal Fold Physiology. Tokyo : Univ. ofTokyo, 305-328.SCHERER, R.C. & Trrm, I.R (1983). Pressure-flow relationships in a model of the laryngeal airway with di

verging glottis. In D.M. Bless & J.H. Abbs (Eds.), Vocal Fold Physiologj~ Contemporary Research andClinical Issues (pp. 177-193). San Diego, CA: College Hill.

SHADLE, C.H. (1985a). The acoustics offricative consonants. PhD Thesis, Dept. of Electrical Engineering andComputer Sciences, MIT, Rsch. Lab. Elect. Report N 506.

SHADLE, C.H. (1985b). Models of fricative consonants involving sound generation along the wall of a tube.In Proceedings ofthe International Conference on Acoustics No 12, A3-4.

TRAGER, H.M. & TEAGER, S.M. (1983). The effects of separated air flow on vocalizations. In D.M. Bless &J.H. Abbs (Eds.), Vocal Fold Physiolog~~ Contemporary Research and Clinical Issues (pp. 124-145).SanDiego, CA: College Hill.

TEAGER, H.M. & TEAGER, S.M. (1990). Evidence for non-linear production mechanism in the vocal tract.In WJ. Hardcastle & A.Marchal (Eds.), Speech Production and Speech Modeling. Dordrecht, TheNetherlands : Kluwer Academic Pub.

THOMAS, T.J. (1986). A finite element model of fluid flow in the vocal tract. Computer Speech and Language,1, 131—151.

TrrzE, I.R (1988). The physics of small-amplitude oscillation of the vocal folds. Journal of the AcousticalSociety ofAmerica, 83, 1536-1552.

TRirroN, D.J. (1988). Physicalfluid dynamics (2nd edition). Oxford: Clarendon Press.WILSON, TA, BEAvERs, G.S., DE COSTER, M.A., HOLGER, D.K. & REGENPUSS, D. (1971). Experiments on

the fluid mechanics ofwhisding. Journal ofAcoustical Society ofAmeric~ 50,366-372.