aural-perceptual speaker identification - problems with non contemporary samples

13
Aural-perceptual speaker identification: problems with noncontemporary samples Harry Hollien* and Reva Schwartz *Institute for Advanced Study of the Communication Processes, University of Florida ABSTRACT The degree to which speech and/or speech samples are noncontemporary is considered important to the speaker identification process. There are two dimensions to the problem; the first relates to the listener and, especially, to earwitness lineups. Here, the subject or witness is asked to make identifications at various times after having heard (but not having seen, of course) the speaker. It has been found that a person’s memory for a voice decays over time. In the second case, it is the samples of the speaker’s utterances which are temporally displaced. The prevailing opinion here has been that the use of non- contemporary speech samples poses just as difficult a challenge to the speaker identifi- cation process as does the decaying memory of a witness. Accordingly, research was carried out to test this possibility (Hollien and Schwartz, in press); it was found that the overall drop in correct identification over latencies from four weeks to six years was only about 15–25 per cent. It was not until the greatest of the time separations was studied (i.e., twenty years) that a substantial drop occurred (to 31 per cent). At this juncture, a number of questions arose; and three of them have been investigated. First, is listener gender important to the process; second, are the identification levels affected by the type of lis- teners employed and, finally, can external factors serve to differentially degrade listener performance? It was found that the first question could be answered in the negative and the second two in the affirmative. These findings should aid in clarifying some of the rela- tionship between sample latency and identification accuracy. KEYWORDS speaker identification, auditory identification, noncomtempory speech, perceptual processing INTRODUCTION Most forensic approaches to speaker identification take one of two major forms (there are exceptions, of course). In one instance, a victim or witness is asked to identify the voice of another person; one who was heard but not seen. One example: a telephone call. A second example: a woman was raped but did not actually see the perpetrator – only heard his speech and voice. She is asked to make an identification (earwitness lineup) at some time after (often long after) having heard him speak. The second of the two procedures is one where a voice has been recorded (examples: death threats, bomb threats, sexual harassment) and attempts are made later to determine the identity of the speaker, often from a pool of suspects. Here, © University of Birmingham Press 2000 1350-1771 Forensic Linguistics 7(2) 2000

Upload: gggtapir

Post on 22-Apr-2015

83 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Aural-perceptual Speaker Identification - Problems With Non Contemporary Samples

Aural-perceptual speaker identification:problems with noncontemporary samples

Harry Hollien* and Reva Schwartz

*Institute for Advanced Study of the Communication Processes,University of Florida

ABSTRACT The degree to which speech and/or speech samples are noncontemporary isconsidered important to the speaker identification process. There are two dimensions tothe problem; the first relates to the listener and, especially, to earwitness lineups. Here, thesubject or witness is asked to make identifications at various times after having heard (butnot having seen, of course) the speaker. It has been found that a person’s memory for avoice decays over time. In the second case, it is the samples of the speaker’s utteranceswhich are temporally displaced. The prevailing opinion here has been that the use of non-contemporary speech samples poses just as difficult a challenge to the speaker identifi-cation process as does the decaying memory of a witness. Accordingly, research wascarried out to test this possibility (Hollien and Schwartz, in press); it was found that theoverall drop in correct identification over latencies from four weeks to six years was onlyabout 15–25 per cent. It was not until the greatest of the time separations was studied (i.e.,twenty years) that a substantial drop occurred (to 31 per cent). At this juncture, a numberof questions arose; and three of them have been investigated. First, is listener genderimportant to the process; second, are the identification levels affected by the type of lis-teners employed and, finally, can external factors serve to differentially degrade listenerperformance? It was found that the first question could be answered in the negative andthe second two in the affirmative. These findings should aid in clarifying some of the rela-tionship between sample latency and identification accuracy.

KEYWORDS speaker identification, auditory identification, noncomtempory speech,perceptual processing

INTRODUCTIONMost forensic approaches to speaker identification take one of two majorforms (there are exceptions, of course). In one instance, a victim or witnessis asked to identify the voice of another person; one who was heard butnot seen. One example: a telephone call. A second example: a woman wasraped but did not actually see the perpetrator – only heard his speech andvoice. She is asked to make an identification (earwitness lineup) at sometime after (often long after) having heard him speak. The second of thetwo procedures is one where a voice has been recorded (examples: deaththreats, bomb threats, sexual harassment) and attempts are made later todetermine the identity of the speaker, often from a pool of suspects. Here,

© University of Birmingham Press 2000 1350-1771

Forensic Linguistics 7(2) 2000

Page 2: Aural-perceptual Speaker Identification - Problems With Non Contemporary Samples

200 Forensic Linguistics

samples of both the ‘unknown’ speaker (evidence tapes) and various othertalkers (knowns) are available, with the tapes of the suspects or ‘knowns’collected later (often much later than the evidence tapes) in the form ofexemplars. Since the samples to be compared are acquired at differentpoints in time, their noncontemporariness can be rather substantial eventhough they may be processed by an examiner at a single session. The dif-ference between the first procedure (the earwitness lineup or voice parade)and the second (speaker identification using noncontemporary samples) isillustrated in Figure 1. The tasks involving judgements of noncontem-porary speech samples are portrayed in the top panel; a typical earwitnesslineup procedure in the lower.

BackgroundInformation basic to earwitness lineups can be found in the aural-per-ceptual speaker identification literature; summaries and reviews of the rel-evant relationships are available (Hecker 1971; Hollien 1990; Künzel1987; Nolan 1983; Stevens 1971; Yarmey 1995). Still other authors have

Figure 1 A graphic portrayal of the differences between the present exper-iments (top box) and the way in which earwitness identification iscarried out (bottom box). In the current instance, a processor (Pis an auditor or computer operator) attempts to determinewhether or not the speakers heard in samples (A) and (B) are thesame person. Note that it is the recordings which are made at dif-ferent times. In earwitness identification, there are no recordingsof the perpetrator's voice (C). Rather, the listener (L) attempts toremember his voice and pick him out of a lineup (D to H).

Page 3: Aural-perceptual Speaker Identification - Problems With Non Contemporary Samples

201Aural-perceptual speaker identification

focused even more specifically on the ‘earwitness’ process itself bystudying memory decay, memory for voices and so on. It is now quite wellestablished that witness’ performance and reliability are somewhatcomplex (Aarts 1984; Hollien 1988; Reich and Duke 1979; Rosenberg1973; Scherer 1986; Williams 1964) and, more to the point, that aperson’s memory for a voice decays as a function of time (Bull andClifford 1984; DeJong 1998; Hollien 1996; Hollien, et al. 1983; Künzel1994; McGehee 1937, 1944; Saslove and Yarmey 1980; Yarmey 1993).Some deviation among the generally observed patterns and decay slopeshas been reported but these shifts appear related to variation in: (1) theprocessing of the stored information, (2) the specific tasks being carriedout, and/or (3) the procedures employed by the investigators.

As stated, it is this second type of speaker identification task which iscurrently being studied. Here, samples of a speaker’s utterances areobtained at different latencies over a period of time and, then, subjected toidentification procedures. Thus, it is the talker’s performances (or changesin them) at specific points in time that are the variables of interest. In thisregard, it has been suggested that noncontemporary speech poses just asdifficult a challenge to the speaker identification process as does decay ofa listener’s memory (Rothman 1977). Yet, surprisingly, very little addi-tional research has been carried out on the question. Of course, Endress,Bambach and Flosser (1971) did address the issue – at least tangentially –when they investigated the effects of ageing on speaker identification. Aswith Suzuki et al. (1996), they studied changes in certain speech charac-teristics but found no universals even over long periods of time. Thus, itappeared that there was but a single investigation in which the authordirectly addressed the effects of noncontemporary speech on speaker iden-tification. Rothman (1977) studied twenty-four talkers who were recordedreading a passage and then recorded again, one week later. His resultswere somewhat unexpected as he found that the mean correct identifi-cation scores dropped to 42 per cent when noncontemporary sampleswere employed. If the Rothman findings are accurate, the use of noncon-temporary speech samples can have a rather serious negative effect on thespeaker identification process – and especially if aural-perceptual proce-dures are used.

It was not until 1995 that Schwartz posed the question of whetherpeoples’ voices actually changed dramatically over short periods of time.Her response was to carry out pilot research that roughly paralleledRothman’s project; her results were not consistent with his. Accordingly, alarge investigation was designed and conducted in an attempt to clarify thecited relationships (Hollien and Schwartz, in press). It was discovered thatapparently Rothman was in error. However, the new research, in turn, leftsome questions unanswered. But first, a review.

Page 4: Aural-perceptual Speaker Identification - Problems With Non Contemporary Samples

202 Forensic Linguistics

The basic experimentDetails of the cited investigation can be found in Hollien and Schwartz (inpress); however, they will be reviewed briefly. The procedures weredesigned to test the possible effects of noncontemporary speech samples onspeaker recognition. They were assessed by means of four parallel experi-ments – two involving relatively short speaking latencies (four and eightweeks; four and thirty-two weeks) and two others based on comparisonssix and twenty years apart. Speech material for the short term pairings com-prised 6–8 second sentences drawn from the Fisher–Logemann articulationtest; those for the long term identifications (six and twenty years) were 7–9second sentences extracted from standard reading passages (such as the‘Rainbow Passage’ or ‘Apology for Idlers’). Talkers for the first set of exper-iments were normal, healthy males drawn from the faculty, staff and stu-dents at the Institute for Advanced Study of the Communication Processes,University of Florida; those for the longer latencies were individuals whowere presently available to the experimenters but who also had been atalker in earlier (related) experiments – and, hence, had high-qualitysamples of their speech stored in the IASCP database. Since at least tentalkers were required for each procedure, sufficiently large groups (wheresubjects met all past and present criteria) were found only in the 1989database (six-year separation), and in that for 1975 (twenty-year sepa-ration). These samples were extracted and the subjects re-recorded. A totalof 149 listeners were employed; they were distributed among auditorgroups varying from thirty to forty-one members. The procedure employedwas the paired comparison technique (ABX) with randomized/counterbal-anced pairs made up of samples drawn from material at each of two pointsin time (zero and four weeks, for example). Between sixty-eight andseventy-four token pairs were presented in each of the four sub-experi-ments with the patterns for each virtually identical; variation resulted onlyif a single pairing (example: zero to six years) or two (for example: zero tofour, zero to eight, weeks) were structured. Listeners also were required tomeet selection criteria; this included a hearing test (>92 per cent correctSRT) and demonstration of the ability to recognize if each of a series of testpairs was produced by the same or different talkers at an 85 per cent orbetter level of correct identification. These test samples were mixed ran-domly with the experimental series in roughly 15–20 per cent (same) and35 per cent (different) proportions, respectively. Listeners’ test responseswere checked first and, if a subject did not reach the 85 per cent correctidentification level on both, his or her experimental responses were dis-carded. In fact none of the auditors had to be eliminated as their correctjudgement ranges were 90–100 per cent and 87–100 per cent respectively.

The results from this experiment may be best understood by consider-ation of Figure 2. As can be seen, subjects scored an expected 95 per centcorrect identification for the contemporary speech, after which their

Page 5: Aural-perceptual Speaker Identification - Problems With Non Contemporary Samples

203Aural-perceptual speaker identification

scores dropped (four-week condition) to a band roughly between 70 percent and 85 per cent and remained in this band for at least six years. Thesedata are consistent with those reported by Schwartz in 1995, (i.e., 79.3and 81.6 per cent correct) but not with Rothman’s (1977). Once these pat-terns were established, a number of additional problems became evident.For example, why did a drop of about 20 per cent in correct identificationoccur between the contemporary utterances and those occurring only fourweeks later and why did a very sharp drop occur between the six- andtwenty-year separations? Three relevant experiments were conducted;they should provide at least some information about these issues.

METHODThe procedures for the cited experiments exactly paralleled those for theprimary investigation (Hollien and Schwartz, in press). That is, taperecordings of male voices were presented in pairs (at different latencies) andnormally hearing listeners judged if the two were produced by a singlespeaker, or two different ones. As stated, subjects were able to do so but somequestions remained. Three specific ones were extracted for study. While theywill not answer all of the cited concerns, they constitute an initial response.

Figure 2 Mean scores from the primary study. Subjects responded tononcontemporary samples as a function of delays of from fourweeks to twenty years. The contemporary baseline is the mean(95.1 per cent) for all 149 listeners. The fitted curve is asecond-order polynomial.

Log2 Time (in weeks)

Perc

ent

Cor

rect

Page 6: Aural-perceptual Speaker Identification - Problems With Non Contemporary Samples

204 Forensic Linguistics

Gender effects The first issue of interest was gender and the possibility that one of thesexes might have biased the results by performing differently from theother. There were slightly over two female listeners to every male (105 : 44) and this imbalance could have had an effect on the data if,indeed, gender differences exist. In this regard, it was as long ago as 1937that McGehee first reported that the performance of her male listenerswas markedly superior to that of her females. Since then, anecdotal andtangential evidence (while mixed) occasionally supported her position.More recent studies, however, have not shown differences of this type(Clifford 1980; Thompson 1985; Yarmey and Matthys 1992) even thoughsometimes a slight trend can be seen. Thus, it would appear appropriate totest for this possibility, and if it occurs, the curve found in Figure 2 mighthave to be modified.

Subjects for this experiment were drawn directly from the listenercohorts employed in the original study. Since there were only forty-fourmales, an equal sample of forty-four females was randomly drawn fromthe overall total of 105. The sole exception was that the number offemales drawn from each of the four procedures exactly equalled thenumber of the men in that procedure. Once retrieved, the male–femalescores were compared.

Performance by professionalsThe second experiment explored the possibility that trained professionalswould perform differently from the 149 student listeners employed in theprimary investigation (see also Köster 1981; Shirt 1984). It involved a newgroup of listeners. First, it was asked if the decline in correct identifica-tions of contemporary speech and various noncontemporary samples(roughly 95 per cent vs. 74–85 per cent) might be due, in any way, to therelative inexperience of the student auditors. Second, did a similar rela-tionship exist for the sharp drop between six and twenty years? The lis-teners for this second experiment were eight experienced phoneticianswith advanced degrees; moreover, half of them knew at least a few of thetalkers. Admittedly, it would have been desirable if this latter relationshiphad not occurred at all but it must be conceded that it is most difficult toobtain both the services of eight qualified phoneticians and eleven talkerswho can provide laboratory quality samples twenty years apart. On theother hand, this experiment included nearly optimum listeners (i.e.,trained, experienced professionals). They would be expected to be highlymotivated and resistant to any distracting conditions/events on the tapes –if, indeed, such did occur. The new listeners were first presented the taperecording which was used as the basis for the fourth sub-experiment in theprimary project (i.e., the zero vs. twenty-year sets). Later, five of these

Page 7: Aural-perceptual Speaker Identification - Problems With Non Contemporary Samples

205Aural-perceptual speaker identification

same subjects were presented a similar tape but where the voices had beenrecorded four weeks apart (the procedure was again identical with theothers). As with the other research, listeners were requested to make‘same’ and ‘different’ judgements.

External variablesThe third investigation was designed to determine if at least one type of‘distractor’ could degrade the ability of listeners to carry out accuratespeaker identification tasks of the type studied here. There is some evi-dence that such might be the case (Aarts 1984; Bull and Clifford 1984;Hollien et al. 1982; Künzel 1994; Reich and Duke 1979). This exper-iment involved a new group of speakers, a set which included a number of‘sound-alikes’. These voices (included were pairs of brothers, fathers andsons, and so on) had been identified as such for other research. It was pos-tulated that these built-in similarities might degrade the discriminationtask. The experimental procedure employed was virtually identical to allof the others. Specifically, it focused on thirty-three listeners whoresponded to the same type of stimuli in exactly the same way as did all ofthe others. This entire process was repeated with a second group of thirty-two auditors (primarily for reliability purposes). If complicating (or dis-tracting) factors such as this one have but little (or no) effect on theprocess, correct identification scores in the region of 75 to 80 per centcould be expected.

RESULTSGender The results of the procedure comparing the men’s responses to those ofthe women may be best understood by consideration of Table 1. First, itshould be remembered that the forty-four male and the forty-four femalelisteners also judged the contemporary pairs and these judgements pro-vided their base-line data. Second, please note that the total number ofauditors in the five experimental categories seen in Table 1 will be foundto be greater than forty-four. The reason for this is that anyone who par-ticipated in the zero-to-eight-week sessions or in the zero-to-thirty-two-week task also made zero-to-four-week comparisons. Hence the fifteensubjects found in the zero-to-four-week category are divided (as eight andseven respectively) into the zero-to-eight- and zero-to-thirty-two-weekcategories. Finally, it should be noted that the number of subjects (or N)actually refers to the number of pairs presented; there actually were, forexample, sixteen ‘subjects’ in the zero-to-twenty-year category (i.e., eightmales and eight females).

As may be seen from consideration of the table, the overall differences

Page 8: Aural-perceptual Speaker Identification - Problems With Non Contemporary Samples

206 Forensic Linguistics

between the sexes were rather small. The males did score slightly higher infour of the six categories but their overall advantage in correct identificationwas tiny (mean = 0.15 per cent); it certainly was not statistically significant.The results, therefore, agree with most of those reported during the pasttwenty to twenty-five years; they certainly do not support McGehee’s obser-vation. The importance of these data is that subjects’ gender cannot accountfor any of the patterns reported in the primary study. That is, both sexes per-formed in a similar fashion; they did equally well on the task.

ProfessionalsConsideration of Table 2 will reveal that the phoneticians did strikinglybetter in the voice identifications than did the students; this was to beexpected (Köster 1981; Schiller and Köster 1998). Yet it must be remem-bered that they (the students) were not typical ‘lay’ listeners as all had back-grounds (albeit somewhat limited) in phonetics, linguistics and/or speech.Nevertheless, these data underscore the fact that trained, experienced

Table 1 Correct responses to the paired speech samples uttered by indi-viduals 4, 8, 32 weeks, and 6, 20 years apart. Listeners wereyoung and healthy, there were forty-four males and forty-fourfemales.

Table 2 Comparison of the responses of experienced phoneticians withthose of student listeners

Listeners N Contemporary Noncontemporary (per cent) (per cent)

A latency: 4 weeksphoneticians 5 100 89students 67 93 76

B latency: 20 yearsphoneticians 8 100 74students 41 98 33

Condition No. pairs Males Females Diff.

Contemp. 44 92.0 91.1 -0.90–4 weeks 15 73.6 74.0 +0.40–8 weeks 8 79.9 79.5 -0.40–32 weeks 7 71.0 73.6 +2.60–6 years 10 86.1 85.0 -1.10–20 years 8 32.6 31.1 -1.5

Page 9: Aural-perceptual Speaker Identification - Problems With Non Contemporary Samples

207Aural-perceptual speaker identification

phoneticians can be expected to carry out speaker identification tasks in amanner that is substantially superior to that of individuals who do notenjoy their background and sophistication. More importantly, however,these data also demonstrate that, while lay listeners can be expected toperform quite well when the task is optimal (contemporary samples, highfidelity recording conditions, etc.) their accuracy will suffer even whensignal degradation is slight – thus the differences that can occur after onlyfour-week latencies. Note the disparity between the respective correctscores of 89 per cent and 76 per cent (respectively) for these judgements.This difference would appear compelling even though a statistical test wasnot applied due to the extreme differences in population size (five vs.sixty-seven).

Just as striking is the mean score of 74 per cent correct identificationachieved by the phoneticians for the zero-to-twenty-year pairs. This dif-ference is not as useful in demonstrating the disparity between the listenercohorts as it is in demonstrating that most people’s speech does not changeas dramatically as might be thought even over relatively long periods oftime. Indeed, these data underscore the argument that noncontemporaryspeech samples can be used effectively in nearly all types of speaker identi-fication.

The issue of listener familiarity with the speakers should be addressed atthis juncture. As it turns out, neither the student nor the professionalsknew any of the speakers used in the zero-to-four-week contrast. Yet, thegroup performance differed here, and did so quite dramatically(undoubtedly due to differences in skill). On the other hand, it must beconceded that familiarity with the talkers could have raised the pho-netician’s scores somewhat when the twenty-year contrasts were tested.None the less, with one exception, many of the phoneticians were notfamiliar with the talkers and, those that were, knew only some of them(i.e., the talkers). This knowledge was also displaced in time (i.e., they hadnot heard the speakers for many years). It is also of interest to note that,while seven of the eight auditors knew one particular talker rather well,not a single one of them correctly matched his sample to the one he haduttered twenty years previously. In short, it is argued that any distortingeffect here probably was of minor consequence.

An external variableThe third set of experiments was structured to (potentially) degrade theidentification process by employing similar sounding voices. The resultshere can be best understood by consideration of Figure 3. First, it can beseen that, with mean scores of 95 per cent, and 92 per cent correct, bothgroups performed appropriately with respect to the contemporary utter-ances (a range of 92–96 per cent was reported for subjects in the other

Page 10: Aural-perceptual Speaker Identification - Problems With Non Contemporary Samples

208 Forensic Linguistics

studies). They did so even though the distractor condition would beexpected to create, at least, minor problems.

On the other hand, severe degradation was found when the ‘distraction’of sound-alike speakers was confounded with a four-week time separationbetween the samples in each of the pairs. Note the sharp drop in correctidentification (to 44 per cent, and 40 per cent respectively); it is wellbeyond that which would be expected for a four-week latency alone.

Figure 3 Correct identification levels for two groups of listeners whoresponded to contemporary speech and, then, to noncontem-porary sets of samples with a four-week time separation. Thespeaker cohort consisted primarily of pairs of individuals whosounded like each other.

Page 11: Aural-perceptual Speaker Identification - Problems With Non Contemporary Samples

209Aural-perceptual speaker identification

Again, the first group was slightly better at the task than the second butthis difference was not significant (P>0.14). In any event, these datastrongly suggest that any factor which degrades the signal can have a neg-ative effect on speaker identification accuracy when it is attempted by laywitnesses. Since the relevant research literature abounds with findings thatwould – and could – suggest possible sources of degradation (see, forexample, DeJong 1998; Hollien 1990; Nolan 1983; Yarmey 1995), atleast some reduction in accuracy here was expected. What was unexpectedwas the severity of the drop. Quite obviously, a number of related factorsmust be researched if human performance relative to speaker recognitionis to be appropriately understood.

CONCLUSIONSThese investigations serve to confirm the results of the primary exper-iment. That is, the (previously) predicted severe reduction in the ability oflisteners to identify individuals from noncontemporary speech samplesover long periods of time was not substantiated. Rather, it appears thatnoncontemporary speech samples can be expected (overall) to show onlyminimal decay – especially when employed in aural-perceptual speakeridentification – for periods of up to six years and, perhaps, even longer ifexperienced professionals are employed. Of course, a particular individualmight show change sufficient to be counted as an exception; however,such would not be the case for most auditors.

What also may be concluded from these additional studies is that:1 Either men or women may be used with impunity in speaker identifi-

cation tasks.2 Professional training and experience will result in superior per-

formance.3 Factors which degrade the signal or operate negatively on the auditor

can sharply reduce performance level.4 The nature and severity of the negative effects are not yet very well

understood.

ACKNOWLEDGMENTSThis research was supported by grants from the National Institutes ofHealth (especially, NIAAA 09377) and the US Army Research Office. Theauthors also wish to thank Dr Gea DeJong and Dr H. B. Rothman for theirassistance during various phases of the project.

Page 12: Aural-perceptual Speaker Identification - Problems With Non Contemporary Samples

210 Forensic Linguistics

REFERENCESAarts, N. W. (1984) ‘The effect of listener stress on perceptual speaker

identification’, unpublished MA thesis, University of Florida.Bull, R. and Clifford, B. R. (1984) ‘Earwitness voice recognition

accuracy’, in G. L. Wells and E. Loftus (eds) Eyewitness Testimony:Psychological Perspectives, Cambridge: Cambridge University Press.

Clifford, B. R. (1980) ‘Voice identification by human listeners, on ear-witness reliability’, Law and Human Behavior, 4: 373–94.

DeJong, G. (1998) ‘Earwitness characteristics and speaker identificationaccuracy’, unpublished PhD dissertation, University of Florida.

Endress, W., Bambach, W. and Flosser, G. (1971) ‘Voice spectrograms as afunction of age, voice disguise and voice imitation, Journal of theAcoustical Society of America, 49: 1842–8.

Hecker, M. H. L. (1971) ‘Speaker recognition: an interpretive survey ofthe literature’, American Speech and Hearing Association, Monographno. 16, Washington, DC.

Hollien, H. (1988) ‘Voice recognition’ in M. LeBlanc, P. Tremblay and A.Blumenstein (eds), Nouvelles Technologies et Justice Penale, Montreal, 9:180–229.

Hollien, H. (1990) The Acoustics of Crime: The new science of forensicphonetics, New York/London: Plenum Press.

Hollien, H. (1996) ‘Consideration of guidelines for earwitness lineups’,Forensic Linguistics, 3: 14–23.

Hollien, H. and Schwartz, R., ‘Speaker identification untilizing noncon-temporary speech’, Journal of Forensic Sciences, in press.

Hollien, H., Bennett, G. and Gelfer, M. P. (1983) ‘Criminal identificationcomparison: aural vs. visual identification resulting from a simulatedcrime’, Journal of Forensic Sciences, 25: 208–21.

Hollien, H., Majewski, W. and Doherty, E. T. (1982) ‘Perceptual identifi-cation of voices under normal, stress and disguised speaking conditions’,Journal of Phonetics, 10: 139–48.

Köster, J. P. (1981) ‘Auditive Sprecherkennug bei Experten und Naiven’, inFestschrift Wangler, Hamburg: Helmut Buske, AG, 52: 171–80.

Künzel, H. J. (1987) Sprecher-Erkennung: Grundzüge forensischerSprachverarbeitung, Heidelberg: Kriminalistik-Verlag.

Künzel, H. J. (1994) ‘On the problem of speaker identification by victimsand witnesses’, Forensic Linguistics, 1: 45–58.

McGehee, F. (1937) ‘The reliability of the identification of the humanvoice’, Journal of General Psychology, 17: 249–71.

McGehee, F. (1944) ‘An experimental study in voice recognition’, Journalof General Psychology, 31: 53–65.

Nolan, F. J. (1983) The Phonetic Bases of Speaker Recognition, Cambridge:Cambridge University Press.

Page 13: Aural-perceptual Speaker Identification - Problems With Non Contemporary Samples

211Aural-perceptual speaker identification

Reich, A. R. and Duke, J. E. (1979) ‘Effects of selective vocal disguisesupon speaker identifiation by listening’, Journal of the Acoustical Societyof America, 66: 1023–8.

Rosenberg, A. E. (1973) ‘Listener performance in speaker verificationtasks, IEEE Transactions of Audio and Electroacoustics, AU-21: 221–5.

Rothman, H. B. (1977) ‘A perceptual (aural) and spectrographic investi-gation of talkers with similar sounding voices, Proceedings of the 1977International Conference on Crime Countermeasures, Oxford, 37–41.

Saslove, H. and Yarmey, A. (1980) ‘Long-term auditory memory: speakeridentification’, Journal of Applied Psychology, 45: 111–16.

Scherer, K. R. (1986) ‘Voice, stress, and emotion’ in H. Appley and R.Trumbull (eds), Dynamics of Stress: Physiological, Psychological andSocial Perspectives, New York: Plenum Press, 157–79.

Schiller, N. O. and Köster, O. (1998) ‘The ability of expert witnesses toidentify voices: a comparison between trained and untrained listeners,Forensic Linguistics, 5: 1–9.

Schwartz, R. (1995) ‘Effect of non-contemporary speech on aural–per-ceptual speaker identification’, paper presented at IAFP-95, Congress ofthe International Association of Forensic Phonetics, Orlando, Florida,July.

Shirt, M. (1984) ‘An auditory speaker recognition experiment’,Proceedings of the Institute of Acoustics Conference, 6: 101–4.

Stevens, K. N. (1971) ‘Sources of inter- and intra-speaker variability in theacoustic properties of speech sounds’, Proceedings of the SeventhInternational Congress on Phonetic Sciences, Montreal, 206–32.

Suzuki, T., Tanimotoc, M., Osanai, T. and Kido, H. (1996) ‘Acousticalvariation of voice with ageing of male speakers on vowels and nasalsounds’, presented at the Annual Meeting of the American Academy ofForensic Sciences, Nashville, February.

Thompson, C. (1985) ‘Voice identification: speaker identifiability and cor-rection of the record regarding sex effects’, Human Learning, 4: 19–27.

Williams, C. E. (1964) ‘The effects of selected factors on the aural identifi-cation of speakers’, Tech. Doc. Opt. ESD-TDR-65-153, Electron Syst.Division, USAF, Hancom Field.

Yarmey, A. D. (1995) ‘Earwitness speaker identification’, Psychology,Public Policy, Law, 1: 792–816.

Yarmey, A. D. and Matthys, E. (1992) ‘Voice identification of anabductor’, Applied Cognitive Psychology, 6: 367–77.