processing differences between feature-based facial composites and photos of real faces

16
Processing Differences between Feature-Based Facial Composites and Photos of Real Faces CURT A. CARLSON 1 *, SCOTT D. GRONLUND 2 , DAWN R. WEATHERFORD 1 and MARIA A. CARLSON 3 1 Texas A&M UniversityCommerce, USA 2 University of Oklahoma, USA 3 University of Mary HardinBaylor, USA Abstract: Computer-generated faces (composites) constructed by select2ing individual facial features (e.g. eyes, nose, mouth) are poorly recognized because this process contrasts with the natural holistic processing of real faces. This result suggests that there should be differences in the cognitive processing of these composites compared with photos of real faces, which would make these stimuli problematic for theories seeking to explain real face processing. We conducted ve experiments to test potential conditions for moving composite processing closer to how real face photos are processed, rst taking the perspective of researchers who construct composites with a random selection of available features and then taking a perspective closer to police by creating each composite to match a real face photo. Composites with randomly selected features (but congured like real faces) showed no face inversion effect and recognition memory for these composites beneted from increased encoding time, unlike real face photos. Although composites constructed to match real face photos yielded an inversion effect, they still were remembered differently than the photos. Researchers should not use feature-based composites as proxies for real face photos. We conclude with a discussion of alternative methods of constructing composites. Copyright © 2012 John Wiley & Sons, Ltd. It is no secret that there are problems with faces produced by feature-based composite systems. These systems construct faces via a selection of individual facial features such as nose, eyes, and mouth, rather than a more holistic process. They do not produce good likenesses of target faces, whether these individuals are well known to the participant 1 or not (Frowd, McQuiston-Surrett, Anandaciva, Ireland, & Hancock, 2007). Composites also are problematic as a diagnostic tool for determining the potential guilt of a suspect (Charman, Gregory, & Carlucci, 2009). In the UK, composite systems have undergone years of scrutiny, revealing that, for example, two popular systems, E-FIT and PRO-t, produced compo- sites that were correctly recognized (named) only about 20% of the time, even after a very short time interval be- tween encoding of the target and composite construction (e.g. Davies, van der Willik, & Morrison, 2000; Frowd, Hancock, & Carson, 2004). When a more forensically rele- vant time interval was used (48 hours), recognition accuracy dropped to almost zero across several composite systems (Frowd et al., 2005b; Frowd et al., 2007a). Even when parti- cipants were cued to a list of celebrities in whose image com- posites had been constructed, recognition accuracy stayed below 10% (Frowd et al., 2007a). The many problems with composite-face construction, recognition, and recall have been known for decades (e.g. Davies et al., 2000; Ellis, Davies, & Shepherd, 1978; Ellis, Shepherd, & Davies, 1975; Koehn & Fisher, 1997; Kovera, Penrod, Pappas, & Thill, 1997). Ellis et al. (1978) concluded (regarding Photot, a feature-based composite-face program) that, the system may require people to do something which they are not very good at: namely, fragmenting a wholistic percept or gestalt(p. 306). A more recent paper included a similar statement: the disappointing performance of facial composite generators may, in part at least, be attributed to the piecemeal manner in which witnesses are required to select individual facial features in isolation. The psychological science of face recognition shows that this approach goes against the grain of our natural face processing strategy, which exploits our ability to process the congural or holistic proper- ties of a face(Valentine, Davis, Thorner, Solomon, & Gibson, 2010, pp. 7273). If this is true, then there should be differ- ences in how featurally constructed composites (dened below) are processed relative to photographs of real faces. 2 We utilized two approaches to composite construction in identifying cognitive processing differences between feature-based composites and real face photos. First, we played the role of a researcher utilizing these stimuli as a substitute for real face photos; facial features were selected randomly to create composites. This was our starting point for two reasons: (i) we wanted to establish a baseline against which we could compare better levels of composite 1 The participant referenced here is the person set with the task of recognizing the composite, as opposed to the person creating the composite. This distinc- tion is important to keep in mind for the remainder of this paper. When speak- ing of face or composite processing, we always will be referring to the person recognizing the face/composite, and not the person constructing the composite. We thank an anonymous reviewer for bringing up this point of clarity. 2 There is no consistent use of the terms holistic, congural, and fea- turalin the literature; therefore, we explicitly dene them here. Following Wells and Hryciw (1984), we will use holisticand conguralto represent spatial-location information about facial features in combination with rela- tional information among features (i.e. interfeature topography). Featural processing will refer to processing of features (e.g. eyes, nose, mouth) in the absence of information concerning relationships among them. It is also important to note that we compared composites with unchanging photographs of real faces rather than the real faces themselves (i.e. no change in view or expression between study and test for composites or real faces). Some researchers (e.g. Hancock, Bruce, & Burton, 2000) have suggested that this procedure tests participantsmemory of the images rather than the faces themselves. We agree that researchers should not study face processing or memory with the exact same face photos used at study and test, but we wanted to match our tests with those (e.g. Flowe & Ebbesen, 2007) who used the same composite image at study and test. *Correspondence to: Curt A. Carlson, Department of Psychology & Special Education, Texas A&M UniversityCommerce, Commerce, TX 75429, USA.E-mail: [email protected] Copyright © 2012 John Wiley & Sons, Ltd. Applied Cognitive Psychology, Appl. Cognit. Psychol. (2012) Published online in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/acp.2824

Upload: curt-a-carlson

Post on 06-Jun-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Processing Differences between Feature-Based Facial Composites and Photos of Real Faces

Processing Differences between Feature-Based Facial Composites and Photos ofReal Faces

CURT A. CARLSON1*, SCOTT D. GRONLUND2, DAWN R. WEATHERFORD1 andMARIA A. CARLSON3

1Texas A&M University–Commerce, USA2University of Oklahoma, USA3University of Mary Hardin–Baylor, USA

Abstract: Computer-generated faces (composites) constructed by select2ing individual facial features (e.g. eyes, nose, mouth) arepoorly recognized because this process contrasts with the natural holistic processing of real faces. This result suggests that thereshould be differences in the cognitive processing of these composites compared with photos of real faces, which would make thesestimuli problematic for theories seeking to explain real face processing. We conducted five experiments to test potential conditionsfor moving composite processing closer to how real face photos are processed, first taking the perspective of researchers whoconstruct composites with a random selection of available features and then taking a perspective closer to police by creating eachcomposite to match a real face photo. Composites with randomly selected features (but configured like real faces) showed no faceinversion effect and recognition memory for these composites benefited from increased encoding time, unlike real face photos.Although composites constructed to match real face photos yielded an inversion effect, they still were remembered differently thanthe photos. Researchers should not use feature-based composites as proxies for real face photos. We conclude with a discussion ofalternative methods of constructing composites. Copyright © 2012 John Wiley & Sons, Ltd.

It is no secret that there are problems with faces produced byfeature-based composite systems. These systems constructfaces via a selection of individual facial features such asnose, eyes, and mouth, rather than a more holistic process.They do not produce good likenesses of target faces, whetherthese individuals are well known to the participant1 or not(Frowd,McQuiston-Surrett,Anandaciva, Ireland,&Hancock,2007). Composites also are problematic as a diagnostic toolfor determining the potential guilt of a suspect (Charman,Gregory, & Carlucci, 2009). In the UK, composite systemshave undergone years of scrutiny, revealing that, for example,two popular systems, E-FIT and PRO-fit, produced compo-sites that were correctly recognized (named) only about20% of the time, even after a very short time interval be-tween encoding of the target and composite construction(e.g. Davies, van der Willik, & Morrison, 2000; Frowd,Hancock, & Carson, 2004). When a more forensically rele-vant time interval was used (48 hours), recognition accuracydropped to almost zero across several composite systems(Frowd et al., 2005b; Frowd et al., 2007a). Even when parti-cipants were cued to a list of celebrities in whose image com-posites had been constructed, recognition accuracy stayedbelow 10% (Frowd et al., 2007a).The many problems with composite-face construction,

recognition, and recall have been known for decades (e.g.Davies et al., 2000; Ellis, Davies, & Shepherd, 1978; Ellis,Shepherd, & Davies, 1975; Koehn & Fisher, 1997; Kovera,Penrod, Pappas, & Thill, 1997). Ellis et al. (1978) concluded

(regarding Photofit, a feature-based composite-face program)that, ‘the system may require people to do something whichthey are not very good at: namely, fragmenting a wholisticpercept or gestalt’ (p. 306). A more recent paper included asimilar statement: ‘the disappointing performance of facialcomposite generators may, in part at least, be attributed tothe piecemeal manner in which witnesses are required to selectindividual facial features in isolation. The psychologicalscience of face recognition shows that this approach goesagainst the grain of our natural face processing strategy, whichexploits our ability to process the configural or holistic proper-ties of a face’ (Valentine, Davis, Thorner, Solomon, &Gibson,2010, pp. 72–73). If this is true, then there should be differ-ences in how featurally constructed composites (definedbelow) are processed relative to photographs of real faces.2

We utilized two approaches to composite construction inidentifying cognitive processing differences betweenfeature-based composites and real face photos. First, weplayed the role of a researcher utilizing these stimuli as asubstitute for real face photos; facial features were selectedrandomly to create composites. This was our starting pointfor two reasons: (i) we wanted to establish a baseline againstwhich we could compare better levels of composite

1 The participant referenced here is the person set with the task of recognizingthe composite, as opposed to the person creating the composite. This distinc-tion is important to keep in mind for the remainder of this paper. When speak-ing of face or composite processing, we always will be referring to the personrecognizing the face/composite, and not the person constructing the composite.We thank an anonymous reviewer for bringing up this point of clarity.

2 There is no consistent use of the terms ‘holistic’, ‘configural’, and ‘fea-tural’ in the literature; therefore, we explicitly define them here. FollowingWells and Hryciw (1984), we will use ‘holistic’ and ‘configural’ to representspatial-location information about facial features in combination with rela-tional information among features (i.e. interfeature topography). Featuralprocessing will refer to processing of features (e.g. eyes, nose, mouth) inthe absence of information concerning relationships among them. It is alsoimportant to note that we compared composites with unchanging photographsof real faces rather than the real faces themselves (i.e. no change in view orexpression between study and test for composites or real faces). Someresearchers (e.g. Hancock, Bruce, & Burton, 2000) have suggested that thisprocedure tests participants’ memory of the images rather than the facesthemselves. We agree that researchers should not study face processing ormemory with the exact same face photos used at study and test, but wewanted to match our tests with those (e.g. Flowe & Ebbesen, 2007) whoused the same composite image at study and test.

*Correspondence to: Curt A. Carlson, Department of Psychology & SpecialEducation, Texas A&M University–Commerce, Commerce, TX 75429,USA.E-mail: [email protected]

Copyright © 2012 John Wiley & Sons, Ltd.

Applied Cognitive Psychology, Appl. Cognit. Psychol. (2012)Published online in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/acp.2824

Page 2: Processing Differences between Feature-Based Facial Composites and Photos of Real Faces

construction; and (ii) we wanted to evaluate whetherresearchers (e.g. Harley, Dillon, & Loftus, 2004; Loftus,Oberg, & Dillon, 2004) should reconsider using compositesconstructed in this manner because they are not processedlike real face photos. We applied increasing levels of controlover these composites for Experiments 2 and 3 to see how farthis would push the boundary toward equating the processingof composites and real face photos. In addition, we did this toevaluate the kinds of composites sometimes utilized in the eye-witness memory literature. Some studies use composites firstconstructed from a random selection of features and then applyvarious levels of control (Flowe & Besemer, in press; Flowe &Cottrell, in press; Flowe & Ebbesen, 2007; Vladeanu, Lewis, &Ellis, 2006).

We took a second approach to composite construction inExperiments 4 and 5, taking a perspective closer to that ofthe police, who would construct a composite to resemblean eyewitness’s memory of a real face. By taking these twodifferent approaches, we attempted to identify boundaryconditions for composite processing versus real face photoprocessing. In other words, we wanted to assess how compo-sites are processed cognitively as we transitioned from infe-rior to superior construction strategies.

There clearly are advantages to being able to constructartificial faces, both from a practical perspective, such asthe use by law enforcement to construct composites of sus-pects (see Davies & Valentine, 2007, for a review) and in thelab. Constructing faces according to required specificationsprovides for a great degree of control and therefore could leadto an enhanced understanding of face processing. It recentlyhas been argued that all facial stimuli used in psychologicalresearch should be standardized in order to control for factorssuch as distinctive features and often-overlooked aspects suchas levels of brightness and contrast (Gronenschild, Smeets,Vuurman, van Boxtel, & Jolles, 2009). Feature-based compos-ite systems could be a useful means to this end. However, themanner in which these systems construct composites presentsa paradoxical situation for researchers.

CONTROL OF FACIAL STIMULI IS IMPORTANT,BUT AT WHAT COST?

The need to gain better control over facial stimuli has madefeature-based composite systems attractive to psychologicalresearchers (e.g. Amishav &Kimchi, 2010; Flowe & Ebbesen,2007; Loftus et al., 2004; Wilford & Wells, 2010). For exam-ple, there is a large literature attesting to the memorial benefitsof distinctive faces compared with more typical faces. Thewell-known von Restorff effect (an item that is different froma set of similar contextual items will be better remembered) hasbeen extended to faces (Cohen & Carr, 1975). Faces rated asmore distinctive were recognized better, even a week afterencoding, compared with faces rated as less distinctive.3

Famous faces rated as more distinctive also have been shown

to be recognized faster than equally familiar non-distinctivefaces (Valentine & Bruce, 1986). However, Davidenko(2007) argued that the distinctiveness effect was an artifactof the distractor faces used. He asserted that, in general, distrac-tor faces paired with distinctive targets were more dissimilarfrom those targets than were distractor faces paired with typicaltargets. When Davidenko equated the similarity between dis-tractors and targets for the distinctive and the typical targets,the distinctiveness effect disappeared. In fact, a slight advan-tage for typical faces emerged. It clearly is important for theorythat researchers have precise control over their stimuli, whichexplains the appeal of feature-based composite systems.The importance of gaining control over similarity relations

among real faces also has been demonstrated in the appliedliterature. For example, the sequential lineup advantage(Lindsay & Wells, 1985; Steblay, Dysart, Fulero & Lindsay,2001) was shown to depend in part on whether the similarityof the perpetrator (or innocent suspect) to the other membersof the lineup was poor (Carlson, Gronlund, & Clark, 2008).Order effects in sequential lineups also have been shown tobe dependent on the placement of a next-best foil that ismore similar to the suspect than are the remaining foils(Clark & Davey, 2005). In addition, synthetic faces havebeen used to demonstrate a role for the perceptual similaritystructure of distractors (i.e. how similar the faces are to eachother in a perceptual multidimensional space) in recognitionmemory (Yotsumoto, Kahana, Wilson, & Sekuler, 2007).We highlight two studies that utilized feature-based com-

posites as proxies for real faces. Flowe and Ebbesen (2007)used these stimuli in order to have control over similarityrelations between targets and foils in lineups. They variedsimilarity relations in several simultaneous and sequentiallineups and found that participants used a more liberal crite-rion to choose the suspect when he was more dissimilar fromthe foils, regardless of whether the lineup members were pre-sented simultaneously or sequentially. However, if feature-based composites are processed differently than real faces,how generalizable are these results? The second study wehighlight (Vladeanu et al., 2006) focused on associativepriming of faces. With a recognition test, they assessed thelevel of priming for paired faces without any semantic infor-mation versus paired faces with the same or different seman-tic information (e.g. presenting a pair of faces, each of whichwere presented with the same information about job andhometown or a different job and hometown). They usedfeature-based composites as proxies for real faces, first gen-erating them with a random selection of features and thenmaking some adjustments (e.g. eliminate obvious distinctivefeatures, add age lines on some). The use of these stimulicould pose problems for interpreting the results if thesecomposites are not processed like real faces.The poor performance of feature-based composite systems

in constructing composites—the fact that they require theuser to deal with faces in a piecemeal, feature-based manner(Wells & Hasel, 2007)—could be problematic for usingthem in place of real faces. Before researchers can developtheory concerning real faces based on research using feature-based composite systems (e.g. Amishav & Kimchi, 2010), itis important to understand any differences in how these twokinds of stimuli are processed. There have been several

3 Distinctiveness effects also have been identified for facial composites con-structed from several systems (e.g. E-FIT, EvoFIT, Photofit; see 2005a). Ourgoal was not to identify distinctiveness effects with real faces or compositesbut rather to control for its potential influence on the cognitive processing offaces across experiments.

C. A. Carlson et al.

Copyright © 2012 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. (2012)

Page 3: Processing Differences between Feature-Based Facial Composites and Photos of Real Faces

advances in the development of composite faces since artists’renditions (see Davies & Valentine, 2007, for a review),from mechanical systems (e.g. Identikit and Photofit) to com-puterized systems (e.g. Mac-A-Mug Pro, E-fit, FACES). Butthose constructing facial composites can produce likenessesfrom memory no better with the latter than with the former(Davies et al., 2000). Some researchers (e.g. Davies et al.,2000; Koehn & Fisher, 1997; Kovera et al., 1997; Valentineet al., 2010) already have proposed that the reason for thiscould be because composite faces are not processed configu-rally. However, we are aware of no published research inwhich an attempt has been made to uncover the underlyingprocessing differences between composites and photographsof real faces or to identify the boundary conditions forfeature-based composites being processed like real facephotos. These were our goals.

THE PRESENT STUDY

We constructed composites from randomly selected featuresin Experiment 1, then applied several levels of control acrossExperiments 2 and 3, and finally created composites to looklike real face photos for Experiments 4 and 5. The point ofdeparture was a replication of the face inversion effect firstreported by Yin (1969). Yin had participants study facesand other objects (airplanes, houses, and caricatures of menin motion) in an upright or inverted orientation. He foundthat performance suffered for all stimuli when studied andtested in an inverted orientation (although not significantlyfor the airplanes), but this was especially true for faces.Yin concluded that the greater cost of inversion to faceswas due in part to faces being special. Experiment 1 was areplication of this original study on face inversion exceptfor two changes: (i) no airplanes or caricatures of men inmotion were used, and (ii) in addition to faces and houses,we included feature-based composites.

EXPERIMENT 1

We compared accuracy and response time (RT) among realface photos, composites, and houses, presented either uprightor inverted. We hypothesized that the inversion decrement(lower accuracy and increased RT for inverted compared withupright stimuli) would be more pronounced for real facephotos than for houses or composites because holistic orconfigural information is more important for real faces andmore disrupted by inversion.

Method

ParticipantsA group of undergraduate psychology students (N=27) from aMidwestern US university participated for research credit in apsychology class. They signed up for the experiment throughan online system in which they could choose in whichpsychology experiments they wanted to participate. The currentexperiment was described as one involving decision-makingabout faces. Although no demographic information wascollected for this or subsequent experiments, all participantswere drawn from a pool composed of undergraduate

psychology students, the majority of which were Caucasian,between the ages of 18 to 24 years, and over 50% were female.

MaterialsThe E-prime program (version 1.1, Psychology SoftwareTools, 2003; Schneider, Eschman, & Zuccolotto, 2002)was used to present three types of stimuli (house photos,photos of real faces, and composites) and to record responsesand RT. We used FACES 4.0 (IQ Biometrix, Inc., 2003) asour composite software for two reasons: (i) it is the mostpopular program used by police departments in the USA toconstruct composites of perpetrators4 (McQuiston-Surrett,Topp, & Malpass, 2006), and (ii) it has been used in recentyears to create proxies for real faces in psychologicalresearch (Amishav&Kimchi, 2010, used FACES 3.0; Carlson,2009; Flowe & Besemer, in press; Flowe & Cottrell, in press;Flowe & Ebbesen, 2007; Harley et al., 2004; Loftus et al.,2004; Vladeanu et al., 2006, used InterQuest Faces, an earlierversion of FACES; Wilford & Wells, 2010). We used therandom face generator option in the software to construct com-posites with a random and independent selection of featuresfrom the uniform distribution of 4400 features available (seeFigure 1 for examples). This resulted in the creation of male,female, and ambiguous-gender composite faces.

Although selection of features was random, their place-ment on each face was not. It was set automatically by theprogram to mimic the arrangement of features on an actualface, with an additional assumption of symmetry (althoughhairstyle typically was not symmetrical). Distance betweenfeatures, including inter-ocular distance, was randomizedfor each face. See Figure 2, top row, for examples of compo-sites randomly generated from the program. There was littlevariance among the composites in terms of configural infor-mation. In other words, the majority of the composites hadconfigurations that fell within the range typified by those inFigure 2, top row. Those that did not, or in any way didnot look like a real face (based on our visual inspection),were eliminated. In addition, the program’s face generatoris designed to blend features to produce photo-quality imita-tions of real faces. The only change made to each composite

4 It has not been used by British police, but has been evaluated by Britishresearchers (e.g. 2007a).

Figure 1. Three example features each chosen at random fromFACES 4.0

Feature-based composites versus face photos

Copyright © 2012 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. (2012)

Page 4: Processing Differences between Feature-Based Facial Composites and Photos of Real Faces

was the removal of glasses, earrings, or any other ‘artificial’addition to the face in order to reduce their distinctivenessand make them more comparable with the real face photos,none of which had these artificial features. These compositeswere constructed in the same manner as those used as prox-ies for real faces in the psychological literature (Harley et al.,2004; Loftus et al., 2004; Vladeanu et al., 2006).

We downloaded the house photos from various real estatewebsites. Only photos showing the front of the house wereused, but they differed in how much greenery was in thefront yard. Photos including trees, shrubs, fences, or otherpossibly distinctive features were minimized, but could notbe eliminated because of the ubiquitous nature of theseobjects. In addition, the photos varied in resolution becausethey were taken from different websites. This issue wasaddressed in Experiment 2.

We downloaded the real face photos from the FloridaDepartment of Corrections supervised population website(www.dc.state.fl.us/activeoffenders/search.asp). We selectedonly young (between the ages of 18 and 40 years) men andwomen without glasses, earrings, or other unnatural features.In order to make the real face photos more comparable withthe composites, we cropped out all visible clothing andplaced all faces against a white background.

ProcedureAfter reading the informed consent, participants (individu-ally, or in groups of two to four, as in all experimentsreported here) read brief instructions and then completedfour experimental blocks, randomly ordered for each

participant. One block presented upright faces with a set ofupright houses; another block presented these same stimuliupside-down. The third type of block presented upright com-posites with a different set of upright houses, and the fourthblock type presented these same stimuli upside-down. Eachblock included an encoding and a test phase. The encodingphase consisted of 20 houses and 20 faces or composites pre-sented in a different random order for each participant. Eachstimulus was on the screen for 3 seconds, with a 1-secondwhite screen between each. Every participant viewed thesame sets of houses, faces, and composites (i.e. we did notrandomly generate a new set of composites for each partici-pant). Immediately after encoding, a set of instructions waspresented on the screen, explaining the upcoming self-pacedtwo-alternative forced-choice test. Participants began the testwhenever they completed reading the instructions. Each testconsisted of 24 pairs of stimuli (12 pairs of houses and 12pairs of composites/real face photos) presented in the sameorientation as encoding. One member of each pair was apreviously presented stimulus and the other was novel. Thegoal was to choose from each pair the stimulus that waspresented before. A different random order of pairs was pre-sented to each participant, with the correct choice presentedon each side of the screen 50% of the time.

DesignWe utilized a 3 (stimulus: house versus real face photo versuscomposite)� 2 (orientation: upright versus inverted) repeated-measures design.

Figure 2. Example composites used in Experiment 1 (top row) and Experiments 2 and 3 (middle row), and two examples of real face photo-composite pairs (bottom row) used in Experiments 4 and 5

C. A. Carlson et al.

Copyright © 2012 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. (2012)

Page 5: Processing Differences between Feature-Based Facial Composites and Photos of Real Faces

Results and discussion

We averaged accuracy and RT separately across items withineach condition for each participant and analyzed accuracyand RT with separate repeated-measures ANOVAs.5 Therewas an interaction between stimulus and orientation foraccuracy, F(2, 52) = 3.40, p = .041, �p

2 = .12 (see Figure 3,top panel), and for RT, F(2, 52) = 13.83, p< .001, �p

2 = .35(see Figure 3, bottom panel). Paired-samples t-tests6

revealed a significant inversion decrement for real facephotos in terms of accuracy, t(26) = 4.31, p< .001, Cohen’sd= 0.85, and RT, t(26) = 5.74, p< .001, d = 1.14. These find-ings replicate the typical face inversion effect. However, insupport of our hypothesis, there was no inversion effect forthe composites, either for accuracy, t(26) = 1.20, ns, or RT,t(26) = 1.22, ns. For houses, there was no inversiondecrement for accuracy, t(26) = 1.81, ns, but there was forRT, t(26) = 2.77, p = .01, d = 1.09. It is interesting to note thatparticipants were slower to identify inverted compared withupright houses, as they were for real face photos, but thispattern did not occur for composites. In addition, participantswere more accurate for upright real face photos comparedwith upright composites, t(26) = 3.81, p< .001, d = 0.66,or houses, t(26) = 6.45, p< .001, d= 1.52. They alsoidentified upright real face photos faster than composites,t(26) = 4.54, p< .001, d = 0.86, or houses, t(26) = 6.52,p< .001, d = 1.29. Overall, these results indicate that compo-sites with randomly selected features are not perceived likephotographs of real faces.In order to validate these results, we sought to replicate

them in Experiment 2 with a more tightly controlled set ofstimuli. Two issues needed to be addressed. First, thepictures of houses used in Experiment 1 were from severaldifferent websites, making them differ in several ways interms of the inherent properties of the houses (e.g. architec-turally), as well as size and resolution of the pictures.Second, and more importantly, the real face photos and com-posites all had hair, which increased distinctiveness acrossfaces within each set. This could be a problem because realfaces with distinctive features are easier to recognize wheninverted (Pullan & Rhodes, 1996). In addition, research hasshown that internal facial features could be processed differ-ently than external features such as hair, both for real faces(Ellis, Shepherd, & Davies, 1979; Young, Hay, McWeeny,& Flude, 1985) and for composites (2007a). Starting withExperiment 2, we eliminated hair from all composites andreal face photos, which also should make it less likely thatthe composites will be processed featurally (Reinitz,Lammers, & Cochran, 1992).

EXPERIMENT 2

Method

ParticipantsA separate group of students (N= 28) from the same pool asExperiment 1 participated. None of the same students partic-ipated in more than one of our experiments.

MaterialsThe same three types of stimuli were presented as in Exper-iment 1. However, more care was taken in selection of thesestimuli. First, the houses were selected from a single realestate website to minimize differences in their design/architecture and to ensure that all photos could be presentedat the same size and resolution. All hair (including facialhair) was removed from a new set of composites generatedwith randomly selected features. This effectively made themall look male to better match the real face photos; see middlerow of Figure 2 for examples. We then applied the same con-straints to these composites as in Experiment 1 (e.g. elimi-nating artificial features such as glasses and earrings).Finally, a new set of real face photos was sampled with thesame restrictions as Experiment 1 except that only baldmen or men with shaved heads (and little to no facial hair)were selected.

ProcedureThe same procedure was followed as in Experiment 1.

DesignAs in Experiment 1, we utilized a 3 (stimulus: house versusreal face photo versus composite)� 2 (orientation: uprightversus inverted) repeated-measures design.

5 Only by-participants analyses are presented in this paper, although datafrom all experiments were analyzed both by-participants and by-items.These joint analyses revealed that results were not driven by a particularparticipant/item or a small group of participants/items (i.e. the significance,or lack of significance, of effects did not change with by-items analysis).6 Bonferroni adjustment was applied to all t-tests reported in this paper, andall reported p-values are two-tailed. In addition, following Dunlap, Cortina,Vaslow, and Burke (1996), the original standard deviations (rather than thepaired t-test value) were used to calculate the effect size for paired-samplest-tests. This is intended to prevent an overestimated effect size.

Figure 3. Results from Experiment 1. Top graph: mean proportioncorrect. Bottom graph: mean response time (milliseconds). Bars

represent 95% confidence intervals

Feature-based composites versus face photos

Copyright © 2012 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. (2012)

Page 6: Processing Differences between Feature-Based Facial Composites and Photos of Real Faces

Results and discussion

Results for both accuracy and RT supported the majority ofour findings from Experiment 1. Although the interactionwas not significant for accuracy, F(2, 54) = 0.58, ns (Figure 4,top panel), planned contrasts revealed a face inversion effectfor real face photos, t(27) = 2.99, p = .004, d = 0.71, and noinversion effect for composites, t(27) = 1.47, ns. There wasa significant interaction for RT, F(2, 54) = 3.71, p = .031,�p2 = .12 (Figure 4, bottom panel). Participants responded

faster for real face photos when upright than when inverted,t(27) = 3.54, p< .001, d = 0.60, but this inversion effect didnot occur for composites, t(27) = 0.42, ns. This indicated thatthe elimination of hair and other distinctive features fromcomposites did not improve how composites were perceivedrelative to real face photos.

As in Experiment 1, houses exhibited an inversion effectfor RT, t(27) = 3.55, p< .001, d = 0.46, but unlike Experi-ment 1, there also was an effect for accuracy, t(27) = 3.49,p = .002, d = 1.34. Also, unlike Experiment 1, participantswere as accurate in their identification of upright compositesas they were for upright real face photos, t(27) = 1.70, ns, andthere was no difference in accuracy between upright housesand upright real face photos, t(27) = 1.37, ns. However, par-ticipants were more accurate in their identification of uprighthouses compared with upright composites, t(27) = 2.97,p = .005, d = 0.75. Finally, as in Experiment 1, participantswere slower in identifying upright composites, relative toboth upright real face photos, t(27) = 3.67, p< .001,d = 0.59, and upright houses, t(27) = 2.99, p = .005, d = 0.37.

These findings replicate the primary results from Experi-ment 1, showing that composites composed of randomlyselected features do not exhibit an inversion effect in termsof accuracy or RT. These stimuli were perceived differentlythan real face photos, despite the increased control we

applied. We believe that this was because holistic or configuralinformation was more important for real face photos and moredisrupted by inversion. Conversely, featural processing wasmore important for these composite faces and was notdisrupted by inversion. Experiment 3 was conducted todetermine if there were downstream cognitive consequencesin how feature-based composites were processed comparedwith real face photos.

EXPERIMENT 3

It has been suspected for decades (e.g. Ellis et al., 1975,1978) that feature-based composite programs could nega-tively affect holistic memories of faces. Ellis et al. (1978)stated that such systems ‘may require people to do some-thing which they are not very good at: namely, fragmentinga wholistic percept or gestalt’ (p. 306). Wilford and Wells(2010) recently reiterated this: ‘memories for faces are storedholistically, whereas composite systems require a moredecomposed, featural representation’ (p. 1614). If this is true,and humans naturally encode upright faces holistically (e.g.Farah, Wilson, Drain and Tanaka, 1998), a memory tracethat results from holistic processing should be a better matchto a test face than a memory trace resulting from featural pro-cessing. To assess whether or not feature-based compositesare encoded holistically, Experiment 3 manipulated encod-ing time. Research has shown that upright facial featuresare processed in parallel (Sergent, 1984; Smith & Nielsen,1970), whereas inverted facial features, houses, and wordsare processed serially (e.g. Farah et al., 1998). In addition,the processing of upright faces is quick and automatic(taking as little as 50milliseconds; Richler, Mack, Gauthier,& Palmeri, 2009), and holistic processing is faster than fea-tural processing for real faces (Bartlett, Helm, & Jerger,2001; Carbon & Leder, 2005; Hole, 1994). In the presentexperiment, we presented faces/composites for either 1 or3 seconds each, therefore allowing plenty of time in bothconditions for holistic processing (e.g. Richler et al., 2009).As such, we did not expect that the increased encoding timewould affect the quality of match between a memory traceand test face for real face photos because the memory tracealready would be of high quality. However, if compositefaces are not processed holistically and automatically, butrather more featurally and serially, tripling encoding timeshould yield benefits for later recognition ability. In particular,we hypothesized that recognition accuracy for feature-basedcomposites would increase with increased encoding time butwould remain unchanged for real face photos. To assessaccuracy, we used d′ (Z(hit rate)�Z(false alarm rate)), astandard measure of discrimination based on signal detectiontheory. Because we had no theoretical interest in response bias,we do not report it. However, we will examine a relatedmeasure in Ratcliff’s (1978) Diffusion Model.We also expected latency to be slowed for recognition

decisions involving a composite face. To gain insight intothe psychological processes that underlie these accuracyand latency effects, we fit a simplified version of Ratcliff’s(1978) Diffusion Model to the RT data. We used this modelbecause it provides a powerful means of linking RT

Figure 4. Results from Experiment 2. Top panel: mean proportioncorrect; Bottom panel: mean response time (milliseconds). Bars

represent 95% confidence intervals

C. A. Carlson et al.

Copyright © 2012 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. (2012)

Page 7: Processing Differences between Feature-Based Facial Composites and Photos of Real Faces

distribution parameters to underlying cognitive processes(Wagenmakers, 2009). If we are correct that the accuracyeffects are a function of the quality of match to memory, thenthe drift rate in the Diffusion Model, which assesses qualityof match, should increase with increased encoding time forcomposites but remain unchanged for real face photos. How-ever, if feature-based composite faces are processed holisti-cally like real faces, the memory trace should be an equivalentmatch regardless of encoding time, with equivalent drift ratesacross encoding times. That would mean that slower compos-ite RTs would be due to either increased response conserva-tism or non-retrieval related factors. With regard to the latter(termed Ter in the Diffusion Model), the perceptual differ-ences found in Experiments 1 and 2 lead us to expect thatTer should be greater for composites, reflecting a slowerencoding process due to greater difficulty extracting compo-sites’ features. We describe the simplified Diffusion Modelafter describing the next experiment and the resulting accuracyand latency data.

Method

ParticipantsA new set of students (N= 92) from the same pool partici-pated for course credit.

MaterialsA new set of composites was generated using a different ran-dom selection of features. The same constraints were appliedto this set as used on the composites from Experiment 2, withthe additional restriction of replacing potentially distinctivenatural features (e.g. crooked nose, large eyes) that couldenhance recognition accuracy. The 80 composites thatremained were divided randomly into 40 targets and 40lures. We eliminated six of the real face photos used inExperiment 2 because of potentially distinctive naturalfeatures. Six replacements were chosen in the same manneras Experiment 2, and then this set was divided randomly into40 targets and 40 lures.

ProcedureAfter reading the informed consent and instructions, partici-pants viewed four experimental blocks, randomly orderedfor each participant. Each block represented one of the fourconditions of the 2 (real face photos versus composites)� 2

(1- versus 3-second presentation at encoding) repeated-measures design. Each block included an encoding phase,filler task (3-minute word-search puzzle), and test phase.The encoding phase consisted of 20 real face photos or 20composites presented in upright orientation for 1 or 3 secondseach. A visual mask was presented during the inter-stimulusinterval, lasting 3 seconds in the 1-second condition, and1 second in the 3-second condition, in order to maintain aconsistent retention interval for each block. The test phase(recognition memory task) consisted of either the 20 studiedreal face photos and 20 real face photo lures or the 20 studiedcomposites and 20 composite lures. All stimuli were tested inrandomized order for each participant. Each test stimulus waspresented individually in upright orientation, and participantsindicated with the ‘y’ key (indicating yes) or ‘n’ key (indicat-ing no) whether they recognized each.

DesignThe experiment was a 2 (stimulus: real face photos versus com-posites)� 2 (encoding time: 1 versus 3 seconds) repeated-measures design.

Results and discussion

As predicted, there was a significant interaction for d′,F(1, 91) = 32.32, p< .001, �p

2 = .26 (Figure 5, left), such thatlonger encoding time did not change accuracy for real facesphotos, t(91) = 0.74, ns, but significantly increased accuracyfor composites, t(91) = 7.16, p< .001, d=0.97. See Table 1for descriptive statistics. Participants correctly recognizedmore composites after the longer encoding time, t(91) = 3.60,p< .001, d=0.33, but the longer encoding time did not helpparticipants to correctly recognize more real face photos,t(91) = 0.64, ns. In addition, participants false alarmed to fewercomposites after longer encoding time, t(91) = 6.60, p< .001,d= 0.70, but the false alarm rate remained the same for realface photos regardless of encoding time, t(91) = 2.07, ns.

Prior to the RT analyses (Figure 6, left column), wereplaced outliers greater than 8000milliseconds (eight fromthe two composite conditions and seven from the two realface photo conditions) with 8000milliseconds and replacedthose less than 150milliseconds (two RTs) with 150millise-conds. The results did not differ significantly when all datawere included. There was no interaction, F(1, 91) = 2.71,ns, but planned comparisons revealed that (i) participants

Figure 5. Accuracy results (d′) for composites and real face photos at 1- and 3-second encoding time. Results from Experiments 3 (left) and 5(right) are presented together in order to enable comparisons between results for randomly generated and systematically constructed com-

posites. Bars represent 95% confidence interval

Feature-based composites versus face photos

Copyright © 2012 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. (2012)

Page 8: Processing Differences between Feature-Based Facial Composites and Photos of Real Faces

took longer to make recognition decisions for composites afterthey were encoded for 3 seconds compared with 1 second,t(91) = 2.77, p= .004, d=0.25; but (ii) RT did not change forreal face photos as a function of encoding time, t(91) = 1.06,ns. In addition, participants were faster to respond to real facephotos compared with composites, F(1, 91) = 64.77, p< .001,�p2 = .42, both after 1 second of encoding, t(91) = 5.11,

p< .001, d=0.39, and after 3 seconds, t(91) = 5.79, p< .001,d=0.56.

As mentioned above, Ratcliff’s (1978) Diffusion Modelhas been shown to be a powerful means of linking RT distri-bution parameters to underlying cognitive processes(Wagenmakers, 2009). However, the seven-parameter modelis too powerful for the data collected here. We applied a sim-plified version of the model, the Robust EZ Diffusion Model(Wagenmakers, van der Maas, Dolan, & Grasman, 2008;Wagenmakers, van der Maas, & Grasman, 2007), which usesproportion correct, mean RT of correct responses, andvariance RT of correct responses, to yield three latentparameters: (i) the drift rate (v), which represents the qualityof match between a memory trace and test item; (ii) theupper decision boundary (a), which provides an indicationof response conservatism; and (iii) an estimate of the timefor encoding and non-retrieval related processes (Ter).

Our primary prediction involved an interaction for thedrift rate, such that the quality of match between a memorytrace and test item would increase for composites withincreased encoding time, but remain the same for real facephotos. This would be the case if the featural nature of thesecomposites required more time to process and form a bettermemory trace. In addition, we expected Ter to be greaterfor composites on the basis of findings from Experiments 1and 2, which revealed differences in how feature-basedcomposites were perceived. Such a difference in Ter wouldprovide additional evidence that feature-based compositesare not processed holistically. Finally, an increase in a withincreased encoding time would represent an increase inconservatism. Although we have no prediction regarding thisparameter, it will be interesting to see whether increasedencoding time influenced response conservatism differentlyfor composites compared with real face photos.

For each condition, each participant’s proportion correct,mean RT of correct responses, and variance RT of correctresponses were input into a program retrieved from http://users.fmg.uva.nl/ewagenmakers/papers.html. The resultingparameter estimates for v, a, and Ter were submitted to arepeated-measures ANOVA. The average parameter estimatesper condition are presented in Figure 7 (left column).

As predicted, the interaction was significant for the driftrate, F(1, 91) = 22.95, p< .001, �p

2 = .20, as were both maineffects (stimulus: F(1, 91) = 141.63, p< .001, �p

2 = .61; time:F(1, 91) = 11.96, p= .001, �p

2 = .12). Planned contrastsindicated that the drift rate was worse for composites thanfor real face photos after both encoding times (after 1 second:t(91) = 11.84, p< .001, d=1.61; after 3 seconds: t(91) = 5.67,p< .001, d=0.73), but it improved for composites with longerencoding time, t(91) = 6.04, p< .001, d= 0.90. The drift ratewas unaffected by encoding time for real face photos,t(91) = 0.49, ns. Also as predicted, encoding and non-retrievalrelated processes (Ter) were slowed for the composites,F(1, 91) = 28.04, p< .001, �p

2 = .24. However, the Diffu-sion Model does not allow us to specify whether thiscomponent of the slowed composite RTs was due to slowedencoding/perceptual processes (as we have argued) or slowedresponse processes (i.e. a more complex decision process in-volving the composites). Finally, composites yielded more con-servative responding (greater values of a), F(1, 91) = 5.18,p= .025, �p

2 = .05.We theorized that feature-based composites are

processed featurally, which contrasts with the holistic pro-cessing of real faces. The EZ Diffusion Model analysissupports this theory. Memory traces created from thesecomposites were of poorer quality, as assessed by the driftrate, and these traces were enhanced by additional encod-ing time. It is likely that the additional time allowed formore features to be encoded or for the same number offeatures to be encoded better. Either of these mechanismsdiffered from the encoding involving the real face photos,which resulted in high-quality memory traces irrespectiveof encoding time. In addition, the difference in Terbetween composites and real face photos was consistentwith our findings from Experiments 1 and 2. Finally, thedifferences in response conservatism (a) might indicatethat participants were metacognitively aware that theinformation derived from composites was of lower quality,which influenced them to respond more cautiously. Takentogether, these results are consistent with the hypothesisthat real face photos are processed holistically andautomatically, but feature-based composites are processedfeaturally and serially.

EXPERIMENT 4

Although we attempted to control for variations betweenreal face photos and composites in Experiments 2 and 3, itis possible that the differing results between compositesand real face photos were due to inherent differencesbetween these stimuli regarding factors like distinctivenessor attractiveness. However, such differences may not ariseif composites were constructed to resemble an actual person,as the police would do when trying to match an eyewit-ness’s memory for a perpetrator. Therefore, to better controlfor distinctiveness and attractiveness across stimulus sets,we created a new set of systematically created compositesfor Experiments 4 and 5, which maximized their similarityto real face photos.

Table 1. Hit rate and false alarm rate by condition for Experiment 3(N = 92)

StimulusEncoding time(seconds) Hit rate

False alarmrate

Composite 1 0.48 (0.017) 0.38 (0.018)Composite 3 0.55 (0.021) 0.27 (0.018)Real face photo 1 0.64 (0.018) 0.21 (0.015)Real face photo 3 0.65 (0.022) 0.24 (0.012)

Note: Standard errors are presented in parentheses.

C. A. Carlson et al.

Copyright © 2012 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. (2012)

Page 9: Processing Differences between Feature-Based Facial Composites and Photos of Real Faces

Method

ParticipantsA new group of students (N= 78) from the same pool as theprevious experiments participated.

MaterialsThe same three types of stimuli were used as in Experiments1 and 2: real face photos, composites, and houses. The sameset of houses was used as in Experiment 2, but the real facephotos and composites were yoked together. Four under-graduate research assistants each were given 10 real facephotos (all bald men or men with shaved heads, with nodistinctive features, little to no facial hair, between the ageof 18 and 40 years, and in grayscale, as in Experiment 2)and were trained in the use of the FACES 4.0 program. Theirgoal was to create a composite that looked as similar aspossible to each photo. This method aimed to provide astrong test of whether feature-based composites could everbe processed like real face photos. However, note that an

eyewitness, working from memory rather than from a photoof the suspect, could not replicate this process. In fact, areliance on memory might produce composites that functionmore like those with randomly selected features becausememory traces of internal features of an unfamiliar face aremore likely to be error prone than those of external features(e.g. Veres-Injac & Schwaninger, 2009). For example, in amemory model like WITNESS (Clark, 2003), memory isrepresented as a vector of features. The more error prone thememory (e.g. for internal features), the more elements of thevector contain random rather than veridical feature encodings.

Subsequent interviews with the research assistants, andcareful analysis of the new composites they created, revealedthe strategies and methods they used when constructingthese composites. They utilized the full extent of the subtlefeatures available in the program, such as wrinkles and agelines, shading manipulations on various parts of the face,and skin blemishes, which do not appear in the compositesrandomly generated by the program. In addition, the featuresof these systematically created composites were manipulated

Figure 6. RT for Experiments 3 (left) and 5 (right) presented here for comparisons between randomly and systematically constructed compo-sites. Bars represent 95% confidence intervals

Feature-based composites versus face photos

Copyright © 2012 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. (2012)

Page 10: Processing Differences between Feature-Based Facial Composites and Photos of Real Faces

to not always be symmetrical (e.g. moving one eye slightlylower or higher to match the comparable real face photo),whereas the faces generated by the program always placedthe features symmetrically. Therefore, these new compositeswere more detailed and had a wider variety of subtle config-urations among features, compared with the composites usedin the first three experiments.

The resulting 40 real face photo-composite pairs werepresented individually to a group of 27 undergraduates,who had no prior exposure to any of the materials used inthe experiments. Their goal was threefold: (i) rate each pairon a 7-point Likert-type similarity scale; (ii) give a distinc-tiveness rating on a 7-point Likert-type scale to each stimu-lus individually; and (iii) provide an attractiveness ratingon a 7-point Likert-type scale to each stimulus individually.The order of these three ratings was randomized for eachparticipant. We used the resulting ratings to select 20composite-face pairs with a high average similarity rating(M = 5.82, SD = 0.58), which were equivalent in distinctive-ness (real faces photos: M = 1.96, SD = 0.32; composites:M = 1.91, SD= 0.41; t(19) = 0.62, ns), and attractiveness (real

face photos: M = 3.73, SD = 0.49; composites: M = 3.81,SD= 0.59; t(19) = 0.84, ns).

ProcedureWith the exception of the change in stimuli, we used thesame procedure as in Experiments 1 and 2.

DesignWe utilized a 3 (stimulus: house versus real face photoversus composite)� 2 (orientation: upright versus inverted)repeated-measures design.

Results and discussion

The use of systematically created composites yielded resultsthat differed from Experiments 1 and 2. There was an inver-sion deficit for real face photos and composites, in terms ofboth accuracy, F(1, 77) = 101.95, p< .001, �p

2 = .57 (seeFigure 8, top), and RT, F(1, 77) = 39.19, p< .001, �p

2 = .34(see Figure 8, bottom). There was no interaction for accu-racy, F(1, 77) = 2.10, ns, or RT, F(1, 77) = 2.07, ns. These

Figure 7. Robust EZ Diffusion Model parameter estimates from Experiments 3 (left) and 5 (right). Bars represent 95% confidence intervals

C. A. Carlson et al.

Copyright © 2012 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. (2012)

Page 11: Processing Differences between Feature-Based Facial Composites and Photos of Real Faces

results show that feature-based composites systematicallycreated to look like real face photos were perceived as such.This could be the result of configural or holistic processingof these more realistic composites. This is good news forresearchers utilizing feature-based composites as proxiesfor real face photos, but it is important to acknowledge theextent to which we had to work on the composites to mimicthe processing of real face photos. Nevertheless, plannedcomparisons revealed that, as in Experiments 1 and 2,participants responded to upright composites (M = 2084,SD= 743) significantly slower than upright real face photos(M= 1916, SD = 580), t(77) = 2.98, p = .004, d = 0.25. Thisindicated that there still could be lingering differences inhow the two stimuli were perceived.Experiment 5 was conducted to determine if the differences

in memory processing found for composites in Experiment 3carry over to these improved composites. If these systemati-cally constructed composites are processed holistically, withfacial information extracted in parallel and automatically, RTdifferences should disappear. But if the RT differences notedin Experiment 4 signal that these composites still differ fromreal face photos, there still will be differences in how theyare remembered, which will appear in Experiment 5.

EXPERIMENT 5

Method

ParticipantsA new group of students (N = 76) from the same pool partic-ipated in the experiment.

MaterialsThe 20 highly similar real face photo-composite pairs fromExperiment 4 were used as targets, and the remaining 20 pairscreated (but not used) in Experiment 4 were used as lures.

ProcedureThe procedure was identical to Experiment 3, but with half thenumber of stimuli. For each block, participants encoded 10targets (real face photos or composites) and were tested withthe 10 targets and 10 lures. The order of stimuli presentationat encoding and test was random for each participant. Targetsand lures always were of the same type (real face photos orcomposites) within a particular block’s recognition test.

DesignWe used a 2 (stimulus: real face photos versus composites)� 2(encoding time: 1 versus 3 seconds) repeated-measures design.

Results and discussion

There was an interaction for d′, F(1, 75) = 9.21, p= .005,�p2 = .23 (see Figure 5, right), such that participants were

more accurate after longer encoding time for composites,t(75) = 5.60, p< .001, d=0.66, but longer encoding timedid not enhance recognition accuracy for real face photos,t(75) = 1.09, ns. Participants correctly recognized thecomposites at a significantly lower rate than real face photos(see Table 2), F(1, 75) = 4.86, p= .031, �p

2 = .06. There wasan interaction for correct recognition, F(1, 75) = 10.17,p= .002, �p

2 = .12, such that participants correctly recognizedmore composites after longer encoding time, t(75) = 4.98,p< .001, d=0.30, but did not correctly recognize more realface photos, t(75) = 1.08, ns. Also, participants false alarmedless to novel real face photos than to novel composites,F(1, 75) = 41.67, p< .001, �p

2 = .36. In addition, the inter-action for false alarm rate was significant, F(1, 75) = 7.15,p= .009, �p

2 = .09, such that participants false alarmed less tonovel composites after longer encoding time, t(75) = 2.85,p= .005, d=0.26, but not to novel real face photos,t(75) = 0.85, ns. This pattern of results mimicked the patternfound with composites constructed from randomly selectedfeatures (Experiment 3).

Prior to analysis of mean RT (see Figure 6, right column),we modified outliers on the basis of the same criteria asExperiment 3, which resulted in one RT greater than8000milliseconds reset to 8000milliseconds, and one RTlower than 150milliseconds reset to 150milliseconds.There was a significant interaction for mean RT overall,F(1, 75) = 5.38, p= .023, �p

2 = .07, a main effect of stimulus,

Figure 8. Mean proportion correct (top panel) and response time(bottom panel) for upright and inverted stimuli from Experiment

4. Bars represent 95% confidence intervals

Table 2. Hit rate and false alarm rate by condition for Experiment 5(N= 76)

StimulusEncoding time

(seconds) Hit rate False alarm rate

Composite 1 0.68 (0.024) 0.36 (0.023)Composite 3 0.74 (0.022) 0.31 (0.021)Real face photo 1 0.77 (0.021) 0.20 (0.019)Real face photo 3 0.74 (0.018) 0.22 (0.019)

Note: Standard errors are presented in parentheses.

Feature-based composites versus face photos

Copyright © 2012 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. (2012)

Page 12: Processing Differences between Feature-Based Facial Composites and Photos of Real Faces

F(1, 75) = 4.33, p= .041, �p2 = .06, and a main effect of encod-

ing time, F(1, 75) = 10.14, p= .002, �p2 = .12. Participants

responded quicker to real face photos compared with compo-sites after a 3-second encoding, t(75) = 2.97, p= .004,d=0.31, but not after a 1-second encoding, t(75) = 0.45, ns.RT did not change for real face photos as a function of encod-ing time, t(75) = 0.91, ns. However, participants took moretime in making recognition decisions for composites after theywere encoded for 3 compared with 1 second, t(75) = 4.41,p< .001, d=0.32.

On the basis of the accuracy data, the systematically con-structed composites were remembered differently than realface photos. The RT results support the same conclusion:The current findings match the findings from Experiment 3,although the effect sizes were much reduced. What does theRobust EZ Diffusion Model tell us about these RT differences?

The Robust EZ Diffusion Model analysis revealed no inter-action between type of stimulus and encoding time for driftrate, F(1, 70) = 0.72, ns, and no main effect of encoding time,F(1, 70) = 1.01, ns. However, there was a main effect ofstimulus, such that the composites had a lower drift rate thanthe real face photos, F(1, 70) = 7.06, p= .01, �p

2 = .09 (seeFigure 7, top right). This was consistent with the memory tracesformed for systematically constructed composites being ofpoorer quality than those formed for real face photos. However,the average drift rate (collapsed over encoding time) for thesecomposites (M=0.08, SD=0.04) was much greater than thosefrom Experiment 3 (M=0.04, SD=0.03), t(322) =11.56,p< .001, d=1.13, reflecting the extensive effort put intoconstructing these composites. Additional evidence that thesystematic construction had beneficial effects comes from thefact that there was no interaction, unlike Experiment 3, andthe effect size for the stimulus main effect was much reduced(�p

2 = .61 in Experiment 3 compared with .09 here).The estimated decision boundaries mimicked that found

for composites constructed from a random selection offeatures (see Figure 7, middle). Participants were moreconservative with their decision boundary for compositesrelative to real face photos, F(1, 70) = 7.95, p = .006, �p

2 = .10,and they were more conservative with longer encoding time,F(1, 70) = 11.03, p = .001, �

p2 = .14. However, the significant

interaction (F(1, 70) = 10.29, p = .002, �p2 = .13) tells us that

participants were more conservative with their decisionboundary at the longer encoding time for composites(t(71) = 3.14, p = .002, d = 0.49) but not for real face photos(t(74) = 1.02, ns). This suggests that, even with these im-proved composites, participants still were metacognitivelyaware of their lower information quality compared with realface photos and compensated by waiting longer for thememorial evidence to accumulate.

These composites yielded the same value of Ter as real facephotos, F(1, 70) = 0.05, ns, supporting the findings from Exper-iment 4 that the two classes of stimuli were perceived similarly.

GENERAL DISCUSSION

We addressed two primary issues across five experiments.Experiments 1, 2, and 4 evaluated whether feature-basedcomposites (like those used as proxies for real face photos

in the literature) were perceived like real face photos. Experi-ments 3 and 5 tested two theories of how feature-based com-posites form memory traces relative to real face photos. Onetheory stated that feature-based composites are processedholistically and form memory traces automatically with allfeatures encoded in parallel, like real faces. This is the theorysupported (at least implicitly) by those in the literature whoutilize such composites in place of real faces photos. How-ever, an alternative theory is that feature-based compositesare processed featurally, in a serial fashion, which contrastswith how real faces are processed. If this latter theory is true,researchers should not generalize results based on such com-posites to real faces. We first tested these theories from theperspective of researchers who used composites constructedfrom randomly selected features (e.g. Ellis, Meador, &Bodfish, 1985; Flowe & Besemer, ; Vladeanu et al., 2006).We applied increasing levels of control across these experi-ments to make composites more comparable with real facephotos. Experiments 4 and 5 were conducted from a stand-point similar to the police, who construct a composite toresemble an eyewitness’s memory of a real face.The first two experiments yielded the typical face inversion

effect, such that accuracy was reduced and RT was increasedfor inverted relative to upright real face photos, but we foundno such effect for the composites. Experiment 3 revealed noeffect of study time on memory for real face photos, but d′for composites improved with study time. Analysis using theRobust EZ Diffusion Model supported the theory that real facephotos are processed quickly and automatically and feature-based composites are processed more featurally. The qualityof match between memory traces and test faces was poorerfor composites, participants set a more conservative decisionboundary for composites, and the time for encoding orresponse processes was greater for composites.Of course, neither police nor all researchers construct

composites by randomly selecting features, which led us totest a new set of composites in Experiments 4 and 5, eachconstructed with the goal of looking as similar as possibleto a matched real face photo. Experiment 4 revealed analmost identical inversion deficit between these compositesand real face photos. This suggests that feature-based com-posites created to look like real face photos were perceivedlike real face photos. However, participants did respond toupright faces more quickly than upright composites, whichcould signal lingering perceptual differences. It also couldindicate a certain metacognitive caution when faced withcomposites because participants realized they were notphotos of real faces. In Experiment 4, this higher criterionslowed responses without an accompanying gain in accu-racy. Experiment 5 showed that more systematicallyconstructed composites still were remembered differentlythan real face photos. Discrimination accuracy remainedhigher for real face photos, and longer encoding timeincreased accuracy for composites but not for real facephotos. Mean RT and Robust EZ Diffusion Model resultssupported these differences, although there was indicationof improved processing for these composites compared withthose constructed from randomly selected features. Never-theless, the quality of match between memory traces and testfaces remained poorer for composites.

C. A. Carlson et al.

Copyright © 2012 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. (2012)

Page 13: Processing Differences between Feature-Based Facial Composites and Photos of Real Faces

These results indicate that feature-based composites areunsatisfactory for generalization to real faces, but weacknowledge that only FACES were utilized in these experi-ments. It is possible that our results would be somewhat lessapplicable to other feature-based systems that performslightly better than FACES (e.g. Frowd et al., 2007a). Re-gardless, researchers should be cautious about theoretical con-clusions they draw when utilizing feature-based composites asproxies for real faces or real face photos. For example, researchindicates that configural processing is linked to recollection(Mäntylä & Holm, 2005), and the conjunction effect (e.g.Bartlett, Searcy, & Truxillo, 1996; Jones & Jacoby, 2001)provides evidence suggesting that conjunction false alarmsare often due to familiarity in the absence of recollection.Because conjunctions are simply a recombination of features,this suggests that familiarity is linked to feature-basedprocessing. Therefore, if this kind of feature-based compositeis used, familiarity could play a more prominent role(e.g. yielding greater effect sizes for whatever measure offamiliarity is being used) than if real face photos are used.Another example comes from the eyewitness memory

literature. There is a debate concerning the best way to presenta lineup: (i) simultaneously, or all members presented at once;or (ii) sequentially, or one member presented at a time, andonce a choice is made, the lineup ends. Lindsay and Wells(1985) argued that the sequential lineup is better (but seeGronlund, Carlson, Dailey, & Goodsell, 2009) because itmakes an ‘absolute judgment’ more likely than a ‘relativejudgment’. In other words, an eyewitness could compare eachmember of a sequential lineup with their memory for theperpetrator without being distracted by the other members ofthe lineup, whereas an eyewitness viewing a simultaneouslineup could be tempted to compare the faces to each other,makingmore of a relative or comparative judgment. Of course,if an innocent suspect is present in the lineup, the relativejudgment puts him at risk. If the absolute judgment relies onrecollection (Carlson & Gronlund, 2011; Gronlund, 2005),the use of feature-based composites in lineup studies (Flowe& Besemer, in press; Flowe & Cottrell, in press; Flowe &Ebbesen, 2007) could reduce the occurrence of absolutejudgment processes by reducing the reliance on recollection.Perceptual fluency (Jacoby & Dallas, 1981), the ease withwhich items are perceived or recognized, might be anothermechanism underlying lineup choices (Meissner, Tredoux,Parker, & MacLin, 2005). Increased perceptual fluency canresult in an increased likelihood of choosing a particular face.However, composites constructed from randomly selectedfeatures should elicit less perceptual fluency than would thetypical real face (or photo), which means that the contributionof perceptual fluency could be suppressed if such compositeswere used instead of real face photos. Our findings fromExperiment 3 based on the parameter Ter from the EZDiffusion Model support this contention.We went to great lengths to create feature-based compo-

sites that would be processed like real face photos, andalthough we succeeded in replicating the face inversioneffect (Experiment 4), there were still differences in howour systematically created composites were processed com-pared with real face photos (Experiment 5). This was thecase even although we carefully constructed each composite

with reference to a photograph of the real face. Is thereanything else that can be carried out to obtain compositesto function more like real face photos?

One approach is to have participants make personality-based judgments about each composite. Wells and Hryciw(1984), using Identi-Kit composites, found that personalityjudgments induced them to process the composites moreholistically relative to a feature judgment control. Thosewho made feature judgments were better able to reconstructthe face at test but were poorer at picking the face out of alineup. Just the opposite was the case for those who initiallymade the personality judgments. Therefore, perhaps the degreeof holistic processing of composites would be enhanced byhaving participants perform personality judgments at encod-ing, something that might happen naturally if participants weretold that they were viewing a criminal rather than beinginstructed to remember a stimulus. Two additional holistic-type methods used to improve the quality of composites arethe development of a ‘holistic cognitive interview’ (Frowd,Bruce, Smith, & Hancock, 2008), which involves face con-structors making personality judgments before constructing acomposite and the use of dynamic caricatures to improvepeople’s ability to correctly recognize a composite face(Frowd, Bruce, Ross, McIntyre, & Hancock, 2007).

Another idea is that composites might function more likereal face photos if they are simplified, ‘to preclude the useof irrelevant photographic idiosyncracies in face matching’(Wilson, Loffler, & Wilkinson, 2002, p. 2919). For example,Wilson and colleagues used synthetic faces that started asphotographs of actual individuals but underwent several sim-plifications. These included the use of identical eyes and eye-brows (although their locations varied) and the eliminationof hair and skin texture, skin color, and wrinkles. The syn-thetic faces also were bandpass filtered, preserving lowerspatial frequencies that carry more holistic information tothe exclusion of more detailed information. It is possible thatthe standardization of some features, and the elimination ofothers, forces an individual to rely more on configural orholistic information in synthetic faces (see also, e.g. Hole,George, Eaves, & Rasek, 2002, concerning the preservationof configural processing after Gaussian blurring of real facephotos). The same might happen if these modifications weremade to feature-based composites, even those constructedfrom randomly selected features.

A good test of these various modifications of feature-basedcomposites would be a paradigm similar to that which we uti-lized in Experiments 4 and 5. First, make sure a face inversioneffect is present for the composites. Then, compare the compo-sites with real faces while manipulating encoding time. If theaccuracy and RT differences we identified in Experiment 5disappear, that would be a good indication that the compositesare perceived and remembered more like real faces.

There are programs that utilize more holistic methods fordeveloping composites (see Davies & Valentine, 2007, fora review). These include EvoFIT (Frowd et al., 2004),Eigen-fit (Gibson, Pallares-Bejarano, & Solomon, 2003;now known as EFIT-V), and ID (Tredoux, Rosenthal,Nunez, & da Costa, 1999). Rather than grouping togetherfeatures, these programs use more sophisticated approaches,including principal component analysis and genetic and

Feature-based composites versus face photos

Copyright © 2012 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. (2012)

Page 14: Processing Differences between Feature-Based Facial Composites and Photos of Real Faces

evolutionary algorithms. These ‘holistic’ methods allow fora greater sampling of a multitude of facial qualities, beyondsimply individual features, such as the precise configurationsof internal facial features. Studies focused on developingEvoFIT have found that it can produce better compositesthan feature-based systems (Frowd et al., 2007b; Frowdet al., 2010). In addition, EFIT-V was used (Valentineet al., 2010) to create composites that were subsequentlymorphed within or between witnesses. Composites morphedacross several, each from a different constructor, were betternamed and judged to be better likenesses of the faces onwhich they were based, compared with morphs created byan individual. All morphs were significantly better matchesto the original real faces compared with individual EFIT-Vcomposites. In sum, these various programs, as well as morph-ing techniques, represent steps toward creating composites thatare processed more like real faces and real face photos. How-ever, to reiterate, a good test of these newer composites wouldbe to evaluate them in the paradigms we utilized. These para-digms should include RT analyses that can provide a thoroughanalysis of the underlying cognitive processing of these com-posites and how they compare with real faces.

CONCLUSION

The control of stimulus similarity and the relationshipsamong stimuli is important for theory testing and develop-ment (e.g. Davidenko, 2007; Yotsumoto et al., 2007).Feature-based composites provide a high degree of control.However, it appears that these stimuli are not processed likereal face photos, even after several levels of control areapplied. This has consequences for researchers utilizingthese stimuli as proxies for real faces. However, alternativemethods of composite construction exist, and these shouldbe tested along the same lines as the paradigms used in thepresent studies. If these composites pass these tests, thenconclusions reached on the basis of these composites shouldbe better candidates for generalization to real faces, whilealso providing the concomitant benefits of increased stimu-lus control so important for theory testing.

REFERENCES

Amishav, R., & Kimchi, R. (2010). Perceptual integrality of componentialand configural information in faces. Psychonomic Bulletin & Review,17, 743–748. DOI: 10.3758/PBR.17.5.743

Bartlett, J. C., Helm, A., & Jerger, S. (2001). Selective attention to inner andouter parts of faces: Evidence for holistic and featural processing. Unpub-lished manuscript, Department of Psychology, University of Texas atDallas.

Bartlett, J. C., Searcy, J. H., & Truxillo, C. (1996, November). Both partsand wholes affect face recognition. Paper presented at the 37th annualmeeting of the Psychonomic Society, Chicago, IL.

Carbon, C., & Leder, H. (2005). When feature information comes first!Early processing of inverted faces. Perception, 34, 1117–1134. DOI:10.1068/p5192

Carlson, C. A. (2009). Distinctiveness in an eyewitness identification para-digm: Comparing simultaneous and sequential lineups. (Unpublisheddoctoral dissertation). University of Oklahoma, Norman, OK.

Carlson, C. A., & Gronlund, S. D. (2011). Searching for the sequentiallineup advantage: A distinctiveness explanation. Memory, 19, 916–929.DOI:10.1080/09658211.2011.613846

Carlson, C. A., Gronlund, S. D., & Clark, S. E. (2008). Lineup compo-sition, suspect position and the sequential lineup advantage. Journalof Experimental Psychology. Applied, 14, 118–128. DOI: 10.1037/1076-898X.14.2.118

Charman, S. D., Gregory, A. H., & Carlucci, M. (2009). Exploring thediagnostic utility of facial composites: Beliefs of guilt can bias perceivedsimilarity between composite and suspect. Journal of ExperimentalPsychology. Applied, 15, 76–90. DOI: 10.1037/a0014682; DOI:10.1016/S0022-5371(73)80014-3

Clark, S. E. (2003). A memory and decision model for eyewitness identifi-cation. Applied Cognitive Psychology, 17, 629–654.

Clark, S. E., & Davey, S. L. (2005). The target-to-foils shift in simultaneousand sequential lineups. Law and Human Behavior, 29, 151–172. DOI:10.1007/s10979-005-2418-7

Cohen, M. E., & Carr, W. J. (1975). Facial recognition and the von Restorffeffect. Bulletin of the Psychonomic Society, 6, 383–384. Retrieved fromhttp://www.psychonomic.org

Davidenko, N. (2007). Silhouetted face profiles: A new methodology forface perception research. Journal of Vision, 7, 1–17. DOI: 10.1167/7.4.6

Davies, G. M., & Valentine, T. (2007). Facial composites: Forensic utilityand psychological research. In R. C. L. Lindsay, D. F. Ross, J. D. Read,& M. P. Toglia (Eds.), Handbook of eyewitness psychology Vol. 2(pp. 59–86). Mahwah, NJ: Erlbaum.

Davies, G. M., van der Willik, P., & Morrison, L. J. (2000). Facial compos-ite production: A comparison of mechanical and computer-drivensystems. Journal of Applied Psychology, 85, 119–124. DOI: 10.1037/0021-9010.85.1.119

Dunlap, W. P., Cortina, J. M., Vaslow, J. B., & Burke, M. J. (1996). Meta-analysis of experiments with matched groups or repeated measuresdesigns. Psychological Methods, 1, 170–177. DOI: 10.1037/1082-989X.1.2.170

Ellis, H. D., Davies, G. M., & Shepherd, J. W. (1978). A critical examina-tion of the photo-fit system for recalling faces. Ergonomics, 21, 297–307.DOI: 10.1080/00140137808931726

Ellis, N. R., Meador, D. M., & Bodfish, J. W. (1985). Differences in intelli-gence and automatic memory processes. Intelligence, 9, 265–273. DOI:10.1016/0160-2896(85)90028-

Ellis, H. D., Shepherd, J. W., & Davies, G. M. (1975). An investigation ofthe use of the photo-fit technique for recalling faces. British Journal ofPsychology, 66, 29–37. Retrieved from: http://onlinelibrary.wiley.com/journal/10.1111/%28ISSN%292044-8295

Ellis, H. D., Shepherd, J. W., & Davies, G. M. (1979). Identification offamiliar and unfamiliar faces from internal and external features: Someimplications for theories of face recognition. Perception, 8, 431–439.DOI: 10.1068/p080431

E-Prime 1.1. (2003). E-Studio (version 1.1.4.15) [Computer software].Pittsburgh, PA: Psychology Software Tools, Inc.

Farah, M. J., Wilson, K. D., Drain, H. M., & Tanaka, J. R. (1998). What is“special” about face perception? Psychological Review, 105, 482–498.DOI: 10.1037/0033-295X.105.3.482

Flowe, H. D., & Besemer, A. N. (in press). The effect of target-foil discrim-inability on criterion placement in sequential and simultaneous lineups.Psychology, Crime and Law. Retrieved from http://www2.le.ac.uk/departments/psychology/ppl/hf49

Flowe, H. D., & Cottrell, G. (in press). An examination of simultaneouslineup decision processes using eye tracking. Applied Cognitive Psychol-ogy. Retrieved from http://www2.le.ac.uk/departments/psychology/ppl/hf49

Flowe, H. D., & Ebbesen, E. B. (2007). The effect of lineup member simi-larity on recognition accuracy in simultaneous and sequential lineups.Law and Human Behavior, 31, 33–52. DOI: 10.1007/s10979-006-9045-9

Frowd, C. D., Bruce, V., McIntyre, A., & Hancock, P. J. B. (2007a).The relative importance of external and internal features of facialcomposites. British Journal of Psychology, 98, 61–77. DOI: 10.1348/000712606X104481

Frowd, C. D., Bruce, V., Ness, H., Bowie, L., Paterson, J., Thomson-Bogner,C., McIntyre, A., & Hancock, P. J. B. (2007b). Parallel approaches to com-posite production: Interfaces that behave contrary to expectation.Ergonomics, 50, 562–585. DOI: 10.1080/00140130601154855

C. A. Carlson et al.

Copyright © 2012 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. (2012)

Page 15: Processing Differences between Feature-Based Facial Composites and Photos of Real Faces

Frowd, C. D., Bruce, V., Ross, D., McIntyre, A., & Hancock, P. (2007).An application of caricature: How to improve the recognition of facialcomposites. Visual Cognition, 15, 954–984. DOI: 10.1080/13506280601058951

Frowd, C. D., Carson, D., Ness, H., McQuiston-Surrett, D., Richardson,J., Baldwin, H., & Hancock, P. (2005a). Contemporary compositetechniques: The impact of forensically- relevant target delay. Legaland Criminological Psychology, 10, 63–81. DOI: 10.1348/135532504X15358

Frowd, C. D., Carson, D., Ness, H., Richardson, J., Morrison, L.,McLanaghan,S., & Hancock, P. (2005b). A forensically valid comparison of facialcomposite systems. Psychology, Crime & Law, 11, 33–52. DOI: 10.1080/10683160310001634313

Frowd, C. D., Bruce, V., Smith, A. J., & Hancock, P. (2008). Improving thequality of facial composites using a holistic cognitive interview. Journalof Experimental Psychology. Applied, 14, 276–287. DOI: 10.1037/1076-898X.14.3.276

Frowd, C. D., Hancock, P., & Carson, D. (2004). EvoFIT: A holistic, evolu-tionary facial imaging technique for creating composites. ACM Transac-tions on Applied Perception, 1, 19–39. DOI: 10.1145/1008722.1008725

Frowd, C. D., McQuiston-Surrett, D., Anandaciva, S., Ireland, C. G., &Hancock, P. (2007). An evaluation of US systems for facial com-posite production. Ergonomics, 50, 1987–1998. DOI: 10.1080/00140130701523611

Frowd, C. D., Pitchford, M., Bruce, V., Jackson, S., Hepton, G., Greenall,M., McIntyre, A., & Hancock, P. J. B. (2010). The psychology of faceconstruction: Giving evolution a helping hand. Applied CognitivePsychology. DOI: 10.1002/acp.1662

Gibson, S., Pallares-Bejarano, A., & Solomon, C. (2003). Synthesis ofphotographic quality facial composites using evolutionary algorithms.In R. Harvey, & J. A. Bangham (Eds.), Proceedings of the BritishMachine Vision Conference 2003 (pp. 221–230). London: BritishMachine Vision Association.

Gronenschild, E., Smeets, F., Vuurman, E., van Boxtel, M., & Jolles, J.(2009). The use of faces as stimuli in neuroimaging and psychologicalexperiments: A procedure to standardize stimulus features. BehaviorResearch Methods, 41, 1053–1060. DOI: 10.3758/BRM.41.4.1053

Gronlund, S. D. (2005). Sequential lineup advantage: Contributions ofdistinctiveness and recollection. Applied Cognitive Psychology 19, 23–37.DOI: 10.1002/acp.1047

Gronlund, S. D., Carlson, C. A., Dailey, S. B., & Goodsell, C. A.(2009). Robustness of the sequential lineup advantage. Journal ofExperimental Psychology. Applied, 15, 140–152. DOI: 10.1037/a0015082

Hancock, P. J. B., Bruce, V., & Burton, A. M. (2000). Recognition of unfa-miliar faces. Trends in Cognitive Sciences, 4, 330–337. DOI: 10.1016/S1364-6613(00)01519-9

Harley, E. M., Dillon, A. M., & Loftus, G. R. (2004). Why is it difficult tosee in the fog? How stimulus contrast affects visual perception and visualmemory. Psychonomic Bulletin & Review, 11, 197–231. Retrieved fromhttp://pbr.psychonomic-journals.org/

Hole, G. J. (1994). Configurational factors in the perception of unfamiliarfaces. Perception, 23, 65–74. DOI: 10.1068/p230065

Hole, G. J., George, P. A., Eaves, K., & Rasek, A. (2002). Effects ofgeometric distortions on face-recognition performance. Perception, 31,1221–1240. DOI: 10.1068/p3252

IQ Biometrix. (2003). FACES, the Ultimate Composite Picture (Version 4.0)[Computer software]. Fremont, CA: IQ Biometrix, Inc.

Jacoby, L. L., & Dallas, M. (1981). On the relationship betweenautobiographical memory and perceptual learning. Journal of Experi-mental Psychology. General, 110, 306–340. DOI: 10.1037/0096-3445.110.3.306

Jones, T. C., & Jacoby, L. L. (2001). Feature and conjunction errors in rec-ognition memory: Evidence for dual-process theory. Journal of Memoryand Language, 45, 82–102. DOI: 10.1006/jmla.2000.2761

Koehn, C. E., & Fisher, R. P. (1997). Constructing facial composites withthe Mac-A-Mug Pro system. Psychology, Crime, & Law, 3, 209–218.DOI: 10.1080/10683169708410815

Kovera, M. B., Penrod, S. D., Pappas, C., & Thill, D. L. (1997). Identifica-tion of computer- generated facial composites. Journal of AppliedPsychology, 82, 235–246. DOI: 10.1037/0021-9010.82.2.235

Lindsay, R. C. L., & Wells, G. L. (1985). Improving eyewitness identifica-tions from lineups: Simultaneous versus sequential lineups presentations.Journal of Applied Psychology, 70, 556–564. DOI: 10.1037/0021-9010.70.3.556

Loftus, G. R., Oberg, M. A., & Dillon, A. M. (2004). Linear theory, dimen-sional theory, and the face-inversion effect. Psychological Review, 111,835–863. DOI: 10.1037/0033-295X.111.4.835

Mäntylä, T., & Holm, L. (2005). Remembering parts and wholes: Configuralprocessing in face recollection. European Journal of Cognitive Psychology,17, 753–769. DOI: 10.1080/09541440440000258

McQuiston-Surrett, D., Topp, L. D., & Malpass, R. S. (2006). Use of facialcomposite systems in US law enforcement agencies. Psychology, Crime,& Law, 12, 505–517. DOI: 10.1080/10683160500254904

Meissner, C. A., Tredoux, C. G., Parker, J. F., & MacLin, O. H. (2005). Eye-witness decisions in simultaneous and sequential lineups: A dual-processsignal detection theory analysis. Memory & Cognition, 33, 783–792.Retrieved from http://mc.psychonomic-journals.org

Pullan, L., & Rhodes, G. (1996). Why are inverted faces hard to recognize?A test of the relational feature by hypothesis. New Zealand Journal ofPsychology, 25, 8–10. Retrieved from http://www.psychology.org.nz/NZ_Journal

Ratcliff, R. (1978). A theory of memory retrieval. Psychological Review, 85,59–108. DOI: 10.1037/0033-295X.85.2.59

Reinitz, M., Lammers, W., & Cochran, B. (1992). Memory conjunctionerrors: Miscombination of stored stimulus features can produce illusionsof memory. Memory & Cognition, 20, 1–11. Retrieved from http://mc.psychonomic-journals.org

Richler, J. J., Mack, M. L., Gauthier, I., & Palmeri, T. J. (2009). Holisticprocessing of faces happens at a glance. Vision Research, 49, 2856–2861.DOI: 10.1016/j.visres.2009.08.025

Schneider, W., Eschman, A., & Zuccolotto, A. (2002). E-Prime user’sguide. Pittsburgh, PA: Psychology Software Tools.

Sergent, J. (1984). An investigation into component and configuralprocesses underlying face perception. British Journal of Psychology,75, 221–242. Retrieved from http://www.ingentaconnect.com/content/bpsoc/bjp

Smith, E. E., & Nielsen, G. D. (1970). Representations and retrieval pro-cesses in short-term memory: Recognition and recall of faces. Journalof Experimental Psychology, 85, 397–405. DOI: 10.1037/h0029727

Steblay, N. M., Dysart, J., Fulero, S., Lindsay, R. C. L. (2001). Eyewitnessaccuracy rates in sequential and simultaneous lineup presentations: Ameta-analytic comparison. Law and Human Behavior, 25, 459–474.DOI: 10.1023/A:1012888715007

Tredoux, C., Rosenthal, Y., Nunez, D., & da Costa, L. (1999). Face recon-struction using a configural, eigenface-based composite system. Paperpresented to the third meeting of the Society for Applied ResearchMemory and Cognition, Boulder, Colorado, July 1999. Retrieved fromhttp://web.uct.ac.za/depts/psychology/plato/

Valentine, T., & Bruce, V. (1986). The effect of race, inversion and encod-ing activity upon face recognition. Acta Psychologica, 61, 259–273. DOI:10.1016/0001-6918(86)90085-5

Valentine, T., Davis, J. P., Thorner, K., Solomon, C., & Gibson, S. (2010).Evolving and combining facial composites: Between-witness and within-witness morphs compared. Journal of Experimental Psychology. Applied,16, 72–86. DOI: 10.1037/a0018801

Veres-Injac, B., & Schwaninger, A. (2009). The time course of processingexternal and internal features of unfamiliar faces. PsychologicalResearch/Psychologische Forschung, 73, 43–53. DOI: 10.1007/s00426-008-0147-5

Vladeanu, M., Lewis, M., & Ellis, H. (2006). Associative priming in faces:Semantic relatedness or simple co-occurrence? Memory & Cognition, 34,1091–1101. Retrieved from http://mc.psychonomic-journals.org

Wagenmakers, E.-J. (2009). Methodological and empirical developmentsfor the Ratcliff diffusion model of response times and accuracy. EuropeanJournal of Cognitive Psychology, 21, 641–671. Retrieved from http://www.tandf.co.uk/journals/pp/09541446.html

Wagenmakers, E.-J., van der Maas, H. L. J., Dolan, C., & Grasman, R. P. P. P.(2008). EZ does it! Extensions of the EZ-diffusion model. PsychonomicBulletin & Review, 15, 1229–1235. DOI: 10.3758/PBR.15.6.1229

Wagenmakers, E.-J., van der Maas, H. L. J., & Grasman, R. P. P. P. (2007).An EZ-diffusion model for response time and accuracy. Psychonomic

Feature-based composites versus face photos

Copyright © 2012 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. (2012)

Page 16: Processing Differences between Feature-Based Facial Composites and Photos of Real Faces

Bulletin & Review, 14, 3–22. Retrieved from http://pbr.psychonomic-journals.org/

Wells, G. L., & Hasel, L. E. (2007). Facial composite production by eyewit-nesses. Current Directions in Psychological Science, 16, 6–10. DOI:10.1111/j.1467-8721.2007.00465.x

Wells, G. L., & Hryciw, B. (1984). Memory for faces: Encoding andretrieval operations. Memory & Cognition, 12, 338–344. Retrieved fromhttp://mc.psychonomic-journals.org

Wilford, M. M., & Wells, G. L. (2010). Does facial processing prioritizechange detection? Change blindness illustrates costs and benefits of holis-tic processing. Psychological Science, 21, 1611–1615. DOI: 10.1177/0956797610385952

Wilson, H. R., Loffler, G., &Wilkinson, F. (2002). Synthetic faces, face cubes,and the geometry of face space. Vision Research, 42, 2909–2923. DOI:10.1016/S0042-6989(02)00362-0

Yin, R. K. (1969). Looking at upside-down faces. Journal of ExperimentalPsychology, 81, 141–145. DOI: 10.1037/h0027474

Yotsumoto, Y., Kahana, M. J., Wilson, H. R., & Sekuler, R. (2007).Recognition memory for realistic synthetic faces. Memory &Cognition, 35, 1233–1244. Retrieved from http://mc.psychonomic-journals.org

Young, A.W., Hay, D. C., McWeeny, K. H., & Flude, B.M. (1985). Matchingfamiliar and unfamiliar faces on internal and external features. Perception,14, 737–746. DOI: 10.1068/p140737

C. A. Carlson et al.

Copyright © 2012 John Wiley & Sons, Ltd. Appl. Cognit. Psychol. (2012)