volume 28 number 1

98
VOLUME 28 1999 Special Edition Chart Interpretation The Utah Numerical Scoring System Brian G. Bell, David C. Raskin, Charles R. Honts, & John C. Kircher Manually SCQring Polygraph Charts Utilizing the Seven-Position Numerical NUMBER 1 Analysis Scale at the Department of Defense Polygraph Institute 10 Jimmy Swinford Development of Deception Criteria Prior to 1950 28 Norman Ansley Numerical Evaluation of the ArJ1lY Zone Comparison Test 37 Gary D. Light Numerical Scoring Systems in the Triad of Matte Polygraph Techniques 46 James A. Matte The Academy for Scientific Investigative Training's Horizontal Scoring System and Examiner's Algorithm for Chart Interpretation 56 Nathan J. Gordon The Control Question Technique: A Search for Improved Decision Rules 65 Eitan Elaad Rank Order Analysis 74 Kathleen Miritello Scoring in a Computer Age 77 James Wygant ,. Short Report: Proposed Method for Scoring Electrodermal Responses 82 Donald Krapohl Chart Interpretation - A Bibliography 85 Norman Ansley Published Quarterly © American Polygraph Associatio n, 1999 P.O. Box 8037, Chattanooga, Tennessee 37414-0037

Upload: trancong

Post on 19-Jan-2017

240 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: VOLUME 28 NUMBER 1

VOLUME 28 1999

Special Edition

Chart Interpretation

The Utah Numerical Scoring System Brian G. Bell, David C. Raskin, Charles R. Honts, & John C. Kircher

Manually SCQring Polygraph Charts Utilizing the Seven-Position Numerical

NUMBER 1

Analysis Scale at the Department of Defense Polygraph Institute 10 Jimmy Swinford

Development of Deception Criteria Prior to 1950 28 Norman Ansley

Numerical Evaluation of the ArJ1lY Zone Comparison Test 37 Gary D. Light

Numerical Scoring Systems in the Triad of Matte Polygraph Techniques 46 James A. Matte

The Academy for Scientific Investigative Training's Horizontal Scoring System and Examiner's Algorithm for Chart Interpretation 56

Nathan J. Gordon

The Control Question Technique: A Search for Improved Decision Rules 65 Eitan Elaad

Rank Order Analysis 74 Kathleen Miritello

Scoring in a Computer Age 77 James Wygant

,. Short Report: Proposed Method for Scoring Electrodermal Responses 82 Donald Krapohl

Chart Interpretation - A Bibliography 85 Norman Ansley

Published Quarterly © American Polygraph Association, 1999

P.O. Box 8037, Chattanooga, Tennessee 37414-0037

Page 2: VOLUME 28 NUMBER 1

Bell, Raskin, Honts & Kircher

Polygraph, 1999, 28 (1) 1

The Utah Numerical Scoring System

Brian G. Bell1, David C. Raskin2, Charles R. Honts3, & John C. Kircher1

Abstract

The Utah method for numerically evaluating polygraph charts is a highly reliable and valid methodfor scoring specific-incident, comparison-question tests. For respiration, electrodermal activity (skinconductance or skin resistance), relative blood pressure (cardiograph), and peripheral vasomotoractivity (finger plethysmograph), a score from +3 to -3 is assigned for each presentation of arelevant question. The reaction to the relevant question is compared to the reaction to a nearbycomparison (control) question. A positive score is assigned when the psychophysiological reaction isgreater to the comparison question than to the relevant question, a negative score is assigned whenthe reaction is greater to the relevant question, and a zero is assigned when the responses to therelevant and comparison questions are approximately equal. Scores are based on the criteriadescribed in the present report. Common artifacts that may affect numerical evaluations arediscussed, as are limitations of this scoring system.

Key words: comparison-question tests, detection of deception, numerical scoring, polygraph

This report describes the Utah methodfor numerically evaluating polygraph chartsfrom specific-incident, comparison-questiontests. The development of the Utah methodwas preceded by numerical scoring techniquesintroduced by Backster (1969) and the U.S.Army (Weaver, 1980; 1985). Although theseearly scoring systems represented majorimprovements over global approaches to chartevaluation, many of their scoring rules had noscientific basis and had not been validated byscientific research. The Backster system inparticular had been shown to be biasedagainst truthful subjects, (Raskin, 1986) andconsisted of many complex scoring rules thatmade it difficult to evaluate polygraph chartsreliably. The Utah system was developed tosimplify the scoring process, reduce bias, and

improve the accuracy of decisions. It consistsof relatively few rules that may be applied withconsiderable consistency by different numeri-cal evaluators after a brief period of training.

The reliability of the Utah scoringsystem has been evaluated in severallaboratory experiments at the University ofUtah. The results of five such studies aresummarized in Table 1. On average, theinterrater reliability of the Utah systemexceeded .90, as measured by the correlationsbetween total numerical scores assigned bytwo or more evaluators. The percentagreement on decisions exceeded 95% whenboth numerical evaluators reached a definitedecision. Similar reliabilities between raterswho use the Utah system have been obtained

Acknowledgements

Portions of the research described in this report were funded by a Marriner S. Eccles Graduate Fellowship in PoliticalEconomy from the University of Utah, the National Institute of Law Enforcement and Criminal Justice (Contract 75-NI-99-0001), and the National Institute of Justice (Grants 8l IJ-CX-0051 and 85-IJ-CX-0040). The views expressed in this articleare those of the authors and do not reflect the official policy or position of the National Institute of Justice or the U.S.Government. The authors also gratefully acknowledge comments by Paul Bernhardt on an early draft of the manuscript.For reprints, contact Dr. John Kircher, Department of Educational Psychology, 327 MBH, University of Utah, Salt Lake City,UT 84112.

1 Department of Educational Psychology, 327 MBH, University of Utah, Salt Lake City, UT 84112.2 P.O. Box 2419, Homer, AK 99603.3 Department of Psychology, Boise State University, Boise, ID 83725.

Page 3: VOLUME 28 NUMBER 1

Utah Numerical Scoring System

Polygraph, 1999, 28 (1) 2

in field studies. For example, the interratercorrelation was .94 in a field study by Hontsand Raskin (1988). These reliabilities farexceed standards of acceptable interrater

reliability for psychological tests as establishedby the American Psychological Association(1985).

Table 1. Reliability of the Utah System of Numerical Scoring in Laboratory Studies

Study

Agreement on DecisionsBetween Original Examiner

and Independent Evaluator*

Correlation Between Numerical Scores of Independent

Evaluatorand Original Examiner

Podlesny & Raskin (1978) 100% .97

Rovner et al. (1979) 95% .97

Kircher & Raskin (1988) 99% .97

Honts et al. (1994) 96% .92

Horowitz et al. (1997) .98** .92

* Includes only cases in which both examiners made a decision (excludes inconclusives)** Only Kappa was reported in this study.

The validity of the Utah system ofnumerical evaluation has also been estab-lished. Table 2 presents decision accuraciesfrom several laboratory experiments.Excluding inconclusive outcomes, the overallpercentage of correct decisions was 91% forguilty subjects and was 89% for innocentsubjects.

The results from field studies with theUtah system are consistent with thosereported in Table 2 (Honts & Raskin, 1988;Raskin, 1976; Raskin, Kircher, Honts, &Horowitz, 1988). In one field study, twonumerical evaluators independently evaluatedthe polygraph charts using the Utah system(Raskin, 1976). Their decisions were 100%correct for both guilty and innocent suspects.In another study, decisions were 92% correctfor guilty suspects and 100% correct forinnocent suspects (Honts & Raskin, 1988).

Overview of the Utah ScoringSystem

The Utah scoring system, when usedwith the probable-lie and directed-lie

comparison question tests, assigns numericalscores by assessing differences betweenrelevant and comparison questions. Scores areassigned on a 7-point scale that ranges from -3 to +3. The reaction to a relevant question iscompared to the reaction produced by atemporally adjacent, comparison question. If arelevant question was presented between twocomparison questions, its reaction iscompared to the reaction to the comparisonquestion that produced the stronger physio-logical response.

For each channel, the relative size ofthe reactions to the comparison and relevantquestions is evaluated and quantified. Positivescores are assigned when the physiologicalreaction to the comparison question wasgreater than the reaction to the relevantquestion. Negative scores are assigned whenthe reaction to the relevant question wasgreater, and zero is assigned when reactions tothe relevant and comparison questions are notnoticeably different. In general, a noticeabledifference between the reactions to thecomparison and relevant questions is assigneda score of 1. A strong, clear difference between

Page 4: VOLUME 28 NUMBER 1

d' <E"

~ is' ...... ID ID .ID

11X1

.:;

w

Table 2. Validity of the Utah System of Numerical Scoring in Laboratory Studies

Guilty Innocent

Study N Correct Incorrect Inconclusive % N Correct Incorrect Inconclusive 0/0 Correct* Correct*

Raskin & Hare (1978) 24 88% 0% 12% 100% 24 75% 4% 21% 95%

Podlesny & Raskin (1978) 20 7CPlo 15% 15% 82% 20 9CPlo 5°1o 5°1o 95%

Rovner et al. (1979)a 24 88% CPlo 12% 10CPlo 24 88% 8% 4% 92%

Kircher & Raskin (1988) 50 88% 6°1o 6% 94% 50 86% 6% 8% 93%

Honts et al. (1994) 20 7CPlo 2CPlo lCPlo 78% 20 75% lCPlo 15% 88%

Horowitz et al. (1997)b 15 53% 2CPlo 27% 73% 15 8CPlo 13% 7% 86%

* The percent correct was calculated by dividing the number correct by the sum of the number correct and the number incorrect.

a Excludes 24 countermeasure-trained subjects. b Excludes 90 subjects given relevant-irrelevant and directed lie tests.

Page 5: VOLUME 28 NUMBER 1

Utah Numerical Scoring System

Polygraph, 1999, 28 (1) 4

the reactions is assigned a 2. A score of 3 isassigned when there is a dramatic differencebetween the reactions to the two questions,the tracing is stable, and the strongerresponse is the largest on the chart for thatphysiological measure.

In a single-issue test, all relevantquestions must be answered truthfully or allmust be answered deceptively. In this case,the scores for all presentations of relevantquestions are summed. The subject is reportedas deceptive if the total score is -6 or lower,truthful if the total is +6 or higher, andinconclusive if the total is between -6 and +6.

For mixed issue tests, such as theModified General Question Test, some relevantquestions can be answered truthfully whileothers can be answered deceptively. In thiscase, a separate total is obtained for eachrelevant question. When the total score for asingle relevant question is -3 or lower, thesubject's answer to that question is considereddeceptive. When the total score for a singlerelevant question is +3 or higher, the subject'sanswer is considered truthful. When the totalscore for a question is between -3 and +3, theoutcome is considered inconclusive. However,if the total score for all relevant questionscombined is at least +6 or -6, and the totalscores for each relevant question are in thesame direction (all positive or all negative), thesubject is considered truthful or deceptive,respectively, to each relevant question.

Scoring Criteria

A total of ten scoring criteria are usedto assess the relative strength of physiologicalreactions to relevant and comparisonquestions. The criteria change depending onthe physiological measure being evaluated.Scores are assigned to respiration, electro-dermal, cardiograph, and finger plethysmo-graph channels.

RespirationFor a given relevant question, changes

in respiration are evaluated first because deepbreaths may affect how other channels areevaluated. In general, a reaction to a questionis indicated by suppressed respiratory activity.The greater the suppression, the stronger thereaction. Suppression is indicated primarily by

a reduction in the amplitude of at least twosuccessive respiration cycles followingquestion onset and brief periods of apnea(cessation of breathing). A rise in therespiration baseline, as indicated by a rise inthe bottoms of at least two respiration cycles,is another criterion for scoring a reaction. Anincrease in cycle time (slowing of respirationrate) is also a criterion but is less heavilyweighted than changes in amplitude, apnea,and baseline increase. Increases in respiratoryactivity, such as increased amplitude,speeding of respiration, and drops in respir-ation baseline, are not indications of a reactionand are not criteria for scoring.

Although thoracic and abdominalrespiration are recorded on separate channelsof the polygraph, only one numerical score isassigned that is based on a composite of bothchannels. Respiration reactions to thecomparison and relevant questions areevaluated by noting the combined amount ofreaction in both respiration channels for therelevant question and for the comparisonquestion. A single numerical score is thenassigned based on the difference in thecomposite reactions to the relevant andcomparison questions. Thus, the numericalscore for one relevant question may be basedon observed changes in thoracic respiration,abdominal respiration, or a composite of both,depending on the total amount of changeobserved in the two channels.

Electrodermal ActivityThe electrodermal channel is evaluated

next. Numerical scores for electrodermalactivity are based mainly on changes in peakamplitude. The amplitude of a reaction isdefined as the greatest difference between anylow point and subsequent high point thatoccurs within the scoring window (describedbelow). Amplitude may be measured by usingthe numerical scoring subprogram in theStoelting Computerized Polygraph System(CPS) or a similar system. If only printed orinked charts are available, the amplitude ismeasured with a ruler to the nearest 0.5millimeter. For each relevant question, a scoreof 1 is assigned if the amplitude of the reactionto the relevant or comparison question is twiceas large as the amplitude of the reaction to theother question. A score of 2 is assigned whenthe amplitude of the reaction is three times as

Page 6: VOLUME 28 NUMBER 1

Bell, Raskin, Honts & Kircher

Polygraph, 1999, 28 (1) 5

large, and a score of 3 is assigned when theamplitude is four times as large. However, ascore of 3 may be assigned only when thebaseline is stable and the reaction is thelargest on the chart. The baseline isconsidered unstable if there are manynonspecific electrodermal responses on thechart. Under those conditions, a score of 3cannot be assigned.

The duration of the electrodermalreaction and its complexity (the number ofwaves or fluctuations that occur within thescoring window) are also considered whenscoring the electrodermal channel. Reactionsthat have clearly longer duration or greatercomplexity may increase the score from 0 to 1or from 1 to 2. The larger score may beassigned if the ratio of the amplitudes is atleast 1.5:1 or 2.5:1, and the larger reactionhas longer duration and/or is more complex.However, a score of 1 or 2 cannot be assignedif the reactions differ only in duration and/orcomplexity, and these criteria are not used toassign a score of 3.

CardiographFor the cardiograph channel, reactions

are measured as rises in the baseline. Thenumerical score is based primarily on thelargest rise in the baseline that occurs withinthe scoring window. Again, use of a computer-scoring algorithm, such as the CPS, or a ruleris recommended for making measurements ofincreases in the baseline. A minimum ratio of1.5 to 1 is required for a score of 1.Measurements of baseline increases are madeon the diastolic side of the waveform becausethe diastolic points show greater change thanthe systolic points and are easier to see.However, increases in the systolic points maybe used if it is unclear whether to assign a 0 or1 or to assign a 1 or 2 based on the diastolicpoints. The duration of the response is alsoconsidered. The rules described above forelectrodermal reactions apply to thecardiograph; reactions with longer durationmay increase the numerical score from 0 to 1or from 1 to 2.

Finger PlethysmographPeripheral vasomotor activity is mea-

sured from a photoplethysmograph attachedto the tip of the finger. Constriction of bloodvessels in the finger produces a reduction

(constriction) in the amplitude of finger pulses.Numerical scores are based on the durationand magnitude of reductions in finger pulseamplitude. Responses of longer durationand/or magnitude are assigned larger num-erical scores. Unlike the cardiograph andelectrodermal channels, scores of 1 or 2 maybe assigned to this response system whenthere is little or no difference in the reductionof pulse amplitude, but there is a cleardifference in the duration of the reactions.

Scoring Windows

For all channels, the response is notscored unless it begins after question onset.However, the minimum latency for a responsevaries depending on the physiologicalmeasure. Respiratory and cardiograph reac-tions may be scored if they begin immediatelyafter question onset. Electrodermal reactionsare scored only if they begin at least 0.5seconds after question onset, and finger pulsereactions are scored only if they begin at least2 seconds after question onset. If a reactionbegins prior to the minimum latency, a pointof inflection or clear increase in slope thatoccurs after the minimum latency may beconsidered the beginning of the reaction. Forall physiological measures, the reaction mustbegin no later than 5 seconds after thesubject's answer, unless the subjectcharacteristically has reactions that begin 5 to8 seconds after answering. Such delayedreactions should be scored conservatively.Reactions that begin outside these scoringwindows are not scored. The duration of areaction that begins within the scoring windowmay be considered up to 20 seconds followingquestion onset.

Distributions of Numerical Scores

We measured the frequency with whichwe observed physiological changes that metthe criteria for 'noticeable', 'clear', and'dramatic' differences in a random sample of25 innocent and 25 guilty subjects whoparticipated in a previous experiment (Kircher& Raskin, 1988). Guilty subjects hadcommitted a mock theft, innocent subject didnot commit the theft, and all subjects hadbeen promised a substantial monetary bonusif they could convince the polygraph examinerthat they were innocent of the crime. The

Page 7: VOLUME 28 NUMBER 1

Utah Numerical Scoring System

Polygraph, 1999, 28 (1) 6

charts were scored by the second author(DCR) who had no contact with the subjectsand was unaware of the subjects' guilt orinnocence. The overall accuracy of decisionswas 95% for guilty subjects and 96% forinnocent subjects. For each physiologicalcomponent, each of three relevant questionswas scored against the probable-lie

comparison question that immediatelypreceded it on each of three charts. Thisprovided a sample of 450 numerical scores foreach physiological component (3 relevantquestions X 3 charts X 50 subjects). Theabsolute values of these scores are presentedin Figure 1 for each physiological measure.

Figure 1.Distributions of numerical scores for respiration, electrodermal, cardiograph, and

finger plethysmograph.

Electrodermal (N=450)

0 +/-1 +/-2 +/-30

20

40

60

80

Numerical Scores for Individual Comparisons

Pe

rce

nt

Cardiograph (N=450)

0 +/-1 +/-2 +/-30

20

40

60

80

Numerical Scores for Individual Comparisons

Pe

rce

nt

Plethysmograph (N=450)

0 +/-1 +/-2 +/-30

20

40

60

80

Numerical Scores for Individual Comparisons

Pe

rce

nt

Respiration (N=450)

0 +/-1 +/-2 +/-30

20

40

60

80

Numerical Scores for Individual Comparisons

Perc

en

t

Page 8: VOLUME 28 NUMBER 1

Bell, Raskin, Honts & Kircher

Polygraph, 1999, 28 (1) 7

As shown in Figure 1, numerical scoresof '0' were assigned considerably more oftenthan any other value. It also may be seen thatthe frequency with which different numericalscores were assigned depended on thephysiological measure. More than 70% of thenumerical scores were '0' for the respirationand plethysmograph channels, whereasapproximately 50% of the scores were '0' forskin conductance and cardiograph channels.These results suggest that, on average,decisions will be based largely on thenumerical scores assigned to the electrodermaland cardiograph channels.

Although electrodermal and cardio-graph channels had about the same numberof scorable (nonzero) differences, numericalscores of 2 or 3 were more common for theelectrodermal channel. Since more scores of 2and 3 were assigned to the electrodermalchannel, it had more influence than thecardiograph on the total numerical scores andthe outcomes of the polygraph tests. Together,the data indicate that the Utah scoring rulesgive greater weight to electrodermal reactionsthan to cardiovascular, respiration, or plethys-mograph reactions. The relative weights giventhe four physiological measures by the Utahscoring rules are remarkably consistent withthe optimal combinations of weighted physio-logical measures that are generated by ourComputerized Polygraph System (CPS) (Kircher& Raskin, 1991).

It should be noted that scores of 3 wereextremely rare. The rules allow for theassignment of scores of 3 to any channel, butin this sample of 450 comparisons, not asingle score of 3 was assigned to respiration,cardiograph, or plethysmograph channels.

Artifacts

The quality of the tracing is consideredwhen assigning scores. Artifacts, such as deepbreaths, coughs, movements, and physio-logical abnormalities, affect how scores areassigned. To minimize the occurrence ofartifacts, we instruct examinees to avoidmovements and to breathe normally while therecordings are made. Nevertheless, artifactsmay render the tracings unscorable.

If a deep breath occurs shortly beforequestion onset, respiration should not bescored. If the deep breath is accompanied byphysiological changes in other channels, theother channels may or may not be scored. Ifthe reaction in the other channel began beforethe deep breath, then the portion precedingthe deep breath may be used in scoring if it islarger than the reaction to the question towhich it is compared. If it is smaller and is to acomparison question, then another comp-arison question may be used. The evaluatorshould also examine all of the charts for thesubject and locate any other places where adeep breath occurred, especially at pointswhere no question had been asked. If there isa similar physiological change at this point,then the reaction following the deep breathmust not be used for scoring. If there is noreaction following the deep breath, then thereaction may be scored, but it should bescored conservatively.

If movements distort more than twosuccessive pulses in the cardiovascularchannels after question onset, the cardio-vascular changes that occur after the move-ment should not be scored. If there is areaction that precedes the artifact, it may beused for scoring if it is larger than the reactionto which it is compared. However, if only oneor two pulses are distorted, it is usuallypossible to visually interpolate across theartifact and infer what the reaction would havebeen if the movement had not occurred. Ifmultiple artifacts occur within the scoringwindow, it is usually not possible to score theresponse.

Physiological abnormalities, such aspremature ventricular contractions (PVCs),may also render the cardiovascular reactionunscorable if they occur in the scoringwindow. PVCs are contractions of the leftventricle that occur before the left atrium hascontracted and filled the left ventricle, causingvery little blood to be pumped into the aorta.This is followed by a relatively long pausebefore the next ventricular contraction. Duringthis pause, the drop in blood pressureproduces a distinct downward deflection in thecardiovascular tracing. If two or three PVCsoccur within the scoring window, the signal isusually so distorted that it is not possible toscore it. However, a cardiovascular reaction

Page 9: VOLUME 28 NUMBER 1

Utah Numerical Scoring System

Polygraph, 1999, 28 (1) 8

that occurred before the PVC may be scored. Itis usually also possible to score the reaction ifit contains only one PVC, although thesubsequent rise in the tracing that is therecovery from the PVC should not be scored asa reaction.

Limitations

The research that supports the use ofthe Utah system of numerical scoring hasbeen limited to specific-incident examinations.It has not been validated for employmentscreening or periodic testing of employees withaccess to sensitive information. Furthermore,most of our research has focused on theprobable-lie test. Use of the numerical scoringsystem with the directed-lie has also beenvalidated (Honts & Raskin, 1988; Horowitz etal., 1997), but the relevant-irrelevant and

other types of tests have received almost noattention.

Most of the laboratory research withthe Utah system has used a single-issue testthat contains three repetitions of neutral,comparison, and relevant questions in thequestion sequence. Other question sequencesor mixed-issue tests have not been testedextensively in our laboratory, although twofield studies included at least one numericalevaluator who used the Utah system with theModified General Question Test format(Raskin, 1976; Raskin et. al., 1988), and theaccuracy of those decisions was comparable tothose we have observed in our laboratoryexperiments. Therefore, there is evidence thatthe validity of the Utah scoring techniquegeneralizes across similar test formats.

References

American Psychological Association (1985). Standards for educational and psychological testing.Washington, D.C.: American Psychological Association.

Backster, C. (1969). Technique fundamentals of the tri-zone polygraph test. New York: BacksterResearch Foundation.

Honts, C.R., & Raskin, D.C. (1988). A field study of the validity of the directed lie control question.Journal of Police Science and Administration, 16, 56-61.

Honts, C. R., Raskin, D. C. & Kircher, J. C. (1994). Mental and physical countermeasures reducethe accuracy of polygraph tests. Journal of Applied Psychology, 79, 252-259.

Horowitz, S. W., Kircher, J. C., Honts, C. R., & Raskin, D. C. (1997). The role of comparisonquestions in physiological detection of deception. Psychophysiology, 34, 108-115.

Kircher, J. C. & Raskin, D. C. (1988). Human versus computerized evaluations of polygraph datain a laboratory setting. Journal of Applied Psychology, 73, 291-302.

Kircher, J. C. & Raskin, D. C. (1991-98). Computerized Polygraph System (CPS) Versions 1.00-2.20. Scientific Assessment Technologies, Inc., 2532 Chadwick Street, Salt Lake City, UT 84106

Podlesny, J. A. & Raskin, D. C. (1978). Effectiveness of techniques and physiological measures inthe detection of deception. Psychophysiology, 15, 344-359.

Raskin, D. C., (1976). Reliability of chart interpretation and sources of errors in polygraphexaminations. Report to the National Institute of Law Enforcement and Criminal Justice(Contract 75-NI-99-0001). Salt Lake City: University of Utah, Department of Psychology.

Page 10: VOLUME 28 NUMBER 1

Bell, Raskin, Honts & Kircher

Polygraph, 1999, 28 (1) 9

Raskin, D. C. & Hare, R. D. (1978). Psychopathy and detection of deception in a prison population.Psychophysiology, 15, 126-136.

Raskin, D. C. (1986). The polygraph in 1986: Scientific, political, and legal issues surroundingapplication and acceptance of polygraph evidence. Utah Law Review, 1986, 29-74.

Raskin, D. C., Kircher, J. C., Honts, C. R., & Horowitz, S. W. (1988). A study of the validity ofpolygraph examinations in criminal investigation. Final report to the National Institute ofJustice (Grant No. 85-IJ-CX-0040). Salt Lake City: University of Utah, Department ofPsychology.

Rovner, L. I., Raskin, D.C., & Kircher, J. C. (1979). Effects of information and practice on detectionof deception. Psychophysiology, 16, 197-198. (Abstract).

Weaver, R. S. (1980). The numerical evaluation of polygraph charts: Evolution and comparison ofthree major systems. Polygraph, 9, 94-108.

Weaver, R. S. (1985). Effects of differing numerical chart evaluation systems on polygraphexamination results. Polygraph, 14, 34-41.

Page 11: VOLUME 28 NUMBER 1
Page 12: VOLUME 28 NUMBER 1

Table 2. Validity of the Utah System of Numerical Scoring in Laboratory Studies

Guilty Innocent

Study N Correct Incorrect Inconclusive %Correct*

N Correct Incorrect Inconclusive %Correct*

Raskin & Hare (1978) 24 88% 0% 12% 100% 24 75% 4% 21% 95%

Podlesny & Raskin (1978) 20 70% 15% 15% 82% 20 90% 5% 5% 95%

Rovner et al. (1979)a 24 88% 0% 12% 100% 24 88% 8% 4% 92%

Kircher & Raskin (1988) 50 88% 6% 6% 94% 50 86% 6% 8% 93%

Honts et al. (1994) 20 70% 20% 10% 78% 20 75% 10% 15% 88%

Horowitz et al. (1997)b 15 53% 20% 27% 73% 15 80% 13% 7% 86%

* The percent correct was calculated by dividing the number correct by the sum of the number correct and the number incorrect.

a Excludes 24 countermeasure-trained subjects.b Excludes 90 subjects given relevant-irrelevant and directed lie tests.

Page 13: VOLUME 28 NUMBER 1

Swinford

Polygraph, 1999, 28 (1) 10

Manually Scoring Polygraph Charts Utilizing the Seven-PositionNumerical Analysis Scale at the Department of Defense Polygraph

Institute

Jimmie Swinford

This documentation sets forthinformation on how the seven-positionnumerical analysis scale is taught/utilized atthe Department of Defense Polygraph Institute(DoDPI). It lists the criteria currently used atDoDPI for manually scoring all threephysiological parameters recorded onpolygraph charts during the conduct of apsychophysiological detection of deception(PDD) examination. Finally, this documentidentifies the DoDPI scoring procedures andmethods utilized for assigning values, rangingfrom +1, +2, +3 and 0 to -1, -2 and -3, for therespiratory, electrodermal activity andcardiovascular tracings of a comparisonquestion test (CQT) format.

Background Information

During the manual scoring process oftest data analysis, the PDD examinercompletes several steps before actuallyassigning values to recorded physiologicalresponses. Initially, the examiner reviews thephysiological data on the charts in an attemptto determine what an examinee's physiologicaltracing looks like while in a state ofhomeostasis or equilibrium. During thisprocess, the PDD examiner also determines ifthere is any unwanted noise on the signal ofinterest when a scoreable question(comparison or relevant) was asked on thechart. If unwanted noise on the signal ofinterest is present, then a portion, or all, of thephysiological activity recorded during thatquestion may be unusable during the scoringprocess. However, just because unwantednoise appears in one of the physiologicalparameters does not mean that the otherparameters in that analysis spot cannot bescored. For example, if an artifact appears inthe respiratory tracings, they are not scored;

but the electrodermal and cardiovasculartracings may be scored if they have not beenaffected by the artifacted respiratory tracings.However, if all of the recorded physiologicalparameters appear to have been affected byunwanted noise, that particular questioncannot be utilized during the scoring process.

Next, the examiner scores the chartsusing the appropriate analysis spots for thatparticular CQT format. After eliminating chartexcerpts containing artifacts, recovery, andother unwanted noise, the examiner comparesresponses of the relevant question(s) toresponses of the applicable comparisonquestion(s) by individual physiologicalrecorded parameters. Responses are onlyscored when there is no unwanted noise onthe signal of interest at the time the stimuluswas presented, and if the responses beganwithin the response onset window (withlatency exceptions). Depending on the type ofresponse, respiratory responses end whenrecovery starts, the tracing returns to theprestimulus baseline or homeostasis returns.Electrodermal and cardiovascular responsesend when the tracing either: (1) returns tothe prestimulus baseline or (2) stabilizes at anew tonic level.

During the scoring process,physiological tracings in an analysis spot withno responses or comparable responses areassigned a value of "0". If there is any form ofunwanted noise on the signal of interest thatprevents that signal from being scored, a "("(zero with a line through it) is placed on thescore sheet. In the seven-position scale,magnitudes of plus and minus values, rangingfrom a +1, +2, or +3 to a -1, -2 or -3, areawarded according to specific analysisprocesses for each physiological activity.

The author is an instructor for the Department of Defense Polygraph Institute.

Page 14: VOLUME 28 NUMBER 1

7-Position Scale at DoDPI

Polygraph, 1999, 28 (1) 11

After values are assigned for each analysisspot for all the PDD charts collected for anexamination, these values are tallied anddecisions are rendered according to the

specific cut-off scores necessary for aparticular CQT format. Depending on the typeof CQT or PDD examination, the examiner mayrender the following decisions:

Specific Issue CQT Screening CQT

1. No Opinion No Opinion

2. No Deception Indicated (NDI) No Significant Responses (NSR)

3. Deception Indicated (DI) Significant Responses (SR)

Types of Physiological CriteriaUtilized in the Scoring Process

Respiratory Tracing:

This is the display of physiologicalactivity indicative of an examinee's breathingpattern that is recorded by a pneumographcomponent. The respiratory tracing consists ofinhalation and exhalation cycles. Anexaminee's breathing pattern and rate mayvary due to their physical conditioning.Normally, during the data collection phase, theexaminer will attach two recording sensors(pneumograph chest assembly) to theexaminee via some type of device. Typically,the pneumograph chest assembly consists of a

convoluted tube, return mechanism, anti-rollbars, beaded chain or velcro strips and rubbertubing for connection to the computer sensorbox or analog instrument. One pneumographchest assembly will be placed around theexaminee's upper body area to record thethoracic breathing pattern. A secondpneumograph chest assembly will be placedaround the lower abdomen area to record theabdominal breathing pattern. When scoringthe respiratory tracings, the PDD examinerwill encounter the following five maincategories of responses: (1) Changes in rate,(2) Changes in amplitude, (3) Change ofbaseline, (4) Loss of baseline and (5) Apnea.These five main categories of responses consistof the following 12 scoreable criteria:

1

R E S P I R A T I O N R A T E C H A N G E S

C r i t e r i a 1 : R e s p i r a t i o n R a t e D e c r e a s e

l l - 4

Page 15: VOLUME 28 NUMBER 1

Swinford

Polygraph, 1999, 28 (1) 12

2

R E S P I R A T I O N R A T E C H A N G E S

l l - 5

C r i t e r i a 2 : R e s p i r a t i o n R a t e I n c r e a s e

3

R E S P I R A T I O N R A T E C H A N G E S

C r i t e r i a 3 : R e s p i r a t i o n I n h a l a t i o n /E x h a l a t i o n R a t i o C h a n g e

l l - 3

4

R E S P I R A T I O N A M P L I T U D E C H A N G E S

l l - 7

C r i t e r i a 4 : R e s p i r a t i o n A m p l i t u d e I n c r e a s e

Page 16: VOLUME 28 NUMBER 1

7-Position Scale at DoDPI

Polygraph, 1999, 28 (1) 13

5

R E S P I R A T I O N A M P L I T U D E C H A N G E S

C r i t e r i a 5 : R e s p i r a t i o n A m p l i t u d eD e c r e a s e / S u p p r e s s i o n

1 1 - 5

6

R E S P I R A T I O N A M P L I T U D E C H A N G E S

C r i t e r i a 6 : P r o g r e s s i v e I n c r e a s eF o l l o w e d b y A D e c r e a s e

l l - 3

7

R E S P I R A T I O N A M P L I T U D E C H A N G E S

C r i t e r i a 7 : P r o g r e s s i v e I n c r e a s e a n dR e t u r n t o H o m e o s t a s i s

l l - 5

Page 17: VOLUME 28 NUMBER 1

Swinford

Polygraph, 1999, 28 (1) 14

8

R E S P I R A T I O N A M P L I T U D E C H A N G E S

C r i t e r i a 8 : P r o g r e s s i v e D e c r e a s e a n d

R e t u r n t o H o m e o s t a s i s

l l - 5

9

R E S P I R A T I O N B A S E L I N E C H A N G E

l l - 7

l l - 5

C r i t e r i a 9 : R e s p i r a t i o n B a s e l i n e C h a n g e ( T e m p o r a r y )

1 0

R E S P I R A T I O N B A S E L I N E L O S S

l l - 4

l l - 4

C r i t e r i a 1 0 : R e s p i r a t i o n B a s e l i n e L o s s ( P e r m a n e n t )

Page 18: VOLUME 28 NUMBER 1

7-Position Scale at DoDPI

Polygraph, 1999, 28 (1) 15

1 1

R E S P I R A T I O N A P N E A

l l - 5

l l - 7

C r i t e r i a 1 1 : H o l d i n g

C r i t e r i a 1 2 : B l o c k i n g

Scoring the Respiratory Tracing

During the process of scoring therespiratory tracings, the examiner mustaccomplish several steps before being able toscore physiological data. First, the examinermust decide if a physiological change(response) has occurred in a timely manner.To be considered timely, a response shouldoccur within the response onset window(typically from stimulus onset until theexaminee's answer or first completerespiratory cycle after the answer to a reviewedtest question) (latency exceptions excluded).Even if a physiological change has occurred ina timely manner, the examiner must alsodecide if there was any type of unwanted noise(artifact, recovery, etc.) on the signal ofinterest at the time the stimulus (question)was applied. Finally, the examiner must thendetermine what constitutes a response to thestimulus and when compensatory action(recovery) begins for this response. Inrespiratory tracings, if there is any type ofresponse to a presented stimulus, there willgenerally be some form of recovery

(compensatory action) that will occur before anexaminee's respiratory pattern returns toequilibrium or homeostasis. In scoringrespiratory tracings, this is one of the moredifficult determinations for a basic examinerstudent to make. It is imperative to makethese distinctions as response can only bescored against response. An examiner cannotscore response against any form of unwantednoise (i.e., artifact or recovery) or vice versa.

After making the above determinations,the examiner will award equal value fordifferent response criteria. For example, if acomparison question has a change of baselineresponse and the relevant question has adecrease in amplitude response, then equalvalue will be assigned for these criteria in thescoring process. In this particular situation, ascore of "0" would be appropriate unless oneresponse lasted longer (more duration).Typically, in the seven-position scale, theexaminer will utilize the following guidelinesbased on the number of observed physiologicalresponse criteria:

Page 19: VOLUME 28 NUMBER 1

Swinford

Polygraph, 1999, 28 (1) 16

Question Type

Comparison Relevant Assigned Value

0 criteria 0 criteria 0

1 criterion 1 criterion 0 (unless duration is a factor)

1 criterion 0 criteria +1

1 criterion multiple criteria - 1

Multiple criteria 0 criteria +2

0 criteria multiple criteria - 2

0 criteria dramatically better - 3

In deciding whether a physiologicalresponse for one of the comparative questionshas multiple criteria present, the examinercannot decide that a specific response hasmultiple criteria if the observed responsewould automatically cause another criterion tobe present. For example, an increase ordecrease in amplitude or apnea will generally

cause a change in rate to occur. Even thoughthere may appear to be multiple criteriapresent, it cannot be considered multiplecriteria for scoring purposes. Typically, thefollowing singular criterion will constitutemultiple criteria when observed in respiratorytracings:

Singular Criterion Multiple Criteria

Amplitude and/or Rate Changes + Apnea or baseline changes

Change/Loss of Baseline + Apnea or amplitude/rate change

Apnea + Change or loss of baseline

In the seven-position scale, durationwill generally allow assigning a value of only+/-1 (no more). Typically, duration is a factoronly in those situations where one singularresponse criterion is compared againstanother singular response criterion. In theseinstances, if one response lasts longer (moreduration), then a value of +/- 1 may beassigned for duration. However, if onecomparative question has a singular responsewhile the other comparative response exhibits

multiple criteria, then duration is generally nota factor. As a rule, multiple response criteriawill cause more response duration than asingular criterion.

In scoring the respiratory tracings, theexaminer will rarely assign values higher thana "0" or +/-1. Occasionally, an examiner mayassign a value of +/-2 based on physiologicalcriteria observed in the comparative questions.Rarely, examiners will assign a value of +/-3.

Page 20: VOLUME 28 NUMBER 1

7-Position Scale at DoDPI

Polygraph, 1999, 28 (1) 17

To assign a +/-3 value, the physiologicalresponse in one question must be sodramatically better (multiple criteria for anextended period) than the response it is beingcompared against.

Electrodermal Activity (EDA) Tracing

The EDA tracing is the display of anexaminee's physiological patterns of eitherskin resistance or skin conductance of anexosomatic recording obtained with agalvanograph component. During the datacollection phase, the examiner will attach asensor to the examinee called the EDAfingerplate electrode assembly. Normally, the

fingerplate electrode assembly consists of twostainless steel plates, with velcro straps, andshielded cable for connection to the computersensor box or analog instrument. Ideally, thefingerplates will be placed on two fingers of anexaminee's non-dominant hand. During thedata collection phase, once the sensor isattached to an examinee, an external(exosomatic) electrical signal is applied to theexaminee's skin. The amount of resistance(skin resistance) that is encountered when thesignal is applied or how freely the electricalsignal travels (skin conductance) is recordedby the galvanograph component. Thefollowing criteria will be utilized when scoringan EDA tracing:

1 2

E L E C T R O D E R M A L T R A C I N G

l l - 4 l l - 5 l l - 6

C r i t e r i a 1 : A m p l i t u d e C h a n g e

1 3

E L E C T R O D E R M A L T R A C I N G

C r i t e r i a 2 : C o m p l e x R e s p o n s e

l l - 5 l l - 6

Page 21: VOLUME 28 NUMBER 1

Swinford

Polygraph, 1999, 28 (1) 18

1 4

E L E C T R O D E R M A L T R A C I N G

C r i t e r i a 3 : R e s p o n s e D u r a t i o n

< - - - - - - - - - - - - - - - - - - - - - - - - >l l - 5

Scoring the EDA Tracing

In the seven-position numericalanalysis scale, assigned EDA values are basedon a ratio scale. This is the only physiologicalparameter where a ratio method is used inscoring the responses. Because the EDAcomponent is generally the most responsive ofall the physiological parameters recorded bythe polygraph, the unit of measurement fordetermining ratio is generally a vertical chartdivision. If the compared responses are likeresponses (i. e., both amplitude or bothcomplex) and they are equal in amplitude andduration, then a score of "0" is assigned andthe ratio method is unnecessary. However, ifone of the compared responses hassignificantly more amplitude than the other,the examiner will utilize the ratio method inassigning values. To determine the ratio forappropriate responses, the smaller response isdivided into the larger response (i. e., six chartdivisions divided by two chart divisions wouldbe a ratio of 3:1). Once the ratio of responsesis determined, then values are assigned basedon the following ratio formula:

RATIO SCORE

4:1 +3/-3

3:1 +2/-2

2:1 +1/-1

1:1 0

As an example, if there is a comparisonquestion amplitude response of three chartdivisions and a relevant question amplituderesponse of seven chart divisions, then theratio would be 2.5:1 (7 divided by 3 equals2.5). To determine the value for this responseratio, the examiner would utilize the ratioscale. In this case, the value would be a -1since the larger response occurred in therelevant question. When using the ratio scale,the response must be at or above theappropriate ratio level to assign a specificvalue, (i. e., to assign a value of +/- 2, theresponse ratio must be at the 3:1 ratio level,etc.) In the above example, the response ratiowas 2.5:1. Accordingly, since it was not at the3:1 ratio level, the assigned score is a -1.

Vs (Comparison) (Relevant) 3 Chart Divisions 7 Chart Divisions =

Ratio of 2.5:1 = Score of -1

Page 22: VOLUME 28 NUMBER 1

7-Position Scale at DoDPI

Polygraph, 1999, 28 (1) 19

There is one exception to the above ruleand that is when the ratio is less than 2:1.For instance, if there is a comparison questionamplitude response of three chart divisionsand a relevant question amplitude response oftwo chart divisions, the ratio is 1.5:1. If theratio method were utilized, the assigned valuewould be a "0" since the ratio is less than 2:1.However, since the comparison response is 1.5times larger in amplitude than the relevantquestion response, it cannot be ignored. As

such, when the ratio is less than 2:1, theconcept of "bigger is better" applies for theinitial +1/-1 . In this instance, a score of +1would be assigned, since the comparisonquestion response is 1.5 times larger inamplitude than the relevant question. The"bigger is better" concept applies only to theinitial +1 or -1. Once the compared responsesreach a ratio of 2:1, this concept no longerapplies.

Vs

(Comparison) (Relevant) 3 Chart Divisions 2 Chart Divisions

Ratio of 1.5:1 = Score of +1 (Bigger is Better)

As indicated, all of the above conceptsapply to "like responses", i. e., amplitudeversus amplitude or complexity versuscomplexity. It would be great if examineesprovided "like responses" all the time.However, as we all know, this does not alwaysoccur. As such, there must be provisions forcomparing "unlike responses" (amplitudeversus complexity) using the ratio method. Incertain instances, complexity will allowassigning a value of +1 or -1 (no more).

When comparing an amplituderesponse to a complex response, complexitywill only be considered when both responsesare equal in amplitude or the ratio is less than2:1. Once the ratio factor reaches a scale of2:1, complexity is no longer a considerationand values are assigned for amplitude ratio.As an example, when there is a comparisonquestion with an amplitude response of sixchart divisions and a relevant question with acomplex response of six chart divisions, theassigned value would be a -1.

Vs

(Comparison) (Relevant) 6 Chart Divisions 6 Chart Divisions

Score = -1

Page 23: VOLUME 28 NUMBER 1

Swinford

Polygraph, 1999, 28 (1) 20

If the unlike compared responses are less thana 2:1 ratio in amplitude and the smallerresponse is complex, then the value will be a

"0". If the larger response is complex, thencomplexity is no longer a consideration. Forexample:

Vs Vs

(Comparison) (Relevant) (Comparison) (Relevant)3 Chart Divisions 2 Chart Divisions 3 Chart Divisions 2 Chart Divisions

Score = 0 Score = +1

Once the comparative ratio reaches ascale of 2:1, complexity is no longer a factorfor consideration. Values are then assigned

for amplitude responses according to the ratioscale. For example:

Vs

(Comparison) (Relevant) 4 Chart Divisions 2 Chart Divisions

Ratio of 2:1 = Score of +1

Other EDA Scoring Considerations:

Duration in "like responses" will allowassigning a value of +1 or -1 only (no more).Duration is not a consideration in "unlikeresponses" (i. e., amplitude versus complexity)as complexity will generally have more

duration just by the nature of the type ofresponse. Therefore, in assigning a value forduration, the responses must be similarresponses (amplitude versus amplitude orcomplexity versus complexity) and they mustbe equal or equivalent in amplitude:

Vs or Vs (Comparison) (Relevant) (Comparison) (Relevant) 2 Chart Divisions 2 Chart Divisions 2 Chart Divisions 1.5 Chart Divisions

Score = -1 (Duration) Score = 0

Something Against Nothing Concept

If one of the questions (comparison orrelevant) has a physiological response, whilethe other question in the same analysis spothas no response (preventing utilization of theratio method), then the unit of measurement(chart division) is utilized for assigning values.

By accomplishing this, an examiner canalways be consistent in assigning values forthe responses being scored. For instance, ifthere are no responses in the comparisonquestion, while the relevant question has twochart divisions of amplitude response, a valueof -2 may be assigned (since the unit ofmeasure is a chart division). Likewise, if the

Page 24: VOLUME 28 NUMBER 1

7-Position Scale at DoDPI

Polygraph, 1999, 28 (1) 21

reverse were true, then a value of +2 wouldhave to be assigned to be consistent in

applying this principle.

Vs (Comparison) (Relevant) (Nothing) 2 Chart Divisions

Score = -2

Cardiovascular Tracing

This tracing displays the physiologicalpatterns of an examinee's relative bloodvolume and pulse rate that are recorded by acardiograph component. The contraction andrelaxation of an examinee's heart will causethe polygraph system (analog or computerized)to record the systolic stroke (heartcontraction), diastolic stroke (relaxation periodof the heart) and a dicrotic notch, whichappears during the diastolic stroke of theheart. During the data collection phase, thesensor attached to the examiner will normallybe a cardiovascular blood pressure cuff

assembly. Usually, this sensor consists of arubber bladder, covered with a cloth sleeveand tightening component (velcro wrap), pumpbulb assembly which includes asphygmomanometer and associated rubbertubing for connecting the sensor to thecomputer sensor box or analog instrument.When scoring the cardiovascular tracing, theexaminer will normally encounter four maincategories of response. These are: (1)Changes in baseline, (2) Changes inamplitude, (3) Changes in rate and (4)Premature ventricular contractions. Thesefour main categories of response consist of thefollowing eight scoreable criteria:

1 5

C A R D I O V A S C U L A R T R A C I N G

C r i t e r i a 1 : P h a s i c I n c r e a s e a n dD e c r e a s e i n B a s e l i n e

l l - 8

1 6

C A R D I O V A S C U L A R T R A C I N G

C r i t e r i a 2 : T o n i c I n c r e a s e i n B a s e l i n e

l l - 9

Page 25: VOLUME 28 NUMBER 1

Swinford

Polygraph, 1999, 28 (1) 22

1 7

C A R D I O V A S C U L A R T R A C I N G

C r i t e r i a 3 : T o n i c D e c r e a s e i n B a s e l i n e

l l - 7

1 8

C A R D I O V A S C U L A R T R A C I N G

C r i t e r i a 4 : I n c r e a s e i n A m p l i t u d e

l l - 8

1 9

C A R D I O V A S C U L A R T R A C I N G

C r i t e r i a 5 : D e c r e a s e i n A m p l i t u d e

l l - 7

Page 26: VOLUME 28 NUMBER 1

7-Position Scale at DoDPI

Polygraph, 1999, 28 (1) 23

2 0

C A R D I O V A S C U L A R T R A C I N G

C r i t e r i a 6 : I n c r e a s e i n R a t e

l l - 8

2 1

C A R D I O V A S C U L A R T R A C I N G

C r i t e r i a 7 : D e c r e a s e i n R a t e

l l - 7

2 2

C A R D I O V A S C U L A R T R A C I N G

l l - 5 l l - 6l l - 4

l l - 7 l l - 8 l l - 9

C r i t e r i a 8 : P r e m a t u r e V e n t r i c u l a r C o n t r a c t i o n s ( P V C s )

Page 27: VOLUME 28 NUMBER 1

Swinford

Polygraph, 1999, 28 (1) 24

Scoring the Cardiovascular Tracing

The primary criteria used in scoringthe cardiovascular tracing will be phasicincreases and returns to baseline (baselinearousal) (criterion #1). Generally, the mostcommon physiological response observedduring a conventional PDD examination is anincrease (arousal) from the baseline level-usually beginning at or near stimulus onsetand lasting for a few seconds with an eventualreturn to the prestimulus tonic level.However, for some examinees, this responsemay last up to 30 seconds or more. If abaseline arousal response returns to theprestimulus level, it is considered a phasicresponse. If there is baseline arousal withoutreturning to the prestimulus level, it isconsidered a tonic response. If there arephysiological changes following both thecomparison and relevant questions within ananalysis spot, greater weight is assigned to thequestion that evoked the greater change eitherin the amount of (degree) or duration (length)of baseline change.

Since one of the major factors inproperly using the seven-position scale isconsistency in applying the scoring principles,the unit of measurement for scoring a phasicor tonic response is generally a vertical chartdivision. In assigning values, any amount ofvisually discernible baseline arousal isawarded a value of +/- 1. To reach the +/- 2level, the cardiovascular response must be atleast two complete vertical chart divisionsmore than its comparative response. Likewise,for a +/- 3, the response must be at least threevertical chart divisions more than itscomparative response.

In scoring a phasic response, durationof baseline arousal must be a consideration.Generally, duration of a cardiovascularresponse will allow assigning a value of +/- 1only (no more). If one response has slightlymore baseline arousal, while the otherresponse has slightly less arousal, but withmore duration, then a score of "0" would beappropriate. If both compared responses havethe same amount of baseline arousal, but oneresponse has more visually discernibleduration, then a value of +/-1 is given to theresponse having more duration. Once a scoreof +/- 1 is assigned for baseline arousal in the

cardiovascular tracing, then duration is nolonger a consideration.

Other Considerations in Scoring theCardiovascular Tracing

If a score is given for a phasic changeof baseline response, changes in amplitude,pulse rate and premature ventricularcontractions in the cardiovascular tracing aregenerally not scored. Generally, these criteriaare scored only when there is no baselinearousal of the cardiovascular tracing. Whenscored, values of +/- 1 (no more) are awardedfor changes in amplitude, pulse rate and/orpremature ventricular contractions in thecardiovascular. Likewise, a value of +/- 1 (nomore) is awarded for a cardiovascular tracinghaving a tonic response (increase and/ordecrease in baseline). Generally, in thecomparison process, cardiovascular responseshaving unlike attributes (tonic against phasic;phasic response against a change in tracingamplitude with no baseline arousal; etc.) willresult in a value of "0".

If both comparative responses haveequal degrees and duration of baselinearousal, speed of arousal of the cardiovasculartracing from the baseline (if visuallydiscernible) may allow assigning a value of +/-1 (no more).

If a consistent response is exhibited toa particular question or a category ofquestions-comparisons only or relevants only-throughout the entire PDD examination,premature ventricular contractions may bescored. If consistency is established, a valueof +/- 1 (no more) may be awarded forpremature ventricular contractions; however,this cardiovascular criterion is seldom scoredas consistency to a particular question or classof questions is rarely established.

Glossary of Terms

Analysis Spot - The relevant and comparisonquestion(s) that are actually evaluated duringspot analysis. The number of appropriatecomparison question(s) for each relevantquestion will vary depending on test formatused [i.e., Test for Espionage and Sabotage(TES) format, Zone Comparison Test (ZCT),

Page 28: VOLUME 28 NUMBER 1

7-Position Scale at DoDPI

Polygraph, 1999, 28 (1) 25

and Modified General Question Test (MGQT)].Regardless of the test format, each relevantquestion is always compared to the mostappropriate comparison question on a tracingby tracing basis. If the test format allows arelevant question to be compared to more thanone comparison question, then thecomparison question with the greater responsefor that physiological tracing is used forcomparison purposes.

Artifact - A change in an examinee'sphysiological pattern (activity) that is notattributable to a reviewed test question(stimulus) or recovery.

Cardiovascular Tracing - A display ofphysiological patterns of an examinee's relativeblood volume and pulse rate that are recordedby a cardiograph component. The contractionand relaxation of an examinee's heart willcause the polygraph to record the systolicstroke (heart contraction), diastolic stroke(relaxation period of the heart) and a dicroticnotch, which appears during the diastolicstroke of the heart. The criteria used toevaluate this physiological tracing are changesin baseline, changes in amplitude and changesin rate.

Comparison Question - A question that isdesigned to produce a physiological response.During spot analysis, the physiologicalresponses of comparison questions arecompared to the physiological responses ofrelevant questions.

Electrodermal Activity (EDA) Tracing - Thedisplay of physiological patterns of either skinresistance or skin conductance obtainedthrough exosomatic recording with agalvanograph component. When evaluatingthis component tracing, the criteria consideredare changes in amplitude, complexity ofresponse and duration of response.

EDA Recovery Phase - The physiologicalactivity displayed in an EDA tracing thatoccurs between the highest peak andsubsequent return to the prestimulus or newlyestablished baseline. The EDA recovery phasebegins once the tracing has reached its highestpeak.

EDA Rise Time - The physiological activitydisplayed in an EDA tracing beginning withresponse onset and ending at the peak.

Homeostasis - A complex interactiveregulatory system by which the body strives tomaintain a state of internal equilibrium.During test data analysis, the examiner looksat the physiological tracings to ensure that theexaminee is in a state of homeostasis before ascoreable test question is presented. If anexaminee's physiological activity is not in astate of homeostasis (i.e., there is noise on thesignal of interest) when a scoreable question ispresented, then subsequent physiologicalactivity should not be considered a response tothat stimulus and cannot be scored.

Psychophysiological Detection of Deception(PDD) Chart - A graphic representationcontaining selected physiological datagenerated by an examinee during the datacollection phase of a PDD examination.

PDD Examination - A process thatencompasses all activities that take placebetween a PDD examiner and an examineeduring a specific series of interactions. Theseinteractions may include the pretest interview,use of a polygraph to collect physiological datafrom an examinee while presenting a series oftests (data collection phase), test data analysisphase, and the post-test interview phase,which may include interrogation of theexaminee.

PDD Examiner - Someone who hassuccessfully completed formal education andtraining in conducting PDD examinations andis certified by his or her agency to conductsuch examinations.

PDD Series - Collection of PDD charts bypresentation of reviewed test questions to anexaminee the number of times required by aparticular PDD testing format. A PDDexamination may consist of any number ofPDD series.

PDD Test Data - The signal of interest thatmay consist of artifact(s), recovery, other noiseor examinee physiological response(s) tostimuli.

Page 29: VOLUME 28 NUMBER 1

Swinford

Polygraph, 1999, 28 (1) 26

PDD Test Data Analysis - Analysis of thepsychophysiological responses recorded on thePDD chart(s). For scoring purposes, only datathat are timely with an applied stimulus(reviewed test question) and free of artifactsand noise on the signal of interest can beconsidered.

Recovery (Returning to Homeostasis) - Adeviation in a PDD tracing attributable to aphysiological phenomenon occurring as acompensatory action after a response or anartifact.

Relevant Question - A question that pertainsdirectly to the matter under investigation or tothe issue(s) for which the examinee is beingtested.

Respiratory Tracing - The display ofphysiological patterns indicative of anexaminee's breathing activity as recorded bythe pneumograph component. The respiratorytracing consists of inhalation and exhalationstrokes. An examinee's breathing pattern andrate may vary due to their physicalconditioning. Evaluation criteria consideredduring the scoring process are changes inamplitude, apnea, changes in rate, changes inbaseline, and loss of baseline.

Response - A physiological change that occursfollowing, and is attributable to, thepresentation of an applied stimulus (i.e.,reviewed test question). Responses areevaluated when they occur within theresponse onset window (latency exceptions)and there is no noise on the signal of interestat the time the stimulus is presented. Aphasic response is a discrete (known origin)response to a specific stimulus that isgenerally seen as an upward movement fromthe baseline with subsequent return to theprestimulus (original) baseline. A tonicresponse is a discrete (known origin) responseto a specific stimulus that is generally seen asa movement from the prestimulus baselineand establishment of a new baseline withoutreturning to the prestimulus baseline.

Response Amplitude - The displayedphysiological activity reflected in a PDDtracing occurring between response onset andresponse peak (highest level from prestimulusbaseline).

Response Duration - The physiologicalactivity (time) displayed between responseonset and offset. Typically, this is the timefrom response onset until return to theprestimulus baseline (phasic response) or anewly established baseline (tonic response).

Response Latency - The time betweenstimulus onset and response onset.

Response Onset - The first indication ofchange from the prestimulus level ofphysiological activity to an applied stimulus(reviewed test question). To be utilized duringtest data analysis, unless latency is involved,response onset must occur within theresponse onset window to an applied stimulus(reviewed test question).

Response Onset Window - The period of timebetween stimulus onset (verbal) and anexaminee's verbal response to that stimulus(assuming an examinee's verbal responseoccurs in a timely manner). Typically, to beconsidered during test data analysis, anexaminee's physiological responses shouldoccur during that period. However, if anexaminee consistently exhibits responselatencies that are outside of this responseonset window, the response onset window maybe increased to include an examinee'sconsistent late responding.

Spot Analysis Concept - The procedurewherein each component tracing is separatelyevaluated by comparing the response of arelevant question to the response of acomparison question.

Stimulus Onset - During data collection, thisis the beginning of the presentation of the firstword of a reviewed question.

Tonic Level - An examinee's level ofphysiological activity occurring prior tostimulus onset. This is sometimes referred toas the resting or baseline activity level. Toniclevel describes a person's physiological activitywhen resting.

Unwanted (Excessive) Noise on Signal ofInterest - Any noise (physiological activity)that should prevent a stimulus (scoreable testquestion) from being presented during thedata collection phase. If an examiner asks a

Page 30: VOLUME 28 NUMBER 1

7-Position Scale at DoDPI

Polygraph, 1999, 28 (1) 27

scoreable question when there is unwantednoise on the signal of interest, this mayprevent that question from being utilizedduring the scoring process. However,unwanted noise on one physiological tracingmay not prevent other tracings in that sameanalysis spot from being evaluated. For

example, unwanted noise on the respiratorytracings may prevent them from beingevaluated. However, during the analysisprocess for that scoreable question, if thecardiovascular and/or EDA tracings areunaffected by the unwanted noise, they maybe used for evaluation.

References

Abrams, S. (1989). The complete polygraph handbook. Lexington, MA: Lexington Books.

Matte, J. A. (1980). The art and science of the polygraph technique. Springfield, IL: Charles C.Thomas Publisher.

Matte, J. A. (1996). Forensic psychophysiology using the polygraph: Scientific truth verification-lie detection. Williamsville, NJ: J.A.M. Publications.

Weinstein, D. A. (1994). Anatomy and physiology for the forensic psychophysiologist. FortMcClellan, AL: Department of Defense Polygraph Institute.

Page 31: VOLUME 28 NUMBER 1

Deception Criteria Prior to 1950

Polygraph, 1999, 28 (1) 28

Development of Deception Criteria Prior to 1950

Norman Ansley

Abstract

This is a review of the literature published up to 1950 that contributed to the current list ofphysiological responses considered deception criteria. Even making allowances for differences interminology, there are deception criteria in the current DoDPI list that had not been observed, or ifobserved, not described before 1950. An appendix describes Luria’s motor movement techniqueand Wertheimer’s word association test. As means of detecting deception, both were discontinuedbefore 1950.

Key words: cardiovascular, deception criteria, electrodermal, motor movement, polygraph history,respiration, terminology, word association.

This is a review of the literaturepublished up to 1950 that contributed to thecurrent list of physiological responsesconsidered deception criteria. That yearmarked the halfway point for the developmentof polygraph testing, as we know it in 1999. In1950 that only formal polygraph training wasat the Keeler Polygraph Institute, and mostexaminers were preceptor trained or self-taught. Most of the instruments were two-channel (cardiograph and pneumograph)mechanical units, although there were somewith electrodermal units. The most widelyused technique was relevant-irrelevant. A fewexaminers used one or the other of twopublished Control Question techniques, onepublished by Summers (1939), and the otherby Inbau (1948). Among the many short-comings in 1950 was a lack of agreement onwhat constituted deception criteria. Addinadequate chart markings, and that indepen-dent analysis of someone else’s charts wasdifficult, and the results were problematic.

In 1950, Charles M. Wilson, presidentof the International Society for the Detection ofDeception (ISDD), was asked, “Should graphsbe released or shown after the test?” Wilson’sreply was printed in the ISDD Bulletin. He saidthat in his experience he never released anoriginal record to anyone. He did not thinkmaking copies a good policy since possession

of the record by an untrained operatorrepresents the first step in the direction ofperversion and quackery. Wilson said thecharts mean nothing to anyone who was notpresent when the tests were run, and the onlyuse to which they could be put was to cloudthe issue (Wilson 1950).

If one examiner could not reliably readcharts from another examiner, what did theyknow about chart interpretation in 1950? Inthis paper we list sixteen studies or reportswhich included something on deceptioncriteria. The sixteen studies or texts did notdiscuss rank order scoring, only two had aform of numerical analysis, and computerswere not yet useful machines. Taking theDepartment of Defense Polygraph Institute(DoDPI) list as state of the art for hand scoringin 1999, how many of the criteria had beenidentified by 1950? In the pneumographtracing, DoDPI lists 12 items. Seven had beenidentified by 1950: I/E ratio change,amplitude increase, amplitude decrease/suppression, amplitude progressive decreaseand return to homeostasis, respirationbaseline change – temporary, baseline change– permanent, and apnea – blocking. By 1950,they had not yet observed respiration rateincrease, rate decrease, respiration amplitudeprogressive increase followed by decrease,amplitude progressive increase and return

The author is a Life Member of the APA and president of Forensic Research Incorporated. Correspondence on this topic orrequests for reprints should be sent to 35 Cedar Road, Severna Park, MD 21146.

Page 32: VOLUME 28 NUMBER 1

Ansley

Polygraph, 1999, 28 (1) 29

to homeostasis, and apnea – holding.Considering terminology, they might have seenthe difference between holding and blockingbut did not think the difference mattered.Some of today’s examiners might have troublerecognizing DoDPI’s more exact definitions ofstaircases up, staircases down, and staircasesup and down. DoDPI lists three criteria underelectrodermal, and two, amplitude andduration, were in the pre-1950 literature.Only the complex response, which someexaminers call a saddle, was not mentioned.DoDPI includes eight deception criteria undercardiograph including premature ventricularcontractions which were not listed prior to1950, but some would say should not be listednow. Of the seven others on the list, six wereknown: phasic increase and decrease inbaseline, tonic increase in baseline, tonicdecrease in baseline, pulse rate increase, pulserate decrease, and decrease in amplitude ofthe tracing. The one lacking in 1950 was theincrease of the amplitude of the tracing. Itwould appear that the well-informed examinerof 1950 had enough deception criteria todecide most of his cases, but the more we goback in time, the less he had. The cumulativegrowth of a body of technical and scientificknowledge is a vital part of a profession. Inthe text that follows we will see thedevelopment of knowledge.

One wonders if the pioneers ininstrumental detection of deception knew ofDaniel Defoe’s proposal to take the pulse of asuspected thief. One would think he wasdiscussing a modern polygraph problem whenhe observed, “It may be true that thisdiscovery by the pulsation of the blood cannotbe brought to a certainty, and therefore it isnot to be brought into evidence; but I insist, ifit be duly and skillfully observed, it may bebrought to be allowed for a just addition toother circumstances, especially if concurringwith other just grounds of suspicion.” (1730)(Moore 1955).

Cesare Lombroso (1911) mentions acase in which he used his recording hydro-sphygmograph. His apparatus measuredblood volume and pulse rate. He reports, “Thesame apathy persisted when he was spoken toof the robbery on the railroad, while there wasan enormous depression – a fall of 14mm –when the Torelli theft was mentioned. I

concluded, that he had no part in the railwayrobbery, but he had certainly participated inthe Torelli affair; and my conclusions werecompletely verified.” Here we have a measureof a cardiovascular reaction, and a verifieddecision.

In 1914, Vittorio Benussi published theresults of an experiment relating to thesymptoms of lying in respiration. At theUniversity of Graz in Leipzig, Benussi hadsubjects read aloud five statements, some ofwhich were coded and not to be read as stated.Half of the items in the 80 experiments were tobe lied about. Panels of witnesses madejudgments as to when subjects were lying, andwhen they were telling the truth. Using aMarey pneumograph which recorded on apolygraph, Benussi measured the distancebetween the beginning and end (length) ofeach of three, four, or five cycles of breathingafter the subject spoke. For each cycle ofbreathing Benussi measured the length (time)of the inspiration (I) and the length of theexpiration (E), and calculated the ratio (I/E)for each of the cycles before and after thestatement. He found that lying producedgreater I/E ratios than truthfulness. Of the 80experiments, I/E analysis resulted in one falsepositive error and one false negative error, fora total accuracy of 97.5%. The average panelaccuracy was 56% for truthful and 58% fordeceptive statements. This experiment attrac-ted the attention of Marston, Larson, andothers to the diagnostic value of a respiratoryrecording. Also, the I/E ratio has remained onthe deception criteria lists of the DoDPolygraph Institute, The Maryland Institute ofCriminal Justice, and other polygraph schoolsand courses.

John A. Larson (1923) had experiencein hundreds of criminal cases as a basis forhis description of deception criteria. Larsonrecorded a continuous cardiograph andpneumograph pattern on a smoked drumapparatus, and observed that the record of theinnocent suspect will usually vary but slightly,if at all, from its normal. In describing someguilty test results, Larson describes repressionin the pneumograph tracing, and the accom-panying chart illustration shows a rise and fallin the cardiograph pattern of the confirmeddeception. In another chart we see supp-ression, loss of baseline, and changes in the

Page 33: VOLUME 28 NUMBER 1

Deception Criteria Prior to 1950

Polygraph, 1999, 28 (1) 30

I/E ratio and rhythm and regularity in thepneumograph and a rise and fall in thecardiograph tracing, but his text does notdescribe this illustration. Larson notes thatthe cardiac curve is usually more significantthan the respiratory curve. In the descriptionof a chart, Larson writes of the extremeblocking effect of deception. In one chartLarson described deception causing a drop inthe blood pressure curve with the obliterationof the pulsations. In addition, there was anincrease in frequency. Describing anotherchart segment with a lie, Larson notes in boththe cardiac and respiratory curves there wasrepression. Larson states the followingchanges have been observed as the effect ofdeception. These changes may occur in boththe cardiac and the respiratory curves or inone alone, more frequently in the cardiacaction:

1. Increase in blood pressure – a rise.

2. Decrease in blood pressure.

3. Increase in height.

4. Increase in frequency.

5. Summative effects.

6. Incomplete inhibition.

7. Complete inhibitory effect.

8. Irregular fluctuations, especiallynoticeable at the base of each cardiacpulsation.

9. Combination of any of the aboveeffects in the same individual.

10. These changes may occur withbut little latent period, or then may beaccumulative in effect and more generallydistributed.

Leonarde Keeler (1930) wanted tocompare the peak of tension polygraphtechnique with the word association method.Seventy-five subjects took a one-chart peak oftension on which of ten cards they hadchosen. If the chosen card, placed by chance,was first or last in the sequence, the test wasrepeated in a different sequence. The

deception criteria were a rise in blood pressurefollowed by a release in tension after thechosen card, and the greatest suppression inthe respiratory tracing. There were 71 correctdecisions of 75 (95%) on the first trial. Post-test interviews attributed the failures to a lackof interest or concern which resulted in a lackof responses. Here we have pneumographsuppression and a rise and relief in thecardiograph pattern established as validdeception criteria. By comparison, the wordassociation test of 30 students was correct in19 (63%). For results of another comparisonsee the work of John E. Winter (1936) in adormitory theft case.

Professor John E. Winter (1936)investigated thefts in the women’s dormitoriesat West Virginia University with two methods:Jung’s word association test with a chrono-scope for reaction time, and a Larson typepolygraph test employing respiratory andcardiovascular measures, from separatedevices. The breathing curve was rated asregular or irregular; light or deep. The bloodpressure curve was rated as regular orirregular, and medium or strong. Winter gavethree levels of significance to the results ofeach of the methods: 0 for no significance,“nothing to indicate guilt;” 1 for “somesignificance and points in direction of guilt;”and 2 for “distinct signs of guilt.” There were25 women suspects and each received twoLarson type tests, with consistent responsesexcept for the culprit. The first test of eachsubject was labeled practice. From therespiration recording there were 24 zeros,including the thief, who confessed. On herpractice she scored a 2 on her cardiographcurve, the only one to do so. She was given apost-confession test where she again scored a2 on the cardiograph curve. This may be thefirst case of numerical scoring. Wordassociation cleared 19 innocent suspects, andhad the thief among the five who scored a 1.

Winter’s polygraph apparatus wasreported as “an ordinary pneumograph, aBaumanometer, an improved form of theErlanger capsule for high and low airpressure, and a MacKenzie polygraph for acontinuous record of breathing and heartaction.” For a picture and description of theMacKenzie polygraph, see Polygraph (1992)21(4) 349-350.

Page 34: VOLUME 28 NUMBER 1

Ansley

Polygraph, 1999, 28 (1) 31

C.D. Lee wrote an article, “The LieDetector,” published in the September, 1937issue of the Fingerprint and IdentificationMagazine. Lee illustrates the article with apicture of a chart from the examination ofJerone Selz who confessed to murder after thetest. There was a double rise and fall in thecardiograph pattern and suppressed res-piration following the question, “Did you killMrs. Rice?” The remainder of the article isabout the instrument and testing.

Leon G. Turrou (1938) in his book NaziSpies in America describes several polygraphexaminations given to suspects and witnessesinvolved in a German espionage ring. Turroudescribes how the instrument functions(cardiograph and pneumograph), then quotesKeeler on the procedure. Eight suspects orwitnesses were tested. Because manyquestions were asked each examinee, a systemof asterisks was devised to give someindication of results. In the report, oneasterisk after a question indicated a mildemotional reaction, two a strong emotionalreaction, and three asterisks, quite anemotional reaction, “such as would be foundwhen the subject is telling a whopper.” Oneexaminee was asked nine relevant questions.There were no asterisks behind four of thequestions, two asterisks behind one question,and three asterisks behind four questions; asplit call from a multiple issue relevant-irrelevant test format. During the testing of asuspect, 18 relevants were asked, and in thereport there were no asterisks behind five ofthe relevant questions, one asterisk behindfour of the relevants, two asterisks behind fiverelevants, and three asterisks behind fourrelevants. This evaluation of the charts wasunusual, at least unusual to appear in thereport. In reality, the asterisks were anumerical system, zero to three, for eachquestion.

William M. Marston published a bookin 1938. Under the heading “Judging aPolygraph Record,” Marston states thatchanges in the blood pressure are the chiefand only dependable criterion of deception.This is shown by the shifting of the entiremass of pulse tracings toward the upper edgeof the recording strip. Variations in the pulseare not significant. Regarding breathing,Marston said marked changes in respiration

tracings that accompany changes in the bloodpressure justify a judgment of deception. Henoted that Benussi’s breathing ratios areprobably extremely significant of lying, but ithas never proved practical. Marston said asudden hump in the breathing record may bemeaningful, as may a “shoulder” in either theinspiration or expiration tracing. Also indica-tive is a sudden irregularity indicating a“catching of the breath,” or an unaccountableflattening out of the whole respiration tracingindicating an extended series of shallowbreaths.

The Reverend Walter G. Summers,S.J., prepared a paper on his work before hisdeath on September 24, 1938. Published in1939, it describes a sophisticated test formatand means of chart analysis. In a theft casesthere would be three relevant questions. Insequence the questions asked aboutknowledge, guilt, and possession. Called“significant” questions, examples were, “Doyou know who took the money?”, “Did youtake the money?”, and “Have you the moneyon your person?” He said that within onerecord there were usually included threedifferent but related significant questions,each of which was asked three times.Interspersed among the non-significant ques-tions (irrelevants) are emotional standardquestions (controls). An emotional standardquestion precedes each significant question.The format is three pairs of control-relevantquestion, with irrelevants put in as needed.Examples of irrelevants were “Are you wearinga black coat?” and “Did you eat breakfast thismorning?” Examples of emotional standards,developed after extensive interviewing of theexaminee were: “Where you ever arrested?”and “Do you own a revolver?”

The analytical system is modern.Summers said “…we contrast and comparethe reactions to the significant questions withthe reactions to the emotional standards. Ifthe reactions to the significant questions areconsistently greater than the deflections to theemotional standards, the individual isconsciously trying to deceive the examiner. If,on the other hand, the deflections to thecritical questions are not consistently greaterthan those to the emotional standards, theindividual is truthfully expressing his state ofmind. This is the essential criterion of

Page 35: VOLUME 28 NUMBER 1

Deception Criteria Prior to 1950

Polygraph, 1999, 28 (1) 32

interpretation.” Professor Summers used arecording galvanometer, the Fordham Patho-meter, which he manufactured. A letter to theauthor from William E. Kirwan in 1952,indicated the New York State TroopersScientific Laboratory was still using theSummer’s technique, with excellent results(Kirwan 1952). Summers, who conductedlaboratory and criminal cases, established thecontrol question test concept, including theanalytic procedure (Summers 1939).

Paul Trovillo (1942) wrote what isprobably the first treatise on the topic ofdeception test criteria. The illustrations weretaken from real cases. Although theelectrodermal unit was not widely used, thereis a good section of illustrations of GSRtracings. For the cardiograph he lists andillustrates:

1. Common form of blood pressurerise (and return to baseline).

2. Blood pressure increase. . .complicated by cyclical increasethroughout the graph.

3. Rapid rise and decline in bloodpressure, accompanied by obliteration ofpulse amplitude.

4. Gradual increase in bloodpressure.

5. Constriction of pulse amplitudeand gradual rise in blood pressure.

6. Slight rise accompanied by rapiddecline in blood pressure.

7. Peak of tension.

8. Rapid changes in heart rhythm.

9. Another form of change in heartrhythm (includes general pulseirregularity).

10. Complication of deception pattern– increase in blood pressure and return tobaseline, variations in pulse frequency,and reduction of pulse amplitude.

11. Reduction in pulse amplitude.

For the respiration tracing, he lists andillustrates:

1. Suppression at point of deception.

2. Respiratory block.

3. Rise in baseline.

4. Respiratory suppression precedingdeception stimulus, followed by deeperrespiration at point of deception.

5. Regularity of respiration up to andthrough the deception stimulus, followedby irregular respiration.

6. Respiratory irregularities up topoint of deception, followed by regularrespiration.

For the electrodermal he lists andillustrates:

1. Comparatively large area ofreaction at point of deception.

2. Comparatively large magnitude ofreaction at point of deception.

3. Peak of tension test (experimentalage test), reactions to each age up to andincluding the point of deception, thennone.

4. Peak of tension card test. The onlylarge reaction.

5. Peak of tension. Pattern atdeception different from patterns attruthful answers.

6. Gradual rise in the electrodermalpattern.

Trovillo then lists and illustrates whathe calls ambiguities in the records. In thecardiograph tracing he shows the effect ofbody movements, a deep breath, generalexcitement, increase in blood pressure even atirrelevant questions, an absence of bloodpressure and pulse rate changes during lying,inconsistency of reactions on questionsinvolving guilt, startle response of innocentsubjects, and a cardiac irregularity.

Page 36: VOLUME 28 NUMBER 1

Ansley

Polygraph, 1999, 28 (1) 33

For ambiguous respiratory patterns helists and illustrates: deception-like supp-ression found among some innocentexaminees, effects of superfluous talking andphysical movement, erratic breathing of aninnocent person from great fear, a deep breathtaken deliberately to obliterate suppression,normal shallow breathing following a deepbreath, effect of sinus congestion, lack ofresponse in known guilty subject, andrespiratory tremor found in both relevant andirrelevant questions by an excited person.

For the ambiguous electrodermalpatterns he lists and illustrates: over-activityof the reaction, effects of bodily movement,effect of deep breath at the very moment ofresponse, unresponsiveness in guilty subject,inconsistent reactions in guilty subject, andguilt reactions in innocent persons.

In 1942, Fred E. Inbau published thefirst of his three books on Lie Detection andCriminal Interrogation. The techniques wererelevant-irrelevant and peak of tension. In thesection on deception criteria he notes that thecriteria differ somewhat for the twotechniques. For the cardiograph he mentionsan increase in blood pressure and theillustration shows it returning to baseline afterfirst going below the baseline. Other criteriainclude a sharp drop in blood pressure, andslowing of the pulse rate. For the respirationpattern he lists suppression, and heavybreathing about twenty or twenty-five secondsafter the reply to a question. Inbau writesabout the EDA and the lack of knowledgeabout it, and concludes that electrodermaltracings alone cannot be considered asadequate for deception diagnosis, but it maybe occasionally helpful as an adjunct to theother recordings.

In 1943, C.D. Lee prepared anInstruction Manual for the Berkeley Polygraph.It is a complete text on conducting examin-ations and reading the charts. The methodsare relevant-irrelevant and peak of tension.He notes the pattern of the innocent is one ofregularity and uniformity with no markeddifference between the effect produced byneutral questions and those related to thecrime. The tension may remain constant,decrease, fluctuate slightly, but seldomincrease. In the guilty, the tension is lacking

in regularity and uniformity. Illustrationsshow a phasic rise and fall of the cardiographpattern associated with deception. In thepneumograph, he shows repressed breathing,followed later by a sigh of relief. Most of theillustrations were of the cardiograph pattern,and the rise and fall of the cardiograph patternis clearly the primary indication of deception.

Joseph W. Haney (1944) was a forensicpsychologist and experienced polygraphexaminer in the Chicago Crime Laboratory.He was interested in the catalogue ofdeception criteria by Paul V. Trovillo. Haneywondered if the respiration responses des-cribed by Trovillo might not be produced by anondeceptive mental task as well as deception.Haney did that, producing charts withblocking (apnea), suppression, and baselinerises. Haney suggested that before usingthese as deception criteria, one should see ifthey occur also at irrelevant questions.

In 1948, Fred E. Inbau published thesecond edition of his book Lie Detection andCriminal Interrogation. In addition to therelevant-irrelevant and peak of tension teststhere was the Reid control question test. Thesection on deception criteria has not changedin a significant way. For the cardiograph,Inbau mentions an increase in blood pressure,and the illustration shows it returning tobaseline after first dropping below thebaseline. Other criteria include a sharp dropin blood pressure, and slowing of the pulserate. For the respiration system he listssuppression, and heavy breathing abouttwenty or twenty-five seconds after the reply toa question. In regard to the electrodermalchannel, the author said electrodermalresponses have been found to be of littlepractical value in diagnosing deception.

Baesen, Chung & Yang (1948-1949) tellus the chart criteria they used in a laboratoryresearch project employing a two-channelKeeler polygraph. They used pulse ratechanges, sudden and delayed drops in bloodpressure, duration of rise and fall in bloodpressure, and location of the dicrotic notch.Notice was taken of changes in respirationbaseline, blocking and suppression ofrespiration either prior to, during, or immed-iately following the question.

Page 37: VOLUME 28 NUMBER 1

Deception Criteria Prior to 1950

Polygraph, 1999, 28 (1) 34

In 1950, Colonel Ralph W. Pierce,president of Leonarde Keeler, Inc. was writingabout the use of the peak of tension test. Hewrote, “One man reacted to this test, his bloodpressure rising until the question concerningthe German Luger was asked, then falling off.He also showed marked irregularity in hisbreathing up to the question about Luger,followed by regularity to the end of the test.The galvanometer pen also rose sharply at thequestion concerning the Luger. This man alsoreacted similarly to the other tests referring tothe disposition of the gun, its condition, etc.”The examinee confessed to the crime. ColonelPierce’s description has tonic changes in thecardiograph and pneumograph tracings, but aphasic response in the electrodermal, channelshowing a combination of deception criteria.

Abandoned Methods for DetectingDeception

In the period before 1950 there weretwo techniques that were subject toconsiderable research as means for detectingdeception. Their criteria for deception werenot related to the methods in the polygraphtechnique.

Luria (1930, 1932) developed a liedetection method that involved tremors and

motor movement. It received some researchattention in the United States but was notused in criminal cases (Berrien, 1939; Morgan& Ojemann 1942).

From the turn of the century into the1930s the word association test wasconsidered a method for detecting guilt incriminal cases (Wertheimer & Klein 1904,Jung 1919). However, Larson (1922) found“the association words with time reaction donot give as satisfactory results as the cardio-respiratory changes.” Larson added, “We cansay this definitely in cases where the suspecthas subsequently confessed where, althoughthere were marked and striking changes in thetracings, the findings by association methodwere not significant.” Keeler (1930) found theassociation method performed poorly whencompared to polygraph test results. Winter(1936) in a real case of theft involving 25students, found the association method hadthe thief among five in a narrowed pool ofsuspects, but his cardio-pneumo methodidentified the culprit, followed by a confession.Although word association with reaction timeremains as a psychological tool, its use insolving crime has disappeared.

References

Baesen, H. V., Chung, C. & Yang, C. (1948-49). A lie detector experiment. Journal of Criminal Law,Criminology, and Police Science, 39, 532-537.

Benussi, V. (1914). Die atmung asymptome der luge (The respiratory symptoms of lying). Archivfur die Gestamte Psychologie, 11, 244-273. Translated and printed in Polygraph, 4(1), 52-76.

Berrien, F. K. (1939). Finger oscillations as indices of emotion. I Preliminary Validation, Journal ofExperimental Psychology, 24, 485-498; and II Further validation and use in detecting deception,Journal of Experimental Psychology, 24, 609-620.

Haney, J. W. (1943-44). The analyses of respiratory criteria in deception tests, a possible source ofmisinterpretation. Journal of Criminal Law and Criminology, 34, 268-273.

Inbau, F. E. (1942). Lie Detection and Criminal Interrogation. Baltimore: Williams & WilkinsCompany.

Inbau, F. E. (1948). Lie Detection and Criminal Interrogation. Second Ed., Rev. Baltimore:Williams & Wilkins Company.

Page 38: VOLUME 28 NUMBER 1

Ansley

Polygraph, 1999, 28 (1) 35

Instruction Manual for the Berkeley Psychograph (1943). San Rafael, California: Lee & Sons.

Jung, C. G. (1919). Studies in Word Association. New York: Moffet, Yard & Company.

Keeler, L. (1930). A method for detecting deception. American Journal of Police Science, 1(1),38-51.

Kirwan, W. E., (16 Oct 1952). Letter from the Director or the New York State Troopers ScientificLaboratory to Norman Ansley.

Larson, J. A. (1923). The cardio-pneumo-psychogram and its use in the study of the emotions,with practical applications. Journal of Experimental Psychology, 5(5), 323-328.

Larson, J. A. (1923). The cardio-pneumo-psychogram in deception. Journal of ExperimentalPsychology, 6(6), 420-454.

Lee, C. D. (Sep 1937). The lie detector. Fingerprint and Identification Magazine.

Lombroso, C., M.D. (1911). Crime: Its Causes and Remedies. Boston: Little, Brown & Company.Excerpt reprinted in Polygraph (1991) 20 (1) 52-53.

Luria, A. R. (1930). Die Methode der Abbildenden Motorik in der Titbestandiagnostik . Zeitschriftfur Angewandte Psycholgie, 35, 139-183.

Luria, A. R. (1932). The Nature of Human Conflicts: On Emotion, Conflict and Will. New York:Liveright.

Marston, W. M. (1938). The Lie Detector Test. New York: Richard R. Smith. Reprinted by theAmerican Polygraph Association, 1989.

Moore, J. R. (1955). Defoe’s project for lie-detection. American Journal of Psychology, 68, 672.

Morgan, M. I. & Ojemann, R. H. (1942). A study of the luria method. Journal of AppliedPsychology, 26, 168-179.

Pierce, R. W. (1950). The peak of tension test. ISDD Bulletin, 3 (3) 1, 4, 5.

Summers, W. G. (1939). Science can get the confession. Fordhan Law Review, 8, 335-354.

Trovillo, P. V. (1942). Deception test criteria: How one can determine truth and falsehood frompolygraphic records. Journal of Criminal Law and Criminology, 33 (4) 338-358.

Turrou, L. G. (1938). Nazi Spies in America. New York: Random House.

Wertheimer, M. & Klein J. (1904). Psychologische Tatbestandsdiagnostik. Achiv fur Kriminal-Anthropologie und Kriminalistic, 15, 78-113.

Wilson, C. M. (1950). Should graphs be released or shown after test? ISDD Bulletin, 3 (3), 5, 8.

Winter, J. E. (1936). A comparison of the cardio-pneumo-psychograph and association methods inthe detection of lying in cases of theft among college students. Journal of Applied Psychology,20, 243-248.

Page 39: VOLUME 28 NUMBER 1

Attachment 1Deception Criteria for Lie Detection Pioneers

Defoe Lombroso Benussi Larson Keeler Winter Lee Turrou Marston Summers Trovillo Inbau Lee Haney Inbau Baesenet al

Pierce

1730 1911 1914 1923 1930 1936 1937 1938 1938 1939 1942 1942 1943 1944 1948 1949 1950

Respiratoryrate decrease

rate increaseI/E ratio change X Xamplitude increase X X X X X Xamplitude decrease-suppression X X X X X X X X X Xprogressive increase - decreaseprogressive increase and returnprogressive decrease and return Xbaseline change - temporary Xbaseline change - permanent X X Xapnea - holding (inhalation)apnea - holding (exhalation) X X X X

Electrodermalamplitude change X X Xcomplex responseresponse duration and return X

Cardiovascularbaseline increase and decrease X X X X X X X Xbaseline increase X X X Xbaseline decrease X X X X Xamplitude increaseamplitude decrease Xpulse rate increase X X X Xpulse rate decrease X X XPVCs

Numerical analysis X X

Page 40: VOLUME 28 NUMBER 1
Page 41: VOLUME 28 NUMBER 1
Page 42: VOLUME 28 NUMBER 1

Light

Polygraph, 1999, 28 (1) 37

Numerical Evaluation of the Army Zone Comparison Test

Gary D. Light

Abstract

This study involved a comparison between the spot-totaling and cumulative numerical evaluationprocedures with the zone comparison test (ZCT) as applied by the United States Army CriminalInvestigation Command (USACIDC) polygraph program. The examinations utilized for this researchwere conducted 1 January through 31 March 1991. Subsequently, the USACIDC polygraphprogram replicated this research with data from the calendar year 1997, and these data wereincorporated into this study. The cumulative test data analysis procedure was applied to the ZCTexaminations from both time periods. A total of 358 confirmed deceptive examinations wereidentified. The cumulative evaluation procedure correctly identified 241 (67%) of the confirmeddeceptive examinations while 107 examinations (30%) would have been classified as no opinion andten examinations (3%) would have been classified as false negatives. Based upon these findings, itwould appear that the cost of the cumulative evaluation procedure in terms of utility and accuracyof the PDD process (30% no opinion) is too significant for an agency supporting a law enforcementmission to accept.

In 1961, the ZCT, as developed byCleve Backster, was adopted by USACIDC foruse in polygraph examinations of criminalsuspects (Brisentine, 1991). In 1962, with thepermission and assistance of Cleve Backster,the ZCT was incorporated into the formallesson plan of the Polygraph School, UnitedStates Army Military Police School (USAMPS),Fort Gordon, GA (Decker, 1991). The ZCT,with certain modifications, is still being taughtat the Department of Defense PolygraphInstitute (DoDPI) (Cole, 1991)]. Thisquestioning format of the ZCT is referred to asthe "Army" ZCT. One modification of theoriginal ZCT protocol incorporated byUSACIDC and USAMPS was a change in theprocedure for test data analysis (scoring). Thenumerical evaluation of the ZCT was taught intwo formats at USAMPS (Sneed, 1991).Originally, the numerical evaluation of the testdata was based solely on the overall sum ofthe spot totals, referred to as "cumulativescoring" (Brisentine, 1991). The secondnumerical evaluation procedure, as taught atUSAMPS and currently at DoDPI, is referred toas "spot-totaling". USAMPS adopted the spot-

totaling procedure after a review of numerouscriminal specific examinations. During theUSAMPS review, it was found that thecumulative method of chart evaluation wasincorrectly assigning deceptive examinees NoOpinion (NO) classifications (Sneed, 1991).

In the ZCT, a spot is the pairing of arelevant and comparison questions in whichthe physiological responses are comparedcomponent by component against one another,and a whole number value between –3 and +3is assigned to each spot for each component.The difference between spot-totaling and thecumulative evaluation procedure is thedecision rule. The spot-totaling numericalevaluation procedure results in a determin-ation of non-deception indicated (NDI) if thesum of all spot-totals greater than +5, with apositive score occurring in each spot. Theclassification of Deception Indicated (DI) ismade if the overall total is less than –5, or ifthe evaluation of any relevant question (spot-total) is -3 or less, regardless of the grand sumof all spot-totals. A classification of NO isrendered when an the grand sum for the

The author was the supervisor of the quality control section of the USACIDC polygraph program in 1991, when this projectwas originally completed. The project was intended to review the utility of the spot-total analysis procedure for the USACIDCpolygraph program for internal purposes, but not for publication. Mr. Milton O. Webb, current supervisor of the qualitycontrol section, USCIDC polygraph program, suggested that 1997 data be included. The author is grateful for the editing ofMs. Sheila Thomas, DoDPI. This document is submitted for publication with the permission of the USACIDC polygraphprogram.

Page 43: VOLUME 28 NUMBER 1

Scoring the Army ZCT

Polygraph, 1999, 28 (1) 38

examination falls between +6 and –6, or anexamination has a spot-total of 0 to minus -2,regardless of the cumulative total.

The cumulative numerical evaluationprocedure results in a determination of NDI ifthe sum of all of the spot-totals is +6 orgreater. The classification of DI is rendered ifthe sum of the spot-totals is -6 or less. Aclassification of NO is rendered when the sumof the spot-totals is between +6 and -6. Nodecision is based on the score of an individualrelevant question.

The ZCT is used in approximately halfof the polygraph examinations conducted bythe USACIDC polygraph program. Between1980 and 1990, USACIDC conductedapproximately 15,839 field examinations ofcriminal suspects utilizing the ZCT. Duringthis period 76% of these examinations wereconfirmed as deceptive. Further, between1980 and 1990, USACIDC maintained aconfession rate of 70% of those examinationsin which the suspect was called DI usingUSACIDC spot-totaling scoring rules. Since1966, USACIDC has utilized the spot-totalingmethod exclusively for chart evaluation(Brisentine, 1991).

The research study of Capps andAnsley (1991) indicated that polygraphexaminations utilizing the spot-totalingmethod of numerical evaluation might resultin a significant number of false positiveexaminations. In this study, Capps andAnsley used confirmed polygraph examin-ations that were numerically evaluated, andopinions that were based upon the cumulativetotaling procedure. In this research, anopinion of NDI was rendered even when spot-totals were in the minus ranges. The Cappsand Ansley research reported that whenutilizing spot-totaling rules, the false positiveerror is not as readily identified. The researchindicated that by "classifying in this way(cumulative totaling procedure), false positiveerrors that may have come into existence byuse of the spot rule are identified" (Capps andAnsley, 1991).

In 1985, Richard Weaver, utilizing 15criminal specific examinations retrieved fromthe Wisconsin State Crime Laboratory, appliedthe Backster, Utah and USAMPS numerical

evaluation procedures. The results of theseevaluation procedures indicated that althoughthere was no significant difference in theopinions rendered when similar classificationprocedures (Utah and USAMPS) were utilized,a significant difference occurred when theweaker comparison question was compared toa relevant question. These findings indicatethat the method of evaluating the test data canaffect ZCT decisions.

Method

This research project was originallyinitiated in response to the research project ofCapps and Ansley (1991), that indicated thatthe cumulative evaluation process might be amore appropriate method of evaluation for theZCT by a government agency. Specifically, theUSACIDC polygraph program evaluated thespot-total and the cumulative evaluationprocedures to determine which would be themost appropriate for that agency. The originalresearch and report were completed in 1991,based upon the review of all DI examinationsconducted by the USACIDC polygraphprogram between 1 January and 31 March1991, utilizing the ZCT question format astaught at USAMPS. The 1991 research was aninternal review of existing PDD procedureswith no intent of publishing the results. In1998, it was requested the research be editedfor submission to this journal for possiblepublication. Prior to the 1991 report beingedited and submitted for publication, theUSACIDC program was provided with a copy ofthe draft report. The USACIDC programrequested that they be allowed to replicate thedata collection procedures completed in 1991.It was agreed that USACIDC personnel wouldreview all DI examinations involving the ZCTfor the calendar year 1997. The examinationsthat had been determined to be DI utilizing thespot-totaling procedure would subsequently beevaluated utilizing the cumulative evaluationprocedure. The results of that procedurewould then be compared to the data collectedin 1991.

Between 1 January and 31 March1991, 482 examinations were conducted and145 of these examinations were opined to beDI. All of the test data for these 145examinations were evaluated by the spot-totaling procedure. Of these 145 DI

Page 44: VOLUME 28 NUMBER 1

Light

Polygraph, 1999, 28 (1) 39

examinations, 108 were verified as DI by useof confession.

Between 1 January and 31 December1997, 959 examinations were conducted and556 examinations were opined to be DI. All ofthe test data for these 556 examinations wereevaluated by the spot-totaling procedure. Ofthese 556 DI examinations, 250 were verifieddeceptive by posttest confession of theexaminee.

The 358 examinations (1991 and 1997totals) that were confirmed deceptive consistedof examinations in which the examineeconfessed to being involved in the investigationfor which the polygraph examination wasconducted. The quality control section of theUSACIDC polygraph program routinely reviewsall examinations that are completed foraccuracy, and to determine if the opinionrendered by the original examiner can besupported by the agency's investigation. Apart of that quality control process is todetermine if the examinee has madestatements which support the opinion which isrendered based upon the physiologicalresponses to the relevant questions. Whenstatements are made by the examinee that areagainst the self-interest of the individual andare consistent with the deceptive physiologicaldata, this examination is consideredconfirmed. As will be discussed later, wheninconsistent results are found between thecase facts and the physiological responses tothe relevant questions, a review of thisdiscrepancy is completed and this examinationis categorized in the USACIDC polygraphprogram data base as a "contradictedexamination".

All of the 358 DI examinations used inthis research project underwent the USACIDCquality control review to verify that eachexamination was confirmed as deceptive. Thecriterion of confessions as ground truth is asubject for debate (Patrick and Iacono, 1991).This research project is not a validation study,but is concerned only with the impact thecumulative evaluation procedure would havewhen applying this test data evaluationprocedure in lieu of the spot-total analysisprocedure.

The Lafayette Factfinder analogpolygraph was the primary instrument used in

1991. The 1997examinations were conductedmainly with the Axciton (Axciton Systems,Houston, Texas) computerized polygraphsystem. All polygraph examinations admini-stered underwent the USACIDC quality controlprocess, which ensured each examinationcomplied with the policies and standardsrequired by USACIDC. With minormodifications, the USACIDC policies adhere tothe procedures as taught by DoDPI for theconduct of the ZCT. The examiners whocollected the examinations for this projectwere either certified examiners, or internscompleting their internship for the USACIDCpolygraph program. Each examiner had abachelor's degree and at least five years ofcriminal investigative experience as aUSACIDC special agent.

The following ZCT question format wasutilized by USACIDC in the conduct of theseexaminations:

NeutralSacrifice RelevantSymptomaticComparisonPrimary RelevantComparisonPrimary RelevantSymptomaticComparisonSecondary Relevant

The review of the relevant literature forthe original USACIDC research project wascompleted in 1991. The review of theliterature was not updated for the 1997 data.The 1997 data was incorporated to thisresearch project at the request of USACIDCpolygraph program. It is believed that thereplication of the procedures by differentresearchers and involving data collected sixyears apart would significantly add to thisresearch.

Results

DI Confirmed Examinations in 1991

Of the 145 DI examinations, 108examinations were confirmed by confession.When utilizing cumulative evaluationprocedures, 74 (69%) of the examinations weredetermined to be DI. Of the remaining

Page 45: VOLUME 28 NUMBER 1

Scoring the Army ZCT

Polygraph, 1999, 28 (1) 40

examinations, 32 (30%) examinations wouldhave been determined to be NO and two (1%)

would have been classified as false negative.(See Table 1.)

Table 1.Polygraph decisions in 1991 as they would be affected by two types of decision rules:

cumulative and spot totals. (n=108 confirmed deceptive examinations)

ResultsMethod DI NO NDICumulative 74 (69%) 32 (30%) 2 (1%)Spot-Totaling 108 (100%) 0 0

DI = Deception Indicated NO = No Opinion NDI = No Deception Indicated

DI Confirmed Examinations in 1997

Of the 556 DI examinations, 250examinations were confirmed by confession.When utilizing cumulative evaluation

procedures, 167 (67%) of the examinationswere determined to be DI. Of the remainingexaminations, 75 (30%) examinations wouldhave been determined to be NO and eight (3%)would have been classified as false negative.

Table 2.Polygraph decisions in 1997 as they would be affected by two types of decision rules:

cumulative and spot totals. (n=250 confirmed deceptive examinations)

ResultsMethod DI NO NDICumulative 167 (67%) 75 (30%) 8Spot-Totaling 250 (100%) 0 0

DI = Deception Indicated NO = No Opinion NDI = No Deception Indicated

Combined Confirmed DI Examinations in1991 and 1997

Of the 701 total DI examinations, 358examinations were confirmed by confession.When utilizing cumulative evaluation

procedures, 241 (67%) of the examinationswere determined to be DI. Of the remainingexaminations, 107 (28%) examinations wouldhave been determined to be NO and ten (3%)would have been classified as false negative.

Page 46: VOLUME 28 NUMBER 1

Light

Polygraph, 1999, 28 (1) 41

Table 3.Polygraph decisions in 1991 and 1997 combined as they would be affected by two types of

decision rules: cumulative and spot totals. (n=358 confirmed deceptive examinations)

ResultsMethod DI NO NDICumulative 241 (67%) 107 (30%) 10 (3%)Spot-Totaling 358 (100%) 0 0

DI = Deception IndicatedNO = No OpinionNDI = No Deception Indicated

Discussion

This research indicates that thecumulative evaluation procedure has asignificant negative impact upon the outcomeof the ZCT. Applying the cumulativeevaluation procedure to the confirmed DIexaminations increased the NO rate to 30%.This NO ratio is significantly higher than the13% NO ratio attained by USACIDC fieldexaminers during the 1990 calendar year. The13% NO ratio includes all USACIDCexaminations, called truthful or deceptive,with all question formats for that time period.The USACIDC NO percentage, when NDIexaminations are not considered, was less that8% for the calendar year 1990.

A 30% rate of NO outcomes would havea significant negative impact upon an agencysupporting a criminal investigative mission.Attachment A outlines seven case historiesthat illustrate the problem. The case historiesclearly illustrate that persons who haveconfessed to homicide, child molestation,forcible rape, larceny, indecent assaults,frauds, and other felony offenses would nothave been correctly identified. Six of the sevenexamples provided were deceptive examineeswho would have been classified as NO. Thenumber of these suspects who would havefailed to confess subsequent to furtherpolygraph testing is unknown. However, thelikelihood of resolving issues throughtestimonial evidence is reduced with eachsubsequent series of questions required toobtain a conclusive opinion. Case history 6 isa troubling example. That guilty suspectwould have been misclassified as NDI of a

forcible rape without the spot-totaling decisionrule. The suspect may not have beeninterrogated at all, if the cumulative evaluationprocedure were accepted as the test dataevaluation procedure for USACIDC.

The rationale for the use of thecumulative evaluation procedure is theconcern for the false positive , that is, thepolygraph examination identified theindividual as having committed the offense,when in fact the person was not involved inthe incident. The concern is that, "theclassification used by those governmentagencies that employ the spot rule does notallow for this (false positive), since a subjectwith a minus three in any one overall spot-total is classified as practicing deception (andfiled accordingly) regardless of whether or nothe is truthful" (Capps and Ansley, 1991). TheUSACIDC polygraph program, since 1976,conducts a review of all final reports ofinvestigation in which a polygraph examin-ation was completed (Brisentine, 1991). Thesereports of investigations are compared to thepolygraph examination report that wasconducted in support of that investigation.These two reports are compared to assure thatthe polygraph examination is consistent withthe results of the field agent that conductedthe original investigation. This final review isto identify those examinations in which thecase agent's final report of investigation iscontradicted by the polygraph report. Once apolygraph report is identified as being possiblycontradicted by the final report of invest-igation, an attempt is made to identify thecause for this conflict. The number ofcontradicted examinations is approximately 6

Page 47: VOLUME 28 NUMBER 1

Scoring the Army ZCT

Polygraph, 1999, 28 (1) 42

per year. Those examinations havetraditionally been both false negative andpositive. The occurrence of a significant num-ber of false positive examinations in USACIDCis not in evidence.

The concerns of researchers that falsepositive examinations can occur in up to 50%of the examinations (Ben-Shakhar andFuredy, 1990) is not supported by theUSACIDC evidence. There is little doubt that iffalse positive decisions occurred as suspectedby some critics of PDD, such a significanterror rate would be identified through theadministrative processing of USACIDC PDDexaminations. The evidence is in sharpcontrast to these hypothetical projections.

The impact of the false positive for aspecific issue examination in a lawenforcement setting can be addressed byestablishing priorities based upon costs andutilities. This concept was recognized by Ben-Shakhar, Lieblich, and Bar-Hillel (1982), whenthey noted that one of the few situations inwhich polygraph tests could have positivecosts and utility is the police investigation.They recognized that the costs associated witha false positive in the criminal specificexamination are minimal while the utility issignificant. Ben-Shakhar and Furedy (1990),placed the costs associated with the falsepositive in the context of the USACIDC specificissue examination when stating theexamination is, "a case where the outcomewould only be a decision of the sort of whetherto continue the interrogation of a givensuspect or to release the person andconcentrate on alternate leads".

Summary

The use of a NO in test data evaluationis a necessary safety valve that precludesforcing a conclusive opinion when that opinioncannot be defended with the availablephysiological response patterns. However, ifUSACIDC field examiners rendered NOclassifications in over 30% of the examinationsconducted, it would be a disservice to theperson undergoing the examination and toUSACIDC field elements. The logic in utilizingthe spot-totaling numerical evaluationprocedure for criminal specific examinations isapparent for USACIDC. This study wasconsistent with the USAMPS review whichfound that the costs associated with theartificially created NO were too great whentaking into consideration that the actualoccurrence of the false positive is minimal. Asclearly demonstrated in this study, thecumulative evaluation procedure failed toconclusively identify numerous deceptivepersons, to include six sex offenders, 12 felonsinvolved in crimes against property, four drugoffenders, and five other felons involved incrimes against persons (these examples wereextracted from the 1991 data). In a criminalspecific issue examination, the concern for thefalse positive exists in the federal government,but there is a paucity of research whichindicates that its actual occurrence issignificant. The costs of the cumulativeevaluation procedure on the utility andaccuracy of the PDD process is high, and isconsidered too significant for an agencysupporting a law enforcement mission toaccept.

References

Backster, C. ( 1961) Report on Polygraph Technique Trends. Speech given at the 1961 Annual ASIReport on Polygraph Technique Trends. New York, NY.

Ben-Shakhar, G. & Furedy, J. (1990). Theories and Applications in the Detection of Deception. NewYork: Springer-Verlag.

Ben-Shakhar, G., Lieblich, I., and Bar-Hillel, M. (1982). An evaluation of polygrapher's judgments:A review from a decision theoretic perspective. Journal of Applied Psychology, 67, 701-713.

Page 48: VOLUME 28 NUMBER 1

Light

Polygraph, 1999, 28 (1) 43

Brisentine, R. (1991). Personal communication. Former Director United States Army CriminalInvestigation Command, Dundalk, MD Polygraph Program, personal conversation.

Capps, M. H., & Ansley, N. (1992). Analysis of federal polygraph charts by spot and chart total.Polygraph, 21(2), 110-131.

Capps, M. (1990). Evaluation of Two Federal Zone Comparison Score Systems. Unpublishedresearch.

Cole, R. (1991). Personal communication. Instructor Department of Defense Polygraph Institute,Fort McClellan, AL.

Decker, R. (1991). Personal communication. Former Director, United States Army Military PoliceSchool, Fort McClellan, AL.

Mason, P. L. (1998). Polygraph validity with urinalysis results as ground truth. Polygraph, 27(1),45-48.

Patrick, C. & Iacono, W. (1991). Validity of the Control Question Polygraph Test: The Problem ofSampling Bias. Journal of Applied Psychology, 76, 229-238.

Sneed, E. (1991). Personal communication. Former Instructor, Department of Defense PolygraphInstitute, Fort McClellan, AL.

Weaver, R. (1985). Effects of differing numerical chart evaluation systems on polygraphexamination results. Polygraph, 14, 1, 90 - 104.

Attachment A

The following are selected summations of the actual polygraph examinations extracted fromthe USACIDC files. These examinations were originally opined to be DI utilizing the spot-totalanalysis evaluation procedure. When applying the cumulative evaluation procedure, a NO or afalse negative determination resulted for each of the following examinations.

Case 1. Homicide

During a failed drug transaction, a drug dealer produced a .45 caliber automatic pistol andshot one of the two persons attempting to make the drug purchase. The victim died as a result of ahead wound. The other person attempting to purchase the drugs fled the area unharmed. Thedrug dealer, who did the shooting, was arrested for the murder. Additional information wasdeveloped that another person assisted and conspired with the drug dealer to kill the two personsduring the drug transaction. The drug dealer who shot and killed the individual declined to talkabout the incident. The suspected co-conspirator consented to undergo the examination. Duringthe post test phase, the individual admitted to planning to "rip-off" the victim by using the .45automatic. He further admitted to assisting the shooter in the actual robbery.

Spot I Spot II Spot IIIQuestion #5 Question #7 Question #10

Spot Scores +2 +4 -4

Cumulative total = +2

Page 49: VOLUME 28 NUMBER 1

Scoring the Army ZCT

Polygraph, 1999, 28 (1) 44

Case 2. Child molesting

Two daughters alleged that their father had been sexually molesting them for several years.The children had made the same allegations three years earlier, but the allegations could not besubstantiated and the children continued to reside with their father. Based upon the newallegations made to a social worker by the children, the father underwent a polygraph examination.The father confessed after the examination to having committed the sexual acts reported by the twodaughters.

Spot I Spot II Spot IIIQuestion #5 Question #7 Question #10

Spot Scores +2 -3 -3

Cumulative total = -4

Case 3. Theft

During the U.S. military operation "Just Cause," several barracks rooms on a U. S. Armyinstallation were broken into, and over $5,000.00 worth of property was stolen. The roomsbelonged to soldiers deployed to Panama in support of operation Just Cause. A soldier that was notdeployed was a suspect, and he consented to a polygraph examination. After failing theexamination, the soldier admitted to stealing the master keys to the rooms and to assisting threeother soldiers in stealing the property.

Spot I Spot II Spot IIIQuestion #5 Question #7 Question #10

Spot Scores +7 -4 +1

Cumulative total = +4

Case 4. Sexual harassment

A female alleged that, while she was at work, a male employee sexually assaulted her bykissing her without her permission and, subsequently, by touching her around the private parts ofher body. She alleged that all of these acts occurred against her will. A polygraph examination wasconducted of the male employee who confessed after testing to having fondled the victim as alleged.

Spot I Spot II Spot IIIQuestion #5 Question #7 Question #10

Spot Scores +1 +2 -3

Cumulative total = 0

Page 50: VOLUME 28 NUMBER 1

Light

Polygraph, 1999, 28 (1) 45

Case 5. Fraud

A person reported that jewelry and money in the amount of $350.00, was stolen from herroom while she was on vacation. The facts surrounding the theft, as related by the complainant,were not consistent with those related by witnesses. Subsequent to the polygraph examination, thealleged victim confessed to having fabricated the theft of the property in order to make a false claimfor reimbursement.

Spot I Spot II Spot IIIQuestion #5 Question #7 Question #10

Spot Scores +7 0 -4

Cumulative total = +3

Case 6. Rape/Indecent Assault

A female alleged that while at a party she became highly intoxicated and passed out. Shestated that when she woke up she found that her panties had been removed. The female providedthe name of a male soldier who was present in the room prior to her passing out. A polygraphexamination was administered of the male who subsequently admitted to engaging in sexualintercourse with the female against her will.

Spot I Spot II Spot IIIQuestion #5 Question #7 Question #10

Spot Scores +7 +4 -4

Cumulative total = +7

Case 7. Wrongful Use of a Controlled Substance (Urinalysis)

During a command directed urinalysis test, a soldier rendered a urine sample which testedpositive for the presence of marihuana. The soldier stated he was given and smoked a cigar thatcould possibly have contained marihuana. The soldier denied knowing the cigar containedmarihuana at the time he smoked the cigar. The soldier consented to a polygraph examination andlater admitted he knowingly consumed marihuana.

Spot I Spot II Spot IIIQuestion #5 Question #7 Question #10

Spot Scores +2 -4 +3

Cumulative total = +1

Page 51: VOLUME 28 NUMBER 1

Scoring Systems for the Matte Techniques

Numerical Scoring Systems in the Triad of Matte Polygraph Techniques

James Allan Matte

The Matte family of polygraph techniques consist of the Matte Quadri-Track Zone Comparison Technique which is a single-issue test, the Matte Quinque-Track Zone Comparison Technique which is an exploratory multiple-issue test, and the Matte Suspicion-Knowledge-Guilt (S-K-G) test designed to identify the examinee who has major involvement, some direct involvement, or guilty knowledge regarding a specific issue.

The numerical quantification system in the analysis of the physiological data used in the aforementioned triad of Matte polygraph techniques employs the Backster chart interpretation rules (See pages 398-406, Matte 1996), with some minor changes described herein.

To attain an objective measure of the reactions or lack of reaction to each relevant and control question in each of the three tracings (pneumograph, electrodermal, cardiograph), the numerical scoring system designed by Cleve Backster (Backster 1963) is used in the triad of Matte polygraph techniques. This system provides the forensic psychophysiologist (FP) with a means of objectively evaluating each relevant question versus its neighboring control question, hereafter referred to as a spot (control vs. relevant), in each tracing according to chart interpretation rules with penalties for violation of those rules, by the assignment or scoring of each spot with a number from a seven-position scale.

Value +3 MT Maximum Truthful Score +2 T Truthful Score +1 t

O? -1 d -2 D -3 MD

Minimum Truthful Score

Minimum Deception Score Deception Score Maximum Deception Score

Numbers preceded by a minus sign fall into the deceptive area; numbers preceded by a plus sign fall into the truthful area.

The following are departures from the Backster Rules. Most of these departures were implemented prior to 1980 and all of them were in effect prior to the validity study of the Matte Quadri-Zone Comparison Technique (subsequently renamed Quadri­Track ZCT) published in 1989, with the exception that as a result of this validity study, the threshold or minimum score required to reach a conclusion for the truthful in the Matte Quadri-Track ZCT was reduced from +4 per chart to +3 per chart.

Backster "Either-Or" Rule:

To arrive at an interim spot analysis tracing determination of (+2) or (-2) there must be a significant and timely tracing reaction in either the red zone (relevant) or green zone (control) being compared.

Backster ZCT: If the red zone indicates a lack of reaction, it should be compared with the neighboring green zone containing the larger

The author is the owner of Matte Polygraph Services, Inc., and author of Forensic Psychophysiology Using the Polygraph' Scientific Truth Verijication - Lie Detection. Reprint requests should be addressed to Dr. James Matte, Matte Polygraph Services, 43 Brookside Drive, Williamsville, NY 14221-6915, or to his e-mail [email protected].

Polygraph, 1999, 28 (1) 46

Page 52: VOLUME 28 NUMBER 1

timely reaction. If the red zone indicates a timely and significant reaction it should be compared with the neighboring green zone containing no reaction or the least reaction.

Matte OTZCT: Red Zone is always compared with the green zone question preceding it. the red zone questions are switched in position with each chart conducted, thus are alternately compared against each green zone question. (Matte 1996)

Tracings Included: respiration, electrodermal, and cardiograph.

Backster "Green Zone 'Yes' Answer Penalty Rule:

If a "yes answer is given to a green zone question which is a reversal of the answer given during the pretest question review, that green zone cannot be used as a spot analysis 'presence-of-reaction" zone.

A green zone involving such an answer reversal can be used as a spot analysis "lack­of-reaction" zone where no reaction, or a reaction significantly smaller than the red zone reaction, is indicated.

Backster ZCT: Such use should be avoided if another adjacent lack-of reaction green zone, properly answered, is available.

Matte QTZCT: The forensic psychophys­iologist cannot jump to another Track to make a comparison. See Figure 3 for Primary, Secondary, and Inside Tracks.

Tracings Included: respiration, electrodermal, and cardiograph.

Backster "Green Zone Abuse" Rule:

If the intensity of a green zone reaction appears to be at least four times as dramatic as a minor reaction in the red zone, it is not proper to feature the minor red zone reaction and compare it with the other neighboring green zone which may show a lesser reaction or no reaction.

Matte OTZCT: This rule does not apply to the Quadri-Track or Quinque-Track ZCT

Polygraph, 1999, 28 (1) 47

Matte

inasmuch as the FP must compare each red zone question to the preceding green zone question, and cannot jump to another track to make a comparison. (Matte 1996)

Tracings Included: respiration, electrodermal, and cardiograph.

Backster "Question Pacing" Upgrading Rule:

To upgrade a ((+2) or (-2) interim spot analysis rating to a (+3) or (-3) final spot rating each of the two zones being inter­compared must embrace a minimum of twenty seconds and a maximum of thirty-five seconds.

Note: Question pacing is measured from the first word of one question to the first word of the question that follows.

Matte OTZCT: Requires a muumum of twenty-seconds between the answer to a test question to the commencement of the next question that follows it, not to exceed thirty seconds.

Tracings Included: respiration, electrodermal, and cardiograph.

Matte "Dual-Equal Strong Reaction" Rule: An Exception to the Backster's One-to-One Rule.

When the red and green zones being inter­compared both contain timely, specific and significant reactions of maximum and equal strength, a minus one (-1) score is assigned to that spot.

Tracings Included: respiration, electrodermal, and cardiograph.

Note: When there is a presence of mild reaction which would warrant only a minimum score of - / + 1 in both the relevant question and its neighboring control question respectively of equal magnitude, such as in Figure 1, where there is no presence of parasympathetic activation (questions 46-33) a numerical value of zero must be assigned to this spot in the respiration tracing. However when there is a presence of strong reaction which would be manifested by distinct

Page 53: VOLUME 28 NUMBER 1

Scoring Systems for the Matte Techniques

activation of both sympathetic and parasympathetic systems in both zones being inter-compared of equal magnitude (Figure 2), a minimum deception score of -1 must be given to this spot. This rule applies only to the pneumograph and cardiograph tracings, not the electrodermal. The electrodermal tracing is excluded because it is more volatile and sensitive to extraneous stimuli.

The aforesaid rule is based on the premise that both zone questions appear to be equally threatening to the examinee, the

degree of threat being proportionate to the degree of the responses, which indicate that while the examinee may be attempting deception to the relevant question, its neighboring control question may be too intense due to faulty structure, embraces an equally or more serious unknown crime, or a countermeasure attempt was made. A sophisticated guilty examinee may be able to cause a reaction on the control question but cannot control an oncoming reaction to the relevant question.

Figure 1 Equally MUd Reactions

1 1 -46 (control) 33 (relevant)

Figure 2 Equally Strong Reactions

1 -46 (control)

Appendix 1 depicts the Matte Quadri­Track Zone Comparison Test structure which shows that the vertical score tallied from spots 1, 2 and 3 are combined for a total score, inasmuch as all spots deal with the same single issue. Appendix 2 depicts the Tri-Spot Quantification System for the Quadri-Track ZCT. and Figure 3 shows the Conclusion Table from which a determination is made as to Truth, Indefinite (Inconclusive),

Polygraph, 1999, 28 (1) 48

33 (relevant)

or Deception from the total scores tallied from spots 1, 2, and 3. It should be noted that in the Matte Quadri-Track Zone Comparison Technique and the Matte Suspicion­Knowledge-Guilt (SKG) tests, a minimum of two polygraph charts (tests) must be conducted to reach a conclusion of truth, deception or inconclusive, and in the Matte Quinque-Track Zone Comparison Technique, a minimum of three polygraph charts must be

Page 54: VOLUME 28 NUMBER 1

conducted to reach a conclusion. This requirement increases external reliability. The American Polygraph Association standards require that a minimum of two charts be conducted to reach a determination

Matte

of truth or deception. (See Appendices 1 and 2 for the Matte Quadri-Track Zone Comparison Test Structure and Tri-Spot Quantification System).

Figure 3

CnNCTJIC:TnN TART r:-RESULTS FOR 1 CHART CIRCLE APPROPRIATE NUMBER BELOW

+ 27 to + 3 + 2 to -4 -5 to -27 TRUTH Lreur.r Lre. :t;

RESULTS FOR 2 CHARTS _ CIRCLE APPROPRIATE NUMBER BELOW + 54 to + 6 + S to -9 -10 to -54 TRUTH INDEFINITE

RESULTS FOR ) CHARTS _ CIRCLE APPROPRIATE NUMBER BELOW +81 to + 9 + 8 to -14 -IS to -81 TRUTH lITE

RESULTS FOR 4 CHARTS - CIRCLE APPROPRIATE NUMBER BELOW + lO8to + 12 +11 to -19 -20 to -108 TRUTH

Matte Quinque-Track Zone Comparison Technique:

Appendix 3 shows the Matte Quinque-Track Zone Comparison Technique test structure. The Matte Quinque-Track ZCT is an Exploratory Multiple-Issue test, thus all four spots are vertically and independently scored but cannot be horizontally tallied because each spot deals with a different issue. A minimum of three charts (ideally four) must be conducted to attain a minimum number of comparisons for each spot. Appendix 4 is the Matte Quinque­Track ZCT Quantification system, including its Conclusion Table.

Matte Suspicion-Knowledge-Guilt (SKG) Test:

The Matte Suspicion-Knowledge-Guilt Test, hereafter referred to as the S-K-G test is designed to provide the forensic psycho­physiologist with a single test capable of identifying the examinee who has major involvement, some direct involvement, or guilty knowledge, yet containing similar controls to that found in the Matte Quadri­Track Zone Comparison Technique. For detailed information about the SKG test, consult Matte (1996, chapter 17).

Polygraph, 1999, 28 (1)

INDEFINITE DECEPTION

49

Appendix 5 shows the SKG Test structure and quantification system. It should be noted that test question number 31 is treated as a control question. Question number 31, which relates to unfounded suspicion on the part of the examinee, is not a relevant question but rather a control, inasmuch as it can be readily assumed that an examinee who shows no reaction to any of the relevant questions but shows a reaction to suspicion can be excluded as a participant or witness in the crime. Therefore, the forensic psychophysiologist can choose between control question 48, which is a control question which encompasses both periods normally covered by control ques­tions 46 and 47 in the Quadri-Track ZCT, and control question 31 for the greatest physiological evidence of sympathetic and parasympathetic action, which the FP then uses as the control question to compare against relevant questions 42, 34, 33, and 32 individually.

There will be few occasions when question 31 (control) will exceed control question 48 in overall sympathetic/para­sympathetic activity. However, its inclusion is necessary to offset the chance that an examinee's suspicion of someone may be so strong that, without the suspicion question

Page 55: VOLUME 28 NUMBER 1

Scoring Systems for the Matte Techniques

on the test, he or she may show reaction on the knowledge question because there is no

other question in that general category to relieve the energy.

Figure 4 below shows the Comparison Score Table.

Figure 4

Comparison Score Table

Relevants 42 34 33 32 24

Controls 48or3l 48or3l 48or3l 48or3l 23

Tally (+/- (+/- (+/- (+/- (+/-

Figure 5 below reflects the SKG Conclusion Table. It should be noted that a minimum of two polygraph charts, as in any

test, must be conducted to insure external reliability, before a determination of truth or deception can be rendered.

Figure 5

S-K-G Conclusion Table

Results For 1 Chart: Circle Appropriate Number Below

+2 Or More Truth

+1 To-2 Inconclusive

-3 Or More Deception

Results For 2 Charts: Circle Appropriate Number Below

+4 Or More Truth

+3 To-5 Inconclusive

References

-6 Or More Deception

Backster, C. (1963). Backster Standardized Polygraph Notepack and Technique Guide. New York: Backster School of Lie Detection.

Backster, C. (1990). Backster zone comparison technique chart analysis rules. Handout at lecture of the 25th annual seminar/workshop of the American Polygraph Association, 14 August 1990, Louisville, Kentucky.

Matte, J. A. (1996). Forensic Psychophysiology Using the Polygraph: Scientific Truth Verification - Lie Detection. Williamsville, New York: J.A.M. Publications.

Matte, J. A. (1989). A field validation study of the Quadri-Zone Comparison Technique. Polygraph, 18(4), 187-202.

Matte, J. A. and Reuss, R. M. (1989). Validation Study on the Quadri-Zone Comparison Technique. Research Abstract, LD 01452, Vol. 1502, 1989, University Microfilm International.

Polygraph, 1999, 28 (1) 50

Page 56: VOLUME 28 NUMBER 1

d' <E'

~ r

j 1&1 ..:::

C/1 ~

Appendix 1

MATTE QUADRI-TRACK ZONE COMPARISON TEST STRUCTURE

(Cannot jump track to make comparison)

OUTSIDE PRIMARY SECONDARY INSIDE

PNEUMOTRACING~ TRACK TRACK TRACK TRACK

c:: .0.; 0 c::: 0 III c:: III c::

:~ ..... > 0 > 0 ..,~ ..... ..... ..... ..... c:: c:: > ~.o..: til III til c:: ..,

]J .., 1-0 0 1-00

ELECTRODERMAL III ~tIl III ::l ::l 0 til til 0 ..... 0 .....

(GSRlGSG) TRACING .-< c:: 0. III ::l til .-< ..... III III 1-0 .., 1-0'" O'tIl " .. ::l ~ t ::l 1-0 '" 1-0'"

H ><: rJl 0' 0' ~ III ~III 1-0 0 Q. " ~III at

I>l~~ at .... 5 ::l

H ..... -- .. ..... 111 75- C:: .. c:: .., .... 0' '-' III c:: "'"'" o c:: o c:: 0 0

Nv~Mf.Nv::'~ ~ ..... "'" 1-0 ~ "'" 1-0 ~ .-< '-' E rJl 111'-< '-'> ~~

.., > 1-0 0 III c:: CARDIO TRACING f ::l ~ ~ 0'" :J 0 VlOJ Vl III ~ 1-0 0."

.. ::l III ... .-< ~~ .-< OJ .. 0> 0.0 ....... III III ~ c:: ;t:1II

::l ,,'" E~ > c:: '" >J '" 0 .-< OJ ~

,., III 0 :c:: u III Z Vl Vl ",u '"

QUESTION NUMBER 14J 39 25 46 33 47 35 23 24

COLOR CODE Y YR B G R G R Gw Rw TRI-ZONE COMPARISON ZONE ZONI ZONE ZONE tzONE !Z.ONE ZONE COLOR LEGEND: ZONES SPOT ONE SPOT TWO ~p?_T~~~' - - .- - - - - -- - .- .. . _ .. - ----- -- - -----

w Note: White (w) suffix to a

Gw

Indicates Zone is influenced by Zones in Spots #1 and #2 Inside Issue Control Question (Variable strength)

Zone places that Zone in the Inside Track to recoup response scores lost as a result of an Inside Issue.

THREE SPOTS SCORED AND TALLIED FOR

Rw

YR Y

Inside Issue Relevant Question (Variable strength) Sacrifice Relevant Question Neutral Question (Irrelevant)

A GRAND TOTAL = TRUTH, DECEPTION, INCONCLUSIVE

TRACK Identifies a pair SPOT Identifies a Track of questions related for which is quantified. comparison/quantification (G & R Zone) or evaluation (B Zone). © 1995 by James Allan Matte

PUTSIDE TRACK

c:: 0 ..... ..,~

til III III ::l ::l til

0'00 H

" ..... 111

"'"'" ., ..... E 00 0'" .... ::l 0.0 e~ ;>, Vl

26

B ZONE

s= a r+ ~

Page 57: VOLUME 28 NUMBER 1

~ cE'

~ :;r

-~ I~ .=::

CI1 tv

;)rt:.L-l

TRUTH

::! +3 +2

GSR 33 +3 +2

CAR 33 +3 +2

SPEC-2 TRUTH

:i +3 +2

GS 33 +3 +2

CA 33 +3 +2

SPEC-3 TRUTH

~i +3 +2

GS 33 +3 +2

CA 33 +3 +2

SPEC-4 TRUTH

~~ +3 +2

GS 33 +3 +2

CA 33 +3 +2

TARGET (

GRAND TOTAL:

SPOT ONE

INDEF

+1 0 -I

+1 0 -I

+1 0 -I

INDEf

+1 0 -I

+1 0 -I

+1 0 -I

INDEF

+1 0 ·1

+1 0 -I

+1 0 -I

INDEf

+1 0 -I

+1 0 -I

+1 0 ·1

TOTAL:

FOR ( ) CHARTS.

DECEP

-2 -3

-2 -3

-2 -3

DECEP

-2 ·3

-2 -3

-2 -3

DECEP

-2 -3

-2 -3

-2 -3

DECEP

-2 -3

-2 -3

-2 -3

Appendix 2

}1atte Quadri-Track Zone Comparison Test

Tri-Spot Quantification System

SPOT TWO

TRUTH INDEF DECEP

" ( ) ~ +3 +2 ... 0 -I -2 -3 " ( )

~ " ( ) 35 +3 +2 +1 0 -I -2 -3 c ( ) 24 c ( ) ® +3 +2 +1 0 -I -2 ·3 "' ( ) 24

TRUTH INDEF DECEP c ( )

~ +3 +2 +1 0 -I -2 -3 c ( )

~ = ( ) 35 +3 +2 +1 0 -I -2 ·3 "' ( ) 24 " ( ) 35 +3 +2 +1 0 -I -2 ·3 "' ( ) 24

TRUTH INDU DECEP

"' ( ) i +3 +2 +1 0 -I -2 ·3 . ( )

~ "' ( ) 35 +3 +2 +1 0 -I -2 ·3 "' ( ) 24 = ( ) 35 +3 +2 +1 0 -I -2 ·3 "' ( ) 24

TRUTH INDEF DECEP

" ( )

~ +3 +2 +1 0 -I -2 -3 = ( )

~ = ( ) 35 +3 +2 +1 0 -I -2 -3 "' ( ) 24) = ( ) 35 +3 +2 +1 0 ·1 -2 ·3 " ( ) 24)

( ) TOTAL: ( )

SPOT THREE

TRUTH INDEF DECEP

+3 +2 ... 0 -I -2 -3

+3 +2 +1 0 -I -2 -3

+3 +2 +1 0 -I ·2 -3

TRUTH INDEF DECEP

+3 ··2 +1 0 -I -2 -3

+3 +2 +1 0 -I -2 -3

+3 +2 +1 0 -I ·2 ·3

TRUTH INDEF DECEP

+3 +2 +1 0 -I -2 -3

+3 +2 +1 0 -I -2 ·3

+3 +2 +1 0 -I -2 ·3

TRUTH INDEF DECEP

+3 +2 +1 0 -I -2 ·3

+3 +2 +1 0 -I -2 ·3

+3 +2 +1 0 -I -2 ·3

TOTAL:

" ( )

" ( )

EO ( )

" ( )

EO ( )

" ( )

"' ( ) EO ( )

,. ( ) -

EO ( )

EO '( )

"' ( )

C/J n

j. ~ ~ (1)

~ 0' '"1

~ s= ~ ~

~ n

[ .g

(1) (IJ

Page 58: VOLUME 28 NUMBER 1

d' <E'

~ :i'" -10 10 :0 I~ ..::

C/l VJ

Appendix 3

MATTE QUINQUE-TRACK ZONE COMPARISON TEST STRUCTURE (EXPLORATORY)

(Cannot. lumP t.rack t.o make comparison) OUTSIDE FIRST SECOND mlRD FOt:RTH TRACK TR.o\CK TRACK TR-\CK TRACK

P?'iEUMOTRACJ~G~ .. > .. .. c .... > > '"' ... 0 C 0 ., .... .... .... c

C ... 0 .... ::> C ., C til C C ..

:~ .. .... : c ::> 0 c ::> 0 C ... >

> .. ~ U'" 0 ........ 0 ........ :: .. .. ELECTRODERMAL .. c ~ ~ .... oc"" ... u .. .... u ... ... C C .... c

"" "-l'J'. ... oc til ... oc ., .. :: 0 .. ::

(GSRlGSG) TRACING ~ ~ ~ 6- a til .. :n I<lti II! I<lti '" u .... .,,: .... ., ":6 ti " .. " '" ... .. ... .. C>- u .... " -a " -a 0- 1 '" '" .... ., ........ .... c Cf .. Cf ... .. '" .. .. C '"'''' ..... C ..... C .... C" .. "

- ::> u .. ...., ... c oJ .. 0 .. .. 0 ... 00' 000'

CARDIO TRACING M]~ ¥ e .... ... ... g ...... c ... I- C ... .., 0" ::>"' ..... .. ..... " u .. ...... uc > "c > ::> C > .... .... Co::> 1 C '" uo .. uo :I Co :J

::> ua:: ~e CU ..... IU ..... lU .... til 0 .. .. 0 <lJ C .. C '" " c z I/) I/) z '" 0 '" 0 a:: III ~

Z Z

Qt:ESTlO~ NUMBER 14J 39 15 46 45J 47 45K 48 45L 031 032

icOLOR CODE Y YR B G R G R G R G R tnu.ZO?'iE COVIPARISON ZO~E IiQ.1\iE ~ONE ZD."I[ ZON'f 7fl?'iE tzONI O~EI tzmlE mLOR LEGEND: ZONES \SPOT ONE ~OTTWO ~!T TH~ ~rT FOUR "" - - -- - .- .". 'NALYSI A!liALYSI ?'iAL YSIS NALYSI~

~SCORE SCORE SCORE SCORE

+- +- +- +-R Relevant Question (Strong) 3. Red (Strong Relevant) YR Sacrifice Relevant Question TRACK Identifies a pair Y Neutral Question (Irrelevant) of questions related for

SPOT Identifies a Track which is quantified.

OUTSIDE TRACK

C 0 ....~

.. OJ ., " .. '" ::> II: a .... u .. .... .., .. .... ~ ~ o " "'0 Co ......

~ I/)

16

B ZO~E

o Prefix- Option Use. comparison (G&R Zone) or evaluation (B Zone)

Note: Fourth Track includes Suspicion Question 031 Green Zone versus Knowledge Question 032 Red Zone.

ABOVE FOUR SPOTS ARE VERTICALLY AND INDEPENDENTLY SCORED BUT CANNOT BE HORIZO~TALLY SCORED FOR A TOTAL TALLY BECAUSE EACH SPOT D£.ALS WITH A DIFfERENT ISSUE. A MINIMU!\I OF THREE CHARTS (IDEALLY FOUR) MUST BE CONDUCTED TO ATTA~ A MINIMUM Nt:MBER OF COMPARISONS FOR EACH SPOT.

© 1995 by James AUan Matte ~ a ;

Page 59: VOLUME 28 NUMBER 1

Scoring Systems for the Matte Techniques

CHARTl TI PO 45J +3+2 GR 45J +3+2 CO 45J +3+2

CHART 2 TI PO 45J +3+2 GR 45J +3+2 CO 45J +3+2

CHART 3 TI 11'0 4!>J +J+l GR 45J +3+2 CO 45J +3+2 TARGET (

Appendix 4

MATIE QUINQUE-TRACK ZONE COMPARISON TECHNIQUE

TARGET ( ) USED ON CHART NR. 14J WERE YOU BORN IN THE U.S.? I13L I13F

(Last Nam~ i(First Name) RE: WHETHER OR NOT (YOURSELF) DO YOU INTEND TO ANSWER TRUTH

39 TRUTHFULL Y EACH QUESTION AB~TTHAT?

ARE YOU COMPLETELY CONVINCED THAT I WILL NOT ASK YOU AN 25 UNREVIEWED QUESTION DURING THIS CHART?

46 BETWEEN THE AGES OF ( ) AND ( ) - DO YOU REMEMBER:

45J DURING THE FIRST ( ) YEARS OF YOUR LIFE - DO YOU REMEMBER:

47

45K DURING THE FIRST ( ) YEARS OF YOUR LIFE - DO YOU REMEMBER:

48

45L

031

032 IS THERE SOMETHING ELSE YOU ARE AFRAID I WILL ASK YOU A

26 QUESTION ABOUT, EVEN THOUGH I TOLD YOU I WOULD NOT?

QUANTIFICATION SYSTEM

~ ---- . -QUINQUE-TRACK ZONE COMPARISON TECHNIQUE

INDEF DI TI INDEF IH TI INDEF DI TI INDEF DI +1 0-1 -2-3( 145K +3+2 +1 0-1 -2-3( 45L +3+2 +1 0-1 -2-3( 32 +3+2 +1 0-1 -2- (

+1 0-1 -2-3( j45K +3+2 +1 0-1 -2-3( 45L +3+2 +1 0-1 -2-3( 32 +3+2 +1 0-1 -2- ( +1 0-1 -2-3( ~5K +3+2 +1 0-1 -2-3( 45L +3+2 +1 0-1 -2-3( 32 +3+2 +1 0-1 -2-3(

INDEF DI TI INDEF D1 TI INDEF DI TI INDEF D1 +1 0-1 -2-3( )45K +3+2 +1 0-1 -2-3 )r+ 5L +3+2 +1 0-1 -2-3 32 +3+2 +1 0-1 -2-3( +1 0-1 -2-3( )45K +3+2 +1 0-1 -2-3 )r+ 5L +3+2 +1 0-1 -2-3r- 32 +3+2 +1 0-1 -2-3( +1 0-1 -2-3 )45K +3+2 +1 0-1 -2-3 )~5L +3+2 +1 0-1 -2-3 32 +3+2 +1 0-1 -2-3(

INDEF DI TI INDEF DI TI lNDEF DI TI INDEF DI +1 u-l -l-", 45K +3+l +1 0-1 -2-3 ,45L +3+2 +1 0-1 -2-3 ;32 +3+2 +1 u-l -2-3 +1 0-1 -2-3 )~5K +3+2 +1 0-1 -2-3 ) 45L +3+2 +1 0-1 -2-3 32 +3+2 +1 0-1 -2-3( +1 0-1 -2-3 )~5K +3+2 +1 0-1 -2-3 )~5L +3+2 +1 0-1 -2-3 )32 +3+2 +1 0-1 -2-3(

TOTAL ( TOTAL ( TOTAL ( TOTAL (

CONCLUSION TABLE NDI INCL DI 4SJ 4SK 4SL 031

THREE CHARTS +27 to +5 +4 to -7 -8 to -27 ( ) ( ) ( ) ( ) FOUR CHARTS +36 to +7 +6 to -9 -10 to -36 ( ) ( ) ( ) ( ) FIVE CHARTS +45 to +9 :+-8 to -11 -11 to -45 ( )( )( )( ) Each of the 4 SPOTS (45 series and 032) are scored, tallied and evaluated separately against the Conclusion Table according to the number of charts conducted. STIMULATION TEST DATA - NUMBER SELECTED: __ CHART NR: ____ _

Polygraph, 1999, 28 (1) 54

) ) )

) ) )

~ )

Page 60: VOLUME 28 NUMBER 1

Appendix 5

S-K G Test I 13L I Used on Chart Nr. Are you completely convinced that I will not ask you an unrcviewed question during

~ this chart? Regarding the: Do you intend to answer truthfully each question about

39 that?

Before that occurred -42 Same as above Did vou definitely know it was about to happen?

At the very time that 34 Same as above occurred - were you

(onJhe scene) 48 During the first ( ) years of your life - Do you remember:

33 Same as above Did you (yourself)

32 Same as above Do you know for sure (who)

31 Same as above Do you suspect anyone in particular of

23 Are you afraid an error will be made on this test regarding: (whether or not you were involved in this crime)

24 Are you hopeful an error will be made on this test regarding: (whether or not you were involved in this crime) --

26 Is there something else you are afraid I will ask you a question about, even though I told ~ou I would not?

S-K-G OUANTIFICATION SYSTEM SCORE TABLE K -I

PNE 42 Score ( ) GSR 42 Score ( ) CAR 42 Score ( SKG-2

) )

42 Total ( ) 34 Total ( ) 33 Total ( ) 32 Total ( 24 Total ( ) S-K-G CONCLUSION TABLE

RESUL TS FOR 1 CHART: TRUTH INCONCLUSIVE DECEPTION +~ +

RESULTS FOR 2 CHARTS: TRUTH INCONCLUSIVE DECEPTION +4 r r +3 0-5

S-K-G CONCLUSIONS: TRUTHFUL to relevant question(s) _____ _ DECEPTIVE to relevant question(s) _____ _

TARGET () INCONCLUSIVE to relevant question(s) ____ _

Polygraph, 1999, 28 (1) 55

Matte

Page 61: VOLUME 28 NUMBER 1

Horizontal Scoring

Polygraph, 1999, 28 (1) 56

The Academy for Scientific Investigative Training's HorizontalScoring System and Examiner's Algorithm System for Chart

Interpretation©

Nathan J. Gordon

Abstract

The Horizontal Scoring System was developed at the Academy for Scientific Investigative Training in1981 as a means of eliminating subjectivity in scoring and the skewing of test results by thesubjective selection of control questions for comparison to the relevant question. It allows forobjective numerical chart analysis, while eliminating the subjectivity of the assignment of numbersfrom the traditional seven-position scale, and the additional subjectivity of the selection of whichcontrol question to use for comparison to the relevant question. The Examiner Algorithm Systemwas developed at the Academy in 1997, and utilizes discrete measurements which allow theexaminer to accurately and consistently determine what constitutes the greatest reaction.

In 1963, Backster developed anumerical scoring system where valuesranging from a +3, to a -3, were assigned toeach relevant question's independentphysiological parameter after it was comparedto those same parameters of a control questionselected by the examiner. The decision ofwhether this "control vs. relevant" comparisonyielded no difference (0), a slight difference(±1), a clear difference (±2), or a huge vs. nodifference (±3), and the corresponding score forthe comparison is left to the subjectivity of theindividual examiner. There are alsodifferences in opinion on how the examinershould select the control question forcomparison to the relevant question (Weaver,1980). The Army compares the relevantquestion to the strongest control question,skewing their test toward truthfulness.Backster compares the relevant question tothe weakest control question, skewing the testtoward non-truthfulness. The University ofUtah compares the relevant question to thecontrol question preceding it, not skewing thetest outcome toward any direction. Weaveralso pointed out that the matter is furthercomplicated in that there are clear differencesamong these three groups as to whatconstitutes a reaction. The Army's position isthat any change from the norm constitutes a

reaction, while Backster clearly discriminatesbetween what he defines as reaction and relief,and Utah takes a position somewhere inbetween.

With so much subjectivity anddifference in opinions among major groups inour profession, it is not difficult to imaginethat examiners of different schools of thoughtcould have differences of opinion analyzing thesame polygraph charts, even though they areall using "numerical analysis." Theseconditions led to our search to eliminate thesubjectivity of what number to assign from thetraditional seven-position scale, and whichcontrol question to select for comparison tothe relevant question and to the developmentof the Horizontal Scoring System (Gordon &Cochetti, 1987).

In the Horizontal Scoring System thesubjectivity of control question selection istotally eliminated by the examiner comparingall of the control and relevant questions ineach individual parameter, creating ahierarchy of the greatest to least reaction. Forexample, in a standard Backster "You Phase"format there is a maximum of three controlquestions (numbers 46, 47, and 48) and threerelevant questions (numbers 33, 35 and 37),

The author is the Director of the Academy for Scientific Investigative Training. Requests for reprints should be forwarded toNathan J. Gordon, Academy for Scientific Investigative Training, 1704 Locust St, 2nd Floor, Philadelphia, PA 19103.

Page 62: VOLUME 28 NUMBER 1

Gordon

Polygraph, 1999, 28 (1) 57

making a total of six reactions being comparedin each individual parameter. The examineremploying Horizontal Scoring identifies thegreatest reaction in thoracic breathing, andassigns it a six (6), the next greatest a five (5),the next greatest a four (4), the next greatest athree (3), the next greatest a two (2), and thesmallest reaction a one (1).

The same process is then repeated inthe abdominal breathing patterns, theelectrodermal patterns and the cardiographpatterns. After all the hierarchies have beenestablished, each question's pneumographscores (thoracic and abdominal) are averaged,and then added to the electrodermal score andthe cardiograph score, resulting in a totalquestion score for each of the control andrelevant questions. This total question scoreis then assigned a plus (+), in each of thecontrol questions, and a minus (-), in each ofthe relevant questions. Since the Backster"You Phase" is a single-issue technique, thescores can be combined for a total chart score.

In a general question technique, suchas Reid, Arther or the MGQT, there are fourrelevant questions and two control questionsof varying weights. The same ranking of six toone can be performed, since one is stillcomparing six questions in each parameter. Iftwo pneumographs are being used, we wouldagain average them for a single pneumographscore, and again add them to theelectrodermal and cardiograph score for eachquestion for a total question score. We wouldagain assign a plus (+) to the control questionscores and a minus (-) to the relevant questionscores. However, we cannot combine thescores since the relevant questions of inquirydo not represent a single issue. We must nowcompare each relevant question's total scorewith the total score of the control questionwith which it would traditionally be compared.The difference between these scores willrepresent that relevant question's final score.For example, if relevant question #3 had atotal score of -15, we would compare it tocontrol question #6, which had a total score ofa +9, and derive the final score for relevantquestion #3 as a -6: (-15) - (+9) = -6. Wewould follow the same process to evaluaterelevant question #5, and then use the sameprocess to compare relevant questions #8 and9, to control question #10.

In situations where there are equalreactions between questions in the parameterbeing scored, we average the scores of thepositions they are vying for. For example, wehave identified the greatest reaction in a givenparameter, and assign it a 6. There are nowtwo equal reactions competing for ranks of 5and 4. We average those two numbers (5+4 =9; 9/2 = 4.5) and assign each a 4.5.

The cutoffs we are currently using are±1.5 per relevant question, per chart. Forthree charts, with three "single issue" relevantquestions we use a ±13, and for three chartswith two "single issue" relevant questions a ±9. For a single question (i.e., relevant question#3 in MGQT) we use a ±3 for two charts, and a±4.5 for three charts. These numbers need tobe reevaluated to determine if adjusting themwould result in even more accurate results.

Previous research has shown theHorizontal Scoring System to be highly validand reliable (Horvath, 1985, and, Driscoll &Honts, 1987). Both studies concluded thatHorizontal Scoring was as accurate astraditional scoring, and the latter study statedthat Horizontal Scoring was much easier toteach and apply. While Horizontal Scoringsucceeded in removing the subjectivity of whatnumber from a seven-position scale to apply,and which control question to select forcomparison to the relevant question, it still leftthe question of what constitutes the "greatest"reaction to the subjectivity of the individualexaminer.

In 1997, we devised a mathematicalmethod (algorithm) for measuring changes foreach physiological parameter, which in itselfreflects the degree and importance of thepsychophysiological reactions occurring . Thealgorithm works independently of anyunderstanding of the psychological orphysiological basis of why, or what is actuallyhappening within the parameter. It isdesigned to simply apply objectivemathematical equations to measure what isoccurring, and remove examiner subjectivity.Since examiners are applying the sameformula to determine the degree of reaction,reliability between examiners' interpretationsof the same charts is dramatically increased.Examiner experience in chart interpretation isthereby negated.

Page 63: VOLUME 28 NUMBER 1

Horizontal Scoring

Polygraph, 1999, 28 (1) 58

Pneumograph

I initially theorized a method forinterpreting the pneumograph that was a dualsystem monitoring changes in pneumographsuppression (PS) and pneumograph duration(PD). I strongly believe these two reactionsreflect the major changes occurring in thisparameter:

(PS + PD) = pneumograph reaction.

To confirm my theory, students at theAcademy for Scientific Investigative Trainingwere instructed to measure in millimeters theheight of each of the breathing cycles in aspecific question. The first four cyclesfollowing answering distortion were measuredand then totaled. Each control and relevantreaction was then assigned a value for thegreatest (defined by the smallest total number,which represented the greatest overallsuppression) to the least reaction (HorizontalScoring System). As previously stated, thehighest value assigned was determined by thetotal of the number of relevant and controlquestions utilized in the polygraph techniquebeing employed for the exam in question. Eachpneumograph was similarly scored based onduration, and given scores from greatest toleast. To do this, chart time was measured, inmillimeters, from the end of exhalation in theanswering distortion cycle to the beginning ofinhalation in the fifth respiration cycle.

In 1997, Emanuel Cohen, one of thestudents attending the Academy for ScientificInvestigative Training from Israel, listened tomy lecture on interpreting the pneumographbased on my formula of PS + PD. He thensuggested a simpler mathematical equation forestablishing pneumograph reactions:

1. Measure in millimeters the heights ofthe first four cycles after answeringdistortion in the tracing being eval-uated.

2. Measure the duration of these fourcycles from the end of the exhalationcycle containing answering duration,until the beginning of the fifth inhala-tion cycle.

3. Divide the duration by the total ofthe four cycle heights in the reactionbeing evaluated, which in essence givesthe amount of suppression and dur-ation in the tracing.

4. The larger the number, the greaterthe reaction.

In Figure 1, the thoracic respirationhas been scored using the algorithm and thenassigned numbers from 6 to 1 in accordancewith the Horizontal Scoring System. Thegreatest reaction was to question R12, and itreceived a 6. Questions R9 and R6 were tiedas the next greatest reaction, and since theywere vying for horizontal positions (ranks) 5and 4, they were each given a 4.5, whichrepresents the average of those two positions(5+4 = 9; 9/2 = 4.5). The three controlquestions, C5, C8 and C11, were also tied forhorizontal positions 3, 2 and 1. Each receivedthe average of those scores (3+2+1+ = 6; 6/3 =2), a 2.

In Figure 2 the abdominal respirationtracing is scored, and the two pneumographscores for each question will be averaged, sothat the pneumograph score only representsone-third of the questions total score.

Electrodermal

Measurement of the electrodermalrecording is performed by multiplying theheight by the base of the tracing. The heightof the tracing is established by a straight linedrawn from the highest peak to the base. Theduration is established by measuring thedistance of a straight line drawn from thebeginning of the reaction, straight out untilthe point it intersects the downwardmovement of the tracing in automatic mode.Multiplying these two measurements togetherreflects tracing amplitude plus duration, andthe larger the number, the greater thereaction. In Figure 3, the electrodermalresponses are scored.

Cardiograph

The cardiograph tracing measurementis established by drawing a twenty-secondstraight line out from the bottom of thecardiograph tracing at the beginning of the

Page 64: VOLUME 28 NUMBER 1

Gordon

Polygraph, 1999, 28 (1) 59

question. A measurement, in millimeters, ismade to determine the height of any changesthat occur in the baseline of the cardiographabove that line. In essence, we are measuringincreases in blood volume. In Figure 4, thecardiograph is scored and tied Horizontalposition scores are averaged (R6 & R12, C8 &R9).

In Figure 5, the results are enteredonto a Horizontal Scoring System sheet, andsince it was a single-issue examination thescores can be combined, for a total chart scoreof -22.5. The same procedure would then befollowed for the other charts in the exam-ination and a total examination score derived.

In conclusion, the Horizontal ScoringSystem has been successfully used in the fieldsince 1981, and the Examiner AlgorithmSystem since 1997. The Academy's HorizontalScoring System clearly removes examinersubjectivity in the assignment of numericalvalues to reactions and the skewing of testresults by arbitrary control question selection.This system is also easier to analyze andteach, allowing for broader application. TheExaminer Algorithm System simplifies andstandardizes the process of determining thedegree of a reaction. The reliability andvalidity of the algorithm in a blind evaluationstudy will be published later this year.

References

Driscoll, L.N., & Honts, C.R. (1988). An evaluation of the reliability and validity of rank order andstandard numerical scoring of polygraph charts. Polygraph, 16(4).

Gordon, N.J., & Cochetti, P.M. (1987). The horizontal scoring system. Polygraph, 16(2), 116-125.

Horvath, F. (1985). Personal communication.

Weaver, R. S. (1980). The numerical evaluation of polygraph charts: Evolution and comparison ofthree major systems. Polygraph, 9, 94-108.

Page 65: VOLUME 28 NUMBER 1

d' IE'

~ ;r ...... \0 \0 ~

1116

..::

CA 19 19 19 19

®

page 1 of 3

CD

99~5-{)1 Exam 1 Subject Examiner nlg Date 1/13/99 View SIZe Normal T1iTle Start 2 29 44 PM End 2 34 09 PM Duration. 0425 Cuff Pressure Start. 68 End: 70 98-05-01 R (Zone)

,

Figure 1

Gain Settings: P2 P1 GS CA page 2 of 3 99~5~1

Recorded: Start 8.5 4.2 2.0 19 Subject Recorded: End 8.5 4.2 2.0 1.9 Examiner· njg Printed: Start 8.5 42 2.0 19 Date. 1113/99 View Size Printed: End 85 4.2 2.0 1.9 Time Start. 2.2944 PM

Cuff Pressure Start 68 I 98-05-01 R (Zone)

Page 66: VOLUME 28 NUMBER 1

d' ~

~ :;:-.... 10 10 10

I~ .!::

CA 19 19 1.9 19

page 1 of 3

Figure 2

99-O5-ll1 - Exam 1 Gain Settings: Subject Recorded: start Examiner njg Recorded: End Date 1/13/99 View SIZe Normal Prrnted: start Time Start 2 2944 PM End 2 3409 PM Duration 0425 Printed End C,J[f I'ressure Stalt 68 r:nd 70 ~]805·01F' (Zone)

P2 P1 GS CA page 2 of 3 99-05-01 8.5 4.2 20 1.9 Subject-85 4.2 2.0 1.9 Examiner· njg 8.5 42 2.0 19 Date 1113/99 View Size 85 4.2 20 19 Time Start 2.2944 PM

Cuff Pressure Stalt 68 I 98-05-01 R (Zone)

Page 67: VOLUME 28 NUMBER 1

d' <E'

~ :;r -ID ID .ID

I~ .:=

CA 19 19 19 19

page 1 of 3 99'{)5-01 Subject Examiner njg [jote 1113/99 View Size Normal

Figure 3

Tilne Star1 2 29 44 PM End 2 3409 PM Duration 0425 Cuff ['ressur" Start 68 End 70 9805~01 R (Zone)

P2 85 85 85 85

P1 42 42 42 42

GS 20 20 20 20

page 2 of 3 99'{)S'{)1 Subject Examiner njg Date. 1113/99 View Size Time Start 2 29 44 PM Cuff Pressure Start 68 I 98~05-01 R (Zone)

......4

Page 68: VOLUME 28 NUMBER 1

d' <E'

~ ;r ......

'" '" .'" I!); .!:;

5 CA

TIO\lts ~~ +~·5

page 1 of 3 99-O~1

Subject Examiner njg

Figure 4

Gain Settings:

~ate ~3f99 View Size Normal

_/ Time rt 2.29.44 PM End 234 09 PL..~tion~5 l# P- e sure Start 68 End 70 -r- ;:)

98-05-01 R (Zone) •

Recorded Start Recorded. End Pnnted Start Pnnted: End

P2 P1 85 42 85 42 85 42

Ir25

GS 20 20 20 20

-I-~

page 2 of 3 99-05-01 Subject Examiner njg

ir!i1mW Size _1 . PM

u r rt 68 I 98-05-01 R (Zone)

Page 69: VOLUME 28 NUMBER 1

Figure 5

Academy for Scientific Investigative Training's Horizontal Scoring System©

CHART! O#~ 0#1' 0# Ii. O#Q 0# 11

PI 2 6 2 4.5 2

P2 1 6 2 3 4

Avg. Pneumo 1.5 6 2 3.75 3

E 1 6 4 5 2

C 4 5.5 2.5 2.5 1

I Total +6.5 -17.5 +8.5 -11.25 +6

CHART 2 0# 0# 0# 0# 0#

PI

P2

Avg. Pneumo

E

C

I Total

CHART 3 0# 0# 0# 0/1. 0#

PI

P2

Avg. Pneumo

E

C

I Total

0 Q# CHART 1 CHART 2 CHART 3

5 +.5

12 -17.5

8 +8.5

9 -11.25

11 +6

6 -13.25

Determination:

l

O#fi

4.5

5

4.75

3

5.5

-13.25

0#

0#

Page 70: VOLUME 28 NUMBER 1

Elaad

Polygraph, 1999, 28 (1) 65

The Control Question Technique: A Search forImproved Decision Rules

Eitan Elaad

Abstract

Analysis of polygraph examinations is done with two main methods. In the semi-objectivenumerical scoring method, the examiner assigns a numerical score between -3 and +3 for everycomparison between the responses to relevant and adjacent control questions. Then, the numericalscores are summed up to yield a final score. Another, less systematic method of analyzing thecharts, use global evaluations. Here, the examiner collects reactions which are liable to be of helpin reaching a decision. Then, he or she selects the blatant responses and reach a decision. Thepresent study was designed to examine these two interpretation methods and to look for betterdecision rules. Two samples of verified polygraph examination records, conducted with thecommon Control Question Technique, were selected for the purposes of the present study. Resultssuggested to replace the raw scores of the numerical scoring method with the mean per comparisonpoint scores and combine both interpretation methods. It was further suggested to use aninconclusive region with boundaries skewed in the direction of the negative scores.

The most preferred polygraph methodin field practice is the Control QuestionTechnique (CQT; Raskin, 1989; Reid & Inbau,1977; Podlesny, 1993). Briefly, the CQTcontains an extensive pre-test interview inwhich the examinee is given the opportunity totalk about the offense and to present his orher version of the crime. During thisinteraction between the examiner and theexaminee, the questions are formulated.Then, the examinee is attached to thepolygraph and is asked a series of questionswhile continuously measuring the variousphysiological reactions. The questions are ofthe following three types: (a) Relevantquestions - directly crime-relevant questions ofthe "did you do it?" type (e.g., "Did you breakinto Mr. Smith's store last Friday night?"). (b)Control questions - focusing on general, non-specific misconducts, of a nature as similar aspossible to the issue under investigation (e.g.,"Have you ever taken something valuablewithout permission?"). (c) Irrelevant questions- focusing on completely neutral issues, (e.g.,are you sitting on a chair?). These areintended to absorb the initial orientingresponse evoked by any opening question, and

to enable rest periods between the more loadedquestions. Typically, the whole question seriesis repeated three or four times.

The inference rule underlying the CQTis based on a comparison of the responsesevoked by the relevant and control questions.Deceptive individuals are expected to showmore pronounced responses to the relevantquestions, whereas truthful individuals areexpected to show the opposite pattern ofresponsivity (i.e., more pronounced responsesto the control questions).

Polygraph records conducted with thecontrol question test are usually interpretedaccording to one of two main methods. Thefirst is the semi-objective numerical scoring(NS) technique (Barland & Raskin, 1975). Thisprocedure is a routine at the Polygraph Unit ofthe Israeli Police. According to the numericalscoring procedure relevant data is gathered bysystematically addressing every comparisonpoint (a single comparison of responsemagnitude between a relevant and a controlquestion for each physiological measure byone examiner), and numbers (-3,-2,-1, 0, 1, 2,

Dr. Elaad is an internationally known researcher in the psychophysiological detection of deception. Request for reprintsof this paper can be sent to Dr. Elaad, Behavior Section, Division of Identification and Forensic Science, Israel NationalPolice Headquarters, Jerusalem, 91906 Israel. This article is based on a paper presented at the IDENTA 85 conference inJerusalem, Israel, on February 27, 1985.

Page 71: VOLUME 28 NUMBER 1

Search for Improved Decision Rules

Polygraph, 1999, 28 (1) 66

3) are assigned to each comparison. Theabsolute value of the assigned number reflectsthe magnitude of the difference between theresponses evoked by the two questions withinthe pair (e.g., -3 or +3 reflect a very largedifference, -1 or +1 reflect a small differenceand 0 reflects no difference), and the sign ofthe assigned number reflects the direction ofthe difference, such that positive numbers areassociated with a pattern of largerphysiological reactivity to the control question,and negative numbers reflect the oppositepattern. These numbers are then summed upacross question pairs, across physiologicalmeasures and across polygraph charts to yielda total score. Thus, if for example a polygraphexamination is based on three charts andthree physiological measures and if two pairsof relevant-control questions are identified foreach chart, then the total score rangesbetween -54 and +54.

While the first stage of the NS whichnotes relevant data and gathers it, has theadvantage that all possible information isreferred to in a reliable manner (whenconducted by knowledgeable and well trainedpolygraph examiners), the second stage, whichintegrates the information, suffers fromoversimplicity. Thus, all three physiologicalmeasures are given the same weight, noattention is paid to dynamic changes in thesubject's responses during the test session,and the consistency of the responses over timeis not considered.

Another method of CQT record analysisis the global record evaluation (GRE) in whichthe evaluator studies the record and notes anysignificant changes in response patternsbetween the relevant and control questions.This type of data analysis presents theopportunity to treat the record as a wholerather than a combination of all itscomponents. However, the GRE is basedmainly on the subjective impression of theevaluator, and since evaluators differ in theweight they place on various physiologicalmeasures and in their emphasis on thedynamic changes during the test, no definitedecision rules are applied and the evaluatormakes the data selection while collecting it.Consequently, the final result is based onsome few salient cues which pass the selectionprocess and this exhibits relatively poor

interevaluator reliability (Podlesny & Raskin1977).

The purpose of this article is tointroduce some decision rules which exhibitimprovement of the CQT detection rate, whileusing the NS and the GRE techniques, on datacollected from two independent, and com-pletely different samples of verified polygraphrecords.

Method

Two samples of real life criminalpolygraph records, which were verified byconfession and/or conviction of the guiltyparty, were utilized for the purposes of thepresent analysis. The two samples wererandomly drawn from the Israel ScientificInterrogation Unit's pool of verified polygraphtests. One sample consisted of 69 records ofsevere crimes (SC) (e.g. murder, attemptedmurder) and the other sample selected 60records of minor crimes (MC) (e.g. theft,burglary, fraud). The MC sample was dividedevenly between actually guilty and actuallyinnocent suspects' records (30 in each) whilethe SC sample consisted of 51 innocent and18 guilty suspects' records. All the recordswere of polygraph examinations conducted bytrained field examiners according to controlquestion techniques (Reid & Inbau 1977,Backster 1969). Three physiological measureswere present on each test record. The first,recorded thoracic and abdominal respiration.The second recorded skin resistance responses(SRR), and the third recorded the cardio-vascular activity (e.g. blood pressure, bloodvolume and heart rate).

Polygraph Record Analysis

The records were independentlyanalyzed by eight (MC) or ten (SC) experiencedpolygraph examiners. This was done twice, intwo counterbalanced sessions separated intime. In one, the evaluator was asked to makea blind interpretation of the record accordingto the semi-objective numerical scoringtechnique. In the other, the same polygraphexaminers were asked to analyze the recordsaccording to the global record evaluationmethod. Here, the evaluator was asked tomake an overall decision regarding eachrelevant question in the record. The overall

Page 72: VOLUME 28 NUMBER 1

Elaad

Polygraph, 1999, 28 (1) 67

decision was expressed on a 5-point scalewhich defines at its extreme points strongconfidence in the truthful or deceptive result,at the 2nd and 4th points low confidence inthe results and on the 3rd point aninconclusive decision. While evaluating therecords, the examiners were blind to the guiltor innocence of the examinee, to the frequencyof deceptive and truthful records in thesample, to the criminal case the recordreferred to, and to the outcomes of hisprevious evaluation of the same record (on thesecond analysis of the record). Every recordwas evaluated by three polygraph examinersassigned randomly with the exemption that noexaminer was to score a record of anexamination originally conducted by himself.The composition of the three scorers changedfrom one record to another according to aLatin Square design.

Results

Reliability in record scoring

To compute the interscorer agreementrate for the numerical scoring technique, thefinal scores obtained were condensed into 5agreement regions according to the 5 mainresults of field tests used for polygraphexaminations in the Israeli police. For thepresent study purposes, the two extremepoints gathered all final scores less than -7(clear deception results) or larger than +7(clear truthful results). In the 2nd region, allfinal scores between -3 and -7 inclusive, weregathered (reserved deception results).Similarly, in the 4th region all final scoresbetween +3 and +7, inclusive, were gathered(reserved truthful results). The 3rd regiongathered all final scores between -2 and +2inclusive (inconclusive region).

The difference (d) between every pair ofexaminers who scored the same record wasdetermined. The lowest value of suchdifference was 0 (two final scores in the sameregion) whereas the largest value that couldhave been computed for such a difference was4 (two scores in extreme opposite regions).The difference score was subtracted from thelargest value of 4 in order to compute theagreement score of that pair. The largestpossible difference (d=4) was then multipliedby the total number of pairs, defining the

highest possible difference score. The ratiobetween the sum of all agreement scores andthe highest possible difference score,determined the interscorer agreement rate. Itturned out that the agreement rates computedfor the severe crime and the minor crimerecords were, 85.05% and 87.5%, respectively.Both agreement rates are fairly high andsuggest that the numerical scoring was donein a reliable manner.

Analyzing the charts using a zero cutoff

Selection of a zero cutoff point yield twodecision regions. One gathered all mean totalscores (across scorers) which are less than 0and indicate deception (DI). The othercorresponded to scores which were equal orgreater than 0 and were defined as nodeception indicated (NDI). The distribution ofDI and NDI decisions according to a zero cutofffor each crime sample, is presented in Table 1.

Table 1 reveals that innocent exam-inees yielded more truthful than deceptiveoutcomes whereas guilty suspects were foundmore deceptive than truthful. However, whenthe severe crime sample is considered, thefalse positive error rate (records of actuallyinnocent suspects which were classified as DI)was considerably larger that the false negativerate (43.1% and 5.6%, respectively). For theminor crime sample, an identical false positiveand false negative error rate (23.3%) wasfound.

The selecte d zero cutoff point is by nomeans the best classification rule for guiltyand innocent examinees. In real life poly-graphy an inconclusive region is included toreduce the error rate.

Including an Inconclusive Region

The inclusion of an inconclusive regionwas suggested by Barland and Raskin (1975),and is used routinely in field polygraphy. Table2 presents the results for both the MC and SCsamples with the inclusion of the inconclusiveregion. The inconclusive regions were selectedby searching for the optimal discriminationbetween guilty and innocent suspects, thus,its boundaries are asymmetrical around zeroand are skewed in the direction of the negativescores for both MC and SC samples.

Page 73: VOLUME 28 NUMBER 1

Search for Improved Decision Rules

Polygraph, 1999, 28 (1) 68

Table 1Frequencies of truthful and deceptive decisions obtained from numerical scoring of the

charts using a zero cutoff.

Actually Guilty Actually Innocent Across Suspects

Decisions DI NDI DI NDI DI NDISevere Crimes 17 1 22 29 39 30Minor Crimes 23 7 7 23 30 30Across Crimes 40 8 29 52 69 60

Note. DI - Deception indicated; NDI - No deception indicated

Table 2Distribution of raw NS scores in three decision regions including skewed inconclusive regions

selected for each sample.

ActuallyGuilty

ActuallyInnocent

AcrossSuspects

Decisions DI Inc NDI DI Inc NDI DI Inc NDISevere Crimes 14 3 1 5 17 29 19 20 30Minor Crimes 20 5 5 4 4 22 24 9 27Across Crimes 34 8 6 9 21 51 43 29 57

Note. DI - Deception indicated; NDI - No deception indicated;INC - Inconclusive.

The selected inconclusive regions for the Severe Crime sample -4 / -1and for the Minor Crime sample -3 / 0.

Table 2 reveals the importance of theinconclusive region for reducing the falsepositive error rate. This is manifested in bothSC and MC samples. When the SC sample isconsidered, the inclusion of the inconclusiveregion eliminates 17 out of 22 (77%) falsepositive errors without any loss of truenegative decisions. Regarding the MC sample,3 out of 7 (43%) of the false positive errorswere shifted into the inconclusive region, whileonly 1 out of 23 (4%) true negative decisionswas lost. The results for the false negativedecisions are not as impressive. For the SCsample the inclusion of the inconclusive regiondid not effect the single false negative decision,however it reduced the true positive decisionsfrom 17 to 14 (an 18% improvement).

For the MC sample, 2 of 7 (29%) falsenegative decisions were shifted into theinconclusive region and the same happened to3 out of 23 (13%) true positive decisions.

Mean per Comparison Point

The NS technique is a procedure tonote and gather data in a reliable manner butis not necessarily the optimal method tointegrate this data. Table 3 presents thefrequencies and percentages of the comparisonpoint scores, with positive, negative and zerosigns, gathered from two collections of data forguilty and innocent examinees separately.

Page 74: VOLUME 28 NUMBER 1

Elaad

Polygraph, 1999, 28 (1) 69

Table 3Comparison point score signs for guilty and innocent.

Criterion Actually Guilty Actually Innocent

Score Sign - 0 + - 0 +Severe Crimes Total = 1674 Total = 4749Number 693 704 277 1146 2165 1438Percent 41.4 42.1 16.6 24.1 45.6 30.3Minor Crimes Total = 2928 Total = 2784Number 1081 1232 615 620 1159 1005Percent 36.9 42.1 21.0 22.3 41.6 36.1MeanPercent 39.2 42.1 18.8 23.2 43.6 33.2Across Crimes

Table 3 reveals that zero scores areconsistently more than 40% of all scoresassigned to the polygraph records. For theguilty suspects, across crime samples, scoreswith negative signs exceed the scores withpositive signs by more than 20% while for theinnocent suspects the positive scores are only10% more than the negative scores. A closerlook at Table 3 reveals that the differenceoriginates from the severe crime sample wherethe guilty suspects produce about 25% morenegative scores than positive scores and theinnocent group produce positive scores inexcess of negative ones by only 6%. The veryhigh proportion of negative scores in the

innocent group implies that a relatively minordeviation from the average may have seriousconsequences.

The minor crimes sample seems to beless vulnerable. It presents similar butopposite figures for guilty and innocentsuspects with about a 15% difference betweenthem.

Table 4 sums up the figures of Table 3into percentages of agreement and disagree-ment with the ground truth and separatesthem into the three physiological measures.

Table 4Percent agreement between the ground truth and comparison point score signs for the two

crime samples and the three physiological measures.

SevereCrimes

MinorCrimes

Agreement Zero Scores Disagreement Agreement Zero Scores DisagreementMeasures % % % % % %Respiration 38.35 35.73 25.92 44.54 32.83 22.64SRR 26.34 54.65 19.01 32.72 49.16 18.12CardiovascularActivity

34.84 43.62 21.53 32.30 43.59 24.11

Mean 33.18 44.67 22.15 36.52 41.86 21.62

Page 75: VOLUME 28 NUMBER 1

Search for Improved Decision Rules

Polygraph, 1999, 28 (1) 70

The results of the two samples, overphysiological measures, are very similar:About 36% of the responses in the MC sampleand 33% in the SC sample are scored inaccordance with the actual state of thesuspect toward the investigated crime. About22% of the responses in the two samples arein the opposite direction. This means that thedecision is based, on the average, on 13% ofthe responses in the test. When the differentphysiological measures are regardedindividually, Table 4 shows that for severecrimes a difference of 13% is preserved for therespiration and cardiovascular measures butfor the SRR the difference decreases to only7%. The difference for the SRR in the MCsample is 14%, but for respiration it increasesto 22% and for the cardiovascular it decreasesto only 8%. The zero scores percents for thetwo samples are very similar, with a relativelylarge proportion of these scores for the SRR(about half of the scores) and a relatively lowpercent for respiration.

In light of the results presented inTables 3 and 4, the widespread use of rawscores sums to determine polygraph outcomes

should be reconsidered. Many polygraphrecords exhibit a small difference between thenumber of positive and negative scores andthe use of raw scores sums may lead to adecision based on a record which is not clearenough. The problem is not acute for shortrecords since the raw scores sum will probablybe assigned to the inconclusive region.Records with many comparison points and asmall difference between the percentages ofpositive and negative scores, increase theprobability that the raw score sum will fall in adecision region.

To demonstrate the long recordsproblem, the records of both MC and SCsamples were divided into two groupsaccording to the length of the record. Thecutoff point was 27 comparisons.Consequently, the SC sample was divided into35 short and 34 long records and the MCsample was divided into 26 short and 34 longrecords. The distribution of raw scores resultsfor actually guilty and actually innocentsuspects according to the length of the recordsare presented in Table 5.

Table 5Distribution of raw NS scores according to the ground truth and the length of the records.

Length of RecordsActually Guilty Actually Innocent Across Subjects

Record'sLength

Long Short Long Short Long Short

DI 9 5 5 0 14 5Severe Crimes Inc 0 3 5 12 5 15

NDI 1 0 14 15 15 15DI 9 11 4 0 13 11

Minor Crimes Inc 4 1 1 3 5 4NDI 5 0 11 11 16 11

Note. The selected inconclusive regions for the Severe Crime sample -4 / -1 and for the Minor Crimes -3 / 0.

Table 5 presents a strong connectionbetween long records and false decisionprobability. All 6 false positive and falsenegative decisions in the SC sample, and all 9

errors of both types in the MC sample arefound in long records. It seems that when aclear result is manifested on the record afterthree repetitions of the question battery the

Page 76: VOLUME 28 NUMBER 1

Elaad

Polygraph, 1999, 28 (1) 71

polygraph examiner tend to end the test andreport the result. When the decision is notclear the examiner tends to continue with thetest hoping that the forthcoming repetitionswould clarify the result. Unfortunately, thisapproach may lead to false decisions.

To overcome the problem, the rawscore sum was replaced with the mean scoreper comparison point (MCP). The MCP isconsidered as a single score assigned by onescorer to one comparison of responses toadjacent relevant and control questions in asingle physiological measure. The theoreticalmaximum MCP is 3.00 and will occur onlywhen all three examiners decide that theresponses to all control questions are greaterthan those to all adjacent relevant questionsand the difference is judged to has themaximum value (+3). This should occur in allthree measures for all the questions in therecord. In general, a positive mean score

indicates a relatively stronger response to thecontrol questions. Conversely, -3.00 is thetheoretical minimum score possible, andnegative mean score indicates that theresponses to the relevant questions in a givenrecord are stronger than those to the adjacentcontrols. In order to compare the validity ofthe MCP with that of the NS raw scores, theMCP score for each record was computed.

The optimal inconclusive region for theMC sample has boundaries of -.2 and +.1.Thus, all MCP scores less than -.2 areconsidered as deception indicators (DI), whileall MCP scores greater than +.1 are consideredas no deception indicators (NDI). All otherMCP scores are considered inconclusive.Similarly, the optimal inconclusive region forthe SC group was determined. The region hasboundaries of -.2 and 0. Table 6 presents thedistribution of MCP scores in each of the threedecision regions for both SC and MC groups.

Table 6Distribution of MCP scores according to the skewed inconclusive regions for the two samples.

Actually Guilty Actually Innocent Across Subjects

DI 12 1 13Severe Crimes Inc 5 21 26

NDI 1 29 30DI 17 2 19

Minor Crimes Inc 9 9 18NDI 4 19 23

Note. The selected inconclusive regions for the Severe Crime sample -.2 / 0 and for the Minor Crimes -.2 / +.1.

Table 6 reveals that the distribution ofMCP scores in the three decision regionseliminate many of the false positive decisionsfound for the raw score sums. This is true forboth MC and SC groups. When the SC groupis considered, 4 out of 5 (80%) false positivedecisions were shifted into the inconclusiveregion while none of the 29 true negativedecisions were affected. For the MC group, 2out of 4 (50%) false positive decisions wereincluded in the inconclusive region at the price

of 3 out of 22 (14%) true negative decisions.In the actually guilty suspect group, the onlyfalse negative decision in the SC sample wasunaffected by the change from NS raw scoresto MCP scores, but 2 out of 14 (14%) truepositive decisions were shifted into theinconclusive region. For the MC group asimilar percent of false negative (1 out of 5,(20%)) and true positive (3 out of 20 (15%))decisions were changed to inconclusives.

Page 77: VOLUME 28 NUMBER 1

Search for Improved Decision Rules

Polygraph, 1999, 28 (1) 72

The Intersection Rule

The effect of a hybrid method, based onthe NS technique and the GRE method ofrecord interpretation, upon correct andincorrect decision rates, was investigated. Forthis purpose, the evaluations of the threeevaluators on the 5-point scale were summedup yielding a 13 point continuum in which thefirst score (3) reflects an unanimous decisionof the three evaluators that the suspect isundoubtedly deceptive, while the last (15)reflects a consensus about a strong truthfuldecision. The 13-point continuum was dividedinto three decision regions which bestdiscriminate between guilty and innocentsuspects. The first consisted of the five

extreme DI evaluations (sums of 3 to 7,inclusive). The second gathered all evaluationsums between 8 and 10 inclusive,(inconclusive region), and the third consistedof all the sums greater than or equal to 11(NDI region).

A new decision rule was now defined,according to which a result would beconsidered truthful only if it is included inboth NS and GRE NDI regions. The same rulewas applied to the deceptive results. All otherdecisions which did not meet the demands ofthe intersection rule were regarded as incon-clusives. Table 7 presents the distribution ofMCP scores in the three decision regions afterapplying the intersection decision rule.

Table 7Distribution of MCP scores in the three decision regions for the two samples after applying

the intersection decision rule.

Actually Guilty Actually Innocent AcrossSuspects

DI 11 0 11Severe Crimes Inc 6 30 36

NDI 1 21 22Minor Crimes DI 17 1 18

Inc 11 10 21NDI 2 19 21

Table 7 reveals that for the MC groupthe intersection rule eliminated 3 out of 6(50%) errors (1 false positive and 2 falsenegative errors). Furthermore, the rule has noeffect on the correct detection rate of both truepositive and true negative decisions. Whenthe SC group is considered, the decision ruleshifted the only false positive error to theinconclusive region but did the same to 8 outof 29 (28%) true negative decisions. For theguilty group of subjects, the intersection rulereduced the true positive decisions from 12 to11 (-8%). While not being impressive, even theresults for the SC group seem to justify theuse of the intersection rule.

For both samples the decision ruleeliminated half of the errors. The price inreduction of correct decisions was relatively

low (0% in the MC group and 22% in the SCgroup).

Conclusions and Suggestions

The robustness of the present resultslean on the fact that the reported decisionrules apply to two independent and completelydifferent samples of polygraph records. Onthis ground it is recommended to use bothsemi-objective numerical scoring and globalrecord evaluation techniques. For each, aninconclusive region should be includedbecause, as the present analyses show, such aregion considerably reduces both false positiveand false negative errors, at a price ofrelatively lower reduction of the correctdecisions. The inconclusive region boundariesseem to be slightly skewed in the direction of

Page 78: VOLUME 28 NUMBER 1

Elaad

Polygraph, 1999, 28 (1) 73

the negative scores, and this tendency issomewhat stronger for the SC group whencompared to the MC group. It is furtherrecommended to replace the NS raw scoressums with MCP scores which standardizeresults of records with varied length andeliminate many of the errors which result fromthis variation. Finally, it is suggested to applyan intersection decision rule to both MCP andGRE results. This may lead to a furthersubstantial reduction of both false positive andfalse negative errors from a rate of about 10%to a rate of about 5%. At the same time theinconclusive region is expected to grow fromabout 30% to about 40%.

It must be noted, however, that a largeproportion of the polygraph records in the

present study were verified by confessions.This may introduce a sampling bias since theconfession is not independent of the polygraphoutcomes and the probability that a suspectwill confess may depend on the polygraphresults. Hence, deceptive outcomes mayencourage interrogation efforts to induce aconfession, while a truthful outcome mayconvince the interrogator to stop interrogating.This may result in an underestimation of thefalse-negative rate. On the other hand, onecannot exclude the possibility of falseconfessions. In this case the false negative ratemay be overestimated. Hence, the verifiedsample is not representative of the populationof polygraph examinations (Elaad andSchachar, 1985).

References

Backster, C. (1969). Technique fundamentals of the tri-zone polygraph test. New York: BacksterResearch Foundation. Inc.

Barland, G.H. & Raskin, D.C. (1975). An evaluation of field techniques in detection of deception.Psychophysiology, 12, 321-330.

Elaad, E. & Schahar, E. (1985). Polygraph field validity. Polygraph, 14, 217-223.

Podlesny, J.A., & Raskin, D.C. (1977). Physiological measures and the detection of deception.Psychological Bulletin, 84, 782-799.

Podlesny, J.A. (1993). Is the guilty knowledge polygraph technique applicable in criminalinvestigations? A review of FBI case records. Crime Laboratory Digest, 20, 57-61.

Raskin, D.C. (1989). Polygraph techniques for the detection of deception. In D.C. Raskin (Ed.),Psychological Methods in Criminal Investigation and Evidence. New York, Springer PublishingCompany.

Reid, J.E., & Inbau, F.E. (1977). Truth and deception: The polygraph ("lie detection") technique. 2nded. Baltimore: Williams & Wilkins.

Page 79: VOLUME 28 NUMBER 1

Rank Order Analysis

Polygraph, 1999, 28 (1) 74

Rank Order Analysis

Kathleen Miritello

The name Rank Order Analysis literallyexplains this procedure, as it is a "ranking" ofthe questions on a chart from greatest to leastresponsiveness. In rank order chart analysiseach physiological parameter (respiration,electrodermal, and cardiovascular) isevaluated separately. The benefits of rankorder analysis include:

1. A systematic approach to chartevaluation which has been proven inresearch studies to be more accurate thana visual or global review.

2. Each physiological parameter isevaluated separately and given equalemphasis in the decision process. Anexaminer cannot ignore or overlook any ofthe parameters based on individual prefer-ences or biases.

3. Rank order analysis provides theexaminer with a cumulative picture of theconsistency and intensity of reactions onthe polygraph charts. It serves to organizethe data contained in the charts, and isespecially helpful when reactions areinconsistent, chart quality is marginal, ornumerous charts have been run.

4. Use of this procedure enhances theexaminer's confidence in the final con-clusion by providing a good foundation forthe decision. It can serve as a qualitycontrol check of the charts in the fieldwhen an examiner is without benefit of anon-the-spot QC for his or her charts.

5. A systematic and objective chartanalysis prevents the examiner fromignoring or rationalizing the pattern ofreactions on the charts. Rank order ratiosforce the examiner to recognize which

questions were consistently most reactive,despite any possible behavioral clueswhich may appear to be indicatingtruthfulness.

6. The recorded rank order analysisresults allow for a more effective qualitycontrol review by providing a definitiverecord of the examiner's chart analysiswhich can be compared with the findingsof the QC supervisor. Any source of poss-ible disagreement is readily identified.

7. The reliability of chart analysis, animportant measure which is related tovalidity, can be clearly demonstratedthrough use of rank order analysis.

8. The chances of an improperinconclusive call are reduced with rankorder analysis. In some cases which mightotherwise be considered inconclusive, theuse of this technique can define potentialareas of deception more clearly in thecharts, steering the examiner's interro-gation in the proper direction in order togain information.

Rank order chart analysis is performedby the following steps.

Step 1. The number of questions to beevaluated per chart determines the values tobe assigned in ranking the questions. Countthe number of questions to be evaluated oneach chart and subtract 1. The resultingnumber should be awarded to the questionyou feel has the largest reaction in respiration.For example, if there are 6 questions to beevaluated, 6 - 1 = 5. The largest respirationreaction will be given a 5. The second largestreaction should be given a 4, and so on downto 0 for the question with the least or no

Editor's note: This paper was written by Ms. Miritello in 1988 as instructions on conducting rank order analysis formultiple-issue examinations, and was archived among the holdings of the Department of Defense Polygraph Institute. It wasupdated for publication here.

Page 80: VOLUME 28 NUMBER 1

Miritello

Polygraph, 1999, 28 (1) 75

reaction. This procedure is repeated for theelectrodermal and cardiovascular patterns,

again ranking the questions in order ofresponse down to 0. (See example below.)

Chart #2: Questions A, B, C, D, E, and F.

Question A B C D E FRank 2 5 3 4 1 0

Step 2. If you believe reactions toquestions are of equal strength, add the valueof the ranks they would be assigned and divideby the number of questions involved. In theexample below, question B stands out clearlyas being the most reactive. It is assigned arank value of 5. The next most reactivequestions, D and C, appear to be of equalstrength. The values that would be assigned

next are 4 and 3. These are added together,and then divided by the number of tiedquestions: 4 + 3 = 7; -7/2 = 3.5. Questions Dand C are then both assigned values of 3.5.You continue to assign the remaining valueswhere the ranks left off since values 4 and 3have already been assigned, 2 is given to thenext most reaction question, A, and so ondown to 0.

Question A B C D E FRank 2 5 3.5 3.5 1 0

Step 3. If you feel an entire parametercannot be evaluated (e.g., PVC contamination,totally erratic respiration, completely flatelectrodermal tracing), give all questions theaverage rank for that parameter. In the

following example, acceptable tracings exist inthe respiration and electrodermal channels,but the ranks of the cardiovascular channelare averaged due to excessive PVCcontamination.

Question A B C D E RRespiration 2 5 3.5 3.5 1 0Electrodermal 3 4 2 5 0 1Cardiovascular 2.5 2.5 2.5 2.5 2.5 2.5

Step 4. Next, the totals are added foreach question by summing the valuesassigned to each parameter.

Question A B C D E FRank 2 5 3.5 3.5 1 0Electrodermal 3 4 2 5 0 1Cardiovascular 2.5 5 4 2.5 1 0Total 7.5 14 9.5 11 2 1

Step 5. This process is completed foreach separate chart. The examiner has theoption of evaluating only the relevant andcomparison questions, or also evaluating

irrelevant questions. At a minimum, all rele-vant questions and at least one comparisonquestion should be evaluated.

Page 81: VOLUME 28 NUMBER 1

Rank Order Analysis

Polygraph, 1999, 28 (1) 76

Step 6. The final step is to calculateratios for each question. Ratios provide someindication of the degree of separation betweenthe questions by placing them against a 0 to1.0 scale. Also, in tests where the relevantquestions are not asked an equal number oftimes, ratios make adjustments for thedifferences in the resulting totals to allow for afair comparison between questions. Ratios arecalculated by dividing the total value for eachquestion by the maximum possible total forthat chart. This maximum value is deter-mined by summing the values accorded to the

highest rank in each parameter. In theexamples we have shown, the highest value foreach parameter was a 5, since we wereevaluating 6 questions. A maximum of 5 forthe respiration channel, plus 5 for theelectrodermal channel, plus 5 for thecardiovascular channel, equals a maximumpossible total of 15. Using the totals from theexample shown above, the ratio for A would be7.5 divided by 15, or .50. The remainder of theratios have been calculated and are shownbelow.

Question A B C D E FRespiration 2 5 3.5 3.5 3.5 0Electrodermal 3 4 2 5 0 1Cardiovascular 2.5 5 4 2.5 1 0Totals 7.5 14 9.5 11 2 1Ratios .50 .93 .63 .73 .13 .07

To calculate rank order ratios for aquestion after several charts, the totals fromeach chart are added together and divided by

the maximum possible value for each of thecharts. For example, question G received thefollowing total values:

Chart 4: 11 (Maximum possible value = 15)Chart 5: 9.5 (Maximum possible value = 15)Chart 6: 7.5 (Maximum possible value = 12)

(only five questions were ranked)

The total for question G would be 28and the ratio is determined by dividing thatfigure by 42 (the sum of the maximumpossible values) = .67.

However, in another example, questionH is asked twice and results in the followingfigures:

Chart 4: 11 (Maximum possible value = 15)Chart 5: 14 (Maximum possible value = 15)

Even though the final total, 25, is lessthan the total for question G, the ratioindicates that question G was actually morereactive ; 25/30 = .83.

It is extremely important to keep inmind that rank order analysis does not provide

any automatic cut-off scores which can beused to reach firm conclusions of truth ordeception. The purpose of the rank orderratios is to provide the examiner with moreaccurate information for applying the prin-ciples of chart analysis already in use.

Page 82: VOLUME 28 NUMBER 1

Wygant

Polygraph, 1999, 28 (1) 77

Scoring in a Computer Age

James Wygant

The extensive use of computer-aidedchart evaluation brings new focus to manualscoring. The obvious questions are: 1) when ismanual scoring necessary; and 2) what is thebest means of resolving a discrepancy betweena manual score and a computer result?

Examiners are not apt to find answersto those questions in any research, except tothe extent that the value of manual scoring iswell established by numerous studies over thepast 25 years. Despite widespread acceptanceof the concept of scoring, actual practice fallsinto four general categories. An examiner maymanually score all charts from every exam-ination, even the most obvious. He may scoreonly the charts that are not obvious. He mayscore only the exams for which he can nototherwise reach a "global impression" of truthor lie. And finally, the examiner may scorenothing, relying entirely on "global impres-sions" or on a computer if he or she has one.We could argue that these four options happento have been listed in order of best to worstpractice. Fortunately for the profession,informal surveys of examiners confirm thatmost manually score everything, even whenusing a computer.

If nearly everybody agrees that scoringis the best way to evaluate charts, there arevarying opinions about what range of numbersto use when matching issue questions(relevants) against comparison questions(controls). Cleve Backster, who is generallycredited with devising numerical scoring, setthe range at +3 to -3. Zero meant nodistinguishable difference between physio-logical responses on an issue question and acomparison question. A score of 1 meant aslight difference. Two meant a noticeabledifference. Three was reserved for whoppersand for what Backster called "upgrading." Aplus score meant greater response on thecomparison question; a minus meant the issuequestion produced the greater response.

Backster applied scoring to what hecalled the "you phase" test, a reference to the"did you do it" thrust of the issue questions,basically a single-issue test by another name.Perhaps an unintended and unanticipatedconsequence of the increase in use ofnumerical scoring has been a gradual anduniversal change in test formats, even amongthose who don't know a "you phase" from a "U-boat." When testing specific allegations, mostexaminers now use a comparison questionformat that includes 2-3 issue questions and2-3 comparison questions. The issue ques-tions often constitute a single issue or at leastissues that are so closely related that a lie toany one, or truth on any one, forces the samefor the others. There are fewer and fewerinstances in specific allegation testing of mixedissues and relevant-irrelevant formats, neitherof which is as well adapted to scoring as asingle issue.

The scoring range that Backsteradvocated, +3 to -3, is commonly identified asthe seven-position scale, because it encom-passes seven possible values, including zero.The range might just as well have been +99 to-99, although the addition and subtractionnecessary to calculate totals would have beenmore cumbersome, and cutoffs applied to totalscores would need to be higher. Still, anyrange of numbers will works with theappropriate cutoffs. There is nothing sacredabout +3 and -3, except that those numbersoffer a range of values that are relatively easyto use.

In actual practice, examiners who usethe seven-position scale rarely assign values of+3 or -3. That is reflected in research byCapps and Ansley (1992a). In 440 cases, with11,682 separate comparisons of responses,examiners assigned threes to the extent shownin Table 1. The numbers represent the per-centage of all values that were either +3 or -3.

The author is a polygraph examiner in private practice. Reprint requests should be sent to him at The Mayer Building, 1130SW Morrison St, Ste 220, Portland, OR 97205.

Page 83: VOLUME 28 NUMBER 1

Scoring in a Computer Age

Polygraph, 1999, 28 (1) 78

Table 1. Percentages of +3 or -3 scores among 440 cases. (from Capps & Ansley, 1992a).

Chart Pneumograph Electrodermal Cardiograph

1 1.3 18.6 2.8 2 1.2 19.6 2.2 3 3.1 23.9 4.3

Two things are obvious from the tableddata. Threes are rarely given to pneumographor cardiograph reactions; and they are usedwith the electrodermal function to such adisproportionate extent that the examinergives substantially more weight to that onetracing. The cause of this phenomenon restswithin the nature of the tracings. Theelectrodermal tracing is much simpler to scorethan the pneumograph or cardiographtracings. It consists of a smooth flowing linethat rises and falls in measurable increments.Differences that warrant a three in scoring areeasier to observe and can be justified with aruler. The cardiograph and pneumographtracings are more complex, relying to a greaterextent on subjective judgement, and conse-quently invite argument about the magnitudeof a score. Polygraph examiners are like mostother people, preferring a safe, conservativecourse rather than one that can lead to adispute. That translates to fewer threes givento respiration and cardiograph tracings.

Other Scoring Methods

Many examiners use a five-positionscale, +2 to -2, and thereby avoid weightingthe electrodermal channel. In the past fewyears many have also begun using a three-position scale, +1 to -1. The rationale for thethree-position scale is that it is allegedly themost objective. It relies simply on thepresence or absence of a scorable response.Those who use it will tell you that theydistrust examiner judgment in deciding whenthe difference between issue and comparisonquestions is great enough to permit a +2 or -2.On the other hand, those who use twos arguethat not all differences in responses are of thesame magnitude. They assert that since somedifferences are clearly greater than others,provision must be made for that in scoring.

A separate Capps and Ansley studyconcluded that the inconclusive rate with a

three-position scale was more than doublethat obtained with a seven-position scale,although accuracy remained equally high withboth methods for those cases in which adecision was made. The researchers used 100verified single-issue examinations. The diff-erence in inconclusive rates was greatest forthe 52 verified deceptive tests. The three-position scale produced an inconclusive rate of23%, while the seven-position rate was only4%.

Apart from concern about inconclusiveresults, the various methods rarely contributeto any difference of opinion in test results.Two examiners using different scoring rangesshould theoretically arrive at the same resultof truth or deception, if able to reach anyconclusion at all. If the conclusions differ, thefault does not lie with the numbers, but withthe different judgments the two examinershave made regarding what they think they areseeing on the same set of charts.

Research might someday help resolvethe issue of scoring ranges, but in themeantime we tolerate examiners using avariety of number systems. Other non-numerical systems, for instance involvingcheck marks, have also been proposed overthe years. All note-keeping systems have thesame goal, to create a simple method for theexaminer to track what he or she has observedthrough an entire examination.

The amount of data an examiner mustconsider is considerable. A chart usuallyincludes at least four channels: two respir-ation records, an electrodermal tracing, and acardiograph tracing. A typical chart mightalso consist of three issue questions and threecomparison questions. The examiner willusually assign three scores per issue question;one for respiration, one for electrodermal, andone for cardiograph. To arrive at those scores,the examiner must make comparisons

Page 84: VOLUME 28 NUMBER 1

Wygant

Polygraph, 1999, 28 (1) 79

between an issue question and at least onecomparison question. For an examiner wholooks at more than one comparison question,the numbers given here as examples are onlygoing to get bigger. Even though the examinerassigns only one score for respiration, both theupper and lower tracings must be reviewed.The total number of judgments an examinermust make for one question on one chart istherefore at least eight (looking at each tracingfor the issue question and one comparisonquestion). For three issue questions on onechart, that is 24 judgments. For three charts,that is 72 judgments, a lot to rememberaccurately and to tabulate without some kindof system.

We will assume that an examiner isconvinced that he needs to keep a writtennotation of what he is seeing during his chartevaluation. He must then decide what he willdo with that data, after he's gone to thetrouble of collecting it.

Results Reporting

Most examiners refrain from referringto their scores when they report test results. Ifa defense attorney or deputy district attorneygets "trained" by one examiner regarding hiscutoffs and the strength of his results, he isapt to assume that all examiners score thesame. Although polygraph testing has becomehighly standardized, the fact is that someexaminers score conservatively, others aremore generous in assigning values, some use arange of +1 to -1, while others range from +3to -3. One examiner's score on a set of chartsmay be +12, while another's is +6. That is not

a problem for either examiner, since they bothconclude truthfulness. It may be a problemfor an attorney who thinks that scores are asconcrete as the inches on a ruler and thatsomething must be wrong if there is a six-point difference.

The simplest method of avoiding thatkind of confusion is to avoid reportingnumbers, either those derived from manualscoring or from computer evaluation. Thenumbers are not the result. The result is whatthe examiner concludes from the numbers andanything else his training and experiencecause him to believe is relevant and can bejustified.

That also means that cutoffs are notabsolute. Most examiners use +6 and -6 asminimum scores necessary for a finding oftruthfulness or lying in a single-issueexamination that consists of three issuequestions and three charts. In fact, cutoffs aslow as +3 and -3 have been shown to worknearly as well, while reducing the number ofinconclusive results.

In an article about the selection ofcomparison questions in scoring, MichaelCapps and Norman Ansley (1992b) coin-cidentally reported interesting data regardingcutoffs. They had eleven examiners score fortyexaminations, for a total of 440 conclusions.The tests were single-issue proceduresconsisting of two issue questions. Best resultswere obtained from comparing each issuequestion to the stronger adjacent control. Theresults were tabulated with various cutoffs.(See Table 2)

Table 2. Correct, incorrect, and inconclusive outcomes for eleven examiners scoring 40examinations. (From Capps & Ansley, 1992b).

Cutoff Number Correct Number Incorrect Number Inconclusive Percent Correct

+/-6 361 10 69 97.30%+/-5 368 13 59 96.60%+/-4 382 13 45 96.70%+/-3 386 17 37 95.80%+/-2 395 24 22 94.30%+/-1 402 28 10 93.50%

Page 85: VOLUME 28 NUMBER 1

Scoring in a Computer Age

Polygraph, 1999, 28 (1) 80

It doesn't take any advanced math tosee that as the cutoffs were lowered two thingshappened: the number of inconclusive resultsdeclined and the number of mistakesincreased, though not as dramatically. Errorswent up from 10 to 17 when the cutoff waschanged from +/-6 to +/-3. At the same time,inconclusives were cut nearly in half, from 69to 37, representing a drop in the inconclusiverate from 15.7% to 8.4% and a correspondingrise in the error rate from 2.7% to 4.4%.

The point of reviewing these data is notto advocate lower cutoffs, but only to illustratethat there are two primary determining factorsin setting cutoffs. The first is accuracy. Thesecond is inconclusive rate. Any examinerwho is suffering an inconclusive rate greaterthan 20% may want to consider either scoringmore aggressively or lowering cutoffs, or both.On the other hand, any examiner who has anexceptionally low inconclusive rate is probablymaking too many mistakes.

Human vs. Computer

That brings us to a question raised inthe first paragraph: what should an examinerdo about a difference of opinion between his orher score and a computer result? That is aquestion every examiner who uses computer-aided scoring must expect to address at somepoint. A difference is certain to occur. Itshould not happen often, and it should rarelybe a circumstance of opposite opinions, oneconcluding truth while the other concludeslying. This is an area too new for researchresults to be available. Anecdotal informationsuggests that differences are occurring formost examiners less than 10% of the time; andmost of the differences are only one degree (theexaminer's manual score indicating a definiteresult while the computer says it's incon-clusive, or the reverse of that.

Any examiner who frequently disagreeswith his or her computer may want to asksome human colleagues to review a fewexaminations. If everybody is getting resultsthat differ from the examiner who gave thetest, that examiner needs to figure out why.

Occasional differences can derive fromseveral sources. In using computer-aidedevaluation, it is imperative that the examiner

not mix issues. The computer software waswritten with the assumption that the examineris submitting a single-issue test, or at leastone in which the issue questions addresssomething in which lying to one presumeslying to all. Asking a computer to evaluate atest in which one question asks about stealingmoney from the victim, one asks about rapingthe victim, and one asks about the suspect'salibi is an invitation to disaster. The peoplewho wrote the software assumed the examinerwould be using the best test available, whichresearch indicates is the single issue. Thesoftware programmers have repeatedly cau-tioned that a computer conclusion of truth orlie in a mixed issue test may not be reliable.

Another consideration that is withinthe examiner's control is the care he or sheexercises in manually removing distortion fromthe computer's consideration. Although thecomputer algorithms are good at identifyingmany forms of distortion, they can miss themost obvious. It is the examiner's respon-sibility to feed in good data if he or she expectsto get back reliable results.

There is one other common source ofdifferences that, unlike the others, is totallybeyond the examiner's control. It is themethod by which the charts are evaluated.Examiners are generally limited to comparingan issue question to a control on either side ofit. To jump much further away from the issuequestion makes the comparison more difficultand is usually not attempted unless there areno better options. A computer does not sufferthat same handicap. It can make comparisonsand do statistical analyses that encompass allof the charts, not just limited portions of anyparticular chart. That does not mean that itsresults are necessarily more accurate orreliable, but they are certainly more compre-hensive.

Faced with a difference, what should aconscientious examiner do? Since theexaminer must assume responsibility for theresult, he or she should not automaticallyeither accept or reject a computer result thatdiffers. Further testing or review by anotherexaminer is often helpful and probably shouldbe the first consideration, before issuing areport in a case where examiner and computerdisagree. There may occasionally be cases in

Page 86: VOLUME 28 NUMBER 1

Wygant

Polygraph, 1999, 28 (1) 81

which the examiner is comfortable reportingthat his or her result was definite truth or lie,while the computer was inconclusive, or viceversa.

Whether an examiner even mentions adifference with a computer depends to a greatextent on how he or she routinely writesreports. If reports typically make no referenceto a computer result, the examiner cancontinue to give an opinion withoutmentioning any difference with the computer.It may lead to some surprises, disputes, anddisappointments later on, but at least therecan be no demonstration that the examinertried to conceal information he or she wouldhave otherwise ordinarily reported. Theexaminer in those circumstances can alwaysargue that he considered the computer resultadvisory, along with his own score, hisobservations, and whatever else he believesjustifiably contributes to his opinion.

When an examiner does routinelymention the computer result in his reports,omitting it when there is a difference ofopinion can lead to serious problems. Toreveal a difference belatedly suggests anattempt to hide something and can damagethe examiner's credibility.

The styles for describing a non-agreeing computer opinion vary considerably.A report might say: "Based on a review of myown numerical evaluation and other factorsthat included computer evaluation of thesecharts, I conclude that these charts indicatedeception/truthfulness with regard to the

questions about the issue." There is nomention of what the computer concluded, onlythat it was taken into consideration. This isprobably the least desirable mode of reporting.A more candid version would say: "I concludefrom my numerical evaluation of these chartsthat they indicate lying/truthfulness withregard to the questions about the issue.Computer evaluation of these same chartsproduced an inconclusive result." The reversesituation, manual inconclusive and definitecomputer result, might be worded: "It is myopinion, based on my evaluation of thesecharts, that they are inconclusive. Computerevaluation indicated lying/truthfulness to thequestions about the issue, which I do not findsupported by the charts."

The common factor in all of thesesamples is that the examiner (the big "I" in theexamples above) assumes full responsibility forthe result he or she is delivering. There is noattempt to characterize the conclusion assomething beyond the examiner's control, theautomatic determination of either a computeror some mystical scoring system. After all, ifan examiner wants to give the impression thatthis game is played solely by the numbers,then he is advancing the notion that he isabout as skilled as the Roto-Rooter man.

Whether any examiner reports anycomputer result is ultimately a matter ofpersonal preference. How he or she respondsto any differences of opinion, whether with acomputer or another examiner, is always areflection of the examiner's professionalstature and self-confidence.

References

Capps, M.H., & Ansley, N. (1992a). Numerical scoring of polygraph charts: What examiners reallydo. Polygraph, 21(4), 264-320.

Capps, M.H., & Ansley, N. (1992b). Strong control versus weak control. Polygraph. 21(4), 341-348.

Page 87: VOLUME 28 NUMBER 1

Scoring EDRs

Polygraph, 1999, 28 (1) 82

Short report

Proposed Method for Scoring Electrodermal Responses

Donald Krapohl

The material in this report is drawnfrom a larger study which is in progress. Thedata from 300 polygraph cases were randomlyselected from the Department of DefensePolygraph Institute confirmed case database.Half of the charts were from truthfulexaminees and the other half were fromdeceptive examinees. All cases were single-issue examinations conducted in the field oncomputer polygraphs by federal or lawenforcement polygraph examiners. Groundtruth was established by confession of thesubject, confession of someone whoexculpated the subject, or other irrefutableforensic evidence. Accuracy of the originalexaminer decision was not a criterion forinclusion of the cases in the database. Thetest format for all cases was the DoDPI ZoneComparison Test (DoDPI, 1992).

The electrodermal response (EDR) datawere standardized as the ratio of the relevantquestion EDR amplitude divided by theamplitude of the greater adjacent comparisonquestion, within each chart. A total of 2700EDR ratios were calculated from the 300 cases(3 charts consisting of 3 relevant questionseach per case). The distribution of ratios wassequentially divided into seven groups such

that all groups had the same number of ratios.Each division corresponded to one of thevalues used in the seven-position scale: +3,+2, +1, 0, -1, -2, -3. Ratios greater than 1.00indicated greater EDR amplitudes to therelevant question than to the comparisonquestion, while ratios smaller than 1.00indicated the EDR to the relevant question wasless than the EDR to the comparison question.Table 1 lists the cutting scores based on thepresent data.

The scores obtained using this methodwere compared to those of the 2:1, 3:1, 4:1requirements of the conventional (DoDPI,1999) scoring rules with these 300 cases. Theproposed method produced a mean score percase more in the correct direction for deceptive(t(149)=11.34, p<.01) and nondeceptive(t(149)=15.25, p<.01) than did the conven-tional scoring rules (see Table 2). Moreover,the proposed method produced a higherproportion of EDR total scores in the correctdirection than did the conventional method(86.3% versus 74.0%, z=3.77, p<.01). Inshort, the proposed method outperformed theconventional method for scoring the EDR withthese 300 field criminal cases.

Table 1Cutting scores for the seven-position scale and the corresponding ratio of relevant question

EDR amplitudes and comparison question EDR amplitudes. (n=2700 ratios)

7-Position Value EDR Ratios (R/C)+3 <.44+2 .44 - .67+1 .68 - .920 .93 - 1.20-1 1.21 - 1.60-2 1.61 - 2.44-3 >2.44

Page 88: VOLUME 28 NUMBER 1

Krapohl

Polygraph, 1999, 28 (1) 83

Table 2Mean values for the conventional and proposed method of scoring the electrodermal channel

by case (n=150 deceptive and 150 nondeceptive cases)

Conventional ProposedDeceptive cases -4.37 -8.33Nondeceptive cases 3.53 8.43

Table 3Conventional and proposed ratios for scoring EDRs

Greater response torelevant question

Greater response tocomparison question

Comparison R:C R:C R:C R:C R:C R:CScore -3 -2 -1 +1 +2 +3Conventional 4 : 1 3 : 1 2 : 1 1 : 2 1 : 3 1 : 4Proposed 2.45 : 1 1.61 : 1 1.21 : 1 1 : 1.09* 1 : 1.49* 1 : 2.27*

* Note: These values are reciprocals of the ratios found in Table 1. They have been inverted inTable 3 to facilitate the comparison with the more familiar conventional ratios.

In the first of two cross validations ofthe proposed scoring rules, a holdout sampleconsisting of 30 confirmed deceptive and 30confirmed nondeceptive cases were tested.Applying the proposed scoring method to thesenew data, the mean of the total EDR scores fordeceptive cases was more negative than thosescores rendered by the conventional scoringrules (-10.5 versus -5.6) and this differencewas significant (t(29)=7.37, p<.01). Similarly,the mean of the total EDR scores fornondeceptive cases was more positive thanthose of the conventional scoring system (7.1versus 2.7) and this difference was alsostatistically significant (t(29)=5.62, p<.01).The proposed method produced 54 of 60scores in the correct direction, while theconventional method produced only 49,though the difference in the proportions wasnot significant (z=1.32, p>.01). A second crossvalidation with 100 new cases will becompleted and reported later this year.

It is interesting to note that theproposed scoring rules are not symmetrical.The assignment of a negative value to therelative sizes of the EDR amplitudes (R/C)

requires a stronger response to the relevantquestion than does the assignment of apositive value if the comparison is in the otherdirection (C/R). In other words, thethresholds for assigning +1,+2, and +3 arelower than the thresholds for assigning a -1, -2 and -3. Table 3 compares the proposed andconventional ratios. There are many possibleexplanations for this response pattern, but aworking hypothesis is that, on average,nondeceptive examinees do not respond asstrongly to the comparison questions as thedeceptive examinees respond to relevantquestions. In support of this hypothesis,Figure 1 shows the relative sizes of EDRs tocomparison and relevant questions for the 300deceptive and nondeceptive subjects used todevelop this scoring method. If nondeceptivesubjects do not react electrodermally tocomparison questions as strongly as deceptivesubjects do to relevant questions, optimalscoring rules would be asymmetrical, such aswas found here. The present findings, ofcourse, require replication. If confirmed, allscoring methodologies will have toaccommodate the asymmetry in responsepatterns for truthful and deceptive examinees.

Page 89: VOLUME 28 NUMBER 1

Scoring EDRs

Polygraph, 1999, 28 (1) 84

Figure 1

Relative EDR amplitudes for relevant and comparison questions for deceptive and nondeceptive examinees

(n=1350 samples per bar)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Deceptive NondeceptiveGround Truth

ED

R A

mpli

tude (

arb

itra

ry u

nit

s)

Comparison QuestionRelevant Question

Summary

Despite the near universality of theconventional EDR ratio scoring rules inpolygraph training and practice, the presentwriter could find no support in the literaturefor setting the EDR ratios at 2:1, 3:1 and 4:1.The notion that psychophysiological re-sponding is so mathematically convenient isprobably unduly optimistic. Though theconventional EDR ratios have been part of

validated scoring procedures over the last 30years, the present findings suggest that thoseEDR scoring rules have not been optimized,and that decision accuracy would benefit bythe development of empirically based scoringrules. The method outlined in this report isone such attempt. A forthcoming full reportwill include the second cross validation data,and similar analyses of the cardiograph andpneumograph tracings.

References

Department of Defense Polygraph Institute (1992). Zone comparison test. Ft. McClellan, Alabama.

Department of Defense Polygraph Institute (1999). Test data analysis (manual scoring process),seven-position numerical analysis scale. Ft. McClellan, Alabama.

Page 90: VOLUME 28 NUMBER 1

Ansley

Chart Interpretation - A Bibliography

Norman Ansley

A selective bibliography on the topic of chart interpretation. It includes recent developments in the use of computers. Most of these papers are available in the files of the DoDPI, the files of the APA References Service, and the files of the author.

Key words: Algorithms, Bibliography, Chart Interpretation, Computer Charts, Hand Scoring

Abrams, S. (1989). The Complete Polygraph Handbook. Lexington, MA: Lexington Books.

Abrams, S. (1977). A Polygraph Handbook for Attorneys. Lexington, MA: Lexington Books.

Abrams, S. (1973). Problems in Interpreting Polygraph Examinations in Malingerers. Polygraph, ~ (4), 377-380.

Abrams, S. (1975). The Validity of the Polygraph With Children. Journal of Police Science and Administration, ~ (3), 310-311.

Abrams, S. (1974). The Validity of the Polygraph With Schizophrenics. Polygraph, ~ (3),328-337.

Abrams, S. & Weinstein, E. (1974). The Validity of the Polygraph With Retardates. Journal of Police Science and Administration, ~ (1), 11-14.

Adachi, K. (1989). Discriminant Functions Based on Models for Randomized Block Design for Detection of Deception. Reports of the National Institute of Police Science, 42 (3), 24-31.

Adachi, K. & Suzuki, A. (1990). Correlation Structure Among Response Measures in Detection of Deception. Reports of the National Institute of Police Science, 43 (3), 13-16.

Adams, T. (1982). A Neurophysiological review and a Proposed Rationale for Interpreting Electrodermal Polygraph Records. Polygraph, 11 (4), 285-303.

Amsel, T. T. (1997). Fear of Consequences and Motivation as Influencing Factors on Psychophysiological Detection of Deception. Polygraph, 26 (4), 255-267.

Angus, J. & Castelaz, P. (1993). Artificial Neural Network Analysis of Polygraph Signals (Final Report) DoDPI 93-R-0010.

Ansley, N. Capillary Responses as a Polygraph Channel. In N. Ansley (Ed), Legal Admissibility of the Polygraph. Springfield, IL: Charles C. Thomas, 1975, pp. 257-265.

Ansley, N. (1959). A Comparison of Arm-Cuff and Wrist-Cuff Blood Pressure Patterns in Polygraph Charts. Journal of Criminal Law, Criminology, and Police Science, 50 (2) 192-194.

The author is a Life Member of the APA and president of Forensic Research, Inc, 35 Cedar Road, Severna Park, MD 21146.

Polygraph, 1999, 28 (1) 85

Page 91: VOLUME 28 NUMBER 1

Scoring Bibliography

Ansley, N. (1990). Scientific Research on Detecting Deception and Verifying Troth. Washington: United States Department of Defense.

Ansley, N. (in press). The Validity of the Modified General Question Test (MGQT).

Aragaki, F. (1979). Intensity of the Suspicion Expressed in Questions and the Physiological Responses. Paper presented at the National Convention of Police Psychology at the National Research Institute of Police Science.

Arther, R. O. (1955). Blood Pressure Rises on Relevant Questions in Lie Detection - Sometimes an Indication of Innocence Not Guilt. Journal of Criminal Law, Criminology, and Police Science, 46, 112-115.

Arther, R. O. (1971). Breathing Analysis, Part I. Journal of Polygraph Studies, .§ (4) 1-4, Part II, Journal of Polygraph Science, .§ (5), 1-4.

Arther, R. O. (1977). Miscellaneous Chart Analysis Principles, Part I, Journal of Polygraph Science, 11 (5), 1-4, Part II, Journal of Polygraph Science, 12 (1), 1-4.

Arther, R. O. (1970). Spot Responders. Journal of Polygraph Studies, 1: (6), 1-2.

Arther, R. O. (1976). Known-Lie Test Chart Analysis. Journal of Polygraph Science, 11 (2), 1-4.

Arther, R. O. (1996). The Two Types of Spot Responders. Journal of Polygraph Science, 30 (4), 1-4.

Backster, C. (1974). Anticlimax Dampening Concept. Polygraph, ~ (1), 48-50.

Backster, C. (1963). The Backster Chart Reliability Rating Method. Law & Order, 1 (1), 63-64.

Backster, C. (1996). The Backster Zone Comparison Technique, An Update from the Source. Paper delivered at the American Polygraph Association Seminar, New Orleans.

Backster, C. (1964). Cardio Baseline Changes Sometimes Misleading. Military Police Journal, p. 24.

Backster, C. (1963). New Standards for Chart Interpretation - Do the Charts Speak for Themselves? Law and Order, 11 (6), 67-71.

Backster, C. (1964). Polygraph Spot Analysis versus Non-localized Analysis. Law & Order, 12 (2), 31-32.

Backster, C. (Sep 1963). Polygraph Technique Trouble Shooting. Law & Order, pp. 54-55.

Baesen, H. V., Chung, C. & Yang, C. (1948-49), A Lie Detector Experiment. Journal of Criminal Law, Criminology, and Police Science, 39, 532-537.

Barland, G. H. (1972). Reliability of Polygraph Chart Evaluations. Polygraph, 1 (4), 192-206.

Barland, G. H. (May 1981). A Validation and Reliability Study of Counterintelligence Screening Test. Security Support Battalion, 902nd Military Intelligence Group, Fort George G. Meade, MD.

Benussi, V. (1914). Die Atmung Asymptome der Luge (The Respiratory Symptoms of Lying). Archiv fur die Gestamte Psychologie, 11, 244-273. Translated and printed in Polygraph, 1: (1), 52-76.

Biran, I. (1975). Chart Interpretation: An Interdisciplinary Paradigm. Polygraph, 1: (1), 7-17.

Polygraph, 1999, 28 (1) 86

Page 92: VOLUME 28 NUMBER 1

Ansley

Blackwell, N. J. (1994). An Evaluation of the Effectiveness of the Automated Scoring System (PASS) in Detecting Deception in a Mock Crime Analog Study. DoDPI94-R-003.

Burtt, H.E. (1921). Further Techniques for Inspiration Expiration Ratios. Journal of Experimental Psychology, 1, 106-111.

Burtt, H. E. (1921). The Inspiration Expiration Ratio During Truth and Falsehood. Journal of Experimental Psychology, 1, 1-23.

Capps, M. H. & Ansley, N. (1992). Analysis of Federal Polygraph Charts by Spot and Chart Total. Polygraph, 21 (2), 110-131.

Capps, M. H. & Ansley, N. (1992). Analysis of Private Industry Polygraph Charts by Spot and Chart Total. Polygraph, 21 (2), 132-142.

Capps, M. H. & Ansley, N. (1992). Anomalies: The Contributions of the Cardio, Pneumo, and Electrodermal Measures Toward a Valid Conclusion. Polygraph, 21 (4),321-340.

Capps, M. H. & Ansley, N. (1992). Comparison of Two Scoring Scales. Polygraph, 21 (1),39-43.

Capps, M. H. & Ansley, N. (1992). Numerical Scoring of Polygraph Charts: What Examiners Really Do. Polygraph, 21 (4),264-320.

Capps, M. H. & Ansley, N. (1992). Strong Control versus Weak Control. Polygraph, 21 (4), 341-348.

Capps, M. H. , Knill, B. L. & Evans, R. K. (1993). Effectiveness of Symptomatic Questions. Polygraph, 22 (4), 285-298.

Capps, M. H., Knill, B. L., Evans, R. K. & Johnson, G. J. (1995). Cognitive Arousal as a Means of Evaluation. Polygraph, 24 (1), 1-5.

Castelaz, P. F. & Angus, J. E. (1994). Deception Classification via Nonconventional Processing Techniques. SPR Abstracts, S12.

Clingan, L. (1980). To Score or Not to Score. The Police Polygraphists Journal, 23-25.

Decker, R. E. (Fall 1981). Chart Analysis. C.A.P.E. Newsletter,! (4), 5-24.

Decker, R. E., Stein, A. E. & Ansley, N. (1972). A Cardio Activity Monitor, Polygraph,! (3) 108-124.

Duston, B. M. (Dec 1990). A Comparison of Discriminant Analysis Techniques for Classifying Polygraph Charts. Naval Oceans Systems Center, San Diego.

Duston, B. M. (Nov 1992). Statistical Techniques for Classifying Polygraph Data. Naval Command, Control, and Ocean Surveillance Center, San Diego.

Edel, E. C. & Jacoby J. (1975). Examiner Reliability in Polygraph Chart Analysis: Identification of Physiological Responses. Journal of Applied Psychology, 60 (5),632-634.

Edel, E. C. & Moore, L. A. Jr. (1984). Analysis of Agreement in Polygraph Charts. Polygraph, 13 (2), 193-197.

Polygraph, 1999, 28 (1) 87

Page 93: VOLUME 28 NUMBER 1

Scoring Bibliography

Elaad, E. (1985). Decision Rules in Polygraph Examinations. In Anti-Terrorism; Forensic Science; Psychology in Police Investigations; Proceedings of IDENTA '85. Jerusalem: Heiliger & Co., 167-178.

Elaad, E. (1990). Detection of Guilty Knowledge in Real-Life Criminal Investigations. Journal of Applied Psychology, 75 (5), 521-529.

Elaad, E., Ginton, A. & Jungman, N. (1992). Detection Measures in Real-Life Criminal Guilty Knowledge Tests. Journal of Applied Psychology, 77 (5), 757-767.

Elaad, E. & Kleiner, M. (1990). Polygraph Chart Interpreter Experience on Psychophysiological Detection of Deception. Journal of Police Science and Administration, 17 (2), 115-123.

Franz, M. L. (1989). Technical Report: Relative Contributions of Physiological Recordings to Detect Deception. Contract No. MDA904-88-M-6612, Argenbright Polygraph, Inc.

Furgerson, R. M. (Oct 1989). Evaluating Investigative Polygraph Results. FBI Law Enforcement Bulletin, 58 (10),6-11, reprinted in Polygraph, 18 (4), 208-215.

Gately, R. D. (1983). Specific Testing With the Atari 800 Computer. Unpublished manuscript.

Geddes, L. A. & Newberg, D. C. Cuff Pressure Oscillations in the Measurement of Relative Blood Pressure. Polygraph, § (2), 113-122.

Glassford, E. & Nyboer, J. (1975). Emotional Stress and Oscillometric Variations of the Pulse Curve. Polygraph, 1: (2), 108-117.

Golden, R. I. & Hunter, F. L. (Aug 1970). The Measurement of Upper and Lower Respiration in Lie Detection. Paper presented at the American Polygraph Association Seminar, Los Angeles.

Gordon, N. J. & Cochetti, P. M. (1987). The Horizontal Scoring System. Polygraph, 16 (2), 116-125.

Haney, J. W. (1943-44). The Analyses of Respiratory Criteria in Deception Tests, A Possible Source of Misinterpretation. Journal of Criminal Law and Criminology, 34, 268-273.

Haney, K. W. (1972). Question Spacing and the Length of Reactions. Polygraph, 1 (4), 234-237.

Harrelson, L. H. (1965). Chart Interpretation and Question Formulation. In Keeler Polygraph Institute Second Annual Seminar. Chicago: Keeler Polygraph Institute.

Harrleson, L. H. (1964). The Keeler Technique, 2nd ed. Chicago: Keeler Polygraph Institute.

Hathaway, S. R. & Hanscom, C. B. (1958). "The Statistical Evaluation of Polygraph Records." In V. A. Leonard (Ed.) Academy Lectures on Lie Detection, v.2. Springfield, IL: Charles C. Thomas, 112-121.

Hilliard, D. L. (1979). A Cross Analysis Between Relevant Questions and a Generalized Intent to Answer Truthfully Question. Polygraph, § (1), 73-77.

Holmes, W. D. (1958). "The Degree of Objectivity in Chart Interpretation." In V. A. Leonard (Ed.) Academy Lectures on Lie Detection, v.2. Springfield, IL: Charles C. Thomas, 83-85.

Honts, C. R. & Driscoll, L. N. (1988). A Field Validity Study of the Rank Order Scoring System (ROSS) in Multiple Issue Control Question Tests. Polygraph, 17 (1),1-15.

Polygraph, 1999, 28 (1) 88

Page 94: VOLUME 28 NUMBER 1

Ansley

Horvath, F. (1994). The Value and Effectiveness of the Sacrifice Relevant Question: An Empirical Assessment. Polygraph, 23 (4), 261-279.

Howland, D. P. (1978). An Application of the Positive Control Concept of Polygraph Examination. Polygraph, Z (4), 286-294.

Howland, D. (1981). Positive Control Question Technique; Pre-Test Interview and Chart Interpretation. Polygraph, 10 (1), 37-41.

Inbau, F. E. (1942). Lie Detection and Criminal Interrogation. Baltimore: Williams & Wilkins Co.

Inbau, F. E. (1948). Lie Detection and Criminal Interrogation. Second Ed., Rev. Baltimore: Williams & Wilkins Co.

Inbau, F. E. & Reid, J. E. (1953). Lie Detection and Criminal Interrogation. Third Ed., Rev. Baltimore: Williams & Wilkins Co.

Instruction Manual for the Berkeley Psychograph (1943). San Rafael, CA: Lee & Sons.

Jayne, B. (1981). Purposeful Non-Cooperation: A Diagnostic Opinion of Deception. Polygraph, 10 (3), 156-174.

Joseph, C. N. (1957). "Compensatory Responses and Irregularities in Polygraph Chart Interpretation." In V. A. Leonard (Ed.) Academy Lectures in Lie Detection, v.l. Springfield,IL: Charles C. Thomas, 93-99.

Keeler, L. (1930). A Method for Detecting Deception. American Journal of Police Science, 1 (1), 38-51.

Keifer, R. (JuI1993). Chart Interpretation. Paper presented at the American Polygraph Association Seminar, Newport Beach, CA.

Kircher, J. C. (1989). Response Quantification and Decision Making in the Detection of Deception: Current Status and Future Directions. SPR Abstracts, 26 (4A), 2-3.

Kircher, J. C. & Raskin, D. C. (1988). Human versus Computerized Evaluations of Polygraph Data in a Laboratory Setting. Journal of Applied Psychology, 73 (2), 291-302.

Kircher, J. C., Raskin D. C., Honts, C. R. & Horowitz, S. W. (1995). Lens Model Analysis of Decision Making by Field Polygraph Examiners. SPR Abstracts, S45.

Koll, M. (1979). Analysis of Zone Charts by Various Pairings of Control and Relevant Questions. Polygraph, ~ (2), 154-160.

Knapp, R. B. & Layeghi, S. (1994). Classification of Deception Using Fuzzy Pattern Recognition. SPR Abstracts, S12.

Kubis, J. F. (1973). Analysis of Polygraph Data, Dependent and Independent Situations. Part 1. Polygraph, ~ (1), 42-58. Part II. Polygraph, ~ (2), 89-107.

Landis, C. & Gullette, R. (1925). Studies of Emotional Reactions. Systolic Blood Pressure and Inspiration-Expiration Ratios. Journal of Comparative Psychology, Q (3), 221-253.

Landis, C. & Wiley, L. E. (1926). Changes in Blood Pressure and Respiration During Deception. Journal of Comparative Psychology, § (1), 1-19.

Polygraph, 1999, 28 (1) 89

Page 95: VOLUME 28 NUMBER 1

Scoring Bibliography

Larson, J. A. (1923). The Cardio-Pneumo-Psychogram in Deception. Journal of Experimental Psychology, § (6), 420-454.

Lee, C. D. (1953). The Instrumental Detection of Deception - The Lie Test. Springfield, IL: Charles C. Thomas.

MacNitt, R. D. (1942). Measures of Deception. Journal of Criminal Law and Criminology, 35 (3).

Magiera, A. C. (1975). "Patterns of Purposeful Distortions." In N. Ansley (Ed.) Legal Admissibility of the Polygraph. Springfield,IL: Charles C. Thomas, 199-209.

Marston, W. M. (1938). The Lie Detector Test. New York: Richard R. Smith. Reprinted by the American Polygraph Association, 1989, pp. 102-105.

Matte, J. A. (1996). Forensic Psychophysiology Using the Polygraph, Scientific Truth Verification - Lie Detection. Williamsville, NY: J.A.M. Publications.

Matte, J. A. (1980). The Art and Science of the Polygraph Technique. Springfield, IL: Charles C. Thomas.

Matte, J. A. (1981). Polygraph Quadri-Zone Reaction Guide. Polygraph, 10 (3), 186-193.

Minor, P. (Aug 1985). Modified General Question Technique (MGQT) Summary. Paper delivered at the American Polygraph Association Seminar, Reno, Nevada.

Minor, P. K. (Aug 1985). Modified Relevant-Irrelevant (MRI) Technique. Paper delivered at the American Polygraph Association Seminar, Reno, Nevada.

Minor, P. K. (1989). "The Relevant-Irrelevant Technique." In Abrams, S. The Complete Polygraph Handbook. Lexington, MA: Lexington Books, 143-152.

Minor, P. K. (Aug 1994). The Relevant-Irrelevant Technique. Paper delivered at the American Polygraph Association seminar at Newport Beach.

Minor, P. K. (Aug 1985). The Zone Comparison Test (ZCT). Paper delivered at the American Polygraph Association Seminar, Reno, Nevada.

Miritello, K. (Nov 1990). Rank Order Analysis. Unpublished manuscript.

Mohr, C. T. (1982). Problems Encountered in Testifying to Test Charts of Other Examiners. Empire State Polygraph Society Newletter, § (4), 8-11.

Moroney, William F. & Zenhausen, R. J. (1972). Detection of Deception as a Function of Galvanic Skin Response Recording Methodology. Journal of Psychology, 80 (2), 255-262.

Murray, K. E. (1989). Movement Recording Chairs: A Necessity? Polygraph, 18 (1), 15-23.

Nakayama, M. & Yamamura, T. (1990). Changes of Respiration Pattern to the Critical Question on Guilty Knowledge Technique. Polygraph, 19 (3), 188-198.

Nieman, D. A. (1982). Physiological Response to Rank Order of Problems as Measured by the Polygraph. Doctoral Dissertation, United States International University, UMI 43/03B, p. 856, DEC82-18282 vo1.3, p.735.

Polygraph, 1999, 28 (1) 90

Page 96: VOLUME 28 NUMBER 1

Ansley

Olsen, D. E., Ansley, N., Feldberg, I. E. Harris, J. C. & Cristion, J. A. (1991). Recent Developments in Polygraph Technology. Johns Hopkins APL Technical Digest, 12 (4), 347-357.

Olsen, D. E., Harris, J. C., Capps, M. H. & Ansley, N. (1997). Computerized Polygraph Scoring System. Journal of Forensic Sciences, 42 (1), 61-71.

Olsen, D. E., Harris, J. C., & Chiu, Wendy W. (1994). The Development of a Physiological Detection of Deception Scoring Algorithm. SPR Abstracts, 31, S 11.

Pierce, R. W. (1950). The Peak of Tension Test. ISDD Bulletin, ~ (3), 1,4,5.

Raskin, D. C., Barland, G. H. & Podlesny, J. A. (1977). Validity and Reliability of Detection of Deception. Polygraph, § (1), 1-39.

Reburn, J. W. & Mayo, J. F. (Apr 1975). The Effect of Noise on Polygraph Tracings. Maryland Polygraph Review, 8-9.

Reid, J. E. (1992). Controlled Breathing as an Indication of Deception. Polygraph, 11 (1),12-13.

Reid, J. E. (1992). Interpretation of Truth and Deception in Polygraph Test Records. Polygraph, 11 (1),46-61.

Reid, J. E. & Inbau, F. E. (1966). Truth and Deception: The Polygraph ("Lie Detector") Technique. Baltimore: Williams & Wilkins.

Reid, J. E. & Inbau, F. E. (1977). Truth and Deception: The Polygraph ("Lie Detector") Technique. Second Ed., Rev. Baltimore: Williams & Wilkins.

Robbins, N. E. & Penley, W. J. (1975). Review of Polygraph Charts of Non-Deceptive Subjects. Polygraph, 1 (3), 199-206.

Silverberg, B. A. (1980). A Note on the Diagnostic Value of the Dicrotic Pulse. Polygraph, 2 (2), 114-119.

Slowik, S. M. (1971). Comparison of Thoracic and Abdominal Respiration Deceptive Responses. Paper presented to the American Polygraph Association Seminar, Atlanta, Georgia.

Slowik, S. M., Buckley, J. P., Kroeker, L. & Ash, P. (1973). Abdominal and Thoracic Pneumograph Recordings. Polygraph, £ (1), 12-27.

Summers, W. G. (1936). Guilt Distinguished from Complicity. Psychology Bulletin, 33, 787.

Summers, W. G. (1939). Science Can Get the Confession. Fordham Law Review, §, 335-354.

Suter, James E. (1979). A Field Study of the Usefulness of the Cardio Activity Monitor. Polygraph, § (2), 112-119.

Suzuki, A. & Hikita, Y. (1981). An Analysis of Responses on Polygraph; A Dimunition of Responses. Polygraph, 10 (1), 1-7.

Suzuki, A., Ohnishi, K. Matsuno, K., & Arasuna, M. (1979). Amplitude Rank Score Analysis ofGSR in the Detection of Deception: Detection Rates Under Various Examination Conditions. Polygraph, § (3), 242-252.

Polygraph, 1999, 28 (1) 91

Page 97: VOLUME 28 NUMBER 1

Scoring Bibliography

Suzuki, A., Watanabe, S., Ohnishi, K., Matsuno, K., & Arasuna, M. (1979). The Objective Analysis ofGSR in the Detection of Deception. Polygraph, § (1),53-63.

Thackray, R. I. & Orne, M. T. (1968). A Comparison of Physiological Indices in Detection of Deception. Psychophysiology, 1: (3), 329-339.

Timm, H. W. & Hedges, W. K. (1993). Factors Theoretically Affecting the Incidence of Deceptive Responses During Pre-employment Screening Procedures. Polygraph, 22 (3), 229-244.

Trovillo, P. V. (1942). Deception Test Criteria: How One Can Determine Truth and Falsehood from Polygraphic Records. Journal of Criminal Law and Criminology, 33 (4), 338-358.

Trovillo, P. V. (1942). Some Indices of Deception for the Interpretation of Polygraph (Lie Detector) Tests. Psychological Bulletin, 39, 599-600.

Turrou, L. G. (1938). Nazi Spies in America. New York: Random House.

United States Army Military Police School (USAMPS) (n.d.). Chart Interpretation - Chart Analysis, Classroom Issue, IN 330, 10 118.

United States Department of Defense Polygraph Institute (DoDPI) (1998). Instructions for Manually Scoring Charts Collected During a Psychophysiological Detection of Deception (POD) Examination.

United States Department of Defense Polygraph Institute (6 June 1990). Peak of Tension (POT) Test Construction Summary Sheet.

van Herk, M. (1991). Numerical Evaluation: Seven Point Scale ± 6 and Possible Alternatives; A Discussion. Polygraph, 20 (2), 70-79.

Wainer, H. & Gruvaeous, G.T. (1974). Using Variance as a Discriminator in Lie Detection. Journal of Applied Psychology, 59 (1), 110-112.

Watanabe, S. & Suzuki, A. (1977). The Heart Rate as an Index of Lie Detecting Test. Polygraph, § (2), 103-112.

Watanabe, S. & Kajitani, M. (1995). Abstract: The Heart Rate Response as an Index of the Lie Detector Test. Polygraph, 24 (1), 56.

Weaver, R. S. (1985). Effects of Differing Numerical Chart Evaluation Systems on Polygraph Examination Results. Polygraph, 14 (1), 34-42.

Weaver, R. S. (1980). The Numerical Evaluation of Polygraph Charts: Evolution and Comparison of Three Major Systems. Polygraph, 2 (2), 94-108.

Wicklander, D. E. & Hunter, F. L. (1975). The Influence of Auxiliary Sources of Information in Polygraph Diagnosis. Journal of Police Science and Administration, ~, 405-409.

Winter, J. E. (1936). A Comparison of the Cardio-Pneumo-Psychograph and Association Methods in the Detection of Lying in Cases of Theft Among College Students. Journal of Applied Psychology, 20, 243-248.

Wygant, J. R. (1984). Rationale for Scoring. Polygraph, 13 (3), 263-266.

Polygraph, 1999, 28 (1) 92

Page 98: VOLUME 28 NUMBER 1

Ansley

Yankee, W. J. (July 1968). A Report on the Computerization of Polygraphic Records. Paper presented to the Fifth Annual Seminar of the Keeler Polygraph Institute Alumni Association, Chicago.

Yankee, W. J. & Laughner, D. L. (1973). An Investigation of Mutual Relationships Between Sighs and Cardio Tracings in Chart Interpretation. Polygraph, ~ (1), 28-35.

Polygraph, 1999, 28 (1) 93