scan to nonscan conversion via test cube analysis

12
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 32, NO. 2, FEBRUARY 2013 289 Scan to Nonscan Conversion Via Test Cube Analysis Ozgur Sinanoglu Abstract —Increasing complexity of integrated circuits has forced the industry to abandon partial scan, which necessitates a computationally demanding and unaffordable sequential auto- matic test pattern generation (ATPG), and to instead adopt full scan, despite its costs. In this paper, we propose a partial scan scheme driven by a computationally efficient test cube analysis. We tackle the challenges associated with the identification of the conditions to restore the controllability and observability compromised due to partial scan, and with the formulation of these conditions in terms of test cube operations. Upon the identification of a maximal-sized set of scan flip-flops that are converted to nonscan, a simple postprocessing of the test cubes helps compute the values to be loaded into the scan flip- flops, eliminating the need to rerun ATPG, while at the same time ensuring the quality of full scan. We further enhance this framework through techniques that process the test data before and after the application of the proposed test cube analysis-driven partial scan technique, in order to enlarge the size of the nonscan flip-flop set. The proposed scheme combines the simplicity of the conventional ATPG flow with the area, performance, test time, and test power reduction benefits of partial scan. The proposed test cube analysis-driven partial scan scheme is orthogonal and thus fully compatible with other test cost-reduction techniques, such as test data compression and test power reduction, which can be applied in conjunction. Index Terms—Partial scan, test cost reduction, test cube analysis, test cube operations. I. Introduction I NCREASING complexity of integrated circuits has forced the industry to abandon partial scan, which removes scan multiplexers corresponding to some of the flip-flops, discon- necting them from the scan path. As a result, controllability and observability of these flip-flops are compromised, neces- sitating sequential automatic test pattern generation (ATPG), where these flip-flops are controlled and observed through functional paths. The computational cost of sequential ATPG cannot be afforded, given the complexity of integrated circuits today. Consequently, the industry has given up on partial scan and adopted full scan despite its costs. Full scan incurs area, performance, and test costs. Insertion of as many multiplexers as the number of flip-flops in the de- Manuscript received October 13, 2011; revised March 8, 2012, June 23, 2012 and September 3, 2012; accepted September 5, 2012. Date of current version January 18, 2013. A preliminary version [1] of this work was presented at the VLSI Test Symposium, Dana Point, CA, in May 2011. This paper was recommended by Associate Editor J. L. Dworak. The author is with the Department of Computer Engineering, New York University Abu Dhabi, Abu Dhabi, UAE (e-mail: [email protected]). Digital Object Identifier 10.1109/TCAD.2012.2218603 sign obviously imposes a considerable area cost. Furthermore, these multiplexers are inserted on functional paths, resulting in critical path prolongation by a multiplexer delay, and hence degrading the performance of the design timing-wise. Full scan also incurs significant test costs. Every test pattern consists of as many bits as the number of flip-flops in the design, translating into high test time and test data volume. Another problem that full scan imposes is the excessive switching activity during the test, as all the flip-flops are active during shift operations. Elevated levels of power dissipation [2] occur during testing, which, if overlooked, can cause reliability issues. A computationally efficient partial scan can be a remedy of the problems of full scan. Removal of scan multiplexers and thus taking some of the flip-flops off the scan path [3]: 1) reduces area cost; 2) potentially improves the critical paths of the design, thus enhancing functional performance; 3) reduces the scan path length, decreasing test time and test data volume, a form of test data compression; 4) reduces switching activity during testing and leashes power dissipation and IR drop, as only the scan flip- flops toggle during shift operations while the nonscan flip-flops are inactive during shift. An extensive amount of research has been conducted in partial scan design. The previously proposed techniques in this field can be classified mainly into three categories: 1) structure-based techniques that typically involves breaking the cycles and/or reducing scan depth [4]–[14]; 2) testability based techniques that select scan flip-flops based on testability improvements [8], [9], [15]–[24]; and 3) test generation-based techniques that intertwine test generation and scan flip-flop selection [25]–[30]. Other partial scan techniques include those driven by layout constraints [8], timing constraints [31], re- timing [5], [32], and toggling rate of flip-flops and entropy measures [33]. These techniques typically necessitate the utilization of sequential ATPG to generate test patterns on the partially scanned design or combinational ATPG with time frame expansion (if all cycles are broken), not only failing to comply with the existing design/test flow that industry utilizes today but also incapable of ensuring the quality of full scan. In this paper, we propose a computationally efficient and design flow compliant partial scan scheme that can deliver all the aforementioned area, test cost reduction, and performance benefits while ensuring the quality of full scan. The proposed 0278-0070/$31.00 c 2013 IEEE

Upload: ozgur

Post on 08-Dec-2016

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Scan to Nonscan Conversion via Test Cube Analysis

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 32, NO. 2, FEBRUARY 2013 289

Scan to Nonscan Conversion ViaTest Cube Analysis

Ozgur Sinanoglu

Abstract—Increasing complexity of integrated circuits hasforced the industry to abandon partial scan, which necessitatesa computationally demanding and unaffordable sequential auto-matic test pattern generation (ATPG), and to instead adopt fullscan, despite its costs. In this paper, we propose a partial scanscheme driven by a computationally efficient test cube analysis.We tackle the challenges associated with the identification ofthe conditions to restore the controllability and observabilitycompromised due to partial scan, and with the formulationof these conditions in terms of test cube operations. Uponthe identification of a maximal-sized set of scan flip-flops thatare converted to nonscan, a simple postprocessing of the testcubes helps compute the values to be loaded into the scan flip-flops, eliminating the need to rerun ATPG, while at the sametime ensuring the quality of full scan. We further enhance thisframework through techniques that process the test data beforeand after the application of the proposed test cube analysis-drivenpartial scan technique, in order to enlarge the size of the nonscanflip-flop set. The proposed scheme combines the simplicity of theconventional ATPG flow with the area, performance, test time,and test power reduction benefits of partial scan. The proposedtest cube analysis-driven partial scan scheme is orthogonal andthus fully compatible with other test cost-reduction techniques,such as test data compression and test power reduction, whichcan be applied in conjunction.

Index Terms—Partial scan, test cost reduction, test cubeanalysis, test cube operations.

I. Introduction

INCREASING complexity of integrated circuits has forcedthe industry to abandon partial scan, which removes scan

multiplexers corresponding to some of the flip-flops, discon-necting them from the scan path. As a result, controllabilityand observability of these flip-flops are compromised, neces-sitating sequential automatic test pattern generation (ATPG),where these flip-flops are controlled and observed throughfunctional paths. The computational cost of sequential ATPGcannot be afforded, given the complexity of integrated circuitstoday. Consequently, the industry has given up on partial scanand adopted full scan despite its costs.

Full scan incurs area, performance, and test costs. Insertionof as many multiplexers as the number of flip-flops in the de-

Manuscript received October 13, 2011; revised March 8, 2012, June 23,2012 and September 3, 2012; accepted September 5, 2012. Date of currentversion January 18, 2013. A preliminary version [1] of this work was presentedat the VLSI Test Symposium, Dana Point, CA, in May 2011. This paper wasrecommended by Associate Editor J. L. Dworak.

The author is with the Department of Computer Engineering, New YorkUniversity Abu Dhabi, Abu Dhabi, UAE (e-mail: [email protected]).

Digital Object Identifier 10.1109/TCAD.2012.2218603

sign obviously imposes a considerable area cost. Furthermore,these multiplexers are inserted on functional paths, resultingin critical path prolongation by a multiplexer delay, and hencedegrading the performance of the design timing-wise. Full scanalso incurs significant test costs. Every test pattern consistsof as many bits as the number of flip-flops in the design,translating into high test time and test data volume. Anotherproblem that full scan imposes is the excessive switchingactivity during the test, as all the flip-flops are active duringshift operations. Elevated levels of power dissipation [2] occurduring testing, which, if overlooked, can cause reliabilityissues.

A computationally efficient partial scan can be a remedy ofthe problems of full scan. Removal of scan multiplexers andthus taking some of the flip-flops off the scan path [3]:

1) reduces area cost;2) potentially improves the critical paths of the design, thus

enhancing functional performance;3) reduces the scan path length, decreasing test time and

test data volume, a form of test data compression;4) reduces switching activity during testing and leashes

power dissipation and IR drop, as only the scan flip-flops toggle during shift operations while the nonscanflip-flops are inactive during shift.

An extensive amount of research has been conducted inpartial scan design. The previously proposed techniques inthis field can be classified mainly into three categories:1) structure-based techniques that typically involves breakingthe cycles and/or reducing scan depth [4]–[14]; 2) testabilitybased techniques that select scan flip-flops based on testabilityimprovements [8], [9], [15]–[24]; and 3) test generation-basedtechniques that intertwine test generation and scan flip-flopselection [25]–[30]. Other partial scan techniques include thosedriven by layout constraints [8], timing constraints [31], re-timing [5], [32], and toggling rate of flip-flops and entropymeasures [33]. These techniques typically necessitate theutilization of sequential ATPG to generate test patterns on thepartially scanned design or combinational ATPG with timeframe expansion (if all cycles are broken), not only failingto comply with the existing design/test flow that industryutilizes today but also incapable of ensuring the quality of fullscan.

In this paper, we propose a computationally efficient anddesign flow compliant partial scan scheme that can deliver allthe aforementioned area, test cost reduction, and performancebenefits while ensuring the quality of full scan. The proposed

0278-0070/$31.00 c© 2013 IEEE

Page 2: Scan to Nonscan Conversion via Test Cube Analysis

290 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 32, NO. 2, FEBRUARY 2013

scheme is driven by an analysis of test cubes, which havebeen generated by a combinational ATPG tool, and identifiesthe flip-flops that can be converted to nonscan while retainingthe test quality intact. Upon the conversion of scan flip-flopsto nonscan, a simple postprocessing of the test cubes helpscompute the values to be loaded into the remaining scan flip-flops, eliminating the need to rerun ATPG. This way, theproposed partial scan scheme combines the simplicity of theconventional ATPG flow with the area, performance, and testcost reduction benefits of the partial scan. Ensurance of thequality of full scan and the elimination of the need to rerunATPG on the partially scanned design render the proposedscheme fully compliant with the design and test flow, and, tothe best of our knowledge, uniquely differentiable from thepreviously proposed partial scan techniques; as these featuresexist in full scan, and not in the other partial scan approachesproposed earlier, we confine our comparisons against full scanonly.

The challenges engendered in inferring, from only a givenset of test cubes, a maximal-sized set of flip-flops thatcan be converted to nonscan consist of the identification ofthe conditions to restore the controllability and observabilitycompromised due to partial scan, and of the formulationof these conditions in terms of test cube operations. Bytackling these challenges, we propose a partial scan schemethat offers yet another benefit; as the structural details aboutthe design are not required in this analysis, and instead, theproposed tool operates only on a set of test cubes, partialscan implementation can be outsourced with no intellectualproperty considerations. It is also noteworthy that the proposedtest cube analysis-driven partial scan scheme is orthogonaland thus fully compatible with other test cost reductiontechniques, such as test data compression [34] and testpower reduction [2], which need to be applied subsequentlyon the test cubes processed by the proposed partial scantechnique.

The theoretical framework that induces nonscan cell iden-tification to basic test cube operations also sheds light toproperties that are desirable in test data to bolster the effec-tiveness of the proposed scheme, enabling the developmentof test data preprocessing and postprocessing operations tofurther enhance the scan to nonscan conversion ratio. Wetherefore present a suite of simple yet efficient test/justificationcube replacement operations to be applied prior and subse-quent to the proposed test cube analysis-driven partial scanscheme. These techniques fully preserve all the simplicityand flow-compatibility benefits that the basic test cube anal-ysis framework offers, while boosting the cost reductionsattained.

The remainder of this paper is organized as follows. InSection II, we describe the proposed partial scan scheme, witha particular focus on the clocking scheme. In Section III, wefollow this discussion up with the proposed test cube analysis.In Section IV, we present the preprocessing and postprocessingtechniques applied on test data to improve the scan to nonscanconversion ratio. Section V discusses the application flow forthe proposed scheme. Experimental results and concludingremarks are presented in Sections VI and VII, respectively.

II. Proposed Partial Scan Scheme

Conversion of a scan flip-flop to a nonscan flip-flop can beaccomplished by removing the associated scan multiplexer andrerouting the scan chain around the flip-flop, bypassing it. Theend result is area cost reduction and potentially performanceenhancement due to the removal of the scan multiplexer, inaddition to the test time, data volume, and power dissipationreductions due to the shortened scan chain; yet, controllabilityand observability of the converted flip-flop are compromisedwith the removal of the scan multiplexer.

In order to preserve test quality, the effect of the scanto nonscan conversion should be nullified by restoring thecompromised controllability and observability. The latter iseasier gain back via a simple tap off of the output of the flip-flop as an observation point. Observing the content of the flip-flop through the observation point subsequent to each captureoperation suffices to restore the observability compromiseddue to scan to nonscan conversion. Furthermore, the observa-tion points corresponding to multiple nonscan flip-flops can becompacted together via a logic cone analysis [35] in order toreduce the associated area cost while retaining error detectionlevel intact; error masking can be prevented by compactingthe outputs of the flip-flops that have disjoint input cones.The compacted observation points can be multiplexed ontothe primary output (PO) or alternatively feed an existing or adedicated compactor/multiple-input signature register (MISR)along with the scan chain(s). The utilization of an MISR ne-cessitates the additional design effort of fixing of all unknownresponse bits (xs) captured in the nonscan cells; alternativesolutions such as X-canceling MISR [36] can be employed ifthe additional design effort is unaffordable.

The compromised controllability is more challenging torestore. With the removal of the scan multiplexer, the non-scan flip-flop should be justified, through the functional pathdriving the flip-flop, to the value required by a test pattern.In order to render a simple test cube analysis sufficient forthe identification of whether and how this justification canbe accomplished, we constrain any such justification to spanonly a single time frame; in the proposed scheme, a singleclock pulse received only by the nonscan flip-flops justifythem to the required value. As the associated functional pathsare driven by the scan flip-flops, the justify pulse is appliedafter the shift pulses (upon the completion of the shift-inoperations, and thus upon the load of the scan flip-flops) andbefore the capture pulse(s) (so that the nonscan flip-flops arealso loaded through the functional paths prior to capture).The clocking of the flip-flops in the proposed partial scanscheme is provided in Fig. 1; aside from the newly insertedjustify pulse, this clocking scheme is identical to that of thetraditional scan-based scheme, and can be implemented viasimple clock gating rather than separate dedicated clocks forthe two groups of flip-flops. The shift pulses drive only thescan flip-flops, the justify pulse drives only the nonscan flip-flops, and the capture pulse(s) are received by both the scanand nonscan flip-flops. Compared to full scan testing, the samepattern is applied to the circuit under test prior to the capturepulse(s), and with a careful feed of the observation points tothe outputs/compactor/MISR, the same response is observed.

Page 3: Scan to Nonscan Conversion via Test Cube Analysis

SINANOGLU: SCAN TO NONSCAN CONVERSION 291

Fig. 1. Clocking in the proposed partial scan scheme.

Both static (stuck-at) and dynamic (launch-off-capture at-speed) types of testing are supported, as shown in Fig. 1, whilelaunch-off-shift type of testing requires drastic changes in theproposed clocking scheme.

In this scheme, a nonscan flip-flop may possibly receivean unintended value upon the justify pulse due to a defectin the functional path driving this flip-flop. Although such adefect can be implicitly detected in the subsequently capturedresponse, its detection is ensured by observing the observationpoints twice: once after the justify pulse, and once after thecapture pulse.

III. Basic Framework for the Proposed Test

Cube Analysis

The simplicity of the proposed partial scan scheme injustifying a nonscan flip-flop enables a test cube analysis-driven identification of flip-flops that can be converted tononscan. In the test cube analysis presented in this section,we adhere to the constraint that fault coverage must remainintact. In other words, the test cube analysis identifies a subsetof flip-flops to be converted to nonscan by ensuring that “alltest cubes” can still be applied intact. Thus, the combinationalATPG process conducted to generate the test cubes need notbe repeated. In this section, we present the proposed test cubeanalysis, which can be applied as a post-ATPG process.

We first introduce the following terminology. The test cubesof the design, which represent the values to be loaded into theflip-flops of the circuit for detecting the faults of a particulartype, are denoted by TC[i][j], where 0 ≤ i < Num−cubes and0 ≤ j < Num−inputs; TC[i][j] denotes the binary value ofjth input [primary input (PI) or pseudo-primary input (PPI)]in the ith test cube, Num−cubes denotes the number of testcubes, and Num−inputs denotes the total number of PIs andPPIs.1 We next define a flip-flop justification cube, JCv[i][j],which denotes the binary value of jth input for justifying aflip-flop i to a value v (0 or 1) through the functional paths,where 0 ≤ i < Num−FFs and 0 ≤ j < Num−inputs, andNum−FFs denotes the number of flip-flops. We also use thenotation TC[i] and JCv[i] to denote the bit sequence (vector)for a test cube and a justification cube, respectively.

A design fragment consisting of a single logic cone isprovided in Fig. 2 where the functional logic driving therightmost flip-flop e is shown and the details regarding the

1PPIs correspond to the output of the flip-flops that drive the combinationallogic.

Fig. 2. Justification of a flip-flop through functional paths.

scan logic (multiplexers) are hidden. This flip-flop can bejustified to 1 by setting the leftmost flip-flop a to 1, or bc

to 01. Similarly, ab = 01 or ac = 00 sets flip-flop e to 0.The condition for justifying a flip-flop to a value is identicalto the detection condition for the fault on the D-input ofthe flip-flop that is stuck at the complementary value; theactivation of the fault necessitates the flip-flop to be justifiedto the value complementary to the stuck-at value, while thereis no propagation requirement for the fault as the flip-flop isan observable point. In other words, the test cube for the s-a-0 fault on the D-input of e is 1xxxx or x01xx, denotingthe condition for justifying e to 1. Similarly, the test cubefor the s-a-1 fault on the D-input of e is 01xxx or 0x0xx,denoting the condition for justifying e to 0, as illustratedin Fig. 2. In this example, JC0[e] =01xxx or 0x0xx, andJC1[e] =1xxxx or x01xx. The justification information foreach flip-flop is actually embedded within the set of testcubes.

The proposed scheme is based on loading a pattern intothe scan flip-flops, and subsequently, performing a single-cycle partial-capture (justify) on nonscan flip-flops to loadthem through the functional paths. Therefore, a pattern thatis loaded into the scan flip-flops serves two purposes for eachtest cube: 1) delivering the care bits of the test cube into thescan flip-flops, and 2) fulfilling the conditions to deliver thecare bits of the test cube into the nonscan flip-flops through thefunctional paths. Every care bit in the test cube correspondingto a nonscan flip-flop therefore necessitates the embeddingof the associated justify cube for this flip-flop in the loadedpattern. The pattern loaded into the scan flip-flops representsa test cube merged with as many justification cubes as thenumber of care bits corresponding to the nonscan flip-flops.

A. Single Flip-Flop Conversion

There are two conditions to be satisfied in order to converta flip-flop f to nonscan.

Page 4: Scan to Nonscan Conversion via Test Cube Analysis

292 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 32, NO. 2, FEBRUARY 2013

1) S1: JCv[f ] can be merged2 with TC[i] for all i suchthat TC[i][f ] = v. In other words, there must be no 0–1conflicts between a test cube that requires f to be at v,and the condition for justifying f to v.

2) S2: JCv[f ][f ] = x for v = 0 and v = 1. In other words,the justification condition for f should not specifyitself to a value, creating a circular dependency; if f

is converted to nonscan, it can only be justified bycontrolling other scan flip-flops.

For the example above, flip-flop e can be converted tononscan if the first condition is met, as the second condition issatisfied; neither JC0[e] nor JC1[e] require e to be specified. Ifall the test cubes that require e to be at 0 merge with JC0[e],and all the test cubes that require e to be at 1 merge withJC1[e], then e can be converted to nonscan. For instance,a test cube 0x010, which specifies e as 0, is compatibleJC0[e] =01xxx. Therefore, if 0101 is loaded into the other flip-flops a, b, c, and d in four shift cycles, a subsequent justifypulse received by only e would load 0 into e, delivering thedesired bits of the test cube into all the flip-flops.

B. Pair Conversion

It is possible that two flip-flops that can be convertedindividually cannot be converted together due to conflictingjustification conditions. Next, we define the conditions forconverting two flip-flops f1 and f2 simultaneously.

1) P1 = S1 and S2: Single flip-flop conversion conditionsare met for both f1 and f2.

2) P2: JCv1 [f1] can be merged with JCv2 [f2], if ∃i such thatTC[i][f1] = v1 and TC[i][f2] = v2. In other words, if thetwo bits corresponding to f1 and f2 are both specifiedby a test cube i, then the associated justification cubesof f1 and f2 should be nonconflicting.

3) P3: JCv[f1][f2] = x and JCv[f2][f1] = x for v = 0 andv = 1. In other words, the justification cube for eitherflip-flop should not specify the other flip-flop, creatinga circular dependency; if both flip-flops are convertednonscan, they can only be justified by controlling otherscan flip-flops.

The next question that can be raised is how can wemaximize the number of flip-flops converted to nonscan, ascommensurate benefits in area cost, test time, test data volume,and test power dissipation will be reaped. The single flip-flopconversion conditions can be used to identify all the candidateflip-flops that can potentially be converted, while pair conver-sion condition introduces a notion of compatibility betweentwo flip-flops. This compatibility notion can be extended to agroup of flip-flops as follows.

C. Group Conversion

A group of flip-flops can all be converted to nonscan if thefollowing conditions hold.

1) G1 = P1 = S1 and S2: Single flip-flop conversionconditions are met for each flip-flop in the group.

2Two cubes can be merged together if and only if the two cubes never havecomplementary values in the same bit position.

Fig. 3. Test cube analysis mapped to the independent set problem.

2) G2: For each of the test cube TC[i] that specifiessome of the bits in the group, the justification cubescorresponding to the specified flip-flops should all benonconflicting, and thus, mergeable.

3) G3: The justification cube for none of these flip-flopsshould specify any other flip-flops in the group.

It can be observed that group conversion is a direct exten-sion of pair conversion. If pair conversion conditions are metfor every pair of flip-flops within a group, then the group con-version conditions automatically hold. The underlying reasonis the natural extension of pairwise to group compatibility ofcube merge operations; for instance, if cubes c1 and c2, c1 andc3, and c2 and c3 can merge, then c1, c2, and c3 can all mergetogether.

The problem of identifying a maximal-sized group of flip-flops that can all be converted to nonscan can thus be mappedto the maximum independent set problem [37]. A conflictgraph can be formed, wherein the nodes correspond to the flip-flops that satisfy the single flip-flop conversion conditions. Anedge denoting a conflict is inserted between two nodes thatfail the pair conversion conditions. A maximal-sized groupof independent nodes3 represents all pairwise compatible flip-flops, namely, a group of flip-flops that can all be convertedto nonscan. Since the independent set problem is known tobe NP-complete, efficient heuristics can be utilized to identifynear-optimal solutions.

The test cube analysis to create the conflict graph, onwhich the maximum independent set algorithm is executed,is illustrated on an example with 18 test cubes and sevenflip-flops in Fig. 3. Out of the seven flip-flops, only two ofthem, b and e, cannot meet the single flip-flop conversionconditions; JC0[b] =01xx1x0 requires b to be specified andJC0[e] =0xx1xx1 cannot be merged with the fourth test cubex0x10x0, which specifies e to be 0. The conflict graph isthus formed with five nodes corresponding to the remainingflip-flops. In this graph, nodes a and g are conflicting, as

3An independent group of nodes denotes a group of nodes with no edgeconnecting any node to any other node in the group.

Page 5: Scan to Nonscan Conversion via Test Cube Analysis

SINANOGLU: SCAN TO NONSCAN CONVERSION 293

JC0[a] =x1xxxx1 specifies g. Also, a and f are conflicting,as the test cube 11x101x specifies both a and f as 1s, andJC1[a] =xxx1xx0 and JC1[f ] =x1xx0x1 cannot merge. A pairof flip-flops that are compatible are a and c, as no test cubespecifies both of them at the same time (P2 automaticallysatisfied), and as the justification cube of one flip-flop doesnot specify the other flip-flop (P3 satisfied); c is also pairwisecompatible with f and g. As a result, one possible solution forthe maximum independent set is {a, c}; both flip-flops can beconverted to nonscan by removing the two scan multiplexers.

The same figure also shows the bits to be loaded into thescan flip-flops b, d, e, f , and g; these new cubes are obtainedby merging the original test cubes with the justification cubeof the nonscan flip-flop specified by the test cube, and by re-moving the bits of a and c. Each of the new test cubes requiresfive shift cycles, as opposed to seven, and a subsequent justifypulse received only by a and c to load the required values intothese nonscan flip-flops. During shift cycles, only five flip-flops (and their clock lines) potentially toggle, while the othertwo flip-flops preserve their values throughout the shift cyclesas they had not clocked during this period of time.

IV. Techniques for Conversion Ratio Improvement

In this section, we present a suite of techniques, allapplied post-ATPG, geared toward improving the proposedbasic framework in terms of number of flip-flops convertedto nonscan. The basic framework presented in the previoussection operates on a given set of test cubes and justificationcubes. Based on the observation that a fault can be detected inmultiple different ways, the proposed scheme can be enhancedby exploiting such a degree of freedom. In particular, ajudicious selection of the best possible test cube for every faultand the best possible pair of justification cubes for every flip-flop may potentially yield a larger number of scan to nonscanconversions. This section elaborates on these enhancements.

A. Test Data Preprocessing Techniques

Test cube and justification cube selections can be judiciouslyperformed prior to the application of the proposed test cube-driven partial scan scheme in order to obtain a conflict graphthat yields a larger independent set, and thus, a larger numberof scan to nonscan conversions. A computationally efficientapproach to obtain such a conflict graph relies on the followingbasic observation: a conflict graph with a larger number ofnodes and/or a smaller number of edges is expected to leadto a larger independent set. Upon the application of thesepreprocessing techniques, a test cube set and a justificationcube set is obtained and fed to the proposed graph-basedpartial scan framework that identifies the flip-flops that canbe converted to nonscan.

1) Test Cube Selection: Assuming that multiple test cubesare available to detect each fault, the selection of exactlyone test cube per fault can be performed judiciously to yielda conflict graph with a larger independent set. This sectionelaborates on such a selection mechanism.

Each test cube may potentially prevent a number of flip-flop conversions. This can be in two different forms: 1) a test

cube may prevent a single flip-flop conversion if it fails tomerge with the justification cube of this flip-flop as per S1,or 2) a test cube may prevent pairwise conversion if it hascare bits in two positions corresponding to two flip-flops withincompatible justification cubes, resulting in a conflict edge inthe graph between the corresponding two nodes as per P2.

In an effort to obtain a conflict graph with a larger number ofnodes and a smaller number of edges, we define a cost functionto drive the selection of test cubes. Every candidate test cube isassigned a cost that is a function of two factors: 1) the numberof single conversions it prevents (the number of nodes that isexcluded from the graph because of this candidate test cube),and 2) the number of pairwise conversions it prevents (thenumber of edges added to the graph because of this candidatetest cube). A weighted average of these two factors defines thecost assigned to each test cube; for every fault, the test cubewith the smallest cost is selected.

Referring back to the example in Fig. 3, the test cube“01x101x” prevents the single conversion of flip-flop b, as thiscube is in conflict with the justify-1 cube of b, and preventspairwise conversions for (a, d) and (d, f ). The two numbersthat define the cost of selecting this particular test cube are 1and 2, respectively.

The cost function above may end up overpenalizing certaincandidate test cubes for single/pairwise conversions that arerepeatedly prevented by a multiplicity of test cubes. Thus, thecost function can be further refined as the weighted average ofunique single/pairwise conversions prevented by a test cube.For the example above for test cube “01x101x,” the refinednumbers become 0 and 0, as the conversion of flip-flop b isdisabled by its own justification cube that specifies b, andas there are also other test cubes that prevent the pairwiseconversions for (a, d) and (d, f ). By preprocessing a given setof test cubes with multiple options for each fault, a conflictgraph with a larger independent set can be obtained, enhancingthe scan to nonscan conversion ratio.

2) Justify Cube Selection: Analogously, multiple justifica-tion cubes may be available to justify a flip-flop to a binaryvalue; the selection of exactly one justify cube per flip-flop perbinary value can be performed judiciously to yield a conflictgraph with a larger independent set. This section elaborateson such a selection mechanism.

Similar to test cubes, a justification cube may potentiallyprevent flip-flops from being converted to nonscan. A justifi-cation cube for a flip-flop may prevent the conversion of theflip-flop itself if the justification cube fails to merge with atest cube that requires this flip-flop to be at a care bit value asper S1, or if the justification cube specifies the flip-flop itselfto a value as per S2, creating a circular dependency. For everyflip-flop, the two justification cubes (one for justify-to-0 andone for justify-to-1) that pass these two conversion conditionsshould be selected, if possible. Otherwise, the flip-flop can bedeemed unconvertible.

A justification cube for a flip-flop may also prevent thepairwise conversion of this flip-flop with another one. Evenif the justification cube for a flip-flop is mergeable withall the test cubes that specify the flip-flop (allowing singleconversion), its merge with a test cube may introduce a

Page 6: Scan to Nonscan Conversion via Test Cube Analysis

294 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 32, NO. 2, FEBRUARY 2013

conflict with the justification cube of another flip-flop asper P2, preventing pairwise conversion of the two flip-flopssimultaneously. This happens when the justification cubes ofthe two flip-flops are incompatible and a test cube specifiesboth flip-flops. If multiple candidates exist for a justificationcube, a cost function-driven selection mechanism may reducethe likelihood of pairwise conflicts, enhancing the conversionratio.

A candidate justification cube can be assessed based on its“similarity” to the test cubes that specify its corresponding flip-flop; the selection of justification cubes that are more similar tothe test cubes creates, indirectly, a similarity among the justifi-cation cubes selected for different flip-flops, possibly reducingpairwise conflicts. For a justification cube that is compatiblewith all the test cubes that specify its corresponding flip-flop,we can keep track of the number of do not care bits that turninto care bits upon its merge into the test cubes. A smallernumber indicates a higher similarity of the candidate justifica-tion cube to the test cubes that specify the flip-flop. A smallernumber (higher similarity) also indicates a reduced likelihoodof pairwise conflicts with a justification cube of another flip-flop, as this latter justification cube, which is supposed tomerge with the test cubes to pass single conversion, will stillmerge with them upon the merge of the test cubes with theformer “similar” justification cube. For every flip-flop, the pairof justification cubes with the least cost is selected.

Referring back to the example in Fig. 3, in order to enablethe conversion of b, the justify-0 cube for b needs to bereplaced by another cube that does not specify b. To assessthe justify-1 cube for a, we first note that there are three testcubes that specify a to 1. The merge of the justify-1 cube ofa with these three cubes individually results in 2, 0, and 1 donot care bits to turn into care bits in these test cubes, resultingin a similarity-based cost factor of 3.

3) Iterative Application of Test Cube and Justifying CubeSelection: The test cube selection procedure described aboveassumes a given set of preselected justification cubes, as thecandidate test cubes are assessed by referring to the set ofjustification cubes. Similarly, the justification cube selectionprocedure assumes a given set of preselected test cubes. Inthis section, we elaborate on how this circular dependencyis circumvented through an iterative application of the twoselection mechanisms.

In the proposed framework, initially, test cube selection isperformed; however, as the justification cube selection has notbeen completed yet, and thus, no basis exists for assessingthe test cubes through the cost function above, this initialtest cube selection is driven by the number of care bits inthe test cubes. Only during the first iteration, the test cubewith the fewest care bits is selected for every fault. Withfewer care bits, the expectation is that the test cubes willnot only be compatible with more justification cubes, but alsonecessitate fewer pairwise conversion checks, and thus resultin fewer conversion fails. An alternative implementation withno obvious advantage or disadvantage can initially performjustification cube selection rather than test cube selection.

Once the initial test cube set selection is completed, jus-tification cube selection can proceed with the resulting test

Fig. 4. Iterative application of preprocessing techniques.

cube set. Upon the selection of the justification cubes based onthe cost function-driven procedure described earlier, test cubeselection can be repeated in order to revise the test cube set,rendering a better match with the selected justification cubeset. This iterative application can continue with justificationand test cube selections until no further improvement isobserved. After each test/justification cube selection procedureapplication, the conflict graph can be created to count the num-ber of nodes and edges in the graph; alternatively, independentset identification procedure can also be executed to computeexactly the number of conversions at the expense of increasedcomputational runtime. Iterative application of the selectionprocedures can be terminated when the conflict graph is nolonger improved in terms of the number of nodes/edges or thesize of the independent set of nodes. The iterative applicationis depicted as a flowchart in Fig. 4. Conversion from scan tononscan can be subsequently effected.

B. Test Data Postprocessing Techniques

While the iterative application of the test cube and justifi-cation cube set preprocessing procedures may yield a con-flict graph with a larger independent set, a technique thatfocuses directly on the resultant independent set and aims atexpanding it further could potentially deliver further savings.In this section, we present a pair of test/justification cube setpostprocessing techniques executed by referring to the conflictgraph whose independent set has been identified.

Page 7: Scan to Nonscan Conversion via Test Cube Analysis

SINANOGLU: SCAN TO NONSCAN CONVERSION 295

1) Independent Set Expansion Via Test Cube and/or Justifi-cation Cube Replacements: The first postprocessing techniquewe present aims at expanding the independent set of theconflict graph by including additional nodes; these are thenodes that have been originally excluded from the independentset due to a conflict edge between them and another node inthe independent set, or to a failure in their single conversion.By replacing the test/justification cube that has caused theirexclusion with another test/justification cube, these nodes canbe included in the independent set, enhancing the conversionratio.

In order to apply this postprocessing technique, the nodesthat failed the single conversion should also be included in theconflict graph, but be marked in order to differentiate themfrom other nodes that passed single conversion. Nodes, whoseonly justification cubes specify themselves, are definitely notconvertible and thus can be excluded from the graph. Othernodes that failed single conversion due to a conflict betweena test cube and a justification cube should be marked andpreserved in the graph, as they may potentially pass singleconversion upon test/justification cube replacement. Pair con-version checks should also be applied between such nodesand the other nodes in the graph, and identified conflict edgesshould be added as well.

The postprocessing technique starts from the independentset and identifies a target node that could potentially beincluded in the independent set. This is the node with thefewest number of conflicts with the nodes that belong to theindependent set. The technique aims at resolving all theseconflicts, one at a time, through test cube and/or justificationcube replacements, as long as none of these replacementsintroduce any conflicts between the nodes in the independentset or any single conversion failure for the nodes in theindependent set. Upon the eradication of all the conflicts, thetarget node can be included in the independent set, expandingthe set size by one. In the process, new conflict edges orsingle conversion failures may be introduced outside of theindependent set, necessitating that the entire graph be updatedfor future iterations. The technique subsequently identifies thenext target node and repeats the same steps. This iterativeapplication continues until a certain number of target nodes(defined by a threshold) fail to be included in the independentset.

In the example in Fig. 3, we note that node g is in conflictwith a but not with c. Therefore, node g can be included inthe independent set only if the conflict between a and g canbe resolved through test/justification cube replacement. In thisparticular case, the conflict is due to the justification cubes ofa and g, as they specify each other. If alternative justificationcubes can be identified for both a and g, then g may be addedto the independent set given that: 1) single conversion rulesstill hold for a and g; 2) the justify-0 cube of a is compatiblewith the justify-1 cube of g, and vice versa, as a test cubespecifies a and g to 0 and 1 and another test cube specifiesthem to 1 and 0; 3) no conflict is introduced between a and c,which can happen only if one of the new justification cubes ofa specifies c, as no test cube specifies a and c simultaneously;and 4) no conflict is introduced between c and g, which can

happen if one of the new justification cubes of g specifies c

or if the justify-0 cube of c is in conflict with one of the newjustifications cubes of g.

2) Independent Set Expansion Via Test Cube Elimination:So far we have strictly adhered to the constraint that all testcubes can still be applied even after the conversions, perfectlypreserving the fault coverage. This constraint can be relaxed totolerate only a minor coverage loss, but in return to eliminatemany conflicts in the graph, thereby increasing the number ofscan to nonscan conversions.

The second postprocessing technique we present is basedon the elimination of a few (defined by a threshold) testcubes to include as many as possible additional nodes in theindependent set. Upon the identification of the target nodes thatare in conflict with at least one of the nodes in the independentset, we identify the number of test cubes that need to beeliminated for each one of these target nodes. The node thatrequires the elimination of the smallest number of test cubesis added to the independent set at the expense of the faultsthat are uniquely detected by the eliminated test cubes.

It should be noted that the target nodes are those that are inconflict with the independent set nodes only due to conflictingjustification cubes; the test cube(s) that specify the target nodeand an independent set node create this conflict, and canbe eliminated to resolve the conflict. If the justify cube ofan independent set node specifies a flip-flop to a value, thenode corresponding to this flip-flop cannot be added to theindependent set, no matter which test cubes are eliminated.

In the example in Fig. 3, node f is in conflict with node a astwo of the five test cubes specify these two flip-flops to valuessuch that the corresponding justify cubes are conflicting;“01x100x” specifies a and f both to 0s (justify-0 cube ofa conflicts with justify-0 cube of f ) and “11x101x” specifiesa and f both to 1s (justify-1 cube of a conflicts with justify-1cube of f ). These two test cubes can be eliminated from thetest set to enable the inclusion of f in the independent setalong with a and c at the expense of some coverage loss.

It is also important to note that while coverage loss upon testcube elimination is likely, it is not certain; it is also possiblethat the faults detected by the eliminated test cubes are alsodetectable by other test cubes when their do not care bitsare specified serendipitously. As it is difficult to foretell theresulting do not care filling of the test cubes, however, thethreshold that dictates the maximum number of test cubes thatcan be eliminated can only be set pessimistically.

V. Application Flow

In this section, we present a flowchart for the application ofthe proposed partial scan framework that includes also thepreprocessing and postprocessing steps. This flowchart notonly helps design for test engineers implement the proposedframework, but also emphasizes the key differences of theproposed partial scan framework from conventional partialscan schemes.

The flowchart for the application of the proposed partialscan technique is provided in Fig. 5. The combinational ATPGengine that operates on the netlist produces the set of test cubes

Page 8: Scan to Nonscan Conversion via Test Cube Analysis

296 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 32, NO. 2, FEBRUARY 2013

Fig. 5. Application flow for the proposed scheme.

that is fed to the proposed partial scan framework. These testcubes are first preprocessed via test/justification cube selectionmechanisms, and subsequently, the conflict graph is created.Application of the independent set identification procedure onthis graph pinpoints the initial set of flip-flops that can be con-verted to nonscan, which is then expanded via the postprocess-ing techniques, leading to the final set of nonscan flip-flops.With this information, the remaining flip-flops are stitched intoone or more scan chains while at the same time test cubemerging operations are performed to compute the final testdata to be loaded into the scan flip-flops. The scan-insertednetlist and the final test data are the outputs of this process.

As the flow illustrates, conventional ATPG process needsto be executed only once in the proposed scheme; both theidentification of the nonscan flip-flops and the computation ofthe final test data is completed through test cube operations.This feature differentiates the proposed partial scan frameworkfrom the previously proposed partial scan schemes, whichcompute of the final test data by running sequential ATPGor combinational ATPG with time frame expansion on thepartially scanned design upon the identification of the nonscanflip-flops.

In the flow depicted in Fig. 5, the netlist on which thecombinational ATPG is executed is an alpha-stitched netlistwhere the scan cells are arbitrarily (alphabetically) ordered.For stuck-at and launch-off-capture tests, the test cubes gen-erated from such a netlist will still be valid for the final scan-stitched netlist, where the scan cells are reordered based ontheir physical position; a simple transformation in the form ofreordering the stimulus and response bits can create the test

data for the final scan-stitched netlist. The underlying reasonis the decoupling of the shift and capture operations in thestuck-at and launch-off-capture type of tests. On the otherhand, reordering of the scan cells invalidates the test data forlaunch-off-shift testing, as such a decoupling does not exist inthis case. As the launch operation depends on the particularordering of scan cells, the proposed scheme has to be radicallychanged for launch-off-shift testing, as stated earlier.

Scan stitching in the conventional or in the proposed flowadds multiplexers on the scan paths only. As a result, newfaults are introduced into the netlist. Some of these newlyadded faults, namely, those that fall on the functional path (thefunctional data input of the scan multiplexer) are equivalentto the faults on the D-input of the flip-flop connected to thescan multiplexer. All the other newly added faults reside onthe scan path, and are tested by the conventional scan flushtests. Consequently, no additional test generation effort needsto be expended for these newly added faults post-stitching. Theoriginal set of test cubes remains valid after scan stitching.

Test data (stimulus) compression necessitates an encodabil-ity check on the data to be delivered into the scan cells. Oncethe test cubes are manipulated via merge and bit-removaloperations, as dictated by the proposed scheme, they aresubject to encodability checks. The ones that are encodablecan be delivered into the scan cells through the decompressor,while the remaining ones need to be applied in the serialmode, where the decompressor is bypassed. The final set oftest cubes can be expected to have a higher care bit densitycompared to the original set. While the bits corresponding tothe nonscan cells are dropped, the merge with the justificationcubes increases the care bit density, rendering a subsequentcompression less effective compared to the compression ap-plied on the original test cubes. It should be noted, however,that the proposed scan to nonscan conversion is a form oftest data compression itself; with the subsequent compression,further reductions in test data volume can be attained.

Last-minute changes in the netlist reflect a change in theset of test cubes. In the proposed methodology, this maypotentially lead to a different set of nonscan cells. Thus, a last-minute change in the netlist necessitates the re-execution ofthe proposed flow, and thus, the resynthesis of the scan chain.However, the typical scenario is that such last-minute changesare localized and minor; the resulting changes in the scan chaincan also be expected to be localized, necessitating a minoreffort to implement the scan chain resynthesis. Additionalinformation regarding which test cube results in the exclusionof which nodes and the inclusion of which edges can be backannotated into the conflict graph, significantly expediting thereidentification of the new set of nonscan cells.

Similar to the last-minute changes in the functional part ofthe design, targeting a new fault model also necessitates there-execution of the test generation tool to produce the newset of test cubes and re-execute the proposed scan synthesisflow. The typical (and more appropriate) approach, however,is to determine the target fault model early on based onthe reliability requirements of the product and the existingelectronic design automation tool capabilities, which takesplace much earlier than the scan stitching phase. Late changes

Page 9: Scan to Nonscan Conversion via Test Cube Analysis

SINANOGLU: SCAN TO NONSCAN CONVERSION 297

TABLE I

Results of the Basic Test Cube Analysis Framework

Reductions Attained (%)Circuit Test Cubes Flip-Flops Single Multiple CR (%) Test Time Test Data Vol. Avg. Test Power Runtime (s)s713 46 19 6 6 31.6 25.0 31.6 34.2 <1s953 101 29 23 23 79.3 73.3 79.3 71.0 <1s1423 80 74 2 2 2.7 1.3 2.7 4.0 <1s3271 65 116 6 3 2.6 1.7 2.6 3.4 <1s3330 87 132 93 52 39.4 38.3 39.4 39.7 <1s3384 153 183 111 46 25.1 24.5 25.1 25.5 <1s4863 96 104 102 48 46.2 44.8 46.2 46.3 <1s5378 170 179 130 72 40.2 39.4 40.2 40.4 <1s6669 147 239 193 86 36.0 35.4 36.0 36.2 <1s9234 238 228 22 17 7.5 7.0 7.5 7.9 <1s13207 274 669 283 202 30.2 30.0 30.2 30.3 5s15850 185 597 72 50 8.4 8.2 8.4 8.5 2s35932 825 1728 42 41 2.4 2.3 2.4 2.4 7s38417 952 1636 514 312 19.1 19.0 19.1 19.1 215s38584 793 1452 65 43 3.0 2.3 3.0 3.0 13b20 324 490 352 111 22.7 22.4 22.7 22.8 203b21 341 490 359 160 32.7 32.4 32.7 32.8 208b22 491 735 461 202 27.5 27.3 27.5 27.6 579b17 674 1414 905 421 29.8 29.7 29.8 30.1 397b18 983 3270 1412 874 26.7 26.7 26.7 26.7 944b19 1452 6642 3741 2143 32.3 32.2 32.3 32.4 2312

in fault models are not common in a well-defined design andtest flow.

The proposed scheme avoids test generation and faultsimulation on the final netlist under the assumption thatthe combinational logic remains identical through the scanstitching process. This assumption remains valid as long asall logic optimizations are performed prior to the applicationof the proposed scheme, after which the combinational logic isfully retained. This can be ensured by applying the proposedanalysis as late as possible in the flow (after physical place-ment). If, however, the combinational logic is subject to furtherlogic optimizations or changes subsequent to the proposed testcube analysis, another round of ATPG and fault simulationshould be executed on the final netlist, as the original testdata and the fault list (possibly a new set of stuck-at faults,critical paths, and bridging pairs) are no longer valid.

The observability of nonscan cells through a com-pactor/MISR in the proposed scheme, if overlooked, mayhamper the debug/diagnostics of the design, which full-scannormally delivers. One approach to restore debug/diagnosticscapabilities is the reconfiguration of the MISR into a shadowregister in the debug mode, allowing the perfect observationof one subset of nonscan cells at a time. While this mayrequire the repeated application of the same diagnostic patternmultiple times, the entire fail data can eventually be collectedand the diagnostic resolution can be perfectly restored.

VI. Experimental Results

We have implemented the proposed test cube analysis tooland applied it on a variety (ISCAS89 and ITC99) of academicbenchmark circuits. In this section, we present the results,which mainly consist of the number of flip-flops that can beconverted to nonscan without losing any fault coverage, in

addition to all the test cost reductions. We have executed ourtool with the test cubes of stuck-at faults, while we note thatthis analysis can be applied with any underlying fault model.

Table I provides the results of the basic framework for theproposed partial scan scheme. The first three columns providethe name of the benchmark circuit, the number of test cubes,and the number of flip-flops, while column 4 presents thenumber of flip-flops that satisfy the single flip-flop conversionconditions and can thus be converted to nonscan individually;this number denotes the number of nodes in the conflictgraph of the proposed test cube analysis. Column 5 presentsthe number of flip-flops that can be converted to nonscanaltogether, while column 6 provides the same number inpercentage (denoted as conversion ratio or CR) with respect tothe number of flip-flops. Columns 7–9 denote the reductions,in percentage, in test time, test data volume, and averagetest power, all with respect to conventional full scan. Finally,column 10 provides the runtime of the test cube analysis. Thenumber given in column 5 denotes the size of the maximallysized independent set in the conflict graph. For s5378, forinstance, the proposed test cube analysis shows that 130 out of179 flip-flops satisfy the single flip-flop conversion conditions,and can be converted to nonscan; 72 of these 130 flip-flopscan be simultaneously converted to nonscan, as this group of72 flip-flops (40.2%) satisfies the group conversion conditions.Conversion of 72 flip-flops to nonscan in turn leads to 39.4%reduction in test time, 40.2% reduction in test data volume,and 40.4% reduction in test power with respect to full scan.

The results show that the proposed test cube analysisapproach is capable of converting 30%–40% of flip-flops tononscan for seven circuits, while the conversion percentage ispoor (2%–3%) in four circuits, from which two are small andtwo are among the largest, deducing no direct conclusionsregarding the effectiveness versus size. For one circuit, 23

Page 10: Scan to Nonscan Conversion via Test Cube Analysis

298 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 32, NO. 2, FEBRUARY 2013

TABLE II

Results Enhanced Through Preprocessing and Postprocessing Techniques

Reductions Attained (%)Circuit Test Cubes Flip-Flops Single Multiple CR (%) Test Time Test Data Vol. Avg. Test Power Runtime (s)s713 46 19 10 9 47.4 40.0 47.4 47.8 <1s953 101 29 23 23 79.3 73.3 79.3 71.0 <1s1423 80 74 2 2 2.7 1.3 2.7 4.0 <1s3271 65 116 20 9 7.8 6.8 7.8 8.5 <1s3330 87 132 101 62 47.0 45.9 47.0 47.1 <1s3384 153 183 122 51 27.9 27.2 27.9 28.2 <1s4863 96 104 109 53 51.0 49.5 51.0 50.9 <1s5378 170 179 147 85 47.5 46.7 47.5 47.5 <1s6669 147 239 201 92 38.5 37.9 38.5 38.6 <1s9234 238 228 22 17 7.5 7.0 7.5 7.9 <1s13207 274 669 296 210 31.4 31.2 31.4 31.5 7s15850 185 597 75 55 9.2 9.0 9.2 9.4 3s35932 825 1728 103 87 5.0 5.0 5.0 5.1 9s38417 952 1636 552 371 22.7 22.6 22.7 22.7 229s38584 793 1452 69 48 3.3 3.2 3.3 3.4 15b20 324 490 375 124 25.3 25.1 25.3 25.4 210b21 341 490 377 173 35.3 35.0 35.3 35.4 221b22 491 735 469 238 32.4 32.2 32.4 32.5 600b17 674 1414 923 452 32.0 31.9 32.0 32.2 422b18 983 3270 1441 880 26.9 26.9 26.9 27.0 961b19 1452 6642 3760 2184 32.9 32.9 32.9 32.9 2379

out of 29 flip-flops are converted, resulting in almost 80%conversion ratio. For the remaining six circuits, the proposedtool attains around 8% conversion for two of the circuits, and19%–28% for the other four. The distribution of the care bits intest and justification cubes, which is an end result of the conestructure of the design, determines the conflict graph and, thus,the effectiveness of the proposed technique. A larger numberof test cubes does not necessarily translate into a larger numberof conflicts in the graph, if the care bits are typically clusteredin the same positions; as the results show, there is no directcorrelation between the size of the circuit (or the number oftest cubes) and the effectiveness of the proposed technique.

The reductions attained in test data volume equals thepercentage scan to nonscan conversion ratio, provided incolumn 5, as each test pattern is compressed (shorter scan se-quence) by an amount that equals the conversion ratio. Slightlylower reductions are attained for test time, as every patternrequires an additional justify cycle in the proposed scheme;nevertheless, the test time reductions are typically very closeto the conversion ratio. The reductions attained in average testpower, which is computed by counting the transitions both inthe scan chains and in the combinational logic, are close to theconversion ratio but almost always exceed the conversion ratio.The underlying reason for this slight positive deviation is thepower-efficient justify cycle, where only a small portion of thenonscan flip-flops toggle. It is difficult, however, to quantifythe exact area cost savings, as the cost of the observation pointsdepends on the scan configuration (number of POs, chains,and the compactor/MISR, if any); we expect the savings dueto the scan multiplexers removed by the proposed schemeto outweigh the cost of observation points, leading to someoverall area savings.

Table II provides in an analogous format the results of theproposed framework further enhanced through preprocessing

TABLE III

Comparison of Compression Results: Original Cubes Versus

Cubes Processed by the Proposed Technique

Compressing Original Cubes Compressing Cubes After MergingCircuit xor Based Run Length xor Based Run Lengths35932 9.6x 4.6x 9.5x 4.6xs38417 8.3x 4.1x 7.9x 3.8xs38584 8.6x 5.0x 8.5x 4.9xb17 7.1x 3.4x 6.5x 3.0xb18 8.9x 4.7x 8.0x 4.2xb19 8.7x 4.7x 8.1x 4.3x

and postprocessing techniques presented in Section IV. It canbe seen by comparing Tables I and II that the results ofthe basic framework are improved for all but three of thebenchmark circuits, leading to commensurate enhancementsin test time, data volume, and test power also; for b22,for instance, the preprocessing techniques help increase thenumber of nodes in the conflict graph from 461 to 469, whilepreprocessing and postprocessing techniques together expandthe independent set size from 202 to 238, enhancing theconversion ratio from 27.5% up to 32.4%. The runtime is onlyslightly increased in all cases, fully justifying the additionalsavings delivered.

Table III provides the test compression-level comparisons.Two different compression schemes (xor-based [38] andrun-length coding [39]) have been applied on the originaltest cubes and on the test cubes processed by the proposedpartial scan technique. Compression results are provided in thetable with respect to the size of the original test cubes. Forthe combinational xor-based compression, a two-phase testapplication is assumed; encodable patterns are applied in thefirst phase from five channels through the xor decompressor

Page 11: Scan to Nonscan Conversion via Test Cube Analysis

SINANOGLU: SCAN TO NONSCAN CONVERSION 299

TABLE IV

Conversion Ratio Enhancement Through Test Cube Compromise

Conversion Ratio % (Fault Coverage Loss %)Circuit 100% TCs 99.9% TCs 99.8% TCs 99.5% TCs 99% TCs 98% TCss35932 5.0 (0.0) 5.3 (<0.1) 5.5 (<0.1) 6.2 (0.1) 8.5 (0.3) 11.3 (0.8)s38417 22.7 (0.0) 22.9 (<0.1) 23.0 (<0.1) 24.5 (0.1) 27.0 (0.4) 30.6 (1.2)s38584 3.3 (0.0) 3.8 (<0.1) 4.0 (<0.1) 4.4 (0.1) 7.2 (0.2) 10.7 (0.7)b17 32.0 (0.0) 32.4 (<0.1) 32.4 (<0.1) 33.0 (0.1) 34.4 (0.3) 40.1 (0.9)b18 26.9 (0.0) 27.5 (<0.1) 27.9 (<0.1) 29.2 (0.2) 30.3 (0.2) 33.6 (0.8)b19 32.9 (0.0) 33.0 (<0.1) 33.2 (<0.1) 33.7 (0.1) 34.2 (0.3) 36.1 (0.9)

into 50 chains, and the remaining patterns are applied intofive chains in the second serial top-up phase. The resultsshow that the compression levels are slightly deterioratedwhen the proposed technique is utilized in conjunction with acompression technique with respect to utilizing compressionalone; the underlying reason is that the cube merge operations(with the justification cubes) increase the care bit density,which slightly degrades compression levels.

Finally, we present the results of the test cube elimination-based postprocessing techniques that enables a tradeoff be-tween conversion ratio (and thus, test time, data volume, andpower savings) and test quality. The results are provided inTable IV; for six benchmark circuits, the attained conversionratios are provided in column 2 for the case where all testcubes (100%) are applied with no quality loss (results fromTable II), as well as for the cases where 0.1%, 0.2%, 0.5%,1%, and 2% of test cubes are judiciously eliminated fromthe test set in columns 3–7, respectively. The results alsoconfirm that the conversion ratios are enhanced consistentlywith the elimination of test cubes that create conflicts in thegraph, expanding the size of the independent set at the expenseof minor fault coverage loss. Eliminating up to 0.5% of thetest cubes can significantly improve the effectiveness of theproposed scheme with very small fault coverage loss.

VII. Conclusion

Partial scan has been abandoned by the industry, as itnecessitates sequential ATPG to recover the controllabilityand observability loss. Full scan has been adopted instead,despite the area, performance, and test costs it incurs. In thispaper, we proposed a test cube analysis-driven partial scanscheme. The proposed technique operates only on a set of testcubes generated by a combinational ATPG tool, and identifiesa maximum number of flip-flops that can be converted tononscan while delivering the quality of full scan.

By identifying the conditions to recover the controllabilityand observability compromised due to partial scan, and byformulating these conditions via test cube operations, weenabled a computationally efficient partial scan scheme thatis compatible with the conventional ATPG flow. Upon theidentification of the flip-flops that can be converted to nonscan,the test cubes were postprocessed to ensure the delivery ofthe original set intact into all the flip-flops. This simplepostprocessing step eliminated the need for an ATPG rerun.

The theoretical framework underlying the proposed schemealso sheds light on desirable properties in test data that

lead to further enhancements in conversion ratio. Buildingon these observations, we developed a suite of test data-processing schemes, all applied post-ATPG, in the formof test/justification replacement techniques. These techniquesbolster the effectiveness of the proposed partial scan frame-work while preserving all the compatibility benefits of thebasic framework.

The proposed partial scan scheme combines the simplicityof the conventional (full scan-based) ATPG flow, and the area,performance, test time, and test power reduction benefits of thepartial scan. The removal of scan multiplexers delivers areaas well as performance savings, while the shortening of thescan path translates into test time and data volume reductions.Furthermore, as the nonscan flip-flops are inactive duringshift operations, power dissipation in the scan path, in thecombinational logic, and in the clock tree are all reduced. Theproposed partial scan scheme can be applied in conjunctionwith test compression and test power reduction techniques todrive the test costs down even further.

Acknowledgment

The author would like to thank N. Alawadhi, a graduatestudent from the University of California, Irvine, for his con-tributions in the implementation and experimentation duringthe early phases of this paper.

References

[1] N. Alawadhi and O. Sinanoglu, “Revival of partial scan: Test cubeanalysis driven conversion of flip-flops,” in Proc. VLSI Test Symp., May2011, pp. 260–265.

[2] P. Girard, “Survey of low-power testing of VLSI circuits,” IEEE DesignTest Comput., vol. 19, no. 3, pp. 80–90, May 2002.

[3] J. Rearick, “The case for partial scan,” in Proc. Int. Test Conf., Nov.1997, p. 1032.

[4] P. Ashar and S. Malik, “Implicit computation of minimum-costfeedback-vertex sets for partial scan and other applications,” in Proc.Design Automat. Conf., Jun. 1994, pp. 77–80.

[5] S. T. Chakradhar, A. Balakrishnan, and V. D. Agrawal, “An exactalgorithm for selecting partial scan flip-flops,” in Proc. Design Automat.Conf., Jun. 1994, pp. 81–86.

[6] K.-T. Cheng and V. D. Agrawal, “A partial scan method for sequentialcircuits with feedback,” IEEE Trans. Comput., vol. 39, no. 4, pp. 544–548, Apr. 1990.

[7] K.-T. Cheng, “Single clock partial scan,” IEEE Design Test Comput.,vol. 12, no. 2, pp. 24–31, Feb. 1995.

[8] V. Chickermane and J. H. Patel, “An optimization based approach tothe partial scan design problem,” in Proc. Int. Test Conf., Sep. 1990, pp.377–386.

[9] V. Chickermane and J. H. Patel, “A fault oriented partial scan designapproach,” in Proc. Int. Conf. Comput.-Aided Design, Nov. 1991, pp.400–403.

Page 12: Scan to Nonscan Conversion via Test Cube Analysis

300 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 32, NO. 2, FEBRUARY 2013

[10] R. Gupta and M. A. Breuer, “The ballast methodology for structuredpartial scan design,” IEEE Trans. Comput., vol. 39, no. 4, pp. 538–544,Apr. 1990.

[11] A. Kunzmann and H. J. Wunderlich, “An analytical approach to thepartial scan design problem,” J. Electron. Test.: Theory Applicat., vol.1, pp. 163–174, 1990.

[12] D. H. Lee and S. M. Reddy, “On determining scan flip-flops in partial-scan designs,” in Proc. Int. Conf. Comput.-Aided Design, Nov. 1990, pp.322–325.

[13] J. Park, S. Shin, and S. Park, “A partial scan design by unifying structuralanalysis and testabilities,” in Proc. Int. Symp. Circuits Syst., vol. 1. 2000,pp. 88–91.

[14] S.-E. Tai and D. Bhattacharya, “A three-stage partial scan design methodusing the sequential circuit flow graph,” in Proc. Int. Conf. VLSI Design,Jan. 1994, pp. 101–106.

[15] M. Abramovici, J. J. Kulikowski, and R. K. Roy, “The best flip-flops toscan,” in Proc. Int. Test Conf., Oct. 1991, p. 166.

[16] V. Boppana and W. K. Fuchs, “Partial scan design based on statetransition modeling,” in Proc. Int. Test Conf., Oct. 1996, pp. 538–547.

[17] P. Kalla and M. Ciesielski, “A comprehensive approach to the partialscan problem using implicit state enumeration,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 21, no. 7, pp. 810–826, Jul.2002.

[18] K. S. Kim and C. R. Kime, “Partial scan by use of empirical testability,”in Proc. Int. Conf. Comput.-Aided Design, Nov. 1990, pp. 314–317.

[19] P. S. Parihk and M. Abramovici, “Testability-based partial scan analysis,”J. Electron. Test.: Theory Applicat., vol. 7, pp. 47–60, Aug. 1995.

[20] G. S. Saund, M. S. Hsiao, and J. H. Patel, “Partial scan beyond cyclecutting,” in Proc. Int. Symp. Fault-Tolerant Comput., Jun. 1997, pp. 320–328.

[21] E. Trischler, “Incomplete scan path with an automatic test generationmethodology,” in Proc. Int. Test Conf., 1980, pp. 153–162.

[22] D. Xiang, S. Venkataraman, W. K. Fuchs, and J. H. Patel, “Partial scandesign based on circuit state information,” in Proc. Design Automat.Conf., Jun. 1996, pp. 807–812.

[23] D. Xiang and J. H. Patel, “A global algorithm for the partial scan designproblem using circuit state information,” in Proc. Int. Test Conf., Oct.1996, pp. 548–557.

[24] D. Xiang and J. H. Patel, “Partial scan design based on circuit stateinformation and functional analysis,” IEEE Trans. Comput., vol. 53, no.3, pp. 276–287, Mar. 2004.

[25] V. D. Agrawal, K.-T. Cheng, D. D. Johnson, and T. S. Lin, “Designingcircuits with partial scan,” IEEE Des. Test Comput., vol. 5, no. 2, pp.8–15, Apr. 1988.

[26] M. S. Hsiao, G. S. Saund, E. M. Rudnick, and J. H. Patel, “Partial scanselection based on dynamic reachability and observability information,”in Proc. Int. Conf. VLSI Design, Jan. 1998, pp. 174–180.

[27] H.-C. Liang and C. L. Lee, “An effective methodology for mixed scanand reset design based on test generation and structure of sequentialcircuits,” in Proc. Asian Test Symp., 1999, pp. 173–178.

[28] X. Lin, I. Pomeranz, and S. M. Reddy, “Full scan fault coverage withpartial scan,” in Proc. Design, Automat. Test Eur., 1999, pp. 468–472.

[29] I. Park, D. S. Ha, and G. Sim, “A new method for partial scan designbased on propagation and justification requirements of faults,” in Proc.Int. Test Conf., Oct. 1995, pp. 413–422.

[30] S. Sharma and M. S. Hsiao, “Combination of structural and stateanalysis for partial scan,” in Proc. Int. Conf. VLSI Design, 2001,pp. 134–139.

[31] J.-Y. Jou and K.-T. Cheng, “Timing-driven partial scan,” in Proc. Int.Conf. Comput.-Aided Design, Nov. 1991, pp. 404–407.

[32] D. Kagaris and S. Tragoudas, “Retiming-based partial scan,” IEEETrans. Comput., vol. 45, no. 1, pp. 74–87, Jan. 1996.

[33] O. Khan, M. L. Bushnell, S. K. Devanathan, and V. D. Agrawal,“Spartan: A spectral and information theoretic approach to partial scan,”in Proc. Int. Test Conf., 2007, paper 21.1.

[34] N. A. Touba, “Survey of test vector compression techniques,” IEEEDesign Test Comput., vol. 23, no. 4, pp. 294–303, Apr. 2006.

[35] Z. You, J. Huang, M. Inoue, J. Kuang, and H. Fujiwara, “A responsecompactor for extended compatibility scan tree construction,” in Proc.Int. Conf. ASIC, Oct. 2009, pp. 609–612.

[36] N. A. Touba, “X-Canceling MISR: An X-tolerant methodology forcompacting output responses with unknowns using a MISR,” in Proc.Int. Test Conf., 2007, paper 6.2.

[37] R. E. Tarjan and A. E. Trojanowski, “Finding a maximum independentset,” SIAM J. Comput., vol. 3, no. 6, pp. 537–546, 1977.

[38] I. Bayraktaroglu and A. Orailoglu, “Decompression hardware determi-nation for test volume and time reduction through unified test patterncompaction and compression,” in Proc. VLSI Test Symp., 2003, pp. 113–118.

[39] C. Anshuman, K. Chakrabarty, and R. A. Medina, “How effective arecompression codes for reducing test data volume?,” in Proc. VLSI TestSymp., 2002, pp. 91–96.

Ozgur Sinanoglu received the Ph.D. degree in com-puter science and engineering from the University ofCalifornia at San Diego, San Diego, in 2004.

He was a Senior Design-for-Testability Engineerwith Qualcomm, San Diego, primarily responsiblefor developing cost-effective test solutions for low-power systems-on-chip. After a four-year academicexperience at Kuwait University, he joined New YorkUniversity Abu Dhabi, Abu Dhabi, UAE, in 2010.He has published nearly 100 conference and journalpapers. He holds three issued and several pending

patents. His current research interests include reliability and security ofintegrated circuits, mostly focusing on design for testability and design fortrust.

Dr. Sinanoglu was a recipient of the Best Paper Award at the VLSI TestSymposium in 2011.