dft_clk_mux_ds

12
SYNOPSYS CONFIDENTIAL DFT_clk_mux and DFT_clk_chain Data Sheet Revision 1.0 November 14, 2011 ABSTRACT DFT Compiler adds DFT_clk_mux and DFT_clk_chain components to the design when insert_dft is run with the set_dft_configuration -clock_controller enable setting. These components are not documented in the DFT Compiler Scan User Guide. This data sheet is intended to document the architecture and operation of these components, and to provide a check list for users concerned about the components’ impact on their design. This document describes the implementation instantiated by the F-2011.09 release. The most recent changes were: In the D-2010.03-SP2 release, an option was added to use clock gating latches. In the E-2010.12 release, the hierarchy of the new blocks was flattened during insert_dft. Note: The PLL controller that is included with DFT Compiler is an example that is not guaranteed to be appropriate for use in your design. If you decide to use this design, you are responsible for validating that this functionality works in the context of your design.

Upload: rohit-patel

Post on 14-Apr-2015

216 views

Category:

Documents


13 download

DESCRIPTION

synopsys OCC document

TRANSCRIPT

Page 1: DFT_clk_mux_DS

SYNOPSYS CONFIDENTIAL

DFT_clk_mux and DFT_clk_chain

Data Sheet Revision 1.0

November 14, 2011

ABSTRACT

DFT Compiler adds DFT_clk_mux and DFT_clk_chain components to the design when insert_dft is run with the set_dft_configuration -clock_controller

enable setting. These components are not documented in the DFT Compiler Scan User Guide. This data sheet is intended to document the architecture and operation of these components, and to provide a check list for users concerned about the components’ impact on their design.

This document describes the implementation instantiated by the F-2011.09 release. The most recent changes were:

In the D-2010.03-SP2 release, an option was added to use clock gating latches.

In the E-2010.12 release, the hierarchy of the new blocks was flattened during insert_dft.

Note: The PLL controller that is included with DFT Compiler is an example that is not guaranteed to be appropriate for use in your design. If you decide to use this design, you are responsible for validating that this functionality works in the context of your design.

Page 2: DFT_clk_mux_DS

DFT_clk_mux

SYNOPSYS CONFIDENTIAL 2

1 System Overview

The DFT_clk_mux and DFT_clk_chain are inserted as two separate modules in the top level of the

design, but they always function together as a unit. The DFT_clk_mux is inserted between the OCC

(On-Chip Clocking) clock generator, usually a PLL (Phase-Locked Loop), and its clock tree to

provide control over the clock for scan shifting and capture. The DFT_clk_chain contains data to

control the capture operation of the DFT_clk_mux. These blocks are kept separate because the flip-

flops inside DFT_clk_mux must be nonscan to allow them to switch clock sources correctly, but the

flip-flops inside DFT_clk_chain must be on the scan chains so that the capture pulses can be

controlled by ATPG.

The purpose of these blocks is to allow ATPG to specify capture sequences consisting of a fixed

number of pulses from a PLL which may be running asynchronously to the primary inputs

controlled by the ATE. The scan shift operation takes place under direct ATE control, and switching

between the different clock sources is done glitchlessly. The fast sequential ATPG engine in

TetraMAX specifies capture sequences with a maximum of 10 cycles, so it is not meaningful to

create DFT_clk_mux blocks capable of emitting more pulses, although it is legal and the IP block

works in this case.

1.1 Schematics

These schematics correspond to the connections made automatically by the insert_dft command

for a specification with two PLL clocks and a maximum of two clock pulses per capture cycle. If

more clock pulses are selected, the DFT_clk_chain becomes longer, and the counter and decoder

become larger. Note that the logic is shown generically, and may appear different after synthesis.

In Figure 1, the DFT_clk_mux is shown as it would be instantiated in the design. Before the

insert_dft command is run, the PLL is connected to the Clock Drivers, and the Clock Trees and

Scan Flops must already exist. These are not changed by insert_dft (besides adding the Scan

Enable and serial scan connections), but the DFT_clk_mux is inserted at the output of the PLL with

DFT_clk_chain controlling it.

The circuitry inside DFT_clk_mux is shown in separate figures for clarity. The hierarchy inside it is

flattened during insert_dft.

Page 3: DFT_clk_mux_DS

DFT_clk_mux

SYNOPSYS CONFIDENTIAL 3

CLKA

CLKB

pll_reset

test_mode

pll_bypass

test_se

ATECLK

PLL

[3:2]

DFT_clk_mux

DSI

SE

Q

[1:0]

test_siN

Clock

Trees

Scan

Flops

DSI

SE

Q

test_soN

Clock

Drivers

DSI

SE

Q DSI

SE

Q

DFT_clk_chain

DSI

SE

Q DSI

SE

Q

[3][2][1][0]

Fast Pulse

Controller

Clock

Selection

Circuit

Fast Pulse

Controller

Clock

Selection

Circuit

Figure 1. DFT_clk_mux & DFT_clk_chain in the design. The contents of the dashed boxes are shown in the

following figures.

D Q

\U_clk_control_i_0/

load_n_meta_0_l_reg

D Q D Q [3]

[2]

[1]

[0]

Decoder:

2-to-4 D Q

pll_reset

Counter: 0-to-3

(then hold)

rst_n

Q[1:0]load_n (load 0)

fast_clk

slow_clk_enable

(from Clock

Selection Circuit)

clk_enable[0]

clk_enable[1]

pipeline_or_tree

(to Clock

Selection Circuit)

\U_clk_control_i_0/

load_n_meta_1_l_reg

\U_clk_control_i_0/

load_n_meta_2_l_reg

Figure 2. Contents of the “Fast Pulse Controller” block from Figure 1. The instance names of the clock domain

crossing synchronization flip-flops are before running the change_names command, and are for the first

DFT_clk_mux to be inserted. For subsequent instances, increment the first 0.

Page 4: DFT_clk_mux_DS

DFT_clk_mux

SYNOPSYS CONFIDENTIAL 4

D Q

D Q

slow_clk

test_se

pll_reset

test_mode

pll_bypass

fast_clk

pipeline_or_tree

clk

slow_clk_enable

Figure 3. Contents of the “Clock Selection Circuit” block from Figure 1, using the default (false) of

test_occ_insert_clock_gating_cells. Clock paths are shown in red.

D Q

D Q

slow_clk

test_se

pll_reset

test_mode

pll_bypass

fast_clk

pipeline_or_tree

clk

D

GN

Q

D

GN

Q

slow_clk_enable

Figure 4. Contents of the “Clock Selection Circuit” block from Figure 1, using set

test_occ_insert_clock_gating_cells true. The inner dashed boxes show logic that can be replaced

by integrated clock gating cells using the test_icg_p_ref_for_dft variable. Clock paths are shown in red.

Page 5: DFT_clk_mux_DS

DFT_clk_mux

SYNOPSYS CONFIDENTIAL 5

2 DFT_clk_mux

2.1 Naming Convention

The module is instantiated under this name:

<string>_DFT_clk_mux_<number>

where

<string> is the current_design during the insert_dft run

<number> is the uniquification number of the controller, starting from 0

2.2 Ports

Port Name Direction Function

reset Input 1 to reset controller, 0 to allow controller to operate

test_mode Input 1 to control clock, 0 to select fast_clk unconditionally

pll_bypass Input 1 to select slow_clk, 0 to allow clock switch-over operations

scan_en Input Mediates clock switch-over operation

clk_enable[m:0] Input Capture pulse control from clock chain

fast_clk[n:0] Input Fast clock from PLL

slow_clk Input ATE clock

clk[n:0] Output Output clock to scan flip-flops

Table 1. DFT_clk_mux I/O ports

The widths of the buses are determined by the options of the set_dft_clk_controller

command:

clk and fast_clk are as wide as the number of elements in the -pllclocks list.

clk_enable is as wide as the number of elements in the -pllclocks list times the argument

of the -cycles_per_clock option.

When the bus width would be 1, a scalar port of the same name is used instead.

Page 6: DFT_clk_mux_DS

DFT_clk_mux

SYNOPSYS CONFIDENTIAL 6

2.3 Connections

As instantiated by insert_dft, the DFT_clk_mux ports are connected as follows:

Port Name Type Default Name CTL DataType

reset Primary Input pll_reset snps_pll_reset

test_mode Primary Input test_mode TestMode

pll_bypass Primary Input pll_bypass snps_pll_bypass

scan_en Primary Input test_se ScanEnable

clk_enable Internal DFT_clk_chain(clk_ctrl_data) -

fast_clk Internal -pllclocks hookup pin

(last element in list is bit 0)

-

slow_clk Primary Input -ateclocks argument MasterClock

ScanMasterClock

clk Internal -pllclocks destination

(last element in list is bit 0)

-

Table 2. DFT_clk_mux default connections

2.4 Functional Operation

The functional operation of DFT_clk_mux is to select either the fast_clk input or the slow_clk input

to pass to the clk output.

Three of the inputs are static controls to the output multiplexer. Switching any of these inputs takes

effect immediately and can result in glitches on the clk output. These signals are listed in Table 3.

test_mode pll_bypass source of clk output

0 - fast_clk

1 1 slow_clk

1 0 dynamic selection

Table 3. Static control states in DFT_clk_mux

The remaining inputs control the dynamic selection of the two clocks. When used properly, they

ensure that switching between the clocks is done glitchlessly. A clock is deselected on its own

falling edge, then the clk output is held low until the new clock selection is made on its own falling

edge to ensure glitchless operation and full pulse widths.

reset is only used for initialization. In the test protocol, it pulses to 1 and then stays at 0 for the

remainder of the test. When reset goes back to 0, the sequence of operations is:

If scan_en is 1, one slow_clk pulse is required and then slow_clk is selected.

If scan_en is 0, the next fast_clk pulse starts a capture pulse sequence.

Pulsing reset to 1 after initialization is improper use, and will result in the clk output immediately

going to 0.

Page 7: DFT_clk_mux_DS

DFT_clk_mux

SYNOPSYS CONFIDENTIAL 7

DFT_clk_mux can reset itself even without the reset pulse. By setting scan_en to 1 and waiting for

one fast_clk pulse followed by one slow_clk pulse (which selects the slow_clk input) and after five

more fast_clk pulses, it will be ready to go through a capture sequence.

clk_enable is a bus connected to the clk_ctrl_data output of a DFT_clk_chain block. This bus is

loaded during the scan shift operation. Changing this input while scan_en is low is improper use and

can result in unpredictable glitching on the clk output. Each bit enables a pulse on an output clk

signal at a particular clock cycle count of its corresponding fast_clk input. A value of 1 represents a

pulse and a value of 0 represents no pulse. The grouping is first by output clock and second by

count.

For example, if set_dft_clk_controller has three elements in its -pllclocks list and a

-cycles_per_clock argument of 2:

clk_enable[0] enables a pulse on count 1 on clk[0]

clk_enable[1] enables a pulse on count 2 on clk[0]

clk_enable[2] enables a pulse on count 1 on clk[1]

clk_enable[3] enables a pulse on count 2 on clk[1]

clk_enable[4] enables a pulse on count 1 on clk[2]

clk_enable[5] enables a pulse on count 2 on clk[2]

scan_en is connected to the scan enable signal used by the internal scan chains. It works as follows:

When scan_en goes high, slow_clk is selected following its first falling edge. Every

transition on slow_clk is passed through to the clk output.

When scan_en goes low, the signal is resynchronized from the slow clock domain (captured

by a single flip-flip in the clock selection block) to the fast clock domain (resynchronized by

three successive synchronizer flip-flops in the fast pulse controller block). Once the low scan

enable signal has been resynchronized, a counting sequence from 0 to N+1 is initiated by the

fast pulse controller, according to the -cycles_per_clock N argument. Cycles 0 and N+1

are quiet, while cycles 1 through N selectively issue fast clock pulses depending on the

values loaded into the clock chain.

If the OCC controller is used with a pipelined scan-enable signal, additional steps are needed to

ensure correct operation. For more information, see “On-Chip Clocking Support” in the DFT

Compiler Scan User Guide.

Figures 5 and 6 show the behaviors in a case with set_dft_clk_controller

-cycles_per_clock 2:

Page 8: DFT_clk_mux_DS

DFT_clk_mux

SYNOPSYS CONFIDENTIAL 8

slow_clk

clk

fast_clk

scan_en

Count = 3 (terminal)Count = 2 (pulse next cycle if enabled)

Count = 1 (pulse next cycle if enabled)Count = 0 (no pulse)

3 synchronization

cycles

scan_en falling deselects

slow_clk asynchronously

scan_en rising takes effect

on next falling clock edges Figure 5. Capture cycle example using the default (false) of test_occ_insert_clock_gating_cells.

slow_clk

clk

fast_clk

scan_en

Count = 3 (terminal)Count = 2 (pulse 2

nd following cycle if enabled)

Count = 1 (pulse 2nd

following cycle if enabled)Count = 0 (no pulse)

3 synchronization

cycles

scan_en falling deselects

slow_clk on next rising edge

scan_en rising takes effect

on next rising clock edges Figure 6. Capture cycle example using set test_occ_insert_clock_gating_cells true.

The dotted arrows show data setup relationships to their corresponding clock edges. scan_en must

be synchronized to slow_clk and it must change while slow_clk is low to avoid truncating its pulse

on clk. No synchronization with fast_clk is assumed and clock domain crossing synchronization

logic is provided. Minimum widths are required for both the high and low pulses of scan_en:

The scan_en low pulse must encompass a slow_clk pulse followed by a number of fast_clk

pulses equal to the -cycles_per_clock argument plus five (three synchronization cycles

plus two extra counter cycles). Failure to meet this requirement will cause a failure during

pattern simulation. Capture pulses will be skipped, but no glitching will occur and the

following scan operation will work correctly.

If needed, increase the duration of the scan_en low pulse by using the set_atpg

Page 9: DFT_clk_mux_DS

DFT_clk_mux

SYNOPSYS CONFIDENTIAL 9

-min_ateclock_cycles cycles command in TetraMAX to specify the number of slow

clock cycles that the signal is held low. You can calculate this value using the waveform

diagrams, the period of the slow clock, and the largest period across all fast clocks.

If the clock pulses have considerable propagation delay to the scan flip-flops, you can also

use the -min_ateclock_cycles option to add additional delay to the low scan_en pulse

so that the clock pulses reach their destination before the rising scan enable transition.

There is no maximum scan_en low pulse width.

The scan_en high pulse must encompass a slow_clk pulse followed by five fast_clk pulses.

Failure to meet this requirement may cause all capture pulses in the next following capture

cycle to be skipped. There is no maximum scan_en high pulse width.

2.5 Special Considerations

The DFT_clk_mux component is added into the design by the insert_dft command when

set_dft_configuration -clock_controller enable is set. Here are the special

considerations that users should be aware of in order to use it successfully.

1. When the insert_dft command maps DFT_clk_mux to gates, it does not optimize it for

insertion delay, drive strength or differential delay (pulse shaping). The timing is invalidated by

insert_dft, so afterwards update_timing must be run before report_timing. If a timing

problem is found, run the compile -incremental command (which can be run in any case to

ensure the best optimization).

It is also possible to completely remap the logic in DFT_clk_mux using a nonincremental

compile. In this case, run the characterize command on the DFT_clk_mux instance, change

the current_design to the DFT_clk_mux design, then run a full compile command. Do not

use the compile –scan command since the clock controller must not be put onto the scan

chains.

2. Some of the flip-flops inside DFT_clk_mux are used for signaling from the slow clock domain

to the fast clock domain(s). These flip-flops should be replaced with metastability-hardened flip-

flops if these are available in the standard-cell library. The instances that should be replaced are

those shown in Figure 2. The instance names after change_names -rules verilog are:

U_clk_control_i_*_load_n_meta_{0,1,2}_l_reg

where * starts at 0 and increments as needed to cover the number of clocks controlled by the

specific DFT_clk_mux.

3. These same metastability flip-flops may cause unnecessary failures in full-timing gate -level

simulation. The timing checks of the U_clk_control_i_*_load_n_meta_0_l_reg instances should

be disabled to prevent this. Only the first of the metastability flip-flops in each DFT_clk_mux

instantiation needs to have its timing disabled. In VCS, this can be done by using the noTiming

Page 10: DFT_clk_mux_DS

DFT_clk_mux

SYNOPSYS CONFIDENTIAL 10

configuration file attribute. See the VCS User Guide for details.

4. Static timing analysis requires a special setup to enable the required clock gating checks. This

setup is described in SolvNet article 022490, titled “Static Timing Analysis Constraints for On-

Chip Clocking Support.”

5. Clock Tree Synthesis (CTS) can cause timing problems if it is not set up properly. If CTS is

allowed to balance the clock skew to the flip-flops inside DFT_clk_mux to the same value as the

flip-flops on the endpoints of the clock tree, then the clock output of DFT_clk_mux may include

glitches or shortened clock pulses. This is because the DFT_clk_mux flip-flops gate the clock

before it has gone through the clock tree’s delay. The solution to this is to skew the clock to the

DFT_clk_mux flip-flops to be earlier than that going to other destinations of the same clock. In

IC Compiler, this can be done using the set_clock_tree_exceptions -float_pins

command. See the IC Compiler documentation for details.

Note that the clock for DFT_clk_chain can use a clock balanced to the functional flip-flops on

endpoints of the clock tree. Its flip-flops are on the scan chains with the functional flip-flops, and

its outputs to DFT_clk_mux are ignored during shift but stable during the capture cycle, so they

do not have to meet single-cycle timing requirements on those paths.

Page 11: DFT_clk_mux_DS

DFT_clk_mux

SYNOPSYS CONFIDENTIAL 11

3 DFT_clk_chain

This section describes the use of the DFT_clk_chain block with regular scan and scan compression.

3.1 Naming Convention

The module is instantiated under this name:

<string>_DFT_clk_chain_<number>

where

<string> is the current_design during the insert_dft run

<number> is the uniquification number of the controller, starting from 0

3.2 Ports

Port Name Direction Function

clk Input Falling edge clock

se Input 1 to shift scan chains, 0 to hold previous data

si[n:0] Input Scan inputs

so[n:0] Output Scan outputs

clk_ctrl_data[m:0] Output Parallel output data

Table 4. DFT_clk_chain I/O ports

The widths of the buses are determined by the options of the set_dft_clk_controller

command:

si and so are as wide as the argument of -chain_count.

clk_ctrl_data is as wide as the number of elements in the -pllclocks list times the

argument of the -cycles_per_clock option.

When the bus width would be 1, a scalar port of the same name is used instead.

3.3 Connections

As instantiated by the insert_dft command, the DFT_clk_chain ports are connected as follows:

Port Name Type Default Name CTL DataType

clk Internal DFT_clk_mux(clk[max]) -

se Primary Input test_se ScanEnable

si Primary Input test_si ScanDataIn

so Primary Output test_so ScanDataOut

clk_ctrl_data Internal DFT_clk_mux(clk_enable) -

Page 12: DFT_clk_mux_DS

DFT_clk_mux

SYNOPSYS CONFIDENTIAL 12

Table 5. DFT_clk_chain default connections

3.4 Functional Operation

DFT_clk_chain shifts data on the falling edge of clk, from the si inputs to the so outputs when se is

high. When se is low, the previous data is held. The data is also read on the clk_ctrl_data parallel

output bus. When data is scanned, the first bit from si[0] feeds the flip-flop driving clk_ctrl_data[0].

3.5 Special Considerations

The addition of the DFT_clk_chain by the insert_dft command may cause difficulties if the

clock tree has already been balanced. Make sure that scan shifting works properly with full timing,

especially at the boundaries of the DFT_clk_chain. This can be done in PrimeTime using the script

written out by the TetraMAX tmax2pt.tcl utility command write_timing_constraints -mode

shift. The most likely timing problems are with hold time (-delay_type min).

One way to avoid clock skew problems is to move the DFT_clk_chain clock connection to the ATE

clock (as defined by set_dft_clk_controller -ateclocks) but this has the drawback that it

receives an extra shift pulse which invalidates its data on scan out. The workaround for this in

TetraMAX is to use add_cell_constraint OX on every flip-flop in DFT_clk_chain.

DFT_clk_chain shifts on the falling clock edge. This allows it to be stitched at the beginning of the

scan chain, which is very helpful in scan compression mode. However, it may require a separate

scan chain of its own if the set_scan_configuration -clock_mixing no_mix or

mix_clocks_not_edges options have been applied. When inserting DFT_clk_chain at the top

level of a design, it is better to use the set_scan_configuration -clock_mixing

mix_edges or mix_clocks options so that edge mixing is permitted.