a4 bruder pres user
DESCRIPTION
snugTRANSCRIPT
-
SNUG 2013 1
Using at-speed testing with OCC on a
complex SoC
A user experience
BRUDER Bertrand
Atmel
June 11, 2013
Grenoble
-
SNUG 2013 2
Agenda
1. Principle of at-speed Scan and On-Chip Clocking
2. Scan at-speed Complex SoC Design Integration
3. Synthesis Flow and STIL generation
4. STA constraints
5. Multi-Pass ATPG flow
6. Results
7. Conclusions and Future Work
-
SNUG 2013 3
1/7 Principle of at-speed Scan
Using On-Chip-Clocking
-
SNUG 2013 4
OCC At-speed test : main concepts
At-speed test detects Delay related defects
Stuck-at fault model does not fit at-speed requirements : transition fault model is used
Synopsys OCC is used to manage at-speed clocks switch
ATE slow clock
test_se
Launch Capture OCC
OCC Ref clock
bypass
PLL OCC
-
SNUG 2013 10
2/7 Scan at-speed Complex SoC
Design Integration
-
SNUG 2013 11
At speed testing on Complex SoC
Scan At-speed test increases drastically the number of patterns
Compression is needed
Complex SoC design have multiple clocks at different frequencies
In our case some of the clocks have differents frequencies, but are synchronous.
1. How to test all clock domains at their maximum speed ?
2. How to test inter clock domain and multi-cycle path at speed ?
What is to take into account
L C
L C
Inte
r clo
ck d
om
ain
Mu
lti-
cycle
pa
th
PLL
div
-
SNUG 2013 13
SAMA5D3 clock system analysis
Clock system has 4 synchronous subsystems :
ARM processor, which runs at 400 MHz
LCD controller, which runs at 266 MHz
System clock, which runs at 133 MHz
1/3 of the peripherals, which runs at 66 MHz
hclocks
pclocks
266 MHz
66 MHz
400 MHz
133 MHz
armclock
lcdclock
APMC
PLLA
800 MHz
%1.5
%3 %2 %2
-
SNUG 2013 14
Scan at-speed OCC insertion
Solution 1 : One Synopsys OCC per clock domain
-
SNUG 2013 15
OCC insertion : Solution 1
4 OCCs are inserted
All dividers have to keep their functionality in scan mode
Advantages :
Easy automated solution
One Synopsys OCC per clock domain
hclocks
pclocks
266 MHz
66 MHz
400 MHz
133 MHz
armclock
lcdclock
APMC
OCC
OCC
OCC
%1.5
%3 %2 %2 PLLA
800 MHz OCC
Drawbacks :
Dividers are not tested at-speed
Synopsys OCC limitation : All OCCs are considered asynchronous
All data path from one sub-domain to the other cant be tested
Loss of 15% of at-speed coverage !
-
SNUG 2013 16
Scan at-speed OCC insertion
Solution 2 : Use of one custom synchronous OCC
-
SNUG 2013 17
OCC insertion : Solution 2
4 synchronous OCCs are inserted
All dividers have to keep their functionality in scan mode
Advantages :
Inter clock domain and multi-cycle path can be tested at-speed :
no at-speed coverage loss
ATPG is done in one PASS
Use of one custom synchronous OCC per domain
hclocks
pclocks
266 MHz
66 MHz
400 MHz
133 MHz
armclock
lcdclock
APMC
OCC
OCC
OCC
%1.5
%3 %2 %2 PLLA
800 MHz OCC
Drawbacks :
Dividers are not tested at-speed
Development of synchronous OCC is time consuming and not bug free
Good solution but :
due to our planning constraints and risk assessment study,
this solution has not been retained
-
SNUG 2013 18
Scan at-speed OCC insertion
Solution 3 : One Synopsys at the output of PLL
-
SNUG 2013 19
OCC insertion : Solution 3
1 OCC is inserted
All dividers are bypassed in scan mode
4 test modes are added to select the frequency corresponding to the
target domain
Test mode controller drives PLL frequency and scan modes
Advantages :
Dividers are tested at-speed
Inter clock domain and multi-cycle path can be tested at-speed : no at-speed
coverage loss
Drawbacks :
Not a fully automated solution
Increase ATPG complexity with addition of 4 modes => Multi-pass ATPG
One Synopsys OCC at the output of PLLA
hclocks
pclocks
266 MHz
66 MHz
400 MHz
133 MHz
armclock
lcdclock
APMC
%1.5
%3 %2 %2 PLLA
800 MHz OCC
%1
%1 %1 %1
Good trade of between solution (1) and (2) :
Solution retained
-
SNUG 2013 20
OCC insertion : Solution 3
Scan multiplexors are inserted on clock paths to protect clock branches against unsupported clock rates
Frequency partitioning is done in a second step during ATPG using mutli-pass pattern generation
4 scan modes are created :
ARM scan mode
ARM + LCD scan mode
ARM + LCD + hclocks scan mode
ARM + LCD + hclocks + pclocks scan mode
Frequency partitioning
hclocks
pclocks
armclock
lcdclock
APMC
ATE clock
PLLA
800 MHz OCC
%1
%1 %1 %1
-
SNUG 2013 21
OCC insertion : Solution 3
PLLA is programmed to run at 400 MHz
Muxes are controlled such a way that only ARM get the OCC clock
The rest of the system is clocked on ATE tester clock
Mode 0 : ARM scan mode
hclocks
pclocks
ATE
ATE
OCC (400 MHz)
ATE
armclock
lcdclock
APMC
PLLA
400 MHz
%1
%1 %1
ATE clock
Mode 0
OCC %1
-
SNUG 2013 22
OCC insertion : Solution 3
PLLA is programmed to run at 266 MHz
Muxes are controlled such a way that only ARM and LCD get the OCC clock
The rest of the system is clocked on ATE tester clock
Interclock domain between ARM and LCD are tested at 266 MHz
Mode 1 : ARM + LCD scan mode
hclocks
pclocks
OCC (266 MHz)
ATE
OCC (266 MHz)
ATE
armclock
lcdclock
APMC
ATE clock
PLLA
266 MHz %1
%1
Mode 1
%1
OCC %1
-
SNUG 2013 23
OCC insertion : Solution 3
PLLA is programmed to run at 133 MHz
Muxes are controlled such a way that only ARM, LCD and hclocks get the OCC clock
pclocks is clocked on ATE tester clock
Interclock domain between ARM, LCD and hclocks are tested at 133 MHz
Mode 2 : ARM +LCD + hclocks scan mode
hclocks
pclocks
OCC (133 MHz)
ATE
OCC (133 MHz)
OCC (133 MHz)
armclock
lcdclock
APMC
ATE clock
PLLA
133 MHz %1
Mode 2
%1
%1
%1 OCC
-
SNUG 2013 24
OCC insertion : Solution 3
PLLA is programmed to run at 66 MHz
Muxes are controlled such a way that all the system gets OCC clock
Interclock domain between ARM, LCD, hclocks and pclocks are tested at 66 MHz
Mode 3 : ARM + LCD + hclocks + pclocks scan mode
hclocks
pclocks
OCC (66 MHz)
OCC (66 MHz)
OCC (66 MHz)
OCC (66 MHz)
armclock
lcdclock
APMC
ATE clock
PLLA
66 MHz
Mode 3
OCC
%1
%1 %1 %1
-
SNUG 2013 25
3/7 Synthesis flow and STIL generation
-
SNUG 2013 26
Synthesis flow
To prepare the synthesis, the test mode controller must provide the control of :
OCC devices
Scan mode selection in multi-pass flow
PLL frequency
OCC insertion is done with set_dft_clock_controller with :
chain_count : set the number of FFs inside the clock chain
cycles_per_clock : max number of clock pulses during capture
-
SNUG 2013 28
STIL file generation
All frequency modes are not described during OCC insertion to reduce the complexity of DFT insertion.
Therefore, post processing of STIL files is needed to derivates as many STIL files as scan frequency modes.
Only scan modes control signals are impacted
If scan mode selection results from a sequential initialization of test mode controller, post processing consist in changing the test_setup
sequence in MacroDefs structures
SAMA5D3 example : Top_ScanCompression_mode0.stil PA[5:4] forced to 00
Top_ScanCompression_mode1.stil PA[5:4] forced to 01
Top_ScanCompression_mode2.stil PA[5:4] forced to 10
Top_ScanCompression_mode3.stil PA[5:4] forced to 11
Specificity of our multi-pass ATPG flow
-
SNUG 2013 29
4/7 STA constraints
-
SNUG 2013 31
STA scan constraints
Two scenarios : one shift and one capture
In multi-pass flow, all modes are overlaid
All clocks of each mode are created in the same STA scenario
Exclusive Clock groups are defined to remove inter-mode timing path
hclocks
pclocks
armclock
lcdclock
APMC
OCC_${OCC}_ATEClock
Mode selection
hclock_$freq_mode2
hclock_$freq_mode1
hclock_$freq_mode0
pclock_$freq_mode2
pclock_$freq_mode1
pclock_$freq_mode0
OCC_$freq_mode2
OCC_$freq_mode1
OCC_$freq_mode0
clkplla_$freq_mode2
clkplla_$freq_mode1
clkplla_$freq_mode0
PLLA
800 MHz OCC %1 %1
%1
%1
LCD_$freq_mode2
LCD_$freq_mode1
LCD_$freq_mode0
-
SNUG 2013 36
SDC generation for ATPG
To avoid timing violations during scan mode, multi-cycle and false paths have to be given to Tetramax
Recommended Synopsys flow is to generate SDC during STA, that will be read by Tetramax (read_sdc command)
pt2tmax.tcl script generates SDC from timing violation
write_exception_from_violation
Multi-pass ATPG flow : one SDC per mode is generated
False path and multicycle path management
-
SNUG 2013 41
5/7 Multi-Pass ATPG flow
-
SNUG 2013 43
+
ATPG at-speed Multi-pass flow
add_faults launch \ capture $occ_clock
mode$i-1 dictionary
mode$i.stil
read_faults \
delete mode$i-1
For 1
-
SNUG 2013 44
Incremental ATPG Flow Complete the ATPG generation using stuck-at fault model
Final dictionary and
Stuck-at patterns
update_faults \
direct_credit mode3
Stuck-at equivalency
add_faults all
Stuck at dictionary Transition fault
final dictionary
+ -
run_atpg (target : 80%)
Multi-pass flow
run_atpg (95%)
-
SNUG 2013 50
6/7 Scan At-speed Results
SAMA5D3 product
-
SNUG 2013 51
ATPG results : SAMA5D3 product
# faults # patterns coverage
Transition fault model
Mode 0 (400) 971,686 2,081 76.25%
Mode 1 (266) 8,114 467 80.02%
Mode 2 (133) 5,867,180 12,874 75.51%
Mode 3 (66) 12,856 454 77.64%
Total transition 6,859,836 15,876 75.62%
Stuck-at fault model
Total Stuck-at 9,287,250 (incr : 2,507,386) 4,337 95.10%
Total patterns 20,213
Scan at-peed ATPG multi-PASS incremental flow (with OCC)
# faults # patterns coverage
Total Stuck-at 9,287,250 8,615 95.15%
Scan ATPG stuck-at only (without OCC) Ratio: 2.4
*
* Without inter-domain at-speed test, we would have been around 60% of at-speed coverage
-
SNUG 2013 52
SAMA5D3 : Early Silicon Results At-speed VS stuck-at
Stuck-at patterns only : 17 fails
At-speed patterns : 19 fails
2 parts were caught thanks to
scan at-speed (251 parts
tested)
Mode 0
Mode 1
Mode 2
Mode 3
0.4% loss vs stuck-at
0% loss vs stuck-at
0.8% loss vs stuck-at
0% loss vs stuck-at
High freq
Small area
Big area
Low freq
-
SNUG 2013 53
7/7 Conclusions and future work
-
SNUG 2013 54
Conclusions and future work
We have successfully deployed Synopsys OCC in our DFT implementation flows at ATMEL Rousset leading to the
ATMEL SAMA5D3 product, in production since begin of 2013
Gain of the approach : increase of 15 % of at-speed coverage
This methodology is now part of the official Atmel flow for all new SoCs
For future work, we plan to implement a Custom OCC as described in solution (2), improve the functional timing
exception flow handling and consider looking at
complementary fault models such as Small Delay or Bridging
Defects.
-
SNUG 2013 55
Thank You
-
SNUG 2013 56
Appendix A
CTS constraints
-
SNUG 2013 57
CTS constraints
CTS constraints aim at
balancing all flip flops in OCC driven by the fast_clk OCC input.
balancing all flip flops in OCC driven by the slow_clk input.
balancing all flip flops in the clock chain driven by OCC output clk,
tagging the nets from PLL outputs to OCC fast_clk input.
tagging the net from ATE clock pad to OCC slow_clk input.
tagging the net from ATE ref clock pad to PLL input.
tagging the net from OCC output (clk) to APMC scan multiplexor.
See example on next slide
(Caution CTS constraints are design dependant.
Example is not exhaustive).
-
SNUG 2013 58
CTS Constraints - Example
-
SNUG 2013 59
CTS Constraints - Reusability
CTS constraints are not reusable :
Constraints at top level are design dependant
Constraints in test mode controller are design dependant
Excluded pins at the input of APMC are automatically set by the ::MCTS::check_CTS_spec case_sensitive during the functional pass of CTS