A Defect Tolerant and A Defect Tolerant and Performance Tunable Gate Performance Tunable Gate
Architecture for Architecture for End-of-Roadmap CMOSEnd-of-Roadmap CMOS
Adit D. SinghAdit D. Singh
Electrical and Computer Engineering, Electrical and Computer Engineering,
Auburn University AL 36849Auburn University AL 36849
National Science FoundationCNS 0708962 and CCF0811454
2
MotivationMotivation
No visibility on technology beyond CMOSNo visibility on technology beyond CMOS
CMOS appears here to stay!CMOS appears here to stay!
Scaling projected to continueScaling projected to continue
At least a decade of design likely in nano-At least a decade of design likely in nano-
scale “End-of-Roadmap” CMOS scale “End-of-Roadmap” CMOS
3
End-of-Roadmap CMOSEnd-of-Roadmap CMOS
Characterized byCharacterized by Atomic scale feature sizes Atomic scale feature sizes (~100 Si atoms in 45nm)(~100 Si atoms in 45nm)
Physical limits in material propertiesPhysical limits in material properties Random dopant fluctuationsRandom dopant fluctuations Extreme sub-wavelength lithographyExtreme sub-wavelength lithography
Potential forPotential for High manufacturing defectivityHigh manufacturing defectivity Operational wear out & degradationOperational wear out & degradation Highy random performance variationHighy random performance variation
4
End-of-Roadmap CMOSEnd-of-Roadmap CMOS
Characterized byCharacterized by Atomic scale feature sizesAtomic scale feature sizes Physical limits in material propertiesPhysical limits in material properties Random dopant fluctuationsRandom dopant fluctuations Extreme sub-wavelength lithographyExtreme sub-wavelength lithography
Potential forPotential for High manufacturing defectivityHigh manufacturing defectivity Operational wear out Operational wear out & & degradationdegradation Highly random performance variationHighly random performance variation
Defect Tolerance
Performance Tuning
Clock Speed and Parameter Variation Clock Speed and Parameter Variation
Clock rate determined by Clock rate determined by slowestslowest path path
Manufacturing variability Manufacturing variability forces different clock forces different clock rates: rates: “Speed Binning”“Speed Binning”
Speed Binning works for Speed Binning works for systematic variabilitysystematic variability
Less effective for Less effective for random variability random variability
Comb.Logic
FF
FF
FF
FF
Clock
Speed Binning Speed Binning
Traditional Systematic VariabilityTraditional Systematic Variability Device parameters track Device parameters track
within a chipwithin a chip All gates slow or all fastAll gates slow or all fast Some chips slow, some Some chips slow, some
fastfast Average clock rate over Average clock rate over
many manufactured partsmany manufactured parts
= clock rate for average= clock rate for average
parameter valuesparameter values
FF
FF
FF
FF
Clock
Random VariabilityRandom Variability
Random parameter Random parameter variability within chipvariability within chip
Every copyEvery copy of a large of a large circuit highly likely to circuit highly likely to have a few very slow have a few very slow pathspaths
Average clock frequencyAverage clock frequency
<< clock rate for average<< clock rate for average
parameter valuesparameter values
(for large circuits)(for large circuits)
FF
FF
FF
FF
Clock
Random VariabilityRandom Variability
Statistical:
1 in 100very slow
gate
18 gate design 150 gate design A few slow parts Virtually all slow parts
9
Normal DistributionNormal Distribution
Random Variability: Speed Vs SizeRandom Variability: Speed Vs Size
Large circuits statistically Large circuits statistically more likely to have one or more likely to have one or more slow outlier pathsmore slow outlier paths
FF
FF
FF
FF
Clock
1 10 100 1000 10000 (log scale)Circuit SizeCircuit Size
Av
era
ge
Av
era
ge
wo
rst
cas
e p
ath
de
lay
wo
rst
cas
e p
ath
de
lay
Post Manufacture Performance TuningPost Manufacture Performance Tuning“Delay Fault Tolerance”“Delay Fault Tolerance”
Capability to allow Capability to allow speed-upspeed-up of of statistically slow outlier pathsstatistically slow outlier paths
FF
FF
FF
FF
Clock
1 10 100 1000 10000 (log scale)Circuit SizeCircuit Size
Av
era
ge
Av
era
ge
wo
rst
cas
e p
ath
de
lay
wo
rst
cas
e p
ath
de
lay
GOAL
12
Defect Tolerant & Tunable CMOSDefect Tolerant & Tunable CMOS
P-Net
N-Net
Sized andProgrammable
Sized andProgrammable
ProgrammableSwitch
ProgrammableSwitch
13
Defect Free OperationDefect Free Operation
P-Net
N-Net
Sized andProgrammable
Sized andProgrammable
ON
OFF
OFF
ON
Traditional CMOS
14
Defect in P-NetDefect in P-Net
P-Net
N-Net
OFF
OFF
ON
ON
Pseudo nMOS operation
Pull-up sized for ratio logic
Rpu ~ 4 Rpd
15
Defect in N-NetDefect in N-Net
P-Net
N-Net
OFF
OFFON
ON
Pseudo PMOS operation
Pull-down sized for ratio logic
Rpd ~ 4 Rpu
16
Performance Tuning: Slow P-transistorPerformance Tuning: Slow P-transistor
P-Net
N-Net
ON
ON
ON
OFF
Redundant PMOS speeds up rising transitions
Speed up greatest for very slow outlier transistor
Some slow down of opposite falling transitions
17
Performance Tuning: Slow P-transistorPerformance Tuning: Slow P-transistor
P-Net
N-Net
ON
ON
ON
OFF
Assume nominal Rpu = Rpd and R_tuning = 4 Rpd
Defective Extra Delay Rpu Untuned Tuned
1.5X 0.5X 0.10X 2X 1X 0.33X 4X 3X 1.00X 6X 5X 1.40X 8X 7X 2.20X
1X 0X - 0.2X
SpeedupSpeedup
18
Performance Tuning: Slow P-transistorPerformance Tuning: Slow P-transistor
P-Net
N-Net
ON
ON
ON
OFF
Assume nominal Rpu = Rpd and R_tuning = 4 Rpd
Defective Extra Delay Rpu Untuned Tuned
1.5X 0.5X 0.10X 2X 1X 0.33X 4X 3X 1.00X 6X 5X 1.40X 8X 7X 2.20X
1X 0X - 0.2X
SpeedupSpeedup
19
Performance Tuning: Slow P-transistorPerformance Tuning: Slow P-transistor
Defective Extra Delay Rpu Untuned Tuned
1.5X 0.5X 0.10X 2X 1X 0.33X 4X 3X 1.00X 6X 5X 1.40X 8X 7X 2.20X
1X 0X - 0.2X
Assume 10 level path
Untuned delay = 13X
Tuned Delay = 11 X
Tuning 2 additionalgates: Tuned Delay = 10.6X SpeedupSpeedup
20
Simulation ExperimentsSimulation Experiments
Simplified simulation of inverter chainsSimplified simulation of inverter chains Transistor parameters drawn from a Normal Transistor parameters drawn from a Normal
Distribution - different variance valuesDistribution - different variance values Circuit size measured by number of chainsCircuit size measured by number of chains For each “circuit” worst case untuned and For each “circuit” worst case untuned and
tuned delays obtained.tuned delays obtained.
Post Manufacture Performance TuningPost Manufacture Performance Tuning“Delay Fault Tolerance”“Delay Fault Tolerance”
Simulate and average over a Simulate and average over a large number of instances for large number of instances for each “circuit” sizeeach “circuit” size
FF
FF
FF
FF
Clock
1 10 100 1000 10000 (log scale)Circuit SizeCircuit Size
Av
era
ge
Av
era
ge
wo
rst
cas
e p
ath
de
lay
wo
rst
cas
e p
ath
de
lay
GOAL
Observed Delay Variations Observed Delay Variations Tuned and UntunedTuned and Untuned
22
101 102 103 104 1051.9
x 10-10
2.72.62.52.42.32.2
2.02.1D
elay
(se
c)
Standard Deviation = 1/6 meanStandard Deviation = 1/6 mean
8 stage inverter chains
20%
Observed Delay Variations Observed Delay Variations for different sigmasfor different sigmas
23
104101 103102
1 123
7
45
6
109
8
Del
ay (
sec)
Number of circuits(log scale)
x 10-10
24Defect Tolerant & Tunable CMOS GateDefect Tolerant & Tunable CMOS Gate
P-Net
N-Net
Sized andProgrammable
Sized andProgrammable
ProgrammableSwitch
ProgrammableSwitch
ConclusionConclusion
End-of-RoadmapApplications