design of variable input delay gates for low dynamic power circuits tezaswi raja, transmeta corp.,...
Post on 19-Dec-2015
214 views
TRANSCRIPT
DesignDesign of Variable Input Delay of Variable Input Delay Gates for Low Dynamic Power Gates for Low Dynamic Power
CircuitsCircuits
Tezaswi Raja, Tezaswi Raja, Transmeta Corp.,Transmeta Corp., Santa Clara, CASanta Clara, CAVishwani D. Agrawal, Vishwani D. Agrawal, Dept. of ECE, Auburn Dept. of ECE, Auburn
UniversityUniversityMichael L. Bushnell, Michael L. Bushnell, Dept. of ECE, Rutgers Dept. of ECE, Rutgers
UniversityUniversity
Research Funded by: Research Funded by: National Science FoundationNational Science Foundation
Sep 23, 2005 2Tezaswi Raja: PATMOS Conf. Leuven.
Talk OutlineTalk Outline
MotivationMotivation Transistor Level Design of Variable Transistor Level Design of Variable
Input Delay GateInput Delay Gate ResultsResults ReferencesReferences Conclusion and Future WorkConclusion and Future Work
Sep 23, 2005 3Tezaswi Raja: PATMOS Conf. Leuven.
1
Motivation: Variable Input Delay GatesMotivation: Variable Input Delay Gates
121
2
0 0 0
2 21
UnoptimizedUnoptimized Buffer OptimizedBuffer Optimized Variable Input Variable Input Delay GateDelay Gate
Produce Produce glitchesglitches
Waste power.Waste power.
Glitches removed.Glitches removed. Active power Active power
consumed in consumed in buffer.buffer.
Leakage paths Leakage paths added through added through buffer.buffer.
Glitches removed.Glitches removed. No extra leakage No extra leakage
paths added.paths added. Issues:Issues:
Can we design Can we design such a gate?such a gate?
How much can How much can the delays the delays through IO paths through IO paths differ by?differ by?
3 3 3
Sep 23, 2005 4Tezaswi Raja: PATMOS Conf. Leuven.
Problem StatementProblem Statement
Design a gate at the transistor-level Design a gate at the transistor-level such thatsuch that The gate has different delays along The gate has different delays along
different IO paths.different IO paths. The maximum achievable difference in The maximum achievable difference in
delay between any two paths (delay between any two paths (uubb) ) through the gate can be quantified. through the gate can be quantified.
Sep 23, 2005 5Tezaswi Raja: PATMOS Conf. Leuven.
Transistor Level Transistor Level ImplementationImplementation
We propose three new implementations of We propose three new implementations of the variable input delay gatethe variable input delay gate
Capacitance manipulationCapacitance manipulation method where method where the input capacitance offered by the the input capacitance offered by the respective transistor pair is varied.respective transistor pair is varied.
Pass transistor added designPass transistor added design where an where an extra transistor is added to increase the extra transistor is added to increase the resistance and thereby the input delay. We resistance and thereby the input delay. We propose the addition of:propose the addition of: Single nMOS transistorSingle nMOS transistor CMOS pass transistorCMOS pass transistor
We describe the pass transistor added design We describe the pass transistor added design in detail here. The first design is documented in detail here. The first design is documented in the paper.in the paper.
Sep 23, 2005 6Tezaswi Raja: PATMOS Conf. Leuven.
Concept of Increasing Concept of Increasing ResistanceResistance
Need a CMOS gate with different delays along different IO paths.Need a CMOS gate with different delays along different IO paths. Note that the resistance of the path influences only the delay and not the Note that the resistance of the path influences only the delay and not the
energy consumed.energy consumed. Hence, adding more resistance can be the best way to add delay without Hence, adding more resistance can be the best way to add delay without
wasting more energy.wasting more energy.
Delay = Ron (Cp + Cr + Cin)
Energy = 0.5 (Cr + Cin) V2
Ron
Cr
Cin
Solution: Add another transistor in series to the path.
Sep 23, 2005 7Tezaswi Raja: PATMOS Conf. Leuven.
Single Single nnMOSFET Added MOSFET Added DesignDesign
The input delay can be added by the input The input delay can be added by the input nnMOS MOS transistor in series to the path desired.transistor in series to the path desired.
The addition of resistance does not increase the The addition of resistance does not increase the energy per transition.energy per transition.
d3,2 = Ron (Cr + Cin)
Energy = 0.5 (Cr + Cin) V2
d3,1 = Ron (Cr + Cin) + Rs Cin
d3,1 = Output + Input delayCr
Ron
d3,1
d3,2
Rs
Cr
Ron
Cin
Cin
Sep 23, 2005 8Tezaswi Raja: PATMOS Conf. Leuven.
Theoretical Calculation of Theoretical Calculation of uubb
Note: Note: nmosnmos conducts logic ‘0’ well but ‘1’ is degraded (shown by conducts logic ‘0’ well but ‘1’ is degraded (shown by λλ))..
Rs
1- λ -IdsRsIds 1
Lin
ear
Cu
toff
Rs
+IdsRsIds 0
Cu
toff
Lin
ear
Logic 1 transmission Logic 0 transmission
For pmos cutoff: (pmos threshold)
1- λ - IdsRs > Vdd – Vtp
For nmos cutoff: (nmos threshold)
IdsRs < Vtn Constraints give upper bound on Constraints give upper bound on RRs s and and λλ Upper bound on Upper bound on RRss determines upper bound on determines upper bound on uubb
Can be made specific to any technology.Can be made specific to any technology.
Sep 23, 2005 9Tezaswi Raja: PATMOS Conf. Leuven.
Rs
Effect of Input SlopeEffect of Input Slope
Theoretical Theoretical uubb cannot be realized in practice due to noise cannot be realized in practice due to noise issues.issues.
Increased resistance degrades the slope of a signal and we Increased resistance degrades the slope of a signal and we use the CMOS gate following it to regenerate the slope.use the CMOS gate following it to regenerate the slope.
The regenerative capability of a gate is limited and this The regenerative capability of a gate is limited and this governs practical governs practical uubb value. value.
The slope allowed in a design depends on the noise The slope allowed in a design depends on the noise specifications of the circuit.specifications of the circuit.
Sep 23, 2005 10Tezaswi Raja: PATMOS Conf. Leuven.
Single nMOSFET Added Single nMOSFET Added DesignDesign
Advantages:Advantages: Complete independent control of input delays.Complete independent control of input delays. uubb is very high compared to capacitance manipulation is very high compared to capacitance manipulation
method.method. Very less overhead compared to a conventional buffer.Very less overhead compared to a conventional buffer. Can be integrated to full-custom as well as standard cell Can be integrated to full-custom as well as standard cell
place and route design flows.place and route design flows. Design Issues:Design Issues:
nMOSFET degrades the signal when passing logic 1. nMOSFET degrades the signal when passing logic 1. Hence, it increases the leakage of the transistors in the Hence, it increases the leakage of the transistors in the fanout stages. However, this is for certain input fanout stages. However, this is for certain input combinations only.combinations only.
Short circuit current is a function of the ratio of Short circuit current is a function of the ratio of input/output slopes. Since we increase the input slope by input/output slopes. Since we increase the input slope by inserting resistance, it might increase short circuit power inserting resistance, it might increase short circuit power by a minor amount.by a minor amount.
Sep 23, 2005 11Tezaswi Raja: PATMOS Conf. Leuven.
CMOS Pass Transistor Added CMOS Pass Transistor Added DesignDesign
The input delay can be added by the input CMOS The input delay can be added by the input CMOS pass transistor in series to the path desired.pass transistor in series to the path desired.
This does not degrade the signal as both transistors This does not degrade the signal as both transistors together conduct both logic values well.together conduct both logic values well.
d3,2 = Ron (Cr + Cin)
Energy = 0.5 (Cr + Cin) V2
d3,1 = Ron (Cr + Cin) + Rs Cin
d3,1 = Output + Input delayCr
Ron
d3,1
d3,2
Rs
Cr
Ron
Cin
Cin
Sep 23, 2005 12Tezaswi Raja: PATMOS Conf. Leuven.
Theoretical Calculation of Theoretical Calculation of uubb
Rs
1 -IdsRsIds 1
Lin
ear
Cu
toff
Rs
+IdsRsIds 0
Cu
toff
Lin
ear
Logic 1 transmission Logic 0 transmission
For pmos cutoff: (pmos threshold)
1 - IdsRs > Vdd – Vtp
For nmos cutoff: (nmos threshold)
IdsRs < Vtn Constraints give upper bound on Constraints give upper bound on RRs s and and λλ Upper bound on Upper bound on RRss determines upper bound on determines upper bound on uubb
Can be made specific to any technology.Can be made specific to any technology. Note that the resistance is a parallel combination of both the resistances of the transistors.Note that the resistance is a parallel combination of both the resistances of the transistors.
Sep 23, 2005 13Tezaswi Raja: PATMOS Conf. Leuven.
CMOS Pass Transistor Added CMOS Pass Transistor Added DesignDesign
Advantages:Advantages: No signal degradation for any logic valueNo signal degradation for any logic value No increase in leakage current in fanout No increase in leakage current in fanout
stage.stage. All other advantages as the nMOSFET added All other advantages as the nMOSFET added
designdesign Design Issues:Design Issues:
Two transistors are added instead of one.Two transistors are added instead of one. Effective resistance per unit length is lesser Effective resistance per unit length is lesser
due to the parallel combination of resistances.due to the parallel combination of resistances.
Sep 23, 2005 14Tezaswi Raja: PATMOS Conf. Leuven.
Technology MappingTechnology Mapping
Determine sizes of transistors in a gate for the Determine sizes of transistors in a gate for the given delay and given load capacitance.given delay and given load capacitance.
First guess is given by the look-up table.First guess is given by the look-up table. Second stage is sensitivity driven. Second stage is sensitivity driven. Reduces the complexity of transistor search.Reduces the complexity of transistor search.
Look Up Table for sizes
Delay required
Error accepta
ble?
Sensitivity of each
transistor size to delay
Increment that
transistor dimension
Transistor Sizes
yes no
Sep 23, 2005 15Tezaswi Raja: PATMOS Conf. Leuven.
Physical Level VerificationPhysical Level Verification
c7552 Un-optimizedGate Count = 3827
Transistor Count ≈ 40,000
Critical Delay = 2.15 ns
Area = 710 x 710 um2
c7552 optimized (ub = 10)Gate Count = 3828
Transistor Count ≈ 45,000
Critical Delay = 2.15 ns
Area = 760 x 760 um2(1.14)
Sep 23, 2005 16Tezaswi Raja: PATMOS Conf. Leuven.
Instantaneous Power Instantaneous Power SavingsSavings
Peak Power Savings = 68%Peak Power Savings = 68%
Sep 23, 2005 17Tezaswi Raja: PATMOS Conf. Leuven.
Average Energy SavingsAverage Energy Savings
Average Energy Savings = 58%Average Energy Savings = 58%
Sep 23, 2005 18Tezaswi Raja: PATMOS Conf. Leuven.
Related Publications Related Publications ThesesTheses1.1. ““Minimum Dynamic Power Deisgn with Variable Input Delay LogicMinimum Dynamic Power Deisgn with Variable Input Delay Logic”, ”,
PhD Thesis, Dept. of Elec. and Comp. Eng., Rutgers University,PhD Thesis, Dept. of Elec. and Comp. Eng., Rutgers University, May May 2004.2004.
2.2. ““Minimum Dynamic Power Design of CMOS Circuits using a Reduced Minimum Dynamic Power Design of CMOS Circuits using a Reduced Constraint Set Linear Program,Constraint Set Linear Program,” MS Thesis, Dept. of Elec. and Comp. ” MS Thesis, Dept. of Elec. and Comp. Eng., Rutgers University,Eng., Rutgers University, May 2002.May 2002.
Journal PapersJournal Papers1.1. T. Raja, V. D. Agrawal and M. L. Bushnell, “Low Power CMOS Design T. Raja, V. D. Agrawal and M. L. Bushnell, “Low Power CMOS Design
for Minimum Power and Highest Speed using a New Gate Design”, for Minimum Power and Highest Speed using a New Gate Design”, submitted to submitted to IEEE Transactions on VLSI(IEEETVLSI)IEEE Transactions on VLSI(IEEETVLSI), in April, 2005., in April, 2005.
Conference Papers:Conference Papers:1.1. T. Raja, V. D. Agrawal and M. L. Bushnell, “Design of Variable Input T. Raja, V. D. Agrawal and M. L. Bushnell, “Design of Variable Input
Delay Logic for Low Dynamic Power Circuits,” Delay Logic for Low Dynamic Power Circuits,” Proc. Of PATMOS Conf. , Proc. Of PATMOS Conf. , SepSep 2005. 2005.
2.2. T. Raja, V. D. Agrawal and M. L. Bushnell, “Variable Input delay logic T. Raja, V. D. Agrawal and M. L. Bushnell, “Variable Input delay logic and its Application to Low Power Design,” and its Application to Low Power Design,” Proc. 18Proc. 18thth Int’l. Conference Int’l. Conference on VLSI Design, on VLSI Design, Jan 2005.Jan 2005.
3.3. T. Raja, V. D. Agrawal and M. L. Bushnell, “CMOS Design of Circuits T. Raja, V. D. Agrawal and M. L. Bushnell, “CMOS Design of Circuits for Minimum Power and Highest Speed,” for Minimum Power and Highest Speed,” Proc. 17Proc. 17thth Int’l. Conference Int’l. Conference on VLSI Design, on VLSI Design, Jan 2004.Jan 2004.
4.4. T. Raja, V. D. Agrawal, and M. L. Bushnell, “Minimum Dynamic Power T. Raja, V. D. Agrawal, and M. L. Bushnell, “Minimum Dynamic Power Design of CMOS Circuits using a Reduced Constraint Set Linear Design of CMOS Circuits using a Reduced Constraint Set Linear Program,” Program,” Proc. 16Proc. 16thth Int’l. Conf. on VLSI Design, Int’l. Conf. on VLSI Design, pp. 527-532, Jan 2003.pp. 527-532, Jan 2003.
Sep 23, 2005 19Tezaswi Raja: PATMOS Conf. Leuven.
ConclusionConclusion Pass transistor (nMOS and CMOS) can be used as a delay element Pass transistor (nMOS and CMOS) can be used as a delay element
instead of a buffer.instead of a buffer.
There are limitations to the size of the transmission gate used based onThere are limitations to the size of the transmission gate used based on
Input slope degradationInput slope degradation
Signal degradation when passing a high signal through nMOS.Signal degradation when passing a high signal through nMOS.
Transmission gate can be used for delay as long as the delay does not Transmission gate can be used for delay as long as the delay does not
exceed exceed uub.b.
Described the technique to calculate Described the technique to calculate uubb for a given technology. for a given technology.
Described the algorithm for sizing of the three variable input delay Described the algorithm for sizing of the three variable input delay
gates for given delay requirements.gates for given delay requirements.
Presented results on power savings using these new gates.Presented results on power savings using these new gates.
FUTURE WORK:FUTURE WORK:
Include Leakage power in the analysis.Include Leakage power in the analysis.
Analyze results for more recent technologies.Analyze results for more recent technologies.
Sep 23, 2005 20Tezaswi Raja: PATMOS Conf. Leuven.
Thank you Thank you
Sep 23, 2005 21Tezaswi Raja: PATMOS Conf. Leuven.
Design Issues and FAQDesign Issues and FAQ
Is this not similar to Input Re-ordering Is this not similar to Input Re-ordering techniques?techniques?
Input re-ordering can change only the rise or fall Input re-ordering can change only the rise or fall delay but not both.delay but not both.
The capacitance manipulation method also cannot The capacitance manipulation method also cannot have completely independent control over both have completely independent control over both rise and fall delays but input re-ordering has zero rise and fall delays but input re-ordering has zero control.control.
The The uubb obtained by the input re-ordering is much obtained by the input re-ordering is much smaller than what can be obtained by Capacitance smaller than what can be obtained by Capacitance manipulation.manipulation.
Sep 23, 2005 22Tezaswi Raja: PATMOS Conf. Leuven.
Design Issues and FAQDesign Issues and FAQ Does this increase Leakage Power?Does this increase Leakage Power?
Observed no increase for 0.25u technology.Observed no increase for 0.25u technology. Need to investigate for present technologies.Need to investigate for present technologies. Can be complemented with known leakage Can be complemented with known leakage
reduction techniques.reduction techniques.
How big should the standard cell How big should the standard cell library be?library be?
For c7552 with 3827 gates, we needed 155 For c7552 with 3827 gates, we needed 155 different standard cells generated by Prolific.different standard cells generated by Prolific.
Area can be further reduced if these cells are Area can be further reduced if these cells are custom designed.custom designed.
Sep 23, 2005 23Tezaswi Raja: PATMOS Conf. Leuven.
Transistor OverheadTransistor Overhead
1,4 – nMOS added design (for 1,4 – nMOS added design (for maxdelaymaxdelay = 1 and 2) = 1 and 2) 2,5 – CMOS added design (for 2,5 – CMOS added design (for maxdelaymaxdelay = 1 and 2) = 1 and 2) 3,6 – Buffer added design (for 3,6 – Buffer added design (for maxdelaymaxdelay = 1 and 2) = 1 and 2)