مرتضي صاحب الزماني 1 synthesis. مرتضي صاحب الزماني 2 what is...
TRANSCRIPT
مرتضي صاحب الزماني 1
Synthesis
مرتضي صاحب الزماني 2
What is Synthesis?• Transformation of an
abstract description into a more detailed description
• "+" operator is transformed into a gate netlist
• "if (VEC_A = VEC_B) then"is realized as a comparator which controls a multiplexer
• Transformation depends on several factors
، مقايسه( به گيتهاي AND، ORعملگرهاي ساده )مثل •مشخصي تبديل مي شوند اما عملگرهاي پيچيده تر مثل
تبديل مي شوند.toolضرب ابتدا به ماکروسلهاي خاص آن
مرتضي صاحب الزماني 3
Field Programmable Gate Array (FPGA)
مرتضي صاحب الزماني 4
هاFPLDچرخه ي طراحي براي مزايا:•
کوتاه شدن پروسه ي طراحي.•
نوآوري بيشتر )پروسه ي طراحي به مراحل باالتر رفتاري •منتقل مي شود( )تشابه با زبانهاي سطح باال(
• Debug طرح بسيار آسانتر و سريعتر.
مانند سيکل برنامه •نويسي:
تغييرات در طرح بسيار آسانتر.•
کامپايل
برنامه اجرانويسي
ويرايش
کامپايل
شبيه سازي
ورود طرح
ويرايش
سنتز شبيه سازي
ويرايش
مرتضي صاحب الزماني 5
Synthesizability
• Only a subset of VHDL is synthesizable
• Different Tools support different subsets
• records?
• arrays of integers?
• clock edge detection?
• sensitivity list?
• ...
مرتضي صاحب الزماني 6
Different Language Support for Synthesis
مرتضي صاحب الزماني 7
How to Do?• Macrocells
• adder • comparator • Bus interface
• Constraints • speed • area • power
• Optimizations • boolean:
mathematic • gate:
technological
مرتضي صاحب الزماني 8
Non-functional requirements• Performance:
– Clock speed is generally a primary requirement.– Usually expressed as a lower bound.
• Design cycle and Timing Closure
• Size:– Determines manufacturing cost.– If your design doesn’t fit into one size FPGA, you must use the
next larger FPGA.– For very large designs: multi-FPGAs.
• Power/energy:– Power/Energy related to battery life and heat.
• May have more cost:– More expensive packaging to dissipate heat.– More extreme measures (e.g. cooling fans).
– Many digital systems are power- or energy-limited.
مرتضي صاحب الزماني 9
Mapping into an FPGA
• Must choose the FPGA:– Capacity.– Pinout/package type.– Maximum speed.
مرتضي صاحب الزماني 10
Synthesis Process in Practiceباوجود مکانيزمهاي بهينه سازي، ممکن است •
بعد از سنتز، همة محدوديتها برآورده نشده تکرارباشند
مرتضي صاحب الزماني 11
Path delay
• Combinational network delay is measured over paths through network.
• Can trace a causality chain from inputs to worst-case output.
مرتضي صاحب الزماني 12
Path delay example
network
graph model
مرتضي صاحب الزماني 13
Critical path
• Critical path = path which creates longest delay.
• Can trace transitions which cause delays that are elements of the critical delay path.
مرتضي صاحب الزماني 14
Critical path through delay graph
مرتضي صاحب الزماني 15
Delay Paths in a design
مرتضي صاحب الزماني 16
False paths
• Logic gates are not simple nodes—some input changes don’t cause output changes.
• A false path is a path which never happens due to Boolean gate conditions.
• False paths cause pessimistic delay estimates.
مرتضي صاحب الزماني 17
Placement and delay
• Placement helps determine routing.
• Routing determines wire length.
• Wire length determines capacitive load.
• Capacitive load determines delay.
مرتضي صاحب الزماني 18
Example: Adder placement and delay
• N-bit adder: (optimal placement)
+ + + +
مرتضي صاحب الزماني 19
Bad placement and routing
placement routing
With no delay constraints.
مرتضي صاحب الزماني 20
Bad placement and routing
• Adder has been distributed throughout the FPGA.• I/O pins have been spread around the chip. P&R algorithms do not catch on to regularity.
مرتضي صاحب الزماني 21
Better placement and routingWith delay constraints.
• Better but far from optimal (less spread out horizontally but spread out vertically)
مرتضي صاحب الزماني 22
How to improve?
• Use macros (optimized),
• Put constraints on the placement of objects,
• Hand place objects.– Example: later.
مرتضي صاحب الزماني 23
Power Optimization
مرتضي صاحب الزماني 24
Power optimization
• Transitions cause power consumption.
• Logic network design helps control power consumption:– minimizing capacitance;– eliminating unnecessary glitches.
مرتضي صاحب الزماني 25
Power optimization
• Leakage in more advanced processes.– Even when logic is idle.– The only way: disconnect the power supply
from the logic when not needed for some time.
– It generally takes a considerable period (larger than a clock period) to reconnect power and let the circuits stabilize.
مرتضي صاحب الزماني 26
Glitching example
• Gate network:
مرتضي صاحب الزماني 27
Glitching example behavior
• NOR gate produces 0 output at beginning and end:– beginning: bottom input is 1;– end: NAND output is 1;
• Difference in delay between application of primary inputs and generation of new NAND output causes glitch.
مرتضي صاحب الزماني 28
Adder Chain Glitching
badgood
a+b
c
d
a+b
a+b+c
c+d
a+b
a+b+c
a+b+c+d
مرتضي صاحب الزماني 29
Explanation
• Unbalanced chain has signals arriving at different times at each adder.
• A glitch downstream propagates all the way upstream.
• Balanced tree introduces multiple glitches simultaneously, reducing total glitch activity.
مرتضي صاحب الزماني 30
Factorization for low power
• Proper factorization reduces glitching.
bad good
acaca: High transition probability
مرتضي صاحب الزماني 31
Factorization techniques
• In example, a has high transition probability, b and c low probabilities.
• Reduce number of logic levels through which high-probability signals must travel in order to reduce propagation of glitches.
مرتضي صاحب الزماني 32
Example (ALU)
• ALU output is not used for every cycle If ALU inputs change, the energy is
needlessly consumed
مرتضي صاحب الزماني 33
Example (ALU)
• Control Signal selects whether data is allowed to pass the logic or the previous value is held to avoid transitions.
Logic
D Q
Data
Control
مرتضي صاحب الزماني 34
Layout for low power
• Place and route to minimize capacitance of nodes with high glitching activity.
• Feed back wiring capacitance values to power analysis for better estimates.
مرتضي صاحب الزماني 35
State assignment for low power
• Later
مرتضي صاحب الزماني 36
Case Study
• 16 x 16 multiplier example.
مرتضي صاحب الزماني 37
The FPGA design process
• Xilinx ISE (Integrated Synthesis Environment)– Translation from HDL.
• (Synthesis, Translation)
– Logic synthesis.• (Mapping)
– Placement and routing.• (Place and Route)
– Configuration generation.• (Program File Generation)
مرتضي صاحب الزماني 38
Design experiments
• Synthesize with no constraints.• Synthesize with timing constraint.
– Tighten timing constraint.
• Synthesize with placement constraints.• Power:
– Many tools don’t allow us to directly specify power consumption must rewrite our h/w description for better power
consumption characteristics.
مرتضي صاحب الزماني 39
Post-translation simulation model
• No timing or area constraints• HDL model in terms of FPGA primitives.• Example: X_LUT4 \p12_Madd__n0015_Mxor_Result_Xo<1>1
( .ADR0(x_7_IBUF), .ADR1(y_13_IBUF), .ADR2(c12[7]), .ADR3(row12[8]), .O(row13[7]) );
مرتضي صاحب الزماني 40
Mapping report
Design Summary--------------Number of errors: 0Number of warnings: 0Logic Utilization: Number of 4 input LUTs: 501 out of 1,024 48%Logic Distribution: Number of occupied Slices: 255 out of 512 49% Number of Slices containing only related logic: 255 out of 255 100% Number of Slices containing unrelated logic: 0 out of 255 0% *See NOTES below for an explanation of the effects of unrelated logicTotal Number 4 input LUTs: 501 out of 1,024 48%
Number of bonded IOBs: 64 out of 92 69%
Total equivalent gate count for design: 3,006Additional JTAG gate count for IOBs: 3,072Peak Memory Usage: 64 MB
مرتضي صاحب الزماني 42
Static timing analysis report
Timing constraint: TS_P2P = MAXDELAY FROM TIMEGRP "PADS" TO TIMEGRP "PADS" 99.999 uS ;
20135312 items analyzed, 0 timing errors detected. (0 setup errors, 0 hold errors)
Maximum delay is 20.916ns.--------------------------------------------------------------------------------
After Mapping: estimated delays (no information about interconnects)
مرتضي صاحب الزماني 43
Static timing report: delays along paths
Data Sheet report:-----------------All values displayed in nanoseconds (ns)
Pad to Pad------------------+----------------------+-----------+Source Pad |Destination Pad| Delay |------------------+----------------------+-----------+x<0> |p<0> | 5.824|x<0> |p<10> | 10.675|x<0> |p<11> | 11.214|x<0> |p<12> | 11.753|
مرتضي صاحب الزماني 45
Static timing after routing
Timing constraint: TS_P2P = MAXDELAY FROM TIMEGRP "PADS" TO TIMEGRP "PADS" 99.999 uS ;
20135312 items analyzed, 0 timing errors detected. (0 setup errors, 0 hold errors)
Maximum delay is 38.424ns.
-------------------------------------------------------------------
• (vs 20.916 ns in mapping report) Because of interconnect delays.
مرتضي صاحب الزماني 46
Timing constraint
• Use timing constraint editor:
مرتضي صاحب الزماني 47
Post-map static timing report
Timing constraint: TS_P2P = MAXDELAY FROM
TIMEGRP "PADS" TO TIMEGRP "PADS" 32 nS ;
20135312 items analyzed, 0 timing errors detected. (0 setup errors, 0 hold errors)
Maximum delay is 20.916ns.
Pad to pad
Hasn’t changed since this design has limited opportunities for logic synthesis to change delays by restructuring logic.
مرتضي صاحب الزماني 48
Post-routing static timing report
Timing constraint: TS_P2P = MAXDELAY FROM
TIMEGRP "PADS" TO TIMEGRP "PADS" 32 nS ;
20135312 items analyzed, 0 timing errors detected. (0 setup errors, 0 hold errors)
Maximum delay is 31.984ns.
Tools generally try to meet the delay goal as closely as possible to minimize area.
مرتضي صاحب الزماني 49
Tighter timing constraints
• Tighten requirement to 25 ns.• Post-place-route timing report:
Timing constraint: TS_P2P = MAXDELAY FROM TIMEGRP "PADS" TO TIMEGRP "PADS" 25 nS ;
20135312 items analyzed, 11 timing errors detected. (11 setup errors, 0 hold errors)
Maximum delay is 31.128ns.
مرتضي صاحب الزماني 50
Report on a violated path
Slack: -6.128ns (requirement - data path) Source: y<0> (PAD) Destination: p<30> (PAD) Requirement: 25.000ns Data Path Delay: 31.128ns (Levels of Logic = 31)
Modify the logic and/or physical design to improve the delay.
مرتضي صاحب الزماني 51
Power report
Power summary: I(mA) P(mW)----------------------------------------------------------------Total estimated power consumption: 333 --- Vccint 1.50V: 0 0 Vccaux 3.30V: 100 330 Vcco33 3.30V: 1 3 --- Inputs: 0 0 Logic: 0 0 Outputs: Vcco33 0 0 Signals: 0 0 --- Quiescent Vccaux 3.30V: 100 330 Quiescent Vcco33 3.30V: 1 3
Thermal summary:---------------------------------------------------------------- Estimated junction temperature: 36C Ambient temp: 25C Case temp: 35C Theta J-A: 34C/W
Helps us determine whether we need additional cooling.
مرتضي صاحب الزماني 52
Improving area• Floorplanner window:
– Floorplanner View/edit placed design
LEs
Chipfloorplan
• Green rectangles: mapped components to CLBs
مرتضي صاحب الزماني 53
Rat’s nest wiring• If you click on a component in the deign hierarchy window, its
rat’s nest is shown.
مرتضي صاحب الزماني 54
Routing editor view• FPGA Editor View/Edit Routed Design
مرتضي صاحب الزماني 55
Editing constraints
• Use constraints editor to place constraints:– This tool allws you to constrain the placement of logic as well as the assignment of
chip I/Os to IOBs (e.g useful for PCB design)
مرتضي صاحب الزماني 56
Design browser pane
مرتضي صاحب الزماني 57
Drag and drop constraints
مرتضي صاحب الزماني 58
Change the shape of constraints
مرتضي صاحب الزماني 59
Full set of placement constraints
• We place the rows of the multiplier one below the other to create the row structure of the floorplan.
مرتضي صاحب الزماني 60
Placement results
مرتضي صاحب الزماني 61
New timing report
• After placement constraints: 19742142 items analyzed, 0 timing errors
detected. (0 setup errors, 0 hold errors)
Maximum delay is 29.934ns.
• Compares to 31 ns for unconstrained placement.
مرتضي صاحب الزماني 62
Combinational Process: Sensitivity List
Library IEEE;use IEEE.Std_Logic_1164.all; entity IF_EXAMPLE isport (A, B, C, X : in std_ulogic_vector(3 downto 0); Z : out std_ulogic_vector(3 downto 0));end IF_EXAMPLE; architecture A of IF_EXAMPLE isbegin process (A, B, C, X) begin if ( X = "1110" ) then Z <= A; elsif (X = "0101") then Z <= B; else Z <= C; end if; end process;end A;
مرتضي صاحب الزماني 63
Combinational Process: Sensitivity List
process (A, B, SEL)begin if SEL = `1` then Z <= A; else Z <= B; end if;end process;
• If SEL is missing in the sensitivity list, what will the behavior (simulation) be?
• Sensitivity list is usually ignored during synthesis.• Equivalent behavior of simulation model and hardware
All signals which are read are entered into the sensitivity list.
• Complete if-statement for the synthesis of combinational logic.
مرتضي صاحب الزماني 64
Combinational Process:Incomplete Assignments
Library IEEE;use IEEE.Std_Logic_1164.all; entity INCOMP_IF isport (A, B, SEL :in std_ulogic; Z : out std_ulogic);end INCOMP_IF; architecture RTL of INCOMP_IF isbeginprocess (A, B, SEL)begin if SEL = `1` then Z <= A; end if;end process;end RTL;
•Latch ي كه هنگامSEL = ‘1’ شفاف است
•( Transparent latch.)
هم احتماالS ناخواسته است •
ها FFهم در مدارهاي سنكرون •بهترند چون قبل از پايداري مدار
تركيبي از مقادير سيگنالهاي مياني غير مجاز جلوگيري مي كند.
• What is the value of Z,if SEL = `0` ?
• What hardware wouldbe generated during synthesis ?
مرتضي صاحب الزماني 65
Modeling of Flip-Flops
Library IEEE;use IEEE.Std_Logic_1164.all; entity FLOP isport (D, CLK : in std_ulogic; Q : out std_ulogic);end FLOP; architecture A of FLOP isbegin process begin wait until CLK`event and CLK=`1`; Q <= D; end process;end A;
مرتضي صاحب الزماني 66
Description of Rising Clock Edge for Synthesis
• Standard for synthesis: IEEE 1076.6
• ... if condition
RISING_EDGE ( clock_signal_ name) (not always supported)
clock_signal_ name'EVENT and clock_signal _name='1'
clock_signal _name='1' and clock_signal_ name'EVENT
not clock_signal_ name'STABLE and clock_signal_ name='1'
clock_signal _name='1' and not clock_signal_ name'STABLE
ليست حساسيت را اYسنتزكننده ها معمول•ناديده مي گيرند.
ها را هم پشتيباني نمي كنند.waitهمه ي • :براي عناصرحافظه if يا wait until به
صورت خاص
مرتضي صاحب الزماني 67
Description of Rising Clock Edge for Synthesis
• ... wait until condition •
RISING_EDGE ( clock_signal_ name)
clock_signal_ name'EVENT and clock_signal _name='1'
clock_signal _name='1' and clock_signal_ name'EVENT
not clock_signal_ name'STABLE and clock_signal_ name='1'
clock_signal _name='1' and not clock_signal_ name'STABLE
clock_signal _name='1'
IEEE 1076.6 is not yet fully supported by all tools
مرتضي صاحب الزماني 68
Description of Rising Clock Edge for Synthesis
• In Std_Logic_1164 package
processbegin wait until RISING_EDGE(CLK); Q <= D;end process;
function RISING_EDGE (signal CLK : std_ulogic) return boolean isbegin if ( CLK`event and CLK =`1` and CLK`last_value=`0`) then return true; else return false; end if;end RISING_EDGE;
مرتضي صاحب الزماني 69
Gated Clock• Designers avoid using gated clocks because of problematic
timing behavior of the circuit (adds skew).• Low power designs deliberately disable clocks to reduce or
eliminate power waste by useless switching of transistors.
processbegin wait until RISING_EDGE(CLK);
if (DGATE) then Q <= D;end process;
mux
DFFDGATE
CLK
D
Q
مرتضي صاحب الزماني 70
Register InferenceLibrary IEEE;use IEEE.Std_Logic_1164.all; entity COUNTER isport ( CLK : in std_ulogic; Q : out integer range 0 to 15 );end COUNTER; architecture A of COUNTER is signal COUNT : integer range 0 to 15 ;begin process (CLK) begin if CLK`event and CLK = `1` then if (COUNT >= 9) then COUNT <= 0; else COUNT <= COUNT +1; end if; end if; end process; Q <= COUNT;end A;
شمارندة يك رقمي BCD• For all signals which receive
an assignment in clocked processes, memory is synthesized.• COUNT: 4 FF
• (constrained integer)• Q not used in clocked
process.
هنگام resetاشكال: مكانيزم روشن شدن ندارد
مرتضي صاحب الزماني 71
Asynchronous Set/ResetLibrary IEEE;use IEEE.Std_Logic_1164.all; entity ASYNC_FF isport ( D, CLK, SET, RST : in std_ulogic; Q : out std_ulogic);end ASYNC_FF; architecture A of ASYNC_FF isbegin process (CLK, RST, SET) begin if (RST = `1`) then Q <= `0`; elsif SET ='1' then Q <= '1'; elsif (CLK`event and CLK = `1`) then Q <= D; end if; end process;end A;
• if/elsif - structure • The last elsif has an edge • No else
سنكرون فقط set/resetبراي •clk در ليست حساسيت قرار
wait مي گيرد )مي توان با until .)هم مدلسازي كرد
اما براي آسنكرون فقط با •ليست حساسيت مي توان
مدلسازي كرد
حتماY همة وروديهاي آسنكرون •در ليست حساسيت وارد
شوند واال نتيجة شبيه سازي با سنتز متفاوت مي شود.
مرتضي صاحب الزماني 72
Coding Style Influence EXAMPLE1:process (SEL,A,B)begin
if SEL = `1` then Z <= A + B; else Z <= A + C; end if;
end process EXAMPLE1;
• Direct implementation
EXAMPLE2:process (SEL,A,B) variable TMP : bit;begin if SEL = `1` then TMP := B; else TMP := C; end if; Z <= A + TMP;end process EXAMPLE2;
• Manual resource sharing
فقط يك جمع •كننده نياز
دارد. ديرتر SELاگر •
مي رسد مدار بااليي سريعتر عمل مي كند.
مرتضي صاحب الزماني 73
Source Code Optimization • An operation can be described very efficiently for
synthesis, e.g.:
• In one description the longest path goes via five, in the other description via three addition components - some optimization tools automatically change the description according to the given constraints.
مرتضي صاحب الزماني 74
Source Code Optimization • If one of the inputs arrives later than others, it can be
chosen for IN6 in the left implementation.• If power is a consideration, IN6 could be used for the signal
that changes more frequently in the left implementation since it passes through only one adder.
مرتضي صاحب الزماني 75
سنتز عملگرها
بسته به عملگر و عناصر كتابخانه اي )برحسب •ها )يا logic cellگيتهاي استاندارد يا برحسب
netlist(( ماجولي در ASICماكروسلها در ايجاد مي شود.
اين ماجولها برحسب سرعت يا مساحت بهينه •شده اند )كه كاربر مشخص مي كند(
VHDLدر بعضي سنتز كننده ها مي توان در كد • comment Yهايي نوشت تا مثًالCarry-
Lookahead يا Ripple Carry.انتخاب كند
مرتضي صاحب الزماني 76
Example: Adder entity ADD is port (A, B : in integer range 0 to 7; Z : out integer range 0 to 15);end ADD;
architecture ARITHMETIC of ADD isbegin Z <= A + B;end ARITHMETIC;
• Notice:Advantages of a range declaration with integer types: a) During simulation: check for "out of range..." b) During synthesis: only 4 bit bus width
library VENDOR_XY;use VENDOR_XY.p_arithmetic.all;
entity MVL_ADD is port (A, B : in stdlogic_vector (3 downto 0); Z : out stdlogic_vector (4 downto 0) );end MVL_ADD;
architecture ARITHMETIC of MVL_ADD isbegin Z <= A + B; // not allowedend ARITHMETIC;
مرتضي صاحب الزماني 77
IF Structure <-> CASE Structure
· · ·if (IN > 17) then OUT <= A ;elsif (IN < 17) then OUT <= B ;else OUT <= C ;end if ;· · ·
• Different descriptions may be synthesized differently
· · ·case IN is when 0 to 16 => OUT <= B ; when 17 => OUT <= C ; when others => OUT <= A ;end case ;· · ·
optimizeسنتزكننده ها ممكن است • كنند.
مرتضي صاحب الزماني 78
Variables in Clocked Processes
VAR_1: process(CLK) variable TEMP : integer;begin if (CLK'event and CLK = '1') then TEMP := INPUT * 2; OUTPUT_A <= TEMP + 1; OUTPUT_B <= TEMP + 2; end if;end process VAR_1;
• Registers are generated for all variables that might be read before they are updated
VAR_2: process(CLK) variable TEMP : integer;begin if (CLK'event and CLK = '1') then OUTPUT <= TEMP + 1; TEMP := INPUT * 2; end if;end process VAR_2;
• How many registers are generated?
مرتضي صاحب الزماني 79
ELSE for Clock Checking
process(CLK)begin if (CLK`event and CLK=`1`) then Q <= D; else Q <= A; end if;end process;
بيشتر سنتز كننده ها اگر براي در • به كار برده elseآشكارسازي كًالك,
باشيم انجام نمي دهند )نمي دانند چطور بايد آن را سنتز كرد(.
مرتضي صاحب الزماني 80
Don’t Care
در شرطها عموماY ‘-’ها مقايسه با شبيه سازدر• مي دهد )هيچگاه مقدار سيگنال = FALSEنتيجة
نمي شود(:‘-’ when (a = “1---”) ….
• Yاگر مثًالa = “1000 ” شود شرط TRUE.نمي شود
استفاده كرد:( numeric_std )در پکيج stdmatch(s1, s2)براي اين منظور مي توان از •
when (std_match(a, “1---”)) ….
را آزمايش مي كند.‘-’همة حاالت •
مرتضي صاحب الزماني 81
Synthesis Tips
- Loops must have a fixed range- 'while' constructs usually cannot be synthesized
Shared variables: not to be used in synthesizable code
•Real و character و time غير قابل :سنتز
نويسي و testbenchمحدود به •مدلسازي.
Some aggregate constructs may not be supportedby your synthesis tool
Synthesis tools map enumerations onto a suitable bit pattern automatically.
مرتضي صاحب الزماني 82
Synthesis Tips
Arrays: Only integer index sets are supported by all synthesis tools
Some synthesizers support up to two dimensions
Aliases are not always supported by synthesis tools
Procedures: Default parameters should not be used in synthesizable code (since some synthesizers initialize absent parameters with type’left inconsistent with simulation)
مرتضي صاحب الزماني 83
Finite State Machines and VHDL
• One- , two- or three-processes• State Coding • FSM Types
• Medvedev • Moore • Mealy • Registered Output
مرتضي صاحب الزماني 84
One-Process FSM
FSM_FF: process (CLK, RESET)begin if RESET='1' then STATE <= START ; elsif CLK'event and CLK='1' then case STATE is when START => if X=GO_MID then STATE <= MIDDLE ; end if ; when MIDDLE => if X=GO_STOP then STATE <= STOP ; end if ; when STOP => if X=GO_START then STATE <= START ; end if ; when others => STATE <= START ; end case ; end if ;end process FSM_FF ;
مرتضي صاحب الزماني 85
Two-Process FSM
FSM_LOGIC: process ( STATE , X)begin case STATE is when START => if X=GO_MID then NEXT_STATE <= MIDDLE ; end if ; when MIDDLE => ... when others => NEXT_STATE <= START ; end case ;end process FSM_LOGIC ;
FSM_FF: process (CLK, RESET) begin if RESET='1' then STATE <= START ; elsif CLK'event and CLK='1' then STATE <= NEXT_STATE ; end if;end process FSM_FF ;
مرتضي صاحب الزماني 86
How Many Processes? • Structure and Readability
• Asynchronous combinatoric ≠ synchronous storing elements=> 2 processes
• Graphical FSM (without output equations) resembles one state process=> 1 process
• Simulation • Error detection easier with two state processes due to access to
intermediate signals.=> 2 processes
• Synthesis • 2 state processes can lead to smaller generic net list and therefore to
better synthesis results(depends on synthesizer but in general, it is closer to hardware)=> 2 processes
مرتضي صاحب الزماني 87
State Encoding
type STATE_TYPE is ( START, MIDDLE, STOP ) ;signal STATE : STATE_TYPE ;
• State encoding responsiblefor safety of FSM
START -> " 00 "MIDDLE -> " 01 "STOP -> " 10 "
• Default encoding: binary
START -> " 001 "MIDDLE -> " 010 "STOP -> " 100 "
• Speed optimized defaultencoding: one hot
if {log2(# of states) ≠log2(# of states)} => unsafe FSM!
مرتضي صاحب الزماني 88
Encoding of CASE Statement
type STATE_TYPE is (START, MIDDLE, STOP) ;signal STATE : STATE_TYPE ;· · · case STATE is when START => · · · when MIDDLE => · · · when STOP => · · ·
when others => · · ·
end case ;
• Adding the "when others" choice
Not necessarily safe;some synthesis tools will ignore "when others" choice
مرتضي صاحب الزماني 89
Extension of Type Declarationtype STATE_TYPE is (START, MIDDLE, STOP, DUMMY) ;signal STATE : STATE_TYPE ;··· case STATE is when START => ··· when MIDDLE => ··· when STOP => ···
when DUMMY => ··· -- or when others
end case ;
• Adding dummy values• Only for binary encoding• Advantages:
• Safe FSM after synthesis
{2 log2(# of states) - n} dummy states(n=20 => 12 dummy states)Changing to one hot coding => unnecessary hardware(n=20 => 12 unnecessary Flip Flops)
مرتضي صاحب الزماني 90
Hand Codingsubtype STATE_TYPE is std_ulogic_vector (1 downto 0) ;signal STATE : STATE_TYPE ;
constant START : STATE_TYPE := "01";constant MIDDLE : STATE_TYPE := "11";constant STOP : STATE_TYPE := "00";··· case STATE is when START => ··· when MIDDLE => ··· when STOP => ··· when others => ··· end case ;
• Defining constants • Control of encoding • Safe FSM • Portable design • Disadvantage:• More effort (especially
when design changes)
مرتضي صاحب الزماني 91
FSM: Medvedev
Two Processes
architecture RTL of MEDVEDEV is ...begin REG: process (CLK, RESET) begin -- State Registers Inference end process REG ;
CMB: process (X, STATE) begin -- Next State Logic end process CMB ; Y <= S ;end RTL ;
• The output vector resembles the state vector: Y = S
One Process
architecture RTL of MEDVEDEV is ...begin
REG: process (CLK, RESET) begin -- State Registers Inference with Logic Block end process REG ; Y <= S ;end RTL ;
مرتضي صاحب الزماني 92
Medvedev Example (2-Process)
architecture RTL of MEDVEDEV_TEST is signal STATE,NEXTSTATE : STATE_TYPE ;begin REG: process (CLK, RESET) begin if RESET='1' then STATE <= START ; elsif CLK'event and CLK='1' then STATE <= NEXTSTATE ; end if ; end process REG;
CMB: process (A,B,STATE) begin case STATE is when START => if (A or B)='0' then NEXTSTATE <= MIDDLE ; end if ; when MIDDLE => if (A and B)='1' then NEXTSTATE <= STOP ; end if ; when STOP => if (A xor B)='1' then NEXTSTATE <= START ; end if ; when others => NEXTSTATE <= START ; end case ; end process CMB ; -- concurrent signal assignments for output (Y,Z) <= STATE ;end RTL ;
مرتضي صاحب الزماني 93
Medvedev Example Waveform
• (Y,Z) = STATE => Medvedev machine
مرتضي صاحب الزماني 94
FSM: Moore
Three Processes
architecture RTL of MOORE is ...begin REG: -- Clocked Process
CMB: -- Combinational Process
OUTPUT: process (STATE) begin -- Output Logic end process OUTPUT ;
end RTL ;
• The output vector is a function of the state vector: Y = f(S)
Two Processes
architecture RTL of MOORE is ...begin REG: process (CLK, RESET) begin -- State Registers Inference with Next State Logic end process REG ; OUTPUT: process (STATE) begin -- Output Logic end process OUTPUT ;end RTL ;
مرتضي صاحب الزماني 95
Moore Example
architecture RTL of MOORE_TEST is signal STATE,NEXTSTATE : STATE_TYPE ;begin REG: process (CLK, RESET) begin if RESET='1' then STATE <=
START ; elsif CLK'event and CLK='1' then STATE <= NEXTSTATE ; end if ; end process REG ;
• Since outputs depend only on the current state, no signals other than STATE appears in the sensitivity list.
CMB: process (A,B,STATE) begin case STATE is when START => if (A or B)='0' then NEXTSTATE <= MIDDLE ; end if ; when MIDDLE => if (A and B)='1' then NEXTSTATE <= STOP ; end if ; when STOP => if (A xor B)='1' then NEXTSTATE <= START ; end if ; when others => NEXTSTATE <= START ; end case ; end process CMB ; -- concurrent signal assignments for output Y <= ‘1’ when STATE=MIDDLE else ‘0’ ; Z <= ‘1’ when STATE=MIDDLE or STATE=STOP else ‘0’;end RTL ;
مرتضي صاحب الزماني 96
Moore Example Waveform
• (Y,Z) changes simultaneously with STATE Moore machine
مرتضي صاحب الزماني 97
FSM: Mealy
Three Processes
architecture RTL of MEALY is ...begin REG: -- Clocked Process
CMB: -- Combinational Process
OUTPUT: process (STATE, X) begin -- Output Logic end process OUTPUT ;end RTL ;
• The output vector is a function of the state vector and the input vector: Y = f(X,S)
Two Processes
architecture RTL of MEALY is ...begin MED: process (CLK, RESET) begin -- State Registers Inference with Next State Logic end process MED ;
OUTPUT: process (STATE, X) begin -- Output Logic end process OUTPUT ;end RTL ;
مرتضي صاحب الزماني 98
Mealy Example
architecture RTL of MEALY_TEST is signal STATE,NEXTSTATE : STATE_TYPE ;begin
REG: · · · -- clocked STATE process CMB: · · · -- Like Medvedev and Moore Examples OUTPUT: process (STATE, A, B) begin case STATE is when START => Y <= '0' ; Z <= A and B ; when MIDLLE => Y <= A nor B ; Z <= '1' ; when STOP => Y <= A nand B ; Z <= A or B ; when others => Y <= '0' ; Z <= '0' ; end case; end process OUTPUT;end RTL ;
مرتضي صاحب الزماني 99
Mealy Example (Another Code)
architecture RTL of MEALY_TEST is signal STATE,NEXTSTATE : STATE_TYPE ;begin
REG: · · · -- clocked STATE process CMB: · · · -- Like Medvedev and Moore Examples-- Concurrent signal assignments for outputsY <= ‘1’ when (STATE = MIDDLE and (A or B) = ‘0’)
or(STATE = STOP and (A and B) = ‘0’)
else ‘0’;Z <= ‘1’ when (STATE = START and (A and B) = ‘1’)
or (STATE = MIDDLE) or(STATE = STOP and (A or B) = ‘1’)
else ‘0’;end RTL ;
مرتضي صاحب الزماني 100
Mealy Example Waveform
• (Y,Z) changes with input => Mealy machine • Note the "spikes" of Y and Z in the waveform
مرتضي صاحب الزماني 101
Modeling Aspects• Medvedev is too inflexible
• but less hardware (no combinational circuit for output)• More effort to calculate state vector.
• Moore is preferred because of safe operation• since o/p depends only on state vector. next output values are stable long before the next clock edge.
• Mealy more flexible, but danger of • Spikes • Unnecessary long paths (maximum clock period) • Combinational feed back loops
مرتضي صاحب الزماني 102
Registered Output• Avoiding long paths and combinational loops. • With one additional clock period
• Without additional clock period
مرتضي صاحب الزماني 103
Registered Output Example (1)
architecture RTL of REG_TEST is signal Y_I , Z_I : std_ulogic ; signal STATE,NEXTSTATE : STATE_TYPE ;begin
REG: · · · -- clocked STATE process
CMB: · · · -- Like other Examples
OUTPUT: process (STATE, A, B) begin case STATE is when START => Y_I<= '0' ; Z_I<= A and B ; · · · end process OUTPUT
-- clocked output process OUTPUT_REG: process(CLK) begin if CLK'event and CLK='1' then Y <= Y_I ; Z <= Z_I ; end if ; end process OUTPUT_REG ;end RTL ;
مرتضي صاحب الزماني 104
Reg. Output Example Waveform
• One clock period delay between STATE and output changes. • Input changes with clock edge result in an output change.
(Danger of unmeant values )
مرتضي صاحب الزماني 105
Registered Output Example (2)
architecture RTL of REG_TEST2 is signal Y_I , Z_I : std_ulogic ; signal STATE,NEXTSTATE : STATE_TYPE ;begin
REG: · · · -- clocked STATE process
CMB: · · · -- Like other Examples
OUTPUT: process ( NEXTSTATE , A, B) begin case NEXTSTATE is when START => Y_I<= '0' ; Z_I<= A and B ; · · · end process OUTPUT
OUTPUT_REG: process(CLK) begin if CLK'event and CLK='1' then Y <= Y_I ; Z <= Z_I ; end if ; end process OUTPUT_REG ;end RTL ;
مرتضي صاحب الزماني 106
Reg. Output Example Waveform
• No delay between STATE and output changes. • "Spikes" of original Mealy machine are gone!
مرتضي صاحب الزماني 107
Case Study (Memory Controller)
FSM SRAM
Memory
Array
Address Data
OE
WE
ADDR1ADDR0
BUS_ID
Reset
READY
BURST
READ_WRITE
CLK
busدستگاههاي روي باس با اعًالن •id يmem_buffer (F3) دسترسي به باس
را آغاز مي كنند.
مرتضي صاحب الزماني 108
Case Study (Memory Controller)
FSM SRAM
Memory
Array
Address Data
OE
WE
ADDR1ADDR0
BUS_ID
Reset
READY
BURST
READ_WRITE
CLK
’READ_WRITE = ‘1يك سيكل بعد, •مي شود تا بگويد كه يك خواندن مي
براي ’0‘خواهد انجام شود )يا نوشتن(.
مرتضي صاحب الزماني 109
Case Study (Memory Controller)
FSM SRAM
Memory
Array
Address Data
OE
WE
ADDR1ADDR0
BUS_ID
Reset
READY
BURST
READ_WRITE
CLK
كلمه اي 4براي خواندن ممكن است •(burst read باشد: بايد در مدت اولين)
فعال باشد.burstسيكل,
مرتضي صاحب الزماني 110
Case Study (Memory Controller)
FSM SRAM
Memory
Array
Address Data
OE
WE
ADDR1ADDR0
BUS_ID
Reset
READY
BURST
READ_WRITE
CLK
محل از بافر دسترسي مي 4كنترلر به •يابد )به محلهاي بعدي بعد از فعال
دسترسي مي readyكردنهاي متوالي يابد(.
مرتضي صاحب الزماني 111
Case Study (Memory Controller)
FSM SRAM
Memory
Array
Address Data
OE
WE
ADDR1ADDR0
BUS_ID
Reset
READY
BURST
READ_WRITE
CLK
در mem_buffer را براي oeكنترلر •طول خواندن فعال مي كند و دو
بايت پايين آدرس را در حالت burst.افزايش مي دهد
مرتضي صاحب الزماني 112
Case Study (Memory Controller)
FSM SRAM
Memory
Array
Address Data
OE
WE
ADDR1ADDR0
BUS_ID
Reset
READY
BURST
READ_WRITE
CLK
نوشتن هميشه يك كلمه اي است.•
مرتضي صاحب الزماني 113
Case Study (Memory Controller)
FSM SRAM
Memory
Array
Address Data
OE
WE
ADDR1ADDR0
BUS_ID
Reset
READY
BURST
READ_WRITE
CLK
فعال مي weهنگام نوشتن • address در محلdataشود و
نوشته مي شود. خاتمه مي يابد.readyخواندن و نوشتن با اعًالن •
مرتضي صاحب الزماني 114
دياگرام حالت
idle
Decision
Write read1 read2 read3
read4
ready
Read_writeRead_write
ready
ready. burst ready
ready
ready. burstready
synch reset
مرتضي صاحب الزماني 115
Memory Controller
براي همة حالتها مفروض است:•
state
ready
مرتضي صاحب الزماني 116
VHDL Code (2-process)
library ieee;use ieee.std_logic_1164.all;entity memory_controller is port ( reset, read_write, ready, burst, clk : in std_logic; bus_id : in std_logic_vector(7 downto 0); oe, we : out std_logic; addr : out std_logic_vector(1 downto 0));end memory_controller;
architecture state_machine of memory_controller istype StateType is (idle, decision, read1, read2, read3, read4, write);signal present_state, next_state : StateType;
مرتضي صاحب الزماني 117
VHDL Codebeginstate_comb:process(reset, bus_id, present_state, burst, read_write, ready) begin if (reset = '1') then oe <= '-'; we <= '-'; addr <= "--"; next_state <= idle; else case present_state is when idle => oe <= '0'; we <= '0'; addr <= "00"; if (bus_id = "11110011“ and ready = ‘1’) then next_state <= decision; else next_state <= idle; end if; when decision=> oe <= '0'; we <= '0'; addr <= "00"; if (read_write = '1') then next_state <= read1; else --read_write='0' next_state <= write; end if;
•Don’t cares assigned to outputs optimized
In every case, a signal must be assigned to the outputs; otherwise, unwanted latches.
مرتضي صاحب الزماني 118
when read1 => oe <= '1'; we <= '0'; addr <= "00"; if (ready = '0') then next_state <= read1; elsif (burst = '0') then next_state <= idle; else next_state <= read2; end if; when read2 => oe <= '1'; we <= '0'; addr <= "01"; if (ready = '1') then next_state <= read3; else next_state <= read2; end if; when read3 => oe <= '1'; we <= '0'; addr <= "10"; if (ready = '1') then next_state <= read4; else next_state <= read3; end if;
مرتضي صاحب الزماني 119
VHDL Code
when read4 => oe <= '1'; we <= '0'; addr <= "11"; if (ready = '1') then next_state <= idle; else next_state <= read4; end if; when write => oe <= '0'; we <= '1'; addr <= "00"; if (ready = '1') then next_state <= idle; else next_state <= write; end if; end case; end if;end process state_comb;
مرتضي صاحب الزماني 120
VHDL Code
state_clocked:process(clk) begin if rising_edge(clk) then present_state <= next_state; end if;end process state_clocked;
end;
مرتضي صاحب الزماني 121
Mooreتوليد خروجيها در ماشينهاي
خروجيهايي كه از بيتهاي حالت به طور 1(تركيبي ديكد شده اند: )كد قبل(
Next-State Logic
State Registers
Output Logic
InputsNext-State
Current-State outputs
مزايا:•
گويايي كد•
نگهداري •آسانتر
اشكال:•
كند•
مرتضي صاحب الزماني 122
Mooreتوليد خروجيها در ماشينهاي
( خروجيهايي كه از رجيسترهاي خروجي به طور 2موازي ديكد مي شوند:
Next-State Logic
State Registers
Output Logic
Inputs
Next-State
Current-State
outputs
انتساب به خروجيها بايد در خارج ازپروسسي كه انتقال حاالت در •آن تعريف مي شود انجام گيرد.
Output Registers
مرتضي صاحب الزماني 123
اين كار را addrفرض: فقط براي • مشكل oe و weانجام مي دهيم )براي
زماني نداريم(
architecture state_machine of memory_controller is type StateType is (idle, decision, read1, read2, read3, read4, write); signal present_state, next_state : StateType; signal addr_d: std_logic_vector(1 downto 0); -- D-input to addr f-flopsbeginstate_comb:process(bus_id, present_state, burst, read_write, ready) begin case present_state is -- addr outputs not defined when idle => oe <= '0'; we <= '0'; -- addr is absent. if (bus_id = "11110011“ and ready = ‘1’) then next_state <= decision; else next_state <= idle; end if; when decision=> oe <= '0'; we <= '0'; if (read_write = '1') then next_state <= read1; else --read_write='0' next_state <= write; end if;
مرتضي صاحب الزماني 124
when read1 => oe <= '1'; we <= '0'; if (ready = '0') then next_state <= read1; elsif (burst = '0') then next_state <= idle; else next_state <= read2; end if; when read2 => oe <= '1'; we <= '0'; if (ready = '1') then next_state <= read3; else next_state <= read2; end if; when read3 => oe <= '1'; we <= '0'; if (ready = '1') then next_state <= read4; else next_state <= read3; end if;
مرتضي صاحب الزماني 125
when read4 => oe <= '1'; we <= '0'; if (ready = '1') then next_state <= idle; else next_state <= read4; end if; when write => oe <= '0'; we <= '1'; if (ready = '1') then next_state <= idle; else next_state <= write; end if; end case;end process state_comb;
with next_state select -- D-input to addr flip-flops addr_d <= "01" when read2, -- defined here. "10" when read3, "11" when read4, "00" when others;
مرتضي صاحب الزماني 126
state_clocked:process(clk, reset) begin if reset = '1' then present_state <= idle; addr <= "00"; -- asynchronous reset for addr flops elsif rising_edge(clk) then present_state <= next_state; addr <= addr_d; -- value of addr_d stored in addr end if;end process state_clocked;
end state_machine;
مرتضي صاحب الزماني 127
Mooreتوليد خروجيها در ماشينهاي مشكًالت:•
•2 FF.اضافه
, از دو addrهاي FFبراي انتشار بيتهاي حالت به • سلول 2از PLDدر مدار تركيبي رد مي شود )اگر
استفاده كند مي تواند فركانس ماكزيمم را محدود كند(
Next-State Logic
State Registers
Output Logic
InputsNext-State
Current-State
outputsOutput Registers
مرتضي صاحب الزماني 128
Mooreتوليد خروجيها در ماشينهاي
( خروجيهايي كه مستقيماY در بيتهاي حالت انكد 3(Medvedevشده اند )
)مانند شمارنده ها(:Next-State Logic
State Registers
InputsNext-State
Current-State
outputs
•State encoding.بايد به دقت انجام شود
•FF.هاي بيشتري الزم دارد
براي خروجي به مدار ترکيبي نياز ندارد •)سرعت بيشتر(.
مرتضي صاحب الزماني 129
State Encoding
Addr(1) Addr(0)
Idle 0 0
decision 0 0
Read1 0 0
Read2 0 1
Read3 1 0
Read4 1 1
Write 0 0
s1 s2
0 0
0 1
1 0
x x
x x
x x
1 1
00
00
00
و we اين كار را انجام مي دهيم )براي addrفرض: فقط براي •oe)مشكل زماني نداريم
مرتضي صاحب الزماني 130
State Encoding
Addr(1) Addr(0)
Idle 0 0
decision 0 0
Read1 0 0
Read2 0 1
Read3 1 0
Read4 1 1
Write 0 0
کنيم:encode هم بخواهيم به همين صورت oe و weاگر براي •
oe we
0 0
0 0
1 0
1 0
1 0
1 0
0 1
s0
0
1
0
0
0
0
0
مرتضي صاحب الزماني 131
VHDL Codearchitecture state_machine of memory_controller is-- state signal is a std_logic_vector rather than an enumeration type signal state : std_logic_vector(4 downto 0); constant idle : std_logic_vector(4 downto 0) := "00000"; constant decision: std_logic_vector(4 downto 0) := "00001"; constant read1 : std_logic_vector(4 downto 0) := "00100"; constant read2 : std_logic_vector(4 downto 0) := "01100"; constant read3 : std_logic_vector(4 downto 0) := "10100"; constant read4 : std_logic_vector(4 downto 0) := "11100"; constant write : std_logic_vector(4 downto 0) := "00010";beginstate_tr:process(reset, clk) begin -- One-process FSM if reset = '1' then state <= idle; elsif rising_edge(clk) then case state is -- outputs not defined here when idle => if (bus_id = "11110011") then state <= decision; end if; -- no else; implicit memory
مرتضي صاحب الزماني 132
VHDL Code when decision=> if (read_write = '1') then state <= read1; else --read_write='0' state <= write; end if; when read1 => if (ready = '0') then state <= read1; elsif (burst = '0') then state <= idle; else state <= read2; end if; when read2 => if (ready = '1') then state <= read3; end if; -- no else; implicit memory
مرتضي صاحب الزماني 133
when read3 => if (ready = '1') then state <= read4; end if; -- no else; implicit memory when read4 => if (ready = '1') then state <= idle; end if; -- no else; implicit memory when write => if (ready = '1') then state <= idle; end if; -- no else; implicit memory when others => state <= "-----"; -- don't care if undefined state end case;
end if;end process state_tr;
-- outputs associated with register values we <= state(1); oe <= state(2); addr <= state(4 downto 3);end state_machine;
مرتضي صاحب الزماني 134
One-Hot Encoding
•N تاFF براي N.حالت
مثال: يک •FSM 18 با
حالت.
State Sequential One-Hot
State0 00000 000000000000000001
State1 00001 000000000000000010
State2 00010 000000000000000100
State3 00011 000000000000001000
State4 00100 000000000000010000
State5 00101 000000000000100000
State6 00110 000000000001000000
State7 00111 000000000010000000
State8 01000 000000000100000000
State9 01001 000000001000000000
State10 01010 000000010000000000
State11 01011 000000100000000000
State12 01100 000001000000000000
State13 01101 000010000000000000
State14 01110 000100000000000000
State15 01111 001000000000000000
State16 10000 010000000000000000
State17 10001 100000000000000000
مرتضي صاحب الزماني 135
One-Hot Encoding
State15
State17State2
cond3
cond3
cond1 cond2
:FSMفرض: بخشي از
مرتضي صاحب الزماني 136
One-Hot EncodingSequential Encodingالف(
s4s3s2s1s0(جاري) cond1cond2cond3…. s4s3s2s1s0(بعدي)
state0
state1
state2 00010 1 - - - - - - - - - 01111
...
state15 01111 - - 0 - - - - - - - - 01111
state16
state17 10001 - 1 - - - - - - - - 01111
State15
State17State2
cond3
cond3
cond1 cond2
مرتضي صاحب الزماني 137
One-Hot EncodingSequential Encodingالف(
2....3....1.... 0123401234012340 condssssscondssssscondsssssDs
مدار ترکيبي بسيار بسيار پيچيده.
2....3....1.... 0123401234012341 condssssscondssssscondsssssDs
2....3....1.... 0123401234012342 condssssscondssssscondsssssDs 2....3....1.... 0123401234012343 condssssscondssssscondsssssDs
......4 sD
s4s3s2s1s0(جاري) cond1cond2cond3…. s4s3s2s1s0(بعد(ي
state0
state1
state2 00010 1 - - - - - - - - - 01111
...
state15 01111 - - 0 - - - - - - - - 01111
state16
State17 10001 - 1 - - - - - - - - 01111
مرتضي صاحب الزماني 138
One-Hot Encoding
3.2.1. 1517215 condtcondtcondtt
مدار ترکيبي بسيار ساده•
ها زيادFFاما تعداد •
معادلة بسيار 5 معادلة بسيار ساده به جاي 18•پيچيده
سطوح کمتر مدار بين رجيسترهاي حالت
فرکانس باالتر
.FPGAمناسب براي •
State One-Hot
State0 000000000000000001
State1 000000000000000010
State2 000000000000000100
State3 000000000000001000
State4 000000000000010000
State5 000000000000100000
State6 000000000001000000
State7 000000000010000000
State8 000000000100000000
State9 000000001000000000
State10 000000010000000000
State11 000000100000000000
State12 000001000000000000
State13 000010000000000000
State14 000100000000000000
State15 001000000000000000
State16 010000000000000000
State17 100000000000000000
مرتضي صاحب الزماني 139
Power Reduction
•State assignmentرا ي تواند توان مصرفي مناسب م کاهش دهد.
• YمثًالOne-hotگنال ير سيي تغ2کل، فقط ي در هر سالزم دارد.
گر:يعوامل د•
خواهد يستر مي رجياديتعداد ز•
يد حالت بعدي توليمدار منطق•
ش کرد.يد آزماي با
•Gray Encoding براي :FSM هاي شبيه شمارنده هامناسب تر.
مرتضي صاحب الزماني 140
Pipelining
بزرگي را که در يک سيکل datapathايدة اصلي: عمليات •ساعت انجام مي شود به چند عمل کوچک که در چند
سيکل انجام مي شوند تقسيم کنيم:
Datapath Operation
Inputs outputs
Reg
iste
rs
Reg
iste
rs
tp= x
Par
t 1Inputs outputs
Reg
iste
rs
Reg
iste
rs
Par
t 1
Reg
iste
rs
Par
t 1
Reg
iste
rstp= x/3 tp= x/3 tp= x/3
مرتضي صاحب الزماني 141
Pipelining
•f Yبرابر مي شود )صرف نظر از زمانهاي 3 تقريبا tco و tsu pipeline)براي رجيسترهاي
•throughput 3 کًالک ديرتر 3 برابر مي شود اما خروجيها latencyحاضر مي شوند:
و نيز هزينة افزودن رجيسترها را دارد.•
pipelineها CPLD در ها مشکلي ندارند اماFPGAبيشتر •کمتر به کار مي رود.
•CPLD ها در يکpass از logic array عمليات زيادي را مي ،توانند انجام دهند
مرتضي صاحب الزماني 142
Example: AMD AM2901
src_op
مرتضي صاحب الزماني 143
AMD AM2901
library ieee;use ieee.std_logic_1164.all;use work.numeric_std.all;use work.am2901_comps.all;entity am2901 is port( clk, rst: in std_logic; a, b: in unsigned(3 downto 0); -- address inputs d: in unsigned(3 downto 0); -- direct data i: in std_logic_vector(8 downto 0); -- micro instruction c_n: in std_logic; -- carry in oe: in std_logic; -- output enable ram0, ram3: inout std_logic; -- shift lines to ram qs0, qs3: inout std_logic; -- shift lines to q y: buffer unsigned(3 downto 0); -- data outputs (3-state) g_bar,p_bar:buffer std_logic; -- carry generate, propagate ovr: buffer std_logic; -- overflow c_n4: buffer std_logic; -- carry out f_0: buffer std_logic; -- f = 0 f3: buffer std_logic); -- f(3) w/o 3-stateend am2901;
مرتضي صاحب الزماني 144
architecture am2901 of am2901 is alias dest_ctl: std_logic_vector(2 downto 0) is i(8 downto 6); alias alu_ctl: std_logic_vector(2 downto 0) is i(5 downto 3); alias src_ctl: std_logic_vector(2 downto 0) is i(2 downto 0);
signal ad, bd: unsigned(3 downto 0); signal q: unsigned(3 downto 0); signal r, s: unsigned(3 downto 0); signal alu_out: unsigned(3 downto 0);begin
-- instantiate and connect componentsu1: ram_regs port map(clk => clk, rst => rst, a => a, b => b, alu_out => alu_out, dest_ctl => dest_ctl, ram0 => ram0, ram3 => ram3, ad => ad, bd => bd);u2: q_reg port map(clk => clk, rst => rst, alu_out => alu_out, dest_ctl => dest_ctl, qs0 => qs0, qs3 => qs3, q => q);u3: src_op port map(d => d, ad => ad, bd => bd, q => q, src_ctl => src_ctl, r => r, s => s);u4: alu port map(r => r, s => s, c_n => c_n, alu_ctl => alu_ctl, alu_out => alu_out, g_bar => g_bar, p_bar => p_bar, c_n4 => c_n4, ovr => ovr);u5: out_mux port map(ad => ad, alu_out => alu_out, dest_ctl => dest_ctl, oe => oe, y => y);
-- define f_0 and f3 outputsf_0 <= '0' when alu_out = "0000" else 'Z';f3 <= alu_out(3);
end am2901;
مرتضي صاحب الزماني 145
Pipelined AM2901
src_opبراي هماهنگي زماني
خروجيها
مرتضي صاحب الزماني 146
Pipelined AMD AM2901library ieee;use ieee.std_logic_1164.all;use work.numeric_std.all;use work.am2901_comps.all;entity am2901 is port( clk, rst: in std_logic; a, b: in unsigned(3 downto 0); -- address inputs d: in unsigned(3 downto 0); -- direct data i: in std_logic_vector(8 downto 0); -- micro instruction c_n: in std_logic; -- carry in oe: in std_logic; -- output enable ram0, ram3: inout std_logic; -- shift lines to ram qs0, qs3: inout std_logic; -- shift lines to q y: buffer unsigned(3 downto 0); -- data outputs (3-state) g_bar_q,p_bar_q:buffer std_logic; -- carry generate, propagate ovr_q: buffer std_logic; -- overflow c_n4_q: buffer std_logic; -- carry out f_0: buffer std_logic; -- alu_out = 0 f3: buffer std_logic); -- alu_out(3) w/o 3-stateend am2901;
مرتضي صاحب الزماني 147
architecture am2901 of am2901 is alias dest_ctl: std_logic_vector(2 downto 0) is i(8 downto 6); alias alu_ctl: std_logic_vector(2 downto 0) is i(5 downto 3); alias src_ctl: std_logic_vector(2 downto 0) is i(2 downto 0);
signal ad, bd: unsigned(3 downto 0); signal q: unsigned(3 downto 0); signal r, s: unsigned(3 downto 0); signal alu_out, alu_out_q: unsigned(3 downto 0);begin
-- instantiate and connect componentsu1: ram_regs port map(clk => clk, rst => rst, a => a, b => b, alu_out => alu_out_q, dest_ctl => dest_ctl, ram0 => ram0, ram3 => ram3, ad => ad, bd => bd);u2: q_reg port map(clk => clk, rst => rst, alu_out => alu_out _q, dest_ctl => dest_ctl, qs0 => qs0, qs3 => qs3, q => q);u3: src_op port map(d => d, ad => ad, bd => bd, q => q, src_ctl => src_ctl, r => r, s => s);u4: alu port map(r => r, s => s, c_n => c_n, alu_ctl => alu_ctl, alu_out => alu_out, g_bar => g_bar, p_bar => p_bar, c_n4 => c_n4, ovr => ovr);u5: out_mux port map(ad => ad, alu_out => alu_out, dest_ctl => dest_ctl, oe => oe, y => y);
-- define f_0 and f3 outputsf_0 <= '0' when alu_out _q = "0000" else 'Z';f3 <= alu_out _q(3);
No change
مرتضي صاحب الزماني 148
Pipelined AMD AM2901
process (clk) if (rising_edge(clk) then alu_out_q <= alu_out; g_bar_q <= g_bar; p_bar_q <= p_bar; ovr_q <= ovr; c_n4_q <= c_n4; end if; end process;end am2901;
مرتضي صاحب الزماني 149
References
• Wayne Wolf, FPGA-Based System Design, Prentice Hall, 2004.
• Ulrich Heinkel, Martin Padeffke, Werner Haas, Thomas Buerner, Herbert Braisz, Thomas Gentner, Alexander Grassmann, The VHDL Reference: A Practical Guide to Computer-Aided Integrated Circuit Design including VHDL-AMS , John Wiley & Sons, 2000.