ece 545 recommendedreading lecture 12 • 7 series … design process (1) design and implement a...
TRANSCRIPT
1
George Mason University
FPGA Resources
ECE 545Lecture 12
2
Recommended reading
• 7 Series FPGAs Configurable Logic Block:User Guide
§ Overview§ Functional Details
3
Block RAM
s
Block RAM
s
ConfigurableLogicBlocks
I/OBlocks
What is an FPGA?
BlockRAMs
4
Modern FPGARAM blocks
Multipliers
Logic blocks
Graphics based on The Design Warrior’s Guide to FPGAsDevices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
Multipliers/DSP units
RAM blocks
Logic resources
(#Logic resources, #Multipliers/DSP units, #RAM_blocks)
George Mason University
CLB Structure
6
Xilinx Spartan-6 CLB
2
7ECE 448 – FPGA and ASIC Design with VHDL
SLICEX
8
4-input LUT (Look-Up Table) (used in earlier families of FPGAs)
• Look-Up tables are primary elements for logic implementation
• Each LUT can implement any function of 4 inputs
x1 x2 x3 x4
y
x1 x2
y
LUT
x1x2x3x4
y
0x1
0x2 x3 x4
0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1
y0100010101001100
0x1
0x2 x3 x4
0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1
y1111111111110000
x1 x2 x3 x4
y
x1 x2 x3 x4
y
x1 x2
y
x1 x2
y
LUT
x1x2x3x4
y
0x1
0x2 x3 x4
0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1
y0100010101001100
0x1
0x2 x3 x4
0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1
y0100010101001100
0x1
0x2 x3 x4
0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1
y1111111111110000
0x1
0x2 x3 x4
0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1
y1111111111110000
9
6-Input LUT of Spartan-6
10
11
Reset and Set Configurations
• No set or reset• Synchronous set• Synchronous reset• Asynchronous set (preset)• Asynchronous reset (clear)
12
Three Different Types of Slicesin Xilinx FPGAs
3
13
SLICEX
14
SLICEL
15
u Each CLB contains separate logic and routing for the fast generation of sum & carry signals• Increases efficiency and
performance of adders, subtractors, accumulators, comparators, and counters
u Carry logic is independent of normal logic and routing resources
Fast Carry Logic
LSB
MSB
Carry
Log
icRo
utin
g
x y COUT
0011
0101
y
y
CINCIN
Propagate = x Å yGenerate = ySum= Propagate Å CIN = x Å y Å CIN
xy
Carry & Control Logic in Xilinx FPGAs
Carry & Control Logic in Spartan 3 FPGAs
LUT
Hardwired (fast) logic
xy
Simplified View of a Xilinx FPGA Carry and Arithmetic Logic in One
Logic Cell
xy
4
20
21
Accessing Carry Logic
u All major synthesis tools can infer carry logic for arithmetic functions
• Addition (SUM <= A + B)• Subtraction (DIFF <= A - B)• Comparators (if A < B then…)• Counters (count <= count +1)
22
SLICEM
23
16-bit SR
16 x 1 RAM
4-input LUT
The Design Warrior’s Guide to FPGAsDevices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
Xilinx Multipurpose LUT (MLUT)
64 x 1 ROM(logic)
64 x 1 RAM
32-bit SR
24
MLUT as a 32-bit Shift Register (SRL32)
5
25
Question 1
How many Xilinx LUTs are necessary to implement the following functions:
A. f1 = x1x2 + x3x4 + x5x6
f2 = x1x2x3x4x5
B. f1 = x1x2 + x2x3 + x3x4 + x4x5
f2 = x1+x2+x3+x4+x5
26
Question 2
How many Xilinx LUTs are necessary to implement:
A. 4-to-1 MUXB. 2-to-4 decoder with enableC. 3-to-8 decoder with enable
27
Question 3
How many Xilinx LUTs are necessary to implement 4-bit priority encoder?
w 0
w 3
y 0
y 1d001
010
w0 y1
d
y0
1 1
01
1
11
z
1--
0
-
w1
01-
0
-
w2
001
0
-
w3
000
0
1
z
w 1
w 2 w
y
28
Question 4How many Xilinx LUTs are necessary to implement the following circuit:
George Mason University
Question 5
Determine the amount of Xilinx FPGA resources needed
to implement a given circuit
R0
R1
R2
R3
R4
R5
R6
R7
R8
R9
R10
R11
R12
R13
R14
R15
w
abcd
yF
m
clk
0 1 runCircuit 1:Top level
6
1
01
0
01234567
cin
x y
cout
s
<<<3
x3
x2
x1
x0
y3
y2
y1
y0
w1
w0
En
y3
y2
y1
y0
a
bcd
a
b
c
dc
ab
e
ef
3
2-to-4 Decoder
FullAdder
f
g
h
g h
y
Circuit 1:F – function R0
R1
R2
R3
R4
R5
R6
R7
R8
R9
R10
R11
R12
R13
R14
R15
z
abcde
yF
d
clk
0 1 runCircuit 2:Top level
1
01
0
01234567
x y
cout
s
>>2
x3
x2
x1
x0
y3
y2
y1
y0
y1
y0
z
w3
w2
w1
w0
a
b
c
d
ae
f
gh
3
Priority Encoder
HalfAdder
g
h
i
e i
y
a
b
c
d
Circuit 2:F – function
Circuit 3: Top level
en enaclk
y(0)
en enaclken enaclken enaclk
x
3232 32 32
32
y(1)
+ −
R0 R1 R2 R3
A BA>B
16 16 16 16
1616
Circuit 4:Top level
George Mason UniversityECE 448 – FPGA and ASIC Design with VHDL
Input/Output Blocks(IOBs)
7
37
Basic I/O Block Structure
DEC
Q
SR
DEC
Q
SR
DEC
Q
SR
Three-StateControl
Output Path
Input Path
Three-State
Output
Clock
Set/Reset
Direct Input
Registered Input
FF Enable
FF Enable
FF Enable
38
IOB Functionality
• IOB provides interface between the package pins and CLBs
• Each IOB can work as uni- or bi-directional I/O
• Outputs can be forced into High Impedance• Inputs and outputs can be registered
• advised for high-performance I/O• Inputs can be delayed
George Mason University
FPGA Design Flow
FPGA Design process (1)Design and implement a simple unit permitting to speed up encryption with RC5-similar cipher with fixed key set on 8031 microcontroller. Unlike in the experiment 5, this time your unit has to be able to perform an encryption algorithm by itself, executing 32 rounds…..
Library IEEE;use ieee.std_logic_1164.all;use ieee.std_logic_unsigned.all;
entity RC5_core isport(
clock, reset, encr_decr: in std_logic;data_input: in std_logic_vector(31 downto 0);data_output: out std_logic_vector(31 downto 0);out_full: in std_logic;key_input: in std_logic_vector(31 downto 0);key_read: out std_logic;
);end AES_core;
Specification / Pseudocode
VHDL description (Your Source Files)Functional simulation
Post-synthesis simulationSynthesis
On-paper hardware design (Block diagram & ASM chart)
FPGA Design process (2)
Implementation
Configuration
Timing simulation
On chip testing
George Mason University
Synthesis
8
43
architecture MLU_DATAFLOW of MLU is
signal A1:STD_LOGIC;signal B1:STD_LOGIC;signal Y1:STD_LOGIC;signal MUX_0, MUX_1, MUX_2, MUX_3: STD_LOGIC;
beginA1<=A when (NEG_A='0') else
not A;B1<=B when (NEG_B='0') else
not B;Y<=Y1 when (NEG_Y='0') else
not Y1;
MUX_0<=A1 and B1;MUX_1<=A1 or B1;MUX_2<=A1 xor B1;MUX_3<=A1 xnor B1;
with (L1 & L0) selectY1<=MUX_0 when "00",
MUX_1 when "01",MUX_2 when "10",MUX_3 when others;
end MLU_DATAFLOW;
VHDL description Circuit netlist
Logic Synthesis
44
Circuit netlist (RTL view)
45
Mapping
LUT2
LUT1
FF1
FF2
LUT0
George Mason University
Implementation
47
Implementation
• After synthesis the entire implementation process is performed by FPGA vendor tools
48
Mapping
LUT2
LUT1
FF1
FF2
LUT0
9
49
PlacingCLB SLICES
FPGA
50
RoutingProgrammable Connections
FPGA
51
Configuration
• Once a design is implemented, you must create a file that the FPGA can understand• This file is called a bit stream: a BIT file (.bit extension)
• The BIT file can be downloaded directly to the FPGA, or can be converted into a PROM file which stores the programming information
Two main stages of the FPGA Design Flow
Synthesis
Technologyindependent
Technologydependent
Implementation
RTLSynthesis Map Place & Route Configure
- Code analysis- Derivation of main logic constructions- Technology independent optimization- Creation of “RTL View”
- Mapping of extracted logic structures to device primitives- Technology dependent optimization- Application of “synthesis constraints”-Netlist generation- Creation of “Technology View”
- Placement of generated netlist onto the device-Choosing best interconnect structure for the placed design-Application of “physical constraints”
- Bitstream generation- Burning device