ece 545 recommendedreading lecture 12 • 7 series … design process (1) design and implement a...

9
1 George Mason University FPGA Resources ECE 545 Lecture 12 2 Recommended reading 7 Series FPGAs Configurable Logic Block: User Guide § Overview § Functional Details 3 Block RAMs Block RAMs Configurable Logic Blocks I/O Blocks What is an FPGA? Block RAMs 4 Modern FPGA Graphics based on The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Multipliers/DSP units RAM blocks Logic resources (#Logic resources, #Multipliers/DSP units, #RAM_blocks) George Mason University CLB Structure 6 Xilinx Spartan-6 CLB

Upload: duonglien

Post on 16-Mar-2018

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: ECE 545 Recommendedreading Lecture 12 • 7 Series … Design process (1) Design and implement a simple unit permitting to speed up encryption with RC5-similar cipher with fixed key

1

George Mason University

FPGA Resources

ECE 545Lecture 12

2

Recommended reading

• 7 Series FPGAs Configurable Logic Block:User Guide

§ Overview§ Functional Details

3

Block RAM

s

Block RAM

s

ConfigurableLogicBlocks

I/OBlocks

What is an FPGA?

BlockRAMs

4

Modern FPGARAM blocks

Multipliers

Logic blocks

Graphics based on The Design Warrior’s Guide to FPGAsDevices, Tools, and Flows. ISBN 0750676043

Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)

Multipliers/DSP units

RAM blocks

Logic resources

(#Logic resources, #Multipliers/DSP units, #RAM_blocks)

George Mason University

CLB Structure

6

Xilinx Spartan-6 CLB

Page 2: ECE 545 Recommendedreading Lecture 12 • 7 Series … Design process (1) Design and implement a simple unit permitting to speed up encryption with RC5-similar cipher with fixed key

2

7ECE 448 – FPGA and ASIC Design with VHDL

SLICEX

8

4-input LUT (Look-Up Table) (used in earlier families of FPGAs)

• Look-Up tables are primary elements for logic implementation

• Each LUT can implement any function of 4 inputs

x1 x2 x3 x4

y

x1 x2

y

LUT

x1x2x3x4

y

0x1

0x2 x3 x4

0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1

y0100010101001100

0x1

0x2 x3 x4

0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1

y1111111111110000

x1 x2 x3 x4

y

x1 x2 x3 x4

y

x1 x2

y

x1 x2

y

LUT

x1x2x3x4

y

0x1

0x2 x3 x4

0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1

y0100010101001100

0x1

0x2 x3 x4

0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1

y0100010101001100

0x1

0x2 x3 x4

0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1

y1111111111110000

0x1

0x2 x3 x4

0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1

y1111111111110000

9

6-Input LUT of Spartan-6

10

11

Reset and Set Configurations

• No set or reset• Synchronous set• Synchronous reset• Asynchronous set (preset)• Asynchronous reset (clear)

12

Three Different Types of Slicesin Xilinx FPGAs

Page 3: ECE 545 Recommendedreading Lecture 12 • 7 Series … Design process (1) Design and implement a simple unit permitting to speed up encryption with RC5-similar cipher with fixed key

3

13

SLICEX

14

SLICEL

15

u Each CLB contains separate logic and routing for the fast generation of sum & carry signals• Increases efficiency and

performance of adders, subtractors, accumulators, comparators, and counters

u Carry logic is independent of normal logic and routing resources

Fast Carry Logic

LSB

MSB

Carry

Log

icRo

utin

g

x y COUT

0011

0101

y

y

CINCIN

Propagate = x Å yGenerate = ySum= Propagate Å CIN = x Å y Å CIN

xy

Carry & Control Logic in Xilinx FPGAs

Carry & Control Logic in Spartan 3 FPGAs

LUT

Hardwired (fast) logic

xy

Simplified View of a Xilinx FPGA Carry and Arithmetic Logic in One

Logic Cell

xy

Page 4: ECE 545 Recommendedreading Lecture 12 • 7 Series … Design process (1) Design and implement a simple unit permitting to speed up encryption with RC5-similar cipher with fixed key

4

20

21

Accessing Carry Logic

u All major synthesis tools can infer carry logic for arithmetic functions

• Addition (SUM <= A + B)• Subtraction (DIFF <= A - B)• Comparators (if A < B then…)• Counters (count <= count +1)

22

SLICEM

23

16-bit SR

16 x 1 RAM

4-input LUT

The Design Warrior’s Guide to FPGAsDevices, Tools, and Flows. ISBN 0750676043

Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)

Xilinx Multipurpose LUT (MLUT)

64 x 1 ROM(logic)

64 x 1 RAM

32-bit SR

24

MLUT as a 32-bit Shift Register (SRL32)

Page 5: ECE 545 Recommendedreading Lecture 12 • 7 Series … Design process (1) Design and implement a simple unit permitting to speed up encryption with RC5-similar cipher with fixed key

5

25

Question 1

How many Xilinx LUTs are necessary to implement the following functions:

A. f1 = x1x2 + x3x4 + x5x6

f2 = x1x2x3x4x5

B. f1 = x1x2 + x2x3 + x3x4 + x4x5

f2 = x1+x2+x3+x4+x5

26

Question 2

How many Xilinx LUTs are necessary to implement:

A. 4-to-1 MUXB. 2-to-4 decoder with enableC. 3-to-8 decoder with enable

27

Question 3

How many Xilinx LUTs are necessary to implement 4-bit priority encoder?

w 0

w 3

y 0

y 1d001

010

w0 y1

d

y0

1 1

01

1

11

z

1--

0

-

w1

01-

0

-

w2

001

0

-

w3

000

0

1

z

w 1

w 2 w

y

28

Question 4How many Xilinx LUTs are necessary to implement the following circuit:

George Mason University

Question 5

Determine the amount of Xilinx FPGA resources needed

to implement a given circuit

R0

R1

R2

R3

R4

R5

R6

R7

R8

R9

R10

R11

R12

R13

R14

R15

w

abcd

yF

m

clk

0 1 runCircuit 1:Top level

Page 6: ECE 545 Recommendedreading Lecture 12 • 7 Series … Design process (1) Design and implement a simple unit permitting to speed up encryption with RC5-similar cipher with fixed key

6

1

01

0

01234567

cin

x y

cout

s

<<<3

x3

x2

x1

x0

y3

y2

y1

y0

w1

w0

En

y3

y2

y1

y0

a

bcd

a

b

c

dc

ab

e

ef

3

2-to-4 Decoder

FullAdder

f

g

h

g h

y

Circuit 1:F – function R0

R1

R2

R3

R4

R5

R6

R7

R8

R9

R10

R11

R12

R13

R14

R15

z

abcde

yF

d

clk

0 1 runCircuit 2:Top level

1

01

0

01234567

x y

cout

s

>>2

x3

x2

x1

x0

y3

y2

y1

y0

y1

y0

z

w3

w2

w1

w0

a

b

c

d

ae

f

gh

3

Priority Encoder

HalfAdder

g

h

i

e i

y

a

b

c

d

Circuit 2:F – function

Circuit 3: Top level

en enaclk

y(0)

en enaclken enaclken enaclk

x

3232 32 32

32

y(1)

+ −

R0 R1 R2 R3

A BA>B

16 16 16 16

1616

Circuit 4:Top level

George Mason UniversityECE 448 – FPGA and ASIC Design with VHDL

Input/Output Blocks(IOBs)

Page 7: ECE 545 Recommendedreading Lecture 12 • 7 Series … Design process (1) Design and implement a simple unit permitting to speed up encryption with RC5-similar cipher with fixed key

7

37

Basic I/O Block Structure

DEC

Q

SR

DEC

Q

SR

DEC

Q

SR

Three-StateControl

Output Path

Input Path

Three-State

Output

Clock

Set/Reset

Direct Input

Registered Input

FF Enable

FF Enable

FF Enable

38

IOB Functionality

• IOB provides interface between the package pins and CLBs

• Each IOB can work as uni- or bi-directional I/O

• Outputs can be forced into High Impedance• Inputs and outputs can be registered

• advised for high-performance I/O• Inputs can be delayed

George Mason University

FPGA Design Flow

FPGA Design process (1)Design and implement a simple unit permitting to speed up encryption with RC5-similar cipher with fixed key set on 8031 microcontroller. Unlike in the experiment 5, this time your unit has to be able to perform an encryption algorithm by itself, executing 32 rounds…..

Library IEEE;use ieee.std_logic_1164.all;use ieee.std_logic_unsigned.all;

entity RC5_core isport(

clock, reset, encr_decr: in std_logic;data_input: in std_logic_vector(31 downto 0);data_output: out std_logic_vector(31 downto 0);out_full: in std_logic;key_input: in std_logic_vector(31 downto 0);key_read: out std_logic;

);end AES_core;

Specification / Pseudocode

VHDL description (Your Source Files)Functional simulation

Post-synthesis simulationSynthesis

On-paper hardware design (Block diagram & ASM chart)

FPGA Design process (2)

Implementation

Configuration

Timing simulation

On chip testing

George Mason University

Synthesis

Page 8: ECE 545 Recommendedreading Lecture 12 • 7 Series … Design process (1) Design and implement a simple unit permitting to speed up encryption with RC5-similar cipher with fixed key

8

43

architecture MLU_DATAFLOW of MLU is

signal A1:STD_LOGIC;signal B1:STD_LOGIC;signal Y1:STD_LOGIC;signal MUX_0, MUX_1, MUX_2, MUX_3: STD_LOGIC;

beginA1<=A when (NEG_A='0') else

not A;B1<=B when (NEG_B='0') else

not B;Y<=Y1 when (NEG_Y='0') else

not Y1;

MUX_0<=A1 and B1;MUX_1<=A1 or B1;MUX_2<=A1 xor B1;MUX_3<=A1 xnor B1;

with (L1 & L0) selectY1<=MUX_0 when "00",

MUX_1 when "01",MUX_2 when "10",MUX_3 when others;

end MLU_DATAFLOW;

VHDL description Circuit netlist

Logic Synthesis

44

Circuit netlist (RTL view)

45

Mapping

LUT2

LUT1

FF1

FF2

LUT0

George Mason University

Implementation

47

Implementation

• After synthesis the entire implementation process is performed by FPGA vendor tools

48

Mapping

LUT2

LUT1

FF1

FF2

LUT0

Page 9: ECE 545 Recommendedreading Lecture 12 • 7 Series … Design process (1) Design and implement a simple unit permitting to speed up encryption with RC5-similar cipher with fixed key

9

49

PlacingCLB SLICES

FPGA

50

RoutingProgrammable Connections

FPGA

51

Configuration

• Once a design is implemented, you must create a file that the FPGA can understand• This file is called a bit stream: a BIT file (.bit extension)

• The BIT file can be downloaded directly to the FPGA, or can be converted into a PROM file which stores the programming information

Two main stages of the FPGA Design Flow

Synthesis

Technologyindependent

Technologydependent

Implementation

RTLSynthesis Map Place & Route Configure

- Code analysis- Derivation of main logic constructions- Technology independent optimization- Creation of “RTL View”

- Mapping of extracted logic structures to device primitives- Technology dependent optimization- Application of “synthesis constraints”-Netlist generation- Creation of “Technology View”

- Placement of generated netlist onto the device-Choosing best interconnect structure for the placed design-Application of “physical constraints”

- Bitstream generation- Burning device