rb - epfl/ic/lap - a2008 1 fpgarm4u [email protected] lap/i&c/epfl chargé de cours lsn/eig...
TRANSCRIPT
RB - EPFL/IC/LAP - A2008 1
FPGARM4U
LAP/I&C/EPFL
Chargé de cours
LSN/EIG
Prof. HES
Filippo Rusco, Yorick Brunet
RB - EPFL/IC/LAP - A2008 2
Sections of the course
The last section introduces an ARM9 processor interfaced with the FPGA4U board by a proprietary bus.
Linux is running on the ARM board and the FPGA is used as specific programmable interfaces.
RB - EPFL/IC/LAP - A2008 3
Schedule (4) 4 weeks
Embedded systems on FPGA and ARM processor
master interfaces
C(8h) ARM family, ARM9 more specificallyUSB2 and microcontroller (FX2)
L(8h) Multiprocessor System ARM/NIOSFPGARM4U / FPGA4U
report & demo
& presentation
RB - EPFL/IC/LAP - A2008 4
FPGARM4U
Hardware
design
RB - EPFL/IC/LAP - A2008 5
Hardware Design
When a new product is designed, the choice of the processor is a very difficult task.
Lot of processors are available on the market, and their life could be (very) short.
In the embedded world the ARM family is largely deployed
In this "single" family thousands of devices are proposed.
Overview of the family
RB - EPFL/IC/LAP - A2008 6
FPGARM4U-ARM family
ARM7, ARM9, ARM11 are hardcores processors
Cortex are synthesizable cores
30 MHz 2GHz clock
Family Licenses
CortexTM 49
ARM11TM 70
ARM9TM 249
ARM7TM 158
Ref.: http://www.arm.com/products/licensing/licencees.html
RB - EPFL/IC/LAP - A2008 7
FPGARM4U-ARM Cortex family
RB - EPFL/IC/LAP - A2008 8
FPGARM4U-ARM Cortex family
• ARM Cortex-A Series, applications processors for complex OS and user applications. Supports the ARM and Thumb-2 instruction sets.• ARM Cortex-R Series, embedded processors for real-time systems. Supports the ARM and Thumb-2 instruction sets.• ARM Cortex-M Series, deeply embedded processors optimized for cost sensitive applications. Supports the Thumb-2 instruction set only.
RB - EPFL/IC/LAP - A2008 9
FPGARM4U Atmel Families
RB - EPFL/IC/LAP - A2008 10
FPGARM4U ARM9 DIOPSYS
From: http://www.atmel.com/products/diopsis/overview.asp
ARM9 + DSP engine, 10 floating p/cycles >1G flop/s
RB - EPFL/IC/LAP - A2008 11
FPGARM4U Atmel mAgicV DSP
RB - EPFL/IC/LAP - A2008 12
FPGARM4U Architecture
AT91SAM9263 SDRAM 64MB: 2x16Mx16
Serial Flash 8MB Ethernet 10/100 Mb/s CAN Bus 2.0 RS-232 UART I2C 2 USB Master FS 1 USB Slave FS (12 Mb/s)
SD/MMC/SDIO JTAG
• Extension connector for FPGA4U• Extension connectors for I/O
RB - EPFL/IC/LAP - A2008 13
FPGARM4U AT91SAM9263
ARM9E-S Technical Reference Manual (Rev 1) (290 pages, revision B, updated 4/08)http://www.atmel.com/dyn/resources/prod_documents/doc6178.pdf
AT91SAM9263 Preliminary Summary (51 pages, revision FS, updated 9/08)http://www.atmel.com/dyn/resources/prod_documents/6249s.pdf
AT91SAM9263 Preliminary (1097 pages, revision F, updated 9/08)http://www.atmel.com/dyn/resources/prod_documents/doc6249.pdf
RevA : > 50 bugsRevB : > 40 bugs
RB - EPFL/IC/LAP - A2008 14
FPGARM4U AT91SAM9263
RB - EPFL/IC/LAP - A2008 15
FPGARM4U AT91SAM9263
RB - EPFL/IC/LAP - A2008 16
FPGARM4U AT91SAM9263
ARM 926EJ-S Core v5TE 3 instruction sets:
32-bit ARM instruction set used in ARM state 16-bit Thumb instruction set used in Thumb state 8-bit Java bytecode used in Jazelle state.
5 stage pipeline, Byte code Java (6 st. pipeline) MAC (Mult-Acc)
ARM 926EJ-S core: http://infocenter.arm.com/help/topic/com.arm.doc.ddi0222b/DDI0222.pdf
RB - EPFL/IC/LAP - A2008 17
FPGARM4U AT91SAM9263
Caches: Instruction 16 kBytes Data 16 kBytes
Fast internal TCM (Tightly Coupled Memory): SRAM 80kBytes
Internal Memories: 16 kBytes SRAM 128 kBytes ROM (monitor, boot)
RB - EPFL/IC/LAP - A2008 18
FPGARM4U AT91SAM9263
RB - EPFL/IC/LAP - A2008 19
FPGARM4U AT91SAM9263
Fast internal Tightly Coupled memory: SRAM 80kBytes, separated in 3 areas A, B, C Instruction TCM, Data TCM, Frame Buffer ITCM: The user can map this SRAM block
anywhere in the ARM926 instruction memory space using CP15 instructions and the TCR
Internal Memories: 16 kBytes SRAM 128 kBytes ROM (monitor, boot)
RB - EPFL/IC/LAP - A2008 20
FPGARM4U Boot Strategies
The system always boots at address 0x0. To ensure maximum boot possibilities, the memory layout can be changed with two parameters.
REMAP allows the user to layout the internal SRAM bank to 0x0. This is done by software once the system has booted. (Refer to the section “AT91SAM9263 Bus Matrix” in the product datasheet for more details.)
When REMAP = 0, BMS allows the user to layout at address 0x0: If BMS = 1 @Reset, the boot memory is the embedded ROM. If BMS = 0 @Reset, the boot memory is the memory connected on
the Chip Select 0 of the External Bus Interface.
The internal memory area mapped between address 0x0 and 0x000F FFFF is reserved to this effect.
RB - EPFL/IC/LAP - A2008 21
FPGARM4U BMS = 1, Boot on Embedded ROM
The system boots on internal Boot Program. Boot at slow clock Auto baudrate detection Downloads and runs an application from external storage
media into internal SRAM Downloaded code size depends on embedded SRAM size Automatic detection of valid application Bootloader on a non-volatile memory
SD Card NAND Flash SPI DataFlash® and Serial Flash connected on NPCS0 of the
SPI0 Interface with SAM-BA® Graphic User Interface to enable code
loading via: Serial communication on a DBGU USB Bulk Device Port
RB - EPFL/IC/LAP - A2008 22
FPGARM4U BMS = 0, Boot on External Memory
Boot at slow clock Boot with the default configuration for the Static
Memory Controller, byte select mode, 16-bit data bus, Read/Write controlled by Chip Select, allows boot on 16-bit non-volatile memory.
The customer-programmed software must perform a complete configuration.
To speed up the boot sequence when booting at 32kHz EBI0 CS0 (BMS=0) the user must:1. Program the PMC (main oscillator enable or bypass mode).2. Program and Start the PLL.3. Reprogram the SMC setup, cycle, hold, mode timings
registers for CS0 to adapt them to the new clock.4. Switch the main clock to the new value.
RB - EPFL/IC/LAP - A2008 23
FPGARM4U AT91SAM9263
RB - EPFL/IC/LAP - A2008 24
FPGARM4U Boot devices
Serial Flash memory to receive Bootstrap program, need to be programmed before: Serial link and SAM-BA software
Then USB Key with an OS kernel as Linux
Or Network Ethernet NFSLaboratory exercise
RB - EPFL/IC/LAP - A2008 25
FPGARM4U AT91SAM9263
RB - EPFL/IC/LAP - A2008 26
FPGARM4U Extension I/O
Many devices available as slaves: MultiMedia Card (2 channels) Timers (3x) PWM (4 channels), Pulse Width Modulation TWI (Serial i2c, 1x), Two Wire Interface SPI (Serial, 2x), Synchronous Programmable Interface SSC (Serial, 2x), Synchronous Serial Controller UART (3x), Universal Asynchronous Receiver/Transmitter CAN (1x) AC97 (Sound) USB (1 slave)
RB - EPFL/IC/LAP - A2008 27
FPGARM4U AT91SAM9263
RB - EPFL/IC/LAP - A2008 28
FPGARM4U Extension I/O
Many devices available as masters (DMA): Ethernet 10/100 USB Master (2x) LCD controller Image sensor 2D graphic controller 20 channels peripheral DMA 2 general DMA (external request)
RB - EPFL/IC/LAP - A2008 29
FPGARM4U AT91SAM9263
RB - EPFL/IC/LAP - A2008 30
FPGARM4U Extension Bus
2 External Bus Interface for memories or external programmable interfaces:
EBIO0, 6 Chip Select nCS0[0..5]: SDRAM SRAM ROM, EPROM, Flash Compact Flash NAND Flash (serial) ECC controller 8, 16, 32 bits data bus
RB - EPFL/IC/LAP - A2008 31
FPGARM4U Extension Bus
EBIO1, 3 Chip Select nCS1[0..2]: SDRAM SRAM ROM, EPROM, Flash NAND Flash (serial) ECC controller 8, 16, 32 bits data bus
RB - EPFL/IC/LAP - A2008 32
FPGA4U FPGARM4U
An ARM processor is connected to a FPGA through an extension bus.
The ARM9 or an internal DMA unit are the masters of the transfers
A specific External Bus Interface is used: EBI1
The document explains the way to be able to realize a bridge in the FPGA for ARM9 Avalon transfers
RB - EPFL/IC/LAP - A2008 33
FPGARM4U FPGA4U internal Avalon interface
To interface the FPGA4U and the Avalon Bus, an EBI-Avalon (master) needs to be design:
Laboratory exercise : Search in the Datasheet the timing for EBI1 as
16 bits external memory for SRAM. Design an Avalon master receiving EBI1 as
external requester of transfers. Realize it in VHDL, Implement it, Simulate it and … Test it.
RB - EPFL/IC/LAP - A2008 34
FPGARM4U EBI1 Bus
The EBI1 Bus is used with FPGA4U as interface for FPGA:
Addresses[22..0] 8 MBytes space Data [15..0] 16 bits data bus nCS0, nCS2 2 Chip Selects nOE Output Enable (Read) nWr[1..0] 2 Write Select nWAIT Wait clk cycles requested
RB - EPFL/IC/LAP - A2008 35
FPGARM4U FPGA4U Bridge: internal Avalon interface
AT91SAM9263
FPGA-CycloneII
AM
AM
AM
Bridge
NIOSII
SDRAM Ctrl
LCD Ctrl
Ava
lon
EB
I1
Add, Ctrl
nWAIT
Data
AS
SDRAM
LCD
RB - EPFL/IC/LAP - A2008 36
FPGARM4U FPGA4U Bridge: internal Avalon interface
The Bridge receives the Addr, Ctrl and transfers Data
It map the addresses from the EBI to an address to the Avalon bus
The bridge can be configured by the EBI1 with the nCS2 selection signal
The PageAddr register is added to the EBI1_Address to generate the Avalon Address
RB - EPFL/IC/LAP - A2008 37
FPGARM4U FPGA4U Bridge interface
FPGA-CycloneII
Bridge
ARM Wr NIOS Wr
PageAddr SDRAM_start
SDRAM_Lgth
LCD_FrBuf
LCD_Lgth
Pag
eAdd
r
nWAIT
nCS0
nCS2
As_CS
+
WaitRequest
AM_Addr[31..0]EBI1_Addr[22..0]
RB - EPFL/IC/LAP - A2008 38
FPGARM4U FPGA4U Bridge interface
15 78 0
CS0 BaseAdd
CS0 LgtAdd
Pag
eAdd
r +
15 78 0
PageAddr
LgtA
dd
SDRAM8MB window
CS2 BaseAdd
SDRAM_start
SD
RA
M_L
gth
LCD_FrBuf PageAddr
ARM memory space Avalon memory space
RB - EPFL/IC/LAP - A2008 39
FPGARM4U FPGA4U Bridge interface
ARM Write NIOS Write
PageAddr SDRAM_start
SDRAM_Lgth
LCD_FrBuf
LCD_Lgth
The Bridge is seen as a programmable interface from the ARM and the NIOSII
All the registers have to be readable by both processors
RB - EPFL/IC/LAP - A2008 40
FPGARM4U FPGA4U Bridge interface
The number of bits dedicated to the PageAddr and the adder specify the granularity of the address mapping
31 24 23 22 16 15 8 7 4 3 0
0 1 1 1 0 0 0 0 0 X X X X X X X X X X X X X X X X X X X X X X x
↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓
0 0 0 0 0 0 0 0 0 X X X X X X X X X X X X X X X X X X X X X X x
+ + + + + + + + + + + + + + + +
P P P P P P P P P P P P P P P P ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓
Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y X X X X X X X X X X X X X X X x
H' 70xx xxxx nCS0 address decoding H' 00xx xxxx Address to map on Avalon+ H' PPPP 0000 Page Address H' YYYY xxxx Avalon Address
+
ARM add nCS0
Avalon add
Page
RB - EPFL/IC/LAP - A2008 41
FPGARM4U FPGA4U internal Avalon interface
External Bus Interface 1 from ARM uCIntegrates three External Memory Controllers: Static Memory Controller SDRAM Controller ECC Controller
Additional logic for NAND FlashOptional Full 32-bit External Data BusUp to 23-bit Address Bus (up to 8 Mbytes linear)
Up to 3 Chip Selects, Configurable Assignment: Static Memory Controller on NCS0 SDRAM Controller or Static Memory Controller on NCS1 Static Memory Controller on NCS2, Optional NAND Flash support
RB - EPFL/IC/LAP - A2008 42
FPGARM4U FPGA4U internal Avalon interface
Static Memory Controller (§8.2.2) of EBI1 8-, 16- or 32-bit Data Bus
Multiple Access Modes supported : Byte Write or Byte Select Lines Asynchronous read in Page Mode supported (4- up to 32-byte page
size)Multiple device adaptability :
Control signals programmable setup, pulse and hold time for each Memory Bank
(Compliant with LCD Module)Multiple Wait State Management :
Programmable Wait State Generation External Wait Request (nWAIT signal) Programmable Data Float Time
Slow Clock mode supported
RB - EPFL/IC/LAP - A2008 43
FPGARM4U FPGA4U internal Avalon interface
Addresses[22..0] Data [15..0] nCS0, nCS2 nOE nWr[1..0]
nWAIT
AT91SAM9263
EB
I1
RB - EPFL/IC/LAP - A2008 44
EBI1 interface
SRAM
+SDRAM
+ NAND Flash
RB - EPFL/IC/LAP - A2008 45
EBI1 interface
15 78 0
A<31..0> :4 GBytesFull memoryspace of the ARM9
nOE
nWr0nWr1
8MBMax spacefor CS0access
8MB
nCS0
nCS2
RB - EPFL/IC/LAP - A2008 46
SMC documentation
SMC: Static Memory Controller It's the mode to access the FPGA as a
simple 16 bits wide memory (Doc 6249.pdf, Atmel, Chap.22, p.197)
What are the timing access ?
RB - EPFL/IC/LAP - A2008 47
SMC documentation
Read Access Programmable
Timings : nRD_
SetUp Pulse (Hold) Cycle
nCS_ Rd_SetUp Rd_Pulse (Rd_Hold)
RB - EPFL/IC/LAP - A2008 48
SMC documentation
Write Access Programmable
Timings : nWE_
SetUp Pulse (Hold) Cycle
nCS_ Wr_SetUp Wr_Pulse (Wr_Hold)
RB - EPFL/IC/LAP - A2008 49
Wait cycles of EBI1/SMC
Internal wait clock cycles are provided by the way of the programmable set up, pulse and cycle (hold) times for read and write transfers cycles.
They are 2 modes for external Wait Cycles with this interface: Frozen mode Ready mode
The nWAIT signal is synchronized by 2 clocks rising edge before used inside the EBI nWAIT can be asynchronous to Clk
RB - EPFL/IC/LAP - A2008 50
Wait cycles Frozen mode
At the sampling time of synchronized nWAIT, the internal clock counters stay with the same value until nWAIT is deactivated
Then they continue down counting to finish the transfer cycle
RB - EPFL/IC/LAP - A2008 51
Wait cycles Ready mode
The pulse time go to the programmed length, if synchronized nWAIT is active at this clk, Wait clk cycle are added until synchronized nWAIT is deactivated.
RB - EPFL/IC/LAP - A2008 52
nWAIT synchronization
For metastability filtering, the nWAIT signal is doubly synchronized before use in the Bus Interface. Thus it's effectiveness is only available 2 clock after it's activation !
Thus nWAIT doesn't need to be synchronized before entering the EBI.
RB - EPFL/IC/LAP - A2008 53
Wait cycles Latency
Set up and pulse lengths have to be programmed enough long for the nWAIT signal to be synchronized and accepted before the end of the pulse time.
minimal pulse length = nWAIT latency (from device selection) + 2 resynchronization cycles + 1 cycle
p.220
RB - EPFL/IC/LAP - A2008 54
SMC Registers
Registers to initialize for EBI1 and SMC:
EBI1_CSA SRAM/SDRAM, PullUp, VddIO
0xFFFFED24 => 0xFFFFEC00 + 0x0124
RB - EPFL/IC/LAP - A2008 55
SMC Registers
RB - EPFL/IC/LAP - A2008 56
SMC Registers
4 CS_number configurations registers for each CS (p.228): SMC_SetUp SMC_Pulse SMC_Cycle SMC_Mode
Hold = Cycle – Pulse - SetUp
RB - EPFL/IC/LAP - A2008 57
SMC Registers
SMC_SetUp : Reset: 0x01 01 01 01 xxx_Setup<5> * 128 + xxx_Setup<4..0> : 0..31 - 128..159 clock cycles CS0: 0xFFFFEA00 CS2: 0xFFFFEA20
RB - EPFL/IC/LAP - A2008 58
SMC Registers
SMC_Pulse : Reset: 0x01 01 01 01 xxx_Pulse<6> * 256 + xxx_Pulse<5..0> : 1..63 - 256..319 clock cycles CS0: 0xFFFFEA04 0: unpredictable CS2: 0xFFFFEA24 result !!!!
RB - EPFL/IC/LAP - A2008 59
SMC Registers
SMC_Cycle : Reset: 0x00 03 00 03 xxx_Cycle<8..7> * 256 + xxx_Cycle<6..0> : 1..127 - 256..383 – 512..639 – 768..895 clock cycles CS0: 0xFFFFEA08 0: unpredictable CS2: 0xFFFFEA28 result !!!!
RB - EPFL/IC/LAP - A2008 60
SMC Registers
SMC_Mode : PS: Page Size: 4, 8, 16, 32 By PMEN: Page Mode Enable TDF_Mode: DataFloatOptimization(1) TDF_Cycles: DataFloatCycles, 0..15
CS0: 0xFFFFEA0C CS2: 0xFFFFEA2C
Reset: 0x10 00 10 00
DBW: DataBusWidth 8, 16, 32, - BAT: Byte Access Type ExnW_Mode: nWAIT Mode Write_Mode: Ctrl by nWR(1)/nCS(0) Read_Mode: Ctrl by nRD(1)/nCS(0)
RB - EPFL/IC/LAP - A2008 61
SMC Registers
SMC_Mode : BAT Byte Access Type
0: Write: nCS, nWE, nBS<3..0> 0: Read: nCS, nRD, nBS<3..0>
1: Write: nCS, nWR<3..0>. 1: Read nCS, nRD
EXnW_Mode External Wait Mode 00 Disabled 01 Reserved 10 Frozen 11 Ready
RB - EPFL/IC/LAP - A2008 62
Memory Mapping
0x7000 0000
0x9000 0000
RB - EPFL/IC/LAP - A2008 63
Design of the Bridge
SMC access from ARM μC to FPGA Master/Slave Avalon Bus Needs to be initialized by NIOSII Functions of the bridge can be programmed by ARM with
nCS2 Access to memory mapping of 8MBytes to full 4 GBytes of
Avalon Bus (in 32 bits mode) Mapping by a register and an adder.
FIFO for advanced features: Prefetch of next data in Read cycle End write cycle before real end of write cycle
RB - EPFL/IC/LAP - A2008 64
Design of the Bridge
As hypothesis: ARM clock : 200MHz FPGA clock: 50MHz Simple read/write cycle
To Do at least for bridge timings determination: Draw Read and Writes transfer cycles from EBI1 to Avalon Synchronize by double flip-flop in the bridge the :
nCS - nOE for read nCS - nWr1 / nCS - nWr0 for write
Determine the SetUp/Pulse/Cycle for read and write transfers for correct nWAIT generation (mode?)
RB - EPFL/IC/LAP - A2008 65
FPGARM4U
Software
design
RB - EPFL/IC/LAP - A2008 66
FPGARM4U ARM9 registers
The processor has 6 working spaces: System and User Fast interrupt (FIQ) Supervisor Abort Interrupt (IRQ) Undefined
Some registers are specific to the actual space
RB - EPFL/IC/LAP - A2008 67
FPGARM4U ARM9 registers
Some registers are allocated functions: Program Counter (PC , r15) Link Register (LR, r14), Stack Pointer (SP, r13) Current Processor State Register (CPSR)
SPSR are the Saved Processor State Register the CPSR is copied in the SPSR_xxx as mode change
RB - EPFL/IC/LAP - A2008 68
FPGARM4U ARM9 registers
RB - EPFL/IC/LAP - A2008 69
FPGARM4U ARM9 registers
In Thumb mode, less registers are available
RB - EPFL/IC/LAP - A2008 70
FPGARM4U ARM9 registers
RB - EPFL/IC/LAP - A2008 71
FPGARM4U ARM9 registers
RB - EPFL/IC/LAP - A2008 72
FPGARM4U BOOT
Start up of the ARM: The processor start @ 0x00000000. The internal ROM search for the serial
memory needs to be initialized Connection as USB device and SAM-BA
GUI from Atmel Program the Serial Flash Memory with
a Bootstrap program.
RB - EPFL/IC/LAP - A2008 73
FPGARM4U
The Bootstrap program initialize the processor and some interface
Search for OS to download: As USB Master, read a USB key need to
load a Kernel on the Key With Ethernet and NFS to load the kernel
from a server
RB - EPFL/IC/LAP - A2008 74
FPGARM4U
Laboratory exercise: Install a Linux system on FPGARM4U Follows the instructions on:
http://fpga4u.epfl.ch/wiki/FPGARM4ULinux
You need a FPGARM4U board A USB Key A serial adapter