apb_doc

1

ABSTRACT

ARM introduced the Advanced Microcontroller Bus Architecture (AMBA) 4.0

specifications which includes Advanced eXtensiable Interface (AXI) 4.0. AMBA bus protocol

has become the de facto standard SoC bus. That means more and more existing IPs must be able

to communicate with AMBA 4.0 bus. Based on AMBA 4.0 bus, we designed an Intellectual

Property (IP) core of Advanced Peripheral Bus (APB) Bridge, which translates the AXI4.0-lite

transactions into APB 4.0 transactions. The bridge provides an interfaces between the high-

performance AXI bus and low-power APB domain. We then interface this bridge to an UART

transmitter and receiver.

AMBA bus is widely used in the high-performance SoC designs. The AMBA

specification defines an on-chip communication standard for designing high-performance

embedded microcontrollers.

2

1. INTRODUCTION

There are many companies that develop core IP for SoC products. The interfaces to these

cores can differ from company to company and can sometimes be proprietary in nature. The SoC

developer then must expend time, effort, and money to create “bridge” or “glue” logic that

allows all of the cores inside the SoC to communicate properly with each other. Incompatible

interfaces are thus barriers to both IP developers and SoC developers. SoC integrated circuits

envisioned by this subcommittee span a wide breadth of applications, target system costs, and

levels of performance and integration.

Integrated circuits have entered the era of System-on-a-Chip (SoC), which refers to

integrating all components of a computer or other electronic system into a single chip. It may

contain digital, analog, mixed-signal, and often radio-frequency functions – all on a single chip

substrate. With the increasing design size, IP is an inevitable choice for SoC design. And the

widespread use of all kinds of IPs has changed the nature of the design flow, making On-Chip

Buses (OCB) essential to the design.

Of all OCBs existing in the market, the AMBA bus system is widely used as the de facto

standard SoC bus. On March 8, 2010, ARM announced availability of the AMBA 4.0

specifications. As the de facto standard SoC bus, AMBA bus is widely used in the high-

performance SoC designs. The AMBA specification defines an on-chip communication standard

for designing high-performance embedded microcontrollers.


specifications in March 2010, which includes Advanced extensible Interface (AXI) 4.0. AMBA

bus protocol has become the de facto standard SoC bus. That means more and more existing IPs

must be able to communicate with AMBA 4.0 bus. Based on AMBA 4.0 bus, This design is an

Intellectual Property (IP) core of AXI(Advanced extensible Interface) Lite to APB(Advanced

Peripheral Bus) Bridge, which translates the AXI4.0-lite transactions into APB 4.0 transactions.

The bridge provides interfaces between the high-performance AXI bus and low-power APB

domain.

3

2.LITERATURE REVIEW

The Advanced Microcontroller Bus Architecture (AMBA) is used as the on-chip bus

in system-on-a-chip (SoC) designs. Since its inception, the scope of AMBA has gone far beyond

microcontroller devices, and is now widely used on a range of ASIC and SoC parts including

applications processors used in modern portable mobile devices like smart phones.

The AMBA protocol is an open standard, on-chip interconnect specification for the

connection and management of functional blocks in a System-on-Chip (SoC). It facilitates right-

first-time development of multi-processor designs with large numbers of controllers and

peripherals.

2.1 About System on Chip (SoC)

SOC is a device which integrates all the computer devices on to one chip, which runs

with desktop operating systems like windows, Linux etc. The contrast with a microcontroller is

one of degree. Microcontrollers typically have under 100 KB of RAM (often just a few

kilobytes) and often really are single-chip-systems, whereas the term SoC is typically used with

more powerful processors, capable of running software such as the desktop versions

of Windows and Linux, which need external memory chips (flash, RAM) to be useful, and which

are used with various external peripherals. In short, for larger systems system on a chip is

hyperbole, indicating technical direction more than reality: increasing chip integration to reduce

manufacturing costs and to enable smaller systems. Many interesting systems are too complex to

fit on just one chip built with a process optimized for just one of the system's tasks.

When it is not feasible to construct an SoC for a particular application, an alternative is

a system in package (SiP) comprising a number of chips in a single package. In large volumes,

SoC is believed to be more cost-effective than SiP since it increases the yield of the fabrication

and because its packaging is simpler.

4

2.1.1 Structure of SoC

A typical SoC consists of:

A microcontroller, microprocessor or DSP core(s). Some SoCs—called multiprocessor

system on chip (MPSoC)—include more than one processor core.

Memory blocks including a selection of ROM, RAM, EEPROM and flash memory.

Timing sources including oscillators and phase-locked loops.

Peripherals including counter-timers, real-time timers and power-on reset generators.

External interfaces including industry standards such

as USB, FireWire, Ethernet, USART, SPI.

Analog interfaces including ADCs and DACs.

Voltage regulators and power management circuits.

These blocks are connected by either a proprietary or industry standard bus such as

the AMBA bus from ARM Holdings. DMA controllers route data directly between

external interfaces and memory, bypassing the processor core and thereby increasing the data

throughput of the SoC.

5

Figure 2.0 ARM-SoC Block Diagram

2.2 About Advanced Microcontroller Bus Architecture (AMBA)

Advanced Microcontroller Bus Architecture (AMBA) specification defines an on chip

communications standard for designing high-performance embedded microcontrollers. The de

facto standard for on-chip communication.

The AMBA protocol is an open standard, on-chip interconnect specification for the

connection and management of functional blocks in a System-on-Chip (SoC). It facilitates right-

first-time development of multi-processor designs with large numbers of controllers and

peripherals. AMBA promotes design re-use by defining a common backbone for SoC modules

using specifications for ACE, AXI, AHB, APB and ATB.

AXI4-

AXI4-

APB Bus

6

AMBA 4 is the latest addition to the AMBA family adding five new interface

protocols: ACE for full cache coherency between processors; ACE-Lite for I/O coherency; AXI4

to maximize performance and power efficiency; AXI4-Lite and AXI4-Stream ideal for

implementation in FPGA.

Another family of buses that the Power.org Bus Architectures TSC decided to investigate

is the ARM AMBA (Advanced Microcontroller Bus Architecture) set of buses: APB, AHB and

AXI. The APB (Advanced Peripheral Bus) is considered a low-performance, peripheral level

bus. The AHB (Advanced High-performance Bus) is considered (despite the bus name) a mid-

performance bus and the AXI (Advanced eXtensible Interface) bus is considered a high-

performance bus. The AMBA family of buses was chosen for consideration because of their

widespread acceptance in the industry and large amount of existing IP cores.

We note in passing that the APB bus is not always used as a standalone bus. Some

AMBA based IP cores will use a higher performing bus like AHB or AXI for the primary

function, but use an APB port for access to configuration registers.

2.2.1 AMBA Specifications

AMBA enables IP re-use

IP re-use is an essential component in reducing SoC development costs and timescales.

AMBA specifications provide the interface standard that enables IP re-use meeting the essential

requirements of:

1. Flexibility

IP re-use requires a common standard while supporting a wide variety of SoCs with different

power, performance and area requirements. With its ACE, AXI, AHB and APB interface

protocols; AMBA 4 has the flexibility to match every requirement.

2. Multi-Layer

7

The Multi-layer architecture acts as a crossbar switch between masters and slaves in an AMBA 3

AXI or AHB system. The parallel links allow the bandwidth of the interconnect to support the

peak bandwidth of the masters without increasing the frequency of the interconnect.

3. Compatibility

It is a standard interface specification that ensures compatibility between IP components from

different design teams or vendors. The AMBA specification is available as both a written

specification as well as a set of assertions that unambiguously define the interface protocol, thus

ensuring this level of compatibility.

4. Support

The wide adoption of AMBA specifications throughout the semiconductor industry has driven a

comprehensive market in third party IP products and tools to support the development of AMBA

based systems. The availability of AMBA assertions promotes this industry wide participation.

2.2.2Design principles of AMBA

The important aspect of a SoC is not only which components or blocks it houses, but also

how they are interconnected. AMBA is a solution for the blocks to interface with each other.

The objective of the AMBA is to:

1. facilitate right-first-time development of embedded microcontroller products with one or

more CPUs, GPUs or signal processors,

2. be technology independent, to allow reuse of IP cores, peripheral and system microcells

across diverse IC processes,

3. encourage modular system design to improve processor independence, and the

development of reusable peripheral and system IP libraries

Minimize silicon infrastructure while supporting high performance and low power on-chip

communication.

8

2.2.3AMBA Versions

1.VER1.0(ASB&APB)

2.VER2.0(AHB)

3.VER3.0(AXI,ATB)

4. VER 4.0 (AXI 4,AXI LITE,AXI STREAM (AXI))

2.2.4 AMBA 4.0 specification buses/interfaces

o Advanced eXtensible Interface (AXI)

AXI4

AXI4-Lite

AXI-4 Stream

o Advanced High-performance Bus (AHB)

o Advanced System Bus (ASB)

o Advanced Peripheral Bus (APB)

o Advanced Trace Bus (ATB)

In this project these are to be used Advanced eXtensible Interface (AXI4-Lite)&

Advanced Peripheral Bus (APB) because these are high bandwidth data transfer between high

performance devices like processor, DMA, RAM etc..,. and Peripheral devices.

2.3 About Advanced eXtensible Interface (AXI)

In previous SOC design a lot of buses are used which are introduced by ARM. ARM

introduced ASB the first High data transfer between modules, then again ARM introduced AHB

and AXI in followed version of AMBA. ARM AMBA (Advanced Microcontroller Bus

Architecture) set of buses: ASB, AHB and AXI. The AHB (Advanced High-performance Bus) is

considered (despite the bus name) a mid-performance bus and the AXI (Advanced eXtensible

Interface) bus is considered a high-performance bus. The AMBA family of buses was chosen for

consideration because of their widespread acceptance in the industry and large amount of

existing IP cores.

History of High-Performance Buses

9

1. ASB (Advanced System Bus)

2. AHB (Advanced High-performance Bus)

3. AXI (Advanced eXtensible Interface)

1. AXI4

2. AXI4-Lite

3. AXI4-Stream

1. ASB (Advanced System Bus)

The Advanced System Bus (ASB) specification defines a high-performance bus that can be

used in the design of high performance 16 and 32-bit embedded microcontrollers. AMBA ASB

supports the efficient connection of processors, on-chip memories and off-chip external memory

interfaces with low-power peripheral microcell functions. The bus also provides the test

infrastructure for modular macro cell test and diagnostic access. The features of ASB are;

1. High performance

2. Pipelined operation

3. Burst transfers

4. Multiple bus masters

2. AHB (Advanced High-performance Bus)

AHB is the mid-performance bus in the AMBA family. The protocol defines a 32-bit

address bus (HADDR), but this has been extended in some implementations. The read and write

data buses (HRDATA and HWDATA) may be defined under the specification as 2n bits wide,

from 8-bit to 1024-bit, but the most common implementation has been 32-bit. With 27 additional

control signals, and assuming the most common address and data bus width of 32-bit, AHB can

have up to 123 I/O for each AHB master.

10

Up to 16 AHB masters may be connected to a central interconnect which arbitrates

between master requests, and multiplexes the winning request and corresponding transfer

qualifiers to the slaves. Slave read data is multiplexed back to the masters.

In addition to previous release, it has the following features:

1. single edge clock protocol

2. split transactions

3. several bus masters

4. burst transfers

5. pipelined operations

6. single-cycle bus master handover

7. non-tristate implementation

8. Large bus-widths (64/128 bit).

A simple transaction on the AHB consists of an address phase and a subsequent data

phase (without wait states: only two bus-cycles). Access to the target device is controlled

through a MUX (non-tristate), thereby admitting bus-access to one bus-master at a time.

3. AXI (Advanced eXtensible Interface)

AXI is the high-performance bus in the AMBA family. The architecture defines three

write channels and two read channels. The write channels are address, write data, and response.

The read channels are address and read data. The address channels include 32-bit address buses,

AWADDR and ARADDR, but this could be extended in some implementations. The write and

read data buses (WDATA and RDATA) may be defined under the specification as any 2n

number, from 8-bit to 1024-bit. With the assumption that both the address and data buses are 32-

bit, and that the data buses are 128-bit, the write address, write data, and write response channels

would require 56, 139, and 8 I/O, respectively. The read address and read data channels would

11

require 56 and 137 I/O, respectively. Thus, each 128-bit AXI master has 396 I/O total. AXI

masters and slaves are connected together through a central interconnect, which routes master

requests and write data to the proper slave, and returning read data to the requesting master. The

interconnect also maintains ordering based on tags if, for example, a single master pipelines read

requests to different slaves.

AXI uses a handshake between VALID and READY signals. VALID is driven by the

source, and READY is driven by the destination. Transfer of information, either address and

control or data, occurs when both VALID and READY are sampled high. AXI, the third

generation of AMBA interface defined in the AMBA 3 specification, is targeted at high

performance, high clock frequency system designs and includes features which make it very

suitable for high speed sub-micrometer interconnect.

1. separate address/control and data phases

2. support for unaligned data transfers using byte strobes

3. burst based transactions with only start address issued

4. issuing of multiple outstanding addresses

5. Easy addition of register stages to provide timing closure.

The AXI Revisions (Advanced eXtensible Interface)

1. AXI4

2. AXI4-Lite

3. AXI4-Stream

The table shows the differences between Interface parameters for above revisions.

12

Table 2.0 Interface parameters for AXI4 and AXI4-Lite

13

Table 2.1 Interface parameters for AXI4-Stream

1. AXI4 signals modified in AXI4-Lite

The AXI4-Lite interface does not fully support the following signals:

RRESP, BRESP

2. AXI4 signals not supported in AXI4-Lite

The AXI4-Lite interface does not support the following signals:

a) AWLEN, ARLEN The burst length is defined to be 1, equivalent to an AxLEN value of

zero.

b) AWSIZE, ARSIZE All accesses are defined to be the width of the data bus.

c) AWBURST, ARBURST The burst type has no meaning because the burst length is 1.

d) AWLOCK, ARLOCK All accesses are defined as Normal accesses, equivalent to an

AxLOCK value of zero.

e) AWCACHE, ARCACHE All accesses are defined as Non-modifiable, Non-buffer able,

equivalent to an Ax CACHE value of 0b0000.

Note:-AXI4-Lite requires a fixed data bus width of either 32-bits or 64-bits

14

2.4 About the Advanced Peripheral Bus (APB)

The Advanced Peripheral Bus (APB) is part of the Advanced Microcontroller Bus

Architecture (AMBA) protocol family. It defines a low-cost interface that is optimized for

minimal power consumption and reduced interface complexity.

The APB protocol is not pipelined, use it to connect to low-bandwidth peripherals that do not

require the high performance of the AXI protocol.

The APB protocol relates a signal transition to the rising edge of the clock, to simplify the

integration of APB peripherals into any design flow. Every transfer takes at least two cycles.

The APB can interface with:

1. AMBA Advanced High-performance Bus (AHB)

2. AMBA Advanced High-performance Bus Lite (AHB-Lite)

3. AMBA Advanced Extensible Interface (AXI)

4. AMBA Advanced Extensible Interface Lite (AXI4-Lite)

You can use it to access the programmable control registers of peripheral devices.

2.4.1APB revisions

The APB Specification Rev E, released in 1998, is now obsolete and is superseded by the

following three revisions:

1. AMBA 2 APB Specification

2. AMBA 3 APB Protocol Specification v1.0

3. AMBA APB Protocol Specification v2.0.

Out of these AMBA APB Protocol Specification v2.0 is used in this project.

AMBA 2 APB Specification

15

This specification defines the interface signals; the basic read and write transfers and the two

APB components the APB Bridge and the APB slave. This version of the specification is

referred to as APB2.

AMBA 3 APB Protocol Specification v1.0

The AMBA 3 APB Protocol Specification v1.0 defines the following additional functionality:

• Wait states.

• Error reporting.

The following interface signals support this functionality:

PREADY A ready signal to indicate completion of an APB transfer.

PSLVERR An error signal to indicate the failure of a transfer.

This version of the specification is referred to as APB3.

AMBA APB Protocol Specification v2.0

The AMBA APB Protocol Specification v2.0 defines the following additional functionality:

• Transaction protection.

• Sparse data transfer.

The following interface signals support this functionality:

PPROTA protection signal to support both non-secure and secure transactions on APB.

PSTRBA write strobe signal to enable sparse data transfer on the write data bus.

This version of the specification is referred to as APB4.

2.5Different types of Bridges

16

1. ASB to APB Bridge

2. AHB to APB Bridge

3. AXI4-Lite to APB Bridge

Out of these AXI4-Lite to APB Bridge is used for this project. This is the latest bridge.

1. ASB to APB Bridge

The APB Bridge provides an interface between the ASB and the Advanced Peripheral Bus

(APB). It continues the pipelining of the ASB by inserting wait cycles on the ASB only when

they are needed. It inserts them for burst transfers or read transfers when the ASB must wait for

the APB.

Fig. 2.1 ASB to APB Bridge Block Diagram

The implementation of this block contains:

1. a state machine, which is independent of the device memory map

17

2. ASB address, and data bus latching

3. Combinatorial address decoding logic to produce PSELxsignals.

To add new peripherals, or alter the system memory map only the address decode section needs

to be modified

2. AHB to APB Bridge

The AHB2APB interfaces the AHB to the APB. It buffers address, control and data from the

AHB, drives the APB peripherals and returns data and response signals to the AHB. It decodes

the address using an internal address map to select the peripheral. The AHB2APBis designed to

operate when the APB and AHB clocks have the same frequency and phase.

In addition to supporting all basic requirements for AMBA compliance, the AHB2APB provides

enhancements to the APB protocol. With AHB2APB, an APB peripheral can insert wait states by

holding its pready signal low for the required number of cycles.

In the AMBA specification, APB accesses are word-wide (32 bits). The AHB2APB provides

the signal pbyte_enable[3:0] to allow byte and half-word accesses. These can be used by the

APB peripheral as necessary.

The AHB2APB does not perform any alignment of the data, but transfers data from the AHB to

the APB for write cycles and from the APB to the AHB for read cycles.

The AHB2APB does not support burst transfers and converts any burst transfers into a series

of APB accesses. The AHB slave interface supplied by the AHB2APB does not make use of the

split response protocol.

18

Fig. 2.2 AHB to APB Bridge Block Diagram

3. AXI4-Lite to APB Bridge

In this study, we focused mainly on the implementation aspect of an AXI4-Lite to APB

Bridge. The AXI4-Lite to APB Bridge provides an interface between the high-performance AXI

domain and the low power APB domain. It appears as a slave on AXI bus but as a master on

APB that can access up to sixteen slave peripherals. Read and write transfers on the AXI bus are

converted into corresponding transfers on the APB. The AXI4-Lite to APB bridge Block

diagram is shown in Figure. (2.3)

19

Fig. 2.3 AXI to APB Bridge Block Diagram

Features of bridge

The Xilinx AXI to APB Bridge is a soft IP core with these features:

1. AXI interface is based on the AXI4-Lite specification

2. APB interface is based on the APB3 specification, supports optional APB4 selection

3. Supports 1:1 (AXI:APB) synchronous clock ratio

4. Connects as a 32-bit slave on 32-bit AXI4-Lite

5. Connects as a 32-bit master on 32-bit APB3/APB4

6. Supports optional data phase time out

2.6 Connections of an Asynchronous FIFO

The Asynchronous FIFO is a First-In-First-Out memory queue with control logic that

performs management of the read and write pointers, generation of status flags, and optional

handshake signals for interfacing with the user logic.

Almost Full Flag: Generates an Almost Full signal, indicating that one additional write can be

performed before the FIFO is full.

20

Almost Empty Flag: Generates an Almost Empty signal, indicating that one additional read can

be performed before the FIFO is empty.

The optional handshaking control signals (acknowledge and/or error) can be enabled via the

Handshaking Options .

• Read Acknowledge Flag: Asserted active on the clock cycle after a successful read has

occurred. This signal, when selected, can be made active high or low through the GUI.

• Read Error Flag: Asserted active on the clock cycle after a read from the FIFO was attempted,

but not successful. This signal, when selected, can be made active high or low through the GUI.

• Write Acknowledge Flag: Asserted active on the clock cycle after a successful write has

occurred. This signal, when selected, can be made active high or low through the GUI.

• Write Error Flag: Asserted active on the clock cycle after a write to the FIFO was attempted,

but not successful. This signal, when selected, can be made active high or low through the GUI.

Fig 2.4Connections of an Asynchronous FIFO

21

3. DESIGN BACKGROUND

Integrated circuits have entered the era of System-on-a-Chip (SoC), which refers to

integrating all components of a computer or other electronic system into a single chip. It may

contain digital, analog, mixed-signal, and often radio-frequency functions – all on a single chip

substrate. With the increasing design size, IP is an inevitable choice for SoC design. And the

widespread use of all kinds of IPs has changed the nature of the design flow, making On-Chip

Buses (OCB) essential to the design.

The main criteria used to select buses for consideration were:

1. The level of acceptance and penetration in the industry: It would hinder acceptance of the bus

hierarchy if an obscure set of buses were chosen that did not have an existing industry support

structure and library of useful available IP.

2. The technical ability to support the wide range of applications found in typical SoCs: It would

be a mistake to specify a set of buses that lacked the features or sufficient bandwidth to support

common SoC applications.

3. The ease of interfacing to Power Architecture processors: Given that the main goal of

Power.org is to promote Power Architecture processors, it would make little sense for the

Power.org Bus

4. Architectures to specify a set of buses that were difficult or cumbersome to attach to

Embedded Power Processors or would add unnecessary bridges in the main transaction paths.

5. The ability to easily obtain a license for the bus architecture to create verification and core IP.

Buses that have been restricted to internal development only were not considered in this report.

Of all OCBs existing in the market, the AMBA bus system is widely used as the de facto

standard SoC bus. On March 8, 2010, ARM announced availability of the AMBA 4.0

specifications. As the de facto standard SoC bus, AMBA bus is widely used in the high-

22

performance SoC designs. The AMBA specification defines an on-chip communication standard

for designing high-performance embedded microcontrollers.


specifications in March 2010, which includes advanced extensible Interface (AXI) 4.0. AMBA

bus protocol has become the de facto standard SoC bus. That means more and more existing IPs

must be able to communicate with AMBA 4.0 bus. Based on AMBA 4.0 bus, we designed an

Intellectual Property (IP) core of AXI (Advanced extensible Interface) Lite to APB (Advanced

Peripheral Bus) Bridge, which translates the AXI4.0-lite transactions into APB 4.0 transactions.

The bridge provide Interfaces between the high-performance AXI bus and low-power APB

domain. Advanced Microcontroller Bus Architecture (AMBA) specification defines an on chip

communications standard for designing high-performance embedded microcontrollers.

AMBA Versions

1.VER1.0(ASB&APB)

2.VER2.0(AHB)

3.VER3.0(AXI,ATB)

4.VER 4.0 (AXI 4,AXI LITE,AXI STREAM (AXI))

AMBA 4.0 specification buses/interfaces

1.AdvancedeXtensible Interface (AXI)

AXI4

AXI4-Lite

AXI-4 Stream

Advanced High-performance Bus (AHB)

Advanced System Bus (ASB)

Advanced Peripheral Bus (APB) Advanced Trace Bus (ATB)

23

In this project these are to be used Advanced eXtensible Interface (AXI4-Lite)&

Advanced Peripheral Bus (APB)because these are high bandwidth data transfer between high

performance devices like processor, DMA, RAM etc..,. and Peripheral devices.

4.PROJECT AIM

An Intellectual Property (IP) core of AXI4(Advanced eXtensible Interface) Lite to

APB(Advanced Peripheral Bus) Bridge, which translates the AXI4.0-lite transactions into APB

4.0 transactions. The bridge provides interfaces between the high-performance AXI bus and low-

power APB domain.

4.1 Existing Model

Overview:

The AXI to APB Bridge translatesAXI4-Lite transactions into APB transactions. The

bridge functions as a slave on the AXI4-Liteinterfaceandasa master on the APB interface. The

AXI to APB Bridge main use model is to connect the APB slaves with AXI masters. Both AXI4-

Lite and APB transactions are happened during rising edge of the clock. The AXI to APB Bridge

block diagram is shown in Figure (4.1) and described in subsequent sections.

Fig. 4.1. AXI to APB Bridge Block Diagram

24

AXI4-Lite Slave Interface

The AXI4-Lite Slave Interface module provides a bidirectional slave interface to the

AXI. The AXI address and data bus widths are always fixed to 32-bits and 1024bits. When both

write and read transfers are simultaneously requested on AXI4-Lite, the read request is given

more priority than the write request. This module also contains the data phase time out logic for

generating OK response on AXI interface when APB slave does not respond.

APB Master Interface

The APB Master module provides the APB master interface on the APB. This interface

can be APB3 or APB4, which can be selected by setting the generic C_M_APB_PROTOCOL.

When C_M_APB_PROTOCOL=apb4, the M_APB_PSTRB, and M_APB_PPROT signals are

driven at the APB Interface. The APB address and data bus widths are fixed to 32-bits.

4.2 Features of the project

We provide an implementation of AXI4-Lite to APB Bridge which has the following features:

1. 32-bit AXI slave and APB master interfaces.

2. PCLK clock domain completely independent of ACLK clock domain.

3. Support up to 16 APB peripherals.

4. Support the PREADY signal which translates to wait states on AXI.

5. An error on any transfer results in SLVERR as the AXI read/write response.

4.3 Proposed Model

In the proposed model we try to interface the APB bridge to UART transmitter and a

receiver. AXI4-LITE to APB bridge we interface it to an UART Transmitter and receiver to send

and receive bits from high performance AXI4-LITE bus to UART as axi_uart interface.

For transmitting and receiving data bits from or to UART transmitter and receiver we

append start bit as of to show the initialization of sequence and stop bit to show completion of

25

sequence. Apart from the start and stop bits we also have parity bit which shows the odd/even

parity of the sequence. So we have each one bit of a start bit, stop bit and parity bit, and also 8

bits of data.

In UART instead of using the normal clock cycle we use the baud clock which is 8 clock cycles,

which means for every 8 clock cycles or for every one baud clock cycle the data sequence gets

transmitted/ received .

When the write address is selected the data sent (WDATA) of 8 bits along with start , stop and

parity bit gets transmitted on to the uart transmitter.

similarly during the receiving when the read register is selected the data from RDATA is read

onto the uart receiver .

For AXI-UART the AWADDR(write address) gives the address of the UART register and gets

selected whether it is to read, write or control. When the address on AWADDR is 00 the write

data is stored in WADDR, when the address on AWADDR is 02 the data bits stored on WDATA

is transmitted into UART_tx.

When the read address (ARADDR) is 01 the data bits present on the UART_rx is read onto the

RDATA ( read data).

UART

A Universal Asynchronous Receiver/Transmitter, abbreviated UART is a type of "asynchronous

receiver/transmitter", a piece of computer hardware that translates data between parallel and

serial forms. UARTs are commonly used in conjunction with communication standards such as

EIA, RS-232, RS-422 or RS-485. The universal designation indicates that the data format and

transmission speeds are configurable and that the actual electric signaling levels and methods

(such as differential signaling etc.) typically are handled by a special driver circuit external to the

UART.

A UART is usually an individual (or part of an) integrated circuit used for serial communications

over a computer or peripheral device serial port. UARTs are now commonly included in

microcontrollers.

26

Transmitting and receiving serial data

` The Universal Asynchronous Receiver/Transmitter (UART) takes bytes of data and

transmits the individual bits in a sequential fashion. [1] At the destination, a second UART re-

assembles the bits into complete bytes. Each UART contains a shift register, which is the

fundamental method of conversion between serial and parallel forms. Serial transmission of

digital information (bits) through a single wire or other medium is much more cost effective than

parallel transmission through multiple wires. The UART usually does not directly generate or

receive the external signals used between different items of equipment. Separate interface

devices are used to convert the logic level signals of the UART to and from the external

signaling levels. External signals may be of many different forms. Examples of standards for

voltage signaling are RS-232, RS-422 and RS-485 from the EIA. Historically, current (in current

loops) was used in telegraph circuits. Some signaling schemes do not use electrical wires.

Examples of such are optical fiber, IrDA (infrared), and (wireless) Bluetooth in its Serial Port

Profile (SPP). Some signaling schemes use modulation of a carrier signal (with or without

wires). Examples are modulation of audio signals with phone line modems, RF modulation with

data radios, and the DC-LIN for power line communication.

Communication may be simplex (in one direction only, with no provision for the receiving

device to send information back to the transmitting device), full duplex (both devices send and

receive at the same time) or half duplex (devices take turns transmitting and receiving).

Character framing

27

The idle, no data state is high-voltage, or powered. This is a historic legacy from telegraphy, in

which the line is held high to show that the line and transmitter are not damaged. Each character

is sent as a logic low start bit, a configurable number of data bits (usually 8, but legacy systems

can use 5, 6, 7 or 9), an optional parity bit, and one or more logic high stop bits. The start bit

signals the receiver that a new character is coming. The next five to eight bits, depending on the

code set employed, represent the character. Following the data bits may be a parity bit. The next

one or two bits are always in the mark (logic high, i.e., '1') condition and called the stop bit(s).

They signal the receiver that the character is completed. Since the start bit is logic low (0) and

the stop bit is logic high (1) there are always at least two guaranteed signal changes between

characters.

Obviously a problem exists if a receiver detects a line that is low for more than one character

time. This is called a "break." It is normal to detect breaks to disable a UART or switch to an

alternative channel. Sometimes remote equipment is designed to reset or shut down when it

receives a break. Premium UARTs can detect and create breaks.

Receiver

All operations of the UART hardware are controlled by a clock signal which runs at a

multiple of the data rate. For example, each data bit may be as long as 16 clock pulses. The

receiver tests the state of the incoming signal on each clock pulse, looking for the beginning of

the start bit. If the apparent start bit lasts at least one-half of the bit time, it is valid and signals

the start of a new character. If not, the spurious pulse is ignored. After waiting a further bit time,

the state of the line is again sampled and the resulting level clocked into a shift register. After the

required number of bit periods for the character length (5 to 8 bits, typically) have elapsed, the

contents of the shift register is made available (in parallel fashion) to the receiving system. The

UART will set a flag indicating new data is available, and may also generate a processor

interrupt to request that the host processor transfers the received data.

The best UARTs "resynchronize" on each change of the data line that is more than a half-bit

wide. In this way, they reliably receive when the transmitter is sending at a slightly different

speed than the receiver. (This is the normal case, because communicating units usually have no

28

shared timing system apart from the communication signal.) Simplistic UARTs may merely

detect the falling edge of the start bit, and then read the center of each expected data bit. A

simple UART can work well if the data rates are close enough that the stop bits are sampled

reliably.

It is a standard feature for a UART to store the most recent character while receiving the next.

This "double buffering" gives a receiving computer an entire character transmission time to fetch

a received character. Many UARTs have a small first-in, first-out FIFO buffer memory between

the receiver shift register and the host system interface. This allows the host processor even more

time to handle an interrupt from the UART and prevents loss of received data at high rates.

Transmitter

Transmission operation is simpler since it is under the control of the transmitting system.

As soon as data is deposited in the shift register after completion of the previous character, the

UART hardware generates a start bit, shifts the required number of data bits out to the line,

generates and appends the parity bit (if used), and appends the stop bits. Since transmission of a

single character may take a long time relative to CPU speeds, the UART will maintain a flag

showing busy status so that the host system does not deposit a new character for transmission

until the previous one has been completed; this may also be done with an interrupt. Since full-

duplex operation requires characters to be sent and received at the same time, practical UARTs

use two different shift register

Application

Transmitting and receiving UARTs must be set for the same bit speed, character length,

parity, and stop bits for proper operation. The receiving UART may detect some mismatched

settings and set a "framing error" flag bit for the host system; in exceptional cases the receiving

UART will produce an erratic stream of mutilated characters and transfer them to the host

system.

29

Typical serial ports used with personal computers connected to modems use eight data bits, no

parity, and one stop bit; for this configuration the number of ASCII characters per second equals

the bit rate divided by 10.

Some very low-cost home computers or embedded systems dispensed with a UART and used the

CPU to sample the state of an input port or directly manipulate an output port for data

transmission. While very CPU-intensive, since the CPU timing was critical, these schemes

avoided the purchase of a UART chip that was costly in the 1970's. The technique was known as

a bit-banging serial port.

30

5. PROJECT OVERVIEW

An AMBA-based microcontroller typically consists of a high-performance system

backbone bus (AMBA AXI Lite), able to sustain the external memory bandwidth, on which the

CPU, on-chip memory and other Direct Memory Access (DMA) devices reside. This bus

provides a high-bandwidth interface between the elements that are involved in the majority of

transfers. Also located on the high performance bus is a bridge to the lower bandwidth APB,

where most of the peripheral devices in the system are located see Figure (5.1).

Fig. 5.1 Overview of AXI Lite to APB Bridge

31

5.1 About the Advanced eXtensible Interface (AXI) Protocol:

The AMBA AXI protocol supports high-performance, high-frequency system designs.

The AXI protocol:

1. is suitable for high-bandwidth and low-latency designs

2. provides high-frequency operation without using complex bridges

3. meets the interface requirements of a wide range of components

4. is suitable for memory controllers with high initial access latency

5. Provides flexibility in the implementation of interconnect architectures is backward-

compatible with existing AHB and APB interfaces.

The key features of the AXI protocol are:

1. Separate address/control and data phases

2. Support for unaligned data transfers, using byte strobes

3. uses burst-based transactions with only the start address issued

4. separate read and write data channels that can provide low-cost DMA

5. Support for issuing multiple outstanding addresses

6. Support for out-of-order transaction completion

7. Permits easy addition of register stages to provide timing closure.

The AXI protocol includes the optional extensions that cover signaling for low-power operation

AXI Revisions

1. AXI4

32

2. AXI4-Lite

3. AXI4-stream

Note: Out of these AXI4-Lite is used.

1. AXI4:

The AXI4 protocol is an update to AXI3 to enhance the performance and utilization of the

interconnect when used by multiple masters. It includes the following enhancements:

1. Support for burst lengths up to 256 beats

2. Quality of Service signaling

3. Support for multiple region interfaces

2. AXI4-Lite:

The AXI4-Lite protocol is a subset of the AXI4 protocol intended for communication with

simpler, smaller control register style interfaces in components. The key features of the AXI4-

Lite interface are:

1. All transactions are burst length of one

2. All data accesses are the same size as the width of the data bus

3. Exclusive accesses are not supported

3. AXI4-Stream:

The AXI4-Stream protocol is designed for unidirectional data transfers from master to slave with

greatly reduced signal routing. Key features are

1. Supports single and multiple data streams using the same set of shared wires

2. Support for multiple data widths within the same interconnect

3. Ideal for implementation in FPGA

33

5.1.1AXI strengths:

AXI offers several features that allow for high performance. However, achieving the

highest performance at a system level requires that all of the AXI masters and slaves are

designed to a high performance target, since,. AXI can use the higher frequency along with a

data bus that can be defined to 128-bit and above to achieve the band

1. Data Pipelining:

Buses that do not pipeline cannot achieve full bus utilization and peak bandwidth given

that there will be transfer latency to and from the slaves. AXI supports 16 deep read and 16 deep

write data pipelining. This allows for full bus utilization, even with interfaces with extremely

long latency.

2. Multiple Data Beats per Burst Request:

AXI address and data phases are independent from one another, so a burst request that is

accepted by a slave when READY and VALID are active must consist of the number of data

beats the master requested, with no further interaction with the address and control signals. The

driver of the burst data also sends a LAST signal to mark the last beat of burst data. width

required for mid- and some high-performance bus applications

3. Write Strobes:

AXI defines write strobes to enable the transfer of some or all of the bytes on the write

data bus in a single beat. Given a 128-bit AXI, this allows any number of bytes between 1 and 16

to be transferred in a single beat on the write data bus. Buses that do not support write strobes

can require that multiple transfers be made for bytes inside a single quad word. Having write

strobes also enables AXI masters to limit propagation of write data with errors. If the master

discovers there are errors in write data that has yet to be sent to AXI, it is allowed to complete

the burst from a data beat standpoint, but without enabling any of the write strobes.

4. Out-of-order Reads:

34

AXI uses the ARID [3:0] bus on the read address channel to allow slaves to return read

data out-of-order with respect to the order of the read requests. Reads made with the same

ARID[3:0] must remain in order, but reads made with different ARID[3:0] may be returned out-

of-order. This improves read data channel utilization by allowing slaves with shorter latencies to

return data as soon as it is available, rather than forcing them to return in the request order,

potentially behind a slave with much longer read latency. This is a better mechanism than

performing delayed reads, since they require communication between the master and slave once

the read data is available, and then a subsequent repeat of the read request from the master.

Simple slaves may not choose to implement out-of-order read data, so system integrators must be

careful when selecting IP to be placed on the same AXI interconnect. Simple slaves with long

latency will negate the benefit of the other slaves performing reads out-of-order, since the

interconnect must guarantee that reads with the same ARID from different slaves return read data

in the same order in which the requests were made.

5. Write Data Interleaving:

Multiple AXI masters may attempt to write data to a single slave at the same time. Some

master’s buffer all of the write data before making the request, but others assemble data to send

after the request is made. AXI slaves that support write data interleaving can accept write data

from both types of masters in what looks like a single data tenure, with the AWID value

switching between the masters sending write data. This allows interconnect to avoid stalling the

write data bus to the slave, waiting for writes data to be assembled. Instead, the interconnect

sends write data from a “buffered write” master in between beats of write data from the

“assemble write” master.

6. Separate Data Buses:

AXI doubles the peak bandwidth at a given frequency by using separate buses for read

and write data. These buses are used in conjunction with data pipelining to increase performance.

7. Handshaking:

The VALID and READY handshaking mechanism in AXI allows both masters and

slaves to control the flow of data. This can be problematic if, for example, a poorly designed

35

master makes a write request while still assembling write data that is coming in from a slower or

narrower interface. One real benefit of having the handshake mechanism is that slaves with long

latencies can accept more requests than they have buffer space for. So, for example, an AXI

PCIe slave may queue eight reads, but only have enough buffer space for four. The slave can do

this if the masters are guaranteed to accept the read data when the slave drives it back. This is an

area and cost savings, but requires knowledge of specific AXI master capabilities, and should

therefore be implemented as a configuration option in a general purpose AXI slave.

8. Exclusive Access:

AXI supports an exclusive access mechanism that enables semaphore types of operations

without requiring the bus to be locked. The process is

1)A master requests an exclusive read from an address location,

2)The same master, sometime later, attempts to complete the exclusive operation by attempting

an exclusive write to the same address. The exclusive write is only able to complete successfully

if no other master has written to that location between the exclusive read and writeand this

mechanism could be used to improve the performance of Power processors attached to AXI.

9. No Slave Burst Termination:

Once an AXI slave acknowledges a burst transfer, it is responsible for accepting all of the

write data or generating all the read data associated with that burst. This simplifies master

designs, since the master does not have to prepare to make a subsequent request if the slave

terminated the original request before all of its data was transferred. Giving slaves the ability to

terminate bursts is not required when the burst length is declared with the request, and is

reasonably short. AXI bursts are 16 beats or fewer.

10. Excellent Verification Support:

AXI offers a large selection of verification IP from several different suppliers. The

solutions offered support several different languages and run in a choice of environments.

11. Flexible Complexity:

36

AXI masters and slaves are more difficult to design than AHB masters and slaves.

However, there are several optional features that may be left unimplemented in order to reduce

the increase in complexity. And may be somewhat offset with the availability of more

sophisticated verification support.

12. Data Bus Width:

AXI is defined with a choice of several bus widths, in 2n increments from 8-bit to 1024-

bit. The ability to select the data bus width at the time of synthesis improves the scalability (in

peak bandwidth) of the core IP. However, supporting even a few bus widths does add to the

complexity of the design, and choices need to interlock with other core IP to maintain

interoperability.

13. Error Reporting:

AXI defines errors in the read and write response channels. The slave can respond with

SLVERR if it detects a problem, and the interconnect can respond with DECERR if no slave

accepts the transfer. Interrupts from masters are handled outside the AXI architecture.

14. IP Core Portability:

The vast majority of the IP cores designed with an AXI interface are available as

“synthesizable” cores. The source Verilog or VHDL is provided and synthesized into the target

technology.

5.2AXI Architecture

The AXI protocol is burst-based and defines the following independent transaction channels:

1. read address

2. read data

3. write address

4. write data

37

5. Write response.

An address channel carries control information that describes the nature of the data to be

transferred. The data is transferred between master and slave using either:

1. A write data channel to transfer data from the master to the slave. In a write transaction,

the slave uses the write response channel to signal the completion of the transfer to the

master.

2. A read data channel to transfer data from the slave to the master.

The AXI protocol:

1. Permits address information to be issued ahead of the actual data transfer

2. supports multiple outstanding transactions

3. Supports out-of-order completion of transactions.

Figure (5.2) shows how a read transaction uses the read address and read data channels.

Fig 5.2 Channel architecture of reads

Figure 5.3 shows how a write transaction uses the write address, write data, and write response

channels.

38

Fig 5.3 Channel architecture of writes

Channel definition for AXI Architecture

Each of the independent channels consists of a set of information signals, inthese VALID

and READY signals that provide a two-way handshake mechanism. Transfer occurs only when

both the VALID and READY signals are HIGH. All five transaction channels use the same

VALID/READY handshake process to transfer address, dataand controlinformation.

The information source uses the VALID signal to show when valid address, data or

control information is available on the channel. The destination uses the READY signal to show

when it can accept the information. Both the read data channel and the write data channel also

include a LAST signal to indicate the transfer of the final data item in a transaction.

1.Readandwriteaddresschannels.

Read and write transactions each have their own address channel which carries all of the

required address and control information for a transaction.

2. Read data channel.

The read data channel conveys both the read data and read response information from the slave

back to the master. It includes:

39

1. The data bus, which can be 8, 16, 32, 64, 128, 256, 512, or 1024 bits wide (RDATA)

2. A read response indicating the completion status of the read transaction.

3. Data and response group signals are maintained until the RReady signal is asserted

3. Write data channel.

The write data channel conveys the write data from the master to the slave and includes

1. The data bus, which can be 8, 16, 32, 64, 128, 256, 512, or 1024 bits wide (WDATA)

2. One byte lane strobe for every byte, indicating which bytes of the data bus are valid

(WSTRB)

3. The last transfer inside a burst must be signaled through the WLAST signal

data, strobe and WLast information are maintained until the WReady signal is asserted.

4. Write Response channel. The write response channel provides a way for the slave to respond

to write transactions.

1. All write transactions use completion signaling.

2. The completion signal occurs once for each burst, not for each individual data transfer within

the burst.

3. Response group signals are maintained until the BReady signal is asserted. 5.3About the

Advanced Peripheral Bus (APB) Protocol

The Advanced Peripheral Bus (APB) is part of the Advanced Microcontroller Bus Architecture

(AMBA) protocol family. It defines a low-cost interface that is optimized for minimal power

consumption and reduced interface complexity.

APB revisions

The APB Specification Rev E, released in 1998, is now obsolete and is superseded by the

following three revisions:

1. AMBA 2 APB Specification

2. AMBA 3 APB Protocol Specification v1.0

40

3. AMBA APB Protocol Specification v2.0.

Out of these using AMBA APB Protocol Specification v2.0.

The APB Bus is the lowest performance bus in the AMBA family. There are separate

address (PADDR), write data (PWDATA), and read data (PRDATA) buses, up to 32-bits each.

With 7 additional control signals, there can be up to 103 I/O for each APB slave. There is one

APB master, usually the bridge from a higher performance bus that begins a transfer by asserting

the appropriate PSELn signal with PADDR. PWRITE is active for a write and inactive on a read.

PENABLE is asserted in the second clock, and is held active until PREADY is returned by the

slave. The minimum transfer, read or write, is two clocks. APB slaves also have the option of

inserting wait states for reads or writes by withholding PREADY. There is an optional

PSLVERR signal used by the slave to report an error on a read or write with PREADY.

Assuming a frequency of 133MHz, (which should be achievable in a 90nm or smaller

process) and the maximum data bus widths of 32-bits, the overall APB bandwidth could be as

high as 267MB/s, but that assumes that none of the APB slaves would insert any wait states.

Also, the total APB bandwidth must be shared between all of the APB slaves.

5.3.1APB strengths

The APB offers a very low cost (based on area), very low power (based on I/O), and the

least amount of complexity. It is intended for use with simple slaves that do not require much

bandwidth.

1. Interface Simplicity: The APB interface and protocol are very simple and easy to learn.

Relatively little effort needs to be spent on APB design and verification, leaving more time to

focus on the device logic.

2. Core IP: There are more than 75 APB cores available from at least 20 different companies.

The cores cover a broad range of small bandwidth applications. There is also verification

enablement in models and checkers/monitors available from at least 6 different companies with

support for 5 different languages.

41

3. Verification IP: There is verification enablement for APB in models and checkers/monitors

available from at least 6 different companies with support for 5 different languages

4.Bridges Exist: APB cores are accessed by the AHB-to-APB or AXI-to-APB Bridge.

5. IP Core Portability: Most of the IP cores designed with an APB interface are available as

“synthesizable” cores. The source code, written in Verilog or VHDL, is provided and

synthesized into the target technology. This IP delivery method typically presents few timing

challenges at the 66-133MHz range expected for APB.

6. Error Reporting: APB defines an optional signal, PSLVERR, driven by APB slaves on reads

or writes with PREADY to indicate an error with the transfer.

42

6.SIGNAL CONNECTIONS

Figure (6.1) shows the component signal connections. The bridge uses:

• AMBA AXI-Lite and APB signals as described in the AMBAAXI-Lite 4.0 protocol

specification.

Fig 6.1 Signal connections

43

INPUT / OUTPUT signal descriptionsXI4-Lite Signals list:-

44

Table 6.1 signal connections for AXI4-Lite

45

APB Signals list:-

Table 6.2 signal connections for APB

6.1 Handshake mechanisms of AXI & APB

In AXI 4.0 specification, each channel has VALID and READY signals for handshaking.

The source asserts VALID when the control information or data is available. The destination

asserts READY when it can accept the control information or data. Transfer occurs only when

both the VALID and READY are asserted. Figure 6.2 Shows all possible cases of

46

VALID/READY handshaking. Note that when source asserts VALID, the corresponding control

information or data must also be available at the same time. The arrows in Figure. Indicate when

the transfer occurs. A transfer takes place at the positive edge of clock. Therefore, the source

needs a register input to sample the READY signal. In the same way, the destination needs a

register input to sample the VALID signal. Considering the situation of Figure(c), we assume the

source and destination use output registers instead of combination circuit, they need one cycle to

pull low VALID/READY and sample the VALID/READY again at T4 cycle. When they sample

the VALID/READY again at T4, there should be another transfer which is an error. Therefore

source and destination should use combinational circuit as output. In short, AXI protocol is

suitable register input and combinational output circuit.

The APB Bridge buffers address, control and data from AXI4-Lite, drives the APB

peripherals and returns data and response signal to the AXI4-Lite. It decodes the address using

an internal address map to select the peripheral. The bridge is designed to operate when the APB

and AXI4-Lite have independent clock frequency and phase. For every AXI channel, invalid

commands are not forwarded and an error response generated. That is once a peripheral accessed

does not exist, the APB Bridge will generate DE CERR as response through the response

channel (read or write). And if the target peripheral exists, but asserts PSLVERR, it will give a

SLVERR response.

47

Fig. 6.2. Waveforms for handshaking mechanism

6.2 Asynchronous FIFO

An asynchronous FIFO refers to a FIFO design where data values are written to a FIFO

buffer from one clock domain and the data values are read from the same FIFO buffer from

48

another clock domain, where the two clock domains are asynchronous to each other.

Asynchronous FIFOs are used to safely pass data from one clock domain to another clock

domain. There are many ways to do asynchronous FIFO design, including many wrong ways.

Most incorrectly implemented FIFO designs still function properly 90% of the time. Most

almost-correct FIFO designs function properly 99%+ of the time. Unfortunately, FIFOs that

work properly 99%+ of the time have design flaws that are usually the most Difficult to detect

and debug (if you are lucky enough to notice the bug before shipping the product), or the most

costly to diagnose and recall.

6.2.1 Passing multiple asynchronous signals:

Attempting to synchronize multiple changing signals from one clock domain into a new

clock domain and insuring that all changing signals are synchronized to the same clock cycle in

the new clock domain has been shown to be problematic. FIFOs are used in designs to safely

pass multi-bit data words from one clock domain to another.

Data words are placed into a FIFO buffer memory array by control signals in one clock

domain, and the data word sare removed from another port of the same FIFO buffer memory

array by control signals from a second clock domain. Conceptually, the task of designing a FIFO

with these assumptions seems to be easy.

6.2.2 Asynchronous FIFO pointers:

In order to understand FIFO design, one needs to understand how the FIFO pointers

work. The write pointer always points to the next word to be written; therefore, on reset, both

pointers are set to zero, which also happens to be the next FIFO word location to be written. On a

FIFO-write operation, the memory location that is pointed to by the write pointer is written, and

then the write pointer is incremented to point to the next location to be written.

Similarly, the read pointer always points to the current FIFO word to be read. Again on

reset, both pointers are reset to zero, the FIFO is empty and the read pointer is pointing to invalid

data (because the FIFO is empty and the empty flag is asserted). As soon as the first data word is

written to the FIFO, the write pointer increments, the empty flag is cleared, and the read pointer

that is still addressing the contents of the first FIFO memory word, immediately drives that first

49

valid word onto the FIFO data output port, to be read by the receiver logic. The fact that the read

pointer is always pointing to the next FIFO word to be read means that the receiver logic does

not have to use two clock periods to read the data word. If the receiver first had to increment the

read pointer before reading a FIFO data word, the receiver would clock once to output the data

word from the FIFO, and clock a second time to capture the data word into the receiver. That

would be needlessly inefficient.

The FIFO is empty when the read and write pointers are both equal. This condition

happens when both pointers are reset to zero during a reset operation, or when the read pointer

catches up to the write pointer, having read the last word from the FIFO.

A FIFO is full when the pointers are again equal, that is, when the write pointer has

wrapped around and caught up to the read pointer. This is a problem. The FIFO is either empty

or full when the pointers are equal, but which? One design technique used to distinguish between

full and empty is to add an extra bit to each pointer. When the write pointer increments past the

final FIFO address, the write pointer will increment the unused MSB while setting the rest of the

bits back to zero as shown in Figure 1 (the FIFO has wrapped and toggled the pointer MSB). The

same is done with the read pointer. If the MSBs of the two pointers are different, it means that

the write pointer has wrapped one more time that the read pointer. If the MSBs of the two

pointers are the same, it means that both Pointers have wrapped the same number of times.

Using n-bit pointers where (n-1) is the number of address bits required to access the entire

FIFO memory buffer, the FIFO is empty when both pointers, including the MSBs are equal. And

the FIFO is full when both pointers, except the MSBs are equal.

The FIFO design in this paper uses n-bit pointers for a FIFO with 2(n-1) write-able locations

to help handle full and empty conditions. More design details related to the full and empty logic.

50

Fig. 6.3 FIFO full and empty conditions

51

7. MECHANISM OF AXI & APB

7.1 Finite state machine

A finite-state machine (FSM) or finite-state automaton (plural: automata), or simply

a state machine, is a mathematical model used to design computer programs and digital

logic circuits. It is conceived as an abstract machine that can be in one of a finite number

of states. The machine is in only one state at a time; the state it is in at any given time is called

the current state. It can change from one state to another when initiated by a triggering event or

condition, this is called a transition. A particular FSM is defined by a list of the possible

transition states from each current state, and the triggering condition for each transition.

Finite-state machines can model a large number of problems, among which are electronic design

automation, communication protocol design, parsing and other engineering applications.

In biology and artificial intelligence research, state machines or hierarchies of state machines are

sometimes used to describe neurological systems, and in linguistics they can be used to describe

the grammars of natural languages.

52

7.2 State Diagram Of APB

7.2.1 Operating states

Figure 7.1 shows the operational activity of the APB.

The state machine operates through the following states:

IDLE: This is the default state of the APB.

SETUP: When a transfer is required the bus moves into the SETUP state, where the appropriate

select signal, PSELx, is asserted. The bus only remains in the SETUP state for one clock cycle

and always moves to the ACCESS state on the next rising edge of the clock.

53

ACCESS: The enable signal, PENABLE, is asserted in the ACCESS state. The address, writes,

select, and write data signals must remain stable during the transition from the SETUP to

ACCESS state.

Exit from the ACCESS state is controlled by the PREADY signal from the slave:

• If PREADY is held LOW by the slave then the peripheral bus remains in

the ACCESS state.

• If PREADY is driven HIGH by the slave then the ACCESS state is exited and the bus returns

to the IDLE state if no more transfers are required.

Alternatively, the bus moves directly to the SETUP state if another transfer follows.

7.2.2 Write transfer

Fig: 7.2 waveform for Write transfer

The write transfer starts with the address, write data, write signal and select signal all

changing after the rising edge of the clock. The first clock cycle of the transfer is called the

SETUP cycle. After the following clock edge the enable signal PENABLE is asserted and this

54

indicates that the ENABLE cycle is taking place. The address, data and control signals all remain

valid throughout the ENABLE cycle. The transfer completes at the end of this cycle.

The enable signal, PENABLE, will be disserted at the end of the transfer. The select signal will

also go LOW, unless the transfer is to be immediately followed by another transfer to the same

peripheral. In order to reduce power consumption the address signal and the write signal will not

change after a transfer until the next access occurs. The protocol only requires a clean transition

on the enable signal. It is possible that in the case of back to back transfers the select and write

signals may glitch.

7.2.3 Read transfer

Fig: 7.3 wave form for Read transfer

The timing of the address, write, select and strobe signals are all the same as for the write

transfer. In the case of a read, the slave must provide the data during the ENABLE cycle. The

data is sampled on the rising edge of clock at the end of the ENABLE cycle.

7.2.4 Protection unit support

55

To support complex system designs, it is often necessary for both the interconnect and

other devices in the system to provide protection against illegal transactions. For the APB

interface, this protection is provided by the PPROT [2:0] signals.

The three levels of access protection are:

1. Normal or privileged, PPROT [0]

• LOW indicates a normal access

• HIGH indicates a privileged access.

This is used by some masters to indicate their processing mode. A privileged processing mode

typically has a greater level of access within a system.

2. Secure or non-secure, PPROT [1]

• LOW indicates a secure access

• HIGH indicates a non-secure access.

This is used in systems where a greater degree of differentiation between

Processing modes is required.

Note: This bit is configured so that when it is HIGH then the transaction is considered non-

secure and when LOW, the transaction is considered as secure.

3. Data or Instruction, PPROT [2]

• LOW indicates a data access

• HIGH indicates an instruction access.

This bit gives an indication if the transaction is a data or instruction access.

Note: This indication is provided as a hint and is not accurate in all cases. For example, where a

transaction contains a mix of instruction and data items. It is recommended that, by default, an

access is marked as a data access unless it is specifically known to be an instruction access.

56

Table 7.1 Protection encoding

Note: The primary use of PPROT is as an identifier for Secure or Non-secure transactions. It is

acceptable to use different interpretations of the PPROT [0] and PPROT [2] identifiers.

7.3 AXI4-Lite

7.3.1 Definition of AXI4-Lite

This section defines the functionality and signal requirements of AXI4-Lite components.

The key functionality of AXI4-Lite operation is:

1. All transactions are of burst length 1

2. All data accesses use the full width of the data bus.

3. AXI4_Lite supports a data bus width of 32-bit or 64-bit.

4. All accesses are Non-modifiable, Non-bufferable

5. Exclusive accesses are not supported.

57

Signal list:

Shows the required signals on an AXI4-Lite interface.

Table 7.2 AXI4-Lite interface signals

7.3.2 Bus width

AXI4-Lite has a fixed data bus width and all transactions are the same width as the data bus. The

data bus width must be, either 32-bits or 64-bits.

ARM expects that:

• The majority of components use a 32-bit interface

• Only components requiring 64-bit atomic accesses use a 64-bit interface.

A 64-bit component can be designed for access by 32-bit masters, but the implementation must

ensure that the component sees all transactions as 64-bit transactions.

7.3.3Write strobes

The AXI4-Lite protocol supports write strobes. This means multi-sized registers can be

implemented and also supports memory structures that require support for 8-bit and 16-bit

accesses.

58

All master interfaces and interconnect components must provide correct write strobes.

Any slave component can choose whether to use the write strobes. The options permitted are:

• To make full use of the write strobes

• To ignore the write strobes and treat all write accesses as being the full data bus width

• To detect write strobe combinations that are not supported and provide an error response.

A slave that provides memory access must fully support write strobes. Other slaves in the

memory map might support a more limited write strobe option.

7.3.4 Conversion, protection, and detection

Connection of an AXI4-Lite slave to an AXI4 master requires some form of adaptation if

it cannot be ensured that the master only issues transactions that meet the AXI4-Lite

requirements.

This section describes techniques that can be adopted in a system design to aid with the

interoperability of components and the debugging of system design problems. These techniques

are:

1. Conversion: This requires the conversion of all transactions to a format that is compatible

with the AXI4-Lite requirements.

2. Protection: This requires the detection of any non-compliant transaction. The non-compliant

transaction is discarded, and an error response is returned to the master that generated the

transaction.

3. Detection: This requires observing any transaction that falls outside the AXI4-Lite

requirements

• notifying the controlling software of the unexpected access

• permitting the access to proceed at the hardware interface level.

59

7.3.5 Address structure

The AXI protocol is burst-based. The master begins each burst by driving control

information and the address of the first byte in the transaction to the slave. As the burst

progresses, the slave must calculate the addresses of subsequent transfers in the burst. A burst

must not cross a 4KB address boundary.

Note: This prevents a burst from crossing a boundary between two slaves. It also limits the

number of address increments that a slave must support.

1. Burst length

The burst length is specified by:

ARLEN [7:0], for read transfers

AWLEN [7:0], for write transfers.

In this specification, AxLEN indicates ARLEN or AWLEN.

AXI3 supports burst lengths of 1 to 16 transfers, for all burst types.

AXI4 extends burst length support for the INCR burst type to 1 to 256 transfers. Support for all

other burst types in AXI4 remains at 1 to 16 transfers.

The burst length for AXI3 is defined as,

Burst Length = AxLEN[3:0] + 1

The burst length for AXI4 is defined as,

Burst Length = AxLEN [7:0] + 1, to accommodate the extended burst length of the INCR burst

type in AXI4.

AXI has the following rules governing the use of bursts:

• For wrapping bursts, the burst length must be 2, 4, 8, or 16

• A burst must not cross a 4KB address boundary

60

• Early termination of bursts it not supported.

No component can terminate a burst early. However, to reduce the number of data transfers in a

write burst, the master can disable further writing by deserting all the write strobes. In this case,

the master must complete the remaining transfers in the burst. In a read burst, the master can

discard read data, but it must complete all transfers in the burst.

Note: Discarding read data that is not required can result in lost data when accessing a read-

sensitive device such as a FIFO. When accessing such a device, a master must use a burst length

that exactly matches the size of the required data transfer. In AXI4, transactions with INCR burst

type and length greater than 16 can be converted to multiple smaller bursts,

2. Burst size

The maximum number of bytes to transfer in each data transfer, or beat, in a burst, is specified

by: ARSIZE [2:0], for read transfers

AWSIZE [2:0], for write transfers.

Table 7.3 Burst size encoding

61

3. Burst type

The AXI protocol defines three burst types:

1. FIXED: In a fixed burst, the address is the same for every transfer in the burst.

This burst type is used for repeated accesses to the same location such as when loading or

emptying a FIFO.

2. INCR: In an incrementing burst, the address for each transfer in the burst is an increment of

the address for the previous transfer. The increment value depends on the size of the transfer.

For example, the address for each transfer in a burst with a size of four bytes is the previous

address plus four. This burst type is used for accesses to normal sequential memory.

3. WRAP: A wrapping burst is similar to an incrementing burst, except that the address wraps

around to a lower address if an upper address limit is reached.

The following restrictions apply to wrapping bursts:

• The start address must be aligned to the size of each transfer

• The length of the burst must be 2, 4, 8, or 16 transfers.

The burst type is specified by:

• ARBURST [1:0], for read transfers.

• AWBURST [1:0], for write transfers.

In this specification, Ax BURST indicates ARBURST or AWBURST

62

Table 7.4Burst type encoding

63

8.VERILOG CODING

Apb Bridge

`timescale 1ns / 1ps

module apb_bridge_cha(input Aclk,

input Pclk,

input Rst,

input Awvalid,

input [31:0]Awaddr,

input Wvalid,

input [31:0] Wdata,

input [3:0] Wstrb,

input Bready,

input Arvalid,

input [31:0] Araddr,

input Rready,

input Pready,

input [31:0] Prdata,

input Pslverr,

output reg Awready,

output reg Wready,

output reg Bvalid,

output reg Bresp,

output reg Arready,

output reg Rvalid,

output reg[31:0] Rdata,

output reg Rresp,

output reg [31:0] Paddr,

output reg Psel,

output reg Penable,

64

output reg Pwrite,

output reg [31:0] Pwdata,

output reg [3:0] Pstrb);

///********************************Signals Of the Internal*******************************

///INTERNAL SIGNALS

///INTERCONNCTES SIGNALS

reg [1:0] awvalid_s,wvalid_s,arvalid_s;

reg bvalid_s;

reg awvalid_initial,arvalid_initial;

reg [31:0] awaddr_s,araddr_s;

reg pready_s,pslverr_s;

reg rready_s;

reg [31:0] prdata_s,wdata_s;

reg a_wr_s,a_rd_s,p_wr_s,p_rd_s;

wire [31:0] pwdata_s_w,rdata_s_w;

reg rd_addr,wr_addr;

reg [31:0]paddr_s;

wire [31:0]paddr_s2;

reg bvalid1,bvalid2;

reg rready_s1,rready_s2,rready_s3;

---- Address logic for the interface

-----Addresss Selection

-----Assgining the signal AXI Signal to a Signal

-----Later the signal is conncted to the FIFO Input Signal

----------------------------------------------------------------------------------*/

65

always@(Arvalid,araddr_s,awaddr_s,Awvalid)

begin

if(Arvalid)

paddr_s<= araddr_s;

else

paddr_s <= awaddr_s;

end

/*----------------Completed The Address Logic Signals-----------------------------------*/

A read and write request are asserted on same time then give a priority signal

---for the read signal than write signal.

always@(Awvalid,Arvalid)

begin

if((Awvalid == 1'b1)&(Arvalid==1'b1))

begin

awvalid_initial = 1'b0;

arvalid_initial = 1'b1;

end

else if((Awvalid == 1'b1)&(Arvalid==1'b0))

begin



end

else if((Awvalid == 1'b0)&(Arvalid==1'b1))

begin



end

else

begin

66



end

end

/*----------------------Priority Of the signals is completed----------

----------------------Write Signals For AW signals-----------------------Awvalid Should be High For The Address Transfer

---Awready Should get High For the Pready is High signal

---If Pready is low and Awvalid is hign signal then the signals

---incidacte that the AXI(Master) is ready but Apb(Slave) is not ready to recevice.

---If Pready is High and Awvalid is Low signals then the signals that

---incidacet that the AXI(Master) is not ready but ABP(Slvae) is ready.

---If Pready is high than Awready signal will be High.

---When Pready and Awvalid is Both are High,We make one enable signal that will be HIgh

---when both signals will be high.

always@(posedge Aclk)

begin

if(~Rst)

begin

awvalid_s <= 2'b0;

Awready <= 1'b0;

awaddr_s <= 32'h0;

end

else if((awvalid_initial==1'b1)&(pready_s==1'b1))

begin

awvalid_s <= 2'b11;

awaddr_s <= Awaddr;

Awready <= 1'b1;

end

67


begin

awvalid_s <= 2'b01;

Awready <= 1'b1;

// awaddr_s <= 32'h0;

end


begin

awaddr_s <= Awaddr;

awvalid_s <= 2'b10;

Awready <= 1'b0;

end

else

begin

// awaddr_s <= 32'h0;

awvalid_s <= 2'b0;

Awready <= 1'b0;

end

end

//--------------------------------------------------------------------

---In this process Wdata tranfter Takes Place Here

---Wdata signal & Wready signals Pready.

---Wdata is take from when Wvalid is High.

---Wready is high if Pready Signal is High if not it will be Low.

---I have take One Enable signal .The enable Signal will be High both

---Signal(Wvalid,Pready,AWenable)should be high.

---Wready will be High if Pready is High.

---Wvalid is high then only my Data will Taken from input

----------------------------------------------------------------------always@(posedge Aclk)

68

begin

if(~(Rst))

begin

wvalid_s <= 2'b0;

wdata_s <= 32'h0;

Wready <= 1'b0;

Pstrb <= 4'h0;

end

else if(awvalid_s==2'b11)

begin

if((Wvalid==1'b1)&(pready_s==1'b1))

begin

wdata_s <= Wdata;

wvalid_s <= 2'b11;

Wready <= 1'b1;

Pstrb <= Wstrb;

end

else if((Wvalid==1'b0)&(pready_s==1'b1))

begin

wvalid_s <=2'b01;

Wready <= 1'b1;

Pstrb <= 4'h0;

end

else if((Wvalid==1'b1)&(pready_s==1'b0))

begin

wdata_s <= Wdata;

wvalid_s <= 2'b10;

Wready <= 1'b0;

Pstrb <= Wstrb;

end

69

else

begin

wvalid_s <= 2'b00;

Wready <= 1'b0;

Pstrb <= 4'h0;

end

end

else if((awvalid_s==2'b01)|(awvalid_s==2'b10)|(awvalid_s==2'b0))

begin

wdata_s <= 32'h0;

wvalid_s <= 2'b0;

Wready <=0;

Pstrb <= 0;

end

end

**********************Response Signals on the AXI side****************************

---Bvalid,Pslverr

---Bvalid is Resopse signal Of Axi Data .

---It will respond That when slave have receviced the data from Master

reg p_rd_ss;

///I have made Regters Logic when we are Changing from One clock Domain to Other Clock Domain


begin

if(~Rst)

begin

Bvalid<= 1'b0;

bvalid1<=1'b0;

70

bvalid2<= 1'b0;

end

else if(p_rd_ss)

begin

bvalid1 <=bvalid_s;

bvalid2 <=bvalid1;

Bvalid <= bvalid2;

end

else

begin

bvalid1<=bvalid_s;

bvalid2 <=bvalid1;

Bvalid <= bvalid2;

end

end


begin

if(~Rst)

begin

bvalid_s<= 1'b0;

end

else if((awvalid_s==2'b11)&(pslverr_s==1'b0))

begin

if((wvalid_s==2'b11)&(pready_s==1'b1))

begin

bvalid_s<= 1'b1;

end

else

begin

bvalid_s<= 1'b0;

71

end

end

else

begin

bvalid_s<= 1'b0;

end

end

--- Bresp from to AXI signals from APB .

---INtially We have Bvalid signal then Axi will Send Bready signal to Abp

---If Bready is HIgh than Bresp will be High.


begin

if(~Rst)

begin

Bresp <= 1'b0;

end

else if(bvalid_s)

begin

if(Bready)

begin

Bresp <= 1'b1;

end

else

begin

Bresp <= 1'b0;

end

end

else

begin

Bresp <= 1'b0;

72

end

end

****************************** Read Signals on the AXI side ********************

---Arvalid Should be High For The Address Transfer

---Arready Should get High For the Pready is High signal

---If Pready is low and Arvalid is hign signal then the signals

---incidacte that the AXI(Master) is ready but Apb(Slave) is not ready to recevice.

---If Pready is High and Arvalid is Low signals then the signals that

---incidacet that the AXI(Master) is not ready but ABP(Slvae) is ready.

---If Pready is high than Arready signal will be High.

---When Pready and Arvalid is Both are High,We make one enable signal that will be HIgh

---when both signals will be high.

----------------------------------------------------------------------always@(posedge Aclk)

begin

if(~Rst)

begin

Arready <=1'b0;

arvalid_s <= 2'b00;

araddr_s <= 32'h0;

end

else if((arvalid_initial==1'b1)&(pready_s==1'b1))

begin

arvalid_s <= 2'b11;

araddr_s <= Araddr;

Arready <= 1'b1;

end


73

begin

arvalid_s <= 2'b10;

araddr_s <= Araddr;

Arready <= 1'b0;

end


begin

arvalid_s <= 2'b01;

araddr_s <= 32'h0;

Arready <= 1'b1;

end

else

begin

arvalid_s <= 2'b00;

araddr_s <= 32'h0;

Arready <= 1'b0;

end

end

--Read Data Signal .

---In The process Slave will be sending the Data for particular Addresss

---which is send by AXI(Master).

---In this process we will be writing Only the enable signals.


begin

if(~Rst)

begin

Rvalid <= 1'b0;

rready_s<=1'b0;

end

74

else if(arvalid_s == 2'b11)

begin

if((pready_s==1'b1)&(Rready==1'b1))

begin

Rvalid <= 1'b1;

rready_s <= 1'b1;

end

else if((pready_s==1'b0)&(Rready==1'b1))

begin

Rvalid <= 1'b0;

rready_s <= 1'b1;

end

else if((pready_s==1'b1)&(Rready==1'b0))

begin

Rvalid <= 1'b1;

rready_s <= 1'b0;

end

else

begin

Rvalid <= 1'b0;

rready_s <= 1'b0;

end

end

else

begin

Rvalid <= 1'b0;

rready_s <= 1'b0;

end

end

75

--- Read Resopse Singal

always@(posedge Aclk )

begin

if(~Rst)

begin

Rresp <= 1'b0;

end

else if((pslverr_s==1'b0)&(rready_s1==1'b1))

begin

Rresp <= 1'b1;

end

else

begin

Rresp <= 1'b0;

end

end

---It internal Signa for the Enabling

---a_rd_s: Incidates that AXI READ SIGNAL FOR THE FIFO

---LOGIC THAT WHEN FIFO SHOULD READ THE DATA FROM FIFO OUT .


begin

if(~Rst)

begin

a_rd_s <= 1'b0;

end

else if((pready_s==1'b1)&(arvalid_s==2'b11))

begin

if((rready_s))

begin

a_rd_s <= 1'b1;

76

end

else

begin

a_rd_s <= 1'b0;

end

end

end

---INTERNAL SIGNAL PURPOSE

---PREADY SIGNAL :IT REPRESENT THAT SLAVE IS READY NOT TO RECEVICE THE DATA SIGNAL

---ANY CONTROL SIGNALS

always@(posedge Pclk)

begin

if(~Rst)

begin

pready_s <= 1'b0;

end

else if(Pready)

begin

pready_s <= 1'b1;

end

else

begin

pready_s <= 1'b0;

end

end

---PLSVERR SIGNAL

---THIS IS SHOWS ERROR INCIDATION

---IF THERE IS ERROR IT IS HIGH IF NOT LOW


77

begin

if(~Rst)

begin

pslverr_s <= 1'b0;

end

else

begin

if(Pslverr)

begin

pslverr_s <= 1'b1;

end

else

begin

pslverr_s <= 1'b0;

end

end

end

--- IN THIS PRO0ESS

---WHEN SHOULD READ DATA SHOULD BE TAKEN IN


begin

if(~Rst)

begin

prdata_s <= 32'h0;

end

else if((arvalid_initial==1'b1)&(Pready==1'b1)&(Rready==1'b1))

begin

prdata_s <= Prdata;

end


78

begin

prdata_s <= Prdata;

end


begin

prdata_s <= 32'h0;

end

else

begin

prdata_s <= 32'h0;

end

end

---P_WR_S:INCIDATES THAT ABP SIDE WRITE SIGNAL FOR THE READ FIFO


begin

if(~Rst)

begin

p_wr_s <= 1'b0;

end

else if((rready_s==1'b1)&(pready_s==1'b1))

begin

p_wr_s <= 1'b1;

end

else

begin

p_wr_s <= 1'b0;

end

end

---PENABLE SIGNAL

reg penable1,penable2,penable3,penable4;

79


begin

if(~Rst)

begin

Penable <= 1'b0;

penable1<= 1'b0;

penable2<= 1'b0;

penable3<= 1'b0;

penable4<= 1'b0;

end

else if((Wready))

begin

penable1<= 1'b1;

penable2 <=penable1;

penable3<= penable2;


Penable <= penable4;

end

else

begin

penable1<= 1'b0;




Penable <= penable4;

end

end

---PWDATA : IT IS SOUTPUT SIGNAL

80

---TAKING DATA FROM WRITE FIFO


begin

if(~Rst)

begin

Pwdata<= 32'h0;

end

else if((Penable==1'b1)|(pready_s==1'b1))

begin

Pwdata <= pwdata_s_w;

end

else

begin

Pwdata <= 32'h0;

end

end

---RDATA SIGNAL ON AXI SIDE

---THIS SIGNAL TAKE DATA FROM ABP SLAVE GIVE IT TO AXI


begin

if(~Rst)

Rdata <= 32'h0;

else

if(a_rd_s)

Rdata <= rdata_s_w;

else

Rdata <= 32'h0;

end

81

/***************************FIFO Interfacing Logic*********************************************/

//ENABLE SIGNAL FOR FIFO SIGNAL

reg wready_s;


begin

if(~Rst)

begin

wready_s <= 1'b0;

end

else if(Wready)

begin

wready_s <= 1'b1;

end

else

begin

wready_s <= 1'b0;

end

end

reg p_rd_en;


begin

if(~Rst)

begin

82

a_wr_s <=1'b0;

p_rd_en <=1'b0;

end

else if((pslverr_s==1'b0)&(Wready==1'b1))

begin

p_rd_en <= 1'b1;

a_wr_s <= 1'b1;

end

else

begin

a_wr_s <=1'b0;

p_rd_en <=1'b0;

end

end

reg rd_a;


begin

if(~Rst)

begin

p_rd_s <= 1'b0;

end

else if((p_rd_en==1'b1)|(rd_a==1'b1))

begin

p_rd_s <= 1'b1;

end

else

p_rd_s <= 1'b0;

end

83

reg p_rd_ss1,p_rd_ss2;


begin

if(~Rst)

begin

p_rd_ss <= 1'b0;

p_rd_ss1 <= 1'b0;

p_rd_ss2 <= 1'b0;

end

else

begin

p_rd_ss <= p_rd_s;

p_rd_ss1 <= p_rd_ss;

p_rd_ss2 <=p_rd_ss1;

end

end

//---WRITE FIFO INTERFACING MODULE

fifo_code wr_fifo(.Data_in(wdata_s),

.RClk(Pclk),

.ReadEn_in(p_rd_ss2),

.Clear_in(Rst),

.WClk(Aclk),

.WriteEn_in(a_wr_s),

.Data_out(pwdata_s_w)

// empty,

84

//full

);

//---READ FIFO INTERFACING MODULE

fifo_code rd_fifo(.Data_in(prdata_s),

.RClk(Aclk),

.ReadEn_in(a_rd_s),

.Clear_in(Rst),

.WClk(Pclk),

.WriteEn_in(p_wr_s),

.Data_out(rdata_s_w)

//empty,

//full

);

//-------------- Pwrite Sognal For the

reg pwrite1,pwrite2;


begin

if(~Rst)

begin

Pwrite <= 1'b0;

pwrite1<= 1'b0;

pwrite2<= 1'b0;

end

else if((wready_s))//|(Wready==1'b1))

begin

pwrite1<= 1'b1;

85

pwrite2 <=pwrite1;

Pwrite <= pwrite2;

end

else

begin

pwrite1<= 1'b0;

pwrite2<= pwrite1;

Pwrite <= pwrite2;

end

end

/*The Address fifo have tochange*/

reg rd_en;


begin

if(~Rst)

begin

wr_addr = 1'b0;

rd_en =1'b0;

end

else

if(((Awready==1'b1)|(Arready==1'b1))&(pslverr_s==1'b0)&(pready_s==1'b1)&((awvalid_s==2'b11)|(arvalid_s==2'b11)))

begin

wr_addr = 1'b1;

rd_en =1'b1;

end

else

begin

rd_en =1'b0;

86

wr_addr = 1'b0;

end

end


begin

if(~Rst)

rd_addr <= 1'b0;

else if((rd_en==1'b1))

rd_addr <= 1'b1;

else

rd_addr <= 1'b0;

end


begin

if(~Rst)

rd_a <= 1'b0;

else if(( rd_addr==1'b1))

rd_a <= 1'b1;

else

rd_a <= 1'b0;

end

fifo_code addr(.Data_in(paddr_s),

.RClk(Pclk),

.ReadEn_in(rd_addr),

.Clear_in(Rst),

.WClk(Aclk),

.WriteEn_in(wr_addr),

87

.Data_out(paddr_s2)

);

---SLECTION OF SLAVE


begin

if(~Rst)

begin

Psel <= 1'b0;

end

else if((rd_addr==1'b1)&(pslverr_s==1'b0))

begin

if(pready_s==1'b1)

begin

Psel <= 1'b1;

end

else

begin

Psel <= 1'b0;

end

end

end

//---ADDRESS ON ABP SIDE


begin

if(~Rst)

Paddr <= 32'h0;

else

88

Paddr <= paddr_s2;

end

//ENABLE FOR READ READY SIGNAL


begin

if(~Rst)

rready_s1 <=1'b0;

else

rready_s1 <=rready_s3;

end


begin

if(~Rst)

rready_s2 <= 1'b0;

else begin

rready_s2 <= rready_s;

rready_s3 <= rready_s2;

end

end

//--------------------------------------------------------------------------------

endmodule

//--------------Completed Code For the Apb Bridge---------------------------------//

89

`timescale 1ns / 1ps

//------------------------------------------------------------------------------

module apb_uart_top (

//APB Interface

input PCLK,

input PRESETn,

input PSEL,

input PENABLE,

input PWRITE,

input [31:0] PWDATA,

input [31:0] PADDR,

output [31:0] PRDATA,

///UART

output tx,

input rx

);

wire [7:0]uart_rx_data,uart_tx_data;

wire uart_rd,uart_wr;

wire txrdy,rxrdy;

//-----------------------------------------------------------------------------

// Input/Output Declaration

//-----------------------------------------------------------------------------

90

//Instantiation of UART Controller Register module

APB Interface

apb_interface APB_IF(.PCLK(PCLK),

.PRESETn(PRESETn),

.PSEL(PSEL),

.PENABLE(PENABLE),

.PWRITE(PWRITE),

.PWDATA(PWDATA),

.PADDR(PADDR),

.PRDATA(PRDATA),

.uart_rx_data(uart_rx_data),

.uart_tx_data(uart_tx_data),

.rxrdy(rxrdy),

.txrdy(txrdy),

.uart_rd(uart_rd),

.uart_wr(uart_wr) );

uart UART_TOP(.clk(PCLK),

.rst(PRESETn),

.wr(uart_wr),

.rd(uart_rd),

.rx(rx),

.tx(tx),

.din(uart_tx_data),

.rdata_out(uart_rx_data),

.txrdy(txrdy),

.rxrdy(rxrdy)

);

91

endmodule //<<UART_cntrl>>

FIFO MODULE

module fifo_code

#(parameter DATA_WIDTH = 32,

ADDRESS_WIDTH = 7)

//FIFO_DEPTH = (1 << ADDRESS_WIDTH))

//Reading port

(output reg [31:0] Data_out,

output reg Empty_out,

input wire ReadEn_in,

input wire RClk,

//Writing port.

input wire [31:0] Data_in,

output reg Full_out,

input wire WriteEn_in,

input wire WClk,

input wire Clear_in);

/////Internal connections & variables//////

reg [31:0] Mem [0:255];

wire [7:0] pNextWordToWrite, pNextWordToRead;

wire EqualAddresses;

wire NextWriteAddressEn, NextReadAddressEn;

wire Set_Status, Rst_Status;

reg Status;

wire PresetFull, PresetEmpty;

92

//////////////Code///////////////

//Data ports logic:

//(Uses a dual-port RAM).

//'Data_out' logic:

always @ (posedge RClk,negedge Clear_in)

begin

if(~Clear_in)

Data_out <= 32'h0;

else if (ReadEn_in && !Empty_out)

Data_out <= Mem[pNextWordToRead];

end

reg pt;

//'Data_in' logic:

always @ (posedge WClk,negedge Clear_in)

begin

if(~Clear_in)

pt <= 1'b0;

else if (WriteEn_in & !Full_out)

Mem[pNextWordToWrite] <= Data_in;

end

//Fifo addresses support logic:

//'Next Addresses' enable logic:

assign NextWriteAddressEn = WriteEn_in & ~Full_out;

assign NextReadAddressEn = ReadEn_in & ~Empty_out;

//Addreses (Gray counters) logic:

GrayCounter GrayCounter_pWr

(.GrayCount_out(pNextWordToWrite),

.Enable_in(NextWriteAddressEn),

93

.Clear_in(Clear_in),

.Clk(WClk)

);

GrayCounter GrayCounter_pRd

(.GrayCount_out(pNextWordToRead),

.Enable_in(NextReadAddressEn),

.Clear_in(Clear_in),

.Clk(RClk)

);

//'EqualAddresses' logic:

assign EqualAddresses = (pNextWordToWrite[ADDRESS_WIDTH:0] == pNextWordToRead[ADDRESS_WIDTH:0])?1'b1:1'b0;

//'Quadrant selectors' logic:

assign Set_Status = (pNextWordToWrite[ADDRESS_WIDTH-1] ~^ pNextWordToRead[ADDRESS_WIDTH]) &

(pNextWordToWrite[ADDRESS_WIDTH] ^ pNextWordToRead[ADDRESS_WIDTH-1])?1'b1:1'b0;

assign Rst_Status = ((pNextWordToWrite[ADDRESS_WIDTH-1] ^ pNextWordToRead[ADDRESS_WIDTH]) &

(pNextWordToWrite[ADDRESS_WIDTH] ~^ pNextWordToRead[ADDRESS_WIDTH-1]))?1'b1:1'b0;

//'Status' latch logic:

always @ (Set_Status, Rst_Status, Clear_in) //D Latch w/ Asynchronous Clear & Preset.

begin

94

if (Rst_Status | !Clear_in)

Status = 0; //Going 'Empty'.

else if (Set_Status)

Status = 1; //Going 'Full'.

end

//'Full_out' logic for the writing port:

assign PresetFull = Status & EqualAddresses; //'Full' Fifo.

always @ (posedge WClk, posedge PresetFull) //D Flip-Flop w/ Asynchronous Preset.

begin

// if(~Clear_in)

// Full_out <= 1'b0;

// else

if (PresetFull)

Full_out <= 1;

else

Full_out <= 0;

end

//'Empty_out' logic for the reading port:

assign PresetEmpty = ~Status & EqualAddresses; //'Empty' Fifo.

always @ (posedge RClk, posedge PresetEmpty) //D Flip-Flop w/ Asynchronous Preset.

begin

if (PresetEmpty)

Empty_out <= 1;

else

Empty_out <= 0;

end

endmodule

95

GRAY COUNTER MODULE

module GrayCounter#(parameter COUNTER_WIDTH =8)

(output reg [COUNTER_WIDTH-1:0] GrayCount_out, //'Gray' code count output.

input wire Enable_in, //Count enable.

input wire Clear_in, //Count reset.

input wire Clk);

/////////Internal connections & variables///////

reg [COUNTER_WIDTH-1:0] BinaryCount;

/////////Code///////////////////////

always @ (posedge Clk,negedge Clear_in)

begin

if (~Clear_in) begin

BinaryCount <= {COUNTER_WIDTH{1'b 0}} + 1; //Gray count begins @ '1' with

GrayCount_out <= {COUNTER_WIDTH{1'b 0}}; // first 'Enable_in'.

end

else if (Enable_in) begin

BinaryCount <= BinaryCount + 1;

GrayCount_out <= {BinaryCount[COUNTER_WIDTH-1],

BinaryCount[COUNTER_WIDTH-2:0] ^ BinaryCount[COUNTER_WIDTH-1:1]};

end

end

96

endmodule

UART module uart( input clk,rst,wr,rd,rx,

output tx,

input [7:0] din,

output [7:0] rdata_out,

output txrdy, rxrdy);

wire parityerr;

wire rxrdy1,txrdy1;

assign rxrdy=(rst==1'b1)?(rxrdy1):1'bZ;

assign txrdy=(rst==1'b1)?(txrdy1):1'bZ;

uart_transmitter u4( .clk(clk),

.rst_n(rst),

.wr(wr),

.data(din),

.txrdy(txrdy1),

.tx(tx));

uart_receiver u5( .clk(clk),

.rst_n(rst),

.rd(rd),

.rx(rx),

.parityerr(parityerr),

.rxrdy(rxrdy1),

.det_rx(det_rx),

.rd_clk(rd_clk),

.flag(flag),

.data(rdata_out));

endmodule

97

UART_RECEIVER

module uart_receiver(input clk, rst_n, rd, rx,

output reg parityerr,

output reg rxrdy,

output reg det_rx,

output reg rd_clk,

output reg flag,

output reg [7:0]data);

integer count, countrx;

//reg [7:0] temp_rhr;

reg rbaud_clk;

reg [10:0] rsr;

reg [7:0] rhr;

// This module is to detect the receiving bit i.e., startbit

always@(posedge clk)

begin

if(rst_n ==1'b0)

begin

det_rx <=1'b0; // On reset, we assume that, det_rx is not activated.

countrx <=1'b0;

end

else if(rx ==1'b0)

begin

det_rx <=1'b1; /// If a start bit is received, det_rx control signal is enabled.

98

countrx <= countrx+1;

end

else if (flag ==1'b1) ///and (countrx < 100))

det_rx <=1'b0; ///If start bit occupies the first bit of RSR i.e., all the bits are received into receiver

end

///This module is to keep track of the count, which will be helpfull in generating baud clocks


begin

if(rst_n ==1'b0)

count <= 0;

else if((det_rx ==1'b1)&&(count == 9))

count <= 0;

else if(det_rx ==1'b1)

count <= count+1;

else

count <= 0;

end

/// This module is for generation of baud clk


begin

if(rst_n ==1'b0)

rbaud_clk <=1'b0;

else if(count == 1)

rbaud_clk <=1'b1;

else

99

rbaud_clk <=1'b0;

end

//This module is for receiving data from transmitter line to receiver i.e, to RSR

always@(posedge rbaud_clk )

begin

if(rst_n ==1'b0)

rsr <=11'b11111111111;

else

begin

rsr[9:0] <= rsr[10:1]; //Receiving bits bit by bit

rsr[10] <= rx;

if(flag ==1'b1) //If start bit reaches the first position, then, it is reset

rsr <= 11'b11111111111;

end

end

//This module is to assign value to the flag


begin

if(rst_n ==1'b0)

flag <=1'b0;

else

100

begin

if(rsr[0] ==1'b0)

flag <=1'b1;

else if(det_rx ==1'b1)

flag <=1'b0;

end

end

///This module is to receive data from RSR to RHR


begin

if(rst_n ==1'b0)

rhr <= 8'b11111111;

else

rhr <= rsr[8:1];

end


begin

if(rst_n ==1'b0)

rd_clk <=1'b0;

else

begin

if(flag ==1'b1)

rd_clk <=1'b1;

else

rd_clk <=1'b0;

101

end

end

//This module is to shift data from RHR to Dataline with the help of read rd signal

always@(posedge flag)

begin

if(rst_n ==1'b0)

data <= 8'b0;

else

data <= rhr;

end

/// This module is to monitor, whether Receiver is ready or not


begin

if(rst_n ==1'b0)

rxrdy <=1'b0;

else if(flag ==1'b1)

rxrdy <=1'b1;

else if(rd ==1'b1)

rxrdy <=1'b0;

end

///This module is for parity error


begin

if(rst_n ==1'b0)

parityerr <=1'b0;

102

else if(rd ==1'b1)

parityerr <=1'b0;

else if (rsr[0] ==1'b0)

begin

if(((rsr[8] ^ rsr[7] ^ rsr[6] ^ rsr[5] ^ rsr[4] ^ rsr[3] ^ rsr[2] ^ rsr[1]) &(rsr[9])) ==1'b1)

parityerr <=1'b0;

else

parityerr <= 1'b1;

end

end

endmodule

103

UART_TRANSMITTER

module uart_transmitter(input clk, rst_n, wr,

input[7:0] data,

output reg txrdy,

output reg tx);

integer count;

reg[7:0] tbr;

reg [10:0]tsr;

reg baud_clk;

reg tx_sts;

/*This module is to keep tx_status to be 1 or not - i.e, to monitor tx is busy or not

-- tx_sts indicates transmitter status, tx_sts = 1 means transmitter is busy; tx_sts = 0 means transmitter is free*/


begin

if(rst_n == 1'b0)

tx_sts <= 1'b0; //Transmitter is free

else if((wr == 1'b1)&(txrdy ==1'b1))

tx_sts <= 1'b1; //Transmitter is busy

else if(txrdy ==1'b1)

tx_sts <= 1'b0; //Transmitter is free

end

///This module is to load data from dataline to tbr


begin

if(rst_n ==1'b0)

104

tbr <= 8'b0;

else if(txrdy ==1'b1) //If transmitter is ready, then we need to load data from dataline to data buffer register

tbr <= data;

end

///This module is to generate baud clock


begin

if(rst_n ==1'b0)

count <= 0;

else if((tx_sts== 1'b1)&(count ==9))

count <= 0;

else if(tx_sts ==1'b1)

count <= count+1;

else

count<=0;

end

///This module is used, to trigger the baud clock


begin

if(rst_n ==1'b0)

baud_clk <=1'b0;

else if(count == 1)

baud_clk <=1'b1;

else

baud_clk <=1'b0;

end

/// This module is for shifing bit by bit


105

begin

if(rst_n ==1'b0)

begin

tx<=1;

tsr <= 10'b0;

txrdy <=1'b1;

end

else if((wr ==1'b1)& (txrdy ==1'b1)) /// and ((tbr(0) or tbr(1) or tbr(2) or tbr(3) or tbr(4) or tbr(5) or tbr(6)) = '1')

///This piece of code is to load data from TBR to TSR(with Start, Stop and Parity)

begin

tsr[10] <=1'b1;

tsr[9] <= (tbr[0]^ tbr[1] ^ tbr[2] ^ tbr[3]^ tbr[4]^ tbr[5] ^ tbr[6]^ tbr[7]);

tsr[8:1] <= tbr;

tsr[0] <=1'b0;

end

if((tsr[0] | tsr[1] | tsr[2] | tsr[3] | tsr[4] | tsr[5]| tsr[6]| tsr[7] | tsr[8]| tsr[9] | tsr[10])==1'b0)

txrdy <=1'b1; //txrdy is 1, when TSR has finished sending data.

else

txrdy <=1'b0; //txrdy is 0, when TSR has data to be sent

if((txrdy ==1'b0)&(baud_clk ==1'b1))

begin

tx <= tsr[0];

tsr <={1'b0 , tsr[10:1]};

end

end

endmodule

106

AXI_INTERFACING_UART

module axi_interfacing_uart( input Aclk,

input Pclk,

input Rst,

input Awvalid,

input [31:0]Awaddr,

input Wvalid,

input [31:0] Wdata,

input [3:0] Wstrb,

input Bready,

input Arvalid,

input [31:0] Araddr,

input Rready,

output Awready,

output Wready,

output Bvalid,

output Bresp,

output Arready,

output Rvalid,

output [31:0] Rdata,

output Rresp,

input uart_rx,

output uart_tx );

wire[31:0] prdata_from_uart;

wire[31:0] paddr_from_apb;

wire[31:0] pwdata_from_apb;

wire psel_from_apb;

wire penable_from_apb;

wire pwrite_from_apb;

107

wire[3:0] pstrb_s;

///Internal Signals

reg psel_s;

reg pwrite_s;

reg penable_s;


begin

if(~Rst)

psel_s <= 1'b0;

else

psel_s <= psel_from_apb;

end

reg pwrite_s1,pwrite_s2,pwrite_s3,pwrite_s4;


begin

if(~Rst)

begin

pwrite_s <= 1'b0;

pwrite_s1 <= 1'b0;

pwrite_s2 <= 1'b0;

pwrite_s3 <= 1'b0;

pwrite_s4 <= 1'b0;

end

else if(pwrite_from_apb)

begin

pwrite_s1 <= 1'b1;

pwrite_s2 <= pwrite_s1;


pwrite_s <= pwrite_s3;

end

108

else

begin

pwrite_s1 <= 1'b0;



pwrite_s <= pwrite_s3;

end

end

reg penable_s1, penable_s2,penable_s3;


begin

if(~Rst)

begin

penable_s1 <= 1'b0;

penable_s2 <= 1'b0;

penable_s3 <= 1'b0;

penable_s <= 1'b0;

end

else if(penable_from_apb)

begin

penable_s1 <= 1'b1;

penable_s2 <= penable_s1;


penable_s <= penable_s3;

end

else

begin

penable_s1 <= 1'b0;


109


penable_s <= penable_s3;

end

end

apb_bridge_cha axi_2_apb(.Aclk(Aclk),

.Pclk(Pclk),

.Rst(Rst),

.Awvalid(Awvalid),

.Awaddr(Awaddr),

.Wvalid(Wvalid),

.Wdata(Wdata),

.Wstrb(Wstrb),

.Bready(Bready),

.Arvalid(Arvalid),

.Araddr(Araddr),

.Rready(Rready),

.Pready(1'b1),

.Prdata(prdata_from_uart),

.Pslverr(1'b0),

.Awready(Awready),

.Wready(Wready),

.Bvalid(Bvalid),

.Bresp(Bresp),

.Arready(Arready),

.Rvalid(Rvalid),

.Rdata(Rdata),

.Rresp(Rresp),

.Paddr(paddr_from_apb),

.Psel(psel_from_apb),

110

.Penable(penable_from_apb),

.Pwrite(pwrite_from_apb),

.Pwdata(pwdata_from_apb),

.Pstrb(pstrb_s)

);

apb_uart_top apb_2_uart(.PCLK(Pclk),

.PRESETn(Rst),

.PSEL(psel_s),

.PENABLE(penable_s),

.PWRITE(pwrite_s),

.PWDATA(pwdata_from_apb),

.PADDR(paddr_from_apb),

.PRDATA(prdata_from_uart),

.tx(uart_tx),

.rx(uart_rx)

);

endmodule

111

9. Applications

Product Type Application

Computing

Net book, Smart book, Tablet, eReader, Thin client

Mobile Handset

Smartphone’s, Feature phones

Automotive

Infotainment, Navigation

Digital Home

Set-top Box, Digital TV, Blu-Ray player, Gaming consoles

Enterprise

LaserJet printers, routers, wireless base-stations, VOIP phones and equipment

Wireless Infrastructure

Web 2.0, wireless base stations, switches, servers

112

10.RESULTS APB read:

APB_write

113

APB_Bridge:

APB_Interfacing_UART:

114

AXI_Interfacing_uart;

115

UART_Receiver:

UART_transmitter

116

References

1.Design and Implementation of APB Bridge based on AMBA 4.0 (IEEE 2011), ARM Limited.

2. http://en.wikipedia.org/wiki/System_on_a_chip#Structure

3. Power.org Embedded Bus Architecture Report Presented by the Bus Architecture TSC

Version 1.0 – 11 April 2008

4. http://www.arm.com/products/system-ip/amba/amba-open-specifications.php

5. ARM, "AMBA Protocol Specification 4.0", www.arm.com, 2010

6. ARM,AMBA Specification (Rev 2.0).

7. AMBA® 4 AXI4™, AXI4-Lite™, and AXI4-Stream™ Protocol Assertions Revision: r0p0

User Guide.

8. AMBA® APB Protocol Version: 2.0 Specifications.

9. ASB Example AMBA System Technical Reference Manual Copyright © 1998-1999 ARM

Limited.

10. AHB to APB Bridge (AHB2APB) Technical Data Sheet Part Number: T-CS-PR-0005-100

Document Number: I-IPA01-0106-USR Rev 05 March 2007.

11. LogiCORE IP AXI to APB Bridge(v1.00a)DS788 June 22, 2011 Product Specification.

12. Simulation and Synthesis Techniques for Asynchronous FIFO Design Clifford E.Cummings,

Sunburst Design, Inc. [email protected]. SNUG San Jose 2002 Rev 1.2., FIFO

Architecture, Functions, and ApplicationsSCAA042A November 1999.

13. ARM, "AMBA Protocol Specification 4.0", www.arm.com, 2010.

14.Ying-Ze Liao, "System Design and Implementation of AXI Bus", National Chiao Tung

University, October 2007.

117

15. Clifford E. Cummings, "Coding And Scripting Techniques For FSM Designs With

Synthesis-Optimized, Glitch-Free Outputs," SNUG

(Synopsys Users Group Boston, MA 2000) Proceedings, September 2000.

16. Clifford E. Cummings, “Synthesis and Scripting Techniques for Designing Multi-

Asynchronous Clock Designs,” SNUG 2001

17. Simulation and Synthesis Techniques for Asynchronous FIFO Design Clifford E.Cummings,

Sunburst Design, Inc. [email protected] San Jose 2002 Rev 1.2.,

18. Lahir, K., Raghunathan A., Lakshminarayana G., “LOTTERYBUS: a new high-performance

communication architecture for system-on-chip deisgns,” in Proceedings of Design Automation

Conference, 2001.

19.Sanghun Lee, Chanho Lee, Hyuk-Jae Lee, “A new multi-channel onchip-bus architecture for

system-on-chips,” in Proceedings of IEEE International SOC Conference, September 2004.

apb_doc

Documents