harshu1

110
CHAPTER 1 INTRODUCTION Serial to communication is an essential computers and allows them to communicate with low speed peripheral devices, such as Keyboard, the mouse, modems etc. Universel Asynchrones Receiver and Transmitter is most important component requiers in serial communication. UART is an integrated circuit used for conversion of serial data to parallel and vice versa. In this project, we study, design and implement a UART and APB interface environment for that UART using VHDL. The UART-APB core is a serial communication controller with a serial data interface that is intended primarily for embedded systems and designing ASIC. The UART-APB core can be used to interface directly to industry standard UARTs. The UART-APB core is intentionally a subset of full UART capability to make the function cost-effective in a programmable device. In Several Control systems, UART a kind of serial communication circuit is used widely. A universal asynchronous receive/transmit (UART) is an integrated circuit which plays the most important. 1

Upload: pinky-ponky

Post on 17-Oct-2014

34 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: harshu1

CHAPTER 1

INTRODUCTION

Serial to communication is an essential computers and allows them to

communicate with low speed peripheral devices, such as Keyboard, the mouse,

modems etc. Universel Asynchrones Receiver and Transmitter is most important

component requiers in serial communication. UART is an integrated circuit used

for conversion of serial data to parallel and vice versa. In this project, we study,

design and implement a UART and APB interface environment for that UART

using VHDL.

The UART-APB core is a serial communication controller with a serial

data interface that is intended primarily for embedded systems and designing

ASIC. The UART-APB core can be used to interface directly to industry standard

UARTs. The UART-APB core is intentionally a subset of full UART capability

to make the function cost-effective in a programmable device. In Several Control

systems, UART a kind of serial communication circuit is used widely. A

universal asynchronous receive/transmit (UART) is an integrated circuit which

plays the most important.

The APB interface allows access to the UART apb internal registers,

FIFO, and internal memory. This interface is synchronous to the clock. The baud

generator creates a divided down clock enable that correctly paces the transmit

and receive state machines. To transmit data, it is first loaded into the transmit

data buffer in normal mode, and into the transmit FIFO in FIFO mode. The

receive state machine monitors the activity of the RX signal. Once a START bit

is detected, the receive state machine begins to store the data in the receive buffer

in normal mode and the receive FIFO in FIFO mode.

1

Page 2: harshu1

Block Diagram:

FIGURE 1.1: BASIC STRUCTURE OF APB UART

1.1 APB:

APB (Advanced Peripheral Bus) is used to connect general purpose low

speed low-power peripheral devices. The bridge is peripheral bus master, while

all buses devices (Timer, UART, PIA, etc) are slaves. APB is static bus that

provides a simple addressing with latched addresses and control signals for easy

interfacing.

APB is optimized for minimal power consumption and reduced interface

complexity to support peripheral functions.

Advanced Peripheral Bus provides the basic peripheral macro cell

communications infrastructure as a secondary bus from the higher bandwidth

pipelined main system bus. Such peripherals typically have interfaces which are

2

APB System Interface

UART CONTROLLER

Page 3: harshu1

memory-mapped registers, have no high bandwidth interfaces and are accessed

under programmed control.

1.2 BAUD RATE GENERATOR:

The Baud Rate Generator is a programmable transmit and receive bit

timing device. Given the programmed value, it generates a periodic pulse, which

determines the baud rate of the UART transmission. This pulse is used by the

receiver and transmitter circuit to generate a sampling pulse for sampling the

received serial data and to determine the bit width of the transmit data.

1.3 UART (Universal asynchronous receiver/transmitter):

A universal asynchronous receiver/transmitter (UART) is a type of

"asynchronous receiver/transmitter", a piece of computer hardware that translates

data between parallel and serial forms.

The Universal Asynchronous Receiver/Transmitter (UART) controller is

the key component of the serial communications subsystem of a computer. The

UART takes bytes of data and transmits the individual bits in a sequential

fashion. At the destination, a second UART re-assembles the bits into complete

bytes.

When transmitting, the UART takes 8 bits of parallel data and converts

the data to a serial bit stream that consists of a start bit (logic 0), 8 data bits (least

significant bit first), and one or more stop bits (logic 1).

3

Page 4: harshu1

CHAPTER 2

SYSTEM ON CHIP BUSSES

A system on a chip or system on chip (SoC or SOC) is an integrated

circuit (IC) that integrates all components of a computer or

other electronic system into a single chip. It may contain digital, analog, mixed-

signal, and often radio-frequency functions all on a single chip substrate.

This technology promises new levels of integration on a single chip,

called the System-on-a-Chip (SoC) design, but also presents significant

challenges to the chip designer. Currently, on-chip interconnection networks are

mostly implemented using buses. For SoC applications, design reuse becomes

easier if standard internal connection buses are used for interconnecting

components of the design.

A heterogeneous SoC might include one or more programmable

components such as general purpose processors cores, digital signal processor

cores, or application-specific intellectual property (IP) cores, as well as an analog

front end, on-chip memory, I/O devices, and other application specific circuits.

In other words, a SoC is an IC that implements most or all the functions of a

complete electronic system. On-chip bus organized communication architecture

(CA) is among the top challenges in CMOS SoC technology due to rapidly

increasing operation frequencies and growing chip size. In general, the

performance of the SoC design heavily depends upon the efficiency of its bus

structure. The balance of computation and communication in any application or

task is, of course, known as a fundamental determinant of delivered performance.

Usually, IP cores, as constituents of SoCs, are designed with many different

interfaces and communication protocols. Integrating such cores in a SoC often

requires insertion of suboptimal glue logic. Standards of on-chip bus structures

were developed to avoid this problem. Currently there are a few publicly

available bus architectures from leading manufacturers, such as Core Connect

from IBM AMBA from ARM , Silicon Backplane from Sonics , and others.

4

Page 5: harshu1

These bus architectures are usually tied to processor architecture, such as the

PowerPC or the ARM processor. Manufacturers provide cores optimized to work

with these bus architectures, thus requiring minimal extra interface logic.

SOME STANDARD BUS ARCHITECTURES OF THE

SYSTEM ON CHIP:

AMBA 2.0, 3.0 (ARM)

Core Connect (IBM)

Sonics Smart Interconnect (Sonics)

ST Bus (STMicroelectronics)

Wishbone (Open cores)

Avalon (Altera)

PI Bus (OMI)

MARBLE (Univ. of Manchester)

Core Frame (Palm Chip)

2.1 AMBA Bus

AMBA (Advanced Microcontroller Bus Architecture), is a bus standard

devised by ARM with aim to support efficient on-chip communications

among ARM processor cores. AMBA is hierarchically organized into two bus

segments, system- and peripheral-bus, mutually connected via bridge that

buffers data and operations between them.

Three distinct buses are defined within the AMBA specification:

Advanced High-performance Bus (AHB)

The The Advanced System Bus (ASB)

The Advanced Peripheral Bus (APB).

5

Page 6: harshu1

2.1.1 Advanced High-performance Bus (AHB)

The AMBA AHB is for high-performance, high clock frequency system

modules. The AHB acts as the high-performance system backbone bus. AHB

supports the efficient connection of processors, on-chip memories and off-chip

external memory interfaces with low-power peripheral macrocell functions. AHB

is also specified to ensure ease of use in an efficient design flow using synthesis

and automated test techniques.

AHB (Advanced High-performance Bus) as a later generation of AMBA

bus is intended for high performance high-clock synthesizable designs. It provides

high-bandwidth Communication channel between embedded processor (ARM,

MIPS, AVR, DSP 320xx, 8051, etc.) and high performance peripherals/ hardware

accelerators (ASICs MPEG, color LCD , etc), on-chip SRAM, on-chip external

memory interface, and APB bridge. AHB Supports a multiple bus masters

operation, peripheral and a burst transfer, wide data bus Configurations and non

tristate implementations. Constituents of AHB are: AHB-master, slave, Decoder.

2.1.2 Advanced System Bus (ASB)

The AMBA ASB is for high-performance system modules. AMBA ASB

is an alternative system bus suitable for use where the high-performance features

of AHB are not required. ASB also supports the efficient connection of

processors, on-chip memories and off chip external memory interfaces with low-

power peripheral macrocell functions.ASB (Advanced System Bus) - first

generation of AMBA system bus used for simple cost-effective designs that

support burst transfer, pipelined transfer operation, and multiple bus masters.

Characteristics of ASB

High Performance

Pipelined Operation

Burst Transfers

Mulitple Bus Masters

2.1.3 Advanced Peripheral Bus (APB)

6

Page 7: harshu1

The AMBA APB is for low-power peripherals. AMBA APB is optimized

for minimal power consumption and reduced interface complexity to support

peripheral functions. APB can be used in conjunction with either version of the

system bus.

Apb provides the basic peripheral macrocell communications

infrastructure as a secondary bus from the higher bandwidth pipelined main

system bus. Such peripherals typically have interfaces which are memory-mapped

registers, have no high bandwidth interfaces and are accessed under programmed

control. APB (Advanced Peripheral Bus) is used to connect general purpose low

speed low-power peripheral devices. The bridge is peripheral bus master while all

buses devices (Timer, UART, PIA, etc) are slaves. APB is static bus that provides

a simple addressing with latched addresses and control signals for easy interfacing

Characteristics of APB

Low Power

Latched Address and Control

Simple Interface

Suitable for Many Peripherals

CHAPTER 3

AMBA Hierarchy

7

Page 8: harshu1

The processor, on-chip memory and external bus interface all reside on

the high performance system bus. This bus provides a high bandwidth interface

between the elements that are involved in the majority of transfers. Also located

on the high performance ASB is a bridge to the lower bandwidth APB, where

most of peripherals in the system reside.

FIGURE 3.1: A TYPICAL AMBA SYSTEM

An AMBA-based microcontroller typically consists of a high-

performance system backbone bus (AMBA AHB or AMBA ASB), able to

sustain the external memory bandwidth, on which the CPU, on-chip memory and

other Direct Memory Access (DMA) devices reside. This bus provides a high-

bandwidth interface between the elements that are involved in the majority of

transfers. Also located on the high performance bus is a bridge to the lower

bandwidth APB, where most of the peripheral devices in the system are located.

AMBA APB provides the basic peripheral macrocell communications

infrastructure as a secondary bus from the higher bandwidth pipelined main

system bus. Such

Peripherals typically:

8

Page 9: harshu1

• have interfaces which are memory-mapped registers

• have no high-bandwidth interfaces

• accessed under programmed control.

The external memory interface is application-specific and may only have

a narrow data path, but may also support a test access mode which allows the

internal AMBA AHB, ASB and APB modules to be tested in isolation with

system-independent test sets.

The Advanced Peripheral Bus appears as a local secondary bus that is

encapsulated as a single ASB slave device. APB provides a low-power extension

to the system bus which builds on ASB signals directly. The APB bridge appears

as a slave module which handles the bus handshake and control signal retiming on

behalf of the local peripheral bus. By defining the APB interface from the starting

point of the system bus, the benefits of the system diagnostics and test

methodology can be exploited.

A full ASB interface is used for:

Bus masters

On-chip memory blocks

External memory interfaces

High-bandwidth peripherals with FIFO interfaces

DMA slave peripherals

A simple APB interface is recommended for:

Simple register-mapped slave devices

Very low power interfaces where clocks cannot be globally routed

Grouping narrow-bus peripherals to avoid loading the system bus

9

Page 10: harshu1

3.1 Objectives of the AMBA specification:

The AMBA specification has been derived to satisfy four key requirements:

To facilitate the right-first-time development of embedded

microcontroller products with one or more CPUs or signal processors

To be technology-independent and ensure that highly reusable peripheral

and system macro cells can be migrated across a diverse range of IC

processes and be appropriate for full-custom, standard cell and gate array

technologies

To encourage modular system design to improve processor independence,

providing a development road-map for advanced cached CPU cores and

the development of peripheral libraries

To minimize the silicon infrastructure required to support efficient on-

chip and off-chip communication for both operation and manufacturing

test.

CHAPTER 4

AMBA Advanced Peripheral Bus (APB)

10

Page 11: harshu1

The Advanced Peripheral Bus (APB) is part of the Advanced

Microcontroller Bus Architecture (AMBA) hierarchy of buses and is optimized

for minimal power consumption and reduced interface complexity.

The AMBA APB should be used to interface to any peripherals which are low

bandwidth and do not require the high performance of a pipelined bus interface.

The latest revision of the APB ensures that all signal transitions are only related

to the rising edge of the clock. This improvement means the APB peripherals can

be integrated easily into any design flow, with the following advantages:

Performance is improved at high-frequency operation

Performance is independent of the mark-space ratio of the clock

Static timing analysis is simplified by the use of a single clock edge

No special considerations are required for automatic test insertion

Many Application-Specific Integrated Circuit (ASIC) libraries have a

better selection of rising edge registers

easy integration with cycle based simulators.

These changes to the APB also make it simpler to interface it to the new

Advanced High-performance Bus (AHB).

4.1 APB specification:

The APB specification is described under the following headings:

• Write transfer

• Read transfer

WRITE TRANSFER:

11

Page 12: harshu1

The write transfer starts with the address, write data, write signal and

select signal all changing after the rising edge of the clock. The first clock cycle

of the transfer is called the SETUP cycle. After the following clock edge the

enable signal PENABLE is asserted and this indicates that the ENABLE cycle is

taking place. The address, data and control signals all remain valid throughout the

ENABLE cycle. The transfer completes at the end of this cycle. The enable

signal, PENABLE, will be reasserted at the end of the transfer. The select signal

will also go LOW, unless the transfer is to be immediately followed by another

transfer to the same peripheral. In order to reduce power consumption the address

signal and the write signal will not change after a transfer until the next access

occurs. The protocol only requires a clean transition on the enable signal.

FIGURE 4.1: WRITE TRANSFER DIAGRAM

READ TRANAFER:

12

Page 13: harshu1

The timing of the address, write, select and strobe signals are all the same

as for the write transfer. In the case of a read, the slave must provide the data

during the ENABLE cycle. The data is sampled on the rising edge of clock at the

end of the ENABLE cycle.

FIGURE 4.2: READ TRANSFER DIAGRAM

4.2 APB BRIDGE:

The APB Bridge is the only bus master on the AMBA APB. In addition,

the APB Bridge is also a slave on the higher-level system bus.

13

Page 14: harshu1

FIGURE 4.3: APB BRIDGE INTERFACE DIAGRAM

APB bridge description

The bridge unit converts system bus transfers into APB transfers and

performs the following functions:

Latches the address and holds it valid throughout the transfer.

Decodes the address and generates a peripheral select, PSELx.

Only one select signal can be active during a transfer.

Drives the data onto the APB for a write transfer.

Drives the APB data onto the system bus for a read transfer.

Generates a timing strobe, PENABLE, for the transfer.

4.3 APB SLAVE:

14

Page 15: harshu1

APB slave description

APB The slave interface is very flexible. For a write transfer the data can be

latched at the following points:

• on either rising edge of PCLK, when PSEL is HIGH

• on the rising edge of PENABLE, when PSEL is HIGH.

The select signal PSELx, the address PADDR and the write signal PWRITE can

be combined to determine which register should be updated by the write

operation. For read transfers the data can be driven on to the data bus when

PWRITE is LOW and both PSELx and PENABLE are HIGH. While PADDR is

used to determine which register should be read.

AMBA APB signal list

All AMBA APB signals use the single letter P prefix. Some APB signals,

such as the clock, may be connected directly to the system bus equivalent signal.

Table shows the list of AMBA APB signal names, along with a description of

howeach of the signals is used.

Name Description

PCLK : Bus clock The rising edge of PCLK is used to time all transfers on

theAPB.

PRESETn: APB reset The APB bus reset signal is active LOW and this signal

will normally be connected directly to the system bus reset signal.

PADDR [31:0] APB address bus This is the APB address bus, which may be up

to 32-bits wide and is driven by the peripheral bus bridge unit.

PSELx : APB select A signal from the secondary decoder, within the peripheral

bus bridge unit, to each peripheral bus slave x. This signal indicates that the slave

device is selected and a data transfer is required. There is a PSELx signal for

each bus slave.

15

Page 16: harshu1

PENABLE: APB strobe This strobe signal is used to time all accesses on the

peripheral bus. The enable signal is used to indicate the second cycle of an APB

transfer. The rising edge of PENABLE occurs in the middle of the APB transfer.

PWRITE: APB transfers direction When HIGH this signal indicates an APB

write access and when LOW a read access.

PRDATA: APB read data bus The read data bus is driven by the selected slave

during read cycles (when PWRITE is LOW). The read data bus can be up to 32-

bits wide.

PWDATA: APB write data bus the write data bus is driven by the peripheral bus

bridge unit during write cycles (when PWRITE is HIGH). The write data bus

can be up to 32-bits

APB slaves have a simple, yet flexible, interface. The exact implementation of

the interface will be dependent on the design style employed and many different

options are possible.

16

Page 17: harshu1

FIGURE 4.4: APB SLAVE INTERFACE DIAGRAM

CHAPTER 5

UART (Universal asynchronous receiver/transmitter)

Block Diagram:

17

Page 18: harshu1

FIGURE: 5.1: UART (Universal asynchronous receiver/transmitter)

A universal asynchronous receiver/transmitter (usually abbreviated

UART and pronounced) is a type of "asynchronous receiver/transmitter", a piece

of computer hardware that translates data between parallel and serial forms.

The Universal Asynchronous Receiver/Transmitter (UART) controller is

the key component of the serial communications subsystem of a computer. The

UART takes bytes of data and transmits the individual bits in a sequential

fashion. At the destination, a second UART re-assembles the bits into complete

bytes.

When transmitting, the UART takes 8 bits of parallel data and converts the data

to a serial bit stream that consists of a start bit (logic 0), 8 data bits (least

significant bit first), and one or more stop bits (logic 1).

5.1 UART (Universal Asynchronous Receiver and Transmitter)

A universal asynchronous receiver/transmitter (usually abbreviated

UART and pronounced) is a type of "asynchronous receiver/transmitter", a piece

of computer hardware that translates data between parallel and serial forms.

18

Page 19: harshu1

A UART is usually an individual (or part of an) integrated circuit used for

serial communications over a computer or peripheral device serial port. UARTs

are now commonly included in microcontrollers. A dual UART or DUART

combines two UARTs into a single chip. Many modern ICs now come with a

UART that can also communicate synchronously; are called USARTs (universal

these devices synchronous/asynchronous receiver/transmitter). the individual bits

in a sequential fashion.  At the destination, a second UART re-assembles the bits

into complete bytes. Each UART contains a shift register, which is the

fundamental method of conversion between serial and parallel forms. Serial

transmission of digital The Universal Asynchronous Receiver/Transmitter

(UART) takes bytes of data and transmits information (bits) through a single wire

or other medium is much more cost effective than parallel transmission through

multiple wires.

FIGURE 5.2: SERIAL DATA TRANSMISSION

The above fig is standard format for serial transmission. Since no clock (clk) line,

data D is transmitted asynchronously, one byte at a time.

19

Page 20: harshu1

FIGURE 5.3: STRUCTURE OF UART

When no data is transmitted, D remains high.

To mark start bit, a low bit is transmitted (D will goes low).

Now, 8 bits will be transmitted, least significant bit will be first.

When text is being transmitted, ASCII code is usually used. In ASCII

code each character is represented by 7 bits and the 8 th bit is the parity bit.

After 8 bits are transmitted; D should go high at least once; representing a

character is transmitted.

Then, another character can be transmitted at any time.

5.2 SERIAL DATA FORMAT

20

Page 21: harshu1

When transmitting, the UART takes 8 bits of parallel data and converts

the data to a serial bit stream that consists of a start bit (logic 0), 8 data bits (least

significant bit first), and one or more stop bits (logic 1)

FIGURE 5.4: STANDARD SERIAL DATA FORMAT

When transmitting, the UART takes 8 bits of parallel data and converts the data

to a serial bit stream that consists of a start bit (logic 0), 8 data bits (least

significant bit first), and one or more stop bits (logic 1).

The Universal Asynchronous Receiver/Transmitter (UART) controller is

the key component of the serial communications subsystem of a computer. The

UART takes bytes of data and transmits the individual bits in a sequential

fashion. At the destination, a second UART re-assembles the bits into complete

bytes. Serial transmission is commonly used with modems and for non-

networked communication between computers, terminals and other

devices.Asynchronous transmission allows data to be transmitted without the

sender having to send a clock signal to the receiver. Instead, the sender and

receiver must agree on timing parameters in advance and special bits are added to

each word which is used to synchronize the sending and receiving units.

21

Page 22: harshu1

An asynchronous transmitting, teletype-style UARTs send a "start" bit,

five to eight data bits, least-significant-bit first, an optional "parity" bit, and then

one, one and a half, or two "stop" bits. The start bit is the opposite polarity of the

data-line's idle state. The stop bit is the data-line's idle state, and provides a delay

before the next character can start. (This is called asynchronous start-stop

transmission). In mechanical teletypes, the "stop" bit was often stretched to two

bit times to give the mechanism more time to finish printing a character. A

stretched "stop" bit also helps resynchronization.

Asynchronous transmission allows data to be transmitted without the

sender having to send a clock signal to the receiver. Instead, the sender and

receiver must agree on timing parameters in advance and special bits are added to

each word which is used to synchronize the sending and receiving units.

5.3 Design of UART:

The structure of UART is as shown in figure 5.3, consists of Transmitter

part and Receiver part, rather we can say consists of 3 units, transmitter circuit,

receiver circuit and Control/Status Registers.

FIGURE 5.5: STRUCTURE OF UART BLOCK

22

Page 23: harshu1

5.3.1 Design of UART Transmitter

The Block diagram of UART Transmitter is as shown in figure 5.3.1. The

data is loaded from Data Bus into TBR (Transmit Buffer Register) and from TBR

to TSR (Transmit Shift Register), based on the control and status signals

produced by the Control unit. The Size of TSR is taken in such a way that, it

should accommodate the START and STOP bits along with the Data bits which

are loaded from the Data Bus.

operation is simpler since it Transmission is under the control of the transmitting

system. As soon as data is deposited in the shift register after completion of the

previous character, the UART hardware generates a start bit, shifts the required

number of data bits out to the line, generates and appends the parity bit (if used),

and appends the stop bits. Since transmission of a single character may take a

long time relative to CPU speeds, the UART will maintain a flag showing busy

status so that the host system does not deposit a new character for transmission

until the previous one has been completed; this may also be done with an

interrupt. Since full-duplex operation requires characters to be sent and received

at the same time, practical UARTs use two different shift registers for transmitted

characters and received characters.

FIGURE 5.6: UART TRANSMITTER UNIT

23

Page 24: harshu1

The Data loaded into TSR has the format of START-DATA-STOP bits

which is as shown in figure of which, every time one bit will be sent, with

reference to baud clock.Correspondingly, the data in TSR will keeps updating

with 0’s; will be completely filled with 0’s, after transmission of the complete

data packet.

5.3.2 Design of UART Receiver

The Block diagram of UART Receiver is as shown in figure 3.6. The data

receiving will be captured using receiving baud clock and then loaded into RSR

(Receive Shift Register) and from RSR to RBR (Receive Buffer Register), and

then to Data Bus, based on the control and status signals produced by the Control

unit. All operations of the UART hardware are controlled by a clock signal which

runs at a multiple of the data rate. For example, each data bit may be as long as

16 clock pulses. The receiver tests the state of the incoming signal on each clock

pulse, looking for the beginning of the start bit. If the apparent start bit lasts at

least one-half of the bit time, it is valid and signals the start of a new character. If

not, the spurious pulse is ignored. After waiting a further bit time, the state of the

line is again sampled and the resulting level clocked into a shift register. After the

required number of bit periods for the character length (5 to 8 bits, typically) have

elapsed, the contents of the shift register is made available (in parallel fashion) to

the receiving system. The UART will set a flag indicating new data is available,

and may also generate a processor interrupt to request that the host processor

transfers the received data. The Size of RSR is taken in such a way that, it should

accommodate the START and STOP bits along with the Data bits which are

loaded from the Data Bus.

24

Page 25: harshu1

FIGURE 5.7: UART RECEIVER

5.3.3 SERIAL DATA FORMAT

FIGURE 5.8: SERIAL DATA FORMAT

25

Page 26: harshu1

The start bit is always a 0 (logic low), which is also called a space. The

start bit signals the receiving DTE that a character code is coming. The next five

to eight bits, depending on the code set employed, represent the character. In the

ASCII code set the eighth data bit may be a parity bit. The next one or two bits

are always in the mark (logic high, i.e., '1') condition and called the stop bit(s).

They provide a "rest" interval for the receiving DTE so that it may prepare for the

next character which may be after the stop bit(s). The rest interval was required

by mechanical Teletypes which used a motor driven camshaft to decode each

character. At the end of each character the motor needed time to strike the

character bail (print the character) and reset the camshaft.

All operations of the UART hardware are controlled by a clock signal

which runs at a multiple (say, 16) of the data rate - each data bit is as long as 16

clock pulses. The receiver tests the state of the incoming signal on each clock

pulse, looking for the beginning of the start bit. If the apparent start bit lasts at

least one-half of the bit time, it is valid and signals the start of a new character. If

not, the spurious pulse is ignored. After waiting a further bit time, the state of the

line is again sampled and the resulting level clocked into a shift register. After the

required number of bit periods for the character length (5 to 8 bits, typically) have

elapsed, the contents of the shift register is made available (in parallel fashion) to

the receiving system. The UART will set a flag indicating new data is available,

and may also generate a processor interrupt to request that the host processor

transfers the received data. In some common types of UART, a small first-in,

first-out (FIFO) buffer memory is inserted between the receiver shift register and

the host system interface. This allows the host processor more time to handle an

interrupt from the UART and prevents loss of received data at high rates.

Transmission operation is simpler since it is under the control of the

transmitting system. As soon as data is deposited in the shift register, the UART

hardware generates a start bit, shifts the required number of data bits out to the

line, generates and appends the parity bit (if used), and appends the stop bits.

Since transmission of a single character may take a long time relative to CPU

26

Page 27: harshu1

speeds, the UART will maintain a flag showing busy status so that the host

system does not deposit a new character for transmission until the previous one

has been completed; this may also be done with an interrupt. Since full-duplex

operation requires characters to be sent and received at the same time, practical

UARTs use two different shift registers for transmitted characters and received

characters.

Transmitting and receiving UARTs must be set for the same bit speed,

character length, parity, and stop bits for proper operation. The receiving UART

may detect some mismatched settings and set a "framing error" flag bit for the

host system; in exceptional cases the receiving UART will produce an erratic

stream of mutilated characters and transfer them to the host system.

Typical serial ports used with personal computers connected to modems

use eight data bits, no parity, and one stop bit; for this configuration the number

of ASCII character per seconds equals the bit rate divided by 10.

5.3.4 Special Receiver Conditions

Overrun Error:

An "overrun error" occurs when the UART receiver cannot process the

character that just came in before the next one arrives. Various UART devices

have differing amounts of buffer space to hold received characters. The CPU

must service the UART in order to remove characters from the input buffer. If the

CPU does not service the UART quickly enough and the buffer becomes full, an

Overrun Error will occur.

Under run Error:

An "under run error" occurs when the UART transmitter has completed

sending a character and the transmit buffer is empty. In asynchronous modes this

is treated as an indication that no data remains to be transmitted, rather than an

27

Page 28: harshu1

error, since additional stop bits can be appended. This error indication commonly

found in USARTs, since an under run is more serious in synchronous systems.

Framing Error:

A "framing error" occurs when the designated "start" and "stop" bits are

not valid. As the "start" bit is used to identify the beginning of an incoming

character, it acts as a reference for the remaining bits. If the data line is not in the

expected idle state when the "stop" bit is expected, a Framing Error will occur.

Parity Error:

A "parity error" occurs when the number of "active" bits does not agree

with the specified parity configuration of the UART, producing a Parity Error.

Because the "parity" bit is optional, this error will not occur if parity has been

disabled. Parity error is set when the parity of an incoming data character does

not match the expected value.

BAUD RATE GENERATOR

The Baud Rate Generator is a programmable transmits and receive bit

timing device. Given the programmed value, it generates a periodic pulse, which

determines the baud rate of the UART transmission. This pulse is used by the

receiver and transmitter circuit to generate a sampling pulse for sampling the

received serial data and to determine the bit width of the transmit

28

Page 29: harshu1

CHAPTER 6

UART-APB

The UART-APB core is a serial communication controller with a serial

The UART-APB core data interface that is intended primarily for embedded

systems and designing ASIC. The UART-APB core can be used to interface

directly to industry standard UARTs. The UART-APB core is intentionally a

subset of full UART capability to make the function cost-effective in a

programmable device. In Several Control systems, UART a kind of serial

communication circuit is used widely. A universal asynchronous receive/transmit

(UART) is an integrated circuit which plays the most important role in serial

communication

The APB interface allows access to the UART through APB. UART is

being used in SoC which consists of transmitter, receiver and baud rate generator

and therefore connecting it to the APB which is a peripheral bus in AMBA to

connect different peripherals, hence APB interface design with UART is needed.

The UART-APB core is a serial communication controller with a serial data

interface that is intended primarily for embedded systems and designing ASIC.

The UART-APB core can be used to interface directly to industry standard

UARTs. The UART-APB core is intentionally a subset of full UART capability

to make the function cost-effective in a programmable device. In Several Control

systems, UART a kind of serial communication circuit is used widely. A

universal asynchronous receive/transmit (UART) is an integrated circuit which

plays the most important.

The APB interface allows access to the UART apb internal registers,

FIFO, and internal memory. This interface is synchronous to the clock. The baud

generator creates a divided down clock enable that correctly paces the transmit

and receive state machines. To transmit data, it is first loaded into the transmit

29

Page 30: harshu1

data buffer in normal mode, and into the transmit FIFO in FIFO mode.The

receive state machine monitors the activity of the RX signal.

CHAPTER 7

SOFTWARE CODE

UART TRANSMITTER CODE

library IEEE;

use IEEE.STD_LOGIC_1164.ALL;

use IEEE.STD_LOGIC_ARITH.ALL;

use IEEE.STD_LOGIC_UNSIGNED.ALL;

entity uart_transmitter is

port(clk, rst_n, wr: in std_logic;

data: in std_logic_vector(7 downto 0);

txrdy: inout std_logic;

tx: out std_logic);

end uart_transmitter;

architecture Behavioral of uart_transmitter is

signal count: integer;

signal tbr: std_logic_vector(7 downto 0);

signal tsr: std_logic_vector(10 downto 0);

signal baud_clk: std_logic;

signal tx_sts: std_logic;

30

Page 31: harshu1

begin

--This module is to keep tx_status to be 1 or not - i.e, to monitor tx is busy or

not tx_sts indicates transmitter status, tx_sts = 1 means transmitter is busy;

tx_sts = 0 means transmittr ise free

process(clk)

begin

if(clk'event and clk = '1') then

if(rst_n = '0') then

tx_sts <= '0'; --Transmitter is free

elsif(wr = '1' and txrdy = '1') then

tx_sts <= '1'; --Transmitter is busy

elsif(txrdy = '1') then

tx_sts <= '0'; --Transmitter is free

end if;

end if;

end process;

--This module is to load data from dataline to tbr

process(clk)

begin

if(clk'event and clk = '1') then

if(rst_n = '0') then

tbr <= "00000000";

elsif(txrdy = '1') then

--If transmitter is ready, then we need to load data from dataline to data

buffer register

tbr <= data;

31

Page 32: harshu1

end if;

end if;

end process;

-- This module is to generate baud clock

process(clk)

begin

if(clk'event and clk = '1') then

if(rst_n = '0') then

count <= 0;

elsif(tx_sts = '1' and count = 9) then

count <= 0;

elsif(tx_sts = '1') the

count <= count+1;

else

count<=0;

end if;

end if;

end process;

--This module is used, to trigger the baud clock

process(clk)

begin

32

Page 33: harshu1

if(clk'event and clk = '1') then

if(rst_n = '0') then

baud_clk <= '0';

elsif(count = 1) then

baud_clk <= '1';

else

baud_clk <= '0';

end if;

end if;

end process;

-- This module is for shifing bit by bit

process(clk, baud_clk)

begin

if(clk'event and clk = '1') then

if(rst_n = '0') then

tsr <= "00000000000";

txrdy <= '1';

elsif((wr = '1') and (txrdy = '1')) then

-- and ((tbr(0) or tbr(1) or tbr(2) or tbr(3) or tbr(4) or tbr(5) or tbr(6)) = '1')

-- This piece of code is to load data from TBR to TSR(with Start, Stop and

Parity)

tsr(10) <= '1';

33

Page 34: harshu1

tsr(9) <= (tbr(0) xor tbr(1) xor tbr(2) xor tbr(3) xor tbr(4) xor tbr(5) xor tbr(6)

xor tbr(7));

tsr(8 downto 1) <= tbr;

tsr(0) <= '0';

end if;

if((tsr(0) or tsr(1) or tsr(2) or tsr(3) or tsr(4) or tsr(5)or tsr(6)or tsr(7) or tsr(8)or

tsr(9) or tsr(10)) = '0') then

txrdy <= '1'; --txrdy is 1, when TSR has finished sending data.

else

txrdy <= '0'; --txrdy is 0, when TSR has data to be sent.

end if;

if(txrdy = '0' and baud_clk = '1') then

tx <= tsr(0);

tsr <= '0' & tsr(10 downto 1);

end if;

end if;

end process;

end Behavioral;

34

Page 35: harshu1

TESTBENCH

Clk<=not clk after10ns

Rst_n<=10;'1'after20ns;

Tb process

Begin

Wr<='1';

Data<="10001001";

Wait for2100ns;

Wr<='0';

35

Page 36: harshu1

RECEIVER CODE

library IEEE;

use IEEE.STD_LOGIC_1164.ALL;

use IEEE.STD_LOGIC_ARITH.ALL;

use IEEE.STD_LOGIC_UNSIGNED.ALL;

entity uart_receiver is

port(clk, rst_n, rd, rx: in std_logic;

parityerr: out std_logic;

rxrdy: inout std_logic;

rsr: inout std_logic_vector(10 downto 0);

rhr: inout std_logic_vector(7 downto 0);

det_rx: inout std_logic;

rd_clk: inout std_logic;

flag: inout std_logic;

data: out std_logic_vector(7 downto 0));

end uart_receiver;

architecture Behavioral of uart_receiver is

signal count, countrx: integer;

signal temp_rhr: std_logic_vector(7 downto 0);

signal rbaud_clk: std_logic;

begin

36

Page 37: harshu1

-- This module is to detect the receiving bit i.e., startbit

process(clk)

begin

if(clk'event and clk = '1') then

if(rst_n = '0') then

det_rx <= '0';

-- On reset, we assume that, det_rx is not activated.

countrx <= 0;

elsif(rx = '0') then

det_rx <= '1';

-- If a, start bit is received det_rx control signal is enabled.

countrx <= countrx+1;

elsif (flag = '1') then --and (countrx < 100))

det_rx <= '0';

-- If start bit occupies the first bit of RSR i.e., all the bits are received into

receiver

end if;

end if;

end process;

-- This module is to keep track of the count, which will be helpfull in

generating baud clocks

process(clk)

37

Page 38: harshu1

begin

if(clk'event and clk = '1') then

if(rst_n = '0') then

count <= 0;

elsif(det_rx = '1' and count = 9) then

count <= 0;

elsif(det_rx = '1') then

count <= count+1;

else

count <= 0;

end if;

end if;

end process;

---This module is for generation of baud clk

process(clk)

begin

if(clk'event and clk = '1') then

if(rst_n = '0') then

rbaud_clk <= '0';

elsif(count = 1) then

rbaud_clk <= '1';

38

Page 39: harshu1

else

rbaud_clk <= '0';

end if;

end if;

end process;

---This module is for receiving data from transmitter line to receiver i.e, to

RSR

process(rbaud_clk, rst_n)

begin

if(clk'event and clk = '1') then

if(rst_n = '0') then

rsr <= "11111111111";

elsif(rbaud_clk'event and rbaud_clk = '1') then

rsr(9 downto 0) <= rsr(10 downto 1);

---Receiving bits bit by bit

rsr(10) <= rx;

end if;

if(clk'event and clk = '1') then

if(flag = '1') then

--If start bit reaches the first position, then, it is reset

rsr <= "11111111111";

end if;

39

Page 40: harshu1

end process;

--- This module is to assign value to the flag

process(clk)

begin

if(rst_n = '0') then

flag <= '0';

elsif(clk'event and clk = '1') then

if(rsr(0) = '0') then

flag <= '1';

elsif(det_rx = '1') then

flag <= '0';

end if;

end if;

end process;

--- This module is to receive data from RSR to RHR

process(clk)

begin

if(rst_n = '0') then

rhr <= "11111111";

elsif(clk'event and clk = '1') then

rhr <= rsr(8 downto 1);

40

Page 41: harshu1

end if;

end process;

process(clk)

begin

if(rst_n = '0') then

rd_clk <= '0';

elsif(clk'event and clk = '1') then

if(flag = '1') then

rd_clk <= '1';

else

rd_clk <= '0';

end if;

end if;

end process;

---This module is to shift data from RHR to Dataline with the help of read rd

signal

process(flag)

begin

if(rst_n = '0') then

data <= "00000000";

elsif(flag'event and flag = '1') then

data <= rhr;

41

Page 42: harshu1

end if;

end process;

---This module is to monitor, whether Receiver is ready or not

process(clk)

begin

if(clk'event and clk = '1') then

if(rst_n = '0') then

rxrdy <= '0';

elsif(flag = '1') then

rxrdy <= '1';

elsif(rd = '1') then

rxrdy <= '0';

end if;

end if;

end process;

---This module is for parity error

process(clk, rsr)

begin

if(clk'event and clk = '1') then

if(rst_n = '0') then

parityerr <= '0';

42

Page 43: harshu1

elsif(rd = '1') then

parityerr <= '0';

elsif(rsr(0) = '0') then

if(((rsr(8) xor rsr(7) xor rsr(6) xor rsr(5) xor rsr(4) xor rsr(3) xor rsr(2) xor rsr(1))

and rsr(9)) = '1') then

parityerr <= '0';

else

parityerr <= '1';

end if;

end if;

end if;

end process;

end Behavioral;

TESTBENCH

clk<=not clk after 1ns;

rst_n<='0','1' after 10ns;

process

begin

rx<='0';

wait for 35ns;

rx<='1';

wait for 35 ns;

rd<='0';rx<='0';

wait for 35ns;

rd<='1';rx<='1';

43

Page 44: harshu1

UART INTERFACE CODE

library IEEE;

use IEEE.STD_LOGIC_1164.ALL;

use IEEE.STD_LOGIC_ARITH.ALL;

use IEEE.STD_LOGIC_UNSIGNED.ALL;

entity uart is

Port(clk,rst,wr,rd,rx:in std_logic;

tx : out std_logic;

din : in STD_LOGIC_VECTOR (7 downto 0);

rdata_out : out STD_LOGIC_vector(7 downto 0);

txrdy, rxrdy : out std_logic);

end uart;

architecture Behavioral of uart is

signal parityerr:std_logic;

signal rxrdy1,txrdy1:std_logic;

component uart_transmitter is

port(clk, rst_n, wr: in std_logic;

data: in std_logic_vector(7 downto 0);

txrdy: inout std_logic;

tx: out std_logic);

end component;

component uart_receiver is

44

Page 45: harshu1

port(clk, rst_n, rd, rx: in std_logic;

parityerr: out std_logic;

rxrdy: inout std_logic;

data: out std_logic_vector(7 downto 0));

end component;

begin

rxrdy<=rxrdy1 when rst='1' else 'Z';

txrdy<=txrdy1 when rst='1' else 'Z';

u4:uart_transmitter port map (

clk => clk,

rst_n=>rst,

wr=>wr,

data=>din,

txrdy=>txrdy1,

tx=>tx);

u5:uart_receiver port map(

clk => clk,

rst_n=>rst,

rd=>rd,

rx=>rx,

45

Page 46: harshu1

parityerr=>parityerr,

rxrdy=>rxrdy1,

data=>rdata_out);

end Behavioral;

TESTBENCH

process

begin

clk <= '1';

wait for 1 ns;

clk <= '0';

wait for 1 ns;

end process;

process

begin

rst <= '0';

wr <= '0';

wait for 100 ns;

rst <= '1';

wait for 15 ns;

din <= "01011011";

46

Page 47: harshu1

rx<='0';

wr <= '1';

wait for 85 ns;

din <= "01001001";

wait for 100 ns;rx<='1';

din <= "01011010";

wait for 100 ns;

din <= "01011110";

wait for 100 ns;rx<='0';

din <= "01111100";

wait for 100 ns;

rx<='1'; rd<='1';

wait;

end process;

47

Page 48: harshu1

APB INTERFACE CODE

library IEEE;

use IEEE.STD_LOGIC_1164.ALL;

use IEEE.STD_LOGIC_ARITH.ALL;

use IEEE.STD_LOGIC_UNSIGNED.ALL;

entity apb_interface is

port( PCLK:in std_logic;

PRESETn:in std_logic;

PSEL:in std_logic;

PENABLE:in std_logic;

PWRITE:in std_logic;

PWDATA:in std_logic_vector(31 downto 0);

PADDR:in std_logic_vector(9 downto 0);

PRDATA:out std_logic_vector(31 downto 0);

---UART

uart_rx_data:in std_logic_vector(7 downto 0);

uart_tx_data:out std_logic_vector(7 downto 0);

rxrdy,txrdy:in std_logic;

uart_rd,uart_wr:out std_logic);

end apb_interface;

architecture Behavioral of apb_interface is

48

Page 49: harshu1

Constant REG0_ADDR:std_logic_vector(1 downto 0):="00";

Constant REG1_ADDR:std_logic_vector(1 downto 0):="01";

Constant REG2_ADDR:std_logic_vector(1 downto 0):="10";

Constant REG3_ADDR:std_logic_vector(1 downto 0):="11";

signal Reg0:std_logic_vector(31 downto 0):=(others=>'0');

signal Reg1:std_logic_vector(31 downto 0):=(others=>'0');

signal Reg2:std_logic_vector(31 downto 0):=(others=>'0');

signal Reg3:std_logic_vector(31 downto 0):=(others=>'0');

signal Next_Reg0:std_logic_vector(31 downto 0):=(others=>'0');

signal Next_Reg1:std_logic_vector(31 downto 0):=(others=>'0');

signal Next_Reg2:std_logic_vector(31 downto 0):=(others=>'0');

signal Next_Reg3:std_logic_vector(31 downto 0):=(others=>'0');

signal Next_PRDATA:std_logic_vector(31 downto 0):=(others=>'0');

signal iPRDATA:std_logic_vector(31 downto 0):=(others=>'0');

-- Read Fill Vector

signal ZeroFill:std_logic_vector(31 downto 0):=(others=>'0');

-- Gated version of PADDR

signal GatedPADDR:std_logic_vector(9 downto 0):=(others=>'0');

49

Page 50: harshu1

-- Internal read enable signal

signal Rden:std_logic;

-- Internal write enable signal

signal Wren:std_logic;

-- Internal PRDATA write enable signal

signal PRDATAEn:std_logic;

-- Internal Write Data Bus, to reduce power consumption

signal PWDATAIn:std_logic_vector(31 downto 0):=(others=>'0');

signal REG0rd:std_logic:='0';

signal REG1rd:std_logic:='0';

signal REG2rd:std_logic:='0';

signal REG3rd:std_logic:='0';

signal REG0wr:std_logic:='0';

signal REG1wr:std_logic:='0';

signal REG2wr:std_logic:='0';

signal REG3wr:std_logic:='0';

begin

PWDATAIn <= PWDATA when ((PSEL='1') and (PWRITE = '1')) else

(others=>'0');

50

Page 51: harshu1

GatedPADDR <= PADDR when (PSEL='1') else (others=>'0');

Wren <= PENABLE and PWRITE and PSEL;

Rden <= PSEL and (not(PWRITE)) and (not(PENABLE));

PRDATAEn <= PSEL and (not(PWRITE)) and (not(PENABLE));

PRDATA <= iPRDATA;

REG0wr <= '1' when ((Wren = '1') and (GatedPADDR(1 downto 0) =

REG0_ADDR)) else '0';

REG1wr <= '1' when ((Wren = '1') and (GatedPADDR(1 downto 0) =

REG1_ADDR)) else '0';

REG2wr <= '1' when ((Wren = '1') and (GatedPADDR(1 downto 0) =

REG2_ADDR)) else '0';

REG3wr <= '1' when ((Wren = '1') and (GatedPADDR(1 downto 0) =

REG3_ADDR)) else '0';

REG0rd <= '1' when ((Rden = '1') and (GatedPADDR(1 downto 0) =

REG0_ADDR)) else '0';

REG1rd <= '1' when ((Rden = '1') and (GatedPADDR(1 downto 0) =

REG1_ADDR)) else '0';

REG2rd <= '1' when ((Rden = '1') and (GatedPADDR(1 downto 0) =

REG2_ADDR)) else '0';

REG3rd <= '1' when ((Rden = '1') and (GatedPADDR(1 downto 0) =

REG3_ADDR)) else '0';

51

Page 52: harshu1

process(PCLK,PRESETn)

begin

if(PRESETn='0')then

iPRDATA <= (others=>'0');

elsif(PCLK'event and PCLK='1')then

iPRDATA <= Next_PRDATA;

end if;

end process;

process(REG0rd,REG1rd,REG2rd,REG3rd,REG0,REG1,REG2,REG3)

begin

if(REG0rd = '1') then

Next_PRDATA <= REG0;

elsif(REG1rd = '1') then

Next_PRDATA <= REG1;

elsif(REG2rd = '1') then

Next_PRDATA <= REG2;

elsif(REG3rd = '1') then

Next_PRDATA <= REG3;

else

Next_PRDATA <= ZeroFill;

end if;

52

Page 53: harshu1

end process;

---Implementation of REG0 register

---UART_Tx_DATA register

process(PWDATAIn,REG0wr,REG0)

begin

if(REG0wr = '1')then

Next_REG0 <= PWDATAIn;

else

Next_REG0 <= REG0;

end if;

end process;

process(PCLK,PRESETn)

begin

if(PRESETn='0')then

REG0 <= (others=>'1');

elsif(PCLK'event and PCLK='1')then

if(txrdy='1')then

uart_tx_data <= Next_REG0(7 downto 0);

else

REG0 <= Next_REG0;

end if;

53

Page 54: harshu1

end if;

end process;

-- Implementation of REG1 register

-- UART RX DATA request generating register

process(PWDATAIn,REG1wr,REG1)

begin

if(REG1wr='1')then

Next_REG1 <= PWDATAIn;

else

Next_REG1 <= REG1;

end if;

end process;

process(PCLK,PRESETn)

begin

if (PRESETn='0')then

REG1 <= (others=>'0');

elsif(PCLK'event and PCLK='1')then

if(rxrdy='1')then

REG1(7 downto 0) <=uart_rx_data;

else

REG1 <= Next_REG1;

54

Page 55: harshu1

end if;

end if;

end process;

---Implementation of REG2 register

---Control Register

process(PWDATAIn,REG2wr,REG2)

begin

if(REG2wr ='1') then

Next_REG2 <= PWDATAIn;

else

Next_REG2 <= REG2;

end if;

end process;

process(PCLK,PRESETn)

begin

if(PRESETn ='0') then

REG2 <= (others=>'0');

elsif(PCLK'event and PCLK='1') then

REG2 <= Next_REG2;

end if;

end process;

55

Page 56: harshu1

uart_wr <= REG2(0);

uart_rd <= REG2(7);

---Implementation of REG3 register

process(PWDATAIn,REG3wr,REG3)

begin

if(REG3wr ='1')then

Next_REG3 <= PWDATAIn;

else

Next_REG3 <= REG3;

end if;

end process;

process(PCLK,PRESETn)

begin

if (PRESETn ='0')then

REG3 <= x"00000200";

elsif(PCLK'event and PCLK='1') then

REG3 <= Next_REG3;

end if;

end process;

end Behavioral;

56

Page 57: harshu1

TESTBENCH

PROCESS

BEGIN

PRESETn<='0', '1' after 10 ns;

PSEL<='1';

PENABLE<='1';

PWRITE<='1';

PADDR<=x"00" & "00";

PWDATA<=x"000000aa";

wait for 40 ns;

txrdy<='1';

PADDR<=x"00" & "10";

PWDATA<=x"00000001";

wait for 100 ns;

PWRITE<='0';

PENABLE<='0';

uart_rx_data<=x"29";

PADDR<=x"00" & "01";

rxrdy<='1','0' after 5 ns;

-- Wait 100 ns for global reset to finish

wait for 100 ns;

57

Page 58: harshu1

APB UART INTERFACE CODE:

library IEEE;

use IEEE.STD_LOGIC_1164.ALL;

use IEEE.STD_LOGIC_ARITH.ALL;

use IEEE.STD_LOGIC_UNSIGNED.ALL;

entity apb_uart_top is

port(APB signals

PCLK:in std_logic;

PRESETn:in std_logic;

PSEL:in std_logic;

PENABLE:in std_logic;

PWRITE:in std_logic;

PWDATA:in std_logic_vector(31 downto 0);

PADDR:in std_logic_vector(9 downto 0);

PRDATA:out std_logic_vector(31 downto 0);

--UART Signals

tx : out std_logic;

rx:in std_logic);

end apb_uart_top;

architecture Behavioral of apb_uart_top is

component apb_interface is

58

Page 59: harshu1

port( PCLK:in std_logic;

PRESETn:in std_logic;

PSEL:in std_logic;

PENABLE:in std_logic;

PWRITE:in std_logic;

PWDATA:in std_logic_vector(31 downto 0);

PADDR:in std_logic_vector(9 downto 0);

PRDATA:out std_logic_vector(31 downto 0);

--- UART

uart_rx_data:in std_logic_vector(7 downto 0);

uart_tx_data:out std_logic_vector(7 downto 0);

rxrdy,txrdy:in std_logic;

uart_rd,uart_wr:out std_logic);

end component;

component uart is

Port(clk,rst,wr,rd,rx:in std_logic;

tx : out std_logic;

din : in STD_LOGIC_VECTOR (7 downto 0);

rdata_out : out STD_LOGIC_vector(7 downto 0);

txrdy, rxrdy : out std_logic);

end component;

59

Page 60: harshu1

Signal uart_rx_data,uart_tx_data:std_logic_vector(7 downto 0):=(others=>'0');

Signal uart_rd,uart_wr:std_logic:='0';

Signal txrdy,rxrdy:std_logic:='0';

begin

APB_IF: apb_interface port map(PCLK => PCLK,

PRESETn => PRESETn,

PSEL => PSEL,

PENABLE => PENABLE,

PWRITE => PWRITE,

PWDATA => PWDATA,

PADDR => PADDR,

PRDATA => PRDATA,

uart_rx_data => uart_rx_data,

uart_tx_data => uart_tx_data,

rxrdy => rxrdy,

txrdy => txrdy,

uart_rd => uart_rd,

uart_wr => uart_wr);

UART_TOP: uart port map(clk => PCLK

rst => PRESETn,

wr => uart_wr,

60

Page 61: harshu1

rd => uart_rd,

rx => rx,

tx => tx,

din => uart_tx_data,

rdata_out => uart_rx_data,

txrdy => txrdy,

rxrdy => rxrdy);

end Behavioral;

TESTBENCH

PROCESS

BEGIN

PRESETn<='0', '1' after 2 ns;

PSEL<='1';

PENABLE<='1';

PWRITE<='1';

PADDR<=x"00" & "00";

PWDATA<=x"00000007"; -- uart-tx data

rx<='0';

wait for 4 ns;

PADDR<=x"00" & "10"; --to enable uart write

PWDATA<=x"00000001";

rx<='0';

61

Page 62: harshu1

wait for 4 ns;

PADDR<=x"00" & "10";

PWDATA<=x"00000000"; --to disable uart write

rx<='0';

wait for 10 ns;

rx<='0';

wait for 60 ns;

rx<='1';

wait for 60 ns;

rx<='0';

wait for 60 ns;

rx<='1';

PADDR<=x"00" & "10";

PWDATA<=x"00000080"; --to enable uart read

wait for 10 ns;

PADDR<=x"00" & "10";

PWDATA<=x"00000000"; --to disable uart read

wait for 2 ns;

PADDR<=x"00" & "01"; --to read from Reg1 i.e uart rx data

PSEL<='1';

PENABLE<='0';

PWRITE<='0';

62

Page 63: harshu1

CHAPTER 8

RESULT

UART TRANSMITTER TEST BENCH

63

Page 64: harshu1

UART RECEIVER TEST BENCH

64

Page 65: harshu1

UART INTERFACE

65

Page 66: harshu1

APB INTERFACE

66

Page 67: harshu1

APB UART TOP MODULE WAVEFORM

67

Page 68: harshu1

CHAPTER 9

TOOLS AND HDL USED

Tools and HDL Used

We have used Xilinx ISE 9 for simulation and synthesis purposes. We

implemented the prescribed design in VHDL, a famous Industry and IEEE

standard HDL.

Brief History:

VHDL Was developed in the early 1980s for managing design problems

that involved large circuits and multiple teams of engineers. Funded by U.S.

Department of Defense.The first publicly available version was released in

1985.In 1986 IEEE (Institute of Electrical and Electronics Engineers, Inc.) was

presented with a proposal to standardize the VHDL.In 1987 standardization =>

IEEE 1076-1987.An improved version of the language was released in 1994 =>

IEEE standard1076-1993.

VHDL

VHDL = VHSIC Hardware Description Language (VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

68

Page 69: harshu1

HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

69

Page 70: harshu1

CHAPTER 10

VLSI DESIGN FLOW

10.1 INTRODUCTION

The word digital has made a dramatic impact on our society. More

significant is a continuous trend towards digital solutions in all areas from

electronicinstrumentation,control,datamanipulation,signalsprocessing,telecommu

nications to consumer electronics. Development of such solutions has been

possible due to good digital system design and modeling techniques.

10.2 CONVENTIONAL APPROACH TO DIGITAL DESIGN

Digital ICs of SSI and MSI types have become universally standardized

and have been accepted for use. Whenever a designer has to realize a digital

function, he uses a standard set of ICs along with a minimal set of additional

discrete circuitry.

Consider a simple example of realizing a function as

Q n+1 = Q n + (A B)

Here Qn, A, and B are Boolean variables, with Q n being the value of Q at

the nth time step. Here A B signifies the logical AND of A and B; the ‘+’ symbol

signifies the logical OR of the logic variables on either side. A circuit to realize

the function is shown in Figure 4.1. The circuit can be realized in terms of two

ICs – an A-O-I gate and a flip-flop. It can be directly wired up, tested, and used.

FIGURE10.1: SIMPLE DIGITAL CIRCUIT

70

Page 71: harshu1

With comparatively larger circuits, the task mostly reduces to one of

identifying the set of ICs necessary for the job and interconnecting; rarely does

one have to resort to a micro level design. The accepted approach to digital

design here is a mix of the top-down and bottom-up approaches as follows:

• Decide the requirements at the system level and translate them to circuit

requirements.

• Identify the major functional blocks required like timer, DMA unit, register file

say as in the design of a processor.

• Whenever a function can be realized using a standard IC, use the same –for

example programmable counter, mux, demux.

• Whenever the above is not possible, form the circuit to carry out the block

functions using standard SSI – for example gates, flip-flops.

• Use additional components like transistor, diode, resistor, capacitor,wherever

essential.

Once the above steps are gone through, a paper design is ready. Starting

with the paper design, one has to do a circuit layout. The physical location of all

the components is tentatively decided; they are interconnected and the ‘circuit-on

paper’ is made ready. Once a paper design is done, a layout is carried out and a

net-list prepared. Based on this, the PCB is fabricated and populated and all the

populated cards tested and debugged. The procedure is shown as a process

flowchart in Figure.

71

Page 72: harshu1

FIGURE10.2:SEQUENCE OF STEPS IN CONVENTIONAL ELECTRONIC

CIRCUIT DESIGN

At the debugging stage one may encounter three types of problems:

• Functional mismatch: The realized and expected functions are different. One

may have to go through the relevant functional block carefully and locate any

error logically. Finally the necessary correction has to be carried out in hardware.

• Timing mismatch: The problem can manifest in different forms. One possibility

is due to the signal going through different propagation delays in two paths and

arriving at a point with a timing mismatch. This can cause faulty operation.

Another possibility is a race condition in a circuit involving asynchronous

feedback. This kind of problem may call for elaborate debugging. The preferred

practice is to do debugging at smaller module stages and ensuring that feedback

through larger loops is avoided: It becomes essential to check for the existence of

long asynchronous loops.

• Overload: Some signals may be overloaded to such an extent that the signal

transition may be unduly delayed or even suppressed. The problem manifests as

72

Page 73: harshu1

reflections and erratic behavior in some cases (The signal has to be suitably

buffered here.). In fact, overload on a signal can lead to timing mismatches.

The above have to be carried out after completion of the prototype PCB

manufacturing; it involves cost, time, and also a redesigning process to develop a

bug free design.

10.3 VLSI DESIGN

The complexity of VLSI is being designed and used today makes the

manual approach to design impractical. Design automation is the order of the

day. With the rapid technological developments in the last two decades, the status

of VLSI technology is characterized by the following

• A steady increase in the size and hence the functionality of the ICs.

• A steady reduction in feature size and hence increase in the speed of operation

as well as gate or transistor density.

• A steady improvement in the predictability of circuit behavior.

• A steady increase in the variety and size of software tools for VLSI design.

The above developments have resulted in a proliferation of approaches to VLSI

design. We briefly describe the procedure of automated design flow the aim is

more to bring out the role of a Hardware Description Language (HDL) in the

design process. An abstraction based model is the basis of the automated design.

10.3.1 Abstraction Model

The model divides the whole design cycle into various domains with such

an abstraction through a division process the design is carried out in different

layers. The designer at one layer can function without bothering about the layers

above or below. The thick horizontal lines separating the layers in the figure

signify the compartmentalization. As an example, let us consider design at the

gate level. The circuit to be designed would be described in terms of truth tables

73

Page 74: harshu1

and state tables. With these as available inputs, he has to express them as Boolean

logic equations and realize them in terms of gates and flip-flops. In turn, these

form the inputs to the layer immediately below. Compartmentalization of the

approach to design in the manner described here is the essence of abstraction; it is

the basis for development and use of CAD tools in VLSI design at various levels.

The design methods at different levels use the respective aids such as Boolean

equations, truth tables, state transition table, etc. But the aids play only a small

role in the process. To complete a design, one may have to switch from one tool

to another, raising the issues of tool compatibility and learning new

Environments.

10.4 ASIC DESIGN FLOW

As with any other technical activity, development of an ASIC starts with an idea

and takes tangible shape through the stages of development as shown in Figure

4.4 and shown in detail in Figure 4.5. The first step in the process is to expand the

idea in terms of behavior of the target circuit. Through stages of programming,

the same is fully developed into a design description – in terms of well defined

standard constructs and conventions.

FIGURE10.3 DESIGN DOMAIN LEVELS OF ABSTRACTION

74

Page 75: harshu1

FIGURE10.4: MAJOR ACTIVITIES IN ASIC DESIGN

The design is tested through a simulation process; it is to check, verify,

and ensure that what is wanted is what is described. Simulation is carried out

through dedicated tools. With every simulation run, the simulation results are

studied to identify errors in the design description. The errors are corrected and

another simulation run carried out. Simulation and changes to design description

together form a cyclic iterative process, repeated until an error-free design is

evolved.

Design description is an activity independent of the target technology or

manufacturer. It results in a description of the digital circuit. To translate it into a

tangible circuit, one goes through the physical design process. The same

constitutes a set of activities closely linked to the manufacturer and the target

technology

10.4.1 Design Description

The design is carried out in stages. The process of transforming the idea

into a detailed circuit description in terms of the elementary circuit components

constitutes design description. The final circuit of such an IC can have up to a

billion such components; it is arrived at in a step-by-step manner. The first step in

evolving the design description is to describe the circuit in terms of its behavior.

75

Page 76: harshu1

The description looks like a program in a high level language like C. Once the

behavioral level design description is ready, it is tested extensively with the help

of a simulation tool; it checks and confirms that all the expected functions are

carried out satisfactorily. If necessary, this behavioral level routine is edited,

modified, and rerun – all done manually. Finally, one has a design for the

expected system – described at the behavioral level. The behavioral design forms

the input to the synthesis tools, for circuit synthesis. The behavioral constructs

not supported by the synthesis tools are replaced by data flow and gate level

constructs. To surmise, the designer has to develop synthesizable codes for his

design.

76

Page 77: harshu1

FIGURE10.5: ASIC DESIGN AND DEVELOPMENT FLOW

The design at the behavioral level is to be elaborated in terms of known

and acknowledged functional blocks. It forms the next detailed level of design

description. Once again the design is to be tested through simulation and

iteratively corrected for errors. The elaboration can be continued one or two steps

further. It leads to a detailed design description in terms of logic gates and

transistor switches.

10.4.2 Optimization

The circuit at the gate level – in terms of the gates and flip-flops – can be

redundant in nature. The same can be minimized with the help of minimization

tools. The step is not shown separately in the figure. The minimized logical

design is converted to a circuit in terms of the switch level cells from standard

libraries provided by the foundries. The cell based design generated by the tool is

the last step in the logical design process; it forms the input to the first level of

physical design.

10.4.3 Simulation

The design descriptions are tested for their functionality at every level –

behavioral, data flow, and gate. One has to check here whether all the functions

are carried out as expected and rectify them. All such activities are carried out by

the simulation tool. The tool also has an editor to carry out any corrections to the

source code. Simulation involves testing the design for all its functions,

functional sequences, timing constraints, and specifications. Normally testing and

simulation at all the levels – behavioral to switch level – are carried out by a

single tool; the same is identified as “scope of simulation tool”.

10.4.4 Synthesis

With the availability of design at the gate (switch) level, the logical design

is complete. The corresponding circuit hardware realization is carried out by a

synthesis tool. Two common approaches are as follows:

77

Page 78: harshu1

• The circuit is realized through an FPGA. The gate level design description is the

starting point for the synthesis here. The FPGA vendors provide an interface to

the synthesis tool. Through the interface the gate level design is realized as a final

circuit. With many synthesis tools, one can directly use the design description at

the data flow level itself to realize the final circuit through an FPGA. The FPGA

route is attractive for limited volume production or a fast development cycle.

• The circuit is realized as an ASIC. A typical ASIC vendor will have his own

library of basic components like elementary gates and flip-flops. Eventually the

circuit is to be realized by selecting such components and interconnecting them

conforming to the required design. This constitutes the physical design. Being an

elaborate and costly process, a physical design may call for an intermediate

functional verification through the FPGA route. The circuit realized through the

FPGA is tested as a prototype. It provides another opportunity for testing the

design closer to the final circuit.

10.4.5 Physical Design

A fully tested and error-free design at the switch level can be the starting

point for a physical design [Baker & Boyce, Wolf]. It is to be realized as the final

circuit using (typically) a million components in the foundry’s library. The step-

by-step activities in the process are described briefly as follows:

• System partitioning: The design is partitioned into convenient compartments

or functional blocks. Often it would have been done at an earlier stage itself and

the software design prepared in terms of such blocks. Interconnection of the

blocks is part of the partition process.

• Floor planning: The positions of the partitioned blocks are planned and the

blocks are arranged accordingly. The procedure is analogous to the planning and

arrangement of domestic furniture in a residence. Blocks with I/O pins are kept

close to the periphery; those which interact frequently or through a large number

of interconnections are kept close together, and so on. Partitioning and floor

planning may have to be carried out and refined iteratively to yield best results.

78

Page 79: harshu1

• Placement: The selected components from the ASIC library are placed in

position on the “Silicon floor.” It is done with each of the blocks above.

• Routing: The components placed as described above are to be interconnected to

the rest of the block: It is done with each of the blocks by suitably routing the

interconnects. Once the routing is complete, the physical design cam is taken as

complete. The final mask for the design can be made at this stage and the ASIC

manufactured in the foundry.

10.4.6 Post Layout Simulation

Once the placement and routing are completed, the performance

specifications like silicon area, power consumed, path delays, etc., can be

computed. Equivalent circuit can be extracted at the component level and

performance analysis carried out. This constitutes the final stage called

“verification.” One may have to go through the placement and routing activity

once again to improve performance.

10.4.7 Critical Subsystems

The design may have critical subsystems. Their performance may be

crucial to the overall performance; in other words, to improve the system

performance substantially, one may have to design such subsystems afresh. The

design here may imply redefinition of the basic feature size of the component,

component design, placement of components, or routing done separately and

specifically for the subsystem. A set of masks used in the foundry may have to be

done afresh for the purpose.

10.5 ROLE OF HDL

An HDL provides the framework for the complete logical design of the

ASIC. All the activities coming under the purview of an HDL are shown

enclosed in bold dotted lines in Figure 1.4. Verilog and VHDL are the two most

commonly used HDLs today. Both have constructs with which the design can be

fully described at all the levels. There are additional constructs available to

79

Page 80: harshu1

facilitate setting up of the test bench, spelling out test vectors for them and

“observing” the outputs from the designed unit.

IEEE has brought out Standards for the HDLs, and the software tools conform to

them. Verilog as an HDL was introduced by Cadence Design Systems; they

placed it into the public domain in 1990. It was established as a formal IEEE

Standard in 1995. The revised version has been brought out in 2001. However,

most of the simulation tools available today conform only to the 1995 version of

the standard.

VHDL used by a substantial number of the VLSI designers today is the used in

this project for modeling the design.

80

Page 81: harshu1

CHAPTER 11

APPLICATIONS

It is used in Embedded processor applications.

Mainly used in full duplex communication.

81

Page 82: harshu1

CHAPTER 12

FUTURE SCOPE

SOC FUTURE SCOPE

The SoCs of the future will:

have 100s of hardware blocks,

have billions of transistors,

have multiple processors,

have large wire-to-gate delay ratios,

handle large amounts of high-speed data,

need to support “plug-and-play” IP blocks

AMBA FUTURE SCOPE

The Advanced Microcontroller Bus Architecture (AMBA) is used as the on-chip

bus in system-on-a-chip (SoC) designs. Since its inception, the scope of AMBA

has gone far beyond microcontroller devices, and is now widely used on a range

of ASIC and SoC parts including applications processors used in modern portable

mobile devices like smartphones.

82

Page 83: harshu1

CHAPTER 13

CONCLUSION

Complex VLSI IC design has been revolutionized by the widespread

adoption of the SoC paradigm. The benefits of the SoC approaches are numerous,

including improvements in system performance, cost, size, power dissipation, and

design turn around time. Many SoC designs consist of one or more IPs, designed

for a single or narrow set of applications with highly characterize-able

communication. As the level of chip integration continues to advances at a fast

pace, the desire for efficient interconnects rapidly increase. Currently on-chip

interconnections networks are mostly implemented using traditional interconnects

like buses. The wide variety of buses used in SoC designs presents the major

problem for reusable-design. A number of companies and standards committees

have attempted to standardize buses and interfaces with mixed results. In this

paper we have discussed some of the issues facing SoC designers in determining

which bus architecture to use in order to provide flexible and high-bandwidth

between IPs.

83

Page 84: harshu1

CHAPTER 14

REFERENCES

[1] M. Keating and P. Bricaud, Reuse Methodology Manual for System-on-a-

Chip Designs,2/E. Boston: Kluwer Academic Publishers, 1999.

[2] W. Ho and T. Pinkston, “A design methodology for efficient application-

specific on-chip interconnects,” IEEE Trans. On Parallel and Distributed Systems

February,vol. 17, no. 2, pp. 174–190, Feb. 2006.

[3] N. Horspool and P. Gorman, The ASIC Handbook. Upperside River, NJ:

Prentice Hall, 2001.

[4] L. Bernini and G. D. Micheli, “Networks on chips: A new paradigm for

component based mp soc design,” in Miltiprocessor Systems-on-Chips, A. A.

Jerraya and W. Wolf, Eds. Amsterdam: Elsevier, 2005, pp. 49–80.

[5] Core connect bus architecture. IBM Microelectronics. [Online]. Available:

http://www.ibm.com/chips/products/coreconnect

84