improvement of the orthogonal code convolution capabilities using fpga implementation

IMPROVEMENT OF THE ORTHOGONAL CODE CONVOLUTION

CAPABILITIES USING FPGA IMPLEMENTATION

When data is stored, compressed, or communicated through a media such as cable or air,

sources of noise and other parameters such as EMI, crosstalk, and distance can considerably

affect the reliability of these data. Error detection and correction techniques are therefore

required. Some of those techniques can only detect errors, such as the Cyclic Redundancy

Check; others are designed to detect as well as correct errors, such as Salomon Codes.

However, the existing techniques are not able to achieve high efficiency and to meet

bandwidth requirements, especially with the increase in the quantity of data transmitted.

Orthogonal Code is one of the codes that can detect errors and correct corrupted data.

CYCLIC REDUNDANCY CHECK

The CRC check is used to detect errors in a message. Two implementations are shown:

• Table driven CRC calculation

• Loop driven CRC calculation

This application describes the implementation of the CRC-16 polynomial. However, there are

several formats for the implementation of CRC such as CRC-CCITT, CRC-32 or other

polynomials. CRC is a common method for detecting errors in transmitted messages or stored

data. The CRC is a very powerful, but easily implemented technique to obtain data reliability.

THEORY OF OPERATION

The theory of a CRC calculation is straight forward. The data is treated by the CRC algorithm

as a binary number. This number is divided by another binary number called the polynomial.

The rest of the division is the CRC checksum, which is appended to the transmitted message.

The receiver divides the message (including the calculated CRC), by the same polynomial the

transmitter used. If the result of this division is zero, then the transmission was successful.

However, if the result is not equal to zero, an error occurred during the transmission. The

CRC-16 polynomial is shown in Equation 1. The polynomial can be translated into a binary

value, because the divisor is viewed as a polynomial with binary coefficients. For example,

the CRC-16 polynomial translates to 1000000000000101b. All coefficients, like x2 or x15,

are represented by a logical 1 in the binary value. The division uses the Modulo-2 arithmetic.

Modulo-2 calculation is simply realized by XOR’ing two numbers.

EQUATION 1: THE CRC-16 POLYNOMIAL

EXAMPLE 1: MODULO-2 CALCULATION

Example Calculation

In this example calculation, the message is two bytes long. In general, the message can have

any length in bytes. Before we can start calculating the CRC value 1, the message has to be

augmented by n-bits, where n is the length of the polynomial. The CRC-16 polynomial has a

length of 16-bits, therefore, 16-bits have to be augmented to the original message. In this

example calculation, the polynomial has a length of 3-bits, therefore, the message has to be

extended by three zeros at the end. An example calculation for a CRC is shown in Example 1.

The reverse calculation is shown in Example 2.

EXAMPLE 2: CALCULATION FOR GENERATING A CRC

EXAMPLE 3: CHECKING A MESSAGE FOR A CRC ERROR

Figure1: HARDWARE CRC-16 GENERATOR

Figure2: LOOP DRIVEN CRC IMPLEMENTATION

HARDWARE IMPLEMENTATION

The CRC calculation is realized with a shift register and XOR gates. Figure 1 shows a CRC

generator for the CRC-16 polynomial. Each bit of the data is shifted into the CRC shift

register (Flip-Flops) after being XOR’ed with the CRC’s most significant bit.

SOFTWARE IMPLEMENTATIONS

There are two different techniques for implementing a CRC in software. One is a loop driven

implementation and the other is a table driven implementation. The loop driven

implementation works like the calculation shown in Figure 2. The data is fed through a shift

register. If a one pops out at the MSB, the content is XORed with the polynomial. In the

other, the registers are shifted by one position to the left. Another method to calculate a CRC

is to use pre-calculated values and XOR them to the shift register.

ADVANTAGES OF CRC Vs SIMPLE SUM TECHNIQUES

The CRC generation has many advantages over simple sum techniques or parity check. CRC

error correction allows detection of:

1. Single bit errors

2. Double bit errors

3. Bundled bit errors (bits next to each other)

A parity bit check detects only single bit errors. The CRC error correction is mostly used

where large data packages are transmitted, for example, in local area networks such as

Ethernet.

REED–SOLOMON ERROR CORRECTION

In coding theory, Reed–Solomon (RS) codes are non-binary cyclic error-correcting codes

invented by Irving S. Reed and Gustave Solomon. They described a systematic way of

building codes that could detect and correct multiple random symbol errors. By adding t

check symbols to the data, an RS code can detect any combination of up to t erroneous

symbols, and correct up to ⌊t/2⌋ symbols. As an erasure code, it can correct up to t known

erasures, or it can detect and correct combinations of errors and erasures. Furthermore, RS

codes are suitable as multiple-burst bit-error correcting codes, since a sequence of b+1

consecutive bit errors can affect at most two symbols of size b. The choice of t is up to the

designer of the code, and may be selected within wide limits.

In Reed-Solomon coding, source symbols are viewed as coefficients of a polynomial p(x)

over a finite field. The original idea was to create n code symbols from k source symbols by

oversampling p(x) at n > k distinct points, transmit the sampled points, and use interpolation

techniques at the receiver to recover the original message. That is not how RS codes are used

today. Instead, RS codes are viewed as cyclic BCH codes, where encoding symbols are

derived from the coefficients of a polynomial constructed by multiplying p(x) with a cyclic

generator polynomial. This gives rise to an efficient decoding algorithm, which was

discovered by Elwyn Berlekamp and James Massey, and is known as the Berlekamp-Massey

decoding algorithm.

Reed-Solomon codes have since found important applications from deep-space

communication to consumer electronics. They are prominently used in consumer electronics

such as CDs, DVDs, Blu-ray Discs, in data transmission technologies such as DSL &

WiMAX, in broadcast systems such as DVB and ATSC, and in computer applications such as

RAID 6 systems.

ORTHOGONAL CODES

Orthogonal codes are binary valued and they have equal number of 1’s and 0’s. An n-bit

orthogonal code has n/2 1’s and n/2 0’s; i.e., there are n/2 positions where 1’s and 0’s differ.

Therefore, all orthogonal codes will generate zero parity bits. The concept is illustrated by

means of an 8- bit orthogonal code as shown in Fig.3. It has 8-orthogonal codes and 8-

antipodal codes for a total of 16-biorthogonal codes. Antipodal codes are just the inverse of

orthogonal codes; they are also orthogonal among themselves.

Fig. 2. Illustrations of the proposed concept. An 8-bit orthogonal code has 8 orthogonal codes

and 8-antipodal codes for a total of 16 bi-orthogonal codes. All orthogonal and antipodal

codes generate zero parity.

Since there is an equal number of 1’s and 0’s, each orthogonal code will generate a zero

parity bit. Therefore, each antipodal code will also generate a zero parity bit. A notable

distinction in this method is that the transmitter does not have to send the parity bit since the

parity bit is known to be always zero. Therefore, if there is a transmission error, the receiver

will be able to detect it by generating a parity bit at the receiving end. Before transmission a

k-bit data set is mapped into a unique n-bit. For example, a 4-bit data set is represented by a

unique 8-bit orthogonal code which is transmitted without the parity bit.

When received, the data are decoded based on code correlation. It can be done by setting a

threshold midway between two orthogonal codes. This is given by the following equation

(1)

Where n is the code length and dth is the threshold, which is midway between two orthogonal

codes. Therefore, for the 8-bit orthogonal code (Fig. 4), we have dth = 8/4 = 2.

Fig. 4. Illustration of Encoding and Decoding.

(2)

From the equ(2), t is the number of errors that can be corrected by means of an n-bit

orthogonal code. For example, a single error-correcting orthogonal code can be constructed

by means of an 8-bit orthogonal code (n = 8). Similarly, a three-error correcting orthogonal

code can be constructed by means of a 16-bit orthogonal code (n = 16), and so on. Table-1

below shows a few orthogonal codes and the corresponding error correcting capabilities:

n t

8 1

16 3

32 7

64 15

TABLE I- Orthogonal Codes and the Corresponding Chip Error Control Capabilities.

DESIGN METHODOLOGY

Since there is an equal number of 1’s and 0’s, each orthogonal code will generate a zero

parity bit. If the data has been corrupted during the transmission the receiver can detect errors

by generating the parity bit for the received code and if it is not zero then the data is

corrupted. However the parity bit doesn’t change for an even number of errors, hence the

receiver can only detect errors 2n /2 combinations of the received code. Therefore detection

percentage is 50% . Our approach is not to use the parity generation method to detect the

errors, but a simple technique based on the comparison between the received code and all the

orthogonal code combinations stored in a look up table. The technique which involves a

i) Transmitter and

ii) Receiver.

TRANSMITTER

The transmitter includes two blocks: an encoder and a shift register. The encoder encodes a k-

bit data set to n=2k-1 bits of the orthogonal code and the shift register transforms this code to

a serial data in order to be transmitted as shown in Fig.5. For example, 4-bit data is encoded

to 8-bit (23) orthogonal code according to the lookup table shown in Fig.4. The generated

orthogonal code is then transmitted serially using a shift register with the rising edge of the

clock.

Fig. 5. Block diagram of the transmitter.

For an example it the transmitter transmit the input data value “0110” labeled as ‘data’. This

data has been encoded to the associated orthogonal code “00111100” labeled ‘ortho’. The

signal ‘EN’ is used to enable the transmission of the serial bits ‘txcode’ of the orthogonal

code with every rising edge of the clock.

SHIFT REGISTERS

Shift Registers consists of a number of single bit "D-Type Data Latches" connected together

in a chain arrangement so that the output from one data latch becomes the input of the next

latch and so on, thereby moving the stored data serially from either the left or the right

direction. The number of individual Data Latches used to make up Shift Registers are

determined by the number of bits to be stored with the most common being 8-bits wide. Shift

Registers are mainly used to store data and to convert data from either a serial to parallel or

parallel to serial format with all the latches being driven by a common clock (Clk) signal

making them Synchronous devices. They are generally provided with a Clear or Reset

connection so that they can be "SET" or "RESET" as required.

Generally, Shift Registers operate in one of four different modes:

Serial-in to Parallel-out (SIPO)

Serial-in to Serial-out (SISO)

Parallel-in to Parallel-out (PIPO)

Parallel-in to Serial-out (PISO)

SERIAL-IN TO PARALLEL-OUT

4-bit Serial-in to Parallel-out (SIPO) Shift Register

Lets assume that all the flip-flops (FFA to FFD) have just been RESET (CLEAR input) and

that all the outputs QA to QD are at logic level "0" ie, no parallel data output. If a logic "1" is

connected to the DATA input pin of FFA then on the first clock pulse the output of FFA and

the resulting QA will be set HIGH to logic "1" with all the other outputs remaining LOW at

logic "0". Assume now that the DATA input pin of FFA has returned LOW to logic "0". The

next clock pulse will change the output of FFA to logic "0" and the output of FFB and QB

HIGH to logic "1". The logic "1" has now moved or been "Shifted" one place along the

register to the right. When the third clock pulse arrives this logic "1" value moves to the

output of FFC (QC) and so on until the arrival of the fifth clock pulse which sets all the

outputs QA to QD back again to logic level "0" because the input has remained at a constant

logic level "0".

Clock Pulse No QA QB QC QD

0 0 0 0 0

1 1 0 0 0

2 0 1 0 0

3 0 0 1 0

4 0 0 0 1

5 0 0 0 0

The effect of each clock pulse is to shift the DATA contents of each stage one place to the

right, and this is shown in the following table until the complete DATA is stored, which can

now be read directly from the outputs of QA to QD. Then the DATA has been converted

from a Serial Data signal to a Parallel Data word.

SERIAL-IN TO SERIAL-OUT

This Shift Register is very similar to the one above except where as the data was read directly

in a parallel form from the outputs QA to QD, this time the DATA is allowed to flow straight

through the register. Since there is only one output the DATA leaves the shift register one bit

at a time in a serial pattern and hence the name Serial-in to Serial-Out Shift Register.

4-bit Serial-in to Serial-out (SISO) Shift Register

This type of Shift Register also acts as a temporary storage device or as a time delay device,

with the amount of time delay being controlled by the number of stages in the register, 4, 8,

16 etc or by varying the application of the clock pulses. Commonly available IC's include the

74HC595 8-bit Serial-in/Serial-out Shift Register with 3-state outputs.

PARALLEL-IN TO SERIAL-OUT

Parallel-in to Serial-out Shift Registers act in the opposite way to the Serial-in to Parallel-out

one above. The DATA is applied in parallel form to the parallel input pins PA to PD of the

register and is then read out sequentially from the register one bit at a time from PA to PD on

each clock cycle in a serial format.

4-bit Parallel-in to Serial-out (PISO) Shift Register

As this type of Shift Register converts parallel data, such as an 8-bit data word into serial data

it can be used to multiplex many different input lines into a single serial DATA stream which

can be sent directly to a computer or transmitted over a communications line. Commonly

available IC's include the 74HC165 8-bit Parallel-in/Serial-out Shift Registers.

PARALLEL-IN TO PARALLEL-OUT

Parallel-in to Parallel-out Shift Registers also act as a temporary storage device or as a time

delay device. The DATA is presented in a parallel format to the parallel input pins PA to PD

and then shifts it to the corresponding output pins QA to QD when the registers are clocked.

4-bit Parallel-in/Parallel-out (PIPO) Shift Register

As with the Serial-in to Serial-out shift register, this type of register also acts as a temporary

storage device or as a time delay device, with the amount of time delay being varied by the

frequency of the clock pulses.

Today, high speed bi-directional universal type Shift Registers such as the TTL 74LS194,

74LS195 or the CMOS 4035 are available as a 4-bit multi-function devices that can be used

in serial-serial, shift left, shift right, serial-parallel, parallel-serial, and as a parallel-parallel

Data Registers, hence the name "Universal".

RECEIVER

The received code is processed through the sequential steps, as shown in Fig.4. The incoming

serial bits are converted into n-bit parallel codes. The received code is compared with all the

codes in the lookup table for error detection. This is done by counting the number of ones in

the signal resulting from ‘XOR’ operation between the received code and each combination

of the orthogonal codes in the lookup table. A counter is used to count the number of ones in

the resulting nbit signal and also searches for the minimum count. However a value rather

than zero shows an error in the received code. The orthogonal code in the lookup table which

is associated with the minimum count is the closest match for the corrupted received code.

The matched orthogonal code in the lookup table is the corrected code, which is then decoded

to k-bit data. The receiver is able to correct up to (n/4)-1 bits in the received impaired code.

However, if the minimum count is associated with more than one combination of orthogonal

code then a signal, REQ, goes high.

Fig. 4. Block diagram of the receiver

Upon reception, the incoming serial data is converted into 8-bit parallel code ‘rxcode’.

Counter is used to count the number of 1’s after XOR operation between the received code

and all combinations of orthogonal code in the lookup table. ‘Count’ gives the minimum

count of one’s among them. The orthogonal code ‘ortho’ associated with the minimum count

is the closest match for the received code, which is then decoded to the final data given by

signal ‘data’. Three different cases result from this simulation. In the first case, the received

code has a match in the lookup table.

For example, if the received code is rxcode= “00111100”, count=’0’ and hence the received

code is not corrupted. The code is then decoded to the corresponding final data “0110”. In the

second case, the received code has no match in the lookup table.

If the received code is rxcode=“00110100”, the value of minimum count is ’1’, which reveals

an error. The corresponding orthogonal code is ortho= “00111100” which is the closest match

for the received code given by the minimum count, and the decoded final data is “0110”. In

this case the single bit error is detected and corrected by the receiver.

In the third case, there is more than one possibility of closest match in the lookup table.If, the

received code is rxcode= “00110000”. The value of minimum count is associated with more

than one orthogonal code and thus it is not possible to determine the closest match in the

lookup table for the received code. Then the signal labelled ‘REQ’ goes high, which is a

request for a retransmission.

COUNTERS

In digital logic and computing, a counter is a device which stores (and sometimes displays)

the number of times a particular event or process has occurred, often in relationship to a clock

signal. In practice, there are two types of counters:

Up counters, which increase (increment) in value

Down counters, which decrease (decrement) in value

http://en.wikipedia.org/wiki/Decrement

http://en.wikipedia.org/wiki/Increment

http://en.wikipedia.org/wiki/Clock_signal

http://en.wikipedia.org/wiki/Clock_signal

http://en.wikipedia.org/wiki/Process_(computing)

http://en.wikipedia.org/wiki/Event_(philosophy)

http://en.wikipedia.org/wiki/Computing

http://en.wikipedia.org/wiki/Digital_logic

Toggle flip-flop. Output is not shown. Red=1, blue=0

In electronics, counters can be implemented quite easily using register-type circuits such as

the flip-flop, and a wide variety of designs exist, e.g:

Asynchronous (ripple) counter – changing state bits are used as clocks to subsequent state

flip-flops

Synchronous counter – all state bits change under control of a single clock

Decade counter – counts through ten states per stage

Up–down counter – counts both up and down, under command of a control input

Ring counter – formed by a shift register with feedback connection in a ring

Johnson counter – a twisted ring counter

Cascaded counter

Each is useful for different applications. Usually, counter circuits are digital in nature, and

count in natural binary. Many types of counter circuit are available as digital building blocks,

for example a number of chips in the 4000 series implement different counters.

Occasionally there are advantages to using a counting sequence other than the natural binary

sequence -- such as the binary coded decimal counter, a linear feedback shift register counter,

or a Gray-code counter.

Counters are useful for digital clocks and timers, and in oven timers, VCR clocks, etc.[1]

ASYNCHRONOUS (RIPPLE) COUNTER

http://en.wikipedia.org/wiki/Counter#cite_note-0

http://en.wikipedia.org/wiki/Gray_code#Gray-code_counters_and_arithmetic

http://en.wikipedia.org/wiki/Linear_feedback_shift_register#Uses_as_counters

http://en.wikipedia.org/wiki/Binary_coded_decimal

http://en.wikipedia.org/wiki/4000_series

http://en.wikipedia.org/wiki/Binary_numeral_system

http://en.wikipedia.org/wiki/Digital

http://en.wikipedia.org/wiki/Flip-flop_(electronics)

http://en.wikipedia.org/wiki/Electronics

http://en.wikipedia.org/wiki/File:T_flip-flop.gif

Asynchronous counter created from JK flip-flops

An asynchronous (ripple) counter is a single D-type flip-flop, with its D (data) input fed from

its own inverted output. This circuit can store one bit, and hence can count from zero to one

before it overflows (starts over from 0). This counter will increment once for every clock

cycle and takes two clock cycles to overflow, so every cycle it will alternate between a

transition from 0 to 1 and a transition from 1 to 0. Notice that this creates a new clock with a

50% duty cycle at exactly half the frequency of the input clock. If this output is then used as

the clock signal for a similarly arranged D flip-flop (remembering to invert the output to the

input), you will get another 1 bit counter that counts half as fast. Putting them together yields

a two bit counter:

Cycle Q1 Q0 (Q1:Q0)dec

0 0 0 0

1 0 1 1

2 1 0 2

3 1 1 3

4 0 0 0

You can continue to add additional flip-flops, always inverting the output to its own input,

and using the output from the previous flip-flop as the clock signal. The result is called a

ripple counter, which can count to 2n-1 where n is the number of bits (flip-flop stages) in the

counter. Ripple counters suffer from unstable outputs as the overflows "ripple" from stage to

stage, but they do find frequent application as dividers for clock signals, where the

instantaneous count is unimportant, but the division ratio overall is. (To clarify this, a 1-bit

http://en.wikipedia.org/wiki/Ratio

http://en.wikipedia.org/wiki/Duty_cycle

http://en.wikipedia.org/wiki/Flip-flop_(electronics)#D_flip-flop

http://en.wikipedia.org/wiki/File:Asynchronous-counter2.jpg

counter is exactly equivalent to a divide by two circuit; the output frequency is exactly half

that of the input when fed with a regular train of clock pulses).

The use of flip-flop outputs as clocks leads to timing skew between the count data bits,

making this ripple technique incompatible with normal synchronous circuit design styles.

SYNCHRONOUS COUNTER

A 4-bit synchronous counter using J-K flip-flops

Wdware-based counters are of this type.

A simple way of implementing the logic for each bit of an ascending counter (which is what

is depicted in the image to the right) is for each bit to toggle when all of the less significant

bits are at a logic high state. For example, bit 1 toggles when bit 0 is logic high; bit 2 toggles

when both bit 1 and bit 0 are logic high; bit 3 toggles when bit 2, bit 1 and bit 0 are all high;

and so on.

Synchronous counters can also be implemented with hardware finite state machines, which

are more complex but allow for smoother, more stable transitions.

Please note that the counter shown will have an error once it reaches 1110.

RING COUNTER

A ring counter is a shift register (a cascade connection of flip-flops) with the output of the

last one connected to the input of the first, that is, in a ring. Typically a pattern consisting of a

http://en.wikipedia.org/wiki/Flip-flop

http://en.wikipedia.org/wiki/Shift_register

http://en.wikipedia.org/wiki/Finite_state_machine

http://en.wikipedia.org/wiki/Synchronous_circuit

http://en.wikipedia.org/wiki/File:4-bit-jk-flip-flop_V1.1.svg

single 1 bit is circulated, so the state repeats every N clock cycles if N flip-flops are used. It

can be used as a cycle counter of N states.

JOHNSON COUNTER

A Johnson counter (or switchtail ring counter, twisted-ring counter, walking-ring counter, or

Moebius counter) is a modified ring counter, where the output from the last stage is inverted

and fed back as input to the first stage.[2][3][4] A pattern of bits equal in length to twice the

length of the shift register thus circulates indefinitely. These counters find specialist

applications, including those similar to the decade counter, digital to analog conversion, etc.

DECADE COUNTER

A decade counter is one that counts in decimal digits, rather than binary. A decimal counter

may have each digit binary encoded (that is, it may count in binary-coded decimal, as the

7490 integrated circuit did) or other binary encodings (such as the bi-quinary encoding of the

7490 integrated circuit). Alternatively, it may have a "fully decoded" or one-hot output code

in which each output goes high in turn; the 4017 was such a circuit. The latter type of circuit

finds applications in multiplexers and demultiplexers, or wherever a scanning type of

behavior is useful. Similar counters with different numbers of outputs are also common.

The decade counter is also known as a mod-10 counter.

UP–DOWN COUNTER

A counter that can change state in either direction, under control an up–down selector input,

is known as an up–down counter. When the selector is in the up state, the counter increments

its value; when the selector is in the down state, the counter decrements the count.

http://en.wikipedia.org/wiki/Multiplexer

http://en.wikipedia.org/wiki/4000_series#4017_decade_counter

http://en.wikipedia.org/wiki/One-hot

http://en.wikipedia.org/wiki/List_of_7400_series_integrated_circuits

http://en.wikipedia.org/wiki/List_of_7400_series_integrated_circuits

http://en.wikipedia.org/wiki/Binary-coded_decimal




NEED FOR TESTING

As the density of VLSI products increases, their testing becomes more difficult and costly.

Generating test patterns has shifted from a deterministic approach, in which a testing pattern

is generated automatically based on a fault model and an algorithm, to a random selection of

test signals. While in real estate the refrain is “Location!” the comparable advice in IC design

should be “Testing! Testing! Testing!”. No matter whether deterministic or random

generation of testing patterns is used, the testing pattern applied to the VLSI chips can no

longer cover all possible defects. Consider the manufacturing processes for VLSI chips as

shown in Fig. 16. Two kinds of cost can incur with the test process: the cost of testing and the

cost of accepting an imperfect chip. The first cost is a function of the time spent on testing or,

equivalently, the number of test patterns applied to the chip. The cost will add to the cost of

the chips themselves. The second cost represents the fact that, when a defective chip has been

passed as good, its failure may become very costly after being embedded in its application.

An optimal testing strategy should trade off both costs and determine an adequate test length

(in terms of testing period or number of test patterns).

Figure 6: Chip Manufacturing and testing flow

Apart from the cost, two factors need to be considered when determining the test lengths. The

first is the production yield, which is the probability that a product is functionally correct at

the end of the manufacturing process. If the yield is high, we may not need to test extensively

since most chips tested will be “good,” and vice versa. The other factor to be considered is

the coverage function of the test process. The coverage function is defined as the probability

of detecting a defective chip given that it has been tested for a particular duration or a given

number of test patterns. If we assume that all possible defects can be detected by the test

process, the coverage function of the test process can be regarded as a probability distribution

function of the detection time given the chip under test is bad. Thus, by investigating the

density function or probability mass function, we should be able to calculate the marginal

gain in detection if the test continues. In general, the coverage function of a test process can

be obtained through theoretical analysis or experiments on simulated fault models. With a

given production yield, the fault coverage requirement to attain a specified defect level,

which is defined as the probability of having a “bad” chip among all chips passed by a test

process

While most problems in VLSI design has been reduced to algorithm in readily available

software, the responsibilities for various levels of testing and testing methodology can be

significant burden on the designer.

The yield of a particular IC was the number of good die divided by the total number of

die per wafer. Due to the complexity of the manufacturing process not all die on a wafer

correctly operate. Small imperfections in starting material, processing steps, or in

photomasking may result in bridged connections or missing features. It is the aim of a

test procedure to determine which die are good and should be used in end systems.

Testing a die can occur:

At the wafer level

At the packaged level

At the board level

At the system level

In the field

By detecting a malfunctioning chip at an earlier level, the manufacturing cost may be

kept low. For instance, the approximate cost to a company of detecting a fault at the

above level is:

Wafer $0.01- $.1

Packaged-chip $0.10-$1

Board $1-$10

System $10-$100

Field $100-$1000

Obviously, if faults can be detected at the wafer level, the cost of manufacturing is kept

the lowest. In some circumstances, the cost to develop adequate tests at the wafer level,

mixed signal requirements or speed considerations may require that further testing be

done at the packaged-chip level or the board level. A component vendor can only test the

wafer or chip level. Special systems, such as satellite-borne electronics, might be tested

exhaustively at the system level.

Tests may fall into two main categories. The first set of tests verifies that the chip

performs its intended function; that is, that it performs a digital filtering function, acts as

a microprocessor, or communicates using a particular protocol. In other words, these tests

assert that all the gates in the chip, acting in concert, achieve a desired function. These

tests are usually used early in the design cycle to verify the functionality of the circuit.

These will be called functionality tests. The second set of tests verifies that every gate

and register in the chip functions correctly. These tests are used after the chip is

manufactured to verify that the silicon in intact. They will be called manufacturing tests.

In many cases these two sets of tests may be one and the same, although the natural flow

of design usually has a designer considering function before manufacturing concerns.

MANUFACTURING TEST PRINCIPLES

A critical factor in all LSI and VLSI [2] design is the need to incorporate methods of

testing circuits. This task should proceed concurrently with any architectural

considerations and not be left until fabricated parts are available.

Figure 7(a) shows a combinational circuit with n-inputs. To test this circuit exhaustively

a sequence of 2^n inputs must be applied and observed to fully exercise the circuit. This

combinational circuit is converted to a sequential circuit with addition of m-storage

registers, as shown in figure 3(b) the state of the circuit is determined by the inputs and

the previous state. A minimum of 2^ (n+m) test vectors must be applied to exhaustively

test the circuit. Clearly, this is an important area of design that has to be well understood.

n n

2^n inputs required to exhaustively test circuit (a)

CombinationalLogic

Register

n n

m m

Clk

2^(n+m) inputs required to exhaustively test circuitFor n=25 m=50 1micro seconds/test, the test time is oner 1 billion years

(b)Figure 7. The combinational explosion in test vectors

OPTIMAL TESTING

With the increased complexity of VLSI circuits, testing has become more costly and time-

consuming. The design of a testing strategy, which is specified by the testing period based on

the coverage function of the testing algorithm, involves trading off the cost of testing and the

penalty of passing a bad chip as good. The optimal testing period is first derived, assuming

the production yield is known. Since the yield may not be known a priori, an optimal

sequential testing strategy which estimates the yield based on ongoing testing results, which

in turn determines the optimal testing period, is developed next. Finally, the optimal

sequential testing strategy for batches in which N chips are tested simultaneously is

presented. The results are of use whether the yield stays constant or varies from one

manufacturing run to another.

TESTING PRINCIPLES

Importance of Testing

According to the Moores law the scale of the IC’s doubles for every 18 months. In 1980’s the

term Very Large Scale Integration (VLSI) was used for chips having more than 100,000 of

transistors. In general, the micro processors produced in 1994 has 3 million transistors and in

present computers there are many millions of transistors and are likely to be increased for

future needs. This vast increase of transistors has reduced the size of chip; this is known as

feature size. As the reduction in feature size increases the number of manufacturing defects in

CombinationalLogic

the IC. Performance, area, power, testing are some of the important attributes of complex

VLSI systems.

Testing of the circuit should be done as early as possible in the design cycle with efficient test

generation patterns which reduce the cost of testing without the compromise of fault

coverage. Many test pattern generation methods used for thisurpose are Random Pattern

Testing, Pseudorandom Testing and Pseudo-Exhaustive Testing have been developed, aiming

to reduce the test pattern length.

Testing is used not only to find the fault-free devices, Printed Circuit Boards (PCBs) and

systems but also to improve production yield at the various stages of manufacturing by

analyzing the cause of defects when faults are encountered. In some systems, periodic testing

is performed to ensure fault-free system operation and to initiate repair procedures when

faults are detected. A fault in a circuit causes the circuit to fail. A failure is the deviation of

the circuit from its original behaviour.

Testing Categories

Testing may fall into different categories depending on the intended goal. Some categories

are explained.

Functional testing

Applying the test vectors to the circuit under test and analysing the output responses of the

circuit to determine the functionality of the circuit. Here the internal structures need not to be

known.

Structural testing

The testing based on the circuit structural information and the set of fault models used. The

structural test tries to find physical defects of the circuit by propagating faults to the output.

This approach is called structural testing.

The selection of fault model used depends on the fault coverage and fault detection

efficiency. Improving the fault coverage can be easier and less expensive than improving the

manufacturing yield. Therefore generating the test stimuli with high fault coverage is very

important. Fault models are needed for fault simulation as well as test generation.

Different physical faults in a circuit namely, Stuck-at-faults (stuck-at-one, stuck-at-zero),

Transistor faults, Open and short circuit fault, bridging faults etc.

Test Pattern Generation Methods

Test patterns are used for testing the circuit. The most commonly used test pattern generation

methods are discussed below.

Exhaustive Testing

There are several testing approaches; the most naive method is the Exhaustive Testing. It

feeds the circuit with all the 2^n (n number of inputs) and checks the response of the circuit.

Exhaustive testing provides complete fault coverage and very easily implemented for

combinational circuits. The problem is with sequential circuits as this circuit consists of

storage elements and the numbers of test patterns required are more.

Pseudo Exhaustive Testing

This method is a slight modification of Exhaustive testing allows us to test the circuit without

the use of 2^n test patterns. The circuit is divided into several possibly overlapping cones,

which are logic elements that influence individual outputs of the circuit. Then, all the cones

are separately tested exhaustively, and hereby also the whole circuit is completely tested. The

only fault not covered by this model is bridging faults between elements belonging to

different non-overlapping cones. If such an efficient decomposition is possible, the circuit can

be tested with much less than 2^n test patterns.

Pseudo Random Testing

In Pseudo random Testing the test patterns are generated by a Pseudo Random Pattern

Generator (PRPG) and apply them to the circuit’s input pin. The difference between the

Exhaustive and this method is the test pattern length. If the PRPG structure and seed are

properly chosen, only few test patterns (less than 2^n) are necessary to completely test the

circuit.

In pseudo-random testing methods the pseudo-random code words generated by a PRPG are

being transformed by some additional logic (combinational or sequential) in order to reach

better fault coverage. Such methods are the reseeding-based techniques, weighted testing, bit-

fixing, bit-flipping, and others. These methods are often being referenced as a mixed-mode

BIST.

Reseeding:

In this method the LFSR is seeded with more than one seeds during the test, this seeds are

stored in the Read Only Memory (ROM). The seeds are smaller than the test pattern

generators and more than one test patterns are derived from one seed. By doing this the

memory requirements are reduced. If a standard LFSR is used as a test pattern generator, it is

not always possible to find the seed producing the required test patterns. A solution to this

problem is to use a Multi polynomial LFSR (MP LFSR), where the feedback network can be

adjusted. Both the seeds and polynomials are stored in a ROM memory and for each LFSR

seed also a unique LFSR polynomial are selected. The structure of such a TPG is shown in

Figure 18.

Figure 8. Multi polynomial BIST

This idea has been extended where the folding counter, which is a programmable counter, is

used as a PRPG. Here the number of folding seeds to be stored in ROM is even more

minimized.

In spite of all these techniques reducing memory overhead, implementation of a ROM on a

chip is still very area demanding and thus the ROM memory should be completely eliminated

in BIST.

Weighted Pattern BIST

In this method two problems have to be solved: first, the weight sets have to be computed and

then how to generate the weighted signals. Many weight set methods have been proposed and

it was shown that multiple weights sets are necessary to produce patterns with sufficient fault

coverage. These multiple weight sets have to be stored on a chip, thus this method increases

the area overhead. Several methods reducing the area overhead are proposed one of them is

Generator of Unequiprobable Random Tests (GURT). The area overhead is reduced to

minimum; however it is restricted to only one weight set.

Testability Analysis:

Testability is the processes of performing the main test of operations of controlling the

internal signals from the primary inputs and observing the internal output at the primary

output.

Many testability analysis techniques have been proposed the most important one is Sandia

Controllability/observability Analysis Program (SCOAP). This method performs the

testability analysis by performing the observability/controllability calculations of each signal.

Observability:

The observability of a particular internal circuit node is the degree to which one can observe

that node at the outputs of an integrated circuit. This measure is of importance when a

designer/tester desires to measure the output of a gate within a larger circuit to check that it

operates correctly.

Controllability:

The controllability of an internal circuit node within a chip is measure of the case of setting

the node to 1 or 0 states. This measure is of importance when assessing the degree of

difficulty of testing a particular signal within a circuit. A node with little controllability might

require many hundreds or thousands of cycles to get it to the right state. It is very big task to

generate a test sequence to set a number of poorly controllable nodes into the right state. A

well designed circuit should have the entire node easily observable and controllable.

VLSI TESTING PROBLEMS

In VLSI circuits faults can occur at anytime at design, process, package, and field. Faults can

occur at any place in the circuit.

The major problems detected so far in VLSI testing are as follows

1) Test generation problem

2) The Input Combinatorial Problem

3) The Gate to I/O Pin Ratio Problem

Test Generation Problem

The testability of a circuit can be increased with Design for testability (DFT) techniques.

Testing of a circuit involves the generation of test patterns for the circuit, application of this

test patterns to the circuit and the analysis of output response of the circuit . Generation of

test patterns for a combinational circuit is an easy task; the problem is with sequential circuits

because of the presence of storage elements.

The Input Combinatorial Problem

The number of test vectors required to test combinational and sequential circuits has

increased to a greater degree as we move from MSI (Medium-Scale-Integrated) to Very

Large Scale Integrated circuits (VLSI).

The Gate I/O Pin Ratio:

With the advances in System on Chip (SOC) technology, the number of transistors on the

IC’s has increased to a greater extent causing the size to be reduced. As the reduction in size

and increase in gate count caused difficulties in testing the internal nodes of the circuit as

they are difficult to be controlled and observed. The controllabilities and observabilities

become the critical issues.

The solution to the VLSI testing problems can be given by the insertion of an additional

circuit in the IC with Design for Testability features known as Built in Self Test.

Built in Self Test

The recent advances in deep submicron IC process technology and core based IC design

technology will lead to a wide spread use of logic BIST in the industry.

Built in Self Test (BIST) is a Design for Testability (DFT) technique that allows the circuit to

test itself and decide whether the output is correct or not by using its own test patterns

generated by the internal circuitry. The logic BIST offers a number of advantages compared

with the external tester. The problem with the external tester is that they are several times

slower than the IC internal circuitry and they are costly. The advantages of using BIST are,

relieves the tester memory accessibility problem, can be run efficiently on a very low cost

tester, problem of low accessibility and observability of internal nodes (known as test

complexity). Many of the electronic systems now are implemented using embedded cores;

BIST is the suitable method for testing complex core based systems.

The single stuck fault (SSF) model continues to be the most commonly used fault model for

digital system testing. The present new nanometre CMOS technologies require various other

models than single stuck model (SSF) for testing. Multiple test generation techniques that

targets a specific fault types are introduced which are very costly. An alternative solution for

this problem is the use of single on chip test pattern generator for generating test sequences

for conventional (stuck-at) and non-conventional (delay, bridging, stuck-open) fault types.

The general Built in Self Test structure consists of three main parts as shown in fig below 19.

Fig 9. Built in Self Test Structure

During the test the test patterns generated and are feed to the CUT and the response is

checked at the output response evaluator.

The three types of test pattern generation for BIST are deterministic, mixed mode and pseudo

random. Among them pseudo random is the practical solution as this approach is not related

to specific faults. These pseudo random sequences are generated by using Linear Feedback

Shift Registers (LFSR).

BIST solves the problems of VLSI testing by providing the circuit under test with a Linear

Feedback Shift Register (LFSR) and Multiple Input Shift Register (MISR).

LFSR produces the test generator patterns required to test the circuit under test. MISR is

used to produce the parallel data into a single signature.

Types of Shift Register:

Shift registers can have a combination of serial and parallel inputs and outputs, including

Serial in Parallel-out (SIPO) and Parallel-in Serial-out (PISO) types. There are also types that

have both serial and parallel input and types with serial and parallel output. There are also

bidirectional shift registers which allow you to vary the direction of the shift register. The

serial input and outputs of a register can also be connected together to create a circular shift

register. One could also create multi-dimensional shift registers, which can perform more

complex computation.

LINEAR FEEDBACK SHIFT REGISTERS

A linear feedback shift register (LFSR) is a shift register whose input bit is a linear function

of its previous state. Thus it is a shift register whose input bit is driven by the exclusive-or

(xor) of some bits. These selected bits are called taps and the list of this taps is called tap

sequences. The initial value of the LFSR is called the seed, and because the operation of the

register is deterministic, the sequence of values produced by the register is completely

determined by its current (or previous) state. Likewise, because the register has a finite

number of possible states, it must eventually enter a repeating cycle. However, an LFSR with

a well chosen feedback function can produce a sequence of bits which appears random and

which has a Likewise, because the register has a finite number of possible states, it must

eventually enter a repeating cycle. However, an LFSR with a well chosen feedback function

can produce a sequence of bits which appears random and which has a very long cycle. The

feedback function has several names: XOR, odd parity, sum modulo 2. In all this the function

is simple: 1) add the selected bit valves, 2) If the sum is odd the output is one, otherwise the

output is zero.

Applications of LFSRs include generating pseudo-random numbers, pseudo noise sequences,

fast digital counters, and whitening sequences. Both hardware and software implementations

of LFSRs are common.

Tap Sequence

The contents of the shift register, bits tapped for the function, output of the feedback function

together describes the state of the LFSR. With each shift in the shift register the output of the

feedback function changes causing a change in the state of the LFSR.

The state space of an LFSR is the list of all the states the LFSR can be in for a particular tap

sequence and a particular starting valve. Any tap sequence will give up atleast two state

spaces for an LFSR. The tap sequences that give up only two state spaces are called maximal

length tap sequences.

The state of an LFSR that is n bits long can be any one of 2^ n valves. The largest state space

possible for such a LFSR is 2^n-1. LFSR’s can have multiple maximal length tap sequences.

Describing one maximal length sequence will automatically lead to another. If a maximal

length tap sequence is [n, A, B, C], another maximal length tap sequence will be [n, n-C, n-B,

n-A].

Multiple Input Shift Register

The LFSR produces a large amount of data when applied this input to the circuit under test

produces large data to test. Testing the data in this form requires large amount of pins at the

output. This can be reduced by the use of Multiple Input Single Register (MISR). MISR

compresses the data into a single signature. Therefore this is the only valve that is to be

compared and evaluated. MISR reduces the gates I/O pin ratio problem.

FPGA Field Programmable Gate Array

Field-programmable gate array (FPGA) technology continues to gain momentum, and the

worldwide FPGA market is expected to grow from $1.9 billion in 2005 to $2.75 billion by

2010. Since its invention by Xilinx in 1984, FPGAs have gone from being simple glue logic

chips to actually replacing custom application-specific integrated circuits (ASICs) and

processors for signal processing and control applications

What is an FPGA?

At the highest level, FPGAs are reprogrammable silicon chips. Using prebuilt logic blocks

and programmable routing resources, you can configure these chips to implement custom

hardware functionality without ever having to pick up a breadboard or soldering iron. You

develop digital computing tasks in software and compile them down to a configuration file or

bitstream that contains information on how the components should be wired together. In

addition, FPGAs are completely reconfigurable and instantly take on a brand new

“personality” when you recompile a different configuration of circuitry. In the past, FPGA

technology was only available to engineers with a deep understanding of digital hardware

design. The rise of high-level design tools, however, is changing the rules of FPGA

programming, with new technologies that convert graphical block diagrams or even C code

into digital hardware circuitry.

FPGA chip adoption across all industries is driven by the fact that FPGAs combine the best

parts of ASICs and processor-based systems. FPGAs provide hardware-timed speed and

reliability, but they do not require high volumes to justify the large upfront expense of custom

ASIC design. Reprogrammable silicon also has the same flexibility of software running on a

processor-based system, but it is not limited by the number of processing cores available.

Unlike processors, FPGAs are truly parallel in nature so different processing operations do

not have to compete for the same resources. Each independent processing task is assigned to

a dedicated section of the chip, and can function autonomously without any influence from

other logic blocks. As a result, the performance of one part of the application is not affected

when additional processing is added.

Benefits of FPGA Technology

1. Performance

2. Time to Market

3. Cost

4. Reliability

5. Long-Term Maintenance

Performance – Taking advantage of hardware parallelism, FPGAs exceed the computing

power of digital signal processors (DSPs) by breaking the paradigm of sequential execution

and accomplishing more per clock cycle. BDTI, a noted analyst and benchmarking firm,

released benchmarks showing how FPGAs can deliver many times the processing power per

dollar of a DSP solution in some applications. Controlling inputs and outputs (I/O) at the

hardware level provides faster response times and specialized functionality to closely match

application requirements.

Time to market – FPGA technology offers flexibility and rapid prototyping capabilities in

the face of increased time-to-market concerns. You can test an idea or concept and verify it in

hardware without going through the long fabrication process of custom ASIC design. You

can then implement incremental changes and iterate on an FPGA design within hours instead

of weeks. Commercial off-the-shelf (COTS) hardware is also available with different types of

I/O already connected to a user-programmable FPGA chip. The growing availability of high-

level software tools decrease the learning curve with layers of abstraction and often include

valuable IP cores (prebuilt functions) for advanced control and signal processing.

Cost – The nonrecurring engineering (NRE) expense of custom ASIC design far exceeds that

of FPGA-based hardware solutions. The large initial investment in ASICs is easy to justify

for OEMs shipping thousands of chips per year, but many end users need custom hardware

functionality for the tens to hundreds of systems in development. The very nature of

programmable silicon means that there is no cost for fabrication or long lead times for

assembly. As system requirements often change over time, the cost of making incremental

changes to FPGA designs are quite negligible when compared to the large expense of

respinning an ASIC.

Reliability – While software tools provide the programming environment, FPGA circuitry is

truly a “hard” implementation of program execution. Processor-based systems often involve

several layers of abstraction to help schedule tasks and share resources among multiple

processes. The driver layer controls hardware resources and the operating system manages

memory and processor bandwidth. For any given processor core, only one instruction can

execute at a time, and processor-based systems are continually at risk of time-critical tasks

pre-empting one another. FPGAs, which do not use operating systems, minimize reliability

concerns with true parallel execution and deterministic hardware dedicated to every task.

Long-term maintenance – As mentioned earlier, FPGA chips are field-upgradable and do

not require the time and expense involved with ASIC redesign. Digital communication

protocols, for example, have specifications that can change over time, and ASIC-based

interfaces may cause maintenance and forward compatibility challenges. Being

reconfigurable, FPGA chips are able to keep up with future modifications that might be

necessary. As a product or system matures, you can make functional enhancements without

spending time redesigning hardware or modifying the board layout.

Figure 9.FPGA DESIGN FLOW

Choosing an FPGA

When examining the specifications of an FPGA chip, note that they are often divided into

configurable logic blocks like slices or logic cells, fixed function logic such as multipliers,

and memory resources like embedded block RAM. There are many other FPGA chip

components, but these are typically the most important when selecting and comparing FPGAs

for a particular application.

Table 2. FPGA Resource Specifications for Various Families

Table 2 shows resource specifications used to compare FPGA chips within various Xilinx

families. The number of gates has traditionally been a way to compare the size of FPGA

chips to ASIC technology, but it does not truly describe the number of individual components

inside an FPGA. This is one of the reasons that Xilinx did not specify the number of

equivalent system gates for the new Virtex-5 family.

IMPLEMENTATION

The simulation has been performed using ModelSim software. The simulation results were

verified for most of the combinations of 8-bit orthogonal code. ISE Xillinx software have

been used for Synthesis.

MODELSIM TUTORIAL

I. Starting ModelSim1. Start ModelSim

Virtex-II

1000

Virtex-II

3000

Spartan-3

1000

Spartan-3

2000

Virtex-5

LX30

Virtex-5

LX50

Virtex-5

LX85

Virtex-5

LX110

Gates 1 million 3 million 1 million 2 million ----- ----- ----- -----

Flip-Flops 10,240 28,672 15,360 40,960 19,200 28,800 51,840 69,120

LUTs 10,240 28,672 15,360 40,960 19,200 28,800 51,840 69,120

Multipliers 40 96 24 40 32 48 48 64

Block

RAM (kb)720 1,728 432 720 1,152 1,728 3,456 4,608

Start Programs ModelSim ModelSim

2. After starting ModelSim, Welcome to ModelSim 5.7g dialog show up.

Fig. 1 Welcome Screen

II. Setting up the ProjectThe first thing to do is to create a project. Projects ease the interaction with ModelSim and are

useful for organizing files and simulation settings.

1. Create a new project by clicking on jumpstart (see figure 1) on the welcome to ModelSim

dialog, and then on create a project. We can also create a new project without the help of the

dialog window by selecting: file new project from the main window

2. A “create project” window pops up (see fig 2). Select a suitable name for the project; set the

project location to d:/temp as shown above, and leave the default library name to work. Hit ok.

Fig 2. Create project dialog

3. After hitting ok, an add items to the project dialog pops out (see fig 3).

Fig 3. Add items to the project dialog

Three options to add files to the project:

Create new VHDL files (from scratch) and then add them to the project, or

Add already existing files to the project, or

Do a combination of the two operations by combining the two above operations.

The following will illustrate the method of adding new files to the project, which has been just

created.

Creating a VHDL File from Scratch

1. From the “add items to the project” dialog click on create a new file.

2. A create project file dialog pops out. Select an appropriate file name for the file which is to be

added; choose VHDL as the add file as type option and top level as the folder option (see fig 4).

Fig 4. Create project file” dialog

3. On the workspace section of the main window (see fig 5), double-click on the file just created

(dff.vhd in our case).

Fig 5. ModelSim’s main window

4. Type in code in the new window. For example, consider a simple D flip-flop code.

entity DFF is

port (D, CLK: in bit;

Q: out bit; QN: out bit := '1');

end DFF;

architecture SIMPLE of DFF is

begin

process (CLK)

begin

if CLK = '1' then

Q <= D after 10 ns;

QN <= not D after 10 ns;

end if;

end process;

end SIMPLE

5. Type (simply paste) the above code in the new window.

6. Save your code (File Save).

Adding Files to the Project

1. Select File Add to Project Existing File…

2. An “add file to Project” dialog pops up. Select the file to be added to the project. Make sure to

select VHDL from “the Add file as type” menu. Hit ok.

Fig 6. Add file to Project window

3. You should now see the file that you have just added in the workspace section of ModelSim’s

Main window.

III. Compiling / Debugging Project Files

1. Select Compile Compile All.

2. The compilation result is shown on the main window. A red message indicates that there is an

error in the code. Steps 3 through 7 will illustrate how to correct this error.

Fig 7. The error is indicated in red on the main window

3. Double-click on the error (shown in Red) on the main window. This will open a new

window that describes the nature of the error. Let the error message is as follows:

4. Double-click on the Error message. The error is highlighted in the source window:

Fig 8. The error is highlighted in the source window

5. Correct the above error by adding the semicolon after the “end SIMPLE” statement. Hit

save, and then recompile the file again. Repeat steps 1-5 until the code compiles with no

errors.

IV. Simulating the Design

This section covers the basics for simulating a design using ModelSim.

1. Click on the Library tab of the main window and then click on the (+) sign next to the work

library. See the name of the entity of the code that has just compiled “DFF”. (See Fig 9)

Fig9: ModelSim’s Main Window

2. Double-click on dff to load the file. This should open a third tab “sim” in the main window.

3. Now select view All Windows from the main window to open all ModelSim windows.

4. Locate the signals window and select the signals that are to be monitored for simulation purposes.

Let all signals are selected for this example.

Library Tab

Fig 10: “The Signals Window”

5. Drag the above signals using the left button of the mouse into the wave window or use: add

wave Selected signals.

6. Do the same as in step 5 with the list window (i.e. drag selected signals into the list window, or

use add list Selected signals).

7. Now, the design is ready to simulate. For this purpose, some commands have to be typed from the

main window of the simulator.

8. On the main window type in: force clk 0 0 ns, 1 10 ns -repeat 20 ns and then hit enter.

This statement forces the clock to take the value of 0 at 0ns, 1 at 10 ns and to repeat the forcing of

these values every 20 ns.

Fig 11: “Input the commands to the simulator as shown above”

9. Next, type run 40 ns on the main window and then hit enter. This will run the simulation for 40

ns. See the changes in the both the wave and list windows.

10. Next, change the value of D to 1 by typing: force d 1 and then hit enter. The change in d will take

place at 40 ns (the current time)

11. Again, type run 40 ns on the main window to simulate the code for another 40ns.

12. Now, select the wave window, and click on “zoom full” (see below). Simulation should look as

follows.

Fig 9: “the Wave window”

The above waveform shows that q follows d 10 Ns after the rising edge of the clock. The same result

is confirmed using the list window.

V. Printing the Results:

Printing the list window:

1. In order to print the list window, select File write list Tabular from the list

window and then save it as simulation.lst

2. Start Notepad and open the file that is saved above. Proceed to printing it.

Printing the wave window:

Printing from the wave window is simple. Select: File Print from the wave window.

Xilinx ISE Tutorial

This tutorial takes through the process of creating, synthesizing, simulating, implementing, and

downloading a simple VHDL design using the XILINX Project Navigator.

Starting the Project Navigator

Start Programs Xilinx Project Navigator

Creating a ProjectNext, create a new project by selecting File New Project. The following “New Project” window

appears

Zoom Full

Fig 1: New Project Wizard Dialog Box

Select an appropriate project name, an appropriate project location, and then set the top-level module

type to hdl. Click next when finished. A new window (see figure 2) appears prompting for device and

design flow information. Fill in these boxes. Click next when finished.

Fig 2: new project wizard device and design flow dialog box

Creating New VHDL files

The new window gives the option to create a new source to add to the project. Click on the New

Source button. In the new window (see Figure 3), select VHDL Module as the file type, and choose a

name for the new file. Check the box Add to project if it is unchecked and then click next.

Figure 3: The create new files window

Then, enter the inputs and outputs of the entity of the file created. Figure 4 shows the wizard

that assists this process.

In the Define VHDL Source window shown in Figure 4:

1. Leave the Entity name as it is.

2. Choose the appropriate Architecture Name for file (i.e. Behavioral, Structural…)

3. Enter the Port names and select their appropriate Direction. For vector types, enter the

appropriate range by selecting the proper values for MSB and LSB. Click next.

Fig 4: Defining Entity port names

Entity inputs and outputs

4-bit vectors

4. A New Source Information window that summarizes the entries pops out. Verify that the

information entered is correct and then press finish. So far, the Project Navigator should have

created the DFF file outline, which consists of the entity and its signals, and an empty

architecture. Then, add code in the architecture part and save the file.

5. Click next on the “Create a new Source” window (Fig 5) to proceed to the next step. The

project Navigator now gives the option of adding existing files to the project. Files can be

added by clicking on Add Source and then selecting the files to be added. Click next and

then click finish.

Fig 5: “Create a New Source” window

For example, let the following code is used:

library IEEE;

use IEEE.STD_LOGIC_1164.ALL;

use IEEE.STD_LOGIC_ARITH.ALL;

use IEEE.STD_LOGIC_UNSIGNED.ALL;

-- Uncomment the following lines to use the declarations that are

-- provided for instantiating Xilinx primitive components.

--library UNISIM;

--use UNISIM.VComponents.all;

entity dff is

Port ( reset : in std_logic;

btnclk : in std_logic;

d : in std_logic_vector(3 downto 0);

q : out std_logic_vector(3 downto 0));

end dff;

architecture Behavioral of dff is

begin

-- A 4-bit register clocked by pressing down a button:

process (reset, btnclk)

begin

if (reset = '1') then

q <= (others => '0');

elsif (rising_edge(btnclk)) then

q <= d;

end if;

end process;

end Behavioral;

Fig 6: A 4-bit Register VHDL code

Checking Syntax and Simulating

Check the Syntax of the code that has just typed. Follow the instructions and on the same time keep

an eye on Figure 9.

First, check that top file is selected (highlighted) under the Sources in Project window, as shown for

“DFF.vhd” in Figure 9. If it is not, then select it by left clicking on it. Next, locate the Processes for

source window on the left of the screen (If the window is not visible, enable it by selecting view

processes). On this window, locate the

synthesize menu and expand it by pressing on the +. Locate the “Check Syntax” command under the

synthesize menu. Left double-click on “Check Syntax” and wait for the software to finish checking

the code.

Fig 9: A green check mark indicates that a process succeeded.

There appears a green check mark if the process completed without warnings (as in Figure 9), a

yellow exclamation mark if there are warnings, and a red x if there are errors in your syntax code.

Observe the messages generated by the software in the process view window (see Figure 9).

The “Processes for Source” is the commands window.

The “Process

view” window

displays messages

A green check

mark indicates

a success.

DFF is the file that is

currently selected.

Correct any errors if any, and then repeat the above step until the check syntax process succeeds. Next

proceed to simulating the design by left double-clicking on “Launch ModelSim Simulator” which is

located under Design Entry Utilities in the Processes for Current Source window. ModelSim

should start, and proceed for simulating the design.

Synthesizing

First, check that the top file is selected (highlighted) under the Sources in Project window, as shown

for “DFF.vhd” in Figure 9. If it is not, then select it by left clicking on it. Next, synthesize the design

by left double-clicking on “Synthesize XST”, located in the Process for Source window. If there are

any errors, fix the errors and re-synthesize the corrected code again.

There are several options for synthesis that can change to optimize the design. Right click on

“Syntehsize XST” and click on properties. The main options that are interested are: Optimization

Goal, Optimization Effort, and FSM Encoding Algorithm. The first two options can be found

under the Synthesis Options tab. By selecting Speed in this option box, Xilinx will try to synthesize

the code to produce faster design. By selecting Area, Xilinx will sacrifice speed and try to build the

smallest design possible. The Optimization Effort option can be set to Normal or High, and tells

Xilinx just how “hard to try” to get a design that is as fast as possible or as small as possible. Clicking

on the HDL Options tab will find the FSM Encoding Algorithm at the top.

If the synthesizer generates any errors, then synthesis will fail and errors have to be fixed in the code

before re-synthesize. If the synthesizer succeeds, then it is still important to check if the synthesizer

has generated any warnings. Check for warnings (and errors) in the synthesis report.

After you have synthesized your design, double left-click on “View Synthesis Report.” The report

should contain a summary of any errors or warnings that were generated. The report should not

contain any latches as these can cause timing problems, when not properly used. If the report contains

latch warnings, then it need to go over your code, and fix these latches, and re-synthesize the code

again.

Note: If you modified your code in this step, then you should re-simulate it to ensure that the proper

functionality is still there.

Assigning I/O Pins

After synthesizing the code, software needs to be informed about where the top entity signals to

be routed. That is, to specify the pin locations for signals feeding in and out of the FPGA chip.

Before proceeding to assign pin locations, implementation constraints file needs to be

created:

Select Project New Source.

The New Source window of Figure 3 should appear.

Select Implementation Constraints file by clicking on it.

Choose a name for the constraints file. Lets choose the name “pins”. Note that the software will add the extension “.ucf” to the constraints file name.

Click next.

Associate the constraints file with the top entity in design. In this case, there is only one single file (Dff), so associate the constraints file with it.

Click next.

Click finish. Now constraints file is created and the pins to the input and output signals are to be assigned.

To specify pin locations:

Locate your constraints file (in our case “pins.ucf”) under the Sources in Project

window and left double-click on it.

After a while, the Xilinx PACE window opens up.

Locate the Design Object List –I/O Pins tab at the bottom of the window. In this

window, specify the I/O device pins that are used to test the design. For this tutorial,

lets use switches SW3-SW0 to feed 4-bit D input, LEDs Ld3-Ld0 to display Q output,

BTN1 to feed the reset signal, and BTN0 to feed button clock.

Now I/O devices that are used to test design are known. Now it needs to find the FPGA

pins to which they are connected. The FPGA pins are those that communicate to the

project navigator as LOC constraints for the implementation.

The pins assignment for all pins in this tutorial is then shown in Table 1. Enter the pin

numbers of Table 1 in the appropriate LOC Column for each I/O (See Figure 10).

Fig 10: Xilinx Constraints Editor

Once the Pin assignment is completed, hit the save button, and then close the Xilinx Pace window.

Implementing the Design

First, check that the top file is selected (highlighted) in the Sources in Project window. If it is not,

then select it by left clicking on it. Next, implement the design by left double-clicking on “Implement

Design”, which is located under the Processes for Source window. Watch the process view window

for any errors or warnings.

Generating a Programming File

After implementing the design, first generate a programming file. Locate “Generate Programming

file” under the Processes for Source window, right-click on it and select properties. A process

properties window should then open up. From this window, generate the programming bit file by

left-double clicking on “Generate Programming File.”

LOC

improvement of the orthogonal code convolution capabilities using fpga implementation

Documents