algorithmic solution for design and optimisation of … · aleodor daniel ioan 44 1. introduction...

17
BULETINUL INSTITUTULUI POLITEHNIC DIN IAŞI Publicat de Universitatea Tehnică „Gheorghe Asachi” din Iaşi Tomul LVII (LXI), Fasc. 3, 2011 SecŃia AUTOMATICĂ şi CALCULATOARE ALGORITHMIC SOLUTION FOR DESIGN AND OPTIMISATION OF MULTI-PHASE PULSE GENERATORS BY ALEODOR DANIEL IOAN “Gheorghe Asachi” Technical University of Iaşi, Faculty of Automatic Control and Computer Engineering Received: July 29, 2011 Accepted for publication: September 7, 2011 Abstract. This paper introduces a new design method for the generators of multiple phase pulses, which can be independently positioned over the signal period, overlapped or not. Such hardware structures are particular counter based FSMs (Finite State Machines) and they can be frequently encountered in many applications that need a sequence of pulses to control the execution path, like video synch-generators, three phase inverters, pipeline processing. The classical design approach uses magnitude comparators on all counter outputs that can be reduced to equality comparators on fewer counter outputs only by heuristic methods. Here is presented an algorithmic solution that can be systematically applied as universal technique, which uses only AND gates to detect the match combinations for the start and the end of each pulse. Furthermore, an innovative optimization procedure that reduces the global number of the gates inputs to a minimum, useful even in FPGA implementations is proposed. These solutions were practically applied and extensively tested by designing multiple resolution/refresh synch-generators in a FPGA video interface for embedded systems. Key words: Pulse generators, counter FSM, AND detection, BER cell, systematical structure, inputs minimization, synch-generator, FPGA video interface. 2000 Mathematics Subject Classification: 68M99, 68W35. Corresponding author: e-mail: [email protected]

Upload: others

Post on 08-Sep-2019

2 views

Category:

Documents


0 download

TRANSCRIPT

BULETINUL INSTITUTULUI POLITEHNIC DIN IAŞI Publicat de

Universitatea Tehnică „Gheorghe Asachi” din Iaşi Tomul LVII (LXI), Fasc. 3, 2011

SecŃia AUTOMATICĂ şi CALCULATOARE

ALGORITHMIC SOLUTION FOR DESIGN AND

OPTIMISATION OF MULTI-PHASE PULSE GENERATORS

BY

ALEODOR DANIEL IOAN∗∗∗∗

“Gheorghe Asachi” Technical University of Iaşi,

Faculty of Automatic Control and Computer Engineering Received: July 29, 2011 Accepted for publication: September 7, 2011

Abstract. This paper introduces a new design method for the generators of

multiple phase pulses, which can be independently positioned over the signal period, overlapped or not. Such hardware structures are particular counter based FSMs (Finite State Machines) and they can be frequently encountered in many applications that need a sequence of pulses to control the execution path, like video synch-generators, three phase inverters, pipeline processing. The classical design approach uses magnitude comparators on all counter outputs that can be reduced to equality comparators on fewer counter outputs only by heuristic methods. Here is presented an algorithmic solution that can be systematically applied as universal technique, which uses only AND gates to detect the match combinations for the start and the end of each pulse. Furthermore, an innovative optimization procedure that reduces the global number of the gates inputs to a minimum, useful even in FPGA implementations is proposed. These solutions were practically applied and extensively tested by designing multiple resolution/refresh synch-generators in a FPGA video interface for embedded systems.

Key words: Pulse generators, counter FSM, AND detection, BER cell, systematical structure, inputs minimization, synch-generator, FPGA video interface.

2000 Mathematics Subject Classification: 68M99, 68W35.

∗Corresponding author: e-mail: [email protected]

Aleodor Daniel Ioan

44

1. Introduction

The structure of digital electronic systems can be divided in two main

functional blocks: the execution section and the sequencer that controls it (Mano & Kime, 2004). It is known that the execution section can be designed only by heuristic techniques, which are different form one application to another. For the sequencer, there are several algorithmic design methods that can be generally used in most implementations, based on the Mealy-Moore FSM (Finite State Machine) formalism (Ioan, 2010b). But also there are some cases when such an approach can be ineffective due to the enormous number of states (hundreds or thousands) required for simple output pulse generation: each output is activated after n states and deactivated after m states, where usually: m > n. Such particular sequencers can be seen as “multi-phase” pulse generators, because they will output many pulses independently controlled in length, start and end position, in relation to a given number of input clock periods. There are many applications that use this kind of pulse generators, which includes any case where a precise sequence of multiple pulses is needed to control the execution unit of the system, like synch-generators for video interfaces (Ioan, 2008), modified three phase inverters (Alecsa & Ioan, 2011), pipeline processing and others.

This type of sequencers can be synthesized in an algorithmic manner too, using a binary counter that increments from an initial state to a final state and some combinational decoding logic that resembles a logical AND between two magnitude comparators for each pulse, one for the “greater than” start value of the counter, and the other for the “smaller than” end value (Navabi, 2005). This is the hardware structure that results from an “if” condition in any hardware description language (VHDL or Verilog) and is leaved in this way by most today designers who do not bother with structure optimization (Chu, 2008). Nevertheless, this structure is very resource consuming because all the comparators must be sized on all bits of the counter and it can cause “glitch” hazard if the synthesis tools do not automatically insert registers on the outputs, because not all combinational paths are equal in propagation time.

Another, more optimized structure can also be systematically obtained by lowering the design level at schematic description and considering the fact that the counter only increments itself, so all output values are scanned in only one direction, from lower to higher ones (Ioan, 2010a). The schematic will contain two comparators with inputs sized for all the counter bits too, but instead comparing the magnitude they will search for equality match. The output of the “start match” comparator will be used to set the pulse flip-flop and the output of the “end match” comparator will be used to reset the flip-flop. This approach can completely eliminate the decoding hazard when the output pulse flip-flop is synchronized with the clock edge that is opposite to the incrementing edge of the counter (Ioan, 2010a). But the equality comparators on

Bul. Inst. Polit. Iaşi, t. LVII (LXI), f. 3, 2011

45

all counter bits are also very prohibitive when multiple pulses must be generated, consuming a lot of implementation resources.

Solutions to further optimization of such “multi-phase” pulse generators were proposed in (Ioan, 2008) and (Ioan, 2010a), but they are all heuristic and cannot be applied in an algorithmic manner. Based on some particular properties of the start and end values or in the relative position of pulses, the width of the comparators inputs can be reduced to a value smaller than the number of counter output bits. These heuristic methods require large design effort and cannot be efficiently applied when there are many pulses to generate or when the pulse must have more than one position (start, length, end) on the counter period. This paper introduces a systematical approach that uses no comparators, but only AND gates to detect begin/end combinations of “1” bits, due to the insertion of an output sequential structure little more complex than a flip-flop, called “BER cell” (Begin-End-Reset cell).

Besides the general schematic structure for the “multi-phase” pulse generator with AND combinational detection and BER cell sequential filtering, this article also proposes a completely new optimization algorithm to reduce the global number of the AND gates inputs to a minimum, using a two-level AND structure with reutilization of detection terms. This procedure can be done by hand using an “OR covering triangle”, but is completely suitable for software implementation as a small program.

The optimization technique is quite useful for schematic FPGA implementation in two ways: it reduces the schematic drawing effort in design phase and also produces palpable results after the implementation. It is the only solution that still reduces the combinational resources usage even after the FPGA synthesis, mapping and routing tools had further optimized the structure for the chip specific CLB (Configurable Logic Block) architecture. Other classical minimization techniques are not applicable here to simultaneously inter-minimize a large group of very simple AND logical functions. Also, any reutilization of detection terms that is cascaded on more than two levels doesn’t give better results because it can increase the propagation time if is not entirely removed by the FPGA implementation tools.

2. Innovative Systematic Hardware Structure

As mentioned before, a multiple phase pulse generator is based on an n

bit synchronous binary counter with synchronous reset input so it can be brought to zero exactly one clock period after the detection of reset combination on its outputs. Each pulse will be generated by detection of two other binary combinations on the counter outputs: a “begin” combination that enables the pulse and an “end” combination that disables the pulse. These different combinations should be changeable to modify the pulse duration and to shift its phase on the counter entire reset period.

Aleodor Daniel Ioan

46

Fig. 1 – General systematic hardware structure for “multi-phase” pulse generators.

The proposed hardware structure at schematic level (Fig. 1) is both

optimal and algorithmic, because it uses only simple AND gates with no additional comparators to detect the counter combinations and it can be systematically applied in every design situation, without heuristic techniques. Practically, it is obvious that after the number of inputs into all AND detection

RES CLK

OUT BEG

END END mux

I0

I1

Ik

0

1

k Sel

I0

I1

Ik

0

1

k Sel

BEGin mux

Pm pulse

m

RES CLK

OUT BEG

END END mux

I0

I1

Ik

0

1

k Sel

I0

I1

Ik

0

1

k Sel

BEGin mux

P1 pulse

1

RES CLK

OUT BEG

END END mux

I0

I1

Ik

0

1

k Sel

I0

I1

Ik

0

1

k Sel

BEGin mux

P2 pulse

2

RESet mux

I0

I1

Ik

0

1

k Sel

CLK

Synchronous

RESET CLK

Q[n-1:0] D

CK Q

MODE “1” clock RESET

delay

active clock edge

inverter

“n” bit phase

counter

n

n

emk<n

erk<n

bmk<n

e2k<n

b2k<n

e1k<n

b1k<n

Bul. Inst. Polit. Iaşi, t. LVII (LXI), f. 3, 2011

47

gates was reduced to a minimum, there cannot be other more efficient hardware structure for a general design, so the schematic can be truly called “optimal”.

In the Fig. 1 schematic, it is supposed that all m pulses have a number of k+1 work modes, each mode consisting of a different counter period (common to all pulses), start/end moments and length specific to each pulse. One work mode is selected simultaneously for all pulses and for the main counter by a group of 2m+1 multiplexers with k+1 inputs and common selection bus rail. There are two multiplexers for each pulse that selects the “begin” and the “end” variants, then one multiplexer for the main counter, which selects the reset combination for a specific counter period.

Each input of the multiplexers is connected to an AND gate to detect a counter match combination, considering only the bits that have logical “1” value at that moment. This is, of course, only a partial detection that ignores logical “0” bits of the required counter combination. But how it works?! The correct functionality is based on two proprieties. First, the counter only increments itself, so the combinations are scanned only in ascending order. In this way, there cannot have been any other combination that had all the same bit positions set to logical “1” before the required combination, because a greater binary number has always more logical “1” bits than a smaller one. For a decrementing counter, the opposite is valid and the detector should search for “0” bits, with OR gates. The second property is the introduction of a special automaton as output cell.

The “end” combination is always greater than the “begin” combination, so the BEGIN signal will be surely activated for the first time before the END signal. It seems that one SR (Set-Reset) flip-flop inserted at the output should be enough, because the BEGIN signal could set the flip-flop and the END signal will reset it after a time period. But this could not work: there is surely the possibility that the “begin” combination to be included into a greater “1” bit combination after the “end” combination, or even in this “end” combination. This possibility is eliminated only for the counter reset detection, where only one combination is detected and cannot be other greater number after it, because the counter will then return to zero and start a new cycle. But for all the pulses, any “begin” combination and the “end” combination too, could last for more than one clock period, could be reactivated more than once, or could be both overlapped. So the AND gates only detect the first time when that “1” bit combination appeared over the counter time period.

If the BEGIN signal lasted for more than one clock period or if it was reactivated before the END signal first appeared, there should not be a problem, because a flip-flop that was already set to “1” is insensitive to another “set”. The partial overlapping that finds both signal simultaneously active for at least one clock period should also not be a problem, because a flip-flop with reset priority over the set can be found. The real problem appears when the BEGIN signal last longer than the END signal when overlapped, or when the BEGIN is

Aleodor Daniel Ioan

48

reactivated some time after the END was inactivated. This situation will lead to the activation of the pulse more than once for each counter period and the pulse can remain active longer, during the counter reset and before the first BEGIN detection, until the first END detection. This malfunction is serious…

The solution to this problem is to replace the simple output flip-flop with a little more complex sequential structure called “BER (Begin-End-Reset) cell”, which will function as a digital filter that removes all transitions on the BEGIN/END signals, except the first one. The operation and one implementation solution of such a BER cell will be discussed next.

To finalize commentaries about presented Fig. 1 schematic, another important aspect should be emphasized: the number of inputs of each AND gate is variable between 1...n, depending on the number of logical “1” bits from which the detected combination was formed. Because some input sections can have common “1” bits with others, an algorithm that reduces the global number of all inputs to a minimum by reutilization of those sections was conceived and proposed further in this paper.

3. BER-Cell Operation and Implementation

The working principle for a BER cell is as follows: − at the beginning of the counter period, the BER cell should have

an initial “00” state sensitive to both B (Begin) and E (End) signals, with inactive output;

− between the first B positive edge and the first E positive edge the BER cell must keep its output active and it should ignore the remaining active level of B or any following B transitions (“01” state);

− after the first E positive edge the BER cell must became insensitive to B signal until the end of counter period, keeping its output inactive (state “10”) and it should ignore the remaining active level of E or any following E transitions;

− the R (Reset) signal that appears at the end of the counter period should bring the BER cell to its first “00” state, in which the cell became sensitive again to B signal, without changing the output.

From those presented above, it is obvious that the BER cell has three states and it can be synthesized as a small FSM (Finite State Machine or automaton) with 2 flip-flops. This lead to a major decrease in combinational resources used: 2*(k+1) AND gates instead 2*(k+1) equality comparators, with the price of a small increase in sequential resources: 2 flip-flops instead one. Considering the fact that all this resources must be multiplied by m (the number of pulses), and the FPGA chip property of being more suitable for sequential design (Rodrigues-Andina et al., 2007), this configuration is extremely useful for FPGA implementations.

Bul. Inst. Polit. Iaşi, t. LVII (LXI), f. 3, 2011

49

Of course, the BER cell could be synthesized using the fluency graph together with the transition matrix method (Ioan, 2010b), but this is not so useful when the design software environment already has components more complex than a simple flip-flop, like the “D” type flip-flops with recessive clock enable (CE) and dominant synchronous reset (R) from the Altium Designer generic library (Altium, 2008). In this case, the implementation should be done directly from the timing diagrams. Fig. 2 presents these diagrams for a BER cell, considering also parasitic reactivations of B and E signals and partial overlapping active levels, longer than a clock period.

Fig. 2 – Timing diagrams for BER cell sequential filtering.

To ignore any “glitches” that may appear on the B, E and R signals due to propagation hazard from the combinational detection structure, all BER cells should work synchronously on the clock edge that is opposite with the incrementing edge of the main counter, as it can be seen in the general schematic from Fig. 1. This configuration allows enough time for complete stabilization of combinational signals.

As it can be seen from Fig. 2, all parasitic levels and edges on the B and E signals are finally filtered from the output; only one first positive edge is considered in a counter period. But a wrong output could be generated if the R input is the same signal used to reset the main counter. Because the counter has synchronous reset, it needs another clock period to change its outputs. So, depending on the specific combinations of the binary “1” bit detections, there is a possibility that the Begin signal still be active in that clock period, after the End was already deactivated. Because the Reset is also active, on the following clock negative edge the BER cell will became sensitive again to B input, so it will activate the output pulse that will last until next End detection.

Begin AND

End AND

Reset AND

Delayed Reset

“BER”

cell out

Wrong

output

Clock

Ignored detections of “1” bit combinations

First detection of “Begin” combination “x1x11…”

First detection of “End” combination “x11x1…”

“Reset” delayed after the reset of main counter

Pulse “Begin

Pulse “End”

Cell “Reset” (sensitive again to “Begin” input)

Wrong pulse “Begin” due to cell “Reset” before main counter

Aleodor Daniel Ioan

50

This malfunction can be easily eliminated if the R signal for all BER cells is delayed one clock period after the reset detection, so they will became sensitive to B inputs at the same time with the counter reset. This is the explanation for the presence of a D type flip-flop with the same clock as the main counter into the Reset path of the BER cells (Fig. 1).

As mentioned before, a BER cell can be implemented directly from the Fig. 2 timing diagrams using two Altium Designer FDRE flip-flops (Fig. 3).

Fig. 3 – BER cell implementation using two “D” flip-flops with dominant synchronous reset and recessive clock enable.

The first flip-flop (U1) controls the cell sensitivity to B input and the

second flip flop (U3) controls the cell output. Both are controlled on the same clock edge and have synchronous reset (R) with priority over the clock enable (CE) that validates data load from “D” input. When enabled, U3 flip-flop will always load the inverted state of U1 flip-flop.

The Begin signal is connected to CE of U3 and changes the “U1U3” cell state from the initial “00” to “01” state, still B-sensitive and with O-output active. All the following clock edges that find B still active until first E detection will load the same “1” value to U3, with no effect on the output. When End has first become active, the cell state changes to “10” (B-insensitive, O-inactive) on the next clock edge, because U1 will load “1” with CE and U3 will dominantly reset to “0”, regardless of the CE=B recessive state. All following clock edges that find B active with E inactive, will always load the same “0” value to U3 with no effect on the output, because U1 was already changed to “1”. The activation of Reset will bring the cell to its initial “00” state (B-sensitive, O-inactive) by reset of U1, that is dominant over the recessive CE=E. 4. Application: Dual Refresh Synch-Generator for FPGA Video Interface

One of the main applications of such optimized pulse generator is in the

video interfaces for embedded systems. An FPGA implementation for the VGA

Bul. Inst. Polit. Iaşi, t. LVII (LXI), f. 3, 2011

51

video interface with frame memory included on the same single-chip was presented in (Ioan, 2010a). The main utility of this interface comes from the autonomy that it brings to any embedded system (Nakano et al., 2008), which can be used in this way fully independent, without the connection to a personal computer (Hamblen, 1999).

Paper (Ioan, 2010a) presented the implementation of a VGA interface with 85 Hz vertical refresh rate that used a heuristic-optimized synch-generator. This rate prove to be very useful for all CRT monitors, but some new LCD monitors did not worked well without a lower 60 Hz refresh mode. When to re-design this synch-generator for dual refresh mode (60 Hz / 85 Hz), the application of all the heuristic techniques used before had proven to be prohibitive, because the resulted schematic would be too complex for any efficient optimizations. The only feasible solution in this situation is to use the systematic hardware structure presented in this paper, which will lead to a minimum resource usage when applied together with an algorithm for global reduction of detection gates inputs.

Only the horizontal synch-generator part will be discussed here, because the structure of the vertical one is similar, due to algorithmic character of the design process that uses this systematic technique. The design begins with the drawings of global timing diagrams first introduced in paper (Ioan, 2008). These diagrams give a graphical representation for the position of border, blank and sync pulses relative to the address counter period, but they should be modified for dual refresh mode, to show dual detection values for each generated event.

Fig. 4 – Design diagram of horizontal synch-generator with pulse positions for dual refresh.

More general diagram architecture for synch-generator design is

proposed here, in Fig. 4. The diagram for horizontal synch-generator was simplified compared to the papers (Ioan, 2008; Ioan, 2010a): it shows the

LEFT BORDER

RIGHT BORDER

TOTAL HORIZONTAL LINE TIME

HBORD HORIZONTAL SCREEN

HSYNC

CHARACTER ADDRESS COUNTER ROLLOVER

00h

00h

20h

20h 2Bh

29h

2Eh

30h

31h

33h

25h

26h 24h

24h

HBLNK

Aleodor Daniel Ioan

52

address generation portion together with pyramidal position of pulses, keeping only the hex values when the pulses change, without the previous moments. Each “flag-like” moment indicator contains two hex values for the counter, the first for 60 Hz refresh and the second for 85 Hz mode, but any number of modes can be represented in this way on the same diagram.

Each hex value from Fig. 4 needs one AND gate to detect when the appropriate counter bits had become logical “1” for that combination. But some greater combinations could exist, which include all the bits considered for a previous smaller value. So, a minimization algorithm would be needed for the re-usage of these pre-generated AND terms, to reduce the schematic drawing effort and the implementation resources. The algorithm proposed further can be applied for any general multi-phase pulse generator design, but it can be better presented in this synch-generator particular context.

20h 1

24h 1 1

25h 1 1 1

26h 1 1 0 1

29h 1 0 0 0 1

2Bh 1 0 0 0 1 1

2Eh 1 1 0 1 0 0 1

30h 1 0 0 0 0 0 0 1

31h 1 0 0 0 0 0 0 1 1

33h 1 0 0 0 0 0 0 1 1 1

OR 20h 24h 25h 26h 29h 2Bh 2Eh 30h 31h 33h

w-1 0 1 2 2 2 3 3 1 2 3

n1-1 9 3 0 1 1 0 0 2 1 0

(w-1) x (n1-1) 0 3 0 2 2 0 0 2 2 0

n2-1 5 – – – 1 0 – 2 1 0

(w-1) x (n2-1) 0 – – – 2 0 – 2 2 0

Fig. 5 – OR covering triangle for inputs minimization of horizontal synch-generator.

The first step of the reduction algorithm is the construction of a so called “OR covering triangle” (Fig. 5). To do that, all the m detection combinations are written on the vertical and horizontal sides of the triangle, in the ascending order of their hex values. Then, all binary covering values are calculated by performing a bitwise OR function between the line and the column combinations, according to Eq. (1):

=

=∨

≠∨=

.

; ,1, : with,

:if , 1

:if , 0

ji

mji

lcl

lclt

iji

iji

ij (1)

Bul. Inst. Polit. Iaşi, t. LVII (LXI), f. 3, 2011

53

The result is “1” only if the OR function leaves the line combination unchanged, when it is called that the corresponding column value “covers” (is included in) this line combination. The idea is that the covered line can be detected by a narrower AND on a second gates level by re-using the column detection from the first level of gates and only the missing “1” bits. The algorithm is proposed here with only two levels of AND gates, because adding another one would be equivalent after FPGA tools optimizations or it would increase the propagation time.

When a covered line reuse a previous detected column combination, the number of inputs in covered AND gate is reduced by a number with one unit lower than the number of bits that was “1” (the bit weight) in the covering column. This is because an input will still be needed to reuse that combination. So, the contribution of each column j if it would be reused only for a single AND gate, can be calculated as in Eq. (2). It will be written below each column combination in the triangle.

( ) ( )1

0

1 1N

kj jk

w b−

=

− = −∑ (2)

It is obvious that the total number of inputs that one column

combination can eliminate will be multiplied by the number of covering lines minus one, because it cannot be reused by itself. In each stage s, the algorithm calculates a new covering value with next Eq. (3) formula, considering only the lines that had not been covered by other columns in a previous stage.

( )1

1 1m

s ijji

n t=

− = −∑ (3)

The best column combination for that stage s is then chosen, which has

both a large number of “1” bits and covers as many lines as possible. If this will be used on the first level of AND gates, it will reduce the global number of inputs with the value:

( ) ( ){ }max 1 1s sj jjR w n= − ⋅ − (4)

All the lines covered by the chosen column combination will be

eliminated from the triangle (Fig. 5). In the next s stage, another column combination will be chosen to cover the remaining lines, and so on, until there are no more lines to cover. If two column combinations have the same reducing R power, any of them can be selected, or both if they cover different line combinations.

Aleodor Daniel Ioan

54

For the exact case of the horizontal synch-generator from the Fig. 4 diagram, in the first stage was chosen the combination 24 h that reduces 3 inputs, then in the second stage both combinations 29 h and 30 h that reduces 2+2 inputs were selected simultaneously because they cover different lines. The 20 h line remains uncovered and must be also considered on the first AND level. Totally, 7 inputs were eliminated using this algorithm, which is the maximal possible value with two gate levels.

The implementation of the horizontal synch-generator in Altium Designer environment, optimized according to Fig. 1 general schematic with BER cells and this reduction algorithm is presented in Fig. 6 (on the next page) and needs no further comments.

5. Experimental Results

Using the optimization solutions presented in this paper for the

horizontal (Fig. 6) and vertical sync-generators, the video interface presented in (Ioan, 2010a) was redesigned to work with dual 60 Hz / 85 Hz refresh frequencies, keeping the hardware structure of the execution circuits intact. The implementation was made on an XC3S200 circuit from the Spartan-3 FPGA family, with the help of Altium Designer and Xilinx ISE WebPack tools (Ioan, 2010b). The new implementation needs only 4 LUTs, 8 flip-flops and 13 slices more than the heuristic-optimized implementation from paper (Ioan, 2010a) and has dual frequency refresh instead one!

Table 1 shows a more detailed comparison between resources used by both implementations:

Table 1

Comparison Between Heuristic (85 Hz) and Algorithmic (60/85 Hz) Implementations

Used resources Paper

(Ioan, 2010a) This paper Difference

4-input LUTs (logic) 117 (3%) 121 (3%) +4 (+0%)

4-input LUTs (total) 125 (3%) 129 (3%) +4 (+0%)

Slice flip-flops 87 (2%) 95 (2%) +8 (+0%)

Slices 90 (4%) 103 (5%) +13 (+1%)

To prove the proper functioning of the synch-generators designed and

optimized as presented, the video interface was tested on a LCD monitor that shows 31 KHz (horizontal)/59 Hz (vertical) refresh frequencies (Fig. 7). The difference from theoretical 31.5 KHz/60.0 Hz appears due to the fact that the

Bul. Inst. Polit. Iaşi, t. LVII (LXI), f. 3, 2011

55

pixel clock used was 25.000 MHz instead of 25.175 MHz, as required for the 60 Hz refresh mode (Ioan, 2010a). In this case, it was impossible to obtain the exact pixel clock frequency from the 50.000 MHz FPGA global clock.

Fig. 6 – Implementation schematic for dual refresh horizontal synch-generator.

Aleodor Daniel Ioan

56

Fig. 7 – Detail showing “31 KHz/59 Hz” refresh frequencies on a LCD monitor.

Another photo of the generated image on a CRT monitor (Fig. 8) proves

the proper functioning of the synch-generators in the other mode, with 85 Hz refresh. This time, the pixel clock is exactly 36.000 MHz and the monitor shows the theoretical 43.3 KHz (horizontal)/85.0 Hz (vertical) refresh frequencies.

6. Conclusions

This work presents an original new method for design and optimization

of the multi-phase pulse generators. Any description of such hardware structures using HDLs (Hardware Description Languages) is very handy but totally inefficient and not always hazard-free. The structure can be optimized only when described as graphical schematic.

Bul. Inst. Polit. Iaşi, t. LVII (LXI), f. 3, 2011

57

Fig. 8 – Photo of a CRT monitor showing “H: 43.3 KHz V: 85.0 Hz” refresh.

A systematical approach to schematic optimization that works in any configuration is presented here for the first time: the paper introduces a general schematic structure, an innovative filtering element called “BER cell”, a general design diagram for synch-generators and a minimization algorithm for the number of inputs in detection gates. If a decrementing counter is used, the complementary method can be applied: the detection of “0” bit combinations with OR gates and the AND covering triangle for minimization.

Another idea should be proposed for further research: all these techniques could be integrated into some unified software tool that outputs directly the netlist file of the designed structure!

Aleodor Daniel Ioan

58

REFERENCES

Alecsa B.C., Ioan A.D., FPGA Implementation of a Sinusoidal PWM Generator with

Zero Sequence Insertion. Proc. of IEEE International Symposium on Advanced Topics in Electrical Engineering, Bucharest, Romania, may 12-14, 2011.

Altium Ltd., Altium Designer. FPGA Design Basics. Training manual, http://www.altium.com/files/training/Module5FPGADesign.pdf, 2008.

Chu P.P., FPGA Prototyping by Verilog Examples, Xilinx Spartan-3 Version. John Wiley & Sons Inc., New Jersey, 2008.

Hamblen J., Using Large CPLDs and FPGAs for Prototyping and VGA Video Display

Generation in Computer Architecture Design Laboratories. Technical Committee on Computer Architecture Newsletter, IEEE Computer Society, July, 1999.

Ioan A.D., Designing an Optimal Single Chip FPGA Video Interface for Embedded

Systems, Proc. of IEEE International Symposium on Electrical and Electronics Engineering, GalaŃi, România, september 16-18, 2010a.

Ioan A.D., Designing the Synchro-Generator of an FPGA or CPLD Video Interface for

Micro-Systems. Proc. of International Conference on Development and Application Systems, Suceava, România, may 22-24, 2008.

Ioan A.D., New Techniques for Implementation of Hardware Algorithms inside FPGA

Circuits. Advances in Electrical and Computer Engineering, Vol. 10, 2, 2010b. Mano M.M., Kime C.R., Logic and Computer Design Fundamentals. 3rd edition,

Prentice Hall, New Jersey, 2004. Nakano K., Kawakami K., Shigemoto K., Kamada Y., Ito Y., A Tiny Processing System

for Education and Small Embedded Systems on the FPGAs. Proc. of IEEE/IFIP International Conference on Embedded and Ubiquitous Computing, Shanghai, China, december 17-20, 2008.

Navabi Z., Digital Design and Implementation with Field Programmable Devices. Kluwer Academic Publishers, Boston, 2005.

Rodrigues-Andina J.J., Moure M.J., Valdes M.D., Features, Design Tools, and

Application Domains of FPGAs. IEEE Transactions on Industrial Electronics, IEEE Industrial Electronics Society, Vol. 54, 4, 2007.

SOLUłIE ALGORITMICĂ PENTRU PROIECTAREA ŞI OPTIMIZAREA

GENERATOARELOR DE IMPULSURI MULTI-FAZĂ

(Rezumat)

În cadrul acestei lucrări se propune o nouă metodă de proiectare pentru generatoarele de impulsuri multi-fază, care produc la ieşiri serii de impulsuri repetitive cu poziŃionare independentă pe durata perioadei semnalului, cu sau fără suprapunere parŃială sau totală. Astfel de structuri hard sunt cazuri particulare de automate cu stări finite (FSM) bazate pe numărătoare, care pot fi întâlnite într-o gamă largă de aplicaŃii unde este necesară o secvenŃă precisă de impulsuri pentru a comanda secŃiunea de execuŃie a unui sistem, cum ar fi: sincro-generatoarele video, invertoarele trifazate sau

Bul. Inst. Polit. Iaşi, t. LVII (LXI), f. 3, 2011

59

anumite procesări de date în mod conductă (pipeline). Abordarea clasică pentru proiectarea unor astfel de generatoare foloseşte comparatoare de magnitudine pe toŃi biŃii numărătorului, aceasta fiind şi structura rezultată din folosirea condiŃionării cu “if” în orice limbaj de descriere hardware. Prin coborârea descrierii la nivel de schemă, comparatoarele de magnitudine pot fi reduse prin metode euristice la unele de egalitate cu intrări pe mai puŃini biŃi decât are în total numărătorul, însă astfel de tehnici dependente de cazul concret respectiv nu pot fi generalizate. Aici este prezentată o soluŃie algoritmică care poate fi aplicată în mod sistematic, ca o tehnică universală şi care utilizează numai porŃi şi pentru detecŃia combinaŃiilor contorului de la începutul şi sfârşitul fiecărui impuls. Mergând cu optimizarea şi mai departe, lucrarea propune şi o procedură (tot algoritmică) de reducere la minimum a numărului global de intrări în toate aceste porŃi şi de detecŃie, procedură care dă rezultate chiar şi în cazul implementărilor din interiorul circuitelor FPGA, unde uneltele de implementare modifică semnificativ structura iniŃială a schemei. Ambele soluŃii propuse au fost aplicate practic şi testate extensiv prin proiectarea sincrogeneratoarelor cu rezoluŃii şi frecvenŃe de reîmprospătare multiple dintr-o interfaŃă video pentru sistemele încorporate, implementată în FPGA cu tot cu memoria de cadre.