a low-power multilevel-output classifier circuit

7
Manuscript ID TNN-2011-P-2933 1 AbstractImplementation and new applications of a tunable Complementary MetalOxideSemiconductor Integrated Circuit (CMOS-IC) of a recently proposed classifier Core-Cell (CC) are presented and tested with two different datasets. With two algorithms, one based on Fisher’s linear discriminant analysis and the other on perceptron learning, used to obtain CC’s tunable parameters, Haberman and Iris datasets are classified. The parameters so obtained are used for hard-classification of datasets with a neural network structured circuit. Classification performance as well as coefficient calculation times for both algorithms are given. The CC has 6 ns response time and 1.8 mW power consumption. The fabrication parameters used for the IC are taken from CMOS AMS 0.35 µm technology. Index TermsCMOS, Classifier, Fisher, Iris, Haberman I. INTRODUCTION LASSIFICATION is an important subject matter in many applications ranging from pattern recognition, neural networks to artificial intelligence, from statistics to template matching [1], [2]. In general data classification can be realized either by software or by hardware systems. Many algorithms have been developed for classification [1]; however for faster online operations on hard data it is desirable to realize these classifiers in hardware which can be achieved with many different approaches either in analog or digital domains. Analog implementation of classification has many advantages over digital ones. For one, complexity of analog circuits is lower as compared to digital circuits; for another, they can be built in voltage or current-mode (input and output signals are current). In voltage-mode implementations the supply voltage level has an important impact on the dynamic range of the circuit. Current-mode approach provides larger dynamic range for processing the variables. It is well known that shrinking bias voltages makes it difficult to process data in voltage-mode. A simple summing circuit, in voltage-mode, needs additional active blocks (e.g. operational amplifiers and additional circuitry); current-mode processing on the other hand is preferred as currents can be added by connecting output terminals of the blocks without requiring the use of extra active blocks. For a handicap, the current-mode circuits used to suffer from less accuracy in comparison to voltage- Manuscript received January 25, 2011; revised June 15, 2011,October 20, 2011 and 01 February, 2012. This work is part of project 106E139 supported by the Scientific & Technological Research Council of Turkey (TÜBİTAK). The authors are with the Department of Electronics and Communications Engineering, Dogus University, Acibadem, Kadikoy 34722, Istanbul, Turkey. (e-mail: cgoknar@dogus.edu.tr , myildiz@dogus.edu.tr , sminaei@dogus.edu.tr ) mode ones; but, in newer technologies (180 nm and below) where low supply voltages are used, the accuracy in voltage- mode circuits is also critical whereas current signals can maintain high ratio accuracy [3]. Some classifier circuits, using advantages of the current-mode approach, are listed in the next paragraph. For template matching applications, current-mode circuits are proposed in [4] based on Euclidean distance calculator and in [5] based on threshold circuits. Another current-mode circuit which covers both Euclidean distance calculation and Gaussian neighborhood tapering is given in [6]. A current- mode sorting circuit for pattern recognition is designed to build transformation between features and classes in [7] and [8]. For pattern recognition applications a current-mode, fuzzy IC is presented in [9]. Different classification solutions regarding the implementation of neural networks on programmable digital circuits and devices can be found in [10], [11] but these circuits are relatively costly and high power dissipating. A compact analog programmable multi- dimensional radial basis function based classifier is proposed in [12] and CMOS implementation of a Neural Network (NN) classifier with several output levels and a different architecture is given in [13]. CMOS realization of a conscience mechanism used to improve the effectiveness of learning in winner-take- all artificial neural networks, which also eliminates the dead neurons, is presented in [14]. Except the one in [13], all these circuits suffer from the shortcoming of not being tunable. In this paper, the new DU-TCC 1209 IC containing 3 CCs, 9 Second-Generation Current Conveyors (CCII) and 3 current buffers is being introduced. The newer CC architecture published in [5], which improves the response time and the Relative Tracking Error (RTE) as compared to the CC given in [13], is being exploited in the IC design/layout/fabrication. Connected properly, these CCs can be exploited to realize n-D classifiers, which can only classify data, defined over mesh-grid (rectangular partitioning) domains. To overcome this deficiency, Linearly Weighting Circuits (LWC) that take linear combinations of data and input these combinations to CC are introduced as preprocessing units. With two algorithms, modified/adapted versions of Fisher’s Linear Discriminant Analysis (LDA) and Perceptron Learning Algorithm (PLA) the weighting coefficients are calculated and soft as well hard-tested on Iris and Haberman datasets. Iris dataset consists of 50 samples from each of three species of the Iris flowers, which are: virginica, versicolor and setosa (3-class data). The flowers have 4 features which are the lengths and widths of the sepal and petal in centimeters (4- A Neural CMOS Integrated Circuit and its Application to Data Classification İzzet Cem Göknar, Fellow IEEE, Merih Yıldız, Member IEEE, Shahram Minaei, Senior Member IEEE, and Engin Deniz, Member IEEE C

Upload: dogus

Post on 02-Mar-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

Manuscript ID TNN-2011-P-2933

1

Abstract— Implementation and new applications of a tunable

Complementary Metal–Oxide–Semiconductor Integrated Circuit

(CMOS-IC) of a recently proposed classifier Core-Cell (CC) are

presented and tested with two different datasets. With two

algorithms, one based on Fisher’s linear discriminant analysis

and the other on perceptron learning, used to obtain CC’s

tunable parameters, Haberman and Iris datasets are classified.

The parameters so obtained are used for hard-classification of

datasets with a neural network structured circuit. Classification

performance as well as coefficient calculation times for both

algorithms are given. The CC has 6 ns response time and 1.8 mW

power consumption. The fabrication parameters used for the IC

are taken from CMOS AMS 0.35 µm technology.

Index Terms—CMOS, Classifier, Fisher, Iris, Haberman

I. INTRODUCTION

LASSIFICATION is an important subject matter in many

applications ranging from pattern recognition, neural

networks to artificial intelligence, from statistics to template

matching [1], [2]. In general data classification can be realized

either by software or by hardware systems. Many algorithms

have been developed for classification [1]; however for faster

online operations on hard data it is desirable to realize these

classifiers in hardware which can be achieved with many

different approaches either in analog or digital domains.

Analog implementation of classification has many

advantages over digital ones. For one, complexity of analog

circuits is lower as compared to digital circuits; for another,

they can be built in voltage or current-mode (input and output

signals are current). In voltage-mode implementations the

supply voltage level has an important impact on the dynamic

range of the circuit. Current-mode approach provides larger

dynamic range for processing the variables. It is well known

that shrinking bias voltages makes it difficult to process data

in voltage-mode. A simple summing circuit, in voltage-mode,

needs additional active blocks (e.g. operational amplifiers and

additional circuitry); current-mode processing on the other

hand is preferred as currents can be added by connecting

output terminals of the blocks without requiring the use of

extra active blocks. For a handicap, the current-mode circuits

used to suffer from less accuracy in comparison to voltage-

Manuscript received January 25, 2011; revised June 15, 2011,October 20,

2011 and 01 February, 2012. This work is part of project 106E139 supported

by the Scientific & Technological Research Council of Turkey (TÜBİTAK). The authors are with the Department of Electronics and Communications

Engineering, Dogus University, Acibadem, Kadikoy 34722, Istanbul, Turkey.

(e-mail: [email protected], [email protected], [email protected] )

mode ones; but, in newer technologies (180 nm and below)

where low supply voltages are used, the accuracy in voltage-

mode circuits is also critical whereas current signals can

maintain high ratio accuracy [3]. Some classifier circuits,

using advantages of the current-mode approach, are listed in

the next paragraph.

For template matching applications, current-mode circuits

are proposed in [4] based on Euclidean distance calculator and

in [5] based on threshold circuits. Another current-mode

circuit which covers both Euclidean distance calculation and

Gaussian neighborhood tapering is given in [6]. A current-

mode sorting circuit for pattern recognition is designed to

build transformation between features and classes in [7] and

[8]. For pattern recognition applications a current-mode, fuzzy

IC is presented in [9]. Different classification solutions

regarding the implementation of neural networks on

programmable digital circuits and devices can be found in

[10], [11] but these circuits are relatively costly and high

power dissipating. A compact analog programmable multi-

dimensional radial basis function based classifier is proposed

in [12] and CMOS implementation of a Neural Network (NN)

classifier with several output levels and a different architecture

is given in [13]. CMOS realization of a conscience mechanism

used to improve the effectiveness of learning in winner-take-

all artificial neural networks, which also eliminates the dead

neurons, is presented in [14]. Except the one in [13], all these

circuits suffer from the shortcoming of not being tunable.

In this paper, the new DU-TCC 1209 IC containing 3 CCs,

9 Second-Generation Current Conveyors (CCII) and 3 current

buffers is being introduced. The newer CC architecture

published in [5], which improves the response time and the

Relative Tracking Error (RTE) as compared to the CC given

in [13], is being exploited in the IC design/layout/fabrication.

Connected properly, these CCs can be exploited to realize

n-D classifiers, which can only classify data, defined over

mesh-grid (rectangular partitioning) domains. To overcome

this deficiency, Linearly Weighting Circuits (LWC) that take

linear combinations of data and input these combinations to

CC are introduced as preprocessing units.

With two algorithms, modified/adapted versions of Fisher’s

Linear Discriminant Analysis (LDA) and Perceptron Learning

Algorithm (PLA) the weighting coefficients are calculated and

soft as well hard-tested on Iris and Haberman datasets.

Iris dataset consists of 50 samples from each of three

species of the Iris flowers, which are: virginica, versicolor and

setosa (3-class data). The flowers have 4 features which are

the lengths and widths of the sepal and petal in centimeters (4-

A Neural CMOS Integrated Circuit and

its Application to Data Classification

İzzet Cem Göknar, Fellow IEEE, Merih Yıldız, Member IEEE, Shahram Minaei, Senior Member

IEEE, and Engin Deniz, Member IEEE

C

Manuscript ID TNN-2011-P-2933

2

D data). This dataset is classified with LDA to distinguish the

flowers from each other [15]; the Iris dataset is not linearly

separable and is frequently used to test many other

classification techniques. Haberman dataset contains cases

from a study that was conducted between 1958 and 1970 at the

University of Chicago's Billings Hospital on the survival of

patients who have undergone surgery for breast cancer. It

consists of 306 samples from two classes, the patients who

survived 5 years or longer (255 samples) and the patients who

died within 5 years (81 samples). The dataset is 3-D with age

of the patient at the time of operation, patient’s year of

operation and number of positive axillary nodes detected.

The paper is organized as follows: in Section II, block

diagram, the transfer characteristic and the schematics of the

current-mode CC are given. In Section III, LWC needed for

datasets separated by hyperplanes with arbitrary slopes is

introduced. Derivation of parameter values for classifying

datasets using the modified Fisher’s LDA based algorithm is

given in section IV. The derivation of the same parameters

with PLA is given in section V. In Section VI, these weight

parameters are applied to the Iris and Haberman dataset

classifier circuits, which are constructed with LWCs and CCs;

also DU-TCC 1209 classification test results are compared

with the simulation results. Finally, Section VII concludes the

paper.

II. CMOS CORE CELL (CC)

The block diagram of the CC and its transfer characteristics

are shown in Figs. 1(a) and (b), respectively. The horizontal

position, the width and the height of the transfer characteristic

can be adjusted independently by means of external currents I1,

I2 and IH. The proposed CC schematic is shown in Fig. 2

where the transistors M1-M5 and M8-M12 constitute the two

threshold circuits respectively. The basic current mirror

constructed with transistors M6 and M7 performs the desired

operation of subtraction. The transistors M13, M14, M15 are

used to provide currents equal to IH (adjusting the output level)

for the threshold circuits. Similarly, the same approach is used

with transistors M16, M17 and M18 to apply the input current Iin

to both of the threshold circuits.

The current Iin is the 1-D data for each CC and Iout is the

output of the classifier. It has been shown in [5] that by

properly grouping CCs and adding the outputs in each group

Multi-Input-Multi-Output (MIMO) classifiers can be obtained.

A detailed Monte Carlo analysis of the CC for VTH and β

parameters of the MOS transistors are reported in [5]; they

show that parameter mismatch has little effect on the CC

characteristics.

III. CLASSIFICATION OF DATA PARTITIONED

WITH ARBITRARY HYPERPLANES

The block diagram of the CC presented in Section II, and

MIMO classifiers realized with them, partitions data domains

into rectangular mesh-grids whereas there is strong need to

treat linearly non-separable data as shown in Fig 3(a).

Let the data to be classified be 1-D and belong to two

different classes (A and B) as in Fig. 3(a). If inputs x1 and x2

are multiplied with coefficients (w1 and w2) and applied to CC

with control currents I1 and I2 as shown in Fig. 3(b), called a

linearly non-separable 1-D data classifier, then the data

domain will be partitioned with arbitrary hyperplanes as

shown in Fig. 3(a). The block diagram realization of Fig. 3(b)

can easily be generalized to an n-D data classifier; a detailed

explanation is given in [16].

IH

I1 I2

IoutIin Core

CircuitIin

Iout

I1 I2

IH

IoutIin

(a) (b) Fig. 1 a) CC block diagram. b) Transfer characteristic of CC.

I1

Iin

Iout

IHI2

VDDVDD

VDD

VDD

VDD

VSS

VSS

VSS

M1 M2

M3

M4 M5

M6 M7

M8M9

M10

M11M12

M13M14

M15

M16M17

M18

Threshold Circuit

VSS

M20 M19

M21 M22

VSS

Subtractor

Fig. 2. CMOS implementation of CC [5].

Manuscript ID TNN-2011-P-2933

3

x1

x2

1

1

w

I

1

2

w

I

2

1

w

I

2

2

w

I

A

A

B

x1w1+x2w2

IH

I1 I2

Iout

Iin

Core

Circuit

(a) (b)

Fig. 3. Linearly non-separable a) data domains b)1-D data classifier.

Fig. 4. Classification methodology of LDA dataset [16].

The classification method and the IC developed in the

sequel will be able to classify linearly non-separable (e.g. as

the one in Fig. 4(a)) data. The dashed lines, called Double

Threshold Hyperplanes (DTH), in Fig. 4 (b)-(d) correspond to:

axwxw 2211 , bxwxw 2211 (1)

and are found to provide best separation of data as shown in

Fig. 4(b) (data outside of these DTH being already classified).

Then classified data is deleted and for the rest, the DTH are

found again as shown in Fig. 4(c). The outcome of the

classification is shown in Fig. 4(d). So the classification of

data is achieved by finding the appropriate coefficients w1, w2

and corresponding CC currents. The output of the classifier

will be in conformity with Fig. 4(d). Algorithms for obtaining

these DTHs will be presented in Sections IV and V.

IV. CLASSIFICATION OF DATA WITH

FISHER’S LDA BASED ALGORITHM

Fisher’s LDA, a successful linear feature extraction method

which maximizes between-class separability and minimizes

within-class variability, is used to find the classifier circuits’

parameters; it will be presented here in 2-D and applied to 4-

D; exactly the same procedure is valid in n-D [17].

Fisher’s LDA is used to find the projection of the data to a

“best” line, which has a direction vector v

and passes through

the origin. Let a two dimensional data with two classes c1 and

c2 be given as shown in Fig. 4(a). After the projection of the

data to this “best” line the histogram of the data is obtained as

shown in Fig. 5 where, µp1 and µp2 are the average distances of

the data to the origin, and σ1, σ2 are standard deviations. The

histogram characteristics help to find the “best” projection line

and hence DTH. This is achieved by finding the projection

Fig. 5. View of DTH on the projection line.

line satisfying the following two criteria: a) µp1 and µp2 are to

be at the maximum distance from each other, b) σ1 and σ2

should have minimal values. Let the 2-D vector x

belong to a

“two class” data,

2

1

x

xx

(2)

Projection of x

samples to the projection line can be found

with the scalar product of xvT ; µp1 and µp2 are the average

distances of the first and second class data sets, respectively,

whereas 1

, 2are the average vectors calculated component

wise for each class. It follows that:

,11 T

p v 22 T

p v (3)

as xvy T

gives the distance of the data to the origin. So

the scatters of the first and the second class are given with:

( ) ,i

p i p

x C

s y

1

22

1 1 ( )i

p i p

x C

s y

2

22

2 2 (4)

According to these equations, to find the best projection line,

vSv

vSv

)ss()v(J

wT

BT

pp

pp

22

221

21

(5)

should be maximized [17], where

n

1iw

Ti

Cxi )μx()μx(S

i

c

i

TiiiB )μμ)(μμ(nS

1

are the within-class and between-class scatter matrixes; here ni

is the number of training sample in the i-th class and n is the

number of classes. To maximize )(vJ

now is equivalent to the

generalized eigenvalue problem [17]:

vSvS wB

(6)

The eigenvector corresponding to the eigenvalue with

maximum value (they are all positive, the matrices being

positive semi-definite) obtained from (6) gives the vector v

which is the slope of the projection line [18], [19]. The

projection of data on this line helps to find the DTH as shown

in Fig. 5 [20]. On the projection line in Fig. 5, “a” shows the

minimum distance to the origin of data in the second-class; its

coordinates are (xa1, xa2). Similarly, point “b” is the maximum

distance to the origin of data in the first-class dataset, whose

coordinates are (xb1, xb2).

The hyperplane equations that are going to classify the

datasets can then be written as:

Manuscript ID TNN-2011-P-2933

4

,x

xv

x

xv

a

aTT 02

1

2

1

02

1

2

1

b

bTT

x

xv

x

xv

(7)

The process is repeated for data between the hyperplanes, the

data outside being classified.

The components of the so chosen eigenvector determine the

weight coefficients (e.g. (14) in Section VI.B) to be used in the

LWC that will implement the separating hyperplanes.

In the n-D case, the only difference is in the dimension of

the involved matrices in (6), which is now n×n; again one has

to find the eigenvector, corresponding to the eigenvalue with

maximum value to determine DTH [18]-[20].

V. CLASSIFICATION OF DATA WITH THE

PERCEPTRON LEARNING ALGORITHM

The classical PLA widely used in neural networks is based

on a single threshold activation function [21] whereas the one

used here is double threshold function as shown in Fig. 1(b).

Regions separated by DTH are characterized by:

00

01

iTi

iTi

iavx

avxy

(8)

00

01

iTi

iTi

ibvx

bvxy

(9)

With the algorithm developed next for DTH, the vectors vi,

the DTH coefficients ai and bi in (8) and (9), relevant to the i-

th class, will be determined. Classification of the data will

then be achieved by separating the classes with appropriate

number of hyperplanes.

Perceptron Learning based Classification Algorithm

Selecting one of the data classes ci (i=1,2,..,m):

1. Check, if there is an appropriate DTH that separates all

data of class ci from all the others.

a. If there is, save these coefficients and delete this data

class from list; move to step 2.

b. If there is not, then move to step 3.

2. Continue with the remaining classes.

a. If the remaining class is m-th, then stop classification.

b. If not, move to the step 1.

3. Increase the number of DTH by 1 and check whether all

data from class ci can be separated from other classes.

a. If, yes then save the coefficients of DTHs and remove

this data class from list; move to step 2.

b. If, the class cannot be separated then move to step 3.

As the activation function is a hard limiter, the coefficients

can be calculated with the update rules of PLA [22] given

below.

Perceptron learning algorithm update rules:

(10)

(11)

(12)

In (10-12) yd and yo are the desired and the output obtained

at that step respectively; η is the learning coefficient which has

to be chosen between 0 and 1.While updating, when yd = yo the

weight coefficients do not change. Learning algorithm stops

when all weight coefficients stay constant [23].

When learning is finished using the training set, the DTHs

iTii bvxa

partition the data domain into regions each

containing a single class of data. Components of v

obtained

from the learning algorithm give the weight coefficients of

LWC, and ai and bi determine the CC’s control currents I1 and

I2. The remaining data set will be used to verify the correct

operation of the classifier so obtained. The CC current IH helps

to identify the class of the data. Thus, classification is

provided with appropriate number of LWC and CC blocks. In

the sequel both algorithms will be used to hard and soft-

classify Iris and Haberman datasets.

Fig. 6. Die photo of the classifier integrated circuit.

TABLE I DIMENSIONS OF MOS TRANSISTORS IN FIG 2

MOSFET M1, M2, M3, M4, M5, M8, M9, M10, M11, M12

M6, M7, M13, M14, M15, M16, M17, M18, M19, M20, M21, M22

W [m] 21 67.9

L [m] 1.05 1.05

Vy Y

X

Iz+

Iz-

R

DO- CCII

Z+

Z-

Fig. 7. LWC configuration with DO-CCII.

Fig. 8. Projection of Iris data to the origin.

Manuscript ID TNN-2011-P-2933

5

VI. REALIZATION OF THE CORE CIRCUIT, LINEARLY

WEIGHTING COEFFICIENTS AND TEST RESULTS

A. CMOS Realizations

The layout of the CC and of the IC including 3 CCs, 9

current conveyors and 3 current buffers have been designed

using MENTOR software with fabrication parameters for the

CMOS AMS 0.35 µm process [24]. The die photo of the

manufactured IC, called DU-TCC 1209, is shown in Fig. 6. In

order to provide user tunability all 52 pins had to be used for

I/O access, causing the pads dominate IC area, thus a pad

limited design. DU-TCC 1209 has a 2.62×2.62 mm2

die area

and CMOS transistors’ dimensions of the CC are given in

Table I.

The block diagram of the LWC using a Dual Output Second

Generation Current Conveyor (DO-CCII) [16] is shown in Fig.

7. The voltage Vy is the input and the current Iz+ and Iz- are the

outputs of the circuit in Fig. 7; these output currents can be

expressed as:

R

VI

yz

R

VI

yz

The resistance R is used to convert the voltage input data

Vy, to current; moreover, the ratio 1/R provides the appropriate

weight value for the realization. It is worthwhile mentioning

that the DO-CCII can also be used to provide negative weight

values using the Z- terminal in case the need arises.

B. Experimental Setting and Applications

The 4-D Iris dataset has 150 samples with equal number

from three classes (c1, c2, c3). Taking 40 data from each class

and using Fisher’s LDA, the coefficients of the projection

(eigen)vector are obtained as:

140100800570 ....v

. (14)

The projection of the Iris data using v

(the scalar product

xvT ) is shown in Fig. 8. It can be seen from Fig. 8 that the

data belonging to three different classes can be separated with

appropriate boundaries (DTH); these boundaries determine the

CC control currents. To test DU-TCC 1209 with the Iris data,

the classifier block diagram given in Fig. 9 was constructed on

a specially designed Printed Circuit Board (PCB) shown in

Fig. 10 where potentiometers provide tunability; in building

the classifier, product of Iris input data with the vector v

(wi

coefficients) was provided with LWC blocks as outlined next.

In order to obtain weighting coefficient w1 = 0.57 in (14) the

resistor R1 at the X terminal of the 1st DO-CCII is tuned to the

value R1 = 10/0.57 k providing an output current IZ+ =

0.57(V/10k). For the 2nd

coefficient w2 = 0.80 of (14), the

resistor R2 at the X terminal of the 2nd

DO-CCII is tuned to R2

= 10/0.80 k, providing an output current IZ- = 0.80(V/10k) to

secure the minus sign and so on.

To test the weight values provided by the algorithm, unused

30 data (10 from each class) was soft-applied to the classifier

of Fig. 9 with a 4-channel programmable-output function

generator. Each data was applied with 1 ms duration and was

classified correctly; the outcome is shown in Table II (only 20

shown because of space limitations).

LWC-1

x1

LWC-2x2

LWC-3x3

LWC-4x4

Current

Follower

CC-1

CC-2

CC-3

Output

x1w1

x2w2

x3w3

x4w4

Fig. 9. Iris classifier block diagram (Fisher’s LDA is used).

Fig. 10. Test PCB for DU-TCC 1209.

TABLE II IRIS AND HABERMAN TEST OUTCOME OF THE CLASSIFIER

Iris data

Haberman data

x1 x2 x3 x4 Class x1 x2 x3 Class

Tim

e in

terv

al o

f d

ata

(ms)

0 - 1 4.3 2.3 1.4 0.2 c1 34 60 1 c1

1 - 2 5.7 2.7 3.9 1.1 c2 61 68 1 c2

2 - 3 5.7 2.7 4.8 1.8 c2 51 59 3 c2

3 - 4 4.9 2.2 6 2.5 c3 37 59 6 c1

4 - 5 5.6 2.5 5.1 1.9 c3 54 58 1 c1

5 - 6 4.6 3 1.7 0.4 c1 61 62 5 c2

6 - 7 4.7 3.1 1.5 0.1 c1 42 63 1 c1

7 - 8 6.1 2.9 3.8 1.1 c2 53 61 1 c1

8 - 9 6.4 3 4 1.3 c2 48 67 7 c2

9 - 10 4.9 2 4.7 1.4 c2 65 66 15 c2

10 - 11 4.8 3.2 1.2 0.2 c1 60 59 17 c2

11 - 12 5.4 3.8 1.6 0.6 c1 42 59 2 c1

12 - 13 6.3 2.8 6.5 1.8 c3 30 62 3 c1

13 - 14 6.7 3 6.4 2 c3 65 62 22 c2

14 - 15 7.2 3.2 5.4 2.1 c3 41 60 23 c2

15 -16 5.4 3.9 1.9 0.4 c1 46 58 3 c1

16 - 17 5.3 3.7 1.5 0.2 c1 42 61 4 c1

17 - 18 5.4 3.8 1.3 0.3 c1 72 67 3 c1

18 - 19 6.3 3.0 4.4 1.3 c2 47 63 23 c2

19 - 20 6.3 3.0 4.1 1.3 c2 43 58 52 c2

Fig. 11. Iris data classification oscilloscope outcome (LDA is used).

LWC-1x1

LWC-2x2

LWC-3x3CC-1

Output

LWC-4x1

LWC-5x2

LWC-6x3

CC-2

x1w1

x2w2

x3w3

x1w4

x2w5

x3w6

Fig. 12. Haberman classifier block diagram (PLA is used).

Manuscript ID TNN-2011-P-2933

6

Fig. 13. Haberman data classification oscilloscope outcome (PLA is used).

To verify the performance of DU-TCC 1209 control currents

of the CCs given in Fig. 9 are hard-set as given in Table III;

CC currents I1, I2 are selected according to the classification

boundaries obtained from the projection of the Iris data as

shown in Fig. 8, (for instance, class c1 is in the distance range

of 0.1 to 0.8 so, I1=1 µA and I2=8 µA). On the other hand, if

the output current Iout is 10 µA then the data is taken from

class c1, for 20 µA from c2, while for 30 µA from c3.

Fig. 11 shows the output voltage taken across a 5 kΩ resistor

connected to the output of the Iris dataset classifier circuit to

measure its output current. Test results are in perfect

agreement with the classes given in Table I.

The 3-D Haberman dataset contains 306 samples belonging

to two classes (c1, c2); for the training stage taking randomly

42 data from each class and using the PLA developed in

Section V, the DTHs are obtained as:

)162.24.1(1 v

, a1=11 and b1=69 (15)

)102.23.1(2 v

, a2=20 and b2=24 (16)

The data remaining within the region enclosed by these

DTHs belongs to class c1, otherwise the data is from class c2.

These regions are constructed with the configuration in Fig. 12

(LWC-1, LWC-2, LWC-3 and CC-1 constitute the first region,

and the remaining blocks constitute the second region).

To verify the performance, unused 20 data (10 from each

class) was first soft-applied to the Haberman dataset classifier

of Fig. 12; all data was classified correctly and the outcome is

shown in Table II. Control currents of the CCs given in Fig.

12 were hard-set as given in Table III. These control currents

were chosen according to the coefficients a and b which are

obtained from the PLA. If the output current Iout is 30 µA then

the data is from c1, if 0 µA then the data is from c2.

Fig. 13 shows the output voltage taken across a 5 kΩ resistor

connected to the output of the classifier circuit to measure its

output current. Test results are in perfect agreement with the

classes given in Table II.

Iris dataset is also used to test the performance of the

classifier circuit with the PLA. Fig. 14 shows the result of the

classification of Iris dataset with the PLA. If output current Iout

is 15 µA then the data is from c1, 30 µA then the data is from

c2, 0 µA then the data is from c3. Test results are in perfect

agreement with the classes given in Table II. Haberman

dataset has also been classified with Fisher’s LDA based

algorithm and the outcome is exhibited in Fig. 15. If the output

current Iout is 20 µA then the data is from c1, 60 µA then the

data is from c2. Test results are in perfect agreement with the

classes given in Table II.

Fig. 14. Iris data classification oscilloscope outcome (PLA is used).

Fig. 15. Haberman data classification oscilloscope outcome (LDA is used).

TABLE III

CONTROL CURRENTS OF THE CCS IN FIG. 8 AND FIG. 11

CC’s in Fig. 8 CC’s in Fig. 11

Control Currents CC-1 CC-2 CC-3 CC-1 CC-2

I1 1 µA 16 µA 21 µA 11 µA 20 µA

I2 8 µA 19 µA 24 µA 69 µA 24 µA

IH 10 µA 20 µA 30 µA 30 µA 30 µA

TABLE IV

PERFORMANCE COMPARISON OF THE ALGORITHMS

Algorithm

Classification

Performance [%]

Coefficient calculation

time [second]

Iris Haberman Iris Haberman

Fisher LDA Based 100 100 0.4 0.3

Perceptron learning 100 100 220 190

TABLE V

COMPARISON OF 1-D CLASSIFIER CIRCUITS

Exp/

Sim(a) Ref. Technology

Supply

Voltage

Power

Dissipation

Response

Time RTE

Exp

[4] 0.6 µm 3.3 V 14.95 mW - -

[8] 0.35 µm 5 V - - -

[9] 2 µm 5 V 80 mW - -

[12] 0.5 µm 3.3 V 90-160 µW 20-40 µsec -

CC 0.35 µm ± 1.65 V 1.2 mW Soft 1.8 mW Hard

6 ns 0.5%

Sim [13] 0.35 µm ± 1.65 V 1.4 mW 7 ns 2%

(a) Sim: Simulated, Exp: Experimental

Haberman and Iris datasets were classified with Fisher’s

LDA and perceptron learning based algorithms. The

algorithms were executed on a Personal Computer (PC)

running at 3 GHz and having 1 GB Random Access-Memory

(RAM). The classification performance and coefficient

calculation times for both algorithms are given in Table III

showing there is no misclassification.

The core cells given in [4], [8], [9], [12], [13] and the one

used in this paper are compared from technology parameters,

power consumption, supply voltage, response time and RTE

point of views in Table V. Small response time and RTE are

important for fast and correct decision of the data. The power

consumption of the circuit in [13] and CC, given in Table V,

are obtained for I1=50 µA, I2=100 µA, IH=100 µA; smaller

choices of these currents yield much less consumption.

Manuscript ID TNN-2011-P-2933

7

VII. CONCLUSION

This paper is about a NN classifier based on a 1-D classifier

called CC implemented in DU-TCC 1209, an IC containing 3

CCs, 9 current conveyors, 3 current buffers with 52 I-O pins,

which has been designed and manufactured with CMOS AMS

0.35 µm technology parameters. After presenting the

improved CC, the newly designed IC, DU-TCC 1209, was

introduced. Hard-test results concerning the classification of

Haberman and Iris datasets are reported; to that purpose two

algorithms, one based on perceptron learning, the other on

Fisher’s LDA, were developed. The weight values (control

currents of CCs) are determined with these two algorithms and

hard/soft-applied to the resulting NN classifier showing

perfect agreement between simulations and measurements.

Other applications such as quantization, template matching

with error correction etc. were previously described at

simulation level in [5], [16], [25], [26]. The running time of

the algorithms are also included to compare performances.

Moreover, DU-TCC 1209 is versatile, with tunable weight

parameters provided a library of weights (control currents) is

available; furthermore they can be used in many applications,

whereas the others given in Table IV are single-task oriented.

All applications being so far based on static properties,

dynamical behavior of DU-TCC 1209 has to be analyzed then

tested. New applications developed by allowing control

currents (parameters) vary with time, thus taking full

advantage of DTH, will be explored in future works. In

another direction, digital circuitry can be embedded into the

IC and online programming as well as tuning of the weighting

parameters achieved in the field.

REFERENCES

[1] E. Hunt, Artificial Intelligence. New York: Academic, 1975.

[2] H.S. Abdel-Aty-Zohdy and M. Al-Nsour, “Reinforcement learning

neural network circuits for electronic nose,” in IEEE International Symp.

on Circuits and Systems, May –June 30-2, 1999, pp. 379 – 382.

[3] C. Toumazou, G.S. Moschytz and M. B. Gilbert, Trade-offs in analog

circuit design: the designer’s companion. Kluwer Academic Pub., 2002.

[4] B. Liu, C. Chen, and J. Tsao, “A modular current-mode classifier circuit

for template matching application,” IEEE Trans. Circuit and Syst. II, Analog and Digital Sig. Proc., vol. 47, no. 2, pp. 145-151, 2000.

[5] M. Yıldız, S. Minaei, and C. Göknar, “A flexible current-mode classifier

circuit and its applications,” International Journal of Circuit Theory and

Applications, vol. 39, pp. 933-945, 2010.

[6] F. Li, C.-H. Chang, and L. Siek, "A compact current mode neuron circuit

with Gaussian taper learning capability," in IEEE International

Symposium on Circuits and Systems, May 24-27, 2009, pp. 2129-2132.

[7] G. Lin and B. Shi, “A current-mode sorting circuit for pattern

recognition,” in Intelligent Processing and Manufacturing of Materials, Honolulu, Hawaii, July 10-15, 1999, pp. 1003–1007.

[8] D. Y. Aksın, and S. Aras, “A compact Distance Cell for Analog

Classifiers,” in Proceedings of the IEEE International Symposium on

Circuits and Systems, Kobe, Japan, May 23-26, 2005, pp. 3627-3630.

[9] G. Lin and B. Shi, “A multi-input current-mode fuzzy integrated circuit

for pattern recognition,” in Intelligent Processing and Manufacturing of

Materials, Honolulu, Hawaii, July 10-15, 1999, pp. 687-693.

[10] J.L. Ayala, A.G. Lomena, M. Lopez-Vallejo, and A. Fernandez, “Design

of a Pipelined Hardware Architecture For Real-Time Neural Network Computations” in 45th Midwest Symposium on Circuits and Systems,

Ohlahoma, USA, August 4-7, 2002, pp:419-422.

[11] J. Zhu, “Towards an FPGA based reconfigurable computing

environment for neural network implementations”, in International

Conf. on Artificial Neural Networks Conference, Edinburgh, UK, 6-10

September, 1999, pp. 661–666.

[12] S. Y. Peng, P. E. Hasler, and D. Anderson, “An Analog Programmable

Multi-Dimensional Radial Basis Function Based Classifier,” in

International Conference on Very Large Scale Integration, Atlanta,

USA, October 15-17, 2007, pp. 13-18.

[13] M. Yıldız, S. Minaei, and C. Göknar, “A CMOS Classifier Circuit using

Neural Networks with Novel Architecture,” IEEE Transaction on

Neural Networks, Vol. 18, 2007, pp. 1845-1849.

[14] R. Dlugosz, T. Talaska, W. Pedrycz, and R. Wojtyna, "Realization of the

Conscience Mechanism in CMOS Implementation of Winner-Takes-All Self-Organizing Neural Networks," IEEE Transaction on Neural

Networks , vol. 21, no. 6, pp. 961-971, June 2010.

[15] R. A. Fisher, “The use of multiple measurements in taxonomic

problems,” Annual Eugenics, vol. 7, pp. 179-188, 1936.

[16] M. Yildiz, S. Minaei, and S. Özoğuz, “Linearly weighted classifier

circuit,” in Northeast Workshop on Circuits and Systems, Toulouse,

France, June-July 28-1, 2009, pp. 99-102.

[17] L. Qi and W.T. Donald, “Principal feature classification,” IEEE

Transaction on Neural Networks, vol. 8, pp. 155-160, 1997.

[18] D. Qian, “Modified Fisher's linear discriminant analysis for

hyperspectral imagery,” IEEE Geoscience and Remote Sensing Letters,

Vol. 4, pp. 503-507, 2007.

[19] H. Çevikalp, “Theoretical analysis of linear discriminant analysis

criteria,” in IEEE 14th Signal Processing and Communications Applications, Antalya, Turkey, April 17-19, 2006, pp. 1-4.

[20] Q. Li, “Classification using principal features with application to speaker

verification,” Ph.D. diss., Univ. Rhode Island, Kingston, Oct. 1995.

[21] D. Y. Aksın, S. Aras, and İ.C. Göknar, “CMOS realization of user

programmable, single-level, double-threshold generalized perceptron,”

in Proceedings of Turkish Artificial Intelligence and Neural Networks

Conference, İzmir, Turkey, July 21-23, 2000, pp. 117-125.

[22] Y. Zhao, B. Deng, and Z. Wang, “Analysis and study of perceptron to

solve XOR problem,” in Proc. of the 2th International Workshop on

Autonomous Decentralized System, China, Nov. 6-7, 2002, pp. 168-173.

[23] İ. Genç and C. Güzeliş, “Threshold class CNNs with input-dependent

initial state,” in IEEE International Workshop on Cellular Neural Networks and their App., London, Eng., April 14-17, 1998, pp. 130-135.

[24] Parameter Ruler Design CMOS AMS 0.35 µm, Mentor Graphics

Corporation, 2008.

[25] M. Yıldız, S. Minaei, and İ.C. Göknar, “A low-power multilevel-output

classifier circuit,” in European Conference on Circuit Theory and Design, Seville, Spain, Aug. 26-30, 2007, pp. 747-750.

[26] M. Yıldız, S. Minaei, C. Göknar, “Realization and template matching

application of a CMOS classifier circuit,” in Proc. of the Applied

Electronics Conf., Pilsen, Czech Rep., Sep. 10-11, 2008, pp. 231-234.