[ieee 2009 ieee symposium on computational intelligence for multimedia signal and vision processing...

Character recognition with two spiking neural network models on

multicore architectures

Mohammad A. Bhuiyan, Rommel Jalasutram, and Tarek M. Taha

Abstract— This paper presents the use of the Izhikevich and

Hodgkin Huxley neuron models for image recognition. The

former is more biologically accurate than the commonly used

integrate and fire neuron model but has similar low

computational requirements. Brain scale cortex models tend to

use the more biological neuron models. The results of this work

show that the Izhikevich model can be used for image

recognition and would be a good candidate for a large scale

visual cortex model. !eural networks based on these models

are developed and applied to character recognition. They were

able to identify 48 24×24 images and their noisy versions. The

networks were accelerated using modern multicore processors

and showed significant speedups. Such processors are likely to

be used for developing high performance, large scale

implementations of these image recognition networks.

I. INTRODUCTION

piking neural network models are the third generation of

neural networks and are considered to be one of the most

biologically accurate models. Several studies indicate that

pulse coding (as opposed to rate coding) may be used in

biological neurons in the cortex for high speed image

recognition [5]. Rate coding generally requires longer

processing times as a neuron would have to wait for several

spikes to arrive before firing. In pulse coding, the timing

order of the incoming spikes can encode information at a

faster rate. Both [5] and [8] have utilized pulse coding based

spiking neural networks for image recognition. Gupta

developed a model for recognizing a set of 48 3×5 pixel

images of characters. They utilized the integrate and fire

spiking neuron model for recognition and spike timing based

plasticity (STDP) for training [4]. Thorpe utilized a similar

neuron model, but examined the recognition of much larger

images.

At present there is a strong interest in the research

community to model biological scale image recognition

systems such as the visual cortex of a rat or a human. For

such systems, it would be useful to examine the use of more

biologically accurate neuron models than the integrate and

fire model for image recognition. Izhikevich points out that

the commonly used integrate and fire model is one of the

least biologically accurate spiking neuron models. He states

that “the model cannot exhibit even the most fundamental

properties of cortical spiking neurons, and for this reason it

should be avoided by all means” [12]. Izhikevich compares a

set of 11 spiking neuron models [12] in terms of their

biological accuracy and computational load. He shows that

the five most biologically accurate models are (in order of

biological accuracy) the: 1) Hodgkin-Huxley, 2) Izhikevich,

3) Wilson, 4) Hindmarsh-Rose, and 5) Morris-Lecar models.

Of these the Hodgkin-Huxley model is the most compute

intensive, while the Izhikevich model is the most

computationally efficient.

Current studies of large scale models of the cortex are

generally using either the Izhikevich [1] or the Hodgkin

Huxley [16] spiking neuron models. One of the main

problems with using the Hodgkin Huxley model for large

scale implementations is that it is computationally intensive

– each neuron update requires about 1200 flops [12]. Using

the integrate and fire model (requires about 5~13 flops per

neuron update) can make large scale implementations more

computationally efficient (such as [5]). The recently

published Izhikevich model is attractive for large scale

implementations as it is close to the Hodgkin Huxley model

in biological accuracy, but is similar to the integrate and fire

model in computational intensity (13 flops per neuron

update).

In this paper, we develop a character recognition model

based on the two layer spiking neuron network model in [8].

However we utilize the more biologically accurate

Izhikevich and Hodgkin Huxley neuron models instead of

the integrate and fire model. We are not aware of any other

studies that examine the use of the Izhikevich model for

pattern recognition. The studies in [1] and [13] utilize the

Izhikevich model primarily to examine spiking behavior and

computational load (they do not examine the use of the

model for inference). We utilize pulse coding to mimic the

high speed recognition taking place in the mammalian brain

[5] and STDP for training [4]. The model is trained to

recognize a set of 48 24×24 pixel images of characters and

was able to recognize noisy versions of these images.

We also examine the performance of the two character

recognition models on recent multicore processors. With the

limited scaling in processor clock frequencies, multicore

processors have become the standard industrial approach to

improve processor performance. However we are not aware

of any studies examining the implementation or performance

S

This work was supported by an NSF CAREER Award and grants

from the US Air Force.

M. A. Bhuyian is a Ph.D. student. Email: [email protected].

R. Jalasutram is an M.S. student. Email: [email protected].

T. M. Taha is an Assistant Professor. Email: [email protected].

Electrical and Computer Engineering Department, Clemson

University, Clemson, SC 29631, USA.

978-1-4244-2771-0/09/$25.00 ©2009 IEEE

of these neuron models on multicore processors. The two

architectures examined in this study are the Intel Pentium D

925 processor and the IBM/Sony/Toshiba Cell broadband

engine [7]. The Intel Pentium D 925 has two processing

cores running at 3 GHz while the Cell broadband engine has

nine processing cores running at 3.2 GHz. The latter

processor has attracted significant attention from the

computing community recently. The fastest supercomputer

at present, the IBM Roadrunner supercomputer [19] installed

at Los Alamos National Lab, utilizes 12,240 Cell processors

and 6,912 AMD Opteron processors.

Section II of this paper discusses background material

including a brief introduction to the two spiking network

models considered for this study and a discussion of related

work. Section III presents the character recognition model.

Sections IV and V describe the experimental setup and

results of the model, while section VI concludes the paper.

II. BACKGROUND

A. Hodgkin Huxley Model

The Hodgkin–Huxley model [9] is considered to be one of

the most biologically accurate spiking neuron models. It

consists of four differential equations (eq. 1-4) and a large

number of parameters. The differential equations describe

the neuron membrane potential, activation of Na and K

currents, and inactivation of Na currents. The model can

exhibit almost all types of neuronal behavior if its

parameters are properly tuned. This model is very important

to the study of neuronal behavior and dynamics as its

parameters are biophysically meaningful and measurable. A

time step of 0.01 ms was utilized to update the four

differential equations as this is the most commonly used

value. Fig. 1 shows the spikes produced with this model.

4 31( ){ ( ) ( ) ( )}K K �a �a L L

dvI g n V E g m h V E g V E

dt C= − − − − − − (1)

( ( ) ) / ( )n

dnn V n V

dtτ∞= − (2)

( ( ) ) / ( )m

dmm V m V

dtτ∞= − (3)

( ( ) ) / ( )h

dhh V h V

dtτ∞= − (4)

B. Izhikevich Model

Izhikevich proposed a new spiking neuron model in 2003

[11] that is based on only two differential equations (eq. 5-6)

and four parameters. The model requires significantly fewer

computations than the Hodgkin Huxley model (13 flops as

opposed to 1200 flops per neuron update), but can still

reproduce almost all types of neuron responses that are seen

in biological experiments. A time step of 1 ms was utilized

(as was done by Izhikevich in [11]). Fig. 2 shows the spikes

produced with this model.

20.04 5 140dV

V V u Idt

= + + − + (5)

( )du

a bV udt

= − (6)

if 30 mV, then V c

Vu u d

←≥

← +

Fig. 2. Spikes produced with the Izhikevich model.

C. Related work

Several groups have studied image recognition using

spiking neural networks. In general, these studies utilize the

integrate and fire model. Thorpe developed SPIKENET [5],

a large spiking neural network simulation software. The

system can be used for the several image recognition

applications including identification of faces, fingerprints,

and video images. Johansson and Lansner developed a large

cluster based spiking network simulator of a rodent sized

cortex [14]. They tested a small scale version of the system

to identify 128×128 pixel images. Baig [2] developed a

temporal spiking network model based on integrate and fire

neurons and applied them to identify online cursive

handwritten characters. Gupta and Long [8] investigated the

application of spiking networks for the recognition of simple

characters. Other applications of spiking networks include

instructing robots in navigation and grasping tasks [17],

recognizing temporal sequences [3][10], the robotic

modeling of mouse whiskers [15].

At present several groups are developing biological scale

implementations of spiking networks, but are generally not

examining the applications of these systems (primarily as

they are modeling large scale neuronal dynamics seen in the

Fig. 1. Spikes produced with the Hodgkin Huxley model.

brain). The Swiss institution EPFL and IBM are developing

a highly biologically accurate brain simulation [16] at the

subneuron level. They have utilized the Hodgkin Huxley and

the Wilfred Rall [18] models to simulate up to 100,000

neurons on an IBM BlueGene/L supercomputer. At the IBM

Almaden Research Center, Ananthanarayanan and Modha

[1] utilized the Izhikevich spiking neuron models to simulate

55 million randomly connected neurons (equivalent to a rat-

scale cortical model) on a 32,768 processor IBM

BlueGene/L supercomputer. Johansson et. al. simulated a

randomly connected model of 22 million neurons and 11

billion synapses using an 8,192 processors IBM BlueGene/L

supercomputer [6].

III. NETWORK DESIGN

A two layer network structure was utilized in this study

with the first layer acting as input neurons and the second

layer as output neurons. Input images were presented to first

layer of neurons, with each image pixel corresponding to a

separate input neuron. Thus the number of neurons in the

first layer was equal to the number of pixels in the input

image. Binary inputs were utilized in this study. If a pixel

was “on” a constant current was supplied to the input, while

no current was supplied if a pixel was “off.” The number of

output neurons was equal to the number of training images.

Each input neuron was connected to all the output neurons.

In the training process, images from the training set are

presented sequentially to the input neurons. The weight

matrix of each output neuron is updated using the STDP rule

each cycle. In the testing phase, an input image is presented

to the input neurons and after a certain number of cycles,

one output neuron fires, thus identifying the input image.

Two versions of the model were developed where the main

difference was the equations utilized to update the potential

of the neurons. In one case, the Hodgkin Huxley equations

presented in section II.A were used, while in the second

case, the Izhikevich equations in section II.B were used. The

parameters utilized in each case are specified in Appendix I.

IV. EXPERIMENTAL SETUP

The same set of training images were utilized for both the

Hodgkin Huxley and the Izhikevich versions of the model.

There were 48 training images, each 24×24 pixels wide. The

images represented the 26 upper case letters (A-Z), 10

numerals (0-9), 8 Greek letters, and 4 symbols. Fig. 3 shows

the training images used. The two models were initially

developed and tested in MATLAB.

Large scale high performance versions of the models

would likely utilize optimized C implementations on clusters

of modern multicore processors. Therefore we evaluated the

runtime performance of parallelized C implementations on

two high performance multicore architectures: an Intel

Pentium D 925 processor and the Cell Broadband Engine.

The Pentium D 925 processor runs at 3 GHz and has two

processing cores. A Sony Playstation 3 platform was utilized

for the Cell processor. This processor contains one

administrative core (PPU) and eight high performance

processing cores (SPUs). On the Playstation 3 platform, only

six of the eight SPUs are available for use. Both the Intel and

Cell processing platforms were running Fedora Core.

Fig. 3. Training images utilized. There are a total of 48 24×24 pixel images.

The two most important optimizations needed for high

performance implementations on these multicore

architectures are the parallelization and vectorization of the

models. The models were compiled using gcc on both

platforms. The models were parallelized by distributing the

neurons of each level across a set of threads that would run

in parallel on separate processing cores. The Pentium D

platform utilized POSIX threads for parallelization and the

Cell platform utilized the IBM Cell SDK 2.1 library. The

models were also vectorized to take advantage of the SIMD

units that are generally found on modern processors to allow

multiple instances of an operation to take place

simultaneously. On the Intel processor, four floating point

operations can be carried out simultaneously on each core

using SSE instructions. The Cell processor can also evaluate

four floating point operations in parallel on each SPU (eight

if fused multiply-add instructions are used). Further code

optimizations (such as loop unrolling, branch elimination,

and double buffering) that are generally needed for the Cell

processor were also implemented.

V. RESULTS

Based on the model parameters utilized, the Izhikevich

and the Hodgkin Huxley models required at most 14 ms and

3.75 ms of simulation time respectively to produce output

spikes at the level two neurons. Thus input images were

presented as inputs for these simulation times for the

respective models. In both cases a membrane potential of

above 30 mV represented a neuron firing (and thus

recognition). The two models were tested with the 48

training images and were able to recognize all images

correctly.

Fig. 4 lists an additional set of test images (labeled 1 to

10) that were applied to the two networks. These include

some original and noisy versions of the first four training

images (‘A’, ‘B’, ‘C’, and ‘D’). The images were applied

sequentially to the inputs of both models. The membrane

potentials of the level two neurons corresponding to these

four images are shown in Fig. 5. Both models recognized all

of the images except for image 6. The spikes produced by

image 6 did not cross the 30 mV threshold needed for a

proper recognition.

The runtimes of the two models are presented in Table 1.

On the Pentium D platform, the runtimes for Matlab and

optimized parallel C implementations are taken. On the Cell

based Playstation 3 platform, the runtimes for serial

implementations on the PPU and parallel implementations

using all the cores are taken. As expected, in all cases, the

Izhikevich model required less time than the Hodgkin

Huxley model. Additionally, the C implementations were

significantly faster than the Matlab implementations.

Amongst the C implementations, the serial implementation

on the Cell PPU was slower than both the parallel C

versions.

In case of the parallel implementations, the Cell processor

was faster than the Pentium D for the Hodgkin Huxley

model, but slower for the Izhikevich model. This disparity in

the performance of the Cell processor versus the Pentium D

is primarily due to the higher synchronization overheads in

the Cell processor compared to the Pentium D. The Cell

processor has six cores running and requires 0.55 ms for

synchronization, while the Pentium D with 2 cores requires

only 0.18 ms for synchronization for the Izhikevich model.

(1) (2) (3) (4) (5)

(6) (7) (8) (9) (10)

Fig. 4. Additional test images.

0 20 40 60 80 100 120 140-100

-50

0

50

100

150

Time (ms)

Membrane Potential (m

v)

(a)

0 5 10 15 20 25 30 35 40-10

0

10

20

30

40

50

Time (ms)

Membrane Potential (m

v)

(b)

Fig. 5. Level two neuron membrane potentials for a serial presentation

of the images in Fig. 4 for (a) the Izhikevich model and (b) the Hodgkin

Huxley model. The red line represents the membrane potential of the neuron for detecting an ‘A’, blue for ‘B’, green for ‘C’, and cyan for

‘D’.

Since the Izhikevich model for 24×24 pixel images requires

far fewer computations than the Hodgkin Huxley model, the

effect of the synchronization overhead dominates the Cell

run time and makes it slower than the Pentium D.

For a larger network neural network, the synchronization

times will not change, thus making the Cell faster than the

Pentium D. To test this hypothesis, we scaled the networks

to utilize 192×192 pixel image inputs. As we did not have

input images to train this network, we utilized randomized

training matrices. This would not affect the results of the run

time though as a trained network would still have the same

number of computations taking place. Table 2 shows the

runtime for these networks. In this case it is seen that the

Cell processor is actually faster than the Pentium D for both

models. Fig. 6 shows that parallel implementations on multi-

core processors can provide significant performance gains

over single core processors for these image processing

models. The speedup is achieved due to thread and data level

parallelism from having multiple cores and vector operations

on each core respectively.

Table I

RUNTIMES FOR 24×24 PIXEL IMAGES

Platform Language Izhikevich

(ms)

Hodgkin-

Huxley (ms)

Matlab 109.7 1613.3 Pentium D 925

C 0.5 67.1

Cell – parallel C 0.6 19.7

Cell – PPU C 2.0 198.0

Table II

RUNTIMES FOR 192×192 PIXEL IMAGES WITH RANDOMLY WEIGHTED

NETWORKS

Platform Language Izhikevich

(ms)

Hodgkin-

Huxley (ms)

Matlab 49120.0 2.11×106 Pentium D 925

C 31.9 1798.3

Cell – parallel C 7.1 549.0

Cell – PPU C 148.1 31231.0

0

10

20

30

40

50

60

Izhikevich

24x24

Izhikevich

192x192

Hodgkin

Huxley

24x24

Hodgkin

Huxley

192x192

Speedup

Pentium

Cell Parallel

Fig. 6. Speedup of parallel multicore implementations over a serial

implementation on the Cell PPU.

VI. CONCLUSIONS

In this paper we presented the use of the Izhikevich and

Hodgkin Huxley spiking neuron models for character

recognition. Two similar networks, one for each of the two

neuron models examined, were developed and trained to

recognize a set of 48 24×24 pixel character images. The

networks were tested with the training images and several

noisy versions of these images. All the training images and

most of the noisy images were correctly identified.

For the small network studied, the integrate and fire

neuron model would have worked fine. The Izhikevich

model however requires similar computational power as the

integrate and fire model, but is much more biologically

accurate. Most researchers examining brain scale cortex

models are utilizing the Izhikevich and Hodgkin Huxley

models. We are not aware of the use of the Izhikevich model

for image recognition (or any other type of inference). In this

paper, we demonstrated that the Izhikevich model can be a

good candidate for larger image recognition networks.

In this paper we implemented the two networks on two

high performance multicore processors: the Intel Pentium D

925 dual core processor and the Cell broadband engine with

nine processing cores. Such processors are likely to be

candidates for future large scale implementations of the

neuron models examined. We demonstrated that these

processors provide significant speedups for both of the

neural model based image recognition networks. To achieve

high performance on these processors, the networks had to

be take advantage of the thread and data level parallelism

offered by these architectures. Further optimizations were

needed on the Cell processor. Our results indicate that the

Izhikevich model requires a shorter run time (as expected)

and that the Cell processor, with more cores, provides higher

speedups. The Cell platform utilized in this study had six

high performance cores available for use.

Future work in this area would be to scale up the image

recognition network to work with larger images. Changes to

the model would also be examined to recognize color and

more complex images. Finally we would examine the

clustered implementation of the model to enable much larger

networks to operate efficiently.

APPENDIX I

NEURON MODEL PARAMETERS

Hodgkin Huxley: gK=36 mho, gNa=120 mho, gL=0.3 mho,

EK=-12 mV, ENa=115 mV, EL=10.613 mV, V=-10 mV,

VK=0 mV, VNa=0 mV, VL=1 mV, time step=0.01 ms.

Izhikevich: Excitatory neurons: a=.02 , b= 0.2 , c= -55 , d=4;

Inhibitory neurons: a=0.06, b=0.22, c=-65, d=2, time step=

1 ms.

REFERENCES

[1] R. Ananthanarayanan and D. Modha, “Anatomy of a Cortical

Simulator,” Proceedings of the International Conference for

High Performance Computing, �etworking, Storage and

Analysis (Supercomputing 2007), Reno, NV, November 2007.

[2] A. R. Baig, “Spatial-temporal artificial neurons applied to

online cursive handwritten character recognition,” in European Symposium on Artificial Neural Networks, April

2004, paper Bruges (Belgium), pp. 561–566.

[3] D. V. Buonomano and M. M. Merzenich, “A neural network

model of temporal code generation and position invariant

pattern recognition,” �eural Computation, vol. 11, pp. 103–

116, 1999.

[4] Y. Dan and M. Poo, “Spike time dependent plasticity of neural

circuits,” �euron, vol. 44, pp. 23–30, 2004.

[5] A. Delorme and S. J. Thorpe, “SpikeNET: an event-driven

simulation package for modelling large networks of spiking

neurons,” �etwork-computation in neural systems, 14(4), 613–

627, Nov. 2003.

[6] M. Djurfeldt, M. Lundqvist, C. Johansson, M. Rehn, O.

Ekeberg, and A. Lansner, “Brain-scale simulation of the

neocortex on the IBM Blue Gene/L supercomputer,” IBM

Journal of Research and Development, 52(1-2), 31–41, Jan.-

Mar. 2008.

[7] M. Gschwind, H. P. Hofstee, B. Flachs, M. Hopkins, Y.

Watanabe, T. Yamazaki, "Synergistic Processing in Cell’s

Multicore Architecture," IEEE Micro, 26(2), 10–24, Mar.

2006.

[8] A. Gupta, L. Long, “Character Recognition using Spiking

Neural Networks,” International Joint Conference on Neural

Networks, Aug. 2007.

[9] A. L. Hodgkin and A. F. Huxley, “A quantitative description

of membrane current and application to conduction and

excitation in nerve,” Journal of Physiology, 117, 500–544,

1952.

[10] T. Ichishita, R. Fujii, “Performance evaluation of a temporal

sequence learning spiking neural network”, Proceedings of the

7th IEEE International Conference on Computer and

Information Technology, Oct. 2007.

[11] E. M. Izhikevich, “Simple Model of Spiking Neurons,” IEEE

Transactions on �eural �etworks, vol. 14, no. 6, November,

2003, pp. 1569-1572.

[12] E.Izhikevich, “Which Model to Use for Cortical Spiking

Neurons?” IEEE Transactions on �eural �etworks, 15(5),

1063-1070, 2004.

[13] E. Izhikevich and G. Edelman, "Large-Scale Model of

Mammalian Thalamocortical Systems," Proceedings of the

�ational Academy of Sciences, 105(9), 3593–3598, Mar.

2008.

[14] C. Johansson and A. Lansner, “Towards Cortex Sized

Artificial Neural Systems,” �eural �etworks, 20(1), 48–61,

Jan. 2007.

[15] K-Team, Inc. Available online: http://www.k-team.com/

[16] H. Markram, “The Blue Brain Project,” �ature Reviews

�euroscience, 7, 153–160, 2006.

[17] C. Panchev and S. Wermter “Temporal sequence detection

with spiking neurons: towards recognizing robot language

instructions,” Connect. Sci., 18(1): 1-22, 2006.

[18] W. Rall, “Branching dendritic trees and motoneuron

membrane resistivity,” Experimental �eurology, 1, 503–532,

1959.

[19] J. Rickman, “Roadrunner supercomputer puts research at a

new scale,” Jun. 2008,

http://www.lanl.gov/news/index.php/fuseaction/home.story/st

ory_id/13602

[ieee 2009 ieee symposium on computational intelligence for multimedia signal and vision processing...

Documents