[ieee 2009 ieee symposium on computational intelligence for multimedia signal and vision processing...
TRANSCRIPT
Character recognition with two spiking neural network models on
multicore architectures
Mohammad A. Bhuiyan, Rommel Jalasutram, and Tarek M. Taha
Abstract— This paper presents the use of the Izhikevich and
Hodgkin Huxley neuron models for image recognition. The
former is more biologically accurate than the commonly used
integrate and fire neuron model but has similar low
computational requirements. Brain scale cortex models tend to
use the more biological neuron models. The results of this work
show that the Izhikevich model can be used for image
recognition and would be a good candidate for a large scale
visual cortex model. !eural networks based on these models
are developed and applied to character recognition. They were
able to identify 48 24×24 images and their noisy versions. The
networks were accelerated using modern multicore processors
and showed significant speedups. Such processors are likely to
be used for developing high performance, large scale
implementations of these image recognition networks.
I. INTRODUCTION
piking neural network models are the third generation of
neural networks and are considered to be one of the most
biologically accurate models. Several studies indicate that
pulse coding (as opposed to rate coding) may be used in
biological neurons in the cortex for high speed image
recognition [5]. Rate coding generally requires longer
processing times as a neuron would have to wait for several
spikes to arrive before firing. In pulse coding, the timing
order of the incoming spikes can encode information at a
faster rate. Both [5] and [8] have utilized pulse coding based
spiking neural networks for image recognition. Gupta
developed a model for recognizing a set of 48 3×5 pixel
images of characters. They utilized the integrate and fire
spiking neuron model for recognition and spike timing based
plasticity (STDP) for training [4]. Thorpe utilized a similar
neuron model, but examined the recognition of much larger
images.
At present there is a strong interest in the research
community to model biological scale image recognition
systems such as the visual cortex of a rat or a human. For
such systems, it would be useful to examine the use of more
biologically accurate neuron models than the integrate and
fire model for image recognition. Izhikevich points out that
the commonly used integrate and fire model is one of the
least biologically accurate spiking neuron models. He states
that “the model cannot exhibit even the most fundamental
properties of cortical spiking neurons, and for this reason it
should be avoided by all means” [12]. Izhikevich compares a
set of 11 spiking neuron models [12] in terms of their
biological accuracy and computational load. He shows that
the five most biologically accurate models are (in order of
biological accuracy) the: 1) Hodgkin-Huxley, 2) Izhikevich,
3) Wilson, 4) Hindmarsh-Rose, and 5) Morris-Lecar models.
Of these the Hodgkin-Huxley model is the most compute
intensive, while the Izhikevich model is the most
computationally efficient.
Current studies of large scale models of the cortex are
generally using either the Izhikevich [1] or the Hodgkin
Huxley [16] spiking neuron models. One of the main
problems with using the Hodgkin Huxley model for large
scale implementations is that it is computationally intensive
– each neuron update requires about 1200 flops [12]. Using
the integrate and fire model (requires about 5~13 flops per
neuron update) can make large scale implementations more
computationally efficient (such as [5]). The recently
published Izhikevich model is attractive for large scale
implementations as it is close to the Hodgkin Huxley model
in biological accuracy, but is similar to the integrate and fire
model in computational intensity (13 flops per neuron
update).
In this paper, we develop a character recognition model
based on the two layer spiking neuron network model in [8].
However we utilize the more biologically accurate
Izhikevich and Hodgkin Huxley neuron models instead of
the integrate and fire model. We are not aware of any other
studies that examine the use of the Izhikevich model for
pattern recognition. The studies in [1] and [13] utilize the
Izhikevich model primarily to examine spiking behavior and
computational load (they do not examine the use of the
model for inference). We utilize pulse coding to mimic the
high speed recognition taking place in the mammalian brain
[5] and STDP for training [4]. The model is trained to
recognize a set of 48 24×24 pixel images of characters and
was able to recognize noisy versions of these images.
We also examine the performance of the two character
recognition models on recent multicore processors. With the
limited scaling in processor clock frequencies, multicore
processors have become the standard industrial approach to
improve processor performance. However we are not aware
of any studies examining the implementation or performance
S
This work was supported by an NSF CAREER Award and grants
from the US Air Force.
M. A. Bhuyian is a Ph.D. student. Email: [email protected].
R. Jalasutram is an M.S. student. Email: [email protected].
T. M. Taha is an Assistant Professor. Email: [email protected].
Electrical and Computer Engineering Department, Clemson
University, Clemson, SC 29631, USA.
978-1-4244-2771-0/09/$25.00 ©2009 IEEE
of these neuron models on multicore processors. The two
architectures examined in this study are the Intel Pentium D
925 processor and the IBM/Sony/Toshiba Cell broadband
engine [7]. The Intel Pentium D 925 has two processing
cores running at 3 GHz while the Cell broadband engine has
nine processing cores running at 3.2 GHz. The latter
processor has attracted significant attention from the
computing community recently. The fastest supercomputer
at present, the IBM Roadrunner supercomputer [19] installed
at Los Alamos National Lab, utilizes 12,240 Cell processors
and 6,912 AMD Opteron processors.
Section II of this paper discusses background material
including a brief introduction to the two spiking network
models considered for this study and a discussion of related
work. Section III presents the character recognition model.
Sections IV and V describe the experimental setup and
results of the model, while section VI concludes the paper.
II. BACKGROUND
A. Hodgkin Huxley Model
The Hodgkin–Huxley model [9] is considered to be one of
the most biologically accurate spiking neuron models. It
consists of four differential equations (eq. 1-4) and a large
number of parameters. The differential equations describe
the neuron membrane potential, activation of Na and K
currents, and inactivation of Na currents. The model can
exhibit almost all types of neuronal behavior if its
parameters are properly tuned. This model is very important
to the study of neuronal behavior and dynamics as its
parameters are biophysically meaningful and measurable. A
time step of 0.01 ms was utilized to update the four
differential equations as this is the most commonly used
value. Fig. 1 shows the spikes produced with this model.
4 31( ){ ( ) ( ) ( )}K K �a �a L L
dvI g n V E g m h V E g V E
dt C= − − − − − − (1)
( ( ) ) / ( )n
dnn V n V
dtτ∞= − (2)
( ( ) ) / ( )m
dmm V m V
dtτ∞= − (3)
( ( ) ) / ( )h
dhh V h V
dtτ∞= − (4)
B. Izhikevich Model
Izhikevich proposed a new spiking neuron model in 2003
[11] that is based on only two differential equations (eq. 5-6)
and four parameters. The model requires significantly fewer
computations than the Hodgkin Huxley model (13 flops as
opposed to 1200 flops per neuron update), but can still
reproduce almost all types of neuron responses that are seen
in biological experiments. A time step of 1 ms was utilized
(as was done by Izhikevich in [11]). Fig. 2 shows the spikes
produced with this model.
20.04 5 140dV
V V u Idt
= + + − + (5)
( )du
a bV udt
= − (6)
if 30 mV, then V c
Vu u d
←≥
← +
Fig. 2. Spikes produced with the Izhikevich model.
C. Related work
Several groups have studied image recognition using
spiking neural networks. In general, these studies utilize the
integrate and fire model. Thorpe developed SPIKENET [5],
a large spiking neural network simulation software. The
system can be used for the several image recognition
applications including identification of faces, fingerprints,
and video images. Johansson and Lansner developed a large
cluster based spiking network simulator of a rodent sized
cortex [14]. They tested a small scale version of the system
to identify 128×128 pixel images. Baig [2] developed a
temporal spiking network model based on integrate and fire
neurons and applied them to identify online cursive
handwritten characters. Gupta and Long [8] investigated the
application of spiking networks for the recognition of simple
characters. Other applications of spiking networks include
instructing robots in navigation and grasping tasks [17],
recognizing temporal sequences [3][10], the robotic
modeling of mouse whiskers [15].
At present several groups are developing biological scale
implementations of spiking networks, but are generally not
examining the applications of these systems (primarily as
they are modeling large scale neuronal dynamics seen in the
Fig. 1. Spikes produced with the Hodgkin Huxley model.
brain). The Swiss institution EPFL and IBM are developing
a highly biologically accurate brain simulation [16] at the
subneuron level. They have utilized the Hodgkin Huxley and
the Wilfred Rall [18] models to simulate up to 100,000
neurons on an IBM BlueGene/L supercomputer. At the IBM
Almaden Research Center, Ananthanarayanan and Modha
[1] utilized the Izhikevich spiking neuron models to simulate
55 million randomly connected neurons (equivalent to a rat-
scale cortical model) on a 32,768 processor IBM
BlueGene/L supercomputer. Johansson et. al. simulated a
randomly connected model of 22 million neurons and 11
billion synapses using an 8,192 processors IBM BlueGene/L
supercomputer [6].
III. NETWORK DESIGN
A two layer network structure was utilized in this study
with the first layer acting as input neurons and the second
layer as output neurons. Input images were presented to first
layer of neurons, with each image pixel corresponding to a
separate input neuron. Thus the number of neurons in the
first layer was equal to the number of pixels in the input
image. Binary inputs were utilized in this study. If a pixel
was “on” a constant current was supplied to the input, while
no current was supplied if a pixel was “off.” The number of
output neurons was equal to the number of training images.
Each input neuron was connected to all the output neurons.
In the training process, images from the training set are
presented sequentially to the input neurons. The weight
matrix of each output neuron is updated using the STDP rule
each cycle. In the testing phase, an input image is presented
to the input neurons and after a certain number of cycles,
one output neuron fires, thus identifying the input image.
Two versions of the model were developed where the main
difference was the equations utilized to update the potential
of the neurons. In one case, the Hodgkin Huxley equations
presented in section II.A were used, while in the second
case, the Izhikevich equations in section II.B were used. The
parameters utilized in each case are specified in Appendix I.
IV. EXPERIMENTAL SETUP
The same set of training images were utilized for both the
Hodgkin Huxley and the Izhikevich versions of the model.
There were 48 training images, each 24×24 pixels wide. The
images represented the 26 upper case letters (A-Z), 10
numerals (0-9), 8 Greek letters, and 4 symbols. Fig. 3 shows
the training images used. The two models were initially
developed and tested in MATLAB.
Large scale high performance versions of the models
would likely utilize optimized C implementations on clusters
of modern multicore processors. Therefore we evaluated the
runtime performance of parallelized C implementations on
two high performance multicore architectures: an Intel
Pentium D 925 processor and the Cell Broadband Engine.
The Pentium D 925 processor runs at 3 GHz and has two
processing cores. A Sony Playstation 3 platform was utilized
for the Cell processor. This processor contains one
administrative core (PPU) and eight high performance
processing cores (SPUs). On the Playstation 3 platform, only
six of the eight SPUs are available for use. Both the Intel and
Cell processing platforms were running Fedora Core.
Fig. 3. Training images utilized. There are a total of 48 24×24 pixel images.
The two most important optimizations needed for high
performance implementations on these multicore
architectures are the parallelization and vectorization of the
models. The models were compiled using gcc on both
platforms. The models were parallelized by distributing the
neurons of each level across a set of threads that would run
in parallel on separate processing cores. The Pentium D
platform utilized POSIX threads for parallelization and the
Cell platform utilized the IBM Cell SDK 2.1 library. The
models were also vectorized to take advantage of the SIMD
units that are generally found on modern processors to allow
multiple instances of an operation to take place
simultaneously. On the Intel processor, four floating point
operations can be carried out simultaneously on each core
using SSE instructions. The Cell processor can also evaluate
four floating point operations in parallel on each SPU (eight
if fused multiply-add instructions are used). Further code
optimizations (such as loop unrolling, branch elimination,
and double buffering) that are generally needed for the Cell
processor were also implemented.
V. RESULTS
Based on the model parameters utilized, the Izhikevich
and the Hodgkin Huxley models required at most 14 ms and
3.75 ms of simulation time respectively to produce output
spikes at the level two neurons. Thus input images were
presented as inputs for these simulation times for the
respective models. In both cases a membrane potential of
above 30 mV represented a neuron firing (and thus
recognition). The two models were tested with the 48
training images and were able to recognize all images
correctly.
Fig. 4 lists an additional set of test images (labeled 1 to
10) that were applied to the two networks. These include
some original and noisy versions of the first four training
images (‘A’, ‘B’, ‘C’, and ‘D’). The images were applied
sequentially to the inputs of both models. The membrane
potentials of the level two neurons corresponding to these
four images are shown in Fig. 5. Both models recognized all
of the images except for image 6. The spikes produced by
image 6 did not cross the 30 mV threshold needed for a
proper recognition.
The runtimes of the two models are presented in Table 1.
On the Pentium D platform, the runtimes for Matlab and
optimized parallel C implementations are taken. On the Cell
based Playstation 3 platform, the runtimes for serial
implementations on the PPU and parallel implementations
using all the cores are taken. As expected, in all cases, the
Izhikevich model required less time than the Hodgkin
Huxley model. Additionally, the C implementations were
significantly faster than the Matlab implementations.
Amongst the C implementations, the serial implementation
on the Cell PPU was slower than both the parallel C
versions.
In case of the parallel implementations, the Cell processor
was faster than the Pentium D for the Hodgkin Huxley
model, but slower for the Izhikevich model. This disparity in
the performance of the Cell processor versus the Pentium D
is primarily due to the higher synchronization overheads in
the Cell processor compared to the Pentium D. The Cell
processor has six cores running and requires 0.55 ms for
synchronization, while the Pentium D with 2 cores requires
only 0.18 ms for synchronization for the Izhikevich model.
(1) (2) (3) (4) (5)
(6) (7) (8) (9) (10)
Fig. 4. Additional test images.
0 20 40 60 80 100 120 140-100
-50
0
50
100
150
Time (ms)
Membrane Potential (m
v)
(a)
0 5 10 15 20 25 30 35 40-10
0
10
20
30
40
50
Time (ms)
Membrane Potential (m
v)
(b)
Fig. 5. Level two neuron membrane potentials for a serial presentation
of the images in Fig. 4 for (a) the Izhikevich model and (b) the Hodgkin
Huxley model. The red line represents the membrane potential of the neuron for detecting an ‘A’, blue for ‘B’, green for ‘C’, and cyan for
‘D’.
Since the Izhikevich model for 24×24 pixel images requires
far fewer computations than the Hodgkin Huxley model, the
effect of the synchronization overhead dominates the Cell
run time and makes it slower than the Pentium D.
For a larger network neural network, the synchronization
times will not change, thus making the Cell faster than the
Pentium D. To test this hypothesis, we scaled the networks
to utilize 192×192 pixel image inputs. As we did not have
input images to train this network, we utilized randomized
training matrices. This would not affect the results of the run
time though as a trained network would still have the same
number of computations taking place. Table 2 shows the
runtime for these networks. In this case it is seen that the
Cell processor is actually faster than the Pentium D for both
models. Fig. 6 shows that parallel implementations on multi-
core processors can provide significant performance gains
over single core processors for these image processing
models. The speedup is achieved due to thread and data level
parallelism from having multiple cores and vector operations
on each core respectively.
Table I
RUNTIMES FOR 24×24 PIXEL IMAGES
Platform Language Izhikevich
(ms)
Hodgkin-
Huxley (ms)
Matlab 109.7 1613.3 Pentium D 925
C 0.5 67.1
Cell – parallel C 0.6 19.7
Cell – PPU C 2.0 198.0
Table II
RUNTIMES FOR 192×192 PIXEL IMAGES WITH RANDOMLY WEIGHTED
NETWORKS
Platform Language Izhikevich
(ms)
Hodgkin-
Huxley (ms)
Matlab 49120.0 2.11×106 Pentium D 925
C 31.9 1798.3
Cell – parallel C 7.1 549.0
Cell – PPU C 148.1 31231.0
0
10
20
30
40
50
60
Izhikevich
24x24
Izhikevich
192x192
Hodgkin
Huxley
24x24
Hodgkin
Huxley
192x192
Speedup
Pentium
Cell Parallel
Fig. 6. Speedup of parallel multicore implementations over a serial
implementation on the Cell PPU.
VI. CONCLUSIONS
In this paper we presented the use of the Izhikevich and
Hodgkin Huxley spiking neuron models for character
recognition. Two similar networks, one for each of the two
neuron models examined, were developed and trained to
recognize a set of 48 24×24 pixel character images. The
networks were tested with the training images and several
noisy versions of these images. All the training images and
most of the noisy images were correctly identified.
For the small network studied, the integrate and fire
neuron model would have worked fine. The Izhikevich
model however requires similar computational power as the
integrate and fire model, but is much more biologically
accurate. Most researchers examining brain scale cortex
models are utilizing the Izhikevich and Hodgkin Huxley
models. We are not aware of the use of the Izhikevich model
for image recognition (or any other type of inference). In this
paper, we demonstrated that the Izhikevich model can be a
good candidate for larger image recognition networks.
In this paper we implemented the two networks on two
high performance multicore processors: the Intel Pentium D
925 dual core processor and the Cell broadband engine with
nine processing cores. Such processors are likely to be
candidates for future large scale implementations of the
neuron models examined. We demonstrated that these
processors provide significant speedups for both of the
neural model based image recognition networks. To achieve
high performance on these processors, the networks had to
be take advantage of the thread and data level parallelism
offered by these architectures. Further optimizations were
needed on the Cell processor. Our results indicate that the
Izhikevich model requires a shorter run time (as expected)
and that the Cell processor, with more cores, provides higher
speedups. The Cell platform utilized in this study had six
high performance cores available for use.
Future work in this area would be to scale up the image
recognition network to work with larger images. Changes to
the model would also be examined to recognize color and
more complex images. Finally we would examine the
clustered implementation of the model to enable much larger
networks to operate efficiently.
APPENDIX I
NEURON MODEL PARAMETERS
Hodgkin Huxley: gK=36 mho, gNa=120 mho, gL=0.3 mho,
EK=-12 mV, ENa=115 mV, EL=10.613 mV, V=-10 mV,
VK=0 mV, VNa=0 mV, VL=1 mV, time step=0.01 ms.
Izhikevich: Excitatory neurons: a=.02 , b= 0.2 , c= -55 , d=4;
Inhibitory neurons: a=0.06, b=0.22, c=-65, d=2, time step=
1 ms.
REFERENCES
[1] R. Ananthanarayanan and D. Modha, “Anatomy of a Cortical
Simulator,” Proceedings of the International Conference for
High Performance Computing, �etworking, Storage and
Analysis (Supercomputing 2007), Reno, NV, November 2007.
[2] A. R. Baig, “Spatial-temporal artificial neurons applied to
online cursive handwritten character recognition,” in European Symposium on Artificial Neural Networks, April
2004, paper Bruges (Belgium), pp. 561–566.
[3] D. V. Buonomano and M. M. Merzenich, “A neural network
model of temporal code generation and position invariant
pattern recognition,” �eural Computation, vol. 11, pp. 103–
116, 1999.
[4] Y. Dan and M. Poo, “Spike time dependent plasticity of neural
circuits,” �euron, vol. 44, pp. 23–30, 2004.
[5] A. Delorme and S. J. Thorpe, “SpikeNET: an event-driven
simulation package for modelling large networks of spiking
neurons,” �etwork-computation in neural systems, 14(4), 613–
627, Nov. 2003.
[6] M. Djurfeldt, M. Lundqvist, C. Johansson, M. Rehn, O.
Ekeberg, and A. Lansner, “Brain-scale simulation of the
neocortex on the IBM Blue Gene/L supercomputer,” IBM
Journal of Research and Development, 52(1-2), 31–41, Jan.-
Mar. 2008.
[7] M. Gschwind, H. P. Hofstee, B. Flachs, M. Hopkins, Y.
Watanabe, T. Yamazaki, "Synergistic Processing in Cell’s
Multicore Architecture," IEEE Micro, 26(2), 10–24, Mar.
2006.
[8] A. Gupta, L. Long, “Character Recognition using Spiking
Neural Networks,” International Joint Conference on Neural
Networks, Aug. 2007.
[9] A. L. Hodgkin and A. F. Huxley, “A quantitative description
of membrane current and application to conduction and
excitation in nerve,” Journal of Physiology, 117, 500–544,
1952.
[10] T. Ichishita, R. Fujii, “Performance evaluation of a temporal
sequence learning spiking neural network”, Proceedings of the
7th IEEE International Conference on Computer and
Information Technology, Oct. 2007.
[11] E. M. Izhikevich, “Simple Model of Spiking Neurons,” IEEE
Transactions on �eural �etworks, vol. 14, no. 6, November,
2003, pp. 1569-1572.
[12] E.Izhikevich, “Which Model to Use for Cortical Spiking
Neurons?” IEEE Transactions on �eural �etworks, 15(5),
1063-1070, 2004.
[13] E. Izhikevich and G. Edelman, "Large-Scale Model of
Mammalian Thalamocortical Systems," Proceedings of the
�ational Academy of Sciences, 105(9), 3593–3598, Mar.
2008.
[14] C. Johansson and A. Lansner, “Towards Cortex Sized
Artificial Neural Systems,” �eural �etworks, 20(1), 48–61,
Jan. 2007.
[15] K-Team, Inc. Available online: http://www.k-team.com/
[16] H. Markram, “The Blue Brain Project,” �ature Reviews
�euroscience, 7, 153–160, 2006.
[17] C. Panchev and S. Wermter “Temporal sequence detection
with spiking neurons: towards recognizing robot language
instructions,” Connect. Sci., 18(1): 1-22, 2006.
[18] W. Rall, “Branching dendritic trees and motoneuron
membrane resistivity,” Experimental �eurology, 1, 503–532,
1959.
[19] J. Rickman, “Roadrunner supercomputer puts research at a
new scale,” Jun. 2008,
http://www.lanl.gov/news/index.php/fuseaction/home.story/st
ory_id/13602