development of a dynamically extendable spinnaker chip ... · development of a dynamically...

8
Development of a Dynamically Extendable SpiNNaker Chip Computing Module Rui Ara´ ujo, Nicolai Waniek, and J¨ org Conradt Technische Universit¨at M¨ unchen, Neuroscientific System Theory, Karlstraße 45, 80333 M¨ unchen, Germany {rui.araujo, nicolai.waniek, conradt}@tum.de http://www.nst.ei.tum.de Abstract. The SpiNNaker neural computing project has created a hard- ware architecture capable of scaling up to a system with more than a million embedded cores, in order to simulate more than one billion spiking neurons in biological real time. The heart of this system is the SpiNNaker chip, a multi-processor System-on-Chip with a high level of interconnectivity between its processing units. Here we present a Dy- namically Extendable SpiNNaker Chip Computing Module that allows a SpiNNaker machine to be deployed on small mobile robots. A non- neural application, the simulation of the movement of a flock of birds, was developed to demonstrate the general purpose capabilities of this new platform. The developed SpiNNaker machine allows the simulation of up to one million spiking neurons in real time with a single SpiNNaker chip and is scalable up to 256 computing nodes in its current state. Keywords: Mobile robotics, parallel computing module, spiking neural networks simulations, SpiNNaker. 1 Introduction Abstract processes carried out in the biological brain are still one of the great challenges for computational neuroscience. Despite an increasing amount of ex- perimental data and knowledge of individual components, we only have an in- sufficient understanding of the operations of intermediate levels. However, those are believed to be the key elements in constructing thoughts and in processing information [5]. Large-scale neural network simulations will help increasing our knowledge about these intermediate levels. However, the huge amount of neurons in the human brain [6] and their connectivities [2] are difficult if not impossible to simulate. General purpose digital hardware is ill suited to perform these simu- lations, as they are unable to cope with the massive parallelism carried out in the brain. A possible approach is the usage of neuromorphic systems such as the BrainScales [7] or the Neuro-Grid hardware [3] which emulate the neural network with a physical implementation of the individuals neurons. S. Wermter et al. (Eds.): ICANN 2014, LNCS 8681, pp. 821–828, 2014. c Springer International Publishing Switzerland 2014

Upload: others

Post on 22-Sep-2019

21 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Development of a Dynamically Extendable SpiNNaker Chip ... · Development of a Dynamically Extendable SpiNNaker Chip Computing Module RuiAra´ujo,NicolaiWaniek,andJ¨orgConradt Technische

Development of a Dynamically Extendable

SpiNNaker Chip Computing Module

Rui Araujo, Nicolai Waniek, and Jorg Conradt

Technische Universitat Munchen, Neuroscientific System Theory,Karlstraße 45, 80333 Munchen, Germany

{rui.araujo, nicolai.waniek, conradt}@tum.dehttp://www.nst.ei.tum.de

Abstract. The SpiNNaker neural computing project has created a hard-ware architecture capable of scaling up to a system with more thana million embedded cores, in order to simulate more than one billionspiking neurons in biological real time. The heart of this system is theSpiNNaker chip, a multi-processor System-on-Chip with a high level ofinterconnectivity between its processing units. Here we present a Dy-namically Extendable SpiNNaker Chip Computing Module that allowsa SpiNNaker machine to be deployed on small mobile robots. A non-neural application, the simulation of the movement of a flock of birds,was developed to demonstrate the general purpose capabilities of thisnew platform. The developed SpiNNaker machine allows the simulationof up to one million spiking neurons in real time with a single SpiNNakerchip and is scalable up to 256 computing nodes in its current state.

Keywords: Mobile robotics, parallel computing module, spiking neuralnetworks simulations, SpiNNaker.

1 Introduction

Abstract processes carried out in the biological brain are still one of the greatchallenges for computational neuroscience. Despite an increasing amount of ex-perimental data and knowledge of individual components, we only have an in-sufficient understanding of the operations of intermediate levels. However, thoseare believed to be the key elements in constructing thoughts and in processinginformation [5].

Large-scale neural network simulations will help increasing our knowledgeabout these intermediate levels. However, the huge amount of neurons in thehuman brain [6] and their connectivities [2] are difficult if not impossible tosimulate. General purpose digital hardware is ill suited to perform these simu-lations, as they are unable to cope with the massive parallelism carried out inthe brain. A possible approach is the usage of neuromorphic systems such asthe BrainScales [7] or the Neuro-Grid hardware [3] which emulate the neuralnetwork with a physical implementation of the individuals neurons.

S. Wermter et al. (Eds.): ICANN 2014, LNCS 8681, pp. 821–828, 2014.c© Springer International Publishing Switzerland 2014

Page 2: Development of a Dynamically Extendable SpiNNaker Chip ... · Development of a Dynamically Extendable SpiNNaker Chip Computing Module RuiAra´ujo,NicolaiWaniek,andJ¨orgConradt Technische

822 R. Araujo, N. Waniek, and J. Conradt

Another possible approach is the SpiNNaker system, which is a massively-parallel computer architecture based on a multi-processor System-on-Chip (MP-SoC) technology that can scale up to a million cores. It is capable of simulatingup to a billion spiking neurons in biological real time with realistic levels ofinterconnectivity between the neurons.

The SpiNNaker system was motivated by the attempt to understand andstudy biological computing structures. Their high level of parallelism with fru-gal amounts of energy as opposed to traditional electronics designs, which weremostly driven by serial throughput just until recently. The biological approachto the design of such many-cores architecture also brings new concerns in termsof fault-tolerance computation and efficiency. The SpiNNaker chip, the basicbuilding block of a SpiNNaker machine, relies on smaller processors than othermachines but in larger numbers, it has 18 highly efficient embedded ARM pro-cessors that allow the SpiNNaker system to be competitive according to twometrics, MIPS/mm2 and MIPS/W.

Fig. 1. The SpiNNaker Chip Computing Module Dataflow Architecture

The currently available machines with SpiNNaker chips are relatively large,the minimum size at this moment is 105× 95mm, which limits their deploymenton systems with limited size as for example, small mobile robots, specially flyingones due to very strict weight and space constraints. Additionally the SpiNNakersystems currently require a workstation, usually a desktop or a laptop, connectedthrough an Ethernet connection to bootstrap the system every time it powers

Page 3: Development of a Dynamically Extendable SpiNNaker Chip ... · Development of a Dynamically Extendable SpiNNaker Chip Computing Module RuiAra´ujo,NicolaiWaniek,andJ¨orgConradt Technische

Development of a SpiNNaker Chip Computing Module 823

on and to feed the processing data into the system. This requirement seriouslylimits the independence and deployment capabilities of systems with embeddedSpiNNaker chips. At present, it is necessary to add an wireless router in order tohave a mobile system with a SpiNNaker machine [4]. The drawbacks from thisapproach are fairly obvious, such as increased power consumption and spacerequirements since the typical wireless router consumes around 4 to 5 Watt andeven though there are fairly small models available at the market it would stilltake up some space.

Furthermore the amount of extensibility provided by standard SpiNNaker ma-chines is very limited since it only allows increases of computing power in fixedamounts. The current single board SpiNNaker machines are available in two ver-sions, one with four chips and another with forty eight. These are wildly differentamounts of processing capability which make it difficult to create intermediatesolutions. It would be interesting to have the capability of selecting how manySpiNNaker chips one needs to deploy without having to design new hardware.

The current requirements of the SpiNNaker architecture are not suitable for alot of applications where its processing power and capabilities would be helpful.It is then necessary to design a new solution that can overcome the limitationsof the present options.

2 SpiNNaker Chip Computing Module

Traditional SpiNNaker systems rely on the host system to provide the input data.This limits the possibilities of interfacing the SpiNNaker chip with the externalenvironment. In order to overcome this restriction, we added a microcontrollerwhich is responsible for communicating with the outside world. Figure 1 presentsthe general architecture for the developed solution which involves: 1) the designof a PCB with a single SpiNNaker chip and a microcontroller, 2) the softwareto drive the microcontroller, and 3) an application that runs on the host systemto communicate with the board. We strived to keep the hardware as simple andsmall as possible to achieve one of our main goals: portability. Therefore, oursolution consists only of the necessary components.

Several requirements determined the usage of our system. For instance, back-wards compatibility with the standard tools used for the bigger SpiNNaker ma-chines, ybug and tubotron, was an important aspect that lead to the creationof the host system application. Having backwards compatibility allows usersof our novel system to simply apply their already established know-how aboutSpiNNaker and its tools.

Adding the microcontroller had various benefits. We could replace the Eth-ernet connection with a simpler, albeit slower, universal asynchronous receiver/transmitter (UART) at a baud rate of 12 MBps, which roughly translates into1.2 Megabytes/s each way. Additionally, the microcontroller can boot the SpiN-Naker chip without a host system as it is capable of storing the previouslytransmitted image. The microcontroller and the application in the workstationact jointly to pretend they are another SpiNNaker chip which is connected to

Page 4: Development of a Dynamically Extendable SpiNNaker Chip ... · Development of a Dynamically Extendable SpiNNaker Chip Computing Module RuiAra´ujo,NicolaiWaniek,andJ¨orgConradt Technische

824 R. Araujo, N. Waniek, and J. Conradt

Microcontroller

SpiNNaker ChipPrototype Debug I/OSpiNNLink Port (optional)

Fig. 2. The SpiNNaker Chip Computing Module

the Ethernet, having their own point to point (P2P) address and position in thegrid. SpiNNaker machines are a two dimensional grid of nodes, where a node isa SpiNNaker chip, each with its own P2P address, a 16 bit number.

2.1 Evaluation

We took measurements to characterize our computing module. The main ob-jective was to determine the input and output capacity in terms of SpiNNakerpackets since higher level protocols rely on these. Both cores ran at 180 MHzduring all tests. The methodology used for these procedures was as follows:

– change the source code to set and clear a GPIO pin around the action to bemeasured.

– use the switch present in the prototype board to trigger the transmission ofa packet.

– using an oscilloscope, measure the time taken between the changes in stateof the previously selected GPIO pin.

The results are compiled in Table 1. The first two lines represent the timetaken to transmit a packet, including the calculation of the parity bit and con-sequential addition to the header, while the following two lines represent thetime taken only during the symbol transmission. There is no such difference forthe input side since it is only reading symbols and placing them at the queue inthe correct position.

It is possible to calculate the symbol transmission capacity with these results.Each symbol takes around 90 ns to be transmitted, which translates into a 11MSymbol transmission rate. The input capacity, around 43 % of the transmissionrate, lies at 4.78 MSymbols.

These numbers are fairly small when compared to the capacity on the SpiN-Naker side, which is around 62.4 MSymbols in both directions. This large dif-ference is explained by the fact the microcontroller implementation is entirely

Page 5: Development of a Dynamically Extendable SpiNNaker Chip ... · Development of a Dynamically Extendable SpiNNaker Chip Computing Module RuiAra´ujo,NicolaiWaniek,andJ¨orgConradt Technische

Development of a SpiNNaker Chip Computing Module 825

software based as opposed to the SpiNNaker which is implemented in dedi-cated hardware. Furthermore, the difference between the transmission and thereception rate lies in the increased complexity when receiving symbols. Whereasreceiving includes a step to identify symbols, no such step is required when trans-mitting packets. Another analysis showed that the transmission and receptioncapacity exceeds the UART’s 12 MBps by far. This means that this communi-cation channel is currently the bottleneck.

The packet transmission rate resides at 455K packets per second for packetswith payload and 770K packets per second for packets without further payload.As for the packet reception rate, 435K packets per second can be received if theydo not include a payload, whereas the rate for packets with payload is around238K packets per second.

Table 1. Performance Measurements for the transmission and reception of SpiNNakerpackets

Action Time taken (µs)

Packet Transmission with payload 2.20Packet Transmission without payload 1.30Symbol transmission for packet with payload 1.80Symbol transmission for packet without payload 1.00Packet reception with payload 4.20Packet reception without payload 2.30

3 Boids Model

We developed a case study for the SpiNNaker Chip Computing Module todemonstrate possible applications. SpiNNaker was obviously designed to sim-ulate spiking neural networks. however, we opted for a more general exampleand simulated the aggregate motion of a flock of birds while using distributedrules for each bird. The implemented model for this simulation is traditionallynamed the Boids model.

The Boids model is a distributed behavioral model [8] for flocks of flying birdsor fish schools. It shares many characteristics with particle systems – large setsof individual items, each with its own behavior – but has several crucial differ-ences. Traditional particle systems are usually employed to model fire, smoke, orwater. Each particle is created, ages, and finally dies, and is generally denotedby a point-like structure. In a boids simulation however, individual boids have ageometrical shape and, consequently, describe an orientation. Additionally, typ-ical particles do not interact between themselves as opposed to e.g. a bird whichmust do so in order to flock appropriately.

Boids models are often referred to as a prime example of Artificial Life [1].They illustrate a variety of principles which appear in natural systems, such asemergence where complex behavior comes from the local interaction of simple

Page 6: Development of a Dynamically Extendable SpiNNaker Chip ... · Development of a Dynamically Extendable SpiNNaker Chip Computing Module RuiAra´ujo,NicolaiWaniek,andJ¨orgConradt Technische

826 R. Araujo, N. Waniek, and J. Conradt

rules. Another example is unpredictability. Although a bird usually does notbehave chaotically in a temporally local context, it is difficult or near impossibleto predict its behavior on a larger time scale.

A natural flock has certain behavioral rules that allows it to exist and survive.For instance, birds tend to avoid collisions with other birds but still stay close tothe flock in order to protect the whole flock against predators. We impose suchpossibly contradictory rules on our simulated boids to induce seemingly naturalflocking behaviors. The basic rules are:

– Collision Avoidance: avoid collisions with nearby flock members

– Velocity Matching: attempt to match velocity with nearby flock members

– Flock Centering: attempt to stay close to nearby flock members

Each behavior rule produces an acceleration which is a contribution to atunable weighted average. The relative strength of each rule will dictate thegeneral behavior of the flock. For example, if the flock centering behavior hasa very low impact then the flock will be very sparse while it will still follow acommon direction.

Since the model attempts to simulate the movement of birds, it must be basedon a semi-realistic model of flight. It does not need to take in considerationall physical forces like aerodynamic drag or even gravity but it must limit thevelocity and instantaneous accelerations to realistic values. These restrictionshelp modeling creatures with finite amounts of energy.

Fig. 3. A frame of the Boids Visualizer with 2176 birds

3.1 Evaluation

We developed two distinct versions of the simulation. One incorporates the capa-bilities of our SpiNNaker Chip Computing Module to simulate the boids modelwhich is described above, the other does not and executes the simulation ona standard desktop computer. Both implementations use a desktop computer

Page 7: Development of a Dynamically Extendable SpiNNaker Chip ... · Development of a Dynamically Extendable SpiNNaker Chip Computing Module RuiAra´ujo,NicolaiWaniek,andJ¨orgConradt Technische

Development of a SpiNNaker Chip Computing Module 827

to visualize the simulation results. The differences between the two simula-tions can be directly inspected with this approach. A screenshot of the simu-lation is shown in Figure 3 and video of the simulation can be found online athttps://www.youtube.com/watch?v=KiyhVRgxugY. The method used for theevaluation is as follows:

– select a number of birds for the simulation.– run the simulation on the SpiNNaker Chip Computing Module, and record

the frames per second (FPS) of the visualization.– run the simulation on the computer and, again, record the frames per second.

The results of this benchmark are compiled in Table 2. The central processingunit (CPU) of the computer used for running this simulation was an AMDPhenom II X4 945 running at 3 GHz.

The O(N2) complexity of the algorithm is clearly visible in the results. Theincrease of the number of birds leads to an ever increasing reduction of the framerate. The version that uses the SpiNNaker Chip Computing Module also suffersfrom a reduction in the frame rate due to the shear number of objects it hasto represent. The graphics code was not optimized to handle a large number ofbirds, but as it is the same code for both versions, the rendering time is negligible.The upper limit of 60 frames per second is due to the monitor’s internal refreshrate setting.

Clearly, the SpiNNaker Chip Computing Module is advantageous. The simu-lation delivers higher frame rates when using the module when compared to thedesktop version of the simulation.

Table 2. Frames per second for the simulation with and without the SpiNNaker ChipComputing Module (SCCM)

Number of birds FPS on SCCM FPS on Desktop Computer

2176 60 604352 59 306528 43 15

4 Summary and Conclusions

We developed hardware and software for a novel module consisting of a singleSpiNNaker chip and an auxiliary microcontroller. The system can simulate hugeneural networks but still has a very low power consumption. In fact, its real-time I/O capabilities with limited power requirements make it highly suitablefor mobile power efficient systems. This point is even more emphasized by themodule’s small size. Furthermore, our module is designed to be easily extendableby other SpiNNaker Chip Computing Modules or by other SpiNNaker systems.We demonstrated the capabilities of one single module using a boids simulation.The evaluation shows that our system is capable of simulating large numbers

Page 8: Development of a Dynamically Extendable SpiNNaker Chip ... · Development of a Dynamically Extendable SpiNNaker Chip Computing Module RuiAra´ujo,NicolaiWaniek,andJ¨orgConradt Technische

828 R. Araujo, N. Waniek, and J. Conradt

of distributed items. Hence, the module is ideal to simulate large networks ofartificial neurons.

We are currently working on a version which is even further reduced in size.It will thus be able to use the module on tiny robots, in neuroprosthetics, ormultiple connected modules to simulate even larger artificial neural networkswith only little space consumption.

Acknowledgments. This project would not have been possible without theprecious help from Steve Temple, Luis Plana and Francesco Gallupi from theUniversity of Manchester who supplied the SpiNNaker chips and provided es-sential guidance while studying the SpiNNaker system.

References

1. Bedau, M.A.: Artificial life: organization, adaptation and complexity from the bot-tom up. Trends in Cognitive Sciences 7(11), 505–512 (2003)

2. Brotherson, S.: Understanding Brain Development in Young Children (April 2009),http://www.ag.ndsu.edu/pubs/yf/famsci/fs609.pdf

3. Choudhary, S., et al.: Silicon neurons that compute. In: Villa, A.E.P., Duch, W.,Erdi, P., Masulli, F., Palm, G. (eds.) ICANN 2012, Part I. LNCS, vol. 7552, pp.121–128. Springer, Heidelberg (2012)

4. Denk, C., Llobet-Blandino, F., Galluppi, F., Plana, L., Furber, S., Conradt, J.: Real-time interface board for closed-loop robotic tasks on the spinnaker neural comput-ing system. In: Mladenov, V., Koprinkova-Hristova, P., Palm, G., Villa, A.E.P., Ap-pollini, B., Kasabov, N. (eds.) ICANN 2013. LNCS, vol. 8131, pp. 467–474. Springer,Heidelberg (2013)

5. Furber, S., Brown, A.: Biologically-inspired massively-parallel architectures - com-puting beyond a million processors. In: Ninth International Conference on Applica-tion of Concurrency to System Design, ACSD 2009, pp. 3–12 (2009)

6. Nguyen, T.: Total number of synapses in the adult human neocortex. Journal ofMathematical Modeling: One + Two 3(1) (2010)

7. Pfeil, T., Grubl, A., Jeltsch, S., Muller, E., Muller, P., Petrovici, M.A., Schmuker,M., Bruderle, D., Schemmel, J., Meier, K.: Six networks on a universal neuromorphiccomputing substrate. ArXiv e-prints (October 2012)

8. Reynolds, C.W.: Flocks, herds and schools: A distributed behavioral model. SIG-GRAPH Comput. Graph. 21(4), 25–34 (1987)