saaniya contractor, nataliya kozlova, and vladimir brezina ... · saaniya contractor, nataliya...

1
A memoryless, stochastic mechanism of timing of phases of behavior by a neural network controller Saaniya Contractor, Nataliya Kozlova, and Vladimir Brezina, Mount Sinai School of Medicine, New York, NY, USA I SUMMARY IV BEHAVIORAL SCENARIO REFERENCES 1. Proekt A, Brezina V, Weiss KR (2004) Dynamical basis of intentions and expec- tations in a simple neuronal network. PNAS 101: 9447-9452. 2. http://inka.mssm.edu/~nata/simulations/ode.html 3. http://inka.mssm.edu/~nata/simulations/neuralnet.html II REAL APLYSIA CPG III NEURAL NETWORK CONTROLLER 1 For a sensorimotor network to generate adaptive behavior in the envi- ronment, the phases of the behavior must be appropriately timed. When the behavior is driven simply by the sensory stimuli from the envi- ronment, these can supply the timing. But when the behavior is driven by an internal "goal" that ignores and perhaps even opposes the imme- diate sensory stimuli, the timing must be generated internally by the net- work. We have modeled a realistic behavioral task that requires such internal timing, based on the feeding behavior of the sea slug Aplysia (Fig. 1A). When an Aplysia feeds, it incrementally ingests long strips of seaweed, driven by ingestive stimuli emanating from the seaweed (Fig. 1B, left to right along the top). But if, having ingested a strip, the animal fails to break the strip off the substrate, it must incrementally egest the entire strip again. To do this, it must ignore the inherent ingestiveness of the seaweed and generate the opposite, egestive behavior, driven by an internal egestive goal, for a length of time that is appropriate for the length of the strip to be egested (Fig. 1B, right to left along the bottom). In this poster, we compare the very different mechanisms by which this task is performed, equally well, by two different nervous sys- tems: the real Aplysia feeding central pattern generator (CPG), and an artificially evolved neural network controller. Using genetic algorithms, we then evolved simple artificial neural network controllers that were able to perform the behavioral task just as well as the real CPG does (see Fig. 10). Although we evolved controllers with up to 10 neurons, further investigation showed that 2-neuron controllers performed just as well and employed the same mech- anism as controllers with more neurons; therefore only 2-neuron controllers are pre- sented here. (1-neuron controllers were not able to perform the task at all.) All con- trollers presented here were evolved to perform the task in environments given by τ = 30 and initially f = 0.3, subsequently instead f = 0.7 for comparison. Fig. 5 shows a representative simulation with the best controller evolved with τ = 30 and f = 0.3, performing the task in that same environment. (The best controllers evolved with f = 0.3 and with f = 0.7 can be run with different environmental parameter values on our Web site [3].) Note the phases of goal-driven egestion c, c’, ... , in Fig. 5B. Fig. 2 shows the behavior of a standard differential-equation-based model (red and blue curves), driven by either ingestive or egestive stimuli, fit- ted to experimental data obtained with that same stimulation of the Aplysia feeding CPG in vitro by Proekt et al. [1] (black and white circles). As can be seen, the dynamics of the CPG are for the most part slow (“1D model”, blue). They integrate the incoming stimuli over multiple cycles of the feeding behavior so that the character of the behavioral output progressively evolves in the ingestive direction with repeated ingestive stim- uli (Fig. 2A), and in the egestive direction with repeated egestive stimuli (Fig. 2B). Furthermore, after a switch from egestive to ingestive stimuli, the output exhibits inertia: it remains egestive for some time (Fig. 2C, arrow 4). After the converse switch from ingestive to egestive stimuli, however, there is no such inertia: the output becomes egestive immediately (Fig. 2C, arrow 3). The superposition of this one component of fast dynamics on the otherwise slow dynamics requires the second dimension of the full “2D model” (red). We tested the CPG model in the behavioral task over a range of environments defined by the two parameters τ, the environmental length or time scale, and f, the fraction of the true stimulus in the environment that is perceived by the model. Fig. 3 shows a representative simulation with the 2D model with τ = 200 and f = 0.1. (The model can be run with different parameter values on our Web site [2].) Over a certain range of τ and f, the 2D model (but not the 1D model) performed the task extremely well (Fig. 4), not much worse than the theoretical average maximal performance of 0.33. Further investigation showed that, in the CPG model, the goal-driven egestion is appropriately timed by a slowly decaying dynamical tran- sient that "remembers" the time elapsed since the beginning of the egestion. The slow speed of the transient is due to the slow dynam- ics of the CPG, while the initial value from which the transient decays is set by the superimposed component of fast dynamics. 2 3 4 Environmental length / time scale, τ 5 6 However, the phases of goal-driven egestion are in this case timed by a completely dif- ferent mechanism. The dynamics of these networks are characterized by discrete ingestive and egestive attractors, to which they switch in response to ingestive and egestive stimuli. In Fig. 6A the ingestive and egestive attractors are shown by the green and red dots, respectively, and their basins of attraction by the green and red col- oring of the state space. The phase of goal-driven egestion is generated by the fact that the switch in behavior, from one attractor to the other, follows the switch in stimulus only with a considerable delay, during which the network continues to reside near the old attractor and generate the old behavior. A representative example is shown in Fig. 6B in the time domain, and in Fig. 6C in the state space. What governs the duration of the delay before the switch to the new attractor and behavior? Residing always near an attractor, the network has no long-term memory. Instead, the switch to the new attractor finally occurs when a sufficiently high local stimulus density appears in the stochastic stimulus input stream (e.g., at the arrow at the top right of Fig. 6B). This complex event occurs rarely. To perform the task efficiently, the evolution of the network tunes its connection weights so that the switch requires a density that occurs, on average, about as often as the time that is required to egest the typical length of seaweed strip with which the network is evolved. This can be seen in a comparison of the best net- works evolved with f = 0.3 and f = 0.7, in both cases with τ = 30, that is, the same desired delay duration of ~30 units of time. The environment with f = 0.7 has an intrinsically higher stimulus density, which would shorten the delay (Fig. 9A). To maintain the delay duration at the desired value, the network evolved with f = 0.7 has slower dynamics (Fig. 7), so that it makes the switch to the new attractor only upon receiving a higher den- sity of stimulus (Fig. 8), a density that occurs on average again about every 30 units of time. Thus the network evolved with f = 0.7 is tuned to τ = 30 at f = 0.7, whereas the network evolved with f = 0.3 is tuned to τ = 30 at f = 0.3 (Figs. 9B and 10). 7 8 9 10 In a behavioral scenario realistically modeled on the feed- ing behavior of Aplysia, we have contrasted two different mechanisms of internal timing of a phase of the behavior, namely egestion driven, in opposition to the current sen- sory stimuli, by an internal egestive goal. 1. In the real feeding CPG, the goal-driven egestion is timed by a slowly decaying dynamical transient that "remembers" the time elapsed since the beginning of the egestion. 2. In artificial neural network controllers, the timing is per- formed by a delayed switch away from an egestive behavioral attractor triggered by a stochastic event that occurs with an appropriately low probability. Nature Precedings : doi:10.1038/npre.2009.2817.1 : Posted 26 Jan 2009

Upload: others

Post on 21-Aug-2020

15 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Saaniya Contractor, Nataliya Kozlova, and Vladimir Brezina ... · Saaniya Contractor, Nataliya Kozlova, and Vladimir Brezina,Mount Sinai School of Medicine, New York, NY, USA I IV

A memoryless, stochastic mechanism of timing of phases of behavior by a neural network controllerSaaniya Contractor, Nataliya Kozlova, and Vladimir Brezina, Mount Sinai School of Medicine, New York, NY, USA

I

SUMMARYIV

BEHAVIORAL SCENARIO

REFERENCES1. Proekt A, Brezina V, Weiss KR (2004) Dynamical basis of intentions and expec-

tations in a simple neuronal network. PNAS 101: 9447-9452.2. http://inka.mssm.edu/~nata/simulations/ode.html3. http://inka.mssm.edu/~nata/simulations/neuralnet.html

II REAL APLYSIA CPG

III NEURAL NETWORK CONTROLLER1For a sensorimotor network to generate adaptive behavior in the envi-

ronment, the phases of the behavior must be appropriately timed.When the behavior is driven simply by the sensory stimuli from the envi-ronment, these can supply the timing. But when the behavior is drivenby an internal "goal" that ignores and perhaps even opposes the imme-diate sensory stimuli, the timing must be generated internally by the net-work. We have modeled a realistic behavioral task that requires suchinternal timing, based on the feeding behavior of the sea slug Aplysia(Fig. 1A).

When an Aplysia feeds, it incrementally ingests long strips of seaweed,driven by ingestive stimuli emanating from the seaweed (Fig. 1B, left toright along the top). But if, having ingested a strip, the animal fails tobreak the strip off the substrate, it must incrementally egest the entirestrip again. To do this, it must ignore the inherent ingestiveness of theseaweed and generate the opposite, egestive behavior, driven by aninternal egestive goal, for a length of time that is appropriate for thelength of the strip to be egested (Fig. 1B, right to left along the bottom).

In this poster, we compare the very different mechanisms by whichthis task is performed, equally well, by two different nervous sys-tems: the real Aplysia feeding central pattern generator (CPG), andan artificially evolved neural network controller.

Using genetic algorithms, we then evolved simple artificial neural network controllersthat were able to perform the behavioral task just as well as the real CPG does (seeFig. 10). Although we evolved controllers with up to 10 neurons, further investigationshowed that 2-neuron controllers performed just as well and employed the same mech-anism as controllers with more neurons; therefore only 2-neuron controllers are pre-sented here. (1-neuron controllers were not able to perform the task at all.) All con-trollers presented here were evolved to perform the task in environments given by τ =30 and initially f = 0.3, subsequently instead f = 0.7 for comparison.

Fig. 5 shows a representative simulation with the best controller evolved with τ = 30 andf = 0.3, performing the task in that same environment. (The best controllers evolvedwith f = 0.3 and with f = 0.7 can be run with different environmental parameter valueson our Web site [3].)

Note the phases of goal-driven egestion c, c’, ... , in Fig. 5B.

Fig. 2 shows the behavior of a standard differential-equation-based model (red and blue curves), driven by either ingestive or egestive stimuli, fit-ted to experimental data obtained with that same stimulation of the Aplysia feeding CPG in vitro by Proekt et al. [1] (black and white circles).

As can be seen, the dynamics of the CPG are for the most part slow (“1D model”, blue). They integrate the incoming stimuli over multiple cyclesof the feeding behavior so that the character of the behavioral output progressively evolves in the ingestive direction with repeated ingestive stim-uli (Fig. 2A), and in the egestive direction with repeated egestive stimuli (Fig. 2B). Furthermore, after a switch from egestive to ingestive stimuli, theoutput exhibits inertia: it remains egestive for some time (Fig. 2C, arrow 4). After the converse switch from ingestive to egestive stimuli, however,there is no such inertia: the output becomes egestive immediately (Fig. 2C, arrow 3). The superposition of this one component of fast dynamics onthe otherwise slow dynamics requires the second dimension of the full “2D model” (red).

We tested the CPG model in the behavioral task over a range of environments defined by the two parameters τ, the environmental length or timescale, and f, the fraction of the true stimulus in the environment that is perceived by the model. Fig. 3 shows a representative simulation with the2D model with τ = 200 and f = 0.1. (The model can be run with different parameter values on our Web site [2].) Over a certain range of τ and f,the 2D model (but not the 1D model) performed the task extremely well (Fig. 4), not much worse than the theoretical average maximal performanceof 0.33.

Further investigation showed that, in the CPG model, the goal-driven egestion is appropriately timed by a slowly decaying dynamical tran-sient that "remembers" the time elapsed since the beginning of the egestion. The slow speed of the transient is due to the slow dynam-ics of the CPG, while the initial value from which the transient decays is set by the superimposed component of fast dynamics.

2 3

4

Environmental length / time scale, τ

56

However, the phases of goal-driven egestion are in this case timed by a completely dif-ferent mechanism. The dynamics of these networks are characterized by discreteingestive and egestive attractors, to which they switch in response to ingestive andegestive stimuli. In Fig. 6A the ingestive and egestive attractors are shown by thegreen and red dots, respectively, and their basins of attraction by the green and red col-oring of the state space. The phase of goal-driven egestion is generated by thefact that the switch in behavior, from one attractor to the other, follows the switchin stimulus only with a considerable delay, during which the network continuesto reside near the old attractor and generate the old behavior. A representativeexample is shown in Fig. 6B in the time domain, and in Fig. 6C in the state space.

What governs the duration of the delay before the switch to the new attractor andbehavior? Residing always near an attractor, the network has no long-term memory.Instead, the switch to the new attractor finally occurs when a sufficiently highlocal stimulus density appears in the stochastic stimulus input stream (e.g., atthe arrow at the top right of Fig. 6B). This complex event occurs rarely.

To perform the task efficiently, the evolution ofthe network tunes its connection weights sothat the switch requires a density that occurs,on average, about as often as the time that isrequired to egest the typical length of seaweedstrip with which the network is evolved.

This can be seen in a comparison of the best net-works evolved with f = 0.3 and f = 0.7, in bothcases with τ = 30, that is, the same desired delayduration of ~30 units of time. The environmentwith f = 0.7 has an intrinsically higher stimulusdensity, which would shorten the delay (Fig. 9A).To maintain the delay duration at the desiredvalue, the network evolved with f = 0.7 has slowerdynamics (Fig. 7), so that it makes the switch tothe new attractor only upon receiving a higher den-sity of stimulus (Fig. 8), a density that occurs onaverage again about every 30 units of time. Thusthe network evolved with f = 0.7 is tuned to τ = 30at f = 0.7, whereas the network evolved with f = 0.3is tuned to τ = 30 at f = 0.3 (Figs. 9B and 10).

7 8

9

10

In a behavioral scenario realistically modeled on the feed-ing behavior of Aplysia, we have contrasted two differentmechanisms of internal timing of a phase of the behavior,namely egestion driven, in opposition to the current sen-sory stimuli, by an internal egestive goal.

1. In the real feeding CPG, the goal-driven egestion istimed by a slowly decaying dynamical transient that"remembers" the time elapsed since the beginning ofthe egestion.

2. In artificial neural network controllers, the timing is per-formed by a delayed switch away from an egestivebehavioral attractor triggered by a stochastic eventthat occurs with an appropriately low probability.

Nat

ure

Pre

cedi

ngs

: doi

:10.

1038

/npr

e.20

09.2

817.

1 : P

oste

d 26

Jan

200

9