AbstractRalph Linsker developed an artificial neural network simulating the emergence of orientation selective cells in the visual cortex of the mammalian brain. He accomplished this using random input for his network. The goal of our project was to study the effects of structured input, namely real world images and computer generated barcodelike images, on Linsker's network. After reproducing Linsker's original results we trained the network with structured input and found the network did not develop similarly. We present the results of our experiments and suggest some ways to improve them.
Introduction
When the primary visual cortex of mammals was thoroughly studied starting in the 1950s, specialized cells were found to exist within. One of these cell types, discovered by David Hubel and Torsten Wiesel (1959), was the orientation selective cell. This type of cell has a barshaped or edgeshaped receptive field with a certain orientation, which makes the cell respond to lines of that orientation. For example, if a vertical ray of light hits the excitatory region of
the cell's receptive field depicted in figure 1, the cell's activation is increased. However, when the light is rotated or moved laterally it will hit the inhibitory region(s) of the field as well, and the response of the cell will decrease or even stop completely. It is clear that the cell would display maximum activity when a bar of light with the correct orientation (vertical) hits the center of its receptive field. Furthermore, Hubel and Wiesel (1963) discovered that orientation selective cells exist in a logically organized structure in the brain. Cells which respond to similar orientations are located nearer to each other in columns of neurons in the visual cortex than cells which respond to orientations that differ more.
A series of deprivation experiments was carried out to better understand the development of the visual cortex (Hubel and Wiesel 1998). The most noteworthy discovery of these studies was the existence of orientation selective cells in animals whose eyes had been sewn shut before they had ever opened them (though Hubel and Wiesel report there are fewer of these cells and they are 'sluggish'). This meant the specialized cells in the visual cortex develop without the need for any external input. However, Hirsch and Spinelli (1970) found that when animals were raised in environments lacking, for example, vertical components, orientation selective cells responding to vertical lines were not found in the animal's cortex, indicating the environment is capable of affecting the development of the visual cortex. Any vertically orientated cells that did form during early development apparently disappeared in the postnatal stage.
The Effect of Structured Input on Linsker's NetworkSelwin van Dijk and Geert Jan Alsem
Figure 1. The receptive field of an orientationselective cell. The area marked with + signs is excitatory, the areas marked with signs are inhibitory.
Though the structure and development of the primary visual cortex had been studied, the mechanisms behind the development remained largely unknown. To explain the development of the specialized cells Ralph Linsker performed experiments with an artificial neural network simulating the mammalian visual cortex (Linsker 1986a 1986b 1986c). Linsker designed the network to be as biologically plausible as possible. This is discussed further in (McDermott 1996). The network is governed by a simple set of Hebbian learning type rules. It is trained using only random values as input to the first layer of cells. So, the activation values of any two input cells, whether they be adjacent or not, are completely uncorrelated. Cells from following layers are connected to cells of the previous layer through Gaussian distributed connections. Under these conditions Linsker reported the emergence of spatial opponent cells, seen in figure 2 (Linsker 1986a), orientationselective cells (Linsker 1986b) (figure 1) and orientation columns (Linsker 1986c), effectively reproducing Hubel and Wiesel's findings in biological brains.
Two important things should be noted about Linsker's experiments. First, by using only random, uncorrelated data as input to the
network, Linsker could explain the existence of specialized cells in the visual cortex of prenatal or blinded mammals. However, the effect of structured, realworld input on the network was not studied, though it is known it has effect on the mammalian brain (Hirsch and Spinelli 1970). And second, Linsker never implemented the network he first designed. Presumably due to limits on computational power at the time, he instead derived a different set of formulas averaging the change of each layer resulting from a large number of inputs over time (Linsker 1986a). Accounts of an actual implementation of the network using the original formulas conceived, but not used, by Linsker have not been found by us. Furthermore, in deriving the formulas Linsker had to make several assumptions. Reproducing the results Linsker gets with his derived calculations using an implementation of the original network would therefor be in itself a substantial experiment.
We are interested in the effect of structured input on Linsker's network. In order to study this effect we have reimplemented Linsker's neural network in its original form. This allowed us to not impose any restrictions on the type of input used, as opposed to Linsker, who in his derived formulas makes assumptions, like the input being uncorrelated.
First, we set out to reproduce Linsker's results of emerging spatial opponent and orientation selective cells. We then started experiments where another network was trained using realworld images. In this situation activity values of two adjacent cells are often correlated, since two adjacent input cells have a high probability of representing part of the same object in the world. Because Linsker's network is considered to be a good model of the visual
Figure 2. The receptive field of a spatial opponent cell. The area marked with + signs is excitatory, the areas marked with signs are inhibitory.
cortex we expected to find more orientation selective cells in networks trained this way. Also, as in real animal brains, we expected horizontal and vertical orientation selective cells to outnumber any others because of the importance of horizontal and vertical objects in our world (Coppola, Purves, McCoy and Purves 1998). Finally, to match results of studies by Hirsch and Spinelli, we trained the network using input containing only vertically orientated structures. Again, we expected the same outcome as in Hirsch and Spinelli's original experiment, in this case the complete lack of development of cells responding to orientations differing more than 15 degrees from the orientations present in the training data (Hirsch and Spinelli 1971).
In the next section we will describe the architecture of the network in detail, as well as the training methods used. Also, the data used in the experiments is described. Following that section we will present the results of the experiments conducted. We conclude this paper with a section discussing our results and the experiments in general.
Method
In this chapter we will describe the neural network used in our experiments, as well as the experiments themselves. We will begin by discussing the general architecture of the network. Next we shall describe the mechanism through which the network develops, that is its training equations, the parameters and the algorithm it uses. Finally we will discuss the different kinds of experiments we ran on the network.
As with any artificial neural network, the network designed by Linsker consists of neurons, grouped in layers, and the connections between them. In order to get the network to display the desired behavior, we have largely kept to the specifications provided by Linsker in the first of his three articles (1986a).
The network consists of a number of twodimensional layers, each divided up into rows and columns of neurons (figure 3). As Linsker describes the development of orientation selective cells in the seventh layer of the network (Linsker 1986b), the number of layers in our implementation is also set to seven. With the experiments using realworld images in mind, each layer was defined to have 75 rows
Figure 3. A single layer of the network. Actual layers are 75×75. Entire network consists of seven of these layers.
Figure 4. Schematic depiction of three layers of the network. Note the Gaussian distribution of the connections and the overlapping receptive fields. Neurons at the edge of the layer have fewer connections than those to the middle.
and 75 columns of neurons. Similarly sized layers are used by Linsker, as he mentions the use of layers of both 72×72 and 80×80 neurons in size (Linsker 1986c). Associated with each neuron is a value, modeling the activity of biological neurons.
One of the most important features of any neural network is the way in which neurons are connected. Keeping with Linsker's design, each neuron receives input from a number of neurons in the previous layer (except of course the neurons in the very first layer which receive their input directly). The information travels exclusively from input layer to output layer, making this a feedforward network. The presynaptic neurons with which a postsynaptic neuron is connected are drawn from a twodimensional Gaussian distribution centered at the postsynaptic neuron's position (figure 4). The number of connections to neurons directly above the postsynaptic neuron is therefor greater than the number of connections to neurons located more to the side. Multiple connections between the same two neurons can, and are in fact likely, to exist. Also note there is a large overlap between the sets of presynaptic neurons of two postsynaptic neurons, which are located near each other. Because of this, input to two neighboring neurons is largely identical, and their outputs will be correlated, given that the weights of their connections will also be similar due to the similarity of their input. In (Linsker 1986a) connection numbers of 300 and 600 connections per neuron are given. By an assessment between these figures and time constraints on training the network we have set the maximum number of connections per neuron at 400. Not all neurons reach this number, however, since some presynaptic neurons drawn from the distribution could lie outside the layer boundaries and are therefor nonexistent. In this
situation, though not touched upon by Linsker, we chose to simply remove these connections. Associated with each connection is a connection strength which serves to weigh the activity value of the presynaptic neuron before using it as input for the postsynaptic neuron. Connection strengths are allowed to vary between their extreme values of 1 and 1. The strength of each connection is initialized at a small random value between 1×105 and 1×105. The initial strengths need to be small enough so they have no lasting effect on the development of the network. For reasons which will become clear when we discuss the formulas used for training the network later in this section, the connections cannot be initialized with a strength of zero. A notable exception is the connections between the first two layers. Linsker notes in (1986a) these connections should all reach their maximum weight when the proper parameters are used. To speed up the process, we simply set the weights to their positive extremes. It should be noted, though, that it would be trivial to achieve the same effect through actual training.
In (Linsker 1986c) the presence of lateral connections between neurons in the seventh layer are mentioned. As stated in that same article, and confirmed by Yamazaki (2002), these connections only serve the development of orientation columns. Because we are not interested in the emergence of these columns, we have omitted the intralayer connections from our implementation.
Training the network is accomplished through a simple set of formulas adjusting the activation value of each neuron and the strength of every connection with each new input. These formulas come straight from Linsker's first article (Linsker 1986a). The activation value of each neuron is calculated from the values of the
neurons to which it is connected using the formula:
(1) a iM
= rarb∑j
wij a jL
In this formula, aiM denotes the activation of
neuron i in layer M. L is the presynaptic and M the postsynaptic layer, ra and rb are constants and wij is the strength of the connection between neurons i and j. The sum in this formula is scaled by rb and adjusted with ra. The scaling factor rb is probably introduced to prevent the activations from reaching extreme values and is omitted in later accounts of Linsker's network (MacKay and Miller 1990, Yamazaki 2002). Because of the need to determine a proper value for rb for each layer of the network, we too have introduced a way of eliminating the scaling factor. After the activity values are adjusted they are immediately scaled so the absolute maximum of the activities equals 1. With this scaling mechanism in place, we found we can achieve good results with ra set to zero and rb set to one. This effectively reduces formula (1) to:
(2) a iM
= ∑j
wij a jL
So, the activation a of each neuron is the weighted sum of activities of the previous layer's neurons with which it is connected.
The formula used to adjust the connection strengths is mentioned by Linsker, but not used by him (Linsker 1986a). Instead, as noted earlier, using knowledge of the data presented to the network, Linsker derives a different set of formulas emulating the probable development of the network. The validity of these formulas, however, depends heavily on the input consisting of an infinite number of
presentations of random, uncorrelated noise. Because of the type of data we plan on using we did not wish to put any such restrictions on the input presented and so we use the original formula. The connection strengths are updated using the following equation:
(3) wij = k akbaiM−a0
Ma j
L−a0
L
Again, ka and kb are constants. The parameter a0
L denotes the average activation of all neurons in layer L with which neuron i is connected. Similarly, a0
M is the average activation of the postsynaptic Mlayer neurons sharing the positions of the neurons from the presynaptic layer with which neuron i is connected. The important aspect to note about this equation is that the weight change is greater when the activity at the presynaptic neuron ai
L is correlated with the activity at the output neuron ai
M and smaller when their respective activities are not correlated or anticorrelated. As Linsker does not use this formula, no values for the constants ka and kb are mentioned in his articles, nor were we able to find any other articles discussing these parameters. The constant kb
determines the level of influence of the correlation between neurons i and j. The value of ka needs to be small enough for negative connection weights to arise. In testing we found setting ka to zero did not produce desired results, as no specialized cells were formed in the network from the third layer up. By running experiments using varying values for ka and kb
we found several providing similar good results. For a discussion of these experiments and their results, see the next section.
At this point it should be easily understood why the connection weights wij have to be initialized at a value other than zero. From
equation (2) it follows that the sum of zero valued connections results in activation values of zero, which, in turn, would lead to a ∆wij fully determined by ka. The input to the network would not have any effect on its final form.
The network is trained one layer at a time, so not until one layer has fully developed is the next layer trained. The actual training of the network consists of providing input to the first layer of the network (i.e. setting the values for all neurons in the first layer), adjusting all activation values in the following layer using equation (2) and then updating all connection strengths between all neurons in the two layers using equation (3). This is repeated until all, or all but one, of the connections between any two neurons have reached their limiting value, as described in (Linsker 1986a). Due to time constraints there was a maximum number of repetitions defined when training would cease, even when not all connections had matured. After testing we decided that 250,000 updates per layer would be our limit, as no immature connections regularly matured at this point. Even when training was constrained by this limit, the process of training an entire network could take, depending on the input used, anywhere between 12 and 30 hours to complete. When one layer completes training, the next
layer is trained in much the same way, with, of course, the input data now filtered through all previously trained (and matured) layers. New layers are added and trained until the seventh layer has completely developed and the network is finished.
The input to the network consists of a series of grayscale images with dimensions equal to those of the layers of the network. Input is fed to the network by setting the activations of the neurons in the first layer to the corresponding pixel values from these images. The activations are then scaled between the minimum and maximum values of 1 and 1, as is done with neurons in following layers after calculating their activation values.
In our first experiment, uniform noise is used as input. This is not actually read from image files, but rather generated internally at runtime. Each neuron's activation is set to a random valid pixel value (an integer in the range 0255) and then scaled. This ensures that the random activations have 256 possible values, just as activations obtained by reading images have. An example of the random input provided to the network can be seen in figure 5a.
For our second experiment we used realworld images to train the network. For this, photos obtained from flickr.com are used, which
Figure 5. Examples of input used in the experiments. (a) Random input. (b) Picture of a building. (c) Picture of a landscape. (d) Image containing only vertical components.
(a) (b) (c) (d)
are prescaled to the desired size. We searched for photos of buildings to create a set of 500 images. The images from the set were presented to the network in randomized order. Photos of buildings (figure 5b) were chosen because we wanted to use input with more horizontal and vertical lines than are seen in uniform noise. Before feeding them into the network, the images are first converted to grayscale. Also, the average pixel value is subtracted from each individual pixel's value before the values are scaled. They are scaled so that the maximum absolute value equals 1. This kept the average activation in the input layer of the network at zero, as it was with the noise input.
The landscape images (figure 5c), used in our third experiment, are to test how the network reacts to real world images that are less structured than the pictures of buildings used in the second experiment. In these images, there is still high correlation between neighboring pixels, but sharp contours are generally not found with the exception of the horizon present in many of the photos. Again, a set of 500 photos was constructed from photos found on flickr.com. These pictures were treated in the same way as the pictures of buildings.
Training the network during the runs in our last experiment was done using images containing only vertically orientated structures as input (figure 5d). These images are generated
in much the same way as the images of random noise.
Results
In this section we will present the results of the experiments conducted. Most importantly, these results are the numbers of the different types of specialized cells: spatial opponent and orientation selective cells. These counts were generated from the trained networks. Cell types were recognized and their occurrence scored automatically. This was done by taking all the connections of a single neuron and representing them as a matrix of numbers, equal to the connection weights. This matrix was then matched with example matrices of various specialized cells. These example matrices were of spatial opponent cells and two forms of orientation selective cells of all rotations in steps of ten degrees (figure 6). The gray areas allow the position of the boundary between excitatory and inhibitory areas of the receptive fields to vary somewhat. The neuron was then classified as the type of cell with which the match was highest, if it was above a certain threshold and otherwise marked as unrecognized. The number of positive and negative connections in the example masks were kept equal, so matching scores would tend to be zero for receptive fields
Figure 6. Example masks of specialized cell types. (a) Spatial opponent cell. (b) Bilobed orientation selective cell of 130 degrees. (c) Orientation selective cell with an angle of 70 degrees.
(a) (b) (c)
Figure 7. Number of specialized cells in layers 3 through 7 networks trained with random data, using ka = 0.005 and varying values for kb. (a) Number of spatial opponent cells. (b) Number of orientation selective cells.
(a) (b)
Figure 8. Number of specialized cells in layers 3 through 7 networks trained with random data, using ka = 0.01 and varying values for kb. (a) Number of spatial opponent cells. (b) Number of orientation selective cells.
(a) (b)
Figure 9. Number of specialized cells in layers 3 through 7 networks trained with random data, using ka = 0.02 and varying values for kb. (a) Number of spatial opponent cells. (b) Number of orientation selective cells.
(a) (b)
Figure 10. Number of specialized cells in layers 3 through 7 networks trained with random data, using ka = 0.04 and varying values for kb. (a) Number of spatial opponent cells. (b) Number of orientation selective cells.
(a) (b)
Figure 11. Average numbers of cell types found in 13 networks trained with random input. Error bars represent standard deviation.
Figure 12. Layers 3 to 7 of a network trained with random input. White pixels are spatial opponent cells, black pixels unrecognized. Color gradient represents an orientation going from vertical (0° = red) through horizontal (90° = green) back to vertical (180° = red).
(a) Layer 3 (b) Layer 4 (c) Layer 5 (d) Layer 6 (e) Layer 7
consisting of randomly mixed connections. It should be noted, however, that the number of positive and negative connections did not remain equal after scaling the masks to the size of a receptive field. Receptive fields with only positive connections were automatically dismissed as any specialized cell. This helps to prevent falsely recognized cells at the edge of a layer, since neurons at the edges have partial receptive fields which would otherwise be incorrectly matched with an orientation, causing artefacts in the output.
Before the main experiments could begin, we needed to determine values for a working set of parameters. To obtain these, we ran a series of tests using random input, while varying the values of the parameters present in equation (3). In figures 7 through 10 we show graphs containing the numbers of specialized cells in each layer of the networks trained in these tests. In figure 7a the number of spatial opponent cells is shown for the networks trained with ka = 0.0005 and varying values for kb. Figure 7b shows the number of orientation selective cells in the same networks. In figures 8, 9 and 10 the same data is presented for networks trained with increasing values of ka.
For all four values of ka, it can be seen that when kb is 100 times bigger than ka, the number of spatial opponent cells is highest. Also, using these values, the number of orientation selective cells increases throughout the layers of the network. Linsker describes a large number of spatial opponent cells in layer three, which are gradually replaced by orientation selective cells in later layers. This is the effect observed when using a ka:kb ratio of 1:100. With little differences between the four optimal tested parameter pairs, and considering
the duration of the training, we decided on using ka = 0.001 and kb = 0.1 for all the following experiments.
Next the main experiments were conducted. In the first experiment, 13 networks were successfully trained using random noise as input and using the parameters discussed above. Training time for these networks was around 26 hours each. Numbers of the cells found in each layer, averaged over the 13 networks, are shown in figure 11. It can be seen that there is a relatively large number of spatial opponent cells in the third layer of the network. Their occurrence gradually drops while the number of orientation selective cells increases.
Also, pictures of all layers of each network were output. Herein, the neurons are depicted by pixels whose colors represent the different orientations they were found to have. Additionally, spatial opponent cells are marked by a white pixel and cells that are not recognized as any special cell type by black
Figure 13. Average numbers of orientation selective cells in the seventh layer of 13 randomly trained networks.
pixels. Shown in figure 12a through 12e are the third to seventh layers of a network trained with random data. We omit the first two layers, since the first layer doesn't have any connections and the second layer always has only positive connections, resulting in a completely black output image. Here we see, just as in figure 11, that spatial opponent cells (white pixels) are mostly present in the third layer, while orientation selective cells (colored pixels) emerge more in later layers. The number of cells responding to each orientation is about the same, as can be seen in figure 13, showing the average numbers of different orientation selective cells in the seventh layer of the networks. For these randomly trained networks the number of immature neurons had dropped to zero, or almost zero, within the maximum 250,000 trainings.
It quickly became clear that when training
multiple networks with the same type of input, the results did not vary much. Also taking the long training time into account, it was decided training only four full networks using photos of buildings as input would be sufficient. This was our second experiment. Training took nearly 30 hours for each network. After the maximum number of updates there were between 100 and 150 neurons still immature in each layer except the second. For this experiment we will only present images showing the recognized specialized cells in a single network (figures 14a through 14e). Most connections turn out to be positive and correctly stay unrecognized, with some mixed connections at the edges that are falsely recognized, making the output suffer somewhat from artefacts. A graph like figure 11 would not be useful with these results, as it would be dominated by the falsely recognized cells at the edges of the layers.
Results from the third experiment, where
Figure 14. Layers 3 to 7 of a network trained with pictures of buildings. White pixels are spatial opponent cells, black pixels unrecognized. Color gradient represents an orientation going from vertical (0° = red) through horizontal (90° = green) back to vertical (180° = red).
(a) Layer 3 (b) Layer 4 (c) Layer 5 (d) Layer 6 (e) Layer 7
Figure 15. Layers 3 to 7 of a network trained with landscape pictures. White pixels are spatial opponent cells, black pixels unrecognized. Color gradient represents an orientation going from vertical (0° = red) through horizontal (90° = green) back to vertical (180° = red).
(a) Layer 3 (b) Layer 4 (c) Layer 5 (d) Layer 6 (e) Layer 7
the networks were trained with landscape pictures, are presented in figure 15. The training time was similar to the previous experiment, about 29 hours. The number of immature neurons was a bit higher, however, mostly ranging between 200 and 400 neurons. Note the large number of cells with horizontally orientated receptive fields in the center of the third layer. As can be seen in figure 16 cells with an orientation differing much from 90 degrees (horizontal) hardly emerged.
Another four networks were trained with randomly distributed vertical lines making up our fourth and final experiment. The number of
immature neurons left was down to zero for most layers in the networks, which in turn explains the relatively short training time of about 15 hours. As seen in figure 17, a large number of cells with vertical receptive fields can be seen in the early layers of the network, disappearing in later layers. In figure 18 it becomes even more clear a large number of vertically orientated cells have emerged in layer three.
Discussion
The networks created and trained in our first experiment, using random activity as Linsker did, show the effects that Linsker describes in his articles. In the third layer many circularly symmetric spatial opponent cells, or, as Linsker calls them, Mexican hats, can be seen. An example of the receptive field of one of these can be seen in figure 19a. The number of spatial opponent cells then drops and orientation selective cells start appearing. This effect continues until the seventh layer, where orientation selective cells (figure 19b) take up most of the layer. This progression can be seen in figures 11 and 12 in the previous chapter.
From these results we conclude the network does indeed match Linsker's original model, making no assumptions about the input
Figure 17. Layers 3 to 7 of a network trained with random vertical lines as input. White pixels are spatial opponent cells, black pixels unrecognized. Color gradient represents an orientation going from vertical (0° = red) through horizontal (90° = green) back to vertical (180° = red).
(a) Layer 3 (b) Layer 4 (c) Layer 5 (d) Layer 6 (e) Layer 7
Figure 16. Average numbers of orientation selective cells in the third layer of 4 networks trained using landscape images.
as Linsker did for his derived model. Thus it explains the emergence of orientation selective cells in the brain under similar circumstances, that is, without any structured input. As was shown in studies by Hubel and Wiesel (1998), these cell types develop without any external input.
When training the network using images of buildings as input in the second experiment, hardly any specialized cells seem to emerge. Ignoring the edges, only a small number of orientated cells appear in the layers. These
however, may be a consequence of the detection method used, since, when manually checking the cells, we find the detected orientation selective cells do not look quite as good as many of the ones found in networks from our first experiment. Further examination of the finished networks showed that almost all neurons have only positive connections (figure 20). An explanation of this could be that there is too much correlation between neighboring pixels in the images, which could result in all connections maturing to the positive limit. This can be illustrated as follows. Neuron Mi in figure 21 will be strongly influenced by the value of neuron Li, because of the Gaussian distribution of the connections. Therefor when Lj and Li are correlated this will result in increased correlation between Lj and Mi as well. This, in turn, will result in an increase of the connection strength between Lj and Mi as can be seen in equation (3).
In our third experiment images of
Figure 19. Receptive field of (a) a spatial opponent cell found in the third layer, (b) an orientation receptive cell responding to 130° angles. Red pixels denote negative connections, green pixels positive ones. Gray pixels are of nonextreme connections.
(a) (b)
Figure 18. Average numbers of orientation selective cells in the third layer of 4 networks trained using images with vertical lines.
Figure 20. Completely excitatory receptive field. Green pixels denote positive connections, gray pixels are of nonextreme connections.
Figure 21. A few neurons from two layers of a network.
landscapes were used as input. A striking feature of the trained networks is the number of orientation selective cells in early layers of the network (figures 15 and 16). All these cells respond to orientations around 90 degrees (horizontal) and are located around the center of the layer. We believe these cells to be the result of the horizon present around the center of most landscape pictures.
In the final experiment with inputs of only vertical lines a large number of cells in the third layer end up having vertically orientated receptive fields, with angles around 0 degrees (figure 22). In the next layers however, the amount of these cells decrease again, leaving the seventh layer with mostly unclassified cells. We found the connections from neurons in later layers appear to be randomly mixed positive and negative, which is why they are marked as unclassified by our program.
It seems Linsker's network is not suitable for explaining how the visual cortex develops in biological brains when it receives input from the eyes. Perhaps it only shows that the visual cortex cannot develop properly if it is to receive structured visual input from the very beginning. Research has shown that the mammalian visual cortex starts developing before the eyes are used (Hubel and Wiesel 1998), which is exactly what Linsker simulated. Of course development of the visual cortex does not end once it starts
receiving visual input. Still, in the experiments done by Hirsch and Spinelli (1970, 1971) the subjects, whose visual cortex turned out to develop an abnormal ratio of the different orientation selective cells, had already started their development before the structured input was presented. It seems to us a good idea to use this approach in another experiment. To model more realistically the way the mammalian visual cortex develops, networks could be trained using random input at first, followed by training with structured input.
Concerning the experiments with structured input, another possibility exists. As is mentioned in (Linsker 1986b), it is possible to have the network develop orientation selective cells in the early layers by choosing appropriate parameter values. These cells, Linsker remarks, are less robust against random variations in initial conditions. Since we determined the parameter values based solely on the proper development of the network in the first experiment, it is entirely possible this has happened in our later experiments. Certainly the orientation selective cells seen in the third layers of the networks from our third and fourth experiments seem to be a direct result of the input presented during training. The lack of this effect in the second experiment could be due to the lack of lines at a regular position in the images and the generally lower contrast of these lines.
To summarize we believe the results in the first experiment show that the network designed by Linsker will work when provided with actual input. As noted before, Linsker never implemented his network, so this is a very interesting result. We do feel, however, the results could be greatly improved by searching the parameter space for optimal values for each
Figure 22. Orientation selective cell found in third layer of a network trained with vertical input. Orientation found is 0°. Red pixels denote negative connections, green pixels positive ones.
layer separately, instead of, as we did, having one single set of parameters for the entire network.
The results of the final three experiments were not as we had expected, nor indeed desired. Expected orientation cells did emerge, but they did so in third layer of the network, very much unlike Linsker's results predict. However, we feel it is too early to say Linsker's network does not cope well with structured input.
Again, results could be improved by examining the parameter space in search of parameter values giving better results for each layer. For our experiments we determined the best values for the constants by running tests with random input, trying to duplicate the effects seen in Linsker's network while trying to keep things computational within a reasonable amount of time. We have however not investigated the effect of using a different set of parameters when using structured input. The key difference in the input presented in the first and following experiments is the correlation between the pixels in it, which, as mentioned earlier, results in an interlayer correlation between neurons. Since this correlation, directly manipulated by the parameters, is the factor determining the development of the network through equation (3), it seems to make sense different parameters are needed. The large number of immature neurons left in the second experiment and more so in the third, also indicate to us different parameter values are needed.
As it is, we will only conclude the network does not develop properly when trained with structured input if parameters are optimized for uncorrelated input.
We would like to mention some of the
difficulties we faced in these experiments. First, the neurons lying at the edge of the layer have only about half a normal receptive field. This poses problems with recognition of their cell type. This is partly solved by ignoring allpositive receptive fields. However, in later layers, the effect is passed on to neurons more to the center, generating orientation selective cells which only develop due to the abnormal input. For special cases, this could be solved by wrapping the connections around to the opposite side of the layer when they would fall off the side. When using realworld images as input, as in the second and third experiments, this would probably produce an unnatural edge in the input generating a similar problem.
Also, we found we had some trouble with cell type recognition. When viewing the recognized cells we sometimes felt the cell was, for instance, a spatial opponent cell, even though it was classified as an orientation selective cell. It proved very difficult to improve this. When changing the threshold for the matching value, any improvements in cell recognition were always paired with an increase in false matches. The only remedy for this would be scoring the cell counts by hand (from a sample of the network, as counting all 39,000+ neurons in a single network is not recommended). This way, one could also concentrate on cells in the center of the layer, which receive no, or very little, input from neurons with incomplete receptive fields.
Acknowledgments
We would like to thank the following people for their help with this project. Fokie Cnossen, for her lectures on setting up a presentation and
writing a report. Gert Kootstra for his guidance during the project and his help with understanding the material and creating the network.
Coppola, D. M. & Purves, H. R. & McCoy, H. N. & Purves, D. (1998). The distribution of oriented contours in the real world. Proceedings of the National Academy of Sciences of the United States of America, 95, 40024006. Hirsch, H. V. B., & Spinelli, D. N. (1970). Visual experience modifies distribution of horizontally and vertically oriented receptive fields in cats. Science, 168, 869871. Hirsch, H. V. B., & Spinelli, D. N. (1971). Modification of the Distribution of Receptive Field Orientation in Cats by Selective Visual Exposure During Development. Experimental brain research, 13, 509527. Hubel, D. H., & Wiesel, T. N. (1959). Receptive fields of single neurons in the cat's striate cortex. Journal of Physiology, 148, 574591. Hubel, D. H., & Wiesel, T. N. (1963). Shape and arrangement of columns in cat's striate cortex. Jourmal of Physiology, 165, 559568. Hubel, D. H., & Wiesel, T. N. (1998). Early exploration of the visual cortex. Neuron, 20, 401412. Linsker, R. (1986). From basic network principles to neural architecture: emergence of spatialopponent cells. Proceedings of the National Academy of Sciences of the United States of America, 83, 75087512. Linsker, R. (1986). From basic network principles to neural architecture: emergence of orientationselective cells. Proceedings of the National Academy of Sciences of the United States of America, 83, 83909394. Linsker, R. (1986). From basic network principles to neural architecture: emergence of orientation columns. Proceedings of the National Academy of Sciences of the United States of America, 83, 87798783. MacKay, D., Miller, K. (1990). Analysis of Linsker's application of Hebbian rules to linear networks. Network, 1, 257297. McDermott, J. (1996). The Emergence of Orientation Selectivity in SelfOrganizing Neural Networks. The Harvard Brain, 3(1), 4351. Yamazaki, T. (2002). A Mathematical Analysis of Development of Oriented Receptive Fields in Linsker's Model. Neural Networks, 15(2), 201207.