artificial neural network based curve prediction documentation

13
Artificial Neural Network based Curve Prediction L ECTURE C OURSE: AUSGEWÄHLTE OPTIMIERUNGSVERFAHREN  R I NGENIEURE S UPERVISOR : P RO F . CHRISTIAN H  AFNER S TUDENTS : ANTHONY  HSIAO , MICHAEL BOESCH Abstract We use artificial neural networks to perform curve prediction. For that, we have created a class of neural networks (feed forward multilayer perceptron networks with backpropagatio n) that have a topology which is determined by their genetic makeup. Using a simple evolutionary strategy on their genes, we optimise the networks’ topologies to solve the problems at hand. Using this approach, we could generate networks that are able to predict simple functions such as sin(x) or linear combination s thereof, with moderate computationa l overhead. H owever, it was not possible to generate networks that predict more complex functions such as sinc(x) or the NASDAQ composite index satisfactorily, within the allowed sizes for the networks. In general though, it appears to be a useful approach to generate neural networks using this form of evolutionary strategy as it substitutes for experience in neural network design. Introduction Curve prediction is one of the most popular applications for artificia l neural networks. However, the success of using a neural network to solve a certain problem is inherently linked to the designer’s ability to apply an appropriat e network to the task. Even relatively simple artificial neura l networks such as the multi-layer-per ceptron or variants thereof have several degrees of freedom, e.g. the number of neurons, the number of hidden layers, the type of transfer functions employed, which the network is very sensitive to. For most tasks there is no methodology for designing a neural network which guarantees success. Instead, we try to evolve a neural network topology that is suitable for any curve prediction task. Aim  To develop an evolvable artificial neural network representation  To optimise such a neural network to solve a number of curve prediction

Upload: anupsutar

Post on 09-Apr-2018

239 views

Category:

Documents


0 download

TRANSCRIPT

8/8/2019 Artificial Neural Network Based Curve Prediction Documentation

http://slidepdf.com/reader/full/artificial-neural-network-based-curve-prediction-documentation 1/13

Artificial Neural Network basedCurve Prediction

LECTURE COURSE: AUSGEWÄHLTE OPTIMIERUNGSVERFAHREN  FÜ R INGENIEURE

SUPERVISOR: PRO F . CHRISTIAN H AFNER

STUDENTS: ANTHONY  HSIAO, MICHAEL BOESCH

AbstractWe use artificial neural networks to perform curve prediction. For that, we have

created a class of neural networks (feed forward multilayer perceptron networks

with backpropagation) that have a topology which is determined by their genetic

makeup. Using a simple evolutionary strategy on their genes, we optimise the

networks’ topologies to solve the problems at hand. Using this approach, we could

generate networks that are able to predict simple functions such as sin(x) or linear

combinations thereof, with moderate computational overhead. However, it was not

possible to generate networks that predict more complex functions such as sinc(x)

or the NASDAQ composite index satisfactorily, within the allowed sizes for the

networks. In general though, it appears to be a useful approach to generate neural

networks using this form of evolutionary strategy as it substitutes for experience in

neural network design.

IntroductionCurve prediction is one of the most popular applications for artificial neural

networks. However, the success of using a neural network to solve a certain

problem is inherently linked to the designer’s ability to apply an appropriate

network to the task. Even relatively simple artificial neural networks such as the

multi-layer-perceptron or variants thereof have several degrees of freedom, e.g. the

number of neurons, the number of hidden layers, the type of transfer functions

employed, which the network is very sensitive to. For most tasks there is no

methodology for designing a neural network which guarantees success. Instead, we

try to evolve a neural network topology that is suitable for any curve prediction

task.

Aim•  To develop an evolvable artificial neural network representation

•  To optimise such a neural network to solve a number of curve prediction

8/8/2019 Artificial Neural Network Based Curve Prediction Documentation

http://slidepdf.com/reader/full/artificial-neural-network-based-curve-prediction-documentation 2/13

•  To evaluate the ability of an evolutionary approach to evolve suitable neural

networks for a given task

Neural Networks

In order to perform the prediction tasks described above, we use multi layerperceptron networks and a simple backpropagation learning rule. Then, we use an

evolutionary strategy to change the following parameters of the network:

• Number of hidden layers

• Number of neurons in each layer

•  Transfer function employed by the neurons in each layer (neurons in the

same layer will employ the same transfer function)

In order to do this, we define a genetic code for the class of neural networks

comprising an N digit binary bit string. In order to limit the optimisation search

space, we arbitrarily limit the number of hidden layers to

 The number of neurons per layer is limited to

 The allowed transfer functions are linear, linear with bounds and hyperbolic

tangens, as shown below:

 Thus, the number of neurons for each hidden layer can be represented by a four bit

number, and the transfer function for the neurons in each layer by a two bit

number, totalling a required six bits per layer. As there are up to 10 hidden layers,

the total bitstring will be 60 bits long, and is represented as follows:

Bit 0 Bit 59Layer0

Layer1

Layer2

Layer3

Layer4

Layer5

Layer6

Layer7

Layer8

Layer9

Layer 4, Layer 4, Layer 4, Layer 4, Layer 4, Layer 4,

8/8/2019 Artificial Neural Network Based Curve Prediction Documentation

http://slidepdf.com/reader/full/artificial-neural-network-based-curve-prediction-documentation 3/13

Neuron bit 3 Neuron bit 2 Neuron bit 1 Neuron bit 0 Transfer bit1

 Transfer bit0

Each layer can have up to 15 neurons, as given by the binary number [Neuron bit 3:

Neuron bit 0]. If the bit string encodes zero neurons for that layer, it is interpreted

as being non-existent. Also, as the bit string can encode for four transfer functions

in each layer, as given by the binary number [Transfer bit 1: Transfer bit 0], but only

three are employed, a bias is given towards the linear y = x transfer function, to be

encoded in two of the possible four states. Thus, by changing the genes of a

network, it will have a different topology, some more others less suitable for the

tasks at hand.

OptimisationIn order to find an optimal network topology for the given tasks, we use an

evolutionary strategy to evolve the genetic makeup of the network, which couldalso be regarded as a genetic algorithm without cross breeding.

The Evolutionary Strategy The algorithm employed works as follows:

1. Produce a first generation of population size seven of random bit strings

2. Generate randomly initialised networks from the population of bit strings

3. Train the networks on a training set using backpropagation

4. Run the networks on the test data

5. A fitness function evaluates the fitness of each network, and the fittest

network is kept for the next iteration, while the other ones are discarded.

6. The fittest network is cloned six times to refill the generation, and each of 

these clones is mutated randomly by inverting one of the 60 bits at random.

7. The process repeats at 3. until a maximum number of iterations has been

performed, or a marginal or no improvement in the fitness can be achieved

over several iterations

A table that contains all the bitstrings that have already been evaluated is kept so

as to avoid computing the same network topology multiple times.

Here, we evaluate the fitness of the networks in two ways, depending on the task at

hand.

• For short term prediction, a part of the test signal series is used as input to

the network, and the first value the network predicts is compared with the

8/8/2019 Artificial Neural Network Based Curve Prediction Documentation

http://slidepdf.com/reader/full/artificial-neural-network-based-curve-prediction-documentation 4/13

actual value of the series at that point. The cumulative error is found by

summing the absolute difference between the predicted value and the actual

value for all shifted versions of the actual signal as input to the network. The

fitness of the network is the reciprocal of the cumulative error. This method

evaulates the network’s ability to make short term predictions for a given

pattern and number of inputs. (see below)

Actual signal

1.

1 . . . . . . . N

Neural network with inputs and outputi n p u t s o

2.

1 . . . . . . . N

i n p u t s o

3.

1 . . . . . . . N

i n p u t s O

Figure 1: Short term prediction method

• For long term prediction, a part of the signal is used as input to the

network, and the first predicted value is fed back and used as the next input

to the network, and this is repeated for a given number of points that are to

be predicted. Then, the fitness is the reciprocal of the cumulative absolute

difference between the actual signal and the recurrently predicted signal.

 This method evaluates the network’s ability to make long term predictions

(forecasts) for a given starting pattern and number of points to predict. (see

below)

Actual signal

1.

1 . . . . . . . N

Neural network with inputs and outputi n p u t s o

2.

1 . . . . . . . N

i n p u t s o

8/8/2019 Artificial Neural Network Based Curve Prediction Documentation

http://slidepdf.com/reader/full/artificial-neural-network-based-curve-prediction-documentation 5/13

3.

1 . . . . . . . N

i n p u t s O

Figure 2: Long term prediction method

In both cases, there is a minimum error (and thus maximum fitness) that each

network must have, in order to avoid division by zero errors and infinite fitness.

Furthermore, smaller networks, i.e. networks with fewer hidden layers are preferred

over larger ones, as are networks with a small number of neurons.

The Search Space The size of the search space can be calculated as follows:

 

In fact, the search space is slightly smaller than this, because some of the

topologies where one or more layers have zero neurons are equivalent. Still, it can

be appreciated, that the search space is large enough to justify this optimisation

approach.

Evaluation and DiscussionWe developed a software application with Java © that implements this evolvable

artificial neural network representation, and which allows our evolutionary strategyto evolve the network topology of the networks. To evaluate our approach, we

adopted the following testing strategy.

Testing Strategy There are two tasks, short- and long term (function) prediction that our neural

networks will have to perform. Here, we shall qualitatively assess the ability of the

evolved networks to perform each task, using the following representative test

signals:

• Sinusoidal function: An arbitrary sinusoidal function such as Sin(x) with a

given amplitude, frequency and phase. This is probably the simplest test

signal for the networks and the tasks can be expected to be managed

successfully by the networks.

• A-Periodic function: An aperiodic function sinc(x) = sin(x)/x. This is a

challenging function to predict, as it is not periodic, and not monotonous. It

8/8/2019 Artificial Neural Network Based Curve Prediction Documentation

http://slidepdf.com/reader/full/artificial-neural-network-based-curve-prediction-documentation 6/13

would come as a positive surprise if the evolved networks would manage the

task successfully.

• Noisy a-periodic or pseudo-random function: An excerpt of the NASDAQ

composite index’ historical weekly values shall be used as an interesting and

challenging test signal. The ability of the networks to predict this stockmarket index would be highly surprising and unexpected.

Apart from running both tasks on the four different functions, the following

questions shall be addressed:

• Do we always arrive at the same network for the same problem? In

order to answer this question, several trials of the same test shall be run. If 

the same network topologies are arrived at most of the time, it implies that

the evolutionary strategy converges to a local or global optimum, which is

desirable.

• How fit are the networks? As mentioned above, there exists a maximum

fitness that a network may achieve. How fit, relative to the maximum

achievable fitness, are the evolved networks?

Tabular Summary The section below describes important aspects of the individual tests in detail. In

addition, the table below summarises the results.

Criteria – 5inputs

Sin(x) Sinc(x) NASDAQ

Short termprediction

Long termprediction

Samenetworks

Fitness 17% 15% 2%%-Error 36% 86% 48%Convergence

Table 1: Summary of the tests for 5 inputs

Criteria –

10 inputs

Sin(x) Sinc(x) NASDAQ

Short termprediction

Long termprediction

Samenetworks

Fitness 95% 15% 2%

8/8/2019 Artificial Neural Network Based Curve Prediction Documentation

http://slidepdf.com/reader/full/artificial-neural-network-based-curve-prediction-documentation 7/13

%-Error 0% 113% 44%Convergence Table 2: Summary of the tests for 10 inputs

Criteria –20 inputs

Sin(x) Sinc(x) NASDAQ

Short termprediction

Long termprediction

Samenetworks

Fitness 99% 15% 3%%-Error 0% 95% 27%Convergence Table 3: Summary of the tests for 20 inputs

Test Details

Sinusoidal

 The neural networks evolved are able to predict the sinusoidal signals with

acceptable accuracy, provided they receive enough inputs. Figure 3 below

illustrates the evolution process over several trials. Each point on the graph

represents an improved network topology over the previous one.

Figure 3: Fitness evolution for different trials for Sin(x) - Clear fitness improvement

As with the fitness evolution, the error performance of the evolved networks

improves. Figure 4 below illustrates how the %-error in the long term prediction

decreases in general with each generation.

8/8/2019 Artificial Neural Network Based Curve Prediction Documentation

http://slidepdf.com/reader/full/artificial-neural-network-based-curve-prediction-documentation 8/13

Figure 4: Error evolution for different trials for Sin(X) - Clear performance improvement

Comparing the size of the network (number of neurons inside hidden layers) to the

network’s performance, it appears, that there exists a certain range of ‘right’ sizes

that the network should have, which allows it to achieve high fitness. Another way

of looking at this is that the network should have a certain minimum complexity (in

terms of numbers of neurons) which is adequate to solve the task at hand. Below

that critical size, it is unlikely, that a network can achieve a high fitness.

Figure 5: Size does matter – a network needs a certain minimum size or complexity toachieve high fitness

8/8/2019 Artificial Neural Network Based Curve Prediction Documentation

http://slidepdf.com/reader/full/artificial-neural-network-based-curve-prediction-documentation 9/13

Sinc(x)

Unlike in the previous case with sin(x), the evolutionary approach does not generate

sufficiently fit networks to perform long-term prediction on a sinc(x) function. Figure

6 and Figure 7 outline the evolutionary performances over several trials. They show

clearly, that the evolutionary approach works in principal, i.e .networks are evolving

and improving, however the task to predict a sinc(x) seems to be too ‘difficult’ atask for the simple feed-forward perceptrons employed here. It appears that the

network evolution is hitting a fitness and %-error performance limit at about 15%

(f)and 90%(e) respectively.

Figure 6: Fitness evolution for different trials for Sinc(x) - No clear fitness improvement

8/8/2019 Artificial Neural Network Based Curve Prediction Documentation

http://slidepdf.com/reader/full/artificial-neural-network-based-curve-prediction-documentation 10/13

Figure 7: Error evolution for different trials for Sinc(x) – The error decreases, but is stillunacceptably high

NASDAQ

Attempting to perform long/term prediction on the NASDAQ is ambitious. Here, the

evolutionary approach again works somewhat, as it is able to generate networks

with improving performance over several generations, however the network model

or complexity is again not able to cope with the challenge posed by the NASDAQ.

Figure 8 and Figure 9 summarize the network evolution over several trials.

8/8/2019 Artificial Neural Network Based Curve Prediction Documentation

http://slidepdf.com/reader/full/artificial-neural-network-based-curve-prediction-documentation 11/13

Figure 8: Fitness evolution for different trials for the NASDAQ - Too difficult for thenetworks

8/8/2019 Artificial Neural Network Based Curve Prediction Documentation

http://slidepdf.com/reader/full/artificial-neural-network-based-curve-prediction-documentation 12/13

Figure 9: Error evolution for different trials for the NASDAQ - Error performance improves,but it is still too high

General Comments

• The networks’ performances are closely tied to some randomness in

the initialisation and the success of the training. The training method

employed, backpropagation, however is not guaranteed to achieve a

satisfactory level of training, and does not necessarily find the globally

optimal solution parameters for the network. To overcome this, we repeated

the training sequence of the networks several times to increase the likelihood

of obtaining a ‘’well trained’ network. This however increased the

computational load manifold, to an impractical degree, and is therefore not

an adequate remedy to decouple the success of the evolutionary approach

from its sensitivity on randomness and initial conditions.

Conclusion•  To develop an evolvable artificial neural network representation

 To optimise such a neural network to solve a number of curve prediction

•  To evaluate the ability of an evolutionary approach to evolve suitable neural

networks for a given task

 The performance of an artificial neural network for a given problem is inherently

tied to the topology of the network, and it might prove difficult to produce

8/8/2019 Artificial Neural Network Based Curve Prediction Documentation

http://slidepdf.com/reader/full/artificial-neural-network-based-curve-prediction-documentation 13/13

satisfactory results. Thus, we have set out to create a neural network design

framework that replaces the designer’s experience with an evolutionary strategy.

We have developed a bit string representation that corresponds to the genetic

makeup of a multi-layer perceptron network, allowing an optimisation of the

following network parameters: number of hidden layers, number of neurons in thehidden layers and the transfer function employed by the neurons in each layer.

Using a simple evolutionary strategy, we tried to optimise the network topology for

the tasks at hand by optimising a generation of networks’ genetic makeup. In

principal, this approach has proven valid, and we have demonstrated an evolution

of networks to predict a sinus function. For more complex function such as the

sinc(x) function or the NASADQ, the evolutionary approach worked, although it was

limited by the network model’s inherent ability to predict complex patterns.

We are optimistic about the approach to evolve neural network topologies for given

tasks, and there are several aspects that could be improved or further investigated.

In particular, we suggest the following:

• Within our simulations, we had to limit the search space by limiting the

allowed size of the networks and their transfer functions. A more extensive

investigation into the evolutionary approach could include other non-linear

transfer functions, larger networks as well as more interconnected or feed-

back networks.

• A major limiting computational factor was the learning rule employed

(backpropagation). It would be worthwhile to consider other learning rules.

• In our approach, we ‘allow’ evolution to uniformly develop networks of all size

and shapes. It might be a more fruitful approach to instead constrain the

evolution to ‘organic growth’, i.e. to start with a small network and growing

them by evolution.

• We have used our evolutionary strategy on function prediction only. It would

be worthwhile to apply this method to other common aNN tasks, such as

image recognition.