predicting stock trending in a financial market with ...sci.tamucc.edu/~cams/projects/207.pdfhand,...

Predicting Stock Trending in a Financial Market with

Neural Networks and Genetic Algorithms

GRADUATE PROJECT TECHNICAL REPORT

Submitted to the Faculty of

the Department of Computing and Mathematical Sciences

Texas A&M University-Corpus Christi

Corpus Christi, Texas

in Partial Fulfillment of the Requirements for the Degree of

Master of Science in Computer Science

by

Brian H. McCord

Summer 2003

Committee Members

Dr. Michelle Moore

Committee Chairperson

Dr. Mario Garcia

Committee Member

Dr. Dulal Kar

Committee Member

ii

ABSTRACT

The goal of this project was to predict stock trends based on data from a financial

market. This project originated from two ideas: that the human brain has a well-defined

structure; and that a financial market has a state and some rule of evolution. The system

in this project has two neural networks, which use a genetic algorithm to learn concepts.

Neural networks and genetic algorithms are two different optimization methods, which

may be used, either separately or together, in many applications where other methods

have less success.

The assumption was that at a moment of time, two things are known: the value of

the data series at that time; and the state of the market given by its history. These values

acted as variables of input for two neural networks: one predicted the next value of the

data series; and the other predicted the next state of the market.

The neural network system was tested upon various stocks where predicted trend

line slopes were compared to actual trend line slopes. The overall results were good and

showed that the accuracy of predictions depended on parameter settings of the genetic

algorithm.

iii

TABLE OF CONTENTS

Abstract ………………………………………………………………………….……….. ii

Table of Contents ……………………………………..……………….…………………. iii

List of Figures ……………………………………….………………………….………... vi

List of Tables …………………………………………………………………………….. ix

1. Introduction and Background ……………………………………………………….. 1

2. Predicting Stock Trending with Neural Networks and Genetic Algorithms ………... 7

2.1 Technical Analysis in a Stock Market ………………………………………... 7

2.1.1 Determining an Uptrend ……………………………………………... 7

2.1.2 Determining a Downtrend …………………………………………… 8

2.2 Architecture of Neural Networks ……...……………………………………… 9

2.3 Method of Neural Networks .…………...……………………………………. 12

2.4 Learning in a Neural Network …...…………………………...……….……… 14

2.5 Summary ………………...……………………………………………………. 16

3. System Design ………………………………………………….…………………… 17

3.1 Financial Data Analysis ………….…………………………………………… 17

3.1.1 Obtaining Financial Data ……………………………………………. 17

3.1.2 Quality Data ……………...…………………………...……….…….. 18

3.1.3 Smoothing Data ……………………………………………………… 19

3.1.4 Data Organization ………………...…………………………………. 21

3.2 Training by Backpropagation ………………………………………………… 22

iv

3.2.1 Backpropagation Phases …………………………………………….. 22

3.2.2 Backpropagation Through Time …………………………………….. 23

3.2.3 System Topology of Backpropagation ………………………………. 23

3.3 Partitioning the Data Series …………………………………………………... 25

3.4 Determining Weights .………………………………………………………… 26

3.5 Managing Prediction Error …...………………………………………………. 33

3.6 Initializing Process ………….………………………………………………… 33

3.7 The Ability of Randomness ..…………………………………………………. 34

3.8 Programming Environment …………………………………………………… 34

3.8.1 Programming Applications …………………………………………. 35

3.8.2 Input Files ………………...…………………………………………. 36

4. Evaluation and Results ……………………………………………………………..... 37

4.1 Test Methods ………………………………………………………………….. 37

4.2 Theoretical Experimentation ………………………………………………….. 39

4.2.1 SINE Data …………………………………………………………… 39

4.2.2 Lorenz Data ………………………………………………………….. 41

4.3 Empirical Experimentation …………………………………………………… 43

4.3.1 Wells Fargo ………………………………………………………….. 44

4.3.2 Genentech, Incorporated …………………………………………….. 46

4.3.3 IBM Corporation …………………………………………………….. 48

4.3.4 Exxon Mobile Corporation ………………………………………….. 49

4.3.5 HCA, Incorporated …………………………………………………... 51

v

4.3.6 AOL Time Warner …………………………………………………... 53

4.3.7 Wal-Mart Stores ……………………………………………………... 54

4.3.8 Intel Corporation …………………………………………………….. 56

4.3.9 Microsoft Corporation ……………………………………………….. 57

4.3.10 FedEx Corporation …………………………………………………... 59

5. Conclusion …………………………………………………………………………... 61

6. Future Work …………………………………………………………………………. 63

Bibliography and References …………………………………………………………….. 65

Appendix …………………………………………………………………………………. 66

vi

LIST OF FIGURES

Figure 2.1. Charting an Uptrend …………………………………………………... 8

Figure 2.2. Charting a Downtrend ………………………………………………… 9

Figure 2.3. Neural Network Connections ….……………………………………… 10

Figure 2.4. The Hidden Layer ……………………………………….…………….. 11

Figure 2.5. Simple Time Series Prediction ………………………………………... 12

Figure 2.6. Multiple Time Series ………………………………………………….. 13

Figure 2.7.

Noisy Data …………………………………………………………….. 15

Figure 3.1. Flat Trends ……………………………………………………………. 18

Figure 3.2. Spikes Causing Noise …………………………………………………. 19

Figure 3.3. Calculating a Moving Average ……………………………………….. 20

Figure 3.4. Smoothing data ……………………………………………………….. 21

Figure 3.5. Recurrence ………………………………………………….…………. 23

Figure 3.6. Topology of Backpropagation ………………………………………… 24

Figure 3.7. Partitioned Data ………………………………………………………. 25

Figure 3.8. Neural Network Architecture …………………………………………. 27

Figure 3.9. Encoded Weight Topology ……………………………………………. 28

Figure 3.10. Tournament Selection ………………………………………………… 30

Figure 3.11. Single-Point Crossover ……………………………………………….. 31

Figure 3.12. Evolution Cycle ……………………………………………………….. 32

Figure 3.13. Error During Prediction ……………………………………………….. 33

vii

Figure 4.1. SINE Input Data ………………………………………………………. 40

Figure 4.2. SINE Prediction ……………………………………………………….. 41

Figure 4.3. Lorenz Input Data ……………………………………………………... 42

Figure 4.4. Lorenz Prediction ……………………………………………………... 43

Figure 4.5. WFC Input Data ………………………………………………………. 44

Figure 4.6. WFC Best Predicted Trend …………………………………………… 45

Figure 4.7. DNA Input Data ………………………………………………………. 46

Figure 4.8. DNA Best Predicted Trend …………………………………………… 47

Figure 4.9. IBM Input Data ……………………………………………………….. 48

Figure 4.10. IBM Best Predicted Trend …………………………………………….. 49

Figure 4.11. XOM Input Data ………………………………………………………. 49

Figure 4.12. XOM Best Predicted Trend …………………………………………… 51

Figure 4.13. HCA Input Data ………………………………………………………. 51

Figure 4.14. HCA Best Predicted Trend ……………………………………………. 52

Figure 4.15. AOL Input Data ……………………………………………………….. 53

Figure 4.16. AOL Best Predicted Trend ……………………………………………. 54

Figure 4.17. WMT Input Data ……………………………………………………… 54

Figure 4.18. WMT Best Predicted Trend …………………………………………... 55

Figure 4.19. INTC Input Data ………………………………………………………. 56

Figure 4.20. INTC Best Predicted Trend …………………………………………… 57

Figure 4.21. MSFT Input Data ……………………………………………………... 57

Figure 4.22. MSFT Best Predicted Trend …………………………………………... 58

viii

Figure 4.23. FDX Input Data ……………………………………………………….. 59

Figure 4.24. FDX Best Predicted Trend ……………………………………………. 60

ix

LIST OF TABLES

Table 3.1. Classes …...…………………………………………………….……… 35

Table 3.2. Input Data ……………………………………………………………... 36

Table 4.1. Tested Stocks ………………………………………………………….. 38

Table 4.2. WFC Prediction Results ………………………………………………. 45

Table 4.3. DNA Prediction Results ………………………………………………. 47

Table 4.4. IBM Prediction Results ……………………………………………….. 48

Table 4.5. XOM Prediction Results ……………………………………………… 50

Table 4.6. HCA Prediction Results ………………………………………………. 52

Table 4.7. AOL Prediction Results ………………………………………………. 53

Table 4.8. WMT Prediction Results ……………………………………………… 55

Table 4.9. INTC Prediction Results ……………………………………………… 56

Table 4.10. MSFT Prediction Results ……………………………………………... 58

Table 4.11. FDX Prediction Results ……………………………………………….. 59

Table 4.12. Best Tests ……………………………………………………………... 60

Table 5.1. Iterations ………………………………………………………………. 62

1

1. INTRODUCTION AND BACKGROUND

It is easy to find experts in various aspects of investing from whom to acquire

knowledge, but few regularly offer reliable knowledge to the public. One reason is that

the complexity of making investment decisions is such that the techniques involved are

rarely consistent. In addition, the knowledge of human experts is usually subjective and

limited. To overcome such limitations, knowledge acquisition from a machine-learning

system may be used. Machine learning can be used to automate the creation of

investment rules from inputs associated with a financial market, such as price-earning

ratios, open prices, and close prices [Lisboa 2000]. Machine learning can also be used to

produce stronger trading rules. Some of the most popular approaches to machine

learning include neural networks and genetic algorithms. However, it should be

remembered that human experts remain an important medium from which one can find

fundamental and analytical knowledge. Therefore, their knowledge will always be

considered more important than machine-learned knowledge.

Neural network technology has some advantages over conventional expert system

approaches in some applications. First, since neural networks do not require that

knowledge be formalized, they are appropriate for domains where knowledge is scanty.

In this sense, a neural network may replace any rule-based system [Lisboa 2000].

The study of artificial neural networks originated from efforts to simulate the

functioning of the human brain in the areas of learning and problem solving. When a

person experiences an unfamiliar event the brain makes certain considerations and

generalizations within the scope of previous and stored experiences. With this

2

functionality, attempts can be made to produce educated guesses [Nelson 1991]. An

example of this is demonstrated when an investor observes a stock’s price drop at the

beginning of every year with a strong period of recovery during the remainder of the

year. If the investor makes this observation year after year, and finally purchases the

stock during an early annual price drop, the investor’s expectations from past experiences

will be for the stock price to rise, thereby creating a profitable return later in the year

when the stock is sold.

Neural networks are a form of computer programming inspired by biology. They

are designed to mimic the functions of the human brain. The fundamental idea of neural

networks is based on the nerve cell called a neuron. Each neuron has connective strings

on each end. The connectors on one end are the dendrites, which carry signals into the

neuron, and the axon carries signals out through the connectors on the other end. The

point at which these signals are fired from the axon of one neuron to the dendrites of

another is called a synapse. The signals are binary, where they are either sent or not sent

[Nelson 1991]. Many different signals pass through the synapses. The neurons sum each

incoming signal, and if a signal’s sum exceeds a preset threshold value, the neuron fires it

across the synapse to another neuron [Welstead 1994].

The human nervous system is composed of a complex network of neurons. The

key to human behavior and thoughts is embedded in these networks. This enables the

human brain to simultaneously perform many tasks, otherwise known as parallel

processing [Baddeley 2000]. This basic structure forms the foundation of neural

networks where software can be programmed with linkages of nodes, inputs, and outputs.

3

The term node refers to the neural network equivalent of a neuron. Nodes have

input signals, which correspond to dendrites, and one output signal, which corresponds to

the axon. The input signals are assigned weights and summed at the node before

producing output. Initially, input signals may be assigned weights randomly. As the

network learns, these weights may be adjusted [Welstead 1994]. Many methods exist for

making these adjustments, and the procedure is typically known as the summation

function.

Compared to the statistical models used by conventional systems, neural networks

are strongly different. They are especially suited for simulating intelligence in pattern

detection, association, and classification activities. They have also gained much interest

in economic decision-making. For example, financial organizations are second only to

the U.S. Department of Defense in sponsoring research in neural networks [Nelson

1991]. Although there are many applications of neural networks, investment problems

remain a true conquest. One such luring quest is predicting stock trends in a financial

market.

One of the first systems for predicting a financial market used many statistical

methods. Currently, there are many references on this subject, as well as several

companies that produce these statistical systems to make such predictions. However,

with advances in neural network research, statistical methods are no longer regarded as

the primary technique for predicting financial markets [Lisboa 2000].

Nevertheless, neural networks also have their weaknesses. One weakness is that

they might find some factors to be important for decision-making when those factors are

actually irrelevant or conflict with traditional theories. This can occur because neural

4

networks are entirely data restricted. Since the scope of training is always limited by

economics and time, networks that contradict theory are at risk of functioning well only

on data with structure similar to their training data. Another potential problem is that

most neural networks cannot guarantee an optimal solution to a problem. On the other

hand, if the neural network is properly configured and trained, it usually can provide

correct results [Lisboa 2000].

When dealing with neural networks they must be provided with a method of

learning. Typically, a learning rule is applied for updating the connection strength

between the nodes in a neural network. One such learning method that has a lot to be

explored is the application of a genetic algorithm. Employing a genetic algorithm is

certainly a unique alternative and might provide a more effective concept for neural

networks to learn.

Genetic algorithms are software procedures modeled after genetics and evolution.

They are designed to efficiently search for optimal solutions to large problems. The

search proceeds in survival-of-the-fittest fashion by gradually manipulating a population

of potential problem solutions until the most superior ones dominate the population.

In living organisms, cells contain important groupings of special material

containing hereditary information. These structures are called chromosomes. Genes are

specific factors that are carried by chromosomes. The basic process of coding

information within genes and chromosomes is fairly simple, but the possible results are

almost limitless in number [Rees 1977].

5

The genetic makeup of each organism is called its genotype. The genotype

determines and limits many aspects of development and survival such as how an

organism responds to normal and abnormal environmental conditions [Rees 1997].

During the process of mating, the members of each chromosome exchange

genetic material in a technique known as crossover. Sometimes the copying of genetic

material results in slight imperfections known as mutation, which leads to additional

diversity in a population [Rees 1997].

These concepts provide an important environment for the specific procedures

used in genetic algorithms. To begin using a genetic algorithm as a problem-solving tool,

the problem must be represented in a manner that a genetic algorithm can work with.

This means representing the parameters of the problem solution using a string of digits,

usually binary. This representation has a biological parallel where the bit string can be

viewed as a chromosome-type structure, in which the 0s and 1s represent genes within

the chromosome.

The steps in a genetic algorithm also have some biological similarities. A genetic

algorithm begins by creating a population of potential problem solutions. Next, the

fitness of each individual in the population is calculated and each is assigned a numerical

value. Better-fit individuals are then selected to become parents of the next generation.

Finally, a second generation is formed from these selected parents by means of a

crossover process such as a random exchange of bits. Mutation is another possible

genetic operation that could be performed, where a random string bit is switched to its

opposite. Obviously, this operation would not be used very often but does introduce

some diversity into the genetic algorithm and decreases the chance for premature

6

convergence, where all the individuals in a population attain the same level of fitness too

early causing evolution to halt before an acceptable solution is found. The process of a

genetic algorithm repeats from the step of calculating fitness through the process of

mating until the population fully converges. Full convergence is a characterization when

all strings are identical in all bit positions, otherwise known as a near-optimal solution

[Goldberg 1989].

Both neural networks and genetic algorithms are patterned after nature. Neural

networks mimic brain functions, and genetic algorithms mimic the process of evolution.

The rationale behind both methods is simply to learn from nature’s efficiency. A neural

network must be trained by presenting it with a test set of data. It then makes predictions,

evaluates the results, and repeats over and over. Genetic algorithms involve multiple

generations of a large population. Calculations for each individual of the population are

performed repeatedly as the population undergoes optimization changes. This work

evaluates how well a neural network can learn from a genetic algorithm and make

accurate predictions in real-world systems.

In this project the real-world system was a financial market. The topic has been

chosen because of the author’s personal and career experiences in finance and economics.

By combining these experiences with knowledge gained from researching neural

networks and genetic algorithms, an opportunity was provided to design a method for

predicting stock trends.

7

2. PREDICTING STOCK TRENDING WITH NEURAL

NETWORKS AND GENETIC ALGORITHMS

This project involved the development of neural networks and a genetic algorithm

to predict stock trending in a financial market. Because this was an attempt to predict a

financial stock trend line, it essentially became a model for time series forecasting.

Neural networks that performed the prediction were trained using a backpropagation

technique and a genetic algorithm. As a learning error was calculated in one neural

network, a second neural network determined the state of the market at each instance of

input.

2.1 Technical Analysis in a Stock Market

The stock market can be difficult to understand regardless of one’s level of

experience. Stock brokers, advisory letters, experts, and the media are all sources of

information that share their opinions, but have different points of view. Technical

analysts, on the other hand, have devised methods of creating and interpreting stock

charts, which do not use the fundamentals of the sources mentioned above [Pistolese

1994]. In this project, the neural network system was designed and implemented to

predict stock trends, in which technical analysis of the predicted data created trend

patterns similar to technical analysis of the real data.

2.1.1 Determining an Uptrend

The price of any stock fluctuates in small and quick movements, which makes

short-term top and bottom points over a period of time. When stock prices are charted

and the successive bottoms are higher than preceding bottoms, the price is in an uptrend.

8

This can be seen in Figure 2.1. The graph shows that bottom B is higher than bottom A,

and bottom C is higher than bottom B. After bottom B was established, the uptrend was

made. The uptrend was confirmed when bottom C occurred. Once the bottoms of A and

B have been created on the chart, an uptrend line is drawn. Whenever subsequent

bottoms occur at, or near this line, the uptrend is reconfirmed [Pistolese 1994].

Figure 2.1. Charting an Uptrend

2.1.2 Determining a Downtrend

When charted, stock prices show that successive highs are lower than preceding

highs, the stock is in a downtrend as seen in Figure 2.2. Here it can be seen that point B

is lower than point A, and point C is lower than point B. The downtrend was created as

soon as the downtrend line could be drawn from A to B. Next, point C confirms the

downtrend. When any subsequent top point occurs at, near, or below this trend line, the

downtrend is reconfirmed. In most situations the subsequent tops will not even come

TIME

PRICE

uptrend

A

C

B

9

near the line because downward trends tend to accelerate as they proceed [Pistolese

1994].

Figure 2.2. Charting a Downtrend

2.2 Architecture of Neural Networks

Neural networks are made up of many simple processors, all of which are

programmed to perform the same basic task with each having a small local memory.

Each processor can be referred to as a node. An individual node has one output but more

than one input. Outputs of one node are essentially inputs to other nodes of the network.

In addition, an output may be fed back as an input to the same node. This is known as a

feedback system, which is commonly used in neural networks. On the other hand, a

feedforward network does not contain any feedback connections [Nelson 1991].

In most neural networks the processing that takes place is rather simple. The

process takes a weighted sum of the inputs and calculates an output value that is a

function of that sum. The node’s local memory stores interconnected parameters (or

TIME

PRICE

downtrend

A

C

B

10

weights) and when many nodes are linked together a neural network is created. The

strength of the connection between the nodes determines the networks’ ability to

correctly generalize. The basic model below in Figure 2.3 demonstrates the connectivity

of a simple neural network.

Figure 2.3. Neural Network Connections

In this model, weights are represented as lines connecting the input nodes and the

summation box. First, the input nodes contain values that are presented to the network.

The summation function collects the weight values and the input values. Then the output

node provides a generalization outcome.

The pattern of interconnected nodes is the primary distinguishing factor between

most neural networks. In most patterns, networks are combined in layers (input, output,

and hidden layers) operating in synchronous fashions with one another. In theory, one

hidden layer is sufficient and necessary for expressing a nonlinear relationship between

Summation

Input

Output

Input

11

the input and output nodes. In this project a multilayered feedforward network was used

for the sake of dealing with the dynamics of a financial market [Hecht-Nielson 1990].

A hidden layer can be seen in Figure 2.4 below. The nodes in the middle layer

are known as hidden, because they do not receive direct input from the real world and

they do not produce direct output outside the network [Lisboa 2000].

Figure 2.4. The Hidden Layer

Because hidden layers contain some of the knowledge within a network, they are an

important component. There are no general rules concerning an appropriate number of

hidden layers and there are no general rules about the manner in which they should be

connected. They commonly function as filters for noisy data as information moves

through the network [Nelson 1991]. Hidden nodes were vital in this project when

financial input data experienced sharp fluctuations and anomalies. These had to be

recognized and understood in order for the neural networks to determine overall trending.

Input Layer Output Layer Hidden Layer

12

2.3 Method of Neural Networks

The neural networks operated much like a black box where inputs and desired

outputs were described, an initial guess was made at a network structure, and then

experimentation took place. The neural networks had more than one input and exactly

one output. The goal was to predict one point ahead in a single time series. Figure 2.5

shows the typical mapping of how a time series was used with the neural networks for

prediction.

Figure 2.5. Simple Time Series Prediction

The figure demonstrates the use of six adjoining points in a time series to predict

the next point. A training series was used to generate a large number of individual

samples [Welstead 1994]. A sample consisted of seven points: the current point 0; five

5 4 3 2 1 0 Predicted

historical current training

points point point

Learning

Neural Network

13

historical points of 1 through 5; and the predicted point to be used for directing the

training of the output.

Time series predictions are typically made using points from a single time series

as predictors [Welstead 1994]. An example using a stock market model is shown in

Figure 2.6 below. The figure demonstrates how several different time series are

combined into a future price. The series in the project were determined through trial and

error, but the final technique was similar to this example. Here, the series are the close

prices and respective trading volumes of a stock at five different time periods.

Figure 2.6. Multiple Time Series

Close price

Volume

Close price

Close price

Day 1

Day 2

Day 3

Day 4

Neural

Network

Predicted

Price

Close price

Volume

Volume

Volume

Day 5

Close price

Volume

14

2.4 Learning in a Neural Network

In a multilayered feedforward network the output nodes detect the errors. In this

project, these errors were propagated back to the nodes in the previous layer, and the

process was repeated until the input layer was reached. An effective algorithm that learns

in this manner, and was used here, was the backpropagation algorithm. This allowed an

incremental adjustment of weights to reduce the errors of prediction [Nelson 1991].

Once an acceptable set of weights was reached, the network ran in a feedforward mode to

organize new cases and make predictions.

Sometimes a difficult issue with neural networks is that during the learning

process, the measured error might not decrease in a consistent manner. This means that it

is not always easy to decide when to stop the learning phase. This project determined the

stopping point, based on performance of the test data [Lisboa 2000].

The neural networks had an error associated with each prediction. The accuracy

of the system was determined by this error. Because future data is unknown, known

historical data was used for the entire neural network process. This historical data was

partitioned into three sections so the neural networks could compute the error of

prediction and learn.

The first set was used to train the neural networks. Evaluating the accuracy of

predictions was performed by the second set. These first two sets were used to compute

the error of prediction. Finally, the third set was used to test the results.

For input, the neural networks used historical values pertaining to stocks. An

important issue considered was that many data series are vulnerable to noise. When this

occurs, prediction results might have be affected even further. In order to produce better

15

results, real data was sometimes transformed into smooth data. This was extremely

important when dealing with stocks experiencing rapid price changes, or unstable

markets that have suffered as an indirect result of major world events. Figure 2.7 shows

an example of noisy data, but closer observation shows that the overall trend is moving

upwards. Detecting the overall movement was a critical aspect of the success of the

neural network system.

Figure 2.7. Noisy Data

In order to produce high-quality predictions, the connections (or weights) in the

neural network were identified for the system. However, instead of having a basic and

traditional mathematic function for determining weight adjustments for connectors, a

genetic algorithm was used. The genetic algorithm searched for the best possible weights

in order to strengthen connectors and assist the neural network in maintaining a low error

of prediction.

TIME

PRICE

noisy data

upward trend line

16

2.5 Summary

In summary, building a neural network involved many steps. The system in this

project was a fully connected, multilayered, feedforward network. In addition, a

backpropagation technique was used for training while a genetic algorithm determined

the best parameter values for the neural networks, to produce as little error as possible.

17

3. SYSTEM DESIGN

3.1 Financial Data Analysis

The first step in conducting this research was to study financial trend patterns to

determine the key factors affecting them. This helped determine the type of input that

needed to be provided to the neural network system. Each issue that was discovered was

an important factor in adjusting input data in order for the neural network to perform.

Incorporating the main issues that cause financial trend lines to change, gave the neural

network a better opportunity to produce accurate forecasts.

3.1.1 Obtaining Financial Data

One reason for displaying financial data, such as closing prices, in graphical form

is to make it easier to interpret stock trends and suspected price anomalies. In this

project, all input data was viewed graphically, prior to being downloaded. This pre-

analysis of input data help established proper testing standards and parameters. To

effectively determine stock trends and discover any anomalies, it was advantageous to

obtain historical data for long periods of time [Hecht-Nielson 1990]. Likewise, the

neural network needed a sufficient amount of data in order to be tested effectively. Daily

data was obtained for several stocks with a range of 13 months. This provided the system

with one year of data for training and evaluation, and one month of data for testing the

prediction.

Since reporting systems make use of financial data on individual stocks,

companies, and industries, the data is maintained in many public databases. As a result

of the widespread availability of electronic data sources, investment databases were

18

accessible for downloading large amounts of financial data. The data source that proved

the easiest to manipulate and the most efficient was the financial database of Yahoo.com

located on the Internet at http://finance.yahoo.com.

3.1.2 Quality Data

It is not necessary to have an approach for dealing with incomplete data because

data gaps do not exist in a securities market. When the market is closed to stock trading,

it is closed to all stocks. Even if a stock is dormant it is still available for trade provided

the market is open. Therefore, seemingly inactive data remains vital to the neural

network where flat trends sometimes occur, because all data is active in a stock market as

long as time is elapsing. A flat trend is illustrated in Figure 3.1.

Figure 3.1. Flat Trends

However, financial market data does have the tendency to quickly spike upward

and downward because price changes around a trend are somewhat random. Such spikes

TIME

PRICE

flat trends

19

in the data created a disturbance, otherwise known as noise, in the neural network’s

attempt to train. These spikes can be seen below in Figure 3.2. Predicting these notably

sharp changes around the trend was not possible because the noise can mistakenly be

interpreted as random data and trending cannot be identified [Lisboa 2000]. When the

neural network was tested with noisy data, it was more effective to first smooth any data

spikes.

Figure 3.2. Spikes Causing Noise

3.1.3 Smoothing Data

Moving-averages were used to smooth a data series and make it easier to find

trends amongst noisy data series. Because of the ability to locate trends, moving-

averages are common tools employed by technical analysts of financial markets

[Pistolese 1994]. Price and trading volume data are usually displayed in graphs showing

stock price trend lines, moving-average curves of stock prices, and moving-average

TIME

PRICE

downward spikes

upward spikes

20

curves of trading volume. Calculating the average price of a stock over a predetermined

number of time periods creates moving-averages. When dealing with stock prices, it is

popular for the closing price to be used for computing the moving average. For stock

trends in this project, data on closing prices and average prices were applied. For

example, adding the closing prices for the last five days and dividing the total by five can

calculate a five-day moving average. A moving average moves because as the newest

period is added, the oldest period is discarded [Pistolese 1994]. If the next closing price

in the average is 32, then the new period will be added and the oldest day, price 22, will

be removed. The new five-day moving average will be calculated. Figure 3.3 illustrates

the concept of calculating a moving average.

Figure 3.3. Calculating a Moving Average

The only disadvantage of moving averages in a financial market is that they lag

behind real-time market prices [Pistolese 1994]. Because lagging is a well-known

characteristic of moving averages, they are only emphasized by financial analysts for

smoothing data series and illustrating trends. As previously discussed, financial markets

are volatile to many real-world events, frequently creating sharp changes in trend lines.

Day

Daily

Close Price

5-day

Moving AVG

1 22

2 24

3 26

4 25

5 23 24

6 32 30

21

Because these sharp changes are known as noise by the neural network, the concept of

moving averages was applied to the training of the neural network system whenever

necessary. Neural networks are designed to learn from past data thereby being unaffected

by lag-time, which made moving-averages a good choice for smoothing sharp anomalies.

When smoothing was completed, the resulting data appeared more like the smoother

dotted line in Figure 3.4.

Figure 3.4. Smoothing data

3.1.4 Data Organization

It was critical that the organization of data be consistent for all input and output

files in order to always execute the program code successfully [Welstead 1994]. For the

neural network system, a text file was created for input and contained a data series

arranged in columns and separated by white spaces. A text file was also created for the

genetic algorithm. It contained information vital to the genetic algorithm consisting of

population size and mutation rate. It also consisted of numbers representing neural

TIME

PRICE

smoothing

of spikes

22

network values for training time, evaluating time, testing time, and the number of

neurons. Also included were data file dimensions with each dimension’s level of

importance.

3.2 Training by Backpropagation

The network system in this project contained several input nodes, one hidden

layer, and a single output node. Input variables were classified as closing prices and

trading volume at daily intervals of time. The output values were produced as a data

series of predicted closing prices with time intervals identical to the input. The neural

networks learned by using a backwards propagation of error known as backpropagation.

Basically, this is a feedforward technique, which calculates the difference between

predicted outputs and desired outputs from each iteration.

3.2.1 Backpropagation Phases

Backpropagation training was composed of three phases. The first phase

provided input to the neural networks and moved forward from the hidden layer to the

output layer. The next phase determined the difference (error) between the desired output

and the output produced by the neural network in the output layer [Nelson 1991]. Third,

the weight of each connection was adjusted in proportion to the error previously

calculated. Therefore, after this third step most weights had a different value. An

explanation of determining weights will appear later in section 3.4.

23

3.2.2 Backpropagation Through Time

Errors were backpropagated even further. This is called backpropagation through

time and is a simple extension of what was previously discussed. Local memory was

created that contained states reflecting structural dependencies and states containing

structural predictions. This technique introduced an architecture that has a local recurrent

nature, but also had an overall global feedforward construction [Yves 1995]. The

property of recurrence that was used is shown below in Figure 3.5.

Figure 3.5. Recurrence

3.2.3 System Topology of Backpropagation

The backpropagation training methods that were performed in this project are

illustrated in Figure 3.6. Using a sliding input window, the diagram shows a 30-day

period of a stock’s close prices and trading volumes. Input for both neural networks

Output

Input State

State

Input

Input

State

State

24

contained the values from the sliding window and the state of the system. Sometimes

price values would differ very little or not have any difference. Therefore, the second

neural network used the property of recurrence that introduced a local memory so that

states could function as additional parameters helping to distinguish between input

instances. In the first iteration, not depicted in Figure 3.6, states had an input value equal

to zero until an initial prediction was made. After the input was read in, one network

produced the predicted prices and the other network produced the next predicted state.

An error was calculated for the predicted outcome and connection weights were adjusted

in proportion to the error calculated.

Figure 3.6. Topology of Backpropagation

During the learning process, each data instance was applied to the neural

networks, and output values were computed using the current weights. Then, weights

were adjusted in order to decrease error measures, and another data instance was applied.

Predicted

Input Nodes Neural Networks Outcome Output Node

input1

close trading input2

price volume input3 Neural close price

c1 v1 input4 Network

c2 v2 input5 1c3 v3 input6

c4 v4 input7

c5 v5 input8

c6 v6 input9

c7 v7 input10c8 v8

c9 v9 state1

c10 v10 state1 Neural state2

c11 v11 state2 Network state3

c12 v12 state3 2 state4c13 v13 state4 state5

c14 v14 state5

c15 v15

c16 v16

c17 v17

c18 v18c19 v19

c20 v20

c21 v21

c22 v22

c23 v23c24 v24

c25 v25

c26 v26

c27 v27

c28 v28c29 v29

c30 v30

Input Data

predicted

close pricetrading

volume

input

nodesexpected

results

Calculate error of

predictions.

Sliding

window

of input

Before going to

output, results

back-up to

determine error.

Weights are adjusted in

proportion to error calculated.

Local recurrent network

provides and predicts

structural

dependencies.

25

After all data instances were applied, the entire process was repeated as long as a

significant amount of error reduction was accomplished. A large number of passes

through the data set were required to accomplish a firm solution. After these steps were

executed for all the data in one input window, one epoch had been completed.

3.3 Partitioning the Data Series

The data series was partitioned into three disjoint sets. These sets were the

training set, evaluation set, and test set. Traditionally, this method lends the majority of

the data to training, while the remaining data is equally divided between the other two

sets.

The backpropagation training occurred directly on the training set. The neural

networks’ ability to generalize was checked for accuracy on the evaluation set. Finally,

its ability to forecast was measured on the test set. This is more clearly understood with

Figure 3.7 below.

Figure 3.7. Partitioned Data

PRICE

train test evaluate

real data

TIME

26

3.4 Determining Weights

In order for neural networks to learn they need some type of learning rule [Lisboa

2000]. A learning rule functions as a guideline for performing the weight (connection

strength) updates. In this project, network configurations were created using a genetic

algorithm for learning. Essentially, the genetic algorithm found near optimal weights for

the neural networks’ connections. Its goal was for the error computed on data sets to be

as small as possible. In the genetic algorithm, the population consisted of encoded

weights that represented the individuals. Each individual (weight) had a fitness

associated with it, and individuals with better fitness were considered better solutions.

Between one generation and the next, individuals were selected from which to create

offspring by a crossover operation. The types of individuals consisted of learning rates of

neural network dimensions. The financial indicators, close-price and trading volume,

were used as the dimension types for learning.

The neural network system globally functioned as a feedforward network with a

sliding window of input. Each data window contained input for a two-level network

composed of ten inputs (five close prices and five associated volumes), two units in the

hidden layer (close price and volume), and a single output (close price). There were a

total of twenty weights connecting the input layer to the hidden layer. This architecture is

depicted below in Figure 3.8. The input nodes from the data set are labeled to help

identify their association with the hidden nodes for close price (C) and volume (V).

27

Figure 3.8. Neural Network Architecture

The use of a genetic algorithm implies a genotypic representation of the

individuals. This allows the genetic operators to modify them without using knowledge

about the individuals’ structure. In this project the genetic algorithm used binary encoded

weights to represent individuals in a population. Each weight was composed of eight

bits. Since each data window consisted of ten inputs all connected to two hidden

neurons, there were a total of twenty weights that were determined by the genetic

algorithm per epoch. The encoded weight (W) topology is better understood by viewing

Figure 3.9. As in the previous figure, the labeling of close price (C) and volume (V) help

identify placement within the dataset.

Input Layer

Output Layer

Hidden Layer

C1 V1 C2 V2 C3 V3 C4 V4 C5 V5

28

Figure 3.9. Encoded Weight Topology

Starting from a randomly generated initial population of weights, the network

generator built a network from the genotype. The backpropagation technique for

measuring prediction error was performed on the neural network system, and the error

was delivered to an application for fitness evaluation and scaling. This process calculated

the actual fitness value for each weight according to its performance. A fitness value

represented the quality of an individual, and was used to rank the individual in a

population. The calculation of fitness is specific to the individual problem and is

essentially a driving force for an effective evolutionary search.

In this project the fitness of an individual represented the accuracy of the

prediction computed by the neural network system. Hence, the higher the fitness of an

individual, the lower the prediction error of the neural network system. It is important to

note that high error is bad and high fitness is good. A simple approach is to divide one by

W1

W2

W3

W4

W5

W6

W7

W8

W9

W10

W11

W12

W13

W14

W15

W16

W17

W18

W19

W20

V1

C1

C2

V2

C4

V4

C5

C3

V3

V5

Hidden

Neuron

1

Hidden

Neuron

2

29

the error and higher errors will give lower fitness, and similarly lower errors will give

higher fitness. However, there is a problem with this approach. If the error were ever

derived to be zero (a perfect individual), dividing one by this would cause a divide-by-

zero math error. The better approach was to use the fitness formula shown in Eq. (3.1)

below.

fitness = |N – total error|, (3.1)

where N is the number of training instances.

More precisely stated, given an individual i, let wi be the weight obtained by assigning to

a connection in the net the corresponding weight encoded in the individual. This yields

the following equations given below in Eq. (3.2) and Eq. (3.3).

fitness of i = |N – total error of wi|, (3.2)

where

total error of wi = sum of all instances of |desired output – prediction|. (3.3)

Based on fitness, the genetic operator known as selection determined which

individuals were placed in a mating pool for reproduction. Then the chosen individuals

were allowed to mate by means of a genetic crossover operator, which ensures that

offsprings are new but also related to their parents. This evolution cycle was iterated for

a fixed number of instances (generations).

The tournament method was chosen as the genetic algorithm’s selection method

of choosing parents from the population to mate. Tournament selection is like small

battles between individuals of the population to determine who gets to be placed in the

mating pool of the next generation. Two tournaments were performed when determining

a set of parents. An example of tournament selection can be seen in Figure 3.10.

30

Figure 3.10. Tournament Selection

In a tournament, two individuals are chosen at random from the population and

their fitness values were compared. The individual with the better fitness is retained as

one of the two parents for mating. The second parent is chosen by the same method.

Because the focus of this project was to model a neural network system,

concentration was not given to the variety of selection methods that a genetic algorithm

can employ. However, it should be noted that the selection method can have a major

impact on the genetic algorithm. For financial applications, the choice of selection

method may depend on the type of problem being solved. If decisions must be made

quickly, especially decisions in real-time trading environments, then quicker convergence

may be more desirable. Tournament selection is recognized as a quicker methodology.

The crossover method that was implemented was single-point crossover, which is

one of the most powerful and popular techniques used in genetic algorithms. After two

parents were first selected for mating, a point was randomly chosen where the two strings

were to be cut. Then an exchange occurred of the tails of the two strings, which left the

Mating Pool

Parent 2

Tournament

X1 Fitness = 2 Y1 Fitness = 3

X2

X2

X1

X2 Wins

X2

Y2 Y1 Parent 1

Tournament

Y1

Y1 Wins Y1

X2 Fitness = 4 Y2 Fitness = 1

31

head of the first string with the tail of the second string and vice versa. This method of

crossover is demonstrated in Figure 3.11, where the randomly selected cut is after the

fourth bit.

Figure 3.11. Single-Point Crossover

The basis for evolution is a set of individuals that establish a population. The

better an individual adapts itself to the given environment, the greater is its chance to

survive and produce offsprings. In this project the neural network represented an

environment. Each individual was assigned a fitness value, which reflected its ability to

adapt to the given environment. The pseudocode for this process is:

MAX = preset maximum number of generations

POP = preset population size

IND = individual

BestIND = individual with highest fitness

Generate initial POP of individuals randomly

Evaluate each IND in population to get Fitness(IND)

Iterations = 0;

While (Iterations < MAX)

1. Select individuals for crossover 2. Produce offspring and replace in population 3. Mutate small percent of population 4. Evaluate the fitness for new members 5. BestCurrent = best fit IND from new population 6. IF (Fitness(BestCurrent) > Fitness(BestIND))

THEN BestIND = BestCurrent

7. BestIND goes to neural network for grading 8. Neural Network computes prediction error 9. Adjust Fitness(BestIND) relative to error 10. Iterations ++

Parent 1: 00001111Parent 2: 10011001

Offspring: 00011001

32

A genetic algorithm is an iterative procedure, where each iteration is called a

generation. The application of elitism allows the most fit individuals from the previous

generation to be carried over into the next generation. During each iteration the two

nature inspired principles of selection and reproduction are applied to the population.

The selection mechanism determines which individuals are allowed to produce offsprings

for the next generation. The probability at which an individual is allowed to reproduce

itself and the number of offsprings is based upon its fitness. Supervised learning in

neural networks, where predictions are measured for error, provides a measure of fitness

for individuals in a genetic algorithm. Hence the need for using the technique of

backpropagation discussed earlier [Jain 2000]. The evolution cycle is characterized in

Figure 3.12.

Figure 3.12. Evolution Cycle

Genetic Algorithm

fitness

evaluation

and

adjustment

elitist population

crossovermating pool

network

generated

backpropagation performed

train

network

test

network

fitness

selection

Tournament X Y

Y

Parent 1: 00001111Parent 2: 10011001

Offspring: 00011001

fitness

criteria

33

3.5 Managing Prediction Error

A problem that frequently occurs in neural networks is due to the nature of the

errors in the training set. The error in the training set might become lower and lower, but

errors on the second and third sets might be exponential. Because of this, errors cannot

simply be added together because it is possible that the population will suffer premature

convergence [Welstead 1994]. To deal with this, during the first generations of the

genetic algorithm, the error on the training set is more important. The errors from the

evaluation set will not be introduced into the system until after a few generations. The

goal of the error can be visualized in Figure 3.13, where it has been roughly sketched.

Figure 3.13. Error During Prediction

3.6 Initializing Process

Another problem is that neural networks needed random values to initialize their

weights [Welstead 1994]. In order for all evaluations to be of an equivalent level, a

random number generator was required and needed restarting from the same seed every

time training was initiated. This had the potential to also affect the genetic algorithm, so

PRICE

train test evaluate

predicted data

real data

error

TIME

34

two random number generators were needed: one for the neural networks; and one for

the genetic algorithm.

3.7 The Ability of Randomness

At several points in this project randomly driven events occurred. In the mating

process, random selections were performed. A pair of individuals was randomly chosen,

and it was randomly decided where to cut the strings. Randomness, or chance, is one of

the distinguishing characteristics of both genetic algorithms and neural networks.

Although it might seem counterintuitive that random operations are productive, but based

on their history in computer science they are successful.

3.8 Programming Environment

The programming language used was C++ running on a UNIX platform. Data

was read from a text file and output was written to another text file. In order to assist the

visualization of outcome, the research in this project used Microsoft Excel to graphically

represent results.

This system supported multidimensional data (ie. close prices and trading

volumes) and for every dimension the error of prediction was evaluated for importance.

For example, there were two dimensions (close price and trading volume) in an input file,

but the forecast for only one dimension (close price) was produced. Originally it was

thought that high and low prices could be used instead of volume. However, it became

clear that the trading volume of stock ultimately establishes the value of closing prices

35

based on the economic concepts of supply and demand. Therefore, trading volumes were

used primarily for strengthening the neural network system.

3.8.1 Programming Applications

In order to develop a successful program, several classes needed to be produced

including a separate application to smooth a data series. Some techniques, classes, and

functions were borrowed, converted, and conformed to this project from A Practical

Guide to Neural Nets [Nelson 1991], and Neural Network and Fuzzy Logic Applications

in C/C++ [Welstead 1994]. The classes that were used in this project are listed in Table

3.1 below and accompanied with brief descriptions.

Table 3.1. Classes

Application Description

DataSet loads time series

Genotype contains information about genes

MemRandom the generator of random numbers

Node contains information about a neuron

NeuralNetwork creates neural networks’ connecting nodes

GA

and

Environment

abstract classes for genetic algorithm

ParGA

and

ParEnvironment

searches best parameter for neural networks

36

3.8.2 Input Files

The input of the system required two separate files. These files are listed and

described below in Table 3.2.

Table 3.2. Input Data

File Name File Description

Data.txt contains data series of all historical input values

GA.txt

contains configuration of system with the following format:

population size,

elitism (best solution in population copied to next generation),

mutation rate,

number of dimensions (close price and volume),

value of importance per dimension,

number of values for training set,

number of values for evaluation set,

number of values for test set,

number of states,

number of historical values in input,

number of neurons on networks

37

4. EVALUATION AND RESULTS

The goal of this project was to produce a functional model for predicting a stock

trend using a neural network system with a genetic algorithm. When theoretical tests

were conducted, forecasting with only a small error of prediction, such as 1%, was

desirable because theoretical tests exist primarily for the purpose of assessing

functionality. When the system was executed with a real series from a stock market,

successful forecasting of trends was determined to be those that followed similar

direction of the trends derived from charting techniques by financial technical-analysts.

4.1 Financial Data Analysis

During the evaluation phase of this study, performance tests were initially

conducted through theoretical experimentations using the SINE trigonometric function

and the Lorenz equations. These tests provided an opportunity to test the neural network

system in a controlled environment, without the volatility and chaotic behavior that can

exist in a stock market. Because this project concentrated on stock trends, the outcome of

theoretical tests were graphed in Microsoft Excel and only simple visual analysis was

conducted.

After conducting theoretical tests, empirical experimentations were performed on

real data series from a stock market. To test real data, one year of historical trading

information was retrieved on ten stocks, each from a separate major market. Taking data

from the major markets allowed the neural network system to be tested against multiple

38

data series behaviors. Table 4.1 below, contains a list of the company stocks with their

respective symbols and markets.

Table 4.1. Tested Stocks

Only data regarding closing price and trading volume were used in the empirical

experimentation process because the volume of stock traded is an important indicator for

determining supply and demand. This helped the neural network system predict whether

or not the future price of the stock was going to be higher or lower. The correlation

between closing price and volume maintained the standard principles of popular financial

analysis.

In addition to maintaining a constant type of input for each stock, many system

variables were maintained with constant values for two reasons. First, maintaining the

dimensions of the neural network system in a constant structure allowed result

comparisons of different stocks to be conducted fairly. Therefore, only genetic algorithm

Symbol Market

WFC Wells Fargo Banking

DNA Genentech, Incorporated Biotech & Drug

IBM IBM Corporation Blue Chip

XOM Exxon Mobile Corporation Energy

HCA HCA, Incorporated Healthcare

AOL AOL Time Warner Internet

WMT Wal-Mart Stores Retail

INTC Intel Corporation Semiconductor

MSFT Microsoft Corporation Technology

FDX FedEx Corporation Transportation

Company

39

parameters for population size and generations were adjusted between tests. This led to a

total of four tests per stock. Second, further test variations were desired, but execution of

the neural network system utilized an excessive amount of memory and run-time, which

often exceeded the designated student quota of the operating system.

Data spikes and anomalies were first smoothed. Then data was fed to the neural

network system. To assess empirical results, real data and predicted data were graphed in

Microsoft Excel. Next, the trend analysis technique was manually applied to the testing

data set for the predicted data and the real data. Afterward, the trend line slopes, for both

data types, were calculated, compared for differences (as absolute values), and recorded

in a table. Slope was expressed as the rate of change along a trend line, where the

vertical distance was divided by the horizontal distance between the first two points that

established the trend line. It is important to note that the neural network system in this

project was not an attempt to predict stock prices, but that it was an attempt to predict

trend lines similar in direction when compared to actual trend lines. Therefore, values

that form predicted trend lines were expected to vary somewhat with values that made up

the actual trend lines, but their calculated slopes were expected to be similar.

4.2 Theoretical Experimentation

4.2.1 SINE Data

Using Microsoft Excel, values were calculated for the trigonometric function

SINE, defined from a circle with a radius of one. A theoretical evaluation with a function

such as SINE allowed for a basic test using a data series without any noise, anomalies, or

volatility, unlike the stock market. This kind of test provided the neural network system

40

an opportunity to be evaluated for simple functionality. Because SINE consists of an

easy, recognizable pattern, any properly running neural network system should be able to

learn and predict it.

Figure 4.1 shows the graphed data of SINE that the neural network system had to

learn from. It is divided into the three partitions: train, evaluate, and test. It’s structured

much like the data in a time series, accept here the coordinates must be labeled in radians

and SIN(x), as opposed to time and price.

Figure 4.1. SINE Input Data

Figure 4.2 is a graph of the test section where the final predictions took place.

The thin line in the graph represents the prediction. The graph reveals a fairly accurate

prediction with a very close resemblance to the actual data. The results are of high

quality because the data followed a repetitious pattern and did not experience any sudden

anomalies.

Data Partitions of SINE Values

-2

-1

-1

0

1

1

2

1

13

25

37

49

61

73

85

97

109

121

133

145

157

169

181

193

Radians

SIN(x)

train evaluate test

41

Figure 4.2. SINE Prediction

4.2.2 Lorenz Data

The Lorenz equations are a system of nonlinear differential equations representing

a time continuous system. These equations were originally designed to examine the

properties of nonlinear systems, and are significant because when graphed, they

demonstrate chaotic behavior in an equation-controlled environment. The chaotic aspect

of this system is that regardless of accuracy, a prediction cannot be made of an advanced

state of the system. In other words, the system has built into itself the property of

amplifying small changes, like a ripple effect, until they become so significant they affect

the accuracy of prediction [Strogatz 2000]. The data for the Lorenz test was calculated

and retrieved on the web site, Calculators On-Line Center for Mathematics located at the

following address: http://www-sci.lib.uci.edu/HSG/RefCalculators2.html#COMP-LOR.

Verify Prediction of SINE

-1.5

-1.0

-0.50.0

0.5

1.0

1.51 3 5 7 9

11

13

15

17

19

21

23

25

27

29

Radians

SIN(X)

actual predicted

42

The data that was retrieved and tested is graphed below in Figure 4.3. As before,

this data is not labeled with time and price. Lorenz equations simply use X and Y

indicators for a two-dimensional representation.

Figure 4.3. Lorenz Input Data

Figure 4.4 shows the outcome of the neural network’s attempt to predict the

Lorenz equations. Looking at the graph, it becomes obvious that the neural network

system needs some type of secondary input variable to help support its generalizations.

Otherwise, predictions in a chaotic system tend to resemble random guessing. The tests

conducted in this project, for predicting stock trends, used trading volume as a secondary

indicator.

Data Partitions of Lorenz Values

0.0

0.2

0.4

0.6

0.8

1.0

1

15

29

43

57

71

85

99

113

127

141

155

169

183

197

211

225

239

253

267

X

Y

train evaluate test

43

Figure 4.4. Lorenz Prediction

4.3 Empirical Experimentation

For each stock tested, the input data for closing price was partitioned and graphed.

The genetic algorithm was adjusted to four different combinations for population size (50

and 100) and generations (50 and 100). Results were recorded in a table and shown for

each stock. Each table shows if the proper trend type (up or down) was predicted, and

contains the slope difference between the actual and predicted trends. When the slope

difference showed a negative value, it indicated that the prediction was below the actual

trend’s slope. A positive slope difference indicated that the prediction was above the

actual trend’s slope. From each stock’s test set of four, the prediction results with the

least slope difference were graphed.

Verify Prediction of Lorenz

0.0

0.2

0.4

0.6

0.8

1.0

1 4 7

10

13

16

19

22

25

28

31

34

37

40

43

46

49

X

Y

actual predicted

44

4.3.1 Wells Fargo

The first empirical test examined the stock of Wells Fargo (WFC). In Figure 4.5,

the input data for close price can be seen with three partitioned sections.

Figure 4.5. WFC Input Data

Results were recorded in Table 4.2, and show that a higher number of generations

in the genetic algorithm produce better results, where population size may not matter.

The best performance took place in test four, where the slope difference was lowest

between the predicted and actual trend. However, all of the four tests conducted

predicted the trend to be an uptrend even though the actual trend contained a very small

uptrend quality.

WFC

Data Partitions of Actual Close Prices

30

35

40

45

50

55

1

17

33

49

65

81

97

113

129

145

161

177

193

209

225

241

257

Time (days)

Price ($)

train evalu

ate

test

45

Table 4.2. WFC Prediction Results

Below in Figure 4.6 is the graph of test four, the most successful test. The results

of all the tests for this stock showed that a higher population size and higher generation

size was more optimal for performing the most accurate prediction. Actual data is

represent by the thick crooked line and the actual trend is represented with the thick

straight line. It is hardly visible that they are in an uptrend. The predicted trend is

represented by the thin line where the prediction was successful in determining an

uptrend, varying from the actual trend only slightly in value.

Figure 4.6. WFC Best Predicted Trend: Test #4

1 yes 50 0.061 0.058

2 yes 100 0.028 0.025

3 yes 50 0.084 0.081

4 yes 100 0.025 0.022

Company

Symbol

GA

Population

GA

Generations

Predicted

Trend Type?

(yes/no)

Test

#

Slope

Slope

DifferenceActual

Trend

Predicted

Trend

WFC

50

100

0.003

WFC

Verify Trend Prediction of Close Prices

45.0

46.0

47.0

48.0

49.0

1 3 5 7 9

11

13

15

17

19

21

23

25

27

Time (days)

Price ($)

actual predicted

46

4.3.2 Genentech, Incorporated

In Figure 4.7, Genentech’s input data for close price can be seen with its three

partitioned sections.

Figure 4.7. DNA Input Data

In Table 4.3, once again all the tests successfully predicted this stock to perform

an uptrend. The results of all the tests for this stock showed that a higher population size

and higher generation size resulted in the more accurate prediction. The most successful

tests were test two followed closely by test four. Both of these tests used a higher

quantity of generations than tests one and three.

DNA


0

10

20

30

40

50

60

1

17

33

49

65

81

97

113

129

145

161

177

193

209

225

241

257

Time (days)

Price ($)

train evalu

ate

test

47

Table 4.3. DNA Prediction Results

Below in Figure 4.8, is the graph of test two. Both the actual trend and the

predicted trend appear very similar, and are established before day 20 when the stock

began to suffer from an anomaly.

Figure 4.8. DNA Best Predicted Trend: Test #2

1 yes 50 0.018 -0.076

2 yes 100 0.108 0.014

3 yes 50 0.181 0.087

4 yes 100 0.109 0.015

Company

Symbol

GA

Population

GA

Generations

DNA

50

Predicted

Trend Type?

(yes/no)

Test

#

0.094

Slope

Slope

DifferenceActual

Trend

Predicted

Trend

100

DNA


30.0

40.0

50.0

60.0

70.0

1 3 5 7 9

11

13

15

17

19

21

23

25

27

Time (days)

Price ($)

actual predicted

48

4.3.3 IBM Corporation

Graphed in Figure 4.9, IBM’s input data for close price is represented with its

three partitioned sections.

Figure 4.9. IBM Input Data

In Table 4.4, the most accurate attempt to forecast IBM’s stock was in test four.

Although the slope difference is a negative value, test four’s predicted trend more closely

resembles the actual trend line. A negative value for slope difference simply implies that

the predicted trend contained less rise than the actual trend. Again, a larger generation

size led to the most accurate outcome.

Table 4.4. IBM Prediction Results

1 yes 50 0.002 -0.074

2 yes 100 0.003 -0.073

3 yes 50 0.127 0.051

4 yes 100 0.066 -0.011

Company

Symbol

GA

Population

GA

Generations

Predicted

Trend Type?

(yes/no)

Test

#

Slope

DifferenceActual

Trend

Predicted

Trend

IBM

50

100

0.077

Slope

IBM


40

50

60

70

80

90

100

1

17

33

49

65

81

97

113

129

145

161

177

193

209

225

241

257

Time (days)

Price ($)

train

evalu

ate

test

49

Below in Figure 4.10, is the graph of test four.

Figure 4.10. IBM Best Predicted Trend: Test #4

4.3.4 Exxon Mobile Corporation

Below in Figure 4.11, Exxon Mobile’s input data for close price is shown with

three partitioned sections.

Figure 4.11. XOM Input Data

IBM


75.0

80.0

85.0

90.01 3 5 7 9

11

13

15

17

19

21

23

25

27

Time (days)

Price ($)

actual predicted

XOM


25

27

29

31

33

35

37

39

41

1

17

33

49

65

81

97

113

129

145

161

177

193

209

225

241

257

Time (days)

Price ($)

train eval

uat

e

test

50

As shown in Table 4.5, the results of Exxon’s stock forecast upheld the idea that a

large generation size is the key to success with this neural network system. However, the

results of test three and test four were very close, of which, test four predicted with more

accuracy. Furthermore, all tests indicate the correct trend to be upward.

Table 4.5. XOM Prediction Results

Below in Figure 4.12, the trend predicted in test four has been graphed. The

forecasted trend line passes through the middle of the actual price values, but as

mentioned before, this research is not attempting to predict stock prices. Here the

predicted trend line is composed of slightly higher values, but the slope of the line

appears very similar to the actual trend’s slope.

1 yes 50 0.060 0.029

2 yes 100 0.051 0.019

3 yes 50 0.017 -0.014

4 yes 100 0.019 -0.013

Company

Symbol

GA

Population

GA

Generations

Predicted

Trend Type?

(yes/no)

Test

#

Slope

Slope

DifferenceActual

Trend

Predicted

Trend

XOM

50

100

0.031

51

Figure 4.12. XOM Best Predicted Trend: Test #4

4.3.5 HCA, Incorporated

In Figure 4.13, HCA’s input data for close price can be seen with three partitioned

sections.

Figure 4.13. HCA Input Data

XOM


33.033.534.034.535.035.536.036.537.0

1 3 5 7 9

11

13

15

17

19

21

23

25

27

Time (days)

Price ($)

actual predicted

HCA


20

25

30

35

40

45

50

55

1

17

33

49

65

81

97

113

129

145

161

177

193

209

225

241

257

Time (days)

Price ($)

train evalu

ate

test

52

In Table 4.6, test one, two, and three failed to predict the proper trend type. On

the other hand test four prevailed by predicting the correct trend type and again making

the most accurate prediction utilizing the highest combination of population and

generation sizes.

Table 4.6. HCA Prediction Results

The graph in Figure 4.14 shows a substantial similarity in the predicted and actual

trends. Again, an interesting result occurred, in which both trends were very similar but

many price values apart.

Figure 4.14. HCA Best Predicted Trend: Test #4

1 no 50 -0.225 -0.409

2 no 100 -0.079 -0.263

3 no 50 -0.038 -0.222

4 yes 100 0.252 0.068

Company

Symbol

GA

Population

GA

Generations

Predicted

Trend Type?

(yes/no)

Test

#

Slope

Slope

DifferenceActual

Trend

Predicted

Trend

HCA

50

100

0.184

HCA


20.0

30.0

40.0

50.0

60.0

1 3 5 7 9

11

13

15

17

19

21

23

25

27

Time (days)

Price ($)

actual predicted

53

4.3.6 AOL Time Warner

In Figure 4.15, AOL Time Warner’s input data for close price was partitioned and

is illustrated below.

Figure 4.15. AOL Input Data

In Table 4.7, the correct trend type was only predicted when the number of

generation sizes were at their highest. As in previous test sets, test four resulted with a

lower slope difference and predicted with the highest level of accuracy.

Table 4.7. AOL Prediction Results

1 no 50 -0.042 -0.076

2 yes 100 0.102 0.068

3 no 50 -0.079 -0.113

4 yes 100 0.043 0.009

Company

Symbol

GA

Population

GA

Generations

Predicted

Trend Type?

(yes/no)

Test

#

Slope

Slope

DifferenceActual

Trend

Predicted

Trend

AOL

50

100

0.034

AOL


5

7

9

11

13

15

17

19

21

1

17

33

49

65

81

97

113

129

145

161

177

193

209

225

241

257

Time (days)

Price ($)

train

evalu

ate

test

54

The result of test four is graphed below in Figure 4.16. This test happened to be

the most accurate thus far for predicting a trend with very similar values as the real trend.

Figure 4.16. AOL Best Predicted Trend: Test #4

4.3.7 Wal-Mart Stores

Figure 4.17 shows Wal-Mart Store’s input data with the partitioned sections.

Figure 4.17. WMT Input Data

AOL


11.0

12.0

13.0

14.0

15.0

16.0

1 3 5 7 9

11

13

15

17

19

21

23

25

27

Time (days)

Price ($)

actual predicted

WMT


30

35

40

45

50

55

60

1

17

33

49

65

81

97

113

129

145

161

177

193

209

225

241

257

Time (days)

Price ($)

train evalu

ate

test

55

As in the previous stock, the neural network system came very close to predicting

the trend perfectly. Once again the more successful test occurred in test four. However,

this stock was the first to demonstrate an actual downtrend. More importantly, each test

was successful at predicting that a downtrend would occur. The outcomes are below in

Table 4.8.

Table 4.8. WMT Prediction Results

The results of test four can be seen below in Figure 4.18.

Figure 4.18. WMT Best Predicted Trend: Test #4

1 yes 50 -0.035 0.269

2 yes 100 -0.045 0.259

3 yes 50 -0.035 0.270

4 yes 100 -0.131 0.174

Company

Symbol

GA

Population

GA

Generations

Predicted

Trend Type?

(yes/no)

Test

#

Slope

Slope

DifferenceActual

Trend

Predicted

Trend

WMT

50

100

-0.305

WMT


30.0

35.0

40.0

45.0

50.0

55.0

60.0

1 3 5 7 9

11

13

15

17

19

21

23

25

27

Time (days)

Price ($)

actual predicted

56

4.3.8 Intel Corporation

In Figure 4.19, Intel’s input data for close price is graphed along with its three

partitioned sections.

Figure 4.19. INTC Input Data

The results of testing with INTC stock were not very unusual. Test four produced

the best prediction and its outcome can be seen in Table 4.9. The experiments in this test

set produced similar results as in the earlier tests with HCA. The similarity is due to both

stocks being in an uptrend and every test other than test four predicted a downtrend.

Table 4.9. INTC Prediction Results

1 no 50 -0.031 -0.057

2 no 100 -0.008 -0.033

3 no 50 -0.007 -0.032

4 yes 100 0.055 0.029

Company

Symbol

GA

Population

GA

Generations

Predicted

Trend Type?

(yes/no)

Test

#

Slope

Slope

DifferenceActual

Trend

Predicted

Trend

INTC

50

100

0.026

INTC


0

5

10

15

20

25

30

35

1

17

33

49

65

81

97

113

129

145

161

177

193

209

225

241

257

Time (days)

Price ($)

train evalu

ate

test

57

A graph showing the outcome of test four can be seen in Figure 4.20. The

predicted trend is hard to see because it is so close to the actual trend line.

Figure 4.20. INTC Best Predicted Trend: Test #4

4.3.9 Microsoft Corporation

Below is Microsoft’s input data shown with its partitions in Figure 4.21.

Figure 4.21. MSFT Input Data

INTC


16.0

17.0

18.0

19.0

20.0

21.0

1 3 5 7 9

11

13

15

17

19

21

23

25

27

Time (days)

Price ($)

actual predicted

MSFT


10

15

20

25

30

1

17

33

49

65

81

97

113

129

145

161

177

193

209

225

241

257

Time (days)

Price ($)

train evalu

ate

test

58

In this test set, the neural network system had another opportunity to predict a

downtrend. As seen in Table 4.10, each test produced a prediction that trended

downwards, but test two and test four proved to be more accurate attempts. Both

contained a higher number of generations resulting in the lowest slope differences in the

set. Test four generated the best outcome.

Table 4.10. MSFT Prediction Results

Figure 4.22 shows the graphical result of test four. Separated by a slight

difference in values, both trend line slopes are seemingly parallel and only differ slightly.

Figure 4.22. MSFT Best Predicted Trend: Test #4

1 yes 50 -0.030 0.113

2 yes 100 -0.041 0.102

3 yes 50 -0.035 0.107

4 yes 100 -0.050 0.093

Company

Symbol

GA

Population

GA

Generations

Predicted

Trend Type?

(yes/no)

Test

#

Slope

Slope

DifferenceActual

Trend

Predicted

Trend

MSFT

50

100

-0.143

MSFT


10.0

15.0

20.0

25.0

30.0

1 3 5 7 9

11

13

15

17

19

21

23

25

27

Time (days)

Price ($)

actual predicted

59

4.3.10 FedEx Corporation

In Figure 4.23, FedEx’s input data for close price can be seen in three partitioned

sections.

Figure 4.23. FDX Input Data

In this data set, test one and two were not able to predict an uptrend. On the other

hand, test three and four predicted the proper trend. The success of test four once again

confirmed the need of the neural network system to have a sufficient amount of learning

time. As in other test sets, test four prevailed because it was provided with a larger

number of generations for learning. These results can be seen below in Table 4.11.

Table 4.11. FDX Prediction Results

1 no 50 -0.008 -0.071

2 no 100 -0.076 -0.139

3 yes 50 0.267 0.204

4 yes 100 0.060 -0.003

Company

Symbol

GA

Population

GA

Generations

Predicted

Trend Type?

(yes/no)

Test

#

FDX

50

100

0.063

Slope

Slope

DifferenceActual

Trend

Predicted

Trend

FDX


20

30

40

50

60

70

1

17

33

49

65

81

97

113

129

145

161

177

193

209

225

241

257

Time (days)

Price ($)

train evalu

ate

test

60

The graph for test four’s prediction is shown in Figure 4.24. Much like the results

from the MSFT test, the trend line and the predicted line appear parallel and faintly differ

in slope value.

Figure 4.24. FDX Best Predicted Trend: Test #4

Overall results showed a system dependency on population and generation sizes.

Results were best when the number of generations were highest, and improved more

when population size increased. Table 4.12 shows how test two and test four performed

best in most empirical experimentations.

Table 4.12. Best Tests

1 50 0 0 0

2 100 1 4 5

3 50 0 3 3

4 100 9 1 10

Test #

GA

Population

GA

Generations

50

100

Predicted Correct

Trend TypeTotal Best &

2nd Best Tests# of Best

Tests

# of 2nd

Best Tests

FDX


45.0

50.0

55.0

60.0

65.0

1 3 5 7 9

11

13

15

17

19

21

23

25

27

Time (days)

Price ($)

actual predicted

61

5. CONCLUSION

Neural networks originated from efforts to simulate the functioning of the human

brain in the areas of learning and problem solving. A neural network system must be

provided with a method of learning. A learning rule is applied for establishing and

controlling the connections among many pieces of input data. One such learning method

that has not received a great deal of attention is the application of a genetic algorithm.

The goal of this project was to predict a stock trend based on data from a financial

market. Because this was an attempt to predict a financial stock trend line, it essentially

became a model for time series forecasting. The neural networks that performed the

prediction were trained using a backpropagation technique and a genetic algorithm. As a

learning error was calculated in one neural network, a second neural network determined

the state of the market at each instance of input.

Prediction results were recorded and analyzed graphically to calculate slopes of

trend lines for measuring accuracy. The accuracy level of each forecast was measured by

directly comparing its actual historical trend line and its predicted trend line for the same

time period. These tests were conducted using many different stocks from various major

markets.

This project demonstrated that artificial intelligence could be a substitute for

traditional statistical methods in predicting stock trends. Although neural networks are

not perfect in forecasting stock trends, they perform extremely well and offer vast

potential for growth. This project was important because it explored different genetic

algorithm parameters such as population size and generations. Results later showed that

62

the success of the neural network system is dependent on a sufficient amount of learning

time. In this case learning was aided by a genetic algorithm. When using a genetic

algorithm, the success of the neural network system was dependent on population size

and on the number of generations employed by the genetic algorithm.

Using a genetic algorithm for the process of learning within a neural network did

have a drawback. The computation cost was so high as to almost make the genetic

algorithm impractical. For example, the computation time easily reached between twelve

and eighteen hours for each empirical test. The high number of genetic algorithm

iterations accounted for this high cost of time, where total iterations were in the millions

per test. Given the population size (P), generations (G), and other fixed parameters,

iterations can be approximated with Eq. (5.1) below. The iterations are calculated in

Table 5.1 for the various combinations of population and generation sizes that were

tested. The cost of the numerous iterations, needed to improve the neural network,

quickly became computationally prohibitive.

Table 5.1. Iterations

1 50 10,700,000

2 100 21,400,000

3 50 21,400,000

4 100 42,800,000100

Total

IterationsTest #

GA

Population

GA

Generations

50

G 2 wt 10 input nodes 214 data windows

1 wt 1 input node 1 data window 1 testPTotal Iterations = * * ** (5.1)

63

6. FUTURE WORK

Future use of neural networks to predict stock market movement has the potential

to be a growing area of research for a long time to come. As investors strive to

outperform stock markets, there will always be a strong interest in finding innovative

methods to improve investment returns. This driving force will push researchers to

continuously find more interesting results and produce new theories for training financial

neural networks. Financial neural networks require training so they can learn and form

generalizations about data. Training is a very important aspect of neural networks, and

different methods of training could be explored. Various combinations of parameters for

genetic algorithms could be tested.

This project used a genetic algorithm for weight training in a supervised neural

network system. The difficulty with using a genetic algorithm to optimize a neural

network is the high cost of numerous iterations. Future work could explore the

parallelization of a neural network’s training set in order to speed-up learning. This

would involve the study of multiple processors and distributed memory. Implementation

could involve a cluster of computers and a message passing interface.

The idea of combining neural networks and expert systems also shows potential.

Additional research could focus on a system that provides reasoning and explanations for

the neural network predictions. However, this might be a difficult challenge knowing

that many financial experts typically characterize stock markets as chaotic systems.

Nevertheless, neural networks seem to be an effective and readily available method for

64

dealing with nonlinear data series. Continued work on improving neural networks may

provide insights into the nature of chaotic systems.

Other ideas for future work include system features and usability aspects that can

be implemented into a neural network project. A true user functional system may allow

the user to predict stock prices as opposed to stock trends. Furthermore, it may predict

stocks for a particular week or day, instead of trending a longer time period such as a

month. Being that a day and week are shorter periods of time than a month, predicting

them may provide interesting results and conclusions. Additionally, a graphical user

interface could be implemented to assist the user in employing these features as well as

automatically viewing numeric results in graphical representation.

65

BIBLIOGRAPHY AND REFERENCES

[Baddeley 2000] Baddeley, R., Hancock, P., Foldiak, P., Information Theory and the

Brain. Cambridge University Press, New York City, NY., 2000.

[Goldberg 1989] Goldberg, D.E. Genetic Algorithms in Search, Optimization, and

Machine Learning. Addison-Wesley, Reading, Mass., 1989.

[Hecht-Nielsen 1990] Hecht-Nielsen, R. Neurocomputing. Addison-Wesley, Reading,

Mass., 1990.

[Jain 2000] Jain, L. and Fanelli, A.M., Recent Advances in Artificial Neural Networks

Design and Applications. CRC Press, Boca Raton, Fla., 2000.

[Lisboa 2000] Lisboa, P., Edisbury, B., Vellido, A., Business Applications of Neural

Networks: The State-of-the-Art of Real-World Applications. World Scientific,

River Edge, NJ., 2000.

[Nelson 1991] Nelson, M.M. and Illingworth, W.T., A Practical Guide to Neural Nets.

Addison-Wesley, Reading, Mass., 1991.

[Pistolese 1994] Pistolese, C. Using Technical Analysis: A Step-by-Step Guide to

Understanding and Applying Stock Market Charting Techniques. McGraw-Hill,

New York, NY., 1994.

[Rees 1977] Rees, H. Chromosome Genetics. University Park Press, Baltimore, MD.,

1977.

[Strogatz 2000] Strogatz, S.H. Nonlinear Dynamics and Chaos: With Applications in

Physics, Biology, Chemistry, and Engineering. Perseus Publishing, Cambridge,

Mass., 2000.

[Welstead 1994] Welstead, S.T. Neural Network and Fuzzy Logic Applications in

C/C++. Wiley, New York, NY., 1994.

[Yves 1995] Yves, C. Back Propagation: Theory, Architectures, and Applications.

Lawrence Erlbaum Associates, Inc., Mahwah, NJ., 1995.

66

APPENDIX

The digital media of this project are contained in the disk provided. Found on the

disk, is a copy of the technical report, data files, program, and executable files.

predicting stock trending in a financial market with ...sci.tamucc.edu/~cams/projects/207.pdfhand,...

Documents