stock market software - walter scott, jr. college of...

23
1 Stock Market Software First Semester Report Fall Semester 2012 by Jordan Courville Rayan Alanazi Prepared to partially fulfill the requirements for ECE401 Department of Electrical and Computer Engineering Colorado State University Fort Collins, Colorado 80538 Project advisor: Dr. Edwin Chong g Approved by: Dr. Edwin Chong g

Upload: phungkhanh

Post on 01-Sep-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Stock Market Software - Walter Scott, Jr. College of ...projects-web.engr.colostate.edu/.../StockMarketSoftwareApproved... · Stock Market Software ... III: Mathematical Analysis

1

Stock Market Software First Semester Report

Fall Semester 2012

by

Jordan Courville

Rayan Alanazi

Prepared to partially fulfill the requirements for

ECE401

Department of Electrical and Computer Engineering

Colorado State University

Fort Collins, Colorado 80538

Project advisor: Dr. Edwin Chong g

Approved by: Dr. Edwin Chong g

Page 2: Stock Market Software - Walter Scott, Jr. College of ...projects-web.engr.colostate.edu/.../StockMarketSoftwareApproved... · Stock Market Software ... III: Mathematical Analysis

2

Abstract

The goal of our senior design project is to create a program that forecast the fluctuation of

stock data. These future predictions will help an investor decide if they should sell, buy, or keep

their stock. The ideology might seem unrealistic, but certain companies are known to have such

complex prediction/recommendation programs. Their programs, however, are not open to the

public. Our prediction program will use a company’s historical data to construct the predictions.

The problem that we need to solve is how to create our own successful prediction

algorithm. In order to forecast future stock trends the algorithm will need to use pattern

matching, statistics, linear algebra, and partial differential equations. To achieve this goal we will

use a revolving process consisting of research, design, programming, and testing.

The results found from last semester have shown that singular value decomposition has

very strong predictive power and has great potential in predictive forecasting. They also found a

positive tendency towards accurate predictions of whether a stock price will rise or fall

(Turner, Burkhart, 2).

Page 3: Stock Market Software - Walter Scott, Jr. College of ...projects-web.engr.colostate.edu/.../StockMarketSoftwareApproved... · Stock Market Software ... III: Mathematical Analysis

3

Table of Contents

Title Page 1

Abstract 2

Table of Contents 3

List of Figures and Tables 4

I: Introduction 5

II: Web Crawler 7

III: Mathematical Analysis

A. Singular Value Decomposition 8

B. Method of Prediction 9

IV: Results and Analysis of our Prediction Algorithm 10

V: Matlab Programs

A. Stock Matrix 13

B. Normalization 13

C. NormSubAve 14

D. Overlap 14

E. Right Singular Vectors 14

F. Prediction Algorithm 15

VI: Conclusions and Future Work 16

Bibliography 17

Appendix A – Abbreviations 18

Appendix B – Budget 18

Appendix C – Project Timelines 19

Page 4: Stock Market Software - Walter Scott, Jr. College of ...projects-web.engr.colostate.edu/.../StockMarketSoftwareApproved... · Stock Market Software ... III: Mathematical Analysis

4

List of Figures

Figure 1: Selected Right Singular Vectors (Turner, Burkhart, 10) 8

Figure 2: Magnitude of Variance (Turner, Burkhart, 10) 8

Figure 3: Actual data prices vs. Predicted data prices (Turner, Burkhart, 11) 9

Figure 4: Prediction on General Electric for next 30 days 10

Figure 5: Prediction on General Electric for next 15 days 11

Figure 6: Prediction on General Electric for next 10 days 11

Figure 7: Prediction on Apple for next 15 days 12

Page 5: Stock Market Software - Walter Scott, Jr. College of ...projects-web.engr.colostate.edu/.../StockMarketSoftwareApproved... · Stock Market Software ... III: Mathematical Analysis

5

Chapter I – Introduction The stock market is a primary indicator of the strength of United States’ economy. It

provides a source for companies to raise money. On the market, companies sell shares to

investors and in return the investor gains partial ownership of the company. The investors range

from individual investors to hedge fund traders. Once an investor purchases a share they can sell

or trade the share whenever they so choose. A good investor can become quite rich, by knowing

how to invest in the stock market. The key to turning a profit is to purchase shares when they are

low and sell them when they are high.

For years investors have been trying to forecast stock prices. If they could accurately

determine the future value of a stock they can decide whether to keep, trade, or sell the stock.

However, there are thousands of different factors that will cause a stock to fluctuate, thus

forecasting is difficult. The qualitative factors that play a role are the company image, quarterly

earnings, dividends, national economic status, an election year, ect.

The goal of the design is to write a program that can effectively predict upcoming stock

trends. When we say trends, we are not looking to accurately predict the exact stock value. We

are more interested in the shape, whether the stock is going to rise or going to fall. Our design

takes a quantitative approach and looks at a company’s historical stock. The idea being that over

time stock patterns and trends repeat. The prediction algorithm that we are writing, in Matlab,

relies on the historical stock data and the most recent days of data, to forecast future prices.

This project is a continuation project on its fourth year. Over the past three years every

team has approached the design differently. However, we determined very early in the semester

that we would be using the same prediction method that the 2011-2012 team used. This method

of analysis is a linear algebra tool called Singular Value Decomposition (SVD). The results from

the last semester have shown that Singular Value Decomposition has strong potential in

predictive forecasting. At the beginning stages of the project we were looking to continue to

build on their code. The problem was that neither my partner nor I were experienced with the

program Matlab; trying to read, interpret, and fix somebody else’s code was becoming a

straining task. So we decided to start fresh and write our own code.

Before we could start writing code, in Matlab, we first had to devise a way to get the

stock data for any company on the market and store it. This was done by using Yahoo Finance

along with a web crawler program. Once we define the time window and the companies whose

data we want to analyze we can run the web crawler. The web crawler will automatically

download and store the stock data into a desired folder. The web crawler is discussed further in

Chapter II.

Chapter III contains the mathematical analysis that has been used all year. The first

section of this chapter, section A, goes over a linear algebra tool called singular value

decomposition (SVD). Where SVD is used to break down a KxN matrix X into its principle

components U, S, V; where X = USVT. Our entire prediction method revolves around this linear

algebra tool. In the next section, section B, we go over how SVD is used to create a prediction on

future stock prices. In Chapter IV, Results and Analysis of our Prediction Algorithm, we run our

algorithm to on different companies to see if it can predicted data points.

Chapter V goes over the key subroutine programs created this semester. These programs

are subcomponents that will form our prediction program. Each subroutine plays a vital role in

whether or not our prediction program produces an accurate prediction. Once the historical stock

data was saved into a folder, we could import it into Matlab for analysis using the command

xlsread(‘Stock Name’,’range’). This data was then used to create the stock matrix talked about in

Page 6: Stock Market Software - Walter Scott, Jr. College of ...projects-web.engr.colostate.edu/.../StockMarketSoftwareApproved... · Stock Market Software ... III: Mathematical Analysis

6

section A. This subroutine is titled StockToMatrix.m and its purpose is to create a matrix that

contains the historical stock data. The historical stock data is the engine that drives the entire

prediction algorithm. Also note that this data does not need to be updated, however, since the

data does span over a number of years inflation has accounted for. In order to implement the

effects of inflation we wrote two separate programs, the first is titled Normalization.m and the

second is titled NormSubAve. In section B, we normalize the stock matrix and as a result we get

a new normalized matrix, of the same dimensions as the stock matrix. In section C, we wrote a

program that will determine the mean of the original stock matrix and then subtract it from the

normalization matrix. As a result we get a new matrix, which also has the same dimension as the

stock matrix. When we run the program it gives us a new matrix, of the same dimensions as both

the stock matrix and the normalized matrix. The program in section D, will create a new matrix

and is titled Overlap.m.The rows of this matrix are formed by overlapping the periods of the

historical stock data. The program that we wrote in section E, of Chapter V, will calculate the

singular value vectors, V. This V will be used is a vital part of our prediction algorithm and was

found by applying SVD to the overlap matrix. Section F goes into the details of our prediction

program.

Page 7: Stock Market Software - Walter Scott, Jr. College of ...projects-web.engr.colostate.edu/.../StockMarketSoftwareApproved... · Stock Market Software ... III: Mathematical Analysis

7

Chapter II – Web Crawler The first task that our team had to solve was how to extract historical stock data from the

web. Like Frank Turner and Greg Burkhart we decided to use Yahoo.com to provide this

information. Under the finance section of Yahoo.com you can find the stock data for a

company’s first day on the market to its last day. There are seven different elements that make up

the stock data. For each date there is an opening price, closing price, lowest price, highest price,

volume price, and adjusted closing price. Downloading the stock data will create a .csv file in

Microsoft Excel.

At first we decided to download the stock data over a company's entire existence

manually. The problem with doing this is that it soon becomes tedious. When we want to

perform test on the predictive algorithm, we have to use multiple stocks to check for its validity.

So, instead of sorting through the Yahoo finance page and downloading historical stock for

various companies, we created a web crawler that would do this for us. The web crawler will

download the stock data automatically. All that we need to specify is the list of companies and

for what date range we want. The web crawler will automatically save the spreadsheets to a

designated folder. The web crawler was designed by us, but the original idea was sparked by the

2011-2012 team, Frank Turner and Greg Burkhart.

Page 8: Stock Market Software - Walter Scott, Jr. College of ...projects-web.engr.colostate.edu/.../StockMarketSoftwareApproved... · Stock Market Software ... III: Mathematical Analysis

8

Chapter III – Mathematical Analysis

A. Singular Value Decomposition

At the beginning of the semester we determined that the mathematical method of

prediction we would use was singular value decomposition (SVD). SVD allows us to take a KxN

matrix X and break down into its principle components X = USV’. U is a KxK matrix, where the

m columns of U are known as the right singular vectors. The right singular vectors of X are

eigenvectors of X’X. V is an NxN matrix, where the n columns of V are known as the left

singular vectors. The left singular vectors of X are the eigenvectors of XXT. S is a KxN matrix,

and has non-values along its diagonal called singular values. These values range from largest to

smallest, where the singular values equal the square root of the eigenvalues of XTX and XX

T.

Let X be the KxN matrix formed by stacking the vectors x1,…,xK one on top of another

as rows. Each vector can be represented as a linear combination of V, . If we let

L < N and define we can represent each row Where

are the coefficients (Chong, 1). This is true because

the first L singular values account for the bulk sum of the squared singular values. The results of

figure 1 and figure 2 verify the statement that the first L rows of V carry a bulk of the singular

values (Turner, Burkhart, 10).

Figure 1: Selected Right Singular Vectors Figure 2: Magnitude of Variance

Page 9: Stock Market Software - Walter Scott, Jr. College of ...projects-web.engr.colostate.edu/.../StockMarketSoftwareApproved... · Stock Market Software ... III: Mathematical Analysis

9

B. Method of Prediction

If we let and create a vector , where y represents the first M data

points of a vector and come from the same distribution as We want to predict the

N-M data points. The prediction method starts by defining , where be the first M rows of We now compute

, our prediction

method is as follows, . Then our prediction of the remaining N-M days is

. We use only the first M rows of because we only have the first M components

of (Chong, 2). Figure 3 below represents the plot of what our prediction should look like

(Turner, Burkhart, 11). The blue line of length M represents y, the actual first M data prices. The

red line represents , does not match the actual data perfectly. It is smooth because we are only

using the first L singular values. will produce the section of the red line that

represents the predicted N-M data prices. The purpose is to have the prediction map the actual

data prices of N-M.

Figure 3: Actual data prices vs. Predicted data prices

Page 10: Stock Market Software - Walter Scott, Jr. College of ...projects-web.engr.colostate.edu/.../StockMarketSoftwareApproved... · Stock Market Software ... III: Mathematical Analysis

10

Chapter IV – Results and Analysis of our Prediction Algorithm

We used the prediction method, discussed in section B of chapter III, to create a

prediction algorithm. The prediction algorithm was created in Matlab, and the programs that

were used to create it are discussed in detail in Chapter V. The goal is to have the plots of our

actual analyzed data vs. predicted analyzed data to look similar to figure 3 on page nine.

Throughout the semester we have been using the closing prices as the analyzed data. Before

running the prediction algorithm we must first declare the stockname, period, M, and L. The

stockname indicates the name of the company that we are wishing to perform our prediction

method on. The period, N = period, is the interval of days that we are interested in. M represents

the number of first M known days of data (where the vector y, discussed in section B of Chapter

III, is a vector composed of these first M data points). The value of L is a natural number and is

used to declare the number of important singular vectors, as discussed earlier in section B of

Chapter III. After declaring these four variables we can run the prediction algorithm and obtain

the predicted N – M closing prices.

If we run our prediction algorithm after setting stockname = GE (General Electric),

N= period = 90, M = 60, and L = 10, we get the plot shown in figure 4 below.

Figure 4: Prediction on GE for the next 30 days

If you compare figures 4 and 3, it is evident that the prediction software is not producing

an accurate prediction. If we look at the first M days, we can see that over-fitting is occurring.

The blue line should not perfectly match the red line for the first M days. Instead it should look

resemble figure 3, where the predicted data line will be smooth, but still give the general shape

of the actual data. If we look at the predicted data, the N-M days in this case the next 30 days, we

can tell that the prediction algorithm is not predicting data points that can represent the actual

stock data.

Page 11: Stock Market Software - Walter Scott, Jr. College of ...projects-web.engr.colostate.edu/.../StockMarketSoftwareApproved... · Stock Market Software ... III: Mathematical Analysis

11

If we rerun the prediction algorithm and keep the same stockname, N, and L as before,

but increase the number of days that are known to M = 75. We do this to see if predicting a

smaller number of days will produce better results. Figure 5, below shows the plots of the

predicted closing prices vs. the actual closing prices. Again there is over-fitting for the first

M =75 days (the blue line matches perfectly with the red line) and the predicted M-N = 15 days

is not producing a plot that can represent the M-N = 15 days of actual stock data.

Figure 5: Prediction on GE for the next 15 days

If we continue with the same approach as above, and increase the known days to M = 80

and decrease the days we wish to predict to N – M =10 days. A plot shown in figure 6 shows the

results of predicted closing prices vs. actual closing prices. Once again the first M days of the

plot suggest that the blue predictive curve is experiencing over-fitting.

Figure 6: Prediction on GE for the next 10 days

Page 12: Stock Market Software - Walter Scott, Jr. College of ...projects-web.engr.colostate.edu/.../StockMarketSoftwareApproved... · Stock Market Software ... III: Mathematical Analysis

12

If we change the company that we want to examine to Apple and let the parameters

(N = period, M, L) be set as N = 80, M =65, and L = 10 we obtain the plot in figure 7. Note that

the period had to be changed to N < 85 days, because we need our stock matrix to be a KxN

matrix such that K > N. Figure 7 below represents the plot of the actual closing prices vs. the plot

of the predicted closing prices. Once again the prediction plot is experiencing over-fitting, and

the predicted N-M days does not accurately represent the actual N-M days.

Figure 7: Prediction on Apple for the next 15 days

Based on the results shown in the above figures, it is evident that the code (described in

Chapter V) that our prediction algorithm uses is flawed. The over-fitting that is occurring

suggests that L does not accurately represent what it should. It has been suggested by Dr. Chong

that we may have used M to describe the number of important singular values. If this was the

case it would explain why the predicted plot is experiencing over-fitting.

Page 13: Stock Market Software - Walter Scott, Jr. College of ...projects-web.engr.colostate.edu/.../StockMarketSoftwareApproved... · Stock Market Software ... III: Mathematical Analysis

13

Chapter V – Matlab Programs

Throughout this semester we have primarily be designing and writing code in Matlab, with

the one exception being our web crawler. The programs below are necessary elements of our

prediction algorithm. However, note that based on the results shown in figures 4 through 7, from

Chapter IV, that our code is not functioning as desired.

A. Stock Matrix

The Stock Matrix subroutine program was the first of many programs necessary for

predictive forecasting. The program is dependent on company name, period, and the data

analyzed (let us call these the dependent variables). The data analyzed tells the program what

column of the .csv file should be read (i.e. opening price, closing price, low price, high price,

volume price, or adjustment price). Throughout the semester we have been using the closing

price.

After declaring the dependent variables and running the program, we get our stock

matrix. From this point forward we will call this matrix X, where matrix X is a KxN matrix and

K >N. N equals the period that we declared before running the program. K is the number of rows

of X, and is found by K =⌈

⌉.

A row vector of matrix X is represented by , where each is a

column vector of height N (Chong, 1). Each of these vectors represents the data analyzed of the

company’s stock over N consecutive trading days, indexed by increasing order of days

(Chong, 1).

The stock matrix X is composed of the historical stock data. This variable does not

change. In this program we also define y, where “ representing the first M data points of

a vector we believe is drawn from the same distribution as (Chong, 2).”

B. Normalization

A dilemma for our prediction arises, in the case that the row vectors ( of our

stock matrix X contain stock data from prices spanning over a number of years. This is due to the

fact that the value of currency changes due to inflation (i.e. $20 dollars five years ago held more

value than $20 dollars today). Therefore, in order to get an accurate prediction we need to

account for inflation. We address the problem due to inflation, by normalizing our stock matrix

X (Chong, 4).

Suppose M is a variable that is given (we will explain the need for M later) and that

M<N. We will let represent the row vectors of matrix X

prior to normalization. From chapter four we know that is a vector of stock prices over N

consecutive trading days. If we let our normalized vector of be called

.

Thus, the normalization of matrix X will produce a new matrix Z, which holds the same KxN

dimensions. Note that matrix Z is formed by stacking the row vectors on top of each

other (Chong, 4).

Page 14: Stock Market Software - Walter Scott, Jr. College of ...projects-web.engr.colostate.edu/.../StockMarketSoftwareApproved... · Stock Market Software ... III: Mathematical Analysis

14

In order to use vector y from section B, of this chapter, in our prediction method we also

have to normalize it. Vector y represents the last M days of data points, and is normalized by

, where y(M) is the value of the last M data point (Chong 5). Where ynorm a vector

of length M and is the normalized version of y.

C. NormSubMean

The next program that we wrote was titled NormSubMean. The first operation that the

program performs is calculating the mean of the normalization matrix Z. Where the mean is

described by the following average vector ̅

(Chong, 1).

Once we have our average vector ̅, we subtract ̅ from the normalized ynorm. We

will define ynew to be ̅ . After determining the average vector ̅, we also

need to subtract it from each (Chong,5).

We felt like the easiest way to do this was to create a mean matrix which we called MU

and is composed of ̅ where . Thus, the mean matrix MU has

the dimensions KxN. Now we can subtract the mean matrix MU from our normalized matrix Z,

since they are of the same dimensions. The resulting matrix is our NormSubMean Matrix, which

we will call XNew.

D. Overlap Matrix

The next subroutine program that we wrote is called OverlapMatrix.m. This program will

use the NormSubMean matrix, XNew, to create overlap matrix . Where the

matrix is composed of vectors If we set our period

to N=90, and want our overlap to be 30 days,

= [day 90, day 89,…, day 1]

= [day 121, day 120,…, day 31]

= [day 152, day151 ,…,day 62]

Note that day 1 represents the latest date and that the last data point of matrix

represents the earliest date. If we set the overlap to equal zero, the resulting matrix will be the

same as Xnew. The reason that we wrote this program was for cycling, so that the historical stock

data could be used to predict trends/patterns.

E. Right Singular Vectors

This subroutine program, is used to determine the matrix V (where V is composed of all

the right singular vectors). This program simply reads the result of our overlap matrix program

and applies SVD to the matrix. The program then saves only V.

Page 15: Stock Market Software - Walter Scott, Jr. College of ...projects-web.engr.colostate.edu/.../StockMarketSoftwareApproved... · Stock Market Software ... III: Mathematical Analysis

15

F. Prediction Algorithm

The prediction program is the final program that we wrote this semester. The program is

dependent on the stock name, period, M, and L variables. It begins by calling on our subroutine

program titled RightSingularVectors. The program is also dependent on the ynew found in section

C of this chapter. Using the concepts discussed in the Method of Prediction, section B of Chapter

III, we wrote our prediction program. However, because the stock data spans over years, we

replace the vector y, in section B of chapter III, with ynew. If we run the prediction program with

these changes we obtain a prediction of data. However the predicted data will not make much

sense. “To make use of the predicted values we should add back the average vector ̅

from days M+1 through N and also multiply the result by y(M) to get to actual stock prices

(Chong, 5).” As a result of running this program, we get a table where the values are the

remaining N-M days, indexed in increasing order of days. The results of running our prediction

algorithm can be seen in Chapter V.

Page 16: Stock Market Software - Walter Scott, Jr. College of ...projects-web.engr.colostate.edu/.../StockMarketSoftwareApproved... · Stock Market Software ... III: Mathematical Analysis

16

Chapter VI – Conclusions and Future Work

The main priority for this semester was to create a predictive algorithm, this much was

achieved. The subroutines that are used by the predictive algorithm were those discussed in

Chapter V. We feel like we have a good base for the project, and now that we are more

experienced with Matlab we can continue to progress.

The results from Chapter IV indicate that our prediction algorithm is not coming close to

giving a good prediction curve to represent the actual curve. For starters our prediction curve is

experiencing over-fitting. The expected reason for this is that the first M (instead of L) “singular

vectors are being used to represent bulk of the sum of the squared singular values (Chong 1).”

Also, based on the results, it seems that the only way to make money would be to do the

complete opposite of what the prediction curve indicates (invert the curve).

We recently figured out that we were not reading in the correct data in order to make a

prediction. The stock matrix that we designed reads all of the stock data from the .csv file, and

the y vector was simply the first M days of the first vector of the stock matrix. This is wrong, our

stock matrix needs to contain the historical prices, which remain constant, and the y vector is the

data that needs to be updated consistently (where y represents the stock data over the last M

days)

Even though our results have yet to indicate it, we believe that there SVD has a high

degree of predictive power. In order to prove this though, we are going to have to refine our

coding and continue to run test on the prediction algorithm. We will also have to decide on a

figure of merit, so that we can determine which results are satisfying. Once we feel like we have

an algorithm, after experimentation, that can produce satisfying results we will need to change

the type of data analyzed. Thus far in the project we have been exclusively using the closing

prices. We will need to examine the prediction algorithm when we change the data analyzed to

high prices, low prices, opening prices, volume prices, or adjusted prices. It is expected that the

prediction algorithm would be independent of the analyzed data, but experimentation is

necessary to confirm this. The next step, after creating a prediction algorithm independent of the

data analyzed, would be create a variable that indicates the type of data to be analyzed.

At the end of next semester we are hoping to have a GUI implemented into the design.

Where the user can enter a company’s name and the GUI will give a group of plots. The user can

then decide whether the company is worth investing in.

Page 17: Stock Market Software - Walter Scott, Jr. College of ...projects-web.engr.colostate.edu/.../StockMarketSoftwareApproved... · Stock Market Software ... III: Mathematical Analysis

17

Bibliography

Chong, E.K.P. "Prediction using Principal Components." (2012): 5. Print.

Turner, Frank, and Greg Burkhart. "Stock Market Software." (2012): 18. Print.

Page 18: Stock Market Software - Walter Scott, Jr. College of ...projects-web.engr.colostate.edu/.../StockMarketSoftwareApproved... · Stock Market Software ... III: Mathematical Analysis

18

Appendix A: Abbreviations

SVD – Singular Value Decomposition

X – stock matrix, contains historical stock data

y – contains stock data over the last M days

Appendix B: Budget

This is a theoretical and software based senior design project, therefore we have no

expenses.

Page 19: Stock Market Software - Walter Scott, Jr. College of ...projects-web.engr.colostate.edu/.../StockMarketSoftwareApproved... · Stock Market Software ... III: Mathematical Analysis

19

Appendix C: Project Timelines

Timeline September 3, 2012

Practice writing and reading matlab code

Reading matlab books

Finding code from online and

experiment with it.

(Summer-Present)

Read the final report of last year’s team and

their project continuation description. (8/20/12-8/25/12)

Obtain code from the 2011 senior design team. (8/26/12-9/1/12) Begin studying the current market stock

predictor programs, and decompose them to

figure out the method they used. We will do

this to see if there is an overlapping theme in

programs and then decide for ourselves what

direction we want to take.

(9/2/12-9/22/12)

Begin decomposing the code to understand its

purpose. (9/23/12-10/6/12)

Have either Frank or Greg run us through the

code and how it works (i.e. what functions it is

performing). Ask them where they left off and

what troubles they had.

(9/23/12-10/6/12)

Research the different methods of prediction (10/7/12-10/20/12) Determine if we want to continue working in

Greg’s and Frank’s direction or start over with

a new approach.

(10/21/12-10/27/12)

Pick the route that we want to take with the

method of prediction.

SVD

Correlation Coefficient

Bergman co-clustering

(10/21/12-10/27/12)

Determine how we want to analyze the data

(i.e. how/if we want to group the data to

analyze it). Whether it is looking at the data as

a whole, through a bear market, or through a

bull market. We also want to consider how/if

we should group the data (i.e. Microsoft,

Apple, Linux would fall under the computer

category and GE and Ford would fall into the

vehicle category).

(10/28/12-11/10/12)

Begin extracting stock market data, using a

web crawler to put the data into matlab. (11/11/12-11/24/12)

Start writing our matlab code for predicting the

rise and fall of stocks. (11/25/12-12/1/12)

Page 20: Stock Market Software - Walter Scott, Jr. College of ...projects-web.engr.colostate.edu/.../StockMarketSoftwareApproved... · Stock Market Software ... III: Mathematical Analysis

20

Second Semester

Design pattern matching software. Week 1 - Week 5 Update web-crawler and implement prediction

software.

Week 6 - Week 9

After finishing writing code, we need to run

analysis and preform experiments on the code.

Week 10 - Week 12

Debug the code, and increase performance. Week 13 - Week 14 Summarize findings in a report. Week 15 - Week 16

Timeline: October 17, 2012

Task Begin End Assigned

Design web-crawler Monday 9/03/12 Thursday 9/20/12 Jordan

Design website Wednesday 9/12/12 Wednesday 9/19/12 Group

Notebook collection Monday 9/17/12 Friday 9/21/12 Individual

Research SVD

method of prediction

Monday 9/03/12 Sunday 9/30/12 Group

Complete

understanding of

previous teams code

Monday 9/03/12 Sunday 10/17/12 Group

Notebook collection Monday 10/08/12 Friday 10/12/12 Individual

Test StockToMatrix

algorithm (we will be

using Frank’s and

Greg’s

StockToMatrix

algorithm)

Monday 10/15/12 Wednesday 10/17/12 Jordan

Testing and

Measurement plan

Monday 10/15/12 Friday 10/19/12 Group

Research pattern

matching algorithm

Sunday 10/21/12 Wednesday 10/31/12 Rayan

Design our pattern

matching algorithm

Saturday 10/21/12 Saturday 11/03/12 Jordan

Notebook collection Monday 11/05/12 Friday 11/09/12 Individual

Test and debug

pattern matching

program

Monday 11/05/12 Friday 11/09/12 Jordan

Research prediction

algorithm

Monday 11/05/12 Friday 11/09/12 Rayan

Design our prediction

program

Saturday 11/10/12 Monday 11/19/12 Jordan

Test and debug

prediction program

Monday 11/19/12 Wednesday 11/21/12 Group

Run experiments on

prediction program

Monday 11/19/12 Saturday 12/01/12

Group

Page 21: Stock Market Software - Walter Scott, Jr. College of ...projects-web.engr.colostate.edu/.../StockMarketSoftwareApproved... · Stock Market Software ... III: Mathematical Analysis

21

Summarize findings

in report

Monday 11/26/12 Wednesday 12/05/12 Group

Written report due Wednesday 12/05/12 Wednesday 12/05/12

Second Semester

Create a test bench

program

Monday 12/31/12 Monday 1/14/13 Rayan

Create an algorithm

that randomly predicts

stock fluctuations

Monday 1/14/13 Monday 1/28/13 Jordan

Create new

experiments for our

code.

Monday 1/14/13 Saturday 2/02/13 Rayan

Compare prediction

program to the

random guesser

program.

Monday 1/28/13 Saturday 2/02/13 Group

Create a bankroll

program.

Sunday 2/03/13 Saturday 2/16/13 Jordan

Test the bankroll

program.

Sunday 2/17/12 Friday 2/22/12 Rayan

Modify our prediction

program, to obtain

better results.

Saturday 2/23/13 Sunday 3/03/13 Jordan

Increase the

dimensionality of our

program, to be able to

make predictions on

all the elements of the

stock data.

Monday 3/04/13 Monday 3/18/13 Group

Rerun experiments on

the code.

Tuesday 3/19/13 Saturday 3/23/13 Rayan

Create new

experiments for our

code

Tuesday 3/19/13 Wednesday 4/02/13 Jordan

Prepare for E-days Tuesday 3/19/13 Thursday 4/12/13 Group

Run more

experiments

Friday 4/13/13 Saturday 5/04/13 Group

Write final report Sunday 4/28/13 Saturday 5/11/13 Group

Page 22: Stock Market Software - Walter Scott, Jr. College of ...projects-web.engr.colostate.edu/.../StockMarketSoftwareApproved... · Stock Market Software ... III: Mathematical Analysis

22

Timeline: December 7, 2012

Task Length Begin End Assigned

Researching and practicing writing Matlab code 11 days Sat 8/11/12 Thu 8/23/12 Jordan

Reviewing last team's code and final paper 7 days Fri 8/31/12 Sun 9/9/12 Group

Research how to create a web-crawler 7 days Fri 8/31/12 Sun 9/9/12 Jordan

Research SVD method of Prediction 26 days Mon 9/3/12 9/30/12 Group

Design web-crawler 1 day Sun 9/9/12 Sun 9/9/12 Group

Design website 6 days Wed 9/12/12 Wed 9/19/12 Group

Notebook collection 3 days Mon 9/17/12 Fri 9/21/12 Individual

Researching low rank approximation 3 days Fri 9/28/12 Tue 10/2/12 Group

Commenting Frank's and Greg's code 1 days Thu 10/4/12 Thu 10/4/12 Jordan

Notebook collection 3 days Mon 10/8/12 Fri 10/12/12 Individual

Testing and measurement plan 6 days Mon 10/8/12 Sat 10/15/12 Group

Meeting with Frank Turner to go over his team's

coding 1 day Wed 10/17/12 Wed 10/17/12 Group

Create Stock Matrix subroutine program 11 days Mon 10/22/12 Fri 11/2/12 Jordan

Reviewing Dr. Chong's Prediction using Principle

Components 22 days Sat 10/27/12 Fri 11/23/12 Group

Notebook collection 3 days Mon 11/5/12 Fri 11/9/12 Individual

Modifying and debugging Stock Matrix subroutine

program 16 days Mon 11/5/12 Fri 11/23/12 Jordan

Design a Normalization program, then modify and

debug the program 7 days Sun 11/18/12 Sun 11/25/12 Jordan

Design NormSubAve subroutine program, then

modify and debug the program 7 days Sun 11/18/12 Sun 11/25/12 Jordan

Design Overlap subroutine program, modify and

debug 6 days Mon 11/19/12 Sun 11/25/12 Jordan

Page 23: Stock Market Software - Walter Scott, Jr. College of ...projects-web.engr.colostate.edu/.../StockMarketSoftwareApproved... · Stock Market Software ... III: Mathematical Analysis

23

Design Right Singular Vectors subroutine

program, modify and debug 6 days Mon 11/19/12 Sun 11/25/12 Jordan

Design Prediction program, modify and debug 5 days Tue 11/20/12 Sun 11/25/12 Jordan

Link all subroutine programs to our Prediction

Program 2 days Mon 11/26/12 Tue 11/27/12 Jordan

Prepare Oral Presentation 6 days Fri 11/30/12 Wed 12/5/12 Group

Summarize findings in report 8 days Fri 11/30/12 Fri 12/7/12 Group

Presentation 0 days Wed 12/5/12 Wed 12/5/12 Group

Written report due 0 days Fri 12/7/12 Fri 12/7/12 Group

Notebook Collection 0 days Mon 12/10/12 Mon 12/10/12 Individual