predictors of stock market values

Upload: aaronchall

Post on 06-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 Predictors of Stock Market Values

    1/38

    PREDICTORS OF STOCK MARKET VALUES

    QMB 6305

    UNIVERSITY OF WEST FLORIDA

    Submitted by

    Aaron Hall

    April 13, 2010

    Instructor

    DR. GAYLE BAUGH

  • 8/2/2019 Predictors of Stock Market Values

    2/38

    TABLE OF CONTENTS

    Introduction ............................................................................................................ 1

    The Data ............................................................................................................. 3

    Prediction And Moving Averages ........................................................................ 5

    Data Manipulation ............................................................................................ 11

    Data Exploration .............................................................................................. 12

    Linear Prediction .............................................................................................. 13

    Checking The Model ......................................................................................... 20

    Conclusions .......................................................................................................... 22

    The Model ......................................................................................................... 22

    Confidence Intervals ........................................................................................ 22

    Error Comparison ............................................................................................. 23

    Summary .......................................................................................................... 25

    References ........................................................................................................... 27

    Appendix A: Acknowledgements .......................................................................... 28

    GNU/Linux/Ubuntu ............................................................................................ 28

    R ....................................................................................................................... 28

    OpenOffice.org ................................................................................................. 28

    Other Tools ....................................................................................................... 28

    Appendix B: R Code ............................................................................................. 29

    ii

  • 8/2/2019 Predictors of Stock Market Values

    3/38

    Predictors of Stock Market Values 1

    INTRODUCTION

    The goal of this project is to analyze available econometric data and find

    predictors of the valuation of the United States stock market. Availability of

    data is an important constraint for this analysis. As the stock market has its

    value calculated every second, using predictors with monthly frequency

    would have been preferable, however, macroeconomic data is usually given

    in quarterly and annual forms. Thus, all data used here has been annualized.

    The proxy for stock values, the model's independent variable, is the

    Ibbotson Large Company total return values. Since these values are given in

    percentage returns from year 1925, the figures used for the model are

    transformed as $1,000 invested in 1925 (Harrington, 2008).

    Dependent variables include projected GDP, interest rates, inflation, and

    the money supply.

    Projected GDP is measured by reported predictions by the fed in the

    Greenbook, but since it is only reported with a three year delay, the data

    must be analyzed up to that date. Further, since the Fed's methodology is a

    secret, the results for this figure, if they are significant, cannot be accurately

    reproduced. GDP is an estimate even after the period for which it is

    measured, and is usually revised several times (St. Louis Fed, 2010).

    Projected GPD is important because it represents the productivity of the

    United States economy. It would make sense that the more productive the US

    economy, the greater chance for profitability for any US company, though

    certainly not a guarantee. If an investor believes the economy will be more

  • 8/2/2019 Predictors of Stock Market Values

    4/38

    Predictors of Stock Market Values 2

    productive in the future, perhaps this will increase the price the investor will

    pay for the investment.

    Since stock prices are based on the present values of future cash flows,

    changes in interest rates are likely to affect the valuation of stocks. Lower

    rates would increase the present value of the future cash flows. Interest rates

    also affect the cost of capital for the firm as well. Lower interest rates would

    decrease the borrowing costs of firms and increase their profitability.

    The original proposal sought to examine bond prices as a predictor

    variable. Since interest rate changes are directly related to changes in bond

    prices, it precludes the use of bonds as a predictor variable.

    Inflation may also be a predictor of stock valuations. Higher inflation

    means that investors in bonds, if they are spending the bond income, are

    losing capital to inflation. As a result, they may seek higher returns in the

    stock market. Further, the values of hard assets owned by the companies

    may also be increasing in value relative to the weakening dollar. Similarly to

    the Large Company stock values, the figures used for inflation are indexed to

    $1,000 in 1925.

    The money supply is the final predictor to examine. The money supply

    represents the number of dollars in circulation. It is related to inflation as the

    more dollars in circulation, the less each individual dollar may be worth.

    Since it may be highly correlated to inflation, it is unlikely both will remain in

    the final model. This idea is not unrelated to Keynes' idea that the

    components of money demand include a speculative demand in addition to

  • 8/2/2019 Predictors of Stock Market Values

    5/38

    Predictors of Stock Market Values 3

    classical notions of precautionary demand and transactional demand.

    Since some predictors are only available on a quarterly or annual basis,

    important data is only available on a year

    A concern for this analysis is whether or not the stock market of today is

    affected by the same causes even a decade past.

    The Data

    The data is expected upon graphical examination to reveal trends and

    cyclicality. It is expected that since the stock market values are

    representative of growth, perhaps a log transformation of the data is the best

    approach. However, the other values also follow a growth form, and

    transforming both the predictor and dependent variables in an identical

    fashion will not yield any additional predictive power. The Box-Cox operation

    may reveal the optimal transformation.

    Stock market values will be represented by Ibbotson Large Company

    Returns. Since returns are given in terms of percentage gains or losses, the

    data is transformed into values based on $1000 invested in 1925, and

    represent the value gained or lost by the end of the year (Harrington, 2008).

    Projected economic production is measured by Projected Gross National

    Product (in billions) for up to year 1992, and Projected Gross Domestic

    Product (in billions) for year 1992 on, with one year of overlap in projections

    for year 1992. These projections are given by the Greenbook, which is

    released along with the Federal Open Market Committee meeting transcripts

    after a five year lag (St. Louis Fed, 2010).

  • 8/2/2019 Predictors of Stock Market Values

    6/38

    Predictors of Stock Market Values 4

    Inflation is given by Ibbotson Inflation Return data. The Fed Funds Rate

    will proxy for interest rates. Money supply may be represented variously by

    Institutional Money Funds (series IMFNS from the St. Louis Fed) and M2, and

    the two series added together. (M3 was to be our time series for money

    supply as it is the most encompassing definition of money, but it has been

    discontinued by the Fed on the grounds that the costs of gathering the data

    are not overcome by the value of the series.)

    It should be noted that both the Large Company and Inflation return data

    are given in terms of annual percentage growth, and have been transformed

    to indicate the growth in the value of $1000 in 1925, therefore these figures

    indicate the value by the end of the period. For prediction, the predictor

    variables (other than projected GDP and GNP) will be lagged.

    Occam's razor states that if two possible explanations are equally likely,

    one should accept the least complicated explanation. When making

    predictions, one should accept more complicated explanations only if the

    more complicated explanation provides significantly greater prediction value.

    Thus, this paper will seek to find the simplest model with the best

    prediction value.

  • 8/2/2019 Predictors of Stock Market Values

    7/38

    Predictors of Stock Market Values 5

    Prediction And Moving Averages

    There are various ways of attempting to predict a variable based on its

    past values. The simplest method is to use the last measured value. This

    method may put too much emphasis on a single terms' values. Various forms

    of moving averages can provide a more nuanced approach to prediction of

    the next period's value.

    Simple moving Averages (SMAs) weight all periods evenly, and the only

    Table 1: Raw Data

    Year LCStock GNP GDP Inflation FEDFUNDS IMFNS M2NS M2IMFNS

    1978 89597.47 8445.3 NA 3777.37 NA NA NA NA

    1979 106119.24 9385.1 NA 4280.14 13.78 9.5 1479.0 1488.5

    1980 140523.09 10219.1 NA 4810.87 18.90 15.2 1604.8 1620

    1981 133623.41 11342.9 NA 5240.97 12.37 38.0 1760.3 1798.3

    1982 162232.18 12479.4 NA 5443.79 8.95 50.0 1917.2 1967.21983 198750.65 12979.4 NA 5650.66 9.47 42.5 2136.2 2178.7

    1984 211212.31 14604.2 NA 5873.86 8.38 65.9 2320.9 2386.8

    1985 279138.19 15628.9 NA 6095.30 8.27 68.2 2506.6 2574.8

    1986 330695.02 16478.2 NA 6164.18 6.91 88.5 2744.1 2832.6

    1987 347990.37 17707.1 NA 6436.02 6.77 95.0 2842.7 2937.7

    1988 406487.55 18951.2 NA 6720.49 8.76 94.9 3006.3 3101.2

    1989 534490.48 20890.4 NA 7032.99 8.45 112.5 3171.4 3283.9

    1990 517547.13 22115.7 NA 7462.71 7.31 141.5 3290.2 3431.7

    1991 675657.77 22918.9 NA 7691.07 4.43 191.2 3391.1 3582.3

    1992 727480.73 23731.8 23646.8 7914.11 2.92 216.0 3446.7 3662.7

    1993 800156.05 NA 25106.9 8131.75 2.96 221.3 3501.2 3722.5

    1994 810638.09 NA 26916.3 8348.86 5.45 216.1 3517.7 3733.81995 1114059.93 NA 28464.6 8560.92 5.60 270.3 3663.9 3934.2

    1996 1371073.56 NA 29644.5 8845.15 5.29 332.4 3839.7 4172.1

    1997 1828463.70 NA 31702.1 8995.51 5.50 409.7 4053.6 4463.3

    1998 2351038.63 NA 33760.6 9140.34 4.68 565.6 4398.2 4963.8

    1999 2845697.15 NA 35286.5 9385.30 5.30 674.2 4661.9 5336.1

    2000 2586454.14 NA 39068.3 9703.47 6.40 833.4 4948.7 5782.1

    2001 2279183.39 NA 41928.5 9853.87 1.82 1248.0 5469.0 6717

    2002 1775483.86 NA 41733.8 10088.39 1.24 1300.8 5816.2 7117

    2003 2285047.73 NA 43518.7 10278.05 0.98 1154.7 6101.4 7256.1

    2004 2533432.42 NA 46639.7 10613.12 2.16 1103.0 6443.7 7546.7

    2005 2657823.95 NA NA 10976.09 4.16 1172.1 6703.1 7875.2

    2006 3077760.13 NA NA 11254.88 5.24 1378.4 7102.3 8480.72007 3246729.16 NA NA 11714.08 4.24 1934.8 7530.2 9465

    2008 2045439.37 NA NA 11724.62 0.16 2430.9 8251.3 10682.2

  • 8/2/2019 Predictors of Stock Market Values

    8/38

    Predictors of Stock Market Values 6

    parameter is the number of periods used for prediction.

    Exponential Moving Averages (EMAs) reduce the parameters involved in

    weighted moving averages to the number of periods used and a parameter,

    alpha, which defines the amount of weighting each period receives. If alpha is

    restricted to 2 /n1 , one may reduce the number of parameters to 1

    (Colby, 2003).

    Weighted Moving Averages (WMA) may also be considered. There are

    enumerable variations in choice for weightings. Restricting the weighting to

    ntx where n is the number of periods, tis the most recent period, and

    xis the period number, can provide a general rule structure with equally

    declining rates of weighting. SMA and the Last method are actually restricted

    cases of WMA, the SMA with equal weighting, and the Last method with

    n=1.

    These are simple methods for forecasting time series data. They can

    provide a baseline for deciding if other more complicated methods are

    worthwhile. Measurement of the degree to which they fail to predict can

    provide a way of eliminating the less effective methods of prediction. The

    methods are demonstrated and compared on the following pages.

  • 8/2/2019 Predictors of Stock Market Values

    9/38

    Predictors of Stock Market Values 7

    Table 2: Prediction Method of using the Last Period's Value

    Year Last Absolute Error Squared Error Abs. % Error

    1978 89597.47

    1979 106119.24 89597.47 16521.77 272968969.30 15.57%

    1980 140523.09 106119.24 34403.86 1183625368.92 24.48%

    1981 133623.41 140523.09 6899.68 47605638.59 5.16%

    1982 162232.18 133623.41 28608.77 818461848.89 17.63%1983 198750.65 162232.18 36518.46 1333598241.05 18.37%

    1984 211212.31 198750.65 12461.67 155293109.25 5.90%

    1985 279138.19 211212.31 67925.88 4613925152.16 24.33%

    1986 330695.02 279138.19 51556.82 2658106122.24 15.59%

    1987 347990.37 330695.02 17295.35 299129110.46 4.97%

    1988 406487.55 347990.37 58497.18 3421920136.68 14.39%

    1989 534490.48 406487.55 128002.93 16384749714.32 23.95%

    1990 517547.13 534490.48 16943.35 287077043.93 3.27%

    1991 675657.77 517547.13 158110.65 24998976830.29 23.40%

    1992 727480.73 675657.77 51822.95 2685618284.69 7.12%

    1993 800156.05 727480.73 72675.32 5281702797.86 9.08%

    1994 810638.09 800156.05 10482.04 109873251.96 1.29%1995 1114059.93 810638.09 303421.84 92064812356.11 27.24%

    1996 1371073.56 1114059.93 257013.63 66056004341.90 18.75%

    1997 1828463.70 1371073.56 457390.14 209205740036.57 25.01%

    1998 2351038.63 1828463.70 522574.93 273084552890.20 22.23%

    1999 2845697.15 2351038.63 494658.53 244687058285.70 17.38%

    2000 2586454.14 2845697.15 259243.01 67206938571.71 10.02%

    2001 2279183.39 2586454.14 307270.75 94415315113.52 13.48%

    2002 1775483.86 2279183.39 503699.53 253713215787.74 28.37%

    2003 2285047.73 1775483.86 509563.87 259655335708.01 22.30%

    2004 2533432.42 2285047.73 248384.69 61694953315.94 9.80%

    2005 2657823.95 2533432.42 124391.53 15473253157.22 4.68%

    2006 3077760.13 2657823.95 419936.18 176346398595.85 13.64%2007 3246729.16 3077760.13 168969.03 28550533539.91 5.20%

    2008 2045439.37 3246729.16 1201289.79 1443097161504.6 58.73%

    225742.56 115510479476.74 16.94%

    MAD MSE MAPE

    LCStock

  • 8/2/2019 Predictors of Stock Market Values

    10/38

    Predictors of Stock Market Values 8

    Table 3: Simple Moving Average, n=3

    Year 3SMA Absolute Error Squared Error Abs. % Error

    1978 89597.47

    1979 106119.24

    1980 140523.09

    1981 133623.41 112079.93 21543.48 464121451.77 16.12%

    1982 162232.18 126755.25 35476.94 1258612933.63 21.87%1983 198750.65 145459.56 53291.08 2839939693.60 26.81%

    1984 211212.31 164868.75 46343.57 2147726102.60 21.94%

    1985 279138.19 190731.71 88406.48 7815705416.34 31.67%

    1986 330695.02 229700.38 100994.63 10199915820.03 30.54%

    1987 347990.37 273681.84 74308.53 5521756957.95 21.35%

    1988 406487.55 319274.53 87213.02 7606111133.42 21.46%

    1989 534490.48 361724.31 172766.17 29848147904.41 32.32%

    1990 517547.13 429656.13 87891.00 7724827496.83 16.98%

    1991 675657.77 486175.05 189482.72 35903703032.64 28.04%

    1992 727480.73 575898.46 151582.27 22977183646.41 20.84%

    1993 800156.05 640228.54 159927.51 25576807786.21 19.99%

    1994 810638.09 734431.52 76206.58 5807442490.69 9.40%1995 1114059.93 779424.96 334634.98 111980567596.75 30.04%

    1996 1371073.56 908284.69 462788.87 214173535872.04 33.75%

    1997 1828463.70 1098590.53 729873.17 532714845282.45 39.92%

    1998 2351038.63 1437865.73 913172.89 833884735153.89 38.84%

    1999 2845697.15 1850191.96 995505.19 991030584614.88 34.98%

    2000 2586454.14 2341733.16 244720.98 59888359287.38 9.46%

    2001 2279183.39 2594396.64 315213.25 99359393130.41 13.83%

    2002 1775483.86 2570444.90 794961.03 631963045960.47 44.77%

    2003 2285047.73 2213707.13 71340.60 5089480910.29 3.12%

    2004 2533432.42 2113238.33 420194.09 176563073690.98 16.59%

    2005 2657823.95 2197988.00 459835.95 211449097709.30 17.30%

    2006 3077760.13 2492101.37 585658.77 342996192310.68 19.03%2007 3246729.16 2756338.83 490390.33 240482676908.24 15.10%

    2008 2045439.37 2994104.42 948665.04 899965361827.70 46.38%

    337495.89 204341961189.70 25.28%

    MAD MSE MAPE

    LCStock

  • 8/2/2019 Predictors of Stock Market Values

    11/38

    Predictors of Stock Market Values 9

    Table 4: Exponential Moving Average, n=3, alpha=0.5

    Year 3EMA.5 Absolute Error Squared Error Abs. % Error

    1978 89597.47

    1979 106119.24 89597.47

    1980 140523.09 97858.35

    1981 133623.41 119190.72 14432.69 208302472.58 10.80%1982 162232.18 126407.07 35825.12 1283438940.56 22.08%

    1983 198750.65 144319.62 54431.02 2962736201.05 27.39%

    1984 211212.31 171535.14 39677.18 1574278358.49 18.79%

    1985 279138.19 191373.72 87764.47 7702601885.25 31.44%

    1986 330695.02 235255.96 95439.06 9108613854.10 28.86%

    1987 347990.37 282975.49 65014.88 4226934433.03 18.68%

    1988 406487.55 315482.93 91004.62 8281840836.41 22.39%

    1989 534490.48 360985.24 173505.24 30104067776.38 32.46%

    1990 517547.13 447737.86 69809.27 4873334340.09 13.49%

    1991 675657.77 482642.49 193015.28 37254899475.16 28.57%

    1992 727480.73 579150.13 148330.59 22001964771.08 20.39%

    1993 800156.05 653315.43 146840.62 21562167965.08 18.35%1994 810638.09 726735.74 83902.35 7039605132.02 10.35%

    1995 1114059.93 768686.92 345373.02 119282520409.14 31.00%

    1996 1371073.56 941373.43 429700.13 184642205957.35 31.34%

    1997 1828463.70 1156223.49 672240.21 451906896336.44 36.77%

    1998 2351038.63 1492343.60 858695.03 737357153315.08 36.52%

    1999 2845697.15 1921691.11 924006.04 853787164899.99 32.47%

    2000 2586454.14 2383694.13 202760.01 41111621713.91 7.84%

    2001 2279183.39 2485074.14 205890.75 42390999723.26 9.03%

    2002 1775483.86 2382128.76 606644.90 368018038091.87 34.17%

    2003 2285047.73 2078806.31 206241.42 42535521976.81 9.03%

    2004 2533432.42 2181927.02 351505.40 123556043793.01 13.87%

    2005 2657823.95 2357679.72 300144.23 90086558779.20 11.29%2006 3077760.13 2507751.83 570008.30 324909460857.22 18.52%

    2007 3246729.16 2792755.98 453973.18 206091648861.04 13.98%

    2008 2045439.37 3019742.57 974303.20 949266726355.81 47.63%

    311128.82 173819531389.31 23.61%

    MAD MSE MAPE

    LCStock

  • 8/2/2019 Predictors of Stock Market Values

    12/38

    Predictors of Stock Market Values 10

    Table 5: Weighted Moving Average, 3 periods

    Year 3WMA Absolute Error Squared Error Abs. % Error

    1978 89597.47

    1979 106119.24

    1980 140523.09

    1981 133623.41 120568 13055.87 170455826.59 9.77%

    1982 162232.18 131339 30892.91 954371666.51 19.04%

    1983 198750.65 149078 49672.90 2467397310.21 24.99%

    1984 211212.31 175723 35489.03 1259471001.03 16.80%

    1985 279138.19 198895 80243.12 6438958847.56 28.75%

    1986 330695.02 243098 87596.71 7673183321.03 26.49%

    1987 347990.37 293596 54394.74 2958787899.04 15.63%

    1988 406487.55 330750 75737.66 5736193038.66 18.63%

    1989 534490.48 374356 160134.08 25642922636.86 29.96%

    1990 517547.13 460739 56807.65 3227108677.42 10.98%

    1991 675657.77 504685 170972.79 29231696566.83 25.30%

    1992 727480.73 599426 128054.38 16397925184.80 17.60%

    1993 800156.05 675217 124938.57 15609647468.82 15.61%

    1994 810638.09 755181 55456.87 3075463885.92 6.84%

    1995 1114059.93 793285 320775.42 102896866984.15 28.79%

    1996 1371073.56 960602 410471.55 168486896330.42 29.94%

    1997 1828463.70 1191996 636467.26 405090572707.41 34.81%

    1998 2351038.63 1556933 794105.60 630603703969.31 33.78%

    1999 2845697.15 2013519 832177.68 692519690655.54 29.24%

    2000 2586454.14 2511272 75182.07 5652344215.05 2.91%

    2001 2279183.39 2633633 354449.17 125634213850.63 15.55%

    2002 1775483.86 2476026 700542.07 490759197131.8 39.46%

    2003 2285047.73 2078545 206502.31 42643204645.54 9.04%

    2004 2533432.42 2114216 419216.70 175742642136.79 16.55%

    2005 2657823.95 2324313 333511.19 111229711943.21 12.55%

    2006 3077760.13 2554231 523529.40 274083030393.65 17.01%

    2007 3246729.16 2847060 399669.05 159735345716.28 12.31%

    2008 2045439.37 3092255 1046815.91 1095823551868.69 51.18%

    302846.77 170434983551.10 22.20%

    MAD MSE MAPE

    LCStock

  • 8/2/2019 Predictors of Stock Market Values

    13/38

    Predictors of Stock Market Values 11

    Table 6: Moving Average Summary

    It is clear that the method of choosing the last value has the best

    performance of the moving averages, since it has the least Mean Absolute

    Deviation (MAD), the least Mean Squared Errors (MSE), and the least Mean

    Absolute Percentage Error (MAPE). In terms of the moving averages, the best

    performer is the simplest prediction method, where the last period is used to

    predict the next period. Since the data follows a trend with occasional

    retracement, it makes sense that this prediction method would provide the

    best performance.

    This is not to say that selecting the Last Period is a satisfactory approach

    to prediction. The variable is generally in an upward trend. If the troughs can

    be predicted, an investor will safely gain returns while avoiding or even

    profiting from market losses.

    Data Manipulation

    Since the data are to be used to predict future period Large Company

    Stock returns using current data, aside from projected GDP. Therefore

    Inflation, Interest Rates, and the Money Supply figures will be lagged.

    Also, Projected GNP ends where Projected GDP begins, with one year of

    overlap. These series are spliced together with the average taken for the

    year of overlap (since there is minimal difference in the figures, less than half

    Method MAD MSE MAPE

    Last 225742.56 115510479476.74 16.94%

    SMA 337495.89 204341961189.70 25.28%

    EMA 311128.82 173819531389.31 23.61%

    WMA 302846.77 170434983551.10 22.20%

  • 8/2/2019 Predictors of Stock Market Values

    14/38

    Predictors of Stock Market Values 12

    a percent.) This series will be referred to as SPLICEGROSS in the data, and as

    GDP in the text from this point on.

    The lagged money supply data will consist of the combined M2 and

    Institution Money Funds, since M3 (the most expansive definition of the

    money supply) was discontinued. In the data, it is referred to under the

    MRM2IMFNS label. In the text, it will be referred to as simply the money

    supply. The most recent (thus lagged) Fed Funds and Inflation data are also

    used.

    Data Exploration

    The data are highly correlated. Pairwise correlation is measured where

    data is available for both terms. The term of primary interest is the

    dependent variable, Large Cap Stock. Correlations with the dependent

    variable are: Year, 0.926; GDP, 0.934; lagged Inflation, 0.912; lagged

    FEDFUNDS, 0.64; lagged M2 and Institution Money Funds, 0.878.

    There are other pairwise correlations of note. FEDFUNDS is negatively

    correlated with all other variables. Its strongest relationship is with the GDP (-

    0.798, -0.781 lagged) and Inflation (-0.808) variables, both at near 0.8.

    The measure of money supply is highly correlated with Inflation (0.962)

    as well as GDP (0.981). This relationship may indicate multicollinearity in the

    data, and it makes the money supply variable an early potential candidate for

    removal. The removal of the money supply (or any other variable, for that

    matter) from the model does not mean that it is unimportant, it merely

    means that it is not needed to predict the response variable.

  • 8/2/2019 Predictors of Stock Market Values

    15/38

    Predictors of Stock Market Values 13

    Multicollinearity is the problem of having one predictive variable being a

    near linear transformation of another predictive variable. When the variables

    are highly correlated, the model may still be reliable so long as the

    relationship between the independent variables is stable. If the relationship

    Illustration 1: Plots of the Key Variables

  • 8/2/2019 Predictors of Stock Market Values

    16/38

    Predictors of Stock Market Values 14

    between variables changes, the model may cease to be reliable (Faraway,

    2005).

    Multicollinearity also means that it is difficult to explain the individual

    importance of each variable. Small changes in the predicted variable can

    create large changes in the beta coefficients.

    The indication of multicollinearity is not only shown in the correlation

    matrix. It may also show itself in the model. (Variance inflation factors are a

    another approach to examining collinearity, but are beyond the scope of this

    paper.)

    There are other potential problems as well, including heteroskedasticity,

    non-constant variance of the errors.

    Linear Prediction

    The first model,

    LCS=YearProjected GDPInflationFed Funds RateMoney Supply , is fit. GDP and

    the money supply are shown to be significant at the 5% level. Adjusted R-

    squared is 0.9014, and the p-value for the model is highly significant.

    (Excluded observations are those before 1980 and after 2004; all

    observations between and including 1980 and 2004 were included in the

    regression.)

  • 8/2/2019 Predictors of Stock Market Values

    17/38

    Predictors of Stock Market Values 15

    Using the original model yields a very high F statistic and R-squared.

    However, only two of the independent variables is significant at the 5% level.

    Visually observing the errors, they appear to be within a narrow band to the

    left and a much wider band to the right.

    Variance appears to be non-constant, a condition called

    heteroskedasticity. A Q-Q Plot of the errors also indicates non-normality, but

    is similar to log-normal residuals (which is more evidence for a log

    transformation of the dependent variable). The Shapiro-Wilk normality test

    gives a p-value of 0.03778. Since the Shapiro-Wilk null hypothesis is that the

    residuals are normal, this test provides formal evidence for the rejection of an

    Illustration 2: Residual Variance is Non-Constant

  • 8/2/2019 Predictors of Stock Market Values

    18/38

    Predictors of Stock Market Values 16

    assumption of normality. (R documentation indicates a rejection threshold of

    less than .1 is adequate, citing a remark in Applied Statistics by Patrick

    Royston in 1995 (R Development Core Team, 2009).)

    A further problem is autocorrelation. Visual inspection of the trending

    data is indicative of autocorrelation. The Durbin-Watson test is conclusive

    (Zeileis & Hothorn, 2002). It reports a p-value of 0.0002858, rejecting the

    hypothesis of non-correlated errors. An approach to deal with autocorrelation

    is to add the lagged response variable to the predictor variables. This

    approach is akin to using the Last method for prediction.

    The proper transformation of the data can be easily estimated with the

    Box-Cox method (Venables & Ripley, 2002). The Box-Cox method transforms

    the response variable by raising it to the power of lambda (and dividing it by

    Illustration 3: Box-Cox Operation Indicates Natural Log-Transformation

  • 8/2/2019 Predictors of Stock Market Values

    19/38

    Predictors of Stock Market Values 17

    lambda), except when lambda equals zero, which then takes the natural log

    of the response variable. The 95% confidence interval for lambda falls

    between approximately -0.28 and 0.05, confirming earlier suspicions of the

    appropriateness of taking the natural log of the response.

    Based on the suggestions of the analysis thus far, the model will be

    changed and transformed before dropping insignificant variables in an

    attempt to improve the model. The new model is

    ln LCS=YearPro.GDPInfl.Fed Funds RateMoney SupplylnLaggedLCS .

    The transformation of the dependent variable indicates a successful

    improvement on the model. The R-squared has increased, and the p-value is

    still significant. The money supply is still significant, but the untransformed

    projected GDP is no longer significant.

    The Shapiro-Wilk normality test now indicates normally distributed errors.

    The Durbin-Watson test still indicates autocorrelation with a p-value of .0847.

    The Box-Cox test now indicates a wide range of possible transformations

    including -2 and 1 within the 95% confidence level. Perhaps the Box-Cox

    result is a problem with the predictors not having the correct transformation.

    Both the Money Supply, Inflation, and Gross Domestic Product are functions

    of growth over time. The next model will take their logs as well and fit.

    This iteration does not improve for any of the variables except the lag

    predictor which corrects for autocorrelation. Up to this point, inflation appears

    to be an unimportant variable. Remove inflation for the next regression.

    Removing inflation improves this model. Removing variables usually

  • 8/2/2019 Predictors of Stock Market Values

    20/38

    Predictors of Stock Market Values 18

    decreases R-squared, but in this case, there was no change. Adjusted R-

    squared actually improved.

    Since the lagged dependent variable is in the model as a predictor, it

    does a better job of predicting the next year's performance than the year.

    Since the year is the least significant predictor now, it is the next variable to

    remove from the model.

    Removing the Year variable lowers R-squared an insignificant amount,

    while still improving adjusted R-squared. In addition, the GDP variable

    becomes significant again. The variable for the Federal Funds Rate is the lone

    remaining insignificant term. Next, remove the Federal Funds Rate from the

    model.

    Removing the Fed Funds Rate creates very small decreases to R-squared

    (at 0.9842) and adjusted R-squared (at 0.982). The model is now

    ln LCS=lnPro.GDPln MoneySupplylnLaggedLCS , which is the best

    iteration so far. Each term is significant at the 5% level, and both terms were

    significant in the first iterations model (before any transformations). (See

    Appendix B, page 33 for the regression ANOVA with the object code "malt5".)

    The least significant term is the money supply term. Even though it is

    significant at the 5% level, given the high R-squared, the model may be over-

    fit. The problem with over-fitting a model is that it is overly sensitive to newly

    sampled data. Training the model on a subset of the data and testing its

    ability to predict based on data outside the subset is one way of testing for

    fit, and this method shall be demonstrated at the end of this paper.

  • 8/2/2019 Predictors of Stock Market Values

    21/38

    Predictors of Stock Market Values 19

    Removing the money supply data reduces both R-squared and adjusted

    R-squared slightly. It also reduces the confidence level of the prediction

    provided by predicting gross economic production. It may be that the money

    supply adds meaning that is required for projected GDP to mean anything. An

    important thing to note is that this regression includes the observations from

    year 1979. A look at the data indicates nothing strange that should arise from

    that year being included. (Also, there is no indication of multiplicative effects,

    see Appendix B.)

    Checking The Model

    Since the optimal model has been found, checking previous diagnostics

    on the model, ln LCS=lnProjected GDPln Money Supplyln LaggedLCS will

    test the strength of the model. The Shapiro test gives a p-value of .2885,

    which is little evidence to reject the assumption of normality in the errors.

    The Durbin-Watson test gives a p-value of .1872, which is evidence for not

    rejecting the assumption of independent errors (other forms of data-

    exploration confirm this conclusion). And the Box-Cox transformation test

    indicates that the data are correctly transformed with a maximal value for

    lambda of close to one, although the 95% confidence interval ranges from

    approximately -0.75 to 2.66. Thus we can assume the model is linear in the

    parameters.

  • 8/2/2019 Predictors of Stock Market Values

    22/38

    Predictors of Stock Market Values 20

    The one remaining problem with the data is multicollinearity. Projected

    GDP is correlated at 98% with the lagged money supply. Although this

    correlation indicates removing one of the predictors from the model, removal

    from this point is impossible. Upon removal of projected GDP, the money

    supply variable's p-value increases to 0.3530. Removal of the money supply

    variable causes projected GDP's p-value to go to 0.267.

    CONCLUSIONS

    The Model

    This study indicates that together, the money supply and projected GDP

    provide information that indicates the direction of the stock market. To

    Illustration 4: Box-Cox Operation Maximum Likelihood: Linear Model

  • 8/2/2019 Predictors of Stock Market Values

    23/38

    Predictors of Stock Market Values 21

    implement this model in making predictions about the stock market, first

    predict the next year's GDP to the same level of accuracy as the Fed (no

    small feat). Then, using the current year's end of year money supply and

    stock market values, combined with the GDP projection, predict the stock

    market's valuations with the following formula.

    LCS=e4.98091.9336ln ProjGDP1.0676ln MoneySupply0.5758ln LaggedLCS

    The transformation is justified by both the Box-Cox procedure as well as

    the improvement in R-Squared of over 0.05.

    Confidence Intervals

    The minimum and maximum residuals are -0.22263 and 0.23279

    respectively. To understand the difference, for any number exponential in e

    greater than expected by 0.25, the exponential function is 28.4% greater

    than expected. Similarly, for a number in e's exponent less than expected by

    0.25, the result is 22.1% less than expected.

    The residual standard error is 0.1368 with 21 degrees of freedom. Based

    on the two tailed t-distribution with an alpha of 95%, the critical range is plus

    or minus 2.08 standard errors. Thus the 95% confidence interval for the

    regression is from 24.8% less than expected to 33.0% greater than expected.

    This calculation indicates that this regression is no gold mine, and that even

    with some expectation of a future value, there can be very large variance.

    Error Comparison

    Based on the sample the regression was calculated from, assuming

    accurate projection of GDP, the next four years would have had this result.

  • 8/2/2019 Predictors of Stock Market Values

    24/38

    Predictors of Stock Market Values 22

    This prediction uses actual cumulative quarterly GDP instead of projected

    GDP (which as noted earlier, is only released by the Fed after a five year lag).

    The Mean Absolute Percent Error is the best measure of error, since this out

    of sample prediction is after significant growth in the stock market and other

    variables, and the absolute errors and squared errors should be much larger.

    Looking at the individual prediction percentage errors, for the first three

    years, note an average over-prediction that ranges from 7.75% to 13.46%.

    The MAPE of 26.73% is skewed high by the 2008 observation.

    For the entire sample plus the next four years, the MAPE is 13.12%.

    Relative to even the best of the moving average prediction methods, by this

    measure, the regression is far superior.

    Table 7: Four Year Forecast and Error

    Year LCSTOCK GDP M2IMF AUTOCOR Predicted Abs Errors Squared Errors bs%Erro

    2005 2657824.0 50553.5 7547 2533432.42 3015493.21 357669.25767 127927297880 13.46%

    2006 3077760.1 53595.7 7875 2657823.95 3316359.46 238599.33319 56929641800.5 7.75%

    2007 3246729.2 56290 8481 3077760.13 3665965.51 419236.34554 175759113421 12.91%

    2008 2045439.4 57765.7 9465 3246729.16 3534858.84 1489419.4698 2218370357114 72.82%

    Beta: -4.9809 1.9336 -1.068 0.5758 Sums: 2504924.4062 2578986410215Means: 626231.10156 644746602554 26.73%

    MAE MSE MAPE

  • 8/2/2019 Predictors of Stock Market Values

    25/38

    Predictors of Stock Market Values 23

    It may be considered unfair to compare MAPE for the whole set of years

    for a regression fitted to those years designed to minimize Mean Squared

    Errors. However, there is little else to compare. Indeed this regression may be

    the best approximation to predicting the next year's stock market levels.

    Optimization of prediction notwithstanding, the variance may be far too high

    to create profitable trading rules based on the data.

    Table 8: Final Model's Error

    Year LCSTOCK GDP RM2IMFN AUTOCOR Predicted Abs ErrorsSquared Errors Abs%Error

    1980 140523.09 10219.1 1488.5 106119.24 124752.29 15770.80 248718251.35 11.22%

    1981 133623.41 11342.9 1620 140523.09 163920.03 30296.62 917885351.98 22.67%

    1982 162232.18 12479.4 1798.3 133623.41 171322.78 9090.599 82638994.598 5.60%

    1983 198750.65 12979.4 1967.2 162232.18 187800.16 10950.49 119913272.36 5.51%

    1984 211212.31 14604.2 2178.7 198750.65 237773.43 26561.12 705493083.59 12.58%1985 279138.19 15628.9 2386.8 211212.31 254694.48 24443.71 597495000.97 8.76%

    1986 330695.02 16478.2 2574.8 279138.19 305514.84 25180.18 634041355.26 7.61%

    1987 347990.37 17707.1 2832.6 330695.02 349602.05 1611.683 2597523.5158 0.46%

    1988 406487.55 18951.2 2937.7 347990.37 394866.81 11620.74 135041639.52 2.86%

    1989 534490.48 20890.4 3101.2 406487.55 492043.93 42446.55 1801709516 7.94%

    1990 517547.13 22115.7 3283.9 534490.48 605042.42 87495.29 7655425684 16.91%

    1991 675657.77 22918.9 3431.7 517547.13 607121.90 68535.87 4697165498 10.14%

    1992 727480.73 23689.3 3582.3 675657.77 720759.17 6721.556 45179310.683 0.92%

    1993 800156.05 25106.9 3662.7 727480.73 821835.76 21679.71 470009941.49 2.71%

    1994 810638.09 26916.3 3722.5 800156.05 976169.36 165531.3 27400600909 20.42%

    1995 1114059.93 28464.6 3733.8 810638.09 1092297.79 21762.14 473590650.94 1.95%

    1996 1371073.56 29644.5 3934.2 1114059.93 1341884.08 29189.48 852025708.45 2.13%

    1997 1828463.7 31702.1 4172.1 1371073.56 1617168.71 211295.0 44645574834 11.56%

    1998 2351038.63 33760.6 4463.3 1828463.7 2005824.72 345213.9 119172641764 14.68%

    1999 2845697.15 35286.5 4963.8 2351038.63 2254232.08 591465.1 349830931632 20.78%

    2000 2586454.14 39068.3 5336.1 2845697.15 2836037.94 249583.8 62292075342 9.65%

    2001 2279183.39 41928.5 5782.1 2586454.14 2824486.32 545302.9 297355288996 23.93%

    2002 1775483.86 41733.8 6717 2279183.39 2217762.22 442278.4 195610143382 24.91%

    2003 2285047.73 43518.7 7117 1775483.86 1957991.01 327056.7 106966098805 14.31%

    2004 2533432.42 46639.7 7256.1 2285047.73 2535676.41 2243.989 5035487.8639 0.09%

    2005 2657823.95 50553.5 7546.7 2533432.42 3015493.21 357669.3 127927297880 13.46%

    2006 3077760.13 53595.7 7875.2 2657823.95 3316359.46 238599.3 56929641800 7.75%

    2007 3246729.16 56290 8480.7 3077760.13 3665965.51 419236.3 175759113421 12.91%

    2008 2045439.37 57765.7 9465 3246729.16 3534858.84 1489419 2.21837E+012 72.82%

    Beta: -4.9809 1.9336 -1.0676 0.5758 Sums: 58182523.80170E+012Means: 207794.7 135775133291 13.12%

    MAE MSE MAPE

  • 8/2/2019 Predictors of Stock Market Values

    26/38

    Predictors of Stock Market Values 24

    Summary

    Since this model was arrived at over a series of iterative processes that

    eliminated one variable at a time, it may be argued that the findings are

    spurious, and the result of random chance. That the model is the result of

    pure chance is unlikely to be the case, however.

    In general, this model states that the stock market goes up when the

    economy is expected to grow, and when the money supply is decreasing. The

    effect for the economy is about twice as much as the effect for the money

    supply.

    Expectations of economic growth fuel speculation in stocks. When people

    expect the economy to grow more, stock prices increase. When there is less

    of an expectation for economic growth, stock prices do not increase as much.

    The Fed acts to contract the money supply when the economy is growing

    too fast. The stock market is known to be a leading indicator of economic

    growth. It would make sense that the Fed would be tightening the money

    supply as the stock market is increasing.

    Sometimes time series data runs the risk of reaching a change point

    where the effects being used for prediction cease to work (Chatfield, 2000). It

    is unlikely that the effects found here will cease to predict, however. These

    effects are the result of actions of or predictions by a United States

    government chartered organization that has powerful control over

    fundamental aspects of the economy.

    The high correlation between the two factors is an element of concern. It

  • 8/2/2019 Predictors of Stock Market Values

    27/38

    Predictors of Stock Market Values 25

    would make sense that if the Fed sees the economy growing above average

    the next year that it would act today to reduce the money supply. This

    reasoning would explain the high level of correlation. This interaction is

    troubling, but each needs the other for its significance level in the model. And

    without the two variables, the model is left with nothing but an

    autocorrelation correction variable based on the previous year's market and

    about a third higher residual standard error.

    Low standard errors with many variables relative to the number of

    observations may indicate a model that is over-fit, but the two variables (plus

    the autocorrelation variable) do not seem to be too much relative to the size

    of the data available. In retrospect, this model also has the lowest standard

    errors, and since all of these models have a very high R-squared, optimizing

    for standard errors while keeping the number of predictors small would seem

    to be the best remaining approach.

  • 8/2/2019 Predictors of Stock Market Values

    28/38

    Predictors of Stock Market Values 26

    REFERENCES

    Chatfield, C. (2000). Time-Series Forecasting. Boca Raton: Chapman &

    Hall/CRC.

    Colby, R. W. (2003). The Encyclopedia of Technical Market Indicators. New

    York: McGraw-Hill.

    Faraway, J. J. (2005). Linear Models with R. Boca Raton: Chapman & Hall/CRC.

    Harrington, J. P. (Ed.). (2008). Ibbotson SBBI 2009 Classic Yearbook: Market

    Results for Stocks, Bonds, Bills, and Inflation 1926-2008. Chicago:

    Morningstar.

    R Development Core Team. (2009). R: A Language and Environment for

    Statistical Computing. Vienna, Austria: R Foundation for Statistical

    Computing. Retrieved from http://www.R-project.org

    St. Louis Fed. (2010). St. Louis Fed: Download Data for Series: M2NS, M2

    Money Stock. St. Louis Fed. Retrieved April 6, 2010, from

    http://research.stlouisfed.org/fred2/series/M2NS/downloaddata?cid=48

    Venables, W. N., & Ripley, B. D. (2002). Modern Applied Statistics with S (4th

    ed.). New York: Springer. Retrieved from

    http://www.stats.ox.ac.uk/pub/MASS4

    Zeileis, A., & Hothorn, T. (2002). Diagnostic Checking in Regression

    Relationships. R News, 2(3), 7-10.

  • 8/2/2019 Predictors of Stock Market Values

    29/38

    Predictors of Stock Market Values 27

    APPENDIX A: ACKNOWLEDGEMENTS

    GNU/Linux/Ubuntu

    The GNU Community has developed or enabled the functioning of all of

    the tools used to create this document. All of these tools are open-source

    software packages that are free to use and free to modify. The Linux kernel

    powered the computing. Ubuntu is a popular distribution of Linux, and the

    source for software repositories that provided the operating system,

    supporting software, and core tools (except Zotero).

    R

    This study was done in R, a powerful command-line statistical

    programming package (R Development Core Team, 2009). The advantages of

    a command-line interface are that one may maintain an exactly reproducible

    copy of ones work (e.g. see Appendix B), while having complete access to

    many powerful functions. The disadvantage is that the learning curve takes

    longer to climb compared to graphical user interfaces.

    OpenOffice.org

    This paper was written in OpenOffice.org, an open-source version of

    Sun's StarOffice. Writer was used for word processing and document

    assembly. Calc was used for data manipulation, spreadsheet functions, and

    table creation.

    Other Tools

    SciTE with R syntax highlighting was also used to manipulate the code.

    Zotero Firefox and Writer plug-ins were used to manage citations.

  • 8/2/2019 Predictors of Stock Market Values

    30/38

    Predictors of Stock Market Values 28

    APPENDIX B: R CODE

    This is the console input/output. It requires the files to be in the location

    provided, and the lmtest and MASS libraries. The command prompt is the

    ">" symbol, and the "#" symbol indicates a non-executing comment.

    R version 2.9.2 (2009-08-24)Copyright (C) 2009 The R Foundation for Statistical ComputingISBN 3-900051-07-0

    R is free software and comes with ABSOLUTELY NO WARRANTY.You are welcome to redistribute it under certain conditions.Type 'license()' or 'licence()' for distribution details.

    R is a collaborative project with many contributors.

    Type 'contributors()' for more information and'citation()' on how to cite R or R packages in publications.

    REvolution R enhancements not installed. For improvedperformance and other extensions: apt-get install revolution-r

    > comb cor(comb, use= "pairwise.complete.obs")

    Year LCSTOCK GNP GDP SPLICEGROSS InflationYear 1.0000000 0.9261381 0.9971895 0.9939544 0.9887894 0.9975192LCSTOCK 0.9261381 1.0000000 0.9715892 0.8080071 0.9340474 0.9154274GNP 0.9971895 0.9715892 1.0000000 NA 0.9999981 0.9855178

    GDP 0.9939544 0.8080071 NA 1.0000000 0.9999990 0.9943664SPLICEGROSS 0.9887894 0.9340474 0.9999981 0.9999990 1.0000000 0.9782972Inflation 0.9975192 0.9154274 0.9855178 0.9943664 0.9782972 1.0000000FEDFUNDS -0.8233347 -0.6218049 -0.8141758 -0.4857107 -0.7987852 -0.8081528IMFNS 0.8859370 0.8180842 0.9549985 0.9533118 0.9287711 0.8753418M2NS 0.9683431 0.8936886 0.9902620 0.9829273 0.9898840 0.9679235M2IMFNS 0.9542132 0.8822151 0.9931752 0.9830307 0.9871683 0.9527529AUTOCOR 0.9261381 0.9508532 0.9668297 0.8545207 0.9352194 0.9307222AUTOCOR2 0.9261381 0.9060283 0.9720780 0.8874491 0.9320751 0.9238932AUTOCOR3 0.9369234 0.8664926 0.9575144 0.9434643 0.9391325 0.9144456MRINFL 0.9975192 0.9119503 0.9808353 0.9940487 0.9774272 0.9986158MRFUNDS -0.8080723 -0.6402895 -0.7588302 -0.3646230 -0.7814267 -0.7788310MRIMFNS 0.8782463 0.8099785 0.9478959 0.9329031 0.9090652 0.8900329MRM2NS 0.9696137 0.8894880 0.9940004 0.9675501 0.9867425 0.9741104

    MRM2IMFNS 0.9547696 0.8778663 0.9947253 0.9609944 0.9814615 0.9621024FEDFUNDS IMFNS M2NS M2IMFNS AUTOCOR AUTOCOR2

    Year -0.8233347 0.8859370 0.9683431 0.9542132 0.9261381 0.9261381LCSTOCK -0.6218049 0.8180842 0.8936886 0.8822151 0.9508532 0.9060283GNP -0.8141758 0.9549985 0.9902620 0.9931752 0.9668297 0.9720780GDP -0.4857107 0.9533118 0.9829273 0.9830307 0.8545207 0.8874491SPLICEGROSS -0.7987852 0.9287711 0.9898840 0.9871683 0.9352194 0.9320751Inflation -0.8081528 0.8753418 0.9679235 0.9527529 0.9307222 0.9238932FEDFUNDS 1.0000000 -0.6737792 -0.7703007 -0.7510230 -0.6632477 -0.7031356

  • 8/2/2019 Predictors of Stock Market Values

    31/38

    Predictors of Stock Market Values 29

    IMFNS -0.6737792 1.0000000 0.9612459 0.9785086 0.8846286 0.9476079M2NS -0.7703007 0.9612459 1.0000000 0.9974368 0.9093549 0.9487988M2IMFNS -0.7510230 0.9785086 0.9974368 1.0000000 0.9097530 0.9551136AUTOCOR -0.6632477 0.8846286 0.9093549 0.9097530 1.0000000 0.9508532AUTOCOR2 -0.7031356 0.9476079 0.9487988 0.9551136 0.9508532 1.0000000AUTOCOR3 -0.7960527 0.9413438 0.9476510 0.9520965 0.9060283 0.9508532

    MRINFL -0.8420432 0.8805821 0.9640316 0.9495986 0.9154274 0.9307222MRFUNDS 0.8351315 -0.5872710 -0.7318301 -0.6988825 -0.6218049 -0.6558817MRIMFNS -0.6824676 0.9750173 0.9567361 0.9682330 0.8180842 0.9239063MRM2NS -0.7584503 0.9539833 0.9985131 0.9937647 0.8936886 0.9404726MRM2IMFNS -0.7456812 0.9678341 0.9966462 0.9960230 0.8822151 0.9445584

    AUTOCOR3 MRINFL MRFUNDS MRIMFNS MRM2NS MRM2IMFNSYear 0.9369234 0.9975192 -0.8080723 0.8782463 0.9696137 0.9547696LCSTOCK 0.8664926 0.9119503 -0.6402895 0.8099785 0.8894880 0.8778663GNP 0.9575144 0.9808353 -0.7588302 0.9478959 0.9940004 0.9947253GDP 0.9434643 0.9940487 -0.3646230 0.9329031 0.9675501 0.9609944SPLICEGROSS 0.9391325 0.9774272 -0.7814267 0.9090652 0.9867425 0.9814615Inflation 0.9144456 0.9986158 -0.7788310 0.8900329 0.9741104 0.9621024FEDFUNDS -0.7960527 -0.8420432 0.8351315 -0.6824676 -0.7584503 -0.7456812IMFNS 0.9413438 0.8805821 -0.5872710 0.9750173 0.9539833 0.9678341

    M2NS 0.9476510 0.9640316 -0.7318301 0.9567361 0.9985131 0.9966462M2IMFNS 0.9520965 0.9495986 -0.6988825 0.9682330 0.9937647 0.9960230AUTOCOR 0.9060283 0.9154274 -0.6218049 0.8180842 0.8936886 0.8822151AUTOCOR2 0.9508532 0.9307222 -0.6558817 0.9239063 0.9404726 0.9445584AUTOCOR3 1.0000000 0.9238932 -0.6733649 0.9417894 0.9414619 0.9493800MRINFL 0.9238932 1.0000000 -0.8081528 0.8753418 0.9679235 0.9527529MRFUNDS -0.6733649 -0.8081528 1.0000000 -0.6418000 -0.7506362 -0.7293701MRIMFNS 0.9417894 0.8753418 -0.6418000 1.0000000 0.9538724 0.9741593MRM2NS 0.9414619 0.9679235 -0.7506362 0.9538724 1.0000000 0.9970302MRM2IMFNS 0.9493800 0.9527529 -0.7293701 0.9741593 0.9970302 1.0000000>> m summary(m)

    Call:lm(formula = LCSTOCK ~ Year + SPLICEGROSS + MRINFL + MRFUNDS +

    MRM2IMFNS, data = comb, na.action = na.exclude)

    Residuals:Min 1Q Median 3Q Max

    -364914 -203016 -41612 106983 820959

    Coefficients:Estimate Std. Error t value Pr(>|t|)

    (Intercept) -3.897e+08 3.763e+08 -1.036 0.3134Year 1.982e+05 1.913e+05 1.036 0.3134SPLICEGROSS 1.757e+02 8.159e+01 2.153 0.0444 *MRINFL -8.347e+02 5.898e+02 -1.415 0.1732MRFUNDS 1.792e+04 3.871e+04 0.463 0.6486MRM2IMFNS -6.165e+02 2.555e+02 -2.413 0.0261 *---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

    Residual standard error: 290400 on 19 degrees of freedom(8 observations deleted due to missingness)

  • 8/2/2019 Predictors of Stock Market Values

    32/38

    Predictors of Stock Market Values 30

    Multiple R-squared: 0.9219, Adjusted R-squared: 0.9014F-statistic: 44.87 on 5 and 19 DF, p-value: 7.147e-10

    > plot(fitted(m), residuals(m), xlab="Fitted",ylab="Residuals")> qqnorm(resid(m))> shapiro.test(residuals(m))

    Shapiro-Wilk normality test

    data: residuals(m)W = 0.9142, p-value = 0.03778

    > library(lmtest)Loading required package: zoo

    Attaching package: 'zoo'

    The following object(s) are masked from package:base :

    as.Date.numeric

    > dwtest(m)

    Durbin-Watson test

    data: mDW = 1.0697, p-value = 0.0002858alternative hypothesis: true autocorrelation is greater than 0

    >> library(MASS)> boxcox(m,plotit=T)

    > boxcox(m,plotit=T,lambda=seq(-0.5,0.5,by=0.1))>>> malt summary(malt)

    Call:lm(formula = log(LCSTOCK) ~ Year + SPLICEGROSS + MRINFL + MRFUNDS +

    MRM2IMFNS + log(AUTOCOR), data = comb, na.action = na.exclude)

    Residuals:Min 1Q Median 3Q Max

    -0.224728 -0.071272 -0.002158 0.078216 0.189551

    Coefficients:Estimate Std. Error t value Pr(>|t|)

    (Intercept) -4.631e+02 1.879e+02 -2.464 0.0240 *Year 2.387e-01 9.603e-02 2.485 0.0230 *SPLICEGROSS -2.748e-06 3.466e-05 -0.079 0.9377MRINFL -3.679e-04 2.572e-04 -1.430 0.1697MRFUNDS -9.905e-04 1.666e-02 -0.059 0.9533MRM2IMFNS -2.957e-04 1.246e-04 -2.373 0.0290 *

  • 8/2/2019 Predictors of Stock Market Values

    33/38

    Predictors of Stock Market Values 31

    log(AUTOCOR) 3.814e-01 1.829e-01 2.085 0.0516 .---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

    Residual standard error: 0.123 on 18 degrees of freedom(8 observations deleted due to missingness)

    Multiple R-squared: 0.9891, Adjusted R-squared: 0.9854F-statistic: 271.3 on 6 and 18 DF, p-value: < 2.2e-16

    > shapiro.test(residuals(malt))

    Shapiro-Wilk normality test

    data: residuals(malt)W = 0.9747, p-value = 0.7653

    > dwtest(malt)

    Durbin-Watson test

    data: maltDW = 1.8845, p-value = 0.0847alternative hypothesis: true autocorrelation is greater than 0

    > boxcox(malt,plotit=T)>> malt2 summary(malt2)

    Call:lm(formula = log(LCSTOCK) ~ Year + log(SPLICEGROSS) + log(MRINFL) +

    MRFUNDS + log(MRM2IMFNS) + log(AUTOCOR), data = comb, na.action =

    na.exclude)

    Residuals:Min 1Q Median 3Q Max

    -0.242438 -0.088883 -0.004927 0.094161 0.211644

    Coefficients:Estimate Std. Error t value Pr(>|t|)

    (Intercept) -72.479133 89.941534 -0.806 0.43085Year 0.037791 0.048280 0.783 0.44395log(SPLICEGROSS) 1.322121 1.589509 0.832 0.41643log(MRINFL) -0.004836 1.307447 -0.004 0.99709MRFUNDS -0.019492 0.018379 -1.061 0.30293log(MRM2IMFNS) -1.318564 0.621873 -2.120 0.04813 *log(AUTOCOR) 0.620016 0.196808 3.150 0.00553 **---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

    Residual standard error: 0.1397 on 18 degrees of freedom(8 observations deleted due to missingness)

    Multiple R-squared: 0.9859, Adjusted R-squared: 0.9812F-statistic: 209.8 on 6 and 18 DF, p-value: 1.178e-15

  • 8/2/2019 Predictors of Stock Market Values

    34/38

    Predictors of Stock Market Values 32

    >> malt3 summary(malt3)

    Call:

    lm(formula = log(LCSTOCK) ~ Year + log(SPLICEGROSS) + MRFUNDS +log(MRM2IMFNS) + log(AUTOCOR), data = comb, na.action = na.exclude)

    Residuals:Min 1Q Median 3Q Max

    -0.242398 -0.088944 -0.005036 0.094195 0.211602

    Coefficients:Estimate Std. Error t value Pr(>|t|)

    (Intercept) -72.58974 82.56247 -0.879 0.39027Year 0.03784 0.04502 0.841 0.41102log(SPLICEGROSS) 1.31767 1.00937 1.305 0.20733MRFUNDS -0.01945 0.01461 -1.331 0.19881log(MRM2IMFNS) -1.31744 0.52746 -2.498 0.02185 *

    log(AUTOCOR) 0.62009 0.19068 3.252 0.00419 **---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

    Residual standard error: 0.136 on 19 degrees of freedom(8 observations deleted due to missingness)

    Multiple R-squared: 0.9859, Adjusted R-squared: 0.9822F-statistic: 265.8 on 5 and 19 DF, p-value: < 2.2e-16

    >> malt4 summary(malt4)

    Call:lm(formula = log(LCSTOCK) ~ log(SPLICEGROSS) + MRFUNDS + log(MRM2IMFNS) +

    log(AUTOCOR), data = comb, na.action = na.exclude)

    Residuals:Min 1Q Median 3Q Max

    -0.24297 -0.08841 0.01259 0.11157 0.22009

    Coefficients:Estimate Std. Error t value Pr(>|t|)

    (Intercept) -3.22647 2.76583 -1.167 0.2571log(SPLICEGROSS) 1.83905 0.79046 2.327 0.0306 *MRFUNDS -0.01805 0.01441 -1.253 0.2246log(MRM2IMFNS) -1.24958 0.51741 -2.415 0.0254 *log(AUTOCOR) 0.63590 0.18835 3.376 0.0030 **---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

    Residual standard error: 0.135 on 20 degrees of freedom(8 observations deleted due to missingness)

    Multiple R-squared: 0.9854, Adjusted R-squared: 0.9825F-statistic: 337 on 4 and 20 DF, p-value: < 2.2e-16

  • 8/2/2019 Predictors of Stock Market Values

    35/38

    Predictors of Stock Market Values 33

    >> #Final model, malt5> malt5 summary(malt5)

    Call:lm(formula = log(LCSTOCK) ~ log(SPLICEGROSS) + log(MRM2IMFNS) +

    log(AUTOCOR), data = comb, na.action = na.exclude)

    Residuals:Min 1Q Median 3Q Max

    -0.22263 -0.09233 0.01955 0.09150 0.23279

    Coefficients:Estimate Std. Error t value Pr(>|t|)

    (Intercept) -4.9809 2.4174 -2.060 0.05196 .log(SPLICEGROSS) 1.9336 0.7975 2.425 0.02443 *log(MRM2IMFNS) -1.0676 0.5033 -2.121 0.04597 *

    log(AUTOCOR) 0.5758 0.1846 3.119 0.00519 **---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

    Residual standard error: 0.1368 on 21 degrees of freedom(8 observations deleted due to missingness)

    Multiple R-squared: 0.9842, Adjusted R-squared: 0.982F-statistic: 436.9 on 3 and 21 DF, p-value: < 2.2e-16

    >> malt6 summary(malt6)

    Call:lm(formula = log(LCSTOCK) ~ log(SPLICEGROSS) + log(AUTOCOR),

    data = comb, na.action = na.exclude)

    Residuals:Min 1Q Median 3Q Max

    -3.358e-01 -1.018e-01 4.623e-05 1.066e-01 2.075e-01

    Coefficients:Estimate Std. Error t value Pr(>|t|)

    (Intercept) -1.2297 1.6680 -0.737 0.468log(SPLICEGROSS) 0.4299 0.3777 1.138 0.267log(AUTOCOR) 0.7774 0.1647 4.721 9.34e-05 ***---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

    Residual standard error: 0.144 on 23 degrees of freedom(7 observations deleted due to missingness)

    Multiple R-squared: 0.9832, Adjusted R-squared: 0.9817F-statistic: 672.3 on 2 and 23 DF, p-value: < 2.2e-16

    >

  • 8/2/2019 Predictors of Stock Market Values

    36/38

    Predictors of Stock Market Values 34

    > malt7 summary(malt5)

    Call:lm(formula = log(LCSTOCK) ~ log(SPLICEGROSS) + log(MRM2IMFNS) +

    log(AUTOCOR), data = comb, na.action = na.exclude)

    Residuals:Min 1Q Median 3Q Max

    -0.22263 -0.09233 0.01955 0.09150 0.23279

    Coefficients:Estimate Std. Error t value Pr(>|t|)

    (Intercept) -4.9809 2.4174 -2.060 0.05196 .log(SPLICEGROSS) 1.9336 0.7975 2.425 0.02443 *log(MRM2IMFNS) -1.0676 0.5033 -2.121 0.04597 *log(AUTOCOR) 0.5758 0.1846 3.119 0.00519 **---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

    Residual standard error: 0.1368 on 21 degrees of freedom(8 observations deleted due to missingness)

    Multiple R-squared: 0.9842, Adjusted R-squared: 0.982F-statistic: 436.9 on 3 and 21 DF, p-value: < 2.2e-16

    >> malt8 summary(malt8)

    Call:lm(formula = log(LCSTOCK) ~ log(AUTOCOR), data = comb, na.action = na.exclude)

    Residuals:Min 1Q Median 3Q Max

    -0.48053 -0.09394 0.01333 0.12556 0.22081

    Coefficients:Estimate Std. Error t value Pr(>|t|)

    (Intercept) 0.86816 0.35789 2.426 0.0220 *log(AUTOCOR) 0.94333 0.02646 35.657 comb2 > cor(comb2, use= "pairwise.complete.obs")

    Year LCSTOCK SPLICEGROSS MRM2IMFNS AUTOCOR AUTOCOR2Year 1.0000000 0.9261381 0.9887894 0.9547696 0.9261381 0.9261381LCSTOCK 0.9261381 1.0000000 0.9340474 0.8778663 0.9508532 0.9060283

  • 8/2/2019 Predictors of Stock Market Values

    37/38

    Predictors of Stock Market Values 35

    SPLICEGROSS 0.9887894 0.9340474 1.0000000 0.9814615 0.9352194 0.9320751MRM2IMFNS 0.9547696 0.8778663 0.9814615 1.0000000 0.8822151 0.9445584AUTOCOR 0.9261381 0.9508532 0.9352194 0.8822151 1.0000000 0.9508532AUTOCOR2 0.9261381 0.9060283 0.9320751 0.9445584 0.9508532 1.0000000AUTOCOR3 0.9261381 0.8664926 0.9391325 0.9493800 0.9060283 0.9508532

    AUTOCOR3

    Year 0.9261381LCSTOCK 0.8664926SPLICEGROSS 0.9391325MRM2IMFNS 0.9493800AUTOCOR 0.9060283AUTOCOR2 0.9508532AUTOCOR3 1.0000000> m2 summary(m2)

    Call:lm(formula = log(LCSTOCK) ~ log(SPLICEGROSS) + log(MRM2IMFNS) +

    log(AUTOCOR), data = comb2, na.action = na.exclude)

    Residuals:Min 1Q Median 3Q Max

    -0.22263 -0.09233 0.01955 0.09150 0.23279

    Coefficients:Estimate Std. Error t value Pr(>|t|)

    (Intercept) -4.9809 2.4174 -2.060 0.05196 .log(SPLICEGROSS) 1.9336 0.7975 2.425 0.02443 *log(MRM2IMFNS) -1.0676 0.5033 -2.121 0.04597 *log(AUTOCOR) 0.5758 0.1846 3.119 0.00519 **---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

    Residual standard error: 0.1368 on 21 degrees of freedom(9 observations deleted due to missingness)

    Multiple R-squared: 0.9842, Adjusted R-squared: 0.982F-statistic: 436.9 on 3 and 21 DF, p-value: < 2.2e-16

    >> plot(fitted(m2), residuals(m2), xlab="Fitted",ylab="Residuals")> qqnorm(resid(m2))> shapiro.test(residuals(m2))

    Shapiro-Wilk normality test

    data: residuals(m2)W = 0.9527, p-value = 0.2885

    > library(lmtest)> dwtest(m2)

    Durbin-Watson test

    data: m2DW = 1.8683, p-value = 0.1872

  • 8/2/2019 Predictors of Stock Market Values

    38/38

    Predictors of Stock Market Values 36

    alternative hypothesis: true autocorrelation is greater than 0

    > library(MASS)> boxcox(m2,plotit=T)> boxcox(m2,plotit=T,lambda=seq(-1,3,by=0.1))> m3 summary(m3)

    Call:lm(formula = log(LCSTOCK) ~ log(SPLICEGROSS) + log(MRM2IMFNS) +

    log(SPLICEGROSS) * log(MRM2IMFNS) + log(AUTOCOR), data = comb2,na.action = na.exclude)

    Residuals:Min 1Q Median 3Q Max

    -0.20975 -0.08643 0.01519 0.08435 0.22880

    Coefficients:Estimate Std. Error t value Pr(>|t|)

    (Intercept) -11.86766 11.49803 -1.032 0.31432log(SPLICEGROSS) 2.53960 1.27772 1.988 0.06072 .log(MRM2IMFNS) -0.10262 1.65485 -0.062 0.95117log(AUTOCOR) 0.58928 0.18869 3.123 0.00536 **log(SPLICEGROSS):log(MRM2IMFNS) -0.08825 0.14395 -0.613 0.54673---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

    Residual standard error: 0.1389 on 20 degrees of freedom(9 observations deleted due to missingness)

    Multiple R-squared: 0.9845, Adjusted R-squared: 0.9814F-statistic: 318.1 on 4 and 20 DF, p-value: < 2.2e-16

    > # This creates the variables plot> comb3 plot(comb3)