machine learning and short positions in stock trading strategies
DESCRIPTION
Machine learning and short positions in stock trading strategies. D.E Allen, R. Powell and A. K. Singh Edith Cowan University. Reading questions. What is short selling and why is it controversial? What are Support Vector Machines (SVM) and why are they a useful technique? - PowerPoint PPT PresentationTRANSCRIPT
D.E Allen, R. Powell and A. K. Singh
Edith Cowan University
Reading questions1. What is short selling and why is it controversial?
2. What are Support Vector Machines (SVM) and why are they a useful technique?
3. Explain what kernel estimation is.
4. Why are different kernel estimators available?
5. Explain what logistic regression is.
6. What does Beta Measure?
7. Why are Sharpe ratios a useful investment metric?
8. How does Beta differ from Sharpe ratios.
9. How do we measure mean absolute error?
10. Why is out of sample forecasting important?
2
Introduction
Forecasting future stock price movement using financial indicators.
Evidence from past for predictability power of financial factors e.g. Beta, E/P, B/M, past returns etc.
Support Vector Machines (SVM), capable of handling large amount of unstructured, noisy or nonlinear data.
SVM classification useful in prediction of future price direction (+1,-1).
3
SVM in Classification
SVM are characterized by Mapping input vectors into higher dimensional feature
space. Structural risk minimization Non linear modelling with Kernel Functions
Kernel density estimators are non-parametric density estimators with no fixed structure. They depend on all the data points to obtain an estimate.
Classification of classes using optimal separating hyperplane.
4
SVM Optimal Separating Hyperplane.
5
SVM SVM use following kernel functions
Linear: Polynomial: Radial Basis Function (RBF): Sigmoid: Here and d are kernel parameters.Study Uses RBF kernel for its robustness on
non linear data.
6
Data Dow Jones Industrial Average sample Stocks’ daily data
for a period of 5 years (1/03/2005-9/03/2010). Factors Used for forecasting
Factors Underlying rationale
Previous 2 days daily log returns. Indicator of the historical performance, which is widely used in time series analysis.
Beta (six months rolling window) Return dependence on the market return in the long run.
Price to Earnings Ratio Indicator of the current company value which effects the price movement.
Book to Market Ratio Fama- French (1992, 1993)
Traded Volume Indicator of the performance of the stock in the market.
Dividend Yield Indicator of company performance. Blume (1980)
7
Methodology Standardization of Data
Direction of price change classified into binary -1 and 1 using
Testing sample is created using last 130 days data. Kernel parameters, cost and gamma are optimized using grid
search. A systematic way of seeking optima. The model is built on training data and is used for forecasting
which is tested on out sample data (130 days) SVM results are compared with Logistic Regression results (with same training and testing data).
Simple investment strategy used to check the predicted directions
8
Forecasting Results Stocks Results SVM Logistic Regression
Stock 1Correctly Classified
Instances 77 (59.2308 %) 67 (51.5385%)
C GammaIncorrectly Classified
Instance 53 (40.7692%) 63 (48.4615 %)
724 0.1 Mean Absolute Error 0.4077 0.5015
Stock 2Correctly Classified
Instances 112 (86.1538%) 109 (83.8462 %)
C GammaIncorrectly Classified
Instance 18 (13.8462%) 21 (16.1538 %)
1024 0.12 Mean Absolute Error 0.1385 0.316
Stock 3Correctly Classified
Instances 76 (58.4615%) 67 (51.5385 %)
C GammaIncorrectly Classified
Instance 54 (41.5385 %) 63 (48.4615 %)
1448 0.003162 Mean Absolute Error 0.4154 0.4962
Stock 4Correctly Classified
Instances 76 (58.4615%) 69 (53.0769 %)
C GammaIncorrectly Classified
Instance 54 (41.5385 %) 61 (46.9231 %)
724 3 Mean Absolute Error 0.4154 0.4963
Stock 5Correctly Classified
Instances 80 (61.5385%) 59 (45.3846 %)
C GammaIncorrectly Classified
Instance 50 (38.4615 %) 71 (54.6154 %)
1448 0.56 Mean Absolute Error 0.3846 0.5091 9
Investment Strategy Results
Final Return Sharpe Ratio
SVM LOGISTIC SVM LOGISTIC
Stock1 20.10167056 -12.0362 17.42748 -13.0499
Stock2 7.246199093 6.009645 4.356055 3.369538
Stock3 16.33556329 15.30477 14.78509 13.72405
Stock4 14.33568424 5.611437 14.83901 4.495077
Stock5 18.27861273 -5.49125 14.62362 -6.39905
DJIA 10.12379524 8.10426878
The final net returns of the stocks are compared using the Sharpe Ratio.
10
Conclusion SVM classification outperforms logistic
regression in classifying price direction. Simple stock trading strategy also reveals the
efficiency of SVM in stock trading. Further applications can include prediction of
other financial time series. SVM regression can be further tested for similar
work
11