using support vector machine with a hybrid feature selection method to the stock trend prediction

Using support vector machine with a hybrid feature selection method

to the stock trend prediction

Ming-Chi LeeExpert Systems with Applications . 2009

Presenter: Yu Hsiang HuangDate: 2012-05-17

Outline• Introduction• Feature selection• Research design• Experimental results and analysis• Conclusion

Introduction• Stock market

– Highly nonlinear dynamic system

• Application of AI– Expert system , Fuzzy system, Neuron network– Back propagation neural network (BPNN)

• Power of prediction is better than the others• Require a large amount of training data to estimate the distribution of input pattern• Over-fitting nature• Fully depends on researcher’s experience of knowledge to preprocess data

– relevant input variables, hidden layer size, learning rate, momentum, etc.

Introduction• In this paper

– Support vector machine (SVM)• Captures geometric characteristics of feature space without deriving weights of

networks from the training data. • Extracts the optimal solution with the small training set size• Local optimal solution vs. Global optimum solution• No over-fitting • Classification performance is influenced by dimension or number of feature variables

– Feature selection• Addresses the dimensionality reduction problem by determining a subset of available

features which is most essential for classification• Hybrid feature selection : Filter method + wrapper method F_SSFS• F_SSFS : F-score + Supported sequential forward search• Optimal parameter search

– Compare performance between BP and SVM

SVM-based model with F_SSFSOriginal feature variables

Filter partFeature pruning using F-score

Wrapper partSSFS algorithm find best feature variables

Pre-selected features

SVMTraining , testing , evaluating the classification accuracy

Best Feature variables

Hybrid feature selection

Feature selection• Filter method :

– No feed back from classifier– Estimate the classification performance by some indirect assessments

• Distance : reflect how well the classes separate from each other

Estimate the classification

performance : distance

No feedback from classifier

Feature selection• F-score and Supported Sequential Forward Search (F_SSFS)

– F-score• Play the role of filter• Pre-selected features – “informative”• Given training vector , k=1,2,..,m• the number of positive and negative instances• F-score of feature :

• are the averages of feature of the whole , positive, negative data sets

• The numerator indicates the discrimination between the positive and negative sets• The denominator indicates the one within each of two sets• The larger the F-score is, the more likely this feature is more discriminative

𝑥𝑖¿¿

𝑥𝑖(−)𝑥𝑖

– F-score

Calculate F-score

Original feature variables

Sort F-score

Select top K F-score feature

K pre-selected feature

Pre-selected features

Feature selection• Wrapper method:

– Classifier-dependent• Evaluate the “goodness” of the selected feature subset directly (from classifier)• Should intuitively yield better performance

– Have limit applications• Due to the high computational complexity involved

Feedback from classifier

– Supported sequential forward search (SSFS)• Play the role of wrapper• A variation of the sequential forward search (SFS) algorithm that is specially tailored to SVM to

expedite the feature searching process• Support vector : training samples other than support vectors have no contribution to

determine the decision boundary• Dynamically maintains an active subset as the candidates of the support vector• Training SVM using reduced subset rather than the entire training set - less computational cost

– Supported sequential forward search (SSFS)

f1 f2 f3 f4 … fk-2 fk-1 fk label

r1 … … … … … … … … +

r2 … … … … … … … … -

… … … … … … … … … -

rN … … … … … … … … +

– Supported sequential forward search (SSFS)

Iteration = 1

Iteration = n+1

1. No significant reduction of is found2. Desired number of features has been obtained

Termination

– F_SSFS• Uses the F-score measure to decide the best feature subsets• Uses the SSFS algorithm to select the final best feature subsets• Reduces the number of features that has to be tested through the training of SVM• Reduces the unnecessary computation time spent on the testing of the “no-informative”

features by wrapper method

Research design• Data collection and preprocessing

– Prediction target : the direction of change in the daily NASDAQ index– Index futures lead the spot index – Using 30 technical indices as the whole features set– 20 future contracts, 9 spot indexes and 1-day lagged NASDAQ Index– Use “1” and “-1” to denote the next day’s index is higher or lower than today’s– From Nov 8, 2001 to Nov 8, 2007 with 1065 observations per feature – The original data are scaled into the range of (0,1)

f1 f2 f3 … … f28 f29 f30 label

1 … … … … … … … … 1

2 … … … … … … … … -1

… … … … … … … … … -1

1065 … … … … … … … … 1

Research design• SVM-based model with F_SSFS

– Filter part• Calculating F-score for every feature and ranking features without involving the classifier• Sorting F-score and select K (threshold) highest scored features to construct the feature subset

– Wrapper part• Each selected feature does the 5-fold cross-validation and calculates the average accuracy of the 5-fold cross-validation• Determining the feature to be added in the best feature subset using M is the objective function• Repeat… • Until no significant increasing accuracy of cross-validation is found or the desired number of features has been obtained

Research design• Modeling for support vector machine

– Model selection and parameter search• Radial basis function (RBF) kernel• Kernel parameter and penalty parameter • Grid-search on () using 5-fold cross-validation

– Preventing the over-fitting problem– Computational time to find good parameters is less that other methods– Grid-search can be easily parallelized because () is independent– Try exponentially growing sequences of ()

» = » =

• Final performance of classifier is evaluated by mean costs of v folds subsets• Use LIBSVM software to conduct SVM experiment

Pre-selected K features

Experimental results and analysis• Experimental result of F_SSFS

– Threshold K determines how many features we want to keep after filtering. • K is equal to the number of all original features filter part does not contribute at all• K is equal to 1 the wrapper method is unnecessary

Experimental results and analysis• Experimental result of F_SSFS – filter part

– Set

– K ↑, accuracy of prediction ↑, selection process time ↑– The performance and complexity of the algorithm can be balanced by tuning K– Choose K = 22, after the process of wrapper part, 17 features variables turned out to have

Experimental results and analysis• Experimental result of F_SSFS – wrapper part

– Choose K = 22, after the process of wrapper part– 17 features are left, average accuracy rate 81.7%

Experimental results and analysis• Result of SVM model selection

– RBF kernel • Penalty parameter , Kernel parameter • Grid-search using 5-fold cross-validation

– Optimal () is () with cross-validation rate of 87.1%

Experimental results and analysis• Experimental result of SVM

• Experimental result of BPNN

Experimental results and analysis• Experimental result of feature selection

– Key deficiency of neural-network models for stock trend prediction • Difficulty in selecting the discriminative features and explaining the rationale for the stock trend prediction

– Relative importance of each feature

Experimental results and analysis• Conclusion

– Stock trend prediction– Support vector machine with hybrid feature selection method (F_SSFS)– Reducing high computational cost and the risk of over-fitting– Need to investigate to develop the optimal value of the parameters in SVM for

the best prediction performance– Generalization of SVM on the basis of the appropriate level of the training set

size and give a guideline to measure the generalization performance

using support vector machine with a hybrid feature selection method to the stock trend prediction

fscore feature

feature selection fscore

ssfs f

feature pruning

feature selection7

number of feature variables

selected feature subset

sequential forward search

Technology

stock trend prediction by using k-means and aprioriall...

fast inter-prediction algorithm based on motion vector...

protein structure prediction using support vector machine

stock market trend prediction using recurrent convolutional...

audi: trend prediction/ client pitch

5. basic strategy for trend prediction 5.1 feature...

recent climate change trend analysis and future prediction...

trend validation of cfd prediction results for ship design

a support vector machine-based prediction model for

prediction of market capitalization trend through selection...

short-term stock market price trend prediction using a

trend prediction task

knowledge-driven stock trend prediction and explanation

housing price prediction using support vector regression

support vector machine approach for protein subcelluar...

trend prediction using fashion datasets

hr trend prediction 2014

comparative study of stock trend prediction using time delay

fortran iv program for vector trend analyses of ... ·...

copper price prediction using support vector regression