stock price prediction from natural language understanding of news headlines machine learning...

6
Stock Price Prediction from Natural Language Understanding of News Headlines Machine learning experiment: Task is to predict whether a stock will rise or fall significantly in reaction to the appearance of some news headline about the stock. The agent is trained on a year's worth of dated headlines and closing prices for 100 stocks. Performance is measured on the number of accurately classified test headlines. Does the headline predict a RISE, FALL or NOTHING?

Post on 20-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Stock Price Prediction from Natural Language Understanding of News Headlines Machine learning experiment: Task is to predict whether a stock will rise

Stock Price Prediction from Natural Language Understanding of News Headlines

Machine learning experiment:

• Task is to predict whether a stock will rise or fall significantly in reaction to the appearance of some news headline about the stock.

• The agent is trained on a year's worth of dated headlines and closing prices for 100 stocks.

• Performance is measured on the number of accurately classified test headlines. Does the headline predict a RISE, FALL or NOTHING?

Page 2: Stock Price Prediction from Natural Language Understanding of News Headlines Machine learning experiment: Task is to predict whether a stock will rise

Some related prior work

• Thomas Fawcett and Foster John Provost (1999) - Activity Monitoring, monitor streams of data for something interesting.

• Lavrenko et al (2000) – Ænalyst system, combines two time series data streams, headlines and quotes.

• Sofus Macskassy (2003) - Information Filtering, Prospective training data

• Yu, H. Hatzivassiloglou, V (2003) - Towards Answering Opinion Questions, separate facts and opinions, understand polarity of opinions, using Bayesian Classifier

Page 3: Stock Price Prediction from Natural Language Understanding of News Headlines Machine learning experiment: Task is to predict whether a stock will rise

Naïve Bayes Classifier

P(D|h) P(h)

P(D)P(h|D) =

h H, H is the set {RISE, FALL, NOTHING}

priors:

P(RISE) = 0.013055

P(FALL) = 0.007976

P(NOTHING) = 0.978969

-The occurrences of rises and falls are sparse.

- Each word in the dictionary collected has counts of when they appeared and when they occurred during a RISE, FALL

- P(D|h) estimated by multiplying the probabilities of the occurrence of each word in a headline during the RISE, FALL and NOTHING

Page 4: Stock Price Prediction from Natural Language Understanding of News Headlines Machine learning experiment: Task is to predict whether a stock will rise

Dictionary generated

• Human-understandable model

• Can be used for further agent design

TOTAL OCCURANCES DURING RISE DURING FALL WORD

98 11 0 upgraded 59 0 5 downgraded 97 21 1 bank 58 6 1 big 15 3 0 boosts 133 11 3 deal 9 0 2 disappoint 38 2 8 drop

31 3 1 despite

189 8 4 disclosure

96 3 1 growth

53 7 1 gains

Page 5: Stock Price Prediction from Natural Language Understanding of News Headlines Machine learning experiment: Task is to predict whether a stock will rise

Results

0.93

0.94

0.95

0.96

0.97

0.98

0.99

1

0 2000 4000 6000 8000 10000

% Accuracy vs. # of Training headlines

Page 6: Stock Price Prediction from Natural Language Understanding of News Headlines Machine learning experiment: Task is to predict whether a stock will rise

Further work

• Bug fixes

• Something more than “bag of words”

• Per-symbol language models

• Simple word-based decision-stubs as inputs to the Boosting algorithm