stock market prediction using sentiment detection c. lee fanzilli advisors: prof. dvorak and prof....

14
Stock Market Prediction Using Sentiment Detection C. LEE FANZILLI ADVISORS: PROF. DVORAK AND PROF. WEBB

Upload: constance-james

Post on 24-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Stock Market Prediction Using Sentiment Detection C. LEE FANZILLI ADVISORS: PROF. DVORAK AND PROF. WEBB

Stock Market Prediction Using Sentiment DetectionC. LEE FANZILLI

ADVISORS: PROF. DVORAK AND PROF. WEBB

Page 2: Stock Market Prediction Using Sentiment Detection C. LEE FANZILLI ADVISORS: PROF. DVORAK AND PROF. WEBB

Hypothesis

Can we use Twitter sentiment mentioning a stock in the NYSE to predict future returns of that stock?

Can we predict contemporaneous returns?

Do returns predict Twitter sentiment instead?

Page 3: Stock Market Prediction Using Sentiment Detection C. LEE FANZILLI ADVISORS: PROF. DVORAK AND PROF. WEBB

Background

People have tried many different ways of predicting prices in the market.

Technical Analysis is a methodology for forecasting the direction of prices through the study of past market data, primarily price and volume.

In Jan Larson’s paper he saw 300% gains on initial investment with this method (June 2010).

Page 4: Stock Market Prediction Using Sentiment Detection C. LEE FANZILLI ADVISORS: PROF. DVORAK AND PROF. WEBB

Challenges

Efficient Market Hypothesis states that the market is always at equilibrium.

Once we have a dataset, there is a fair amount of organizing and cleaning up to be done.

Not all data is useful data, and the data that is useful may not be sufficient enough to make a claim.

Page 5: Stock Market Prediction Using Sentiment Detection C. LEE FANZILLI ADVISORS: PROF. DVORAK AND PROF. WEBB

Data

For this experiment we collected daily stock price information on AMD, Google, and Apple from Yahoo Finance.

We retrieved a list of the top 101 tech tweeters from a Business Insider article to extract our tweets from.

Using Twitter’s API we created a corpus of tweets.

Date Open High Low Close Volume Adj Close

2/24/2015 530 536.79 528.25 536.09 1002300 536.09

2/23/2015 536.05 536.44 529.41 531.91 1448900 531.91

2/20/2015 543.13 543.75 535.8 538.95 1440400 538.95

2/19/2015 538.04 543.11 538.01 542.87 986400 542.87

2/18/2015 541.4 545.49 537.51 539.7 1447600 539.7

Google Daily Price Information

Page 6: Stock Market Prediction Using Sentiment Detection C. LEE FANZILLI ADVISORS: PROF. DVORAK AND PROF. WEBB

Organization

We uploaded our Twitter data to CouchDB, an Apache database.

Next we pulled the date posted and text from tweets then separated them based on which of our stocks was mentioned.

Then wrote a script to score each tweet’s overall sentiment.

Page 7: Stock Market Prediction Using Sentiment Detection C. LEE FANZILLI ADVISORS: PROF. DVORAK AND PROF. WEBB

Sentiment Detection

Sentiment Detection, a form of textual analysis.

The university of Pittsburgh provides the MPQA corpus.

For a given dataset, we were able to calculate the total number of positive, negative, neutral, and true neutral tweets.

Mean # Pos # Neg #Neutral

# Balanced

# True Neutral

Total

1.1941 926 2249 22560 247 2414 5735

Percentages 16.16% 39.22%

44.64% 4.31% 40.33%

Apple Stats

Page 8: Stock Market Prediction Using Sentiment Detection C. LEE FANZILLI ADVISORS: PROF. DVORAK AND PROF. WEBB

Example of Scoring

Love @sunrise update - smoother calendar sync with GOOG apps and iPad app! 1

@Simonkhalaf @BenedictEvans no doubt, apps are winning but I still have sense that GOOG can change trajectory 1

Another protest against techies at 24th st-- google continues to be a rallying symbol for protestors http://t.co/NebJQ4pwrZ -2

The Beer Game -or- Why Apple Can't Build iPads in the US by @marksweep http://t.co/u2cl4Xne -1

apple analyst releases analysis based on another apple analysts analysis 0

Page 9: Stock Market Prediction Using Sentiment Detection C. LEE FANZILLI ADVISORS: PROF. DVORAK AND PROF. WEBB

Results

We ran linear regression models in RStudio.

Our results indicate that there is little to no correlation between sentiment and future returns.

But each case tends to vary. Our analysis on Google showed that sentiment was indeed significant.

Certain values can be explained by not enough data.

Returnt-1 Returnt Returnt+1

Intercept(AAPL)Intercept(AMD)Intercept(GOOG)

t-value = 2.04

t-value = 0.497

t-value = 0.15

t-value = 1.80

t-value = 0.79

t-value = -0.37

t-value = 2.48

t-value = 0.99

t-value = 0.62

Sentiment(AAPL)Sentiment(AMD)Sentiment(GOOG)

t-value = -0.52

t-value = 0.12

t-value = -0.09

t-value = 0.15

t-value = -1.02

t-value = 1.28

t-value = -0.22

t-value = -0.51

t-value = 2.61**

R2(AAPL)R2(AMD)R2(GOOG)

-0.000336-0.01565-0.00291

-0.0004480.0007890.001875

-0.000436-0.010910.01745

Page 10: Stock Market Prediction Using Sentiment Detection C. LEE FANZILLI ADVISORS: PROF. DVORAK AND PROF. WEBB

Apple GraphsFuture Returns Returns Predicting Sentiment

Retu

rns

Retu

rns

Tweet SentimentTweet Sentiment

Page 11: Stock Market Prediction Using Sentiment Detection C. LEE FANZILLI ADVISORS: PROF. DVORAK AND PROF. WEBB

Google GraphsFuture Returns Returns Predict Sentiment

Tweet SentimentTweet Sentiment

Retu

rns

Retu

rns

Page 12: Stock Market Prediction Using Sentiment Detection C. LEE FANZILLI ADVISORS: PROF. DVORAK AND PROF. WEBB

AMD Graphs

Future Returns Returns Predicting Sentiment

Tweet Sentiment

Retu

rns

Retu

rns

Tweet Sentiment

Page 13: Stock Market Prediction Using Sentiment Detection C. LEE FANZILLI ADVISORS: PROF. DVORAK AND PROF. WEBB

Future Work

In the future we would take a look at indices in addition to individual stocks.

As well as a broader range of Twitter data, not just tech tweets

Rather than calculating return, we would also include the Cumulative Abnormal Return.

More Twitter data would have to be collected, many papers about similar experiences have millions of tweets not thousands.

Instead of using a linear regression model, we would consider using Support Vector Machines and other Machine Learning tools.

Page 14: Stock Market Prediction Using Sentiment Detection C. LEE FANZILLI ADVISORS: PROF. DVORAK AND PROF. WEBB

Works Cited

B. Wiithrich, D. Permunetilleke, S. Leung, V. Cho, J. Zhang, W. Lam, "Daily Prediction of Major Stock Indices from textual WWW Data", The Hong Kong University of Science and Technology

J. Bollen, H. Mao, X. J. Zeng, "Twitter mood predicts the stock market", School of Informatics and Computing, Indiana University-Bloomington, October 2010

J. I. Larsen, "Predicting Stock Prices Using Technical Analysis and Machine Learning", Masters in Computer Science, Norwegian University of Science and Technology, June 2010