the process of gathering and analyzing twitter data to predict...
TRANSCRIPT
The process of gathering and analyzing Twitter data to predict stock returns
EC115
Economics
Purpose
Many Americans save for retirement through plans such as 401k’s and IRA’s and these
retirement plans save money through mutual funds, which are groups of stocks. Sometimes these
mutual funds can actually lose money for potential retirees. In 2008, on average, employees lost
$10,000 on their retirement savings (Brandon, 2009).
Currently, investors seeking information about public entities traditionally gather the
majority of their data from financial publications and documents filed by a company with the
Securities Exchange Commission, which sources typically contain financial data including
revenues, earnings per share, priceearnings ratios, cash flows, dividend yields, product launches
and company management strategies (Rader, 2006). However, this purely financial approach to
market prediction neglects the bullish or bearish sentiments of the public that play a significant
role in movement of the market. Due to this total dependence on financial data from investors,
stock market predictions are generally inaccurate (Ferri, 2013). By 2008, CXO had collected and
graded more than 5,000 predictions and the accuracy of these predictions have stabilized at about
48 percent ever since (Figure 1).
Figure 1. Data collected on the accuracy of stock market predictions by the CXO advisory group.
Hypothesis
The experiment sought a correlation between the calculated social media sentiment value
and the stock returns of the company. The hypothesis was that a positive social media sentiment
would result in an increase in stock returns and a negative social media sentiment would result in
a decrease in stock returns. The null hypothesis was that social media sentiment would have no
impact on stock returns.
Background Literature Review
Social Media is an outlet for instant news which is valuable to investors since accurate
news about a company is a reliable predictor of that company's stock value. In one study,
principal components analysis was used to reduce the approximately 400 features extracted from
social media to about 30 features, capturing about 25% of the variance. An Ordinary Least
Squares regression was used to forecast stock price movements from the approximately 30
variables (Sisk, 2013). By conducting these statistical processes on data collected from social
media, news metadata can inform shortterm future price movements and volatility.
In a study done by Fisher and Statman (2000), there was no significant relationship
between change in sentiment in one month and stock returns in the following month. For large
market capitalization stock returns, the relationship was negative for all sentiment groups but
never statistically significant. The relationship between change in sentiment during a month and
the following month's stock returns was positive for the small market capitalization stocks of the
Center for Research in Security Prices (CRSP) 910, but that relationship also was not
statistically significant.
Economists at the University of Rochester, Cornell University, and the University of
Texas concluded that individual investor sentiment predicts future stock returns, and that
investor sentiment is independent from data involving past returns or past volume (Kaniel et. al,
2004). Furthermore, the trading of these individual investors predicts weekly returns for stocks
of all market capitalizations, while data involving past returns are only predictive of stock with a
small market capitalization over the same time period. These findings show that the sentiment of
the individual investor is highly indicative of the movement of stock price.
When the propensity of investors to speculate is high and when the stock is volatile,
sentiment is a significant predictor of the fluctuations in the stock price (Baker et. al, 2007).
However, when a stock is stable and investors are calm, sentiment has less bearing on the
movement of stock price.
Using the Facebook's Gross National Happiness Index (FGNHI), which determined the
average optimism and pessimism of its users in a particular region, researchers found a
significant positive relation between sentiment and contemporaneous stock market returns,
showing that optimistic sentiment is related to gains in the market index and pessimistic
sentiment is related to losses in the market index (Siganos, 2014). However, the procedure of
using public sentiment as an indicator of the market may be flawed since the relation between
sentiment and stock returns is subject to reverse causality. For example, when investors profit or
loss from the market, those sentiments may be expressed through social media. These specific
sentiments are not indicative of what the movement of the market will be tomorrow.
Bollen, Mao, and Zeng (2011) investigated whether measurements of collective mood
states derived from largescale Twitter feeds are correlated to the value of the Dow Jones
Industrial Average (DJIA) over time. They used OpinionFinder to measure positive vs. negative
mood and GoogleProfile of Mood States (GPOMS) that measures mood in terms of 6
dimensions (Calm, Alert, Sure, Vital, Kind, and Happy). Through this process, an accuracy of
86.7% was achieved in predicting the daily up and down changes in the closing values of the
DJIA and the Mean Average Percentage Error (MAPE) was reduced by more than 6%.
Research Methodology
Six companies were chosen for the calculations: Kirkland (KIRK), Zynga (ZNGA),
Aramark (ARMK), Foot Locker (FL), Verizon (VZ), and Disney (DIS). These companies were
chosen from the Russell Index Member Lists. Kirkland and Zynga were chosen from the Russell
Microcap List (Russell Investments, 2011), which lists the stocks of highly successful small
companies; Aramark and Foot Locker were chosen from the Russell Midcap List (Russell
Investments, 2011), which lists the stocks of highly successful midsize companies; Verizon and
Disney were chosen from the Russell 1000 Index List, which lists the stocks of highly successful
large companies. These companies were chosen for their varying product fields, their importance
to consumer markets, and their distinctive names.
A Python program, entitled StockPile was developed. The program asked for a company
name to predict the stock price of. Using the Twitter REST API, all the tweets mentioning the
name of the specified company in the last eight days was collected. As the Twitter REST API
has a rate limit on the number of queries possible per fifteen minute intervals, multiple API
accounts were created and cycled through each time one hit the limit. Python’s Natural Language
Toolkit (NLTk) associated a sentiment and polarity value for each Tweet and the Twitter API
associated the number of favorites, the number of retweets, and the number of followers of the
author for each Tweet. However, the number of followers metric was removed after its statistical
insignificance was noted. The Tweet, with the corresponding data, was stored as a TweetObj
Object as shown in Figure 2. The metrics of the TweetObjs were then related with the relevant
stock prices of that company with a two day timeshift in the future. The TweetObjs were saved
in a Pickle file. This process was repeated for each of the tested companies: Kirkland (KIRK),
Zynga (ZNGA), Aramark (ARMK), Foot Locker (FL), Verizon (VZ), and Disney (DIS).
Figure 2. Class Diagram for the TweetObj class.
A Least Squares Multiple regression was conducted on the metrics of the TweetObjs and
the stock prices with a certain timeshift. The most accurate timeshift was determined by
conducting a Least Squares Multiple regression on all the different timeshifts from a zero
timeshift to a timeshift of a week. Whichever regression model provides the least pvalue and
the greatest correlation coefficient of the predictive model will determine the delay in the
sentiment on twitter of a company affecting the stock price of that company. The predictive
model with the most accurate timeshift will provide an equation to predict the stock price in the
following form:
iFi] + K = Pt[∑n
i=1C
Where n is the number of features, Ci is the coefficient of the feature value, Fi is the feature
value, K is a constant, and Pt is the stock price after a certain time delay.
Figure 3. Data Flow Diagram for the StockPile algorithm.
After the algorithms were developed, for the following three days, a testing algorithm,
encapsulated in testing.py, was run after the markets closed at 3:00 PM EST. This program
collected Tweets via the Twitter REST API for a period of one day, and ran the values through
the predictive equation to determine a predicted stock price associated with each Tweet. These
stock prices were then compared to the actual stock price at the germane time. An R2 score was
then calculated comparing the predicted values to the actual values.
Figure 4. Tweets referencing a certain company are collected using the Twitter Streaming API. Each terminal window is collecting Tweets about a certain company.
Figure 5. The output the computer program that collects the Tweets and associates them with a stock price is shown above. The output shows each Tweets ID, properties of that Tweet, and the associated stock price with that Tweet. The last line of the output shows the coefficients and the constant for the predicted model. The coefficients are in the brackets. The companies being run
in the terminals from the topleft terminal to bottomleft terminal clockwise are Kirkland, Zynga, Aramark, Foot Locker, Verizon, and Disney.
Figure 6. Using the predictive model provided by the previous program, this program tests the accuracy of the predictive model. The R2 value is shown at the bottom of the terminal window. The companies being run in the terminals from the topleft terminal to bottomleft terminal clockwise are Kirkland, Zynga, Aramark, Foot Locker, Verizon, and Disney.
Results
Kirkland (KIRK)
Average volume of Tweets per day: 3442
Equation: 5.16412696e01a + 2.07975484e05b + 9.28207328e02c + 24.4283273762 Where a is the sentiment, b is the number of favorites, and c is the number of retweets
Day R2 Value
Day 1 1.70142808328
Day 2 0.958742568133
Day 3 36.3152009966
Figure 7. Data for Kirkland
Zynga (ZNGA)
Average volume of Tweets per day: 938
Equation: 0.05174558a 0.00014439b 0.00452934c + 2.56291473118 Where a is the sentiment, b is the number of favorites, and c is the number of retweets
Day R2 Value
Day 1 0.0866621636987
Day 2 0.924072874293
Day 3 0.413653843975
Figure 8. Data for Zynga
Aramark (ARMK)
Average volume of Tweets per day: 179
Equation: 0.20719973a + 0.16352239b 0.01903519c + 31.920903806 Where a is the sentiment, b is the number of favorites, and c is the number of retweets
Day R2 Value
Day 1 0.0500613156024
Day 2 19.2256210256
Day 3 7.89317668631
Figure 9. Data for Aramark
Foot Locker (FL)
Average volume of Tweets per day: 1103
Equation: 0.07723532a 0.08410707b + 0.0054169c + 56.7938719015 Where a is the sentiment, b is the number of favorites, and c is the number of retweets
Day R2 Value
Day 1 331.704823394
Day 2 300.694360902
Day 3 178.346760997
Figure 10. Data for Foot Locker
Verizon (VZ)
Average volume of Tweets per day: 35251
Equation: 1.37212215e01a 8.71009324e05b 2.69172043e03c + 47.203529803 Where a is the sentiment, b is the number of favorites, and c is the number of retweets
Day R2 Value
Day 1 0.981403546164
Day 2 14.3259053938
Day 3 2.06868027039
Figure 11. Data for Verizon
Disney (DIS)
Average volume of Tweets per day: 358214
Equation: 2.90521402e01a + 3.97562468e06b + 4.45964933e02c + 94.8674834768 Where a is the sentiment, b is the number of favorites, and c is the number of retweets
Day R2 Value
Day 1 1242.50846904
Day 2 7.7987392641
Day 3 5.52669624237
Figure 12. Data for Disney
Conclusions
The R2 value is the coefficient of determination. The coefficient of determination is the
proportion of the variance in the dependent variable that is predictable from the independent
variable. The higher the R2 value, the datasets are more correlative with an upper bound of 1.
The lower the R2 value, the less correlated the datasets. The datasets compared in this experiment
are the sets of predicted stock values using the model based on Tweets, related features, and the
actual stock price. All but one of the trials in the experiment had a negative coefficient of
determination. This means the model predicted the stock price inaccurately.
The Efficient Market Hypothesis states that information relevant to stock price will
immediately affect the stock price. Since the social media was highly uncorrelated with the stock
price according the values of the coefficient of determination, the Efficient Market Hypothesis
proves that information on social media is not relevant to the fluctuation in stock price.
The results of this experiment also prove that stock movement follows a Markov Chain.
A Markov Chain means that the state of something currently is not related at all to the previous
states that it has been in. An extension of this type of movement is a random walk. A random
walk is a succession of random steps. In the case of stocks, the movement of stocks can go either
up or down. These movements represent the total randomness of fluctuations in the stock market.
Nobel Prize winning Economist, Gene Fama (1965) finds that “empirical evidence
provides strong support for the random walk model”. He notes that economists must prove that
models can predict prices better than simply randomly choosing ‘buy’ or ‘sell’. As Figure 1
showed, the average accuracy of predictive models is only 48 percent, less than the fiftyfifty
chance of choosing randomly. The results of this experiment corroborate his findings. The major
finding of this experiment was that social media is irrelevant to determining stock returns.
Further research to seek a correlation between Social Media and stock returns may include taking
into account longer periods of time for data gathering and testing, using more sensitive natural
text analysis tools, and calculating based on a time shift, with an assumption that sentiment data
from the previous days would impact future days.
Bibliography
Baker, M. and Wurgler, J. (2007). Investor Sentiment in the Stock Market.
Bollen, J., Mao, H. and Zeng, X. (2011). Twitter mood predicts the stock market. Journal of
Computational Science, 2(1), pp.18.
Brandon, E. (2009, February 12). How Did Your 401(k) Really Stack Up in 2008? [online] US
News.
Available at:
http://money.usnews.com/money/retirement/articles/2009/02/12/howdidyour401kreally
stackupin2008
Fama, E. (1965). Random Walks in Stock Market Prices. Financial Analysts Journal, 21(5),
pp.5559.
Ferri, R. (2013). It's Official! Gurus Can't Accurately Predict Markets. [online] Forbes.
Available at:
http://www.forbes.com/sites/rickferri/2013/01/10/tsofficialguruscantaccuratelypredict
markets/ [Accessed 28 Dec. 2014].
Fisher, K. and Statman, M. (2000). Investor Sentiment and Stock Returns. Financial Analysts
Journal, 56(2), pp.1623.
Kaniel, R., Saar, G. and Titman, S. (2004). Individual Investor Sentiment and Stock Returns.
Li, X., Xie, H., Chen, L., Wang, J. and Deng, X. (2014). News impact on stock price return via
sentiment analysis. KnowledgeBased Systems, 69, pp.1423.
Minh, D. (2013). Sentiment and Influence Analysis of Twitter Tweets. US20130103667 A1.
Paniagua, J. and Sapena, J. (2014). Business performance and social media: Love or hate?.
Business Horizons, 57(6), pp.719728.
Rader, J. (2006). Method and system for conducting sentiment analysis for securities research.
US20060242040 A1.
Russell Investments, (2011). Russell 1000 Index Member List. Washington: Russell Investments.
Russell Investments, (2011). Russell Microcap Index Membership List. Washington: Russell
Investments.
Russell Investments, (2011). Russell Midcap Index Member List. Washington: Russell
Investments.
Siganos, A., VagenasNanos, E. and Verwijmeren, P. (2014). Facebook's daily sentiment and
international stock markets. Journal of Economic Behavior & Organization, 107,
pp.730743.
Sisk, J. (2013). Methods and systems for predicting market behavior based on news and
sentiment analysis. US20130138577 A1.
Acknowledgements
We would like to acknowledge and thank several people for providing us with an
abundance of support and assistance. We would like to express our gratitude to Dr. Alex
Tabarrok of George Mason University, Professor Francis DiTraglia of the University of
Pennsylvania, Dr. Andrew Lo of the Massachusetts Institute of Technology, and William Li of
the Massachusetts Institute of Technology for their highly supportive feedback on the Economic
and programming side. We would also like to thank Dr. Csaba Gabor and Mr. Phillip Ero of the
Thomas Jefferson High School for Science and Technology for providing us with support in
understanding our statistical analyses. We are grateful to our lab director, Dr. Dan Burden of the
Thomas Jefferson High School for Science and Technology, for patiently and proactively
ensuring that we had everything we needed to conduct our experiment. Finally, and most
importantly, we want to appreciate our families for their unparalleled support.