data mining of informational stream in social networks
DESCRIPTION
Data Mining of Informational Stream in Social Networks Forecasting of Social, Market and Financial TrendsTRANSCRIPT
Data Mining of Informational Stream Data Mining of Informational Stream in Social Networks in Social Networks
Forecasting of Social, Market Forecasting of Social, Market and Financial Trends and Financial Trends
Bohdan Pavlyshenkoe-mail: [email protected]: bpavlyshenko.blogspot.com
Used technologies: R, Python, Java, Hadoop/MapReduce/Pig/Hive
The prototypes of data mining systems are based on the theory of formal concept analysis and on the theory of frequent itemsets. Using a model of a semantic concept lattice makes it possible to analyze semantically related sets of words and to construct association rules.
The use of quantitative characteristics of informational streams for marketing trend forecasting and for the analysis of users’ attitude towards different goods and services (Opinion Mining)
Detection of predictive potential of association rules in informational streams and the use of these rules in autoregressive models (ARIMA, VAR) for predicting, in particular, the financial trends on stock markets. Such a model takes into account both the past behavior of financial time row of a company and the time dynamics of quantitative characteristics of association rules.
The analysis of communities and their leaders who form analyzed trends in social networks. The analysis of the presence of manipulative formation of users’ attitude towards this or that commodity or economic trend.
The causality analysis on the basis of Granger tests for singling out the principal and subordinate time rows, particularly for informational streams, economic indicators, etc.
The creation of a subsystem of recommendations for users. For example, in an online store, this system analyzes users’ behavior, their purchases, their feedback towards goods or services. Based on the user’s activity, one can create his/her semantic profile and then make various offers to this user, taking into account his/her activity and the decisions of users with similar profiles. Such an approach may shorten significantly the time the user spent while searching goods and services, and give him/her unknown but necessary offers, revealed on the basis of other similar users’ activities.
The analysis of financial tweets
The package “Tweet Miner for Stock Market”
The formation of keyword frequent sets with the biggest support value
The analysis of financial tweets
The analysis of financial tweets
The analysis of causal relationship between the frequent sets in tweets and Apple stock prices. The results obtained show that it is possible to predict stock prices on the basis of data mining of informational streams in social networks.
test 1Granger causality testModel 1: V3 ~ Lags(V3, 1:1) + Lags(V2, 1:1)Model 2: V3 ~ Lags(V3, 1:1)Res.Df Df F Pr(>F) 1 87 2 88 -1 10.05 0.002103 **---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
test 2Granger causality testModel 1: V2 ~ Lags(V2, 1:1) + Lags(V3, 1:1)Model 2: V2 ~ Lags(V2, 1:1)Res.Df Df F Pr(>F)1 87 2 88 -1 0.3261 0.5694
The analysis of financial tweets
Granger causality test between quantitative characteristics of tweets and Apple stock prices.
Forecasting based on ARIMA model
Forecasting based on VAR model
The examples of the studies of semantic concepts in Twitter messages
The Final Olympic Tennis Tournament (2012)
The examples of the studies of semantic concepts in Twitter messages
The examples of test studies of semantic concepts in Twitter messages
Before the Eurovision 2013 final we published our forecasting of a winner and the favorites in our blog. Later on it proved to be correct.
The prediction of Eurovision 2013 favorites
The analysis of travel trends
The examples of test studies of semantic concepts in Twitter messages
Travel trends
The examples of test studies of semantic concepts in Twitter messages
The analysis of travel trendsTravel trends
The examples of test studies of semantic concepts in Twitter messagesMarket analysis of iPhone concept
The examples of test studies of semantic concepts in Twitter messages
Market analysis of iPhone concept
The examples of test studies of semantic concepts in Twitter messages
In this work, we analyze the existence of possible correlation between public opinion of twitter users and the decision-making of persons who are influential in the society. We carry out this analysis on the example of the discussion of probable name of the British crown baby, born in July, 2013. In our study, we use the methods of quantitative processing of natural language, the theory of frequent sets, the algorithms of visual displaying of users' communities. We also analyzed the time dynamics of keyword frequencies. The analysis showed that the main predictable name was dominating in the spectrum of names before the official announcement. Using the theories of frequent sets, we showed that the full name consisting of three component names was the part of top 5 by the value of support. It was revealed that the structure of dynamically formed users' communities participating in the discussion is determined by only a few leaders who influence significantly the viewpoints of other users.
The prediction of Royal baby’s name
The examples of test studies of semantic concepts in Twitter messages
Royal baby’s name forecasting
The name George was dominating in the spectrum of names before the official announcement.
The examples of test studies of semantic concepts in Twitter messages
Royal baby’s name forecasting
10 first frequent sets were created by five names, the three of which are the components of Prince’s full name George Alexander Louis.
The examples of test studies of semantic concepts in Twitter messages
The Royal baby’s name forecasting
Users’ societies, which formed the discussion trends.
More test examples and studies are in my blog http://bpavlyshenko.blogspot.com
Bohdan Pavlyshenko,Ph.D., e-mail: [email protected]
Thank you for your attention!