polls and news articles during the 2016 usa presidential ...200.145.112.249/webcast/files/federico...
Post on 10-Jun-2020
2 Views
Preview:
TRANSCRIPT
The Mass Media bias: Analysing and comparing the time series of
polls and news articles during the 2016 USA presidential election.
Federico Albanese(ffalbanese@gmail.com)
Director: Pablo BalenzuelaCodirector: Viktoriya Semeshenko
Departamento de Física, FCEyN-UBA
Objectives
1) Does a Mass media influence the society?
2) Does the negative propaganda have a positive or negative effect in a candidate?
3) Is there a bias in the Mass Media?
Polls
- 263 polls ( an average of 2.7 polls per day)
- Made by: NBC, New York Times, LA Times, CBS, Fox News, Gravis, ABC, IBD (entre otros)
∆(Clinton - Trump)
Time [month]
perc
enta
ge [%
]
MediaNew York Times Fox News Breitbart
[2] https://datascience.berkeley.edu/data-media-map-bitly/
- The most republican media, according to a study made at Berkeley University (2013) [2].
An article by A.J.Delgado in Oct. 22 2015
- Fox News is more conservative,whereas Breitbart is exclusively pro-Trump from the very first day.
[1] Google Trends in the USA between the most important newspapers
- Most consume and most google newspaper in the USA [1].
First look into the data
Clinton Trump
Number of mentions per article in the New York Times
First look into the data
Clinton Trump
Number of mentions per article in the New York Times
Clinton was mention less than 5 times in most of the articles. In contrast, Trump was mention more than 80 times in some articles.
Sentiment AnalysisStandford NLP: The algorithm makes a binary tree from each sentence taking into account the semantic composition.
(There are slow and repetitive parts, but it has just enough spice to keep it interesting )
Going from the children to the root, a sentiment value (positive, negative or neutral) is assigned for each node
Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C. D., Ng, A., & Potts, C. (2013). Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 conference on empirical methods in natural language processing (pp. 1631-1642).
Sentiment AnalysisTime Series:
(1) Republican National Convention(2) First Debate(3) Election Day
Clin
ton
Trum
p
dates
Num
ber o
f fra
ses
# positive frases
# neutral frases
# negative frases
# total frases
(1)
(1)(1)
(1)
(1)
(1)(2)
(2)
(2)
(2)(2)
(2)
(3)
(3)
(3)
(3)
(3)
(3)
Num
ber o
f fra
ses
Linear CorrelationLinear Correlation with a 14 days lag
Coeficient p-value Coeficient p-value Coeficient p-value
Clinton’s positive mentions 0.485 3.43e-6 -0.213 0.05 0.060 0.590
Clinton’s negative mentions 0.394 2.24e-4 -0.682 1.29e-12 -0.319 0.3
Clinton’s total mentions 0.453 1.70e-5 -0.616 5.54e-10 -0.174 0.116
Trump’s positive mentions 0.554 5.64e-8 -0.395 2.20e-4 0.160 0.149
Trump’s negative mentions 0.476 5.39e-6 -0.470 7.54e-6 -0.021 0.853
Trump’s total mentions 0.518 5.31e-7 -0.437 3.62e-5 0.082 0.460
- The more phrases published by the New York Times, bigger the difference in favor of Clinton.
- The more phrases published by Fox News, Trump goes up in the polls and smoller is the difference.
Difference in the polls
Linear CorrelationLinear Correlation with a 14 days lag
Coeficient p-value Coeficient p-value Coeficient p-value
Clinton’s positive mentions 0.485 3.43e-6 -0.213 0.05 0.060 0.590
Clinton’s negative mentions 0.394 2.24e-4 -0.682 1.29e-12 -0.319 0.3
Clinton’s total mentions 0.453 1.70e-5 -0.616 5.54e-10 -0.174 0.116
Trump’s positive mentions 0.554 5.64e-8 -0.395 2.20e-4 0.160 0.149
Trump’s negative mentions 0.476 5.39e-6 -0.470 7.54e-6 -0.021 0.853
Trump’s total mentions 0.518 5.31e-7 -0.437 3.62e-5 0.082 0.460
Mutual Information of the symbolize time series
where Xi and Yj are two random variables and “n” and “m” are the number of possible values for X and Y. The value of MI goes from 0 (no mutual information) and 1 (perfect relation between the variables).
Mutual Information (MI) measures the dependency between two time series:
- The permutation test was used in order to measure the significance of the statistics results [1].
- A symbolization of all the time series was made for this analysis [2]:
[1] François, D., Wertz, V., & Verleysen, M. (2006, April). The permutation test for feature selection by mutual information. In ESANN (pp. 239-244).[2] Bandt, C., & Pompe, B. (2002). Permutation entropy: a natural complexity measure for time series. Physical review letters, 88(17), 174102.
Mutual Information of the symbolize time series
DonaldTrump
Polls of Hillary Clinton
Hillary Clinton
DonaldTrump
It was observed how the sentiment of the frases is important and it is related to the time series of the polls.
Topic Detection:Dimensionality reduction
Topic Detection
Ramos, J. (2003, December). Using tf-idf to determine word relevance in document queries. In Proceedings of the first instructional conference on machine learning (Vol. 242, pp. 133-142).Xu, W., Liu, X., & Gong, Y. (2003, July). Document clustering based on non-negative matrix factorization. In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 267-273). ACM.
Advantages: - Vectors have positive components (easy interpretation)
- Orthogonality is not imposeDisadvantages: - The # of topics is an input, not an output of the algorithm.
Dimensionality reduction:NMF is an algorithm where a matrix V is factorized into two matrices W and H (M ≈ H*W ), with the property that all three matrices have no negative elements.
How could you mathematically represent a document?
- Vectors
V = [ ... , TF(t)*IDF(t) , … ] -> dim = # words
con:
where N is the # of documents and nt the # of documents in which the word t appears.
Combining all the vectors of all the documents, we have a matrix M
Non Negative Matrix Factorization (NMF)
ECONOMY Social Issues: Immigration
Detección de tópicos para cada medio por separado
Social Issues(Immigration and racism)
Economy
week review
Clinton’s and Trump’s scandals
Art
Foreign affairs
Temas:
Elections
Clinton’s email scandal
Social issues(immigration)
Economy
Foreign affairs
Clinton foundation scandals
Temas:
Social issues (racism)
FBI investigation of the Clinton’s emails
third party
Clinton foundation scandals
Social issues(immigration)
Clinton’s email scandal
Temas:
<< ffalbanese@gmail.com >>
top related