An preliminary analysis of the #FeesMustFall Twitter Campaign
Khan, Y1 and Thakur, S2 and Shabat, S3 1 Masters candidate, KZN CoLab, DUT
2 Director, KZN eSkills CoLab, DUT 3 Lecturer Information Technology DUT
Table of Contents
An preliminary analysis of the #FeesMustFall Twitter Campaign ........................................... 1
Abstract...................................................................................................................................... 3
Introduction ............................................................................................................................... 5
Background ................................................................................................................................ 5
The Twitter Data ........................................................................................................................ 6
VADER .................................................................................................................................... 9
The data ............................................................................................................................... 10
[Outlying interesting observations] ..................................................................................... 11
Describe the data [YASEEN] ............................................................................................ 11
The Twitter platform .......................................................................................................... 12
The impact of bots on campaign..................................................................................... 13
Interesting Finding ............................................................................................................. 15
Data Set .................................................................................................................................... 16
Detecting a bot......................................................................................................................... 16
The emotion trend ................................................................................................................... 18
Sentiment Analysis ............................................................................................................... 18
Further work ............................................................................................................................ 26
Conclusion ................................................................................................................................ 26
Bibliography ............................................................................................................................. 27
Word pool ................................................................................................................................ 28
Prayer Index ......................................................................................................................... 28
Bot detection ........................................................................................................................... 28
Timelines vs frequency ............................................................................................................ 28
‘Amita Bachan – twitter cleans followers (bots)’ ..................................................................... 28
Bell Pottinger ........................................................................................................................... 28
Abstract The #FeesMustFall (FMF) campaign was started in 2015 by students to lobby
government to fund student university education to redress past imbalances. A unique
feature of FMF was the leveraging of social media platforms to coordinate the
campaign, inform and lobby students as well as activists, garner support, and retain
media and community attention. This study consequently studied the Twitter
component of the FMF primarily through the acquisition of 576,583 Tweets.
These tweets were collated, pre-processed and cleaned. This meant, inter alia
removing duplicates and data that could not be discerned or made no sense. This
output data was then subjected to a series of analysis. The analytical method utilized
included, amongst others, descriptive statistics, sentiment analysis using a natural
language programming (NLP) approach with VADER (Valence Aware Dictionary
sEntinment Reasoner), timeline analysis, and hashtag data analysis. This was
triangulated with real-world events.
This study is relevant to understand student activism. The model and methodology
may, at a government level, be extended to anticipate and mitigate service delivery
protests and even help in tracking sources of illnesses like listeriosis. At a commercial
level companies may use this for real-time sentiment tracking.
The study shows that Twitter was a key and active platform of the campaign. It found
an intriguing evidence of software robots commonly called bots which was deployed
to drive public sentiment. This to the authors knowledge, was not mentioned in the
media or any other study, during this campaign. This influence will be analysed.
Further perceived negative events can and did drive sentiment. One example is the
arson event with the touching of the UKZN library. This incident will be analysed in
this paper.
Students it seems are students with activism largely confined to weekday and during
university term time. Contrary to some perceptions, slactivism, although probably
present, it was not a key component of the campaign. Slacktivism are actions
performed via the Internet in support of a political or social cause but requiring little
time or involvement, e.g. signing an online petition.
The FMF campaign had a desired effect as the new President Ramaphosa
announced that from 2017 education will be free for students from families with a
combined income of less than ZAR 350, 000.
Introduction The #FeesMustFall (FMF) campaign started in October 2015 by students to forcefully
lobby government to fund student university education to redress past imbalances. A
unique feature of FMF was the leveraging of social media platforms to coordinate the
campaign, inform and lobby students as well as activists, garner support and retain
media and community attention. This study consequently studied the Twitter
component of the FMF primarily because the researchers had access to 576,583
Tweets that was posted from March 2015 to March 2017.
Background The #FeesMustFall (FMF) campaign was started in October 2015 by students to force
government to fully fund student university education to redress past imbalances. The
FMF campaign was waged on the back of the #RhodesMustFall campaign at UCT.
There was intriguing evidence of one tweet in March 2015 as well as one tweet (put
tweet here) in April 2015 (put tweet here) during the RMF campaign alluding to the
fact that this campaign gave birth to the FMF campaign. Similarly, it well be that the
DataMustFall campaign may well expand as a product of FMF.
The FMF student argued that higher education entrenched a new form of apartheid
based on class system, as the poor could ill afford the fees, the accommodation, travel
and meals. FMF is celebrated as a non-partisan largely student protest movement
although many political parties tried to gain mileage out of the process. The movement
enjoyed much support from across the political spectrum, from rich and poor, business
academia, and civil society.
The actual #FeesMustFall (FMF) movement started in Johannesburg after the
University of Witwatersrand (Wits) declaring an unaffordable rise in fees for 2016. Wits
claimed that the subsidy from government would not be enough to accommodate the
net increase in costs by the university, for library books, journal subscriptions, research
equipment, and academics’ salaries. Rhodes University in Grahamstown then
announced a minimum initial payment of 50% of fees for 2016, meaning that the
average student living in residence needed an upfront payment of ZAR45,000.00. The
FMF movement became a rallying cry against financial exclusion and debt traps for
economically disadvantaged students (Pillay, 2016).
Digital activism enabled the movement to flourish. Facebook, Twitter, and instant
messaging services allowed supporters to swiftly communicate, coordinate and
organise meetings and protest marches. On 23 October 2015, thousands of
supporters marched to the Union Buildings to demand free education from the then
State President, Jacob Zuma, and then Minister of Higher Education, Blade
Nzimande.
This campaign regrettably pitched student against university administration, which
sometimes required police intervention and backing. The irony is deepened when one
considers that both elements in the ecosystem supported the campaign. In spite of
society supporting the FMF movement, there was social media but little physical
support. Pillay is unequivocal that “silence is (also) violence.”
There were some particularly horrifying events during the standoff which saw emotions
swaying towards the students when they were tear gassed and rubber bullets fired.
On the other hand, society reacted with horror when a security guard was killed at
CPUT, a ZAR100 million building was torched at UJ and a irreplaceable historic library
was burnt at UKZN.
A particular intriguing feature of FMF was the student and public leveraging of Social
Media platforms to garner support and keep the event in the public eye in a sustained.
A particular interest was the leveraging of Twitter by students to use the hashtag
#FMF. The researchers have gathered 597,000 tweets from the period 2015-2017 for
this campaign.
The Twitter Data These tweets were collated, preprocessed and cleaned. The latter meant removing
duplicates and data that were uninterpretable. This output data was subjected to a
series of analysis. The analytical method utilized included, amongst others, descriptive
statistics, sentiment analysis using natural language programming (NLP), timeline
analysis, hashtag data analysis and VADER. Valence Aware Dictionary and sEntiment
Reasoner (VADER) is a NLP lexicon and rule-based sentiment analysis tool that is
specifically attuned to sentiments expressed in social media and works well on texts
from other domains (Python Foundation, 2018). This was also triangulated with real-
world events.
The students argued that higher education entrenched a new form of apartheid based
on a class system, as the poor could ill afford the fees, the accommodation, travel and
meals. FMF is celebrated as a non-partisan largely student protest movement
although many political parties tried to gain mileage out of the process. The movement
enjoyed much support from across the political spectrum, from rich and poor, business
academia, and civil society. It must be noted that the FMF campaign started to lobby
against fee increases but progressed to remind government of its free education
pledge (Hehe, 2017).
The movement is the first national struggle waged leveraging an almost entirely social
media (SM) platform. SM was used to mobilize students through virtual tools to
amplify their physical presence at various campuses on particular campuses at
particular times such as Wits University. At the same time, media houses were
strategically informed to ensure that the event occurred in glaring public eye. The
simultaneous private public nature of SM allowed student leaders to network and
coordinate without the knowledge of authorities. It must be mentioned that although
students distrusted the university administrations, perhaps because they felt that
administrations were not doing enough to support their cause, they both
philosophically were on the same side.
The free campaign was till the FMF launch was fought by Student Representative
Councils (SRC) with their respective university. This had limited success. RMF created
the intrigue and the counter-memory angle ( Bosch, 2017) which ignited academic
passion. This was matched by some revolting attention grabbing incidents such as
feaces dropping which ignited twitterspere. The RMF fueled FMF with the first mention
of FMF coming 6 months before October.
South Africa is a country with a rich history of activism fighting unsurmountable odds.
These began historically when settlers invaded this land, and declared parts of it to be
the sovereign of Netherland, Britain, or Germany etc. This effected the tribes, as well
as the Khoi San.
The dispossession was entrenched in law, and took a raced-based hugh, when the
1948 government wrote Apartheid into the law, and reduced the blacks to servant
status. Even within the genre of blacks the oppressor government saw fit to accord
some form of blacks namely the so-called Indians and Coloured folk more privileges
that African folk. Gender discrimination was a given across all populations.
In 1955 in Kliptown the people of SA passed several resolutions not least the right to
free education. The people of SA rose over the next three decades and after a
protracted struggle the new country was born in 1994, with Mandela installed as our
inaugural president. The country had, since then, many competing priorities to redress
apartheid, some of which required a form of reverse apartheid, where certain positions
were reserved for people of color to statistically redress the imbalance. The people
have been patient for a reasonable period of time.
However, as time passed the people particularly the disadvantaged peoples, patience
was tested and an increasing number of service delivery protests, resulted. These
have in many cases forced government to react, which lent further credibility to
vigilantism as a method to attract government attention.
South Africa has the most unequal school system in the world. (Nic Spaull of the
University of Stellenbosch (Spaull, 2017). Many schools are deemed free fee-paying
schools and received full subsidy. This is a method to redress the school situation.
The irony is that a poor student may go through their entire schooling career not paying
fees and suddenly be required to pay tertiary fees.
Moreover, the article argues that youth are increasingly using social networking sites
to develop a new biography of citizenship which is characterized by more
individualized forms of activism. In the present case, Twitter affords youth an
opportunity to participate in political discussions, as well as discussions of broader
socio-political issues of relevance in contemporary South African society, reflecting a
form of sub-activism (Bosch, 2017).
VADER Valence Aware Dictionary and sEntiment Reasoner, also known as VADER, is a
parsimonious rule-based model for sentiment analysis of social media text. According
to (Hutto & Gilbert, 2014), the effectiveness of VADER was compared to eleven typical
sentiment analysis models such as Affective Norms for English Words (ANEW),
Linguistic Inquiry Word Count (LIWC), SentiWordNet (SWN) and also those that
utilises machine learning techniques that depend on Naïve Bayes, Maximum Entropy
and Support Vector Machine (SVM) algorithms. Subsequently, VADER ranked the
highest in predictive accuracy when tested on 4200 Tweets from Twitter, 3708 product
review snippets from Amazon.com, 10605 movie review snippets and 5190 article
snippets from NY Times Editorials.
VADER is an example of a lexical method for sentiment analysis and its algorithm is
based on the following principles (Gab, 2017):
𝐸 ∈ [−4; 4], 𝑤ℎ𝑒𝑟𝑒 𝐸 𝑟𝑒𝑝𝑟𝑒𝑠𝑒𝑛𝑡𝑠 𝑡ℎ𝑒 𝑆𝑒𝑛𝑡𝑖𝑚𝑒𝑛𝑡 𝑆𝑐𝑜𝑟𝑒 𝑝𝑒𝑟 𝑤𝑜𝑟𝑑
Sentiment score or Emotion intensity of a word is measured on a scale from -4
to +4, where -4 is the most negative and +4 is the most positive. The midpoint
0 represents a neutral sentiment.
The overall sentiment (S) is normalized using the formula,
𝑆 = ∑𝐸𝑖
√ (∑𝐸𝑖)2 + 𝛼⁄ 𝑖 = 0,1,2, … , 𝑛
𝐸𝑖 𝑟𝑒𝑝𝑟𝑒𝑠𝑒𝑛𝑡𝑠 𝑡ℎ𝑒 𝑆𝑒𝑛𝑡𝑖𝑚𝑒𝑛𝑡 𝑠𝑐𝑜𝑟𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑖𝑡ℎ 𝑤𝑜𝑟𝑑
𝑆 ∈ [−1; 1], 𝑤ℎ𝑒𝑟𝑒 𝑆 𝑖𝑠 𝑡ℎ𝑒 𝑜𝑣𝑒𝑟𝑎𝑙𝑙 𝑠𝑒𝑛𝑡𝑖𝑚𝑒𝑛𝑡
𝛼 𝑖𝑠 𝑎 𝑛𝑜𝑟𝑚𝑎𝑙𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑝𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟 𝑠𝑒𝑡 𝑎𝑡 𝑎 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 15
𝑆 > 0 𝑖𝑠 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑆 < 0 𝑖𝑠 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑆 = 0 𝑖𝑠 𝑁𝑒𝑢𝑡𝑟𝑎𝑙
The overall sentiment score of a sentence is the normalization of the sum of the
sentiment score of each sentiment-bearing word. A value below zero is considered to
have a negative overall sentiment with -1 being the most negative whilst a value above
zero has a positive sentiment with +1 being the most positive and a value of zero
returns a neutral sentiment.
VADER incorporates colloquialism, emoticons and punctuations into its sentiment
algorithm by considering five heuristics (Gab, 2017). They are as follows:
1. Punctuation
2. Capitalization
3. Degree Modifiers
4. Shift in polarity due to “but”
5. Tri-gram examination before a sentiment-laden lexical feature to catch polarity
negation
The development of VADER by (Hutto & Gilbert, 2014) included 20 prescreened and
appropriately trained human raters that boosts its credibility and can be applied to
various domains. VADER was based upon the English language and therefore lacks
diversity and continuous development is required to improve this area.
The data These tweets were collated, preprocessed and cleaned. The latter meant removing
duplicates and data that made no sense. This output data was subjected to a series
of analysis. The analytical method utilized included, amongst others, descriptive
statistics, sentiment analysis using natural language programming (NLP), timeline
analysis, hashtag data analysis and VADER. This was also triangulated with real-
world events. This finding is analyzed in this paper.
There is a school of thought which says that online activism promotes slactivism which
refers to actions performed via the Internet in support of a political or social cause but
regarded as requiring little time or involvement such as signing an online petition or
joining a campaign group on a social media website or application. But FMF is different
because it had a physical protest component. One other campaign where slacktivism
had a physical component was Tahir Square in Egypt during 2012.
[Outlying interesting observations] The range of languages used in the tweets was interesting. A minor number of tweets
were in Chinese (32), isiZulu (100), Afrikaans, …
Perspectives range across the full spectrum from those who view the Internet as
potentially disruptive (Aday et al., 2010; Howard, 2010) to those who argue that it may
even support authoritarian regimes (Morozov, 2011) who examined evidence social
media and the Internet were being used by protesters as events unfolded in real time.
Win particular the use of social media amongst participants in the Tahrir Square
protests in Egypt. Our central research questions were: Did social media use shape
how they learned about the protests, how they planned their involvement, and
how they documented their involvement? (tufekci and Wilson, 2012)
Software Robot
A social bot is a software robot or program that simulates human behavior in
automated interactions on social network platforms such as Facebook and Twitter.
They're sophisticated enough to fool other users and be taken for a human.
Social bots populate techno-social systems: they are often benign, or even useful, but
some are created to harm, by tampering with, manipulating, and deceiving social
media users (Ferrara, Varol, Davis, Menczer and Flammini, 2016).
Social bots have been used to infiltrate political discourse, manipulate the stock
market, steal personal information, and spread misinformation. The detection of social
bots is therefore an important research endeavour (Ferrara, Varol, Davis, Menczer
and Flammini, 2016)
Describe the data [YASEEN] Range type dates etc.
Twitter data also known as tweets have been purchased from a professional data service provider, and consists of 576 583 data points. Each data point or tweet has the following Metadata:
• Tweet – A message containing texts, emoticons and symbols limited to
280 characters (140 previously) • Time Stamp – Date according to the Gregorian calendar as well as the
time of the tweet • User name – Unique identification of the user who tweeted • Source of Tweet – The device used to send through a tweet • Favourite – Tweet tagged as favourite • Retweeted – Tweet that has been reposted or forwarded
Tweets were stored as text in a Microsoft Excel format and other media such as
images, videos and audio were omitted from the analysis. The timeline of tweets
gathered ranges from the 21 March 2015 until the 10 April 2017.
The data underwent a preprocessing phase prior to analysis which consisted of the
following:
removing duplicates and corrupt data points
validating data at random with tweets from the Twitter.com website
Applying the VADER analysis using Python
The Twitter platform
Twitter is a popular social networking and micro blocking tool which was released in
2006. It has about 300m users. Twitter users write over five hundred million messages
each day. Twitter users express whatever is on the mind through a so-called tweet. A
tweet is much like an SMS but it may only be 280 characters long. This length may
well be extended according to Twitter. These tweets are sometimes annotated with a
tag called a hashtag usually represented by the symbol #. A user on Twitter has their
user name proceeded by an @ so Justin Timberlake whose Twitter name is
jtimberlake has a handle, @jtimberlake.
For our study the student protest movement, known as the Fees Must Fall movement
was tagged. It became known as #FeesMustFall. This opportunistic tagging allows for
contextual searches, grouping of responses, identification of trends and other forms
of meta-analysis. Sometimes multiple tags are used in the same tweet. For example,
the university acronym was often used with the tweet. Thus, the following tweet:
“8am DUT steve biko campus that's where imma be at tomorrow#FeesMustFall #DUT
#MUT #UKZN #DurbanShutDown” [YASEEN]
This allows a tweet to divert a hyperlink to another story.
The impact of bots on campaign
The massive spread of digital misinformation has been identified a global risk that
could impact elections, national security, company and individual reputation (Shao et
al, 2017). Much research is being undertaken to understand the viral diffusion of
misinformation. Indeed, Shao et al (2017) conducted a research on the 2016 US
Presidential campaign to mine for misinformation.
Twitter has two kinds of directed relationships friend and follower. In the case where
the user A adds B as a friend, A is a follower of B while B is a friend of A. In Twitter
terms, A follows B. B can also add A as his friend (namely, following back or returning
the follow), but is not required. From the standpoint of information flow, tweets flow
from the source (author) to subscribers (followers). More specifically, when a user
posts tweets, these tweets are displayed on both the author’s homepage and those of
his followers (Chu et al, 2012).
The growing number of users and the very open nature of Twitter have made itself an
idea target of exploitation from automated programs, known as bots. Further cyborgs
have emerged as an intermediary between humans and bots, which are either human-
assisted bots or bot assisted humans. Cyborgs have become a feature on Twitter and
display interwoven hybrid characteristics of both manual and automated behavior (Zu,
Gianvecchio, Wang and Jajodia, 2012).
Table 1.0 Tweet distribution
Year of Date
Month Number of Tweets
2015 March 1
April 1
October 289,458
November 13,452
December 3,922
2016 January 13,318
February 7,215
March 3,898
April 2,076
May 900
June 1,551
July 2,238
August 6,541
September 38,472
October 82,712
November 9,505
December 3,626
2017 January 4,113
February 2,244
March 2,843
April 2,363
October 2015 has the highest number of tweets with a tally of 289 458 which is greater
than the total number of tweets for the entire year of 2016.
0
20000
40000
60000
80000
100000
120000
Monday Tuesday Wednesday Thursday Friday Saturday Sunday
Total Tweets per weekday
Noticeably lower tweet count over the weekend (Saturday and Sunday) compared to
the significantly high tweet count for midweek days (Wednesday, Thursday and
Friday)
It was widely believed that the first widespread use of political bots to shift public
opinion was the alleged work of Bell Pottinger, a public relations company employed
to support the Gupta family. This was allegedly revealed by documents now known as
#GuptaLeaks. Guild (2017) suggest that the Twitterati come to the defense of South
Africa’s democracy by outing fake pro-Gupta Twitter bot accounts which have been
used to promote the family by praising pro-Gupta supporters and selectively targeting
journalists and others perceived to be anti-Gupta. Many were created in India. One
such Twitter accounts, “Esaia Theron”, was shown to be fake (Child, 2017).
For example, Theron praised known supporter Andile Mngxitama,
@Mngxitama “We all have to admit that is the greatest of all so much
passion he has for the improvement of the country. #BLF”
On the other hand, journalist Barry Bateman, was trolled by the bots, picking up 500
new fake followers daily forcing him to lock his account. Theron condemned for locking
his account and blocking him (Theron).
Interesting Finding
The significant finding in this research was that evidence of bots used in #FMF was
uncovered. Given the nature and the participants of the FMF campaign, who are
largely viewed as the intelligences of the country, it was interesting to find an evidence
of a bot, though in review unsurprising.
It is highly probable that some slacktivists who may well be bright students or
academics authored the bots. Students are inherently intelligent and will always find
the easiest way to do something.
Table 2.0 Description of the platforms Platform Twitter WhatsApp SMS Messenger
User currently
online feature
Not available Available Not available Available
Size 140 characters No restriction 160 characters No restriction
Cost Almost free or low cost Bandwidth cost Charge per SMS Bandwidth cost
User searchable
hashtags, on all but
private tweets
Only user can
search. Encrypted
Only at user level Only at user level
Communication
Mode
Broadcast or Direct
Message (DM)
One-to-one defined,
one-to-many or
many-to-many
closed user groups
One-to-one or one-
to-many
One-to-one or
one-to-many
Sources (Differences, 2017)
Data Set
Information entropy is defined as the “the average amount of information produced by
a probabilistic stochastic source of data.” As such, it is one effective way to quantify
the amount of randomness within a data set. (Kramar, 2017)
One can reasonably conjecture that actual humans are more complicated than
automated programs, entropy can be a useful signal when one is attempting to identify
bots, as has been done by a number of previous researchers. Of the recent research
in social bot detection, particularly notable is the excellent work by groups
of researchers from the University of California and Indiana University. Their “botornot”
system uses a random forest machine learning model that incorporates 1,150 features
derived from user account metadata, friend/follower data, network characteristics,
temporal features, content and language features, and sentiment analysis. Botornot is
now called Botometer (Kramar, 2017).
Detecting a bot A Tweet may be a fake software social robot (bot) pretending to be human, a human,
or a human who uses bot technology to help them post more, faster, and longer
(cyborg). Each has some characteristics that assist is distinguishing between them.
Chu et al (2012) observed that a typical human user is very likely to follow “famous”
or reputable accounts.
𝑨𝒄𝒄𝒐𝒖𝒏𝒕 𝑹𝒆𝒑𝒖𝒕𝒂𝒕𝒊𝒐𝒏 = 𝒇𝒐𝒍𝒍𝒐𝒘𝒆𝒓_𝒏𝒐/(𝒇𝒐𝒍𝒍𝒐𝒘𝒆𝒓_𝒏𝒐 + 𝒇𝒓𝒊𝒆𝒏𝒅_𝒏𝒐)
A celebrity has many followers and few friends e.g. Justin Timberlake who has a
reputation value of close to one. In contrast a bot has few followers and many friends.
This has a reputation close to zero. It follows that humans should have the highest
Account Reputation, followed by cyborgs, and then bots.
It terms of the number of tweets, it turns out that cyborgs generate the most tweets,
followed by human and finally bots (source?). At a superficial level this is surprising,
but reflection points that bots tweet frequently in a small sustained period, when it is
higher than human, then hibernate for a long period, perhaps to avoid detection.
Some bot accounts are now being suspended for extreme or aggressive activity (Chu
et al, 2012).
Indicators of a cyborg
1. Follows very few accounts, followed by very few
2. Usually topic specific
3. Frequency may be defined by characteristics
4. By equal periods
5. Short frequent bursts
6. Exact time in a day every day
7. Account Reputation
8. (Chu, 2012; Kramar, 2017)
Bot-authored tweets
1. Use timers to tweet
2. Or fixed intervals
3. Exhibit regular repetitive behavior
The use equal periods to tweet was a very simple method to determine cyborg and
bot activity.
Johannesburg marketing strategist Andrew Fraser, who has analysed many bots, said
they are easy to identify by their strange names, identical profiles and that they all
tweet the same thing at the same time. Using online tools, he found many were
generated in India.
SM may be used to legally monitor activity for medical evidence of spread of an
“illness.” (Chew and Eysenbach, 2010) call this through infoveilliance.
The emotion trend Off interest is the emotional mood swing of the tweets during the FMF. Was the
positive always in the majority? Did the negative moods hold sway at any point? Did
major destructive events such as the burning of the lib and the UJ hall sway moods?
Sentiment Analysis
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Sentiment Analysis
Negative Neutral Positive
Figure 1.0 Sentiment Analysis The table above depicts the sentiment of tweets in proportion to the total number of
tweets accumulated for a given month. Outliers have been filtered out by excluding
total tweets that were less than 100 for a month.
Negative sentiment were highest in proportion to the other sentiment classifications
for months, September, October and November of the year 2016. This coincides with
events such as the burning of the UKZN Howard library on the 06 September 2016
and the burning of a lecture hall in the University of Johannesburg (UJ).
Discuss ukzn library incident Bots Bots are automated applications that perform humanistic tasks. In the context of social
media, they are known as socialbots (Chatbots). Modern socialbots are capable of
holding a conversation with a human but not without limitations. Limitations exist due
to the complex nature of human communication and language and while natural
language programming (NLP) is on the up there are and probably will forever be
factors that curb its efficiency. Factors such as sarcasm, pragmatism and colloquialism
are complex to overcome but machine learning offers some relief and by utilising
efficient algorithms together with NLP this gap becomes narrower.
In a specific context a bot may be undetectable even with the best of methods as was
seen when the Turin test was passed years ago but for a bot to achieve this in modern
times it has to evolve with its counterparts that aim to detect and remove anti-social
bot activities. Twitter has witnessed the emergence of bot activities that aim to
influence government and financial markets such as the Bell Pottinger incident (Bond,
2017). Twitter has its own security measures that detect unusual activities within its
platform but bots adapt and evolve and continue to plague social media platforms.
Thus, there is a need for growing research in this area as malicious bots aim to
influence societies’ governments and economies. Twitterbots have been known to
attempt to influence the sentiment of twitter users by associating the hashtag with
positive or negative text together with fake news and additional hashtags.
** Facebook has recently launched a website called wit.ai whereby users can create their own bots. Bot characteristics Unsurprisingly, the characteristic of an average bot displays repetitive behaviour, high
volume of output and very frequently active. A Twitterbot is a type of bot software that
uses a Twitter API to control a Twitter account (Chu, et al., 2012). Twitterbots are
capable of performing tasks autonomously such as tweeting, retweeting, liking,
following, unfollowing, or direct messaging other accounts. Twitter imposes a set of
automation rules that cap twitter bot behaviour (Twitter Inc., 2018) but it does not
effectively remove all malicious bots (Shao, et al., 2017).
According to (Gilani, et al., 2017) there are clear distinctions between bot and human activity across the following metrics:
Age of Account
User Tweets
User Retweets
User replies and mentions
URLs in tweets
Content uploaded
Likes per tweet
Retweets per tweet
Tweets Favourited
Friend-follower ratio
Activity sources count
The following accounts have been shown to exhibit bot activity:
Twitter suspends accounts based on the following 3 criterion (Twitter inc., 2018):
Spam
Account security at risk
Abusive tweets or behaviour
Users have now began to use automated tools in order to boost their profiles on social
media and in particular there are such tools for Twitter that follows and un-follows
users automatically (Karlson, 2017). A framework for bot detection has been
constructed by (Varol, et al., 2017) whereby more than 1000 features are leveraged
to evaluate the degree by which a Twitter account is similar in characteristics to known
social bots (Davis, et al., 2016). This framework is adopted in the website,
truthy.indiana.edu/botornot (now Botometer), and is free to use online.
Bot influence According to (Gilani, et al., 2017) bots have been observed to have a profound
influence in social media. Since, sentiment shared in social media have been
recognized to affect external events such as financial markets and political affinity, it
is unsurprising to witness the emergence of bots that aggressively lobby viewpoints or
spread malicious information.
Bot Networks Online media has become influential in affecting the sentiment of users and bot
creators have been targeting websites and social media to either promote products,
causes or spread fake news. The sophistication of bots have known to circumvent
social media platforms and have networks of their own known as bot networks. The
recent Trump and Russia allegations revealed bot networks in Twitter that targeted
journalists’ and other users’ accounts that opposed Donald Trump with some human
accounts temporarily suspended after being attacked by a network of bots1. This
means that the bot network is capable of suspending human accounts by triggering a
breach in the rules of Twitter.
Analysis of #FeesMustFall Twitter Data According to Botometer, EduFunder is rated 74% a bot. Outlier behaviour (8 tweets within 2 seconds) The table below is an analysis of the #FeesMustFall twitter data collected from the 21 March 2015 until 09 April 2017. Total amount of twitter data gathered was 576 583 tweets. This table is reveals the top 10 list of users that populated the most number of tweets in this duration. Table 3.0 The most prolific tweets
User Name Avg. Sentiment Score
No. of hashtags (#)
Favorite No. of urls in Tweet
Number of Tweets
Retweet
Camaren Peter -0,07 63817 488 15362 15403 242
EduFunder 0,05 13665 146 4111 7018 319
Wake up SA!! 0,00 15684 1013 2215 2318 1025
Jou Ma Se Party -0,09 533 56 2294 2258 70
#AFRICA -0,09 7355 426 2221 2193 100
Jacaranda News -0,01 3388 2035 712 2063 7185
EWN Reporter -0,04 2206 6532 969 1739 28974
1 https://krebsonsecurity/2017/08/twitter-bots-use-likes-rts-for-intimidation/
The Daily VOX -0,02 1561 3523 1096 1053 12034
POWER987News 0,00 1297 1371 298 1041 6622
ANN7 -0,01 259 1114 366 949 4171
- Avg Sentiment – The average sentiment of the user. Sentiment has been calculated
using the VADER2 (Valence Aware Dictionary and Sentiment Reasoner) library for
Python3.
- No. of hashtags - The sum total of hashtags (#) of a user.
- Favourite – The sum total of all the user tweets that was selected as favourite
- No. of URLs in Tweet – Sum total of the number of URLs of a user.
- No. of Tweets – Sum total of the number of tweets of a user.
- Retweets – The sum total of all the user tweets that was retweeted.
The researchers argue that the news media twitter account such as ANN, EWN, and Power are self-serving, with reasonable justification, because they direct traffic towards their news articles as a marketing exercise. We suggest self-serving because they as a marketing strategy. Twitter users view these users as reputable sources of information and RT them.
2 VADER Sentiment Analysis. VADER (Valence Aware Dictionary and Sentiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains - https://github.com/cjhutto/vaderSentiment 3 Python is an interpreted high-level programming language for general-purpose programming - https://www.python.org/
Edufunder was found to be a bot as the volume and frequency of tweets exhibited
abnormal behaviour as can be seen in Figure 2 below:
Figure 2: Sample of Tweets from EduFunder
The vertical axis represents time in the format of h:m:s (hour:minute:second) and the
horizontal axis represents the corresponding volume of tweets. Notice the significant
amount of activity within 10 seconds and at least 2 tweets per second from this sample
of data. EduFunder also used IFTT (IF THIS THEN THAT) to tweet which is widely
used to create bot applets.
Table 4: A few tweets from Figure 2
Tweet Sentiment
NikkyCage: RT Notty_Mnguni: "The only Blade we acknowledge" ☹☹ #FeesMustFall #NationalShutDown #UKZNFeesMustFall #… https://t.co/StX4389VIX
0
BCM_82: RT IOL: 10 powerful placards of #FeesMustFall https://t.co/3iWxA75HXs
0,4215
ABasaJJ: RT sihle_mda: #NationalShutdown ✊�#FeesMustFall ✊� 0
Ceendie_: Mandela didn't spend years in prison for this #Feesmustfall -0,5106
Mageba_Zulu: RT SASCO_Jikelele: Solidarity from Namibian students. #FeesMustFall #NationalShutDown https://t.co/2J9vbEiyDj
0,296
Table 4 displays a few tweets corresponding to Figure 4 in order to deduce the nature
of the bot’s intentions. It can be seen that the bot has a lot of retweets and virtually no
‘original’ tweet. Perhaps it was created to amplify the awareness of the #FMF
campaign. There were some tweets from the bot that were retweeted and it can
therefore be said that influence of some nature occurred between the bot and other
twitter users.
Camaren Peter is more of a cyborg as Hootsuite and Buffer were used to automate
tweets for this account. Camaren Peter’s tweeting activity displayed punctuality unlike
a completely human controlled twitter accounts and also exhibited a lack of variety in
the substance of tweets. Figure 3 portrays this behaviour as follows:
Figure 3: Tweets vs Minutes of an hour
Table 5 expresses an example of a tweet from Camaren Peter:
Table 4: A sample tweet from Camaren Peter
Tweet Date Favourite Retweet Tweet Source
Sentiment
Thought Factory (Oct 2015): Student Protests Scuppered by Institutions https://t.co/JTRUqaIc8e #FeesMustFall #SouthAfrica #leadership ¤
2016-10-03 11:56
5 5 Hootsuite -0,2263
The tweet from Table 5 is calculated as a negative sentiment by VADER and on
inspection seems accurate as the text leaves the reader with a feeling of failure for the
student protests. Also note that this tweet was tagged as a favourite 5 times and
retweeted 5 times. This signifies influence of some degree and in the broader context
means that cyborgs can effect social media users. The link in the tweet refers a user
to a blog that discusses South African politics thereby luring users to the aims of the
cyborg.
Further work This study is relevant to understand student activism. The model and methodology
may, at a government level, be extended to anticipate and mitigate service delivery
protests and even help in tracking sources of illnesses like listeriosis. At a commercial
level companies may use this for tracking real-time sentiment. New emerging
campaigns such as #DataMustFall may also be tracked.
Conclusion Although not part of the study, the researchers are pleased to announced that the
campaign had a desired effect as the new President Ramaphosa announced that
education will from 2017 be free for students from families with a combined income of
less than ZAR 350, 000.
The study shows that Twitter was a key and active platform of the campaign. Contrary
to some perceptions slactivism, although present, was not a key component of the
campaign. It found an intriguing evidence of software robots commonly or social bots
or simply bots which was, to the authors knowledge, not mentioned in the media or
any study, during this campaign. The FMF campaign had a desired effect as the new
President Ramaphosa announced that education will from 2017 be free for students
from families with a combined income of less than ZAR 350, 000.
Bibliography
Chew C and Eysenbach G. 2010. Pandemics in the Age of Twitter: Content Analysis of Tweets during the 2009 H1N1 Outbreak. PLOS ONE 5(11): e14118.https://doi.org/10.1371/journal.pone.0014118
Child, K. 2017. Sunday Times. Report. Pro-Gupta bots unmasked. 10 July 2017.
Chu, Z., Gianvecchio, S., Wang, H. and Jajodia, S., 2012. Detecting automation of twitter accounts: Are you a human, bot, or cyborg?. IEEE Transactions on Dependable and Secure Computing, 9(6), pp.811-824.
Davis, C. A. et al., 2016. BotOrNot: A system to evaluate social bots. Proc. 25th Intl. Conf. Companion on World Wide Web, pp. 273-274.
Differences. 2017. Differences between Twitter and Texting. Report. Available at: http://www.differencebetween.net/technology/internet/difference-between-twitter-and-texting/
Ferrara, E., Varol, O., Davis, C., Menczer, F., Flammini, A.: 2016. The rise of social bots. Commun. ACM 59(7), 96–104 (June 2016)
Gilani, Z. et al., 2017. An in-depth characterisation of Bots and Humans on Twitter. arXiv preprint arXiv:1704.01508, pp. 1-18.
Karlson, K., 2017. AGGREGATE: everyone’s using automated twitter following tools. [Online] Available at: https://aggregateblog.com/automated-twitter-following-tools/ [Accessed 13 February 2018]
Kramar, S. 2017. Identifying viral bots and cyborgs in social media. OReilly Media, https://www.oreilly.com/ideas/identifying-viral-bots-and-cyborgs-in-social-media
Python Foundation. 2018. Vader Sentiment 2.5. Available at: https://pypi.python.org/pypi/vaderSentiment Shao, C. et al., 2017. The spread of fake news by social bots. arXiv preprint arXiv:1707.07592, pp. 1-27
Spaull,N. 2017. https://www.economist.com/news/middle-east-and-africa/21713858-why-it-bottom-class-south-africa-has-one-worlds-worst-education South Africa has one of the world’s worst education systems. The economist 7 Jan 2017
Twitter inc., 2018. About suspended accounts. [Online] Available at: https://help.twitter.com/en/managing-your-account/suspended-twitter-accounts [Accessed 8 February 2018].
Twitter Inc., 2018. Automation Rules. [Online] Available at: https://help.twitter.com/en/rules-and-policies/twitter-automation
Pillay, S.R. 2016. Silence is violence: (critical) psychology in an era of Rhodes Must Fall and Fees Must Fall
Varol, O. et al., 2017. Online Human-Bot Interactions: Detection, Estimation, and Characterization. arXiv:1703.03107v2, 27 March.pp. 1-11.
Word pool [social yob][social mob]coordinate manipulate incite
Prayer Index What was the prayer index during the campaign?
Bot detection
Timelines vs frequency
‘Amita Bachan – twitter cleans followers (bots)’
Bell Pottinger Slash and burn strategy Instant gratification