communities modeling aggression in online yuba bhandari...

Cyberbullying DetectionModeling Aggression in Online Communities

Presented By:Yuba Bhandari, Lorena Mesa, Daniele Vinci

"We have been in touch with Ms. McGowan's team," Twitter said in a tweet on Thursday. "We want to explain that her account was temporarily locked because one of her Tweets included a private phone number, which violates of our Terms of Service." Source: CNN

https://www.theverge.com/2017/10/12/16463622/rose-mcgowan-temporarily-blocked-from-twitter-after-weinstein-tweets

http://money.cnn.com/2017/10/12/technology/rose-mcgowan-twitter-account/index.html

https://www.bostonglobe.com/opinion/2017/09/24/twitter-should-disable-donald-trump-account/8L3ueLmWkSjWNgi95SVqRP/story.html

https://www.theguardian.com/technology/2017/may/04/facebook-content-moderators-ptsd-psychological-dangers

http://www.wnyc.org/story/moderating-content-facebook

“ . . . the use of information and communication technologies to support deliberate, repeated, and hostile behaviour by an individual or group, that is intended to harm others”. Belsey, B. Cyberbullying.ca. Available online: http://www.cyberbullying.ca

What can go wrong when posting online?

▪ Highly prevalent in communities where users under 18▪ Impacts one’s mental health▪ Cyberbullying can take different forms:

□ Flaming□ Trolling

How has cyberbullying been deterred in the past?

Historically done via:▪ Content moderation via the product owner▪ Content moderation via user feedback (e.g. user reports) Typically these approaches require having moderators manually review comments. This is gravely inefficient and doesn’t scale well due to the need for human input.

Our idea and how it evolved

Looked at pre-existing data sets and scraped our own to understand the variability of cyberbullying across different types of platforms.

Build a model to identify a piece of text as cyberbullying or not

Cyberbullying incorporates the criteria of traditional bullying:

intent to harm, repetition and

power imbalance

Cyber-based communication is unique and typical criteria are difficult to identify

Cyberbullying Cyber Aggression!

1st Planned to develop an application to detect cyberbullying and help website owners to timely address this problem

Highly variable based on the communication platform in which the conversation happens.

BUT

Modeling Aggression▪ Problem: By looking at aggression, we don’t have to

consider intent. Yet aggression can be influenced by many things.

▪ For example the communication platform has a lot of influence in how conversation happens.

▪ Solution: How do various communities monitor aggression?

Data Selection Criteria: Have at least 1 million users and permit scraping!

Wikipedia Wikipedia pages have a "talk page" used forcommunicating with other users115,846 comments, of which +- 88% unflagged

FormspringNow defunct Q/A message forum wherein a

user proposes a question and selects the best answer.

RedditRegistered members submit content to the

site such as links, text posts, and images, which are then voted up or down by other members. It is an anonymous platform and uses varying rules of moderation.

Communities Data SetsWikipedia Detox ProjectDataset containing comments scraped from Wikipedia’s talk pages. The dataset also included Crowdflower-sourced labels for comments based on aggression, toxicity, and personal attacks.

Formspring DataUploaded by the user (Sweta Agrawal) in the Kaggle website. The total dataset is about 12,000 comments labeled using a web service, Amazon’s Mechanical Turk. The data represented 50 IDs from Formspring.me that were crawled in Summer 2010.For each ID, the profile information and each post (question and answer) was extracted.

RedditData collected with PRAW API from reddit in order to get a broad spectrum of audiences (across age ranges and interests)Collected data daily from four of the most active subreddits including: politics, soccer, LeagueOfLegends, AskReddit. We have used Mechanical Turk to label the ~ 5000 comments.

Data Science Pipeline / Workflow

Disclaimer!!!!There is explicit and derogatory language in the next slides. We sincerely apologize :(.

What type of stats do we care about in the data?Reddit Politics Training Set: 2500 labeled comments ( 7% aggressive)

- Lexical diversity:- Aggressive: 0.21- Non: 0.19

- Word count- Aggressive: 3816- Non: 74657

- Common words- Aggressive: profanities, go,

traitors, trump, going, better- Non: trump, would, like, people,

one, get, dnc, https- Word frequency distributions

Yellow Brick

Text Visualization of Bullying Corpus

▪ Vocab: 2,942▪ Words: 19,046▪ Hapax: 62 *

* Hapax references hapax legomenon; Greek for “(something) being said (only) once"

http://www.scikit-yb.org/en/latest/_modules/yellowbrick/text/freqdist.html

Python NLP Tools sklearn Open source Python machine learning library

including classification algorithms.

pandas Open source Python data analysis tool with “expressive data

structures”

NLTK Natural language toolkit for Python

jupyter Open source web app to create and share code, visualizations,

explanatory text

Detecting Aggression in Formspring Dataset

Data Extraction / Cleaning (Formspring Dataset)

Formspring Dat:❏ Each comment an exchange of question and answer❏ Labelled as bullying or non bullying comments by three mechanical

Turks (scale 0 - 10)

Examples:Q: Bitch u thee bomb like Tick TICK!<br>A: Hahah(: Thanks! Bitch u thee bomb like Tick TICK! Hahah(: Thanks! None No 0 n/a No 0 n/a Yes 9 Bitch u thee bomb like Tick TICK

Q: Well don't because it's crude.<br>A: crude? Well don't because it's crude. crude? None No 0 n/a No 0 n/a No 0 n/a

Sca

tter

text

viz

!

Model Selection Triple(Key to Accurate Prediction )

Text Feature Extraction and Optimization

Text Normalization (Reduce # of features)

▪ Removal of special characters▪ Punctuation and stop word

removal▪ Stemming Vs Lemmatization

Feature Optimization❏ Choice of text normalization❏ Ngrams (use of one, two or

three words as a feature)❏ Term frequency / document

frequency tuning❏ Limiting maximum features❏ Dimensionality reduction

(SVD)

What helped

▪ Stemming, stop word removal▪ Ngrams: Monogram and bigram▪ Frequency tuning: Helped▪ Limiting max features: Helped ▪ SVD: Did not help

Apply With

Care

Bengfort et. al, unpublished work

How that may look in Python SklearnPipeline([

('vect', TfidfVectorizer(stop_words='english', max_df=0.20, ngram_range=(1,2))),

max_features=50000))),

('clf', MultinomialNB())

])

Link: http://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html

Tfdif Vectorizer::

▪ Controls the size of your ngrams

▪ max_df ignores terms more than this document frequency

▪ Stop words - remove words with no intrinsic value (e.g. the)

Quantifying Feature Optimization

TfidfVectorizer

Weighted by term frequency and inverse document frequency. Reduce the weight to words occurring frequently across the documents.

MLPClassifier (solver=’lbfgs’, alpha=1e-5,hidden_layer_sizes=(15,4))

Text Normalization F1 Score

lemmatization 0.50

stemming 0.56

monogram 0.50

monogram and bigram 0.56

include stopwords 0.54

remove stopwords 0.56

Model Family Selection: Our problem is a classification problem.

Supervised Learning

1. Experience: Labeled comments as aggressive or not

2. Task: Label a comment 3. Performance Metric: How many did we

get right?

Classifiers

Avoids overfitting (making assumptions beforehand about the likely distribution of the answer).But independence assumption is a simplistic model of world.

NB LR

Modeling the relationship between variables that is iteratively refined using a measure of error in the predictions made by the model (Regularization, penalty term). Logistic regression gives linear class boundaries.

DT

Graphical model of rules that partitions the data until a decision is reached at one of the leaf nodes. Complexity is related to the amount of data and the partitioning method. Prone to overfit. Minor variations in data cause big changes in tree structure. Highly biased to training set (Random Forest to your rescue)

RF

Constructs a forest of decision trees. At each step, in one of its iterations (classification process), it picks a random subset of features to try. It will eventually pick a subset of features that perform best in a tree classifier.

MLP

Can learn a non-linear function approximator. Between the input and the output layer, there can be one or more non-linear layers (i.e., hidden layers). Requires tuning a number of hyperparameters; sensitive to feature scaling

SVM

Attempts to maximize the distance between classes, works in high dimensional space. Use of kernel to transpose data into a higher dimensional space. Linear kernels are commonly used for text classification due to the large number of features involved

Probabilistic.

Algorithm Selection and Hyperparameter Tuning

Total corpus: 118,910 wordsBully Comments: 747 (6%)Non Bully Comments: 11954

Algorithm Precision Recall F1 Score F1 ScoreSVD (100)

F1 ScoreSVD (500)

MLPClassifier 0.74 0.45 0.56 0.43 0.49

LogisticRegression 0.92 0.14 0.24 0.31 0.40

Multinomial NB 0.0 0.0 0.0 Breaks Breaks

DecisionTreeClassifier 0.42 0.40 0.41 0.25 0.20

Random Forest 0.85 0.17 0.22 0.13 0.02

SVM (LinearSVC) 0.79 0.40 0.53 0.29 0.44

Oversampling bullying comments “magic”❏ Added an equal amount (6%)

of bullying comments to the dataset

❏ Thereby increased the weight of bullying instances in the corpus

❏ Added an incentive for the learners to identify the true bullying instances

Accuracy Precision Recall F1 Score

0.97 0.82 0.94 0.86

12 Fold Cross Validation MLPClassifier

Total corpus: 119,880 wordsBully Comments: 1494 (11%)Non Bully Comments: 11954

Model Family Selection For Wiki Detox Data

▪ Unlike Formspring, the class imbalance was not as pronounced in Wikipedia Detox

▪ A logistic regression as well as naive bayes and a multi layer perceptron was applied

▪ Formspring skews towards a younger user age unlike Wikipedia

Wiki Detox Logistic Regression

Results What does the confusion matrix tell us?

The support column tell us about the actual count of each label in

Test data(sum of row of CM)

Non Cyber0/False

Cyber1/True

Non Cyber0/False 33137 534

Cyber1/True 1436 3129

An ideal classifier with 100% accuracy would produce a pure diagonal matrix which would have all the elements predicted with the correct label

Actu

al L

abel

s

Labels predicted

There are 33137+534 (33671) comments with label 098% successfully identified as Non Cyber (label 0)

False positive Similarly: there are 1436+3129 (4565) comments with label 169% successfully identified as Cyber (label 1)

False negative RECALL

PRECISION

There are 33137+1436 (34573) comments with label 0

There are 534+3129 (3663) comments with label 1

96% successfully identified as Non Cyber (label 0)

85% successfully identified as Cyber (label 1)

The F1-score is the armonic mean of Precision and Recall

ACCURACYThere are 33137+3129 (36266) comments correctly labeled over a 38236 total comments = 95%

Total instances in matrix = 38236 (sum of the 4 squares)I have split it in train_test_split with a test_size=0.3338236 is 33% of 115.864 initial instances

True negative

True positive

Not OK

Acceptable miss

Naive Bayes Multi Layer Perceptron

Wiki Detox

What if we wanted to consider aggression in relation to topic analysis?

Unsupervised Learning Approaches focusing on Topics

Top terms per clusterExtracting features from the training dataset using a sparse vectorizer

- Apply to MiniBatchKMeans- Findings:- n_samples: 2500, n_features: 3542

Cluster 0: trump just like don people going right time say didCluster 1: need yes just democrats american really people read hillary stop

Clustering → dissimilarity distance (TFIDF Vectorizer transformed with cosine similarity) Example TDIDF vectorizer: [[ 0.00000000e+00 8.05677888e-01 1.00000000e+00 ..., 1.00000000e+00 6.01168210e-01 7.41111300e-01] [ 8.05677888e-01 0.00000000e+00 1.00000000e+00 ..., 1.00000000e+00 6.85804750e-01 9.49692201e-01]

- Using this precompiled distance matrix with Multidimensional scaling can plot the difference and do some visual topic analysis

Dissimilarity distance of comments in SubReddit #Politics *Sorry including text of comments wouldn’t have been legible!

More than 1,000 new words, senses, and subentries have been added to the Oxford English Dictionary in our latest update, including worstest, fungivorous, and corporation pop.Oxford English Dictionary September 2017 Update

http://public.oed.com/the-oed-today/recent-updates-to-the-oed/

Lessons Learned- We built three models for three communities in

hindsight starting with a one model across may have been better

- However we optimized feature engineering techniques to recommend when building an aggression classifier

- Arriving at a “one model fits all” is significantly more difficult than we thought. Using combo of models can help us arrive at better generalized predicting power.

What’s next?▪ Document level features▪ Sentiment analysis▪ Topic analysis▪ Doc2Vec▪ Use Deep Representation:

□ Recurrent neural network□ Convolutional neural network

▪ Can we add in demographic data somehow?▪ Develop a “general” corpus representing aggression

communities modeling aggression in online yuba bhandari...

Documents