presentation - application of actor level social characteristic indicator selection for the...

“Application of Actor Level Social Characteristic

Indicator Selection for the Precursory Detection of Bullies in Online Social

Networks”

April 19th, 2016

Holly White, Jeremy Fields Robert Hall, Joshua White

2

Introduction Background on Cyberbullying Bullying Traits Dataset Selection Analysis

– Pre-processing

– Determining Negativity Initial Results Conclusion and Future Work References / Contact Info.

Overview

3

This work presents a precursory method for detection of “potential” bullies in social networks. We still have an unknown number of false positives that we have yet to quantify.

Disclaimer

4

Cyberbully occurs globally and impacts people of all ages and walks of life

– Effects include depression, low self-esteem, physical and psychological conditions and suicide (3,4)

– Early detection can mitigate effects Cyberbullying occurs within a wide range of

technologies– Analyzing social media for cases of cyberbullying

requires methods isolating key factors within large samples of data

Introduction

5

Cyberbullying effects both victims and offenders (8)– victims are 1.9X more likely to attempt suicide

– Offenders are 1.5X more

– Attacks are personal & focus on sexuality, race, intelligence, and appearance

Courts insist schools have a legal & moral responsibility to take action– 49 states prohibit cyberbullying (8,9)

– The Safe Schools Improvement Act will require all schools prohibit bullying to acquire funding

– Dignity For All Students Act (NYS) (10)

Cyberbullying

6

Cyberbullying While there are numerous forms of cyberbullying, Chisolm created

the top 11– Catfishing

– Use of MMOGs (Massive Multi-player Online Gaming-typical of males)

– Material or messages dehumanizing, attacking, or threatening the target

– Flaming (hostile & insulting)

– Impersonating

– Slamming (by-standers are crucial)

– Ratting (utilization of victim's hardware)

– Relational aggression (mean girl & by-standers)

– Sexting (large repercussions)

– Shock Trolling

– Online Stalking

7

Aside from the chart below, cyberbullies are primarily teenaged and female (9)

Gender aside, bullies typically have low effective & cognitive empathy (13)

Bystanders play a major role (DASA)

Bullying Traits

8

This work applies to many different social networks

In 2010 the Department of Homeland Security proposed classifications for social networks [17]

Twitter is Unique: – Allows for non-accepted Follower Relationships

– 140 Character Limit

– Easy Access API

– * The Twitter Rules *

Dataset Selection

9

Started with a series of political hashtags that were collected as part of a previous research project, researchers at SUNY Polytechnic collected 9Million+ tweets from the trickler API.

Dataset Selection

This dataset is available upon request in full or summarized form, under a data sharing agreement. A complete summation of the dataset is also available in report form.

10

Build a process that can be highly parralellized using mapreduce

Reduce the dataset through various, easy to compute mechanisms

Eliminate as much “noise” as possible High confidence selection of bullying messages

Analysis Goals

11

Removal of Bots/SPAM/etc. Plot (Entropy over Time)

– Previous work showed messages scoring under 4.9 to be bots/SPAM 99% of the time

– 325,396 messages removed from 189,263 accounts.

Analysis [Entropy]

12

Analyzed 2 groups (Male, Female)– Assumption: Males more negatively polarized,

• A. Sifferlin stated that men share more negative emotion online than women [21]

• Our Findings: – (All Messages - Male Negative: 17.048%) – (All Messages - Female Negative 14.742%) – Difference (2.306%)

Male Female

Analysis [Polarity]

13

Next we analyzed negative messages directed at another user

– Twitter denotes directed messages as @username

– We found 725,572 messages were directed

– Within this data we found a much smaller gender difference (0.55%)

Analysis Cont.

14

We have not built the classifiers (discussed in future work)

– Instead we set an arbitrary threshold for selection of potential bully accounts

• Manual analysis shows accounts containing more than 4 negative messages directed at a particular user was a good threshold choice

• 1,035 individuals fell into the 4 or more category

• Sample messages from 1 such account are shown:

Initial Results

15

Goal: Create a variable “real-time” threshold by training two probability based machine learning classifiers:

– 1.) Assess negativity and “cruelty” of messages along with demographics of an individual compared to users of the same demographics

– 2.) Compare the overall negativity and “cruelty” of an individual, to the amount of negativity and “cruelty” shown to a specific user

Future Work / Next Steps

16

Despite our arbitrary threshold, our method located bullying within the dataset

This method shows potential as a tool to combat cyberbullying

We aim to enhance this capability Planned future work will also create more “real-

time” recognition of cyberbullying Future work with Rutgers University under NSF

grant is underway

Conclusion

17

Contact:

Holly M. White

[email protected]

Citations / Contact

presentation - application of actor level social characteristic indicator selection for the...

Documents