collecting and coding twitter data in discovertext
DESCRIPTION
These are the slides to a workshop I presented on September 23, 2014 to the University of Wisconsin-Madison Digital Humanities Research Network (http://dhresearchnetwork.wordpress.com/). The workshop covered an overview of my research using DiscoverText, steps to collect data in the cloud-based big data analytics software DiscoverText (https://discovertext.com/), and coding data, as well as limitations, challenges and other resources for social media data collection and analysis.TRANSCRIPT
Jill Hopke @jillhopke
Digital Humani4es Research Network
UW-‐Madison September 23, 2014
Collec4ng and Coding TwiJer Data in DiscoverText
Workshop Overview
My Research on Global Frackdown
Steps to collect Twi9er data in DiscoverText
Coding data in DiscoverText
LimitaAons and Challenges
Other Tools/Resources
Theory-‐Driven Research is Key!
CollecAve AcAon
The Changing Nature of Ac4vism
CollecAve AcAon
Connec&ve AcAon
The Changing Nature of Ac4vism
CollecAve AcAon
Connec&ve AcAon
CollecAve AcAon Frames
Personal AcAon Frames
Transna4onal An4-‐Fracking Ac4vism
DistribuAon of Global Frackdown 2013 Events
Source: Global Frackdown. (n.d.). Events.
RQ1: What TwiJer strategies do Global Frackdown ac4vists use to mobilize for the October 19, 2013 day of ac4on? RQ2: How do Global Frackdown tweeters frame protest against hydraulic fracturing?
Research Ques4ons
• Dataset of 9,449 tweets for the hashtag #globalfrackdown.
• Data collected from October 13 to October 27, 2013 using DiscoverText.
• Textual analysis of English (n=7,678) and Spanish (n=1,314) tweets.
• Unit of analysis is the individual tweet. • Also conducted in-‐depth interviews with transna4onal ac4vists.
Project Data
Twee4ng the #GlobalFrackdown
Tweet Frequency (October 13-‐27, 2013)
0
1000
2000
3000
4000
5000
6000 10/13/13
10/14/13
10/15/13
10/16/13
10/17/13
10/18/13
10/19/13
10/20/13
10/21/13
10/22/13
10/23/13
10/24/13
10/25/13
10/26/13
10/27/13
English
Spanish
Total
Tweet Language
79%
14%
3% 2% 1% 1% 0%
English
Spanish
French
Catalan
Basque
German
Other
Propor4on of Tweets by Device Source
0 20 40 60 80 100
English
Spanish
Mobile
Desktop
Applica4on
Propor4on of Tweets with Photos
9%
91%
Spanish
Photo
No Photo
21%
79%
English
Photo
No Photo
Tweet Content Type
0
10
20
30
40
50
60
% of T
weets
English
Spanish
Most Frequently Used Hashtags
English Spanish Fracking Fracking
Elsipogtog FrackingNo
Banfracking StopFracking
IdleNoMore FrackingEZ
PowerShil BanFracking
ElsipogtogSolidarity 19O
BanFrackingNow SiSePuede19O
Mikmaqblockade Castelló
Cdnpoli GlobalFrackdo
NYC Chervon
Excluding
#GlobalFrackdown
Collec4ng Data in DiscoverText
DiscoverText Dashboard – Login and Try Out!
Start a New Project
Name Your Project
Impor4ng Data
TwiJer Data Types (API, GNIP, Historical)
Name Your Data Feed (Archive)
Enter Your Search Term
Schedule Feed Data Collec4on
Archive Management
List of Tweets
Viewing Tweets
Meta Data
Keeping Track of Feed Schedules
Coding in DiscoverText
(Part of) What I Did (Theory-‐Driven)
• First round, code for language. • Second round, read sub-‐sec4on of data and developed set of “working themes.”
• Code for themes. Memo/annotate interes4ng examples.
• Refine codebook (themes) and con4nue coding. • Intercoder reliability (you might want to do this… Depends on your methodological approach).
• I also used the machine-‐learning func4ons for a separate chapter to “classify” data for valence and certainty.
Coding Tweets
DiscoverText Coding Example
Limita4ons and Challenges
Limita4ons and Challenges
PRO: Doesn’t require programming knowledge. User-‐friendly interface. Powerful tool. CON: Solware’s advanced machine-‐learning func4ons are expensive! DiscoverText is one of the “affordable” plaworms. Also, human subjects research/IRB considera4ons. = Need for collabora4ons and grant funding.
Other Tools/Resources
Other Tools and Resources
• “Social Media Data Collec4on Tools” (see here): Running list of tools curated by Deen Freelon, Ph.D., [email protected], hJp://dfreelon.org, @dfreelon.
• Digital Methods Ini4a4ve at University of Amsterdam (see here).
• Digital Methods (2013) by Richard Rogers (see here).
• Join Associa4on of Internet Researchers AIR-‐L mailing list (see here)!