m&e sharenet 2014_kolkata_connected customer audience analytics_twitter data analytics.doc
TRANSCRIPT
-
8/10/2019 M&E ShareNet 2014_Kolkata_Connected Customer Audience analytics_Twitter Data Analytics.doc
1/8
Data Analytics using Twitter Social Media
Author(s):Saikat Chatterjee
Document: Techathon Solution Overview Template
Owner: IBM Status: Draft
Page 1 of 8
-
8/10/2019 M&E ShareNet 2014_Kolkata_Connected Customer Audience analytics_Twitter Data Analytics.doc
2/8
Contents Contents
1 "igh #evel Overview1 1 Intro%uction
1 & Solution Overview
& Detaile% Description& 1 (rchitecture Overview
& & Macro %esign
! +nvironment ,ee%s
Document: Techathon Solution Overview Template
Owner: IBM Status: Draft
Page & of 8
-
8/10/2019 M&E ShareNet 2014_Kolkata_Connected Customer Audience analytics_Twitter Data Analytics.doc
3/8
1. High Level Overview
1.1 Introduction Twitter is a massive social networking site tuned towards fast communication. More than 140million active users publish over 400 million 140-character Tweets ever! da!. Twitter"s speedand ease of publication have made it an important communication medium for people from allwalks of life. Twitter has pla!ed a prominent role in socio-political events# such as the $rab%pring and the &ccup! 'all %treet movement. Twitter has also been used to post damagereports and disaster preparedness information during large natural disasters# such as the(urricane %and!.
This document provides a ver! high level overview of the proposed solution and thesoftware)hardware re*uirements necessar! for building it.
1.2 Solution Overview This application showcases some of the data anal!tical works that can be achieved using the Twitter +,%T based $ -
/ollecting# storing# and anal! ing Twitter data
%tore this data in a tangible wa! for use in real-time applications
ocus on common measures and algorithms that are used to anal! e social media data
2isual anal!tics# an approach which helps humans inspect the data through intuitivevisuali ations
Document: Techathon Solution Overview Template
Owner: IBM Status: Draft
Page ! of 8
-
8/10/2019 M&E ShareNet 2014_Kolkata_Connected Customer Audience analytics_Twitter Data Analytics.doc
4/8
2. Detailed Description
Collecting, storing, and analyzing Twitter data3sers on Twitter generate over 400 million Tweets ever!da! 1 . %ome of these Tweets areavailable to researchers and practitioners through public $ s at no cost. n this chapter we willlearn how to e tract the following t!pes of information from Twitter5
nformation about a user#
$ user"s network consisting of his connections#
Tweets published b! a user# and
%earch results on Twitter.
$ s to access Twitter data can be classi6ed into two t!pes based on their design and accessmethod5
+,%T $ s are based on the +,%T architecture 7 now popularl! used for designing web$ s. These $ s use the pull strateg! for data retrieval. To collect information a usermust e plicitl! re*uest it.
%treaming $ s provides a continuous stream of public information from Twitter. These$ s use the push strateg! for data retrieval. &nce a re*uest for information is made# the%treaming $ s provide a continuous stream of updates with no further input from theuser. The! have di8erent capabilities and limitations with respect to what and how muchinformation can be retrieved. The %treaming $ has three t!pes of endpoints5 ublic streams5 These are streams containing the public tweets on Twitter. 3ser streams5 These are single-user streams# with to all the Tweets of a user.
%ite streams5 These are multi-user streams and intended for applications whichaccess Tweets from multiple users.
Storing Twitter Data There has been an e plosion in the si e of data generated on social media. This data e plosioncalls for a new data storage paradigm. $t the forefront of this movement is 9o%:;# whichpromises to store big data in a more accessible wa! than the traditional# relational model. Thereare several 9o%:; implementations. n this book# we choose Mongo
-
8/10/2019 M&E ShareNet 2014_Kolkata_Connected Customer Audience analytics_Twitter Data Analytics.doc
5/8
Analyzing Twitter DataMan. of the /uestions that we as0 of our Twitter %ata can e answere% through networ0 anal.sis 2uestionssuch as 3who is important456 3who tal0s to whom456 an% 3what is important45 can all e answere% through anetwor0 7sing proper networ0 measures6 we can fin% these important actors or topics in a networ0
/entralit! - 'ho is important@
-
8/10/2019 M&E ShareNet 2014_Kolkata_Connected Customer Audience analytics_Twitter Data Analytics.doc
6/8
2.2 Macro design
Document: Techathon Solution Overview Template
Owner: IBM Status: Draft
Page ) of 8
-
8/10/2019 M&E ShareNet 2014_Kolkata_Connected Customer Audience analytics_Twitter Data Analytics.doc
7/8
Document: Techathon Solution Overview Template
Owner: IBM Status: Draft
Page * of 8
-
8/10/2019 M&E ShareNet 2014_Kolkata_Connected Customer Audience analytics_Twitter Data Analytics.doc
8/8
3. nviron!ent "eedsID : !et"eans #
$anguage and Tools: %D& ' #, % ' *, A+ache Ant latest ersion, -it ' . / 0
1e" Ser er: A+ache To2cat .
De elo+2ent 3S: 1indows # (Internet 2ust "e accessi"le as our a++lication calls Twitter4s 5 ST A6I), Ad2inistrator access le el
De+loy2ent 3S: $inu7 (Internet 2ust "e accessi"le as our a++lication calls Twitter4s 5 ST A6I), Ad2inistrator access le el
Data"ase: 8ongoD9 latest ersion
Document: Techathon Solution Overview Template
Owner: IBM Status: Draft
Page 8 of 8