twitter sentiment analysis

21
The University of Texas at Dallas utdallas.edu Airline Twitter Analysis 1

Upload: abhishek-m-shivalingaiah

Post on 12-Apr-2017

361 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Page 1: Twitter sentiment analysis

The University of Texas at Dallas utdallas.edu

Airline Twitter Analysis

1

Page 2: Twitter sentiment analysis

The University of Texas at Dallas utdallas.edu

What we wanted to do?

• Kaggle- Twitter Airlines Sentiments• Exploratory Analysisi. When do people tweet?ii. Which airlines gets the most tweets?iii. Which sentiments are dominant?iv. How these sentiments are distributed?• Text Analyticsi. Most frequently used wordsii. Most frequently used words when the sentiment is negative.iii. Most frequently used words when the sentiment is positive.iv. Tweet length vs Sentiment

2

Page 3: Twitter sentiment analysis

The University of Texas at Dallas utdallas.edu

Cleansing of data

• Tweets Had “@airline name” at the beginning of every tweet

• 4 columns with hardly any data

• Null and missing values

• Co-Ordinates required - Geo coding

3

Page 4: Twitter sentiment analysis

The University of Texas at Dallas utdallas.edu

When do people tweet?

• Most of the tweets have come in during the rush morning hours peaking at 9 am

4

0

200

400

600

800

1000

1200

0 5 10 15 20 25

Nu

mb

er

of

Twe

ets

Hour

Number of Tweets every hour

Page 5: Twitter sentiment analysis

The University of Texas at Dallas utdallas.edu

How are the tweets & sentiments distributed?

• United Airlines, American and US Airways receive most of the tweets.

• Most of the tweets are negative as expected.

• 63% of the tweets are negative.

5

Page 6: Twitter sentiment analysis

The University of Texas at Dallas utdallas.edu

Distribution of sentiments for all the airlines

Sentiment frequency

Positive 0.1706621Neutral 0.2295947Negative 0.5997432

• The three airlines having maximum tweets are the ones having maximum negative tweets? Why?

6

Page 7: Twitter sentiment analysis

The University of Texas at Dallas utdallas.edu

Why so many negative tweets?

7

Page 8: Twitter sentiment analysis

The University of Texas at Dallas utdallas.edu

Word clouds to show frequency of words used in negative tweets

8

US Airways United Airlines American Airlines

Page 9: Twitter sentiment analysis

The University of Texas at Dallas utdallas.edu

An outlier in the case of Delta Airlines

.

9

Page 10: Twitter sentiment analysis

The University of Texas at Dallas utdallas.edu

Word cloud for all the positive tweets

10

Page 11: Twitter sentiment analysis

The University of Texas at Dallas utdallas.edu

From which time zones are people tweeting ?

• Flights travel everywhere throughout the world.

• But we observed that most of the tweets originate from the Eastern Time zone(US & Canada).

11

Page 12: Twitter sentiment analysis

The University of Texas at Dallas utdallas.edu

Association Analysis

• Association Analysis on words used in the tweet.

12

Page 13: Twitter sentiment analysis

The University of Texas at Dallas utdallas.edu

Hierarchical clustering to determine association between words

13

Page 14: Twitter sentiment analysis

The University of Texas at Dallas utdallas.edu

Cont’d

14

Page 15: Twitter sentiment analysis

The University of Texas at Dallas utdallas.edu

Kmean clustering

15

Page 16: Twitter sentiment analysis

The University of Texas at Dallas utdallas.edu

Cont’d

16

Page 17: Twitter sentiment analysis

The University of Texas at Dallas utdallas.edu

Association between Tweet length and sentiment

• Longer the tweet, we observed they are likely to be negative in sentiment.

17

Page 18: Twitter sentiment analysis

The University of Texas at Dallas utdallas.edu

Cont’d

18

Page 19: Twitter sentiment analysis

The University of Texas at Dallas utdallas.edu

What else we tried doing?

• A predictive model

• Setbacks we faced during the process

• Work on SPSS

• Categorization

19

Page 20: Twitter sentiment analysis

The University of Texas at Dallas utdallas.edu

Why this Analysis? Will it help in some way?

• Airline Industry – lives on customers.

• We get to know where we are doing good and where we are doing bad.

• Can be a basis for a predictive model when we associated tweet length with sentiment.

• Companies can get to know their competition.

• Improve the flight journey overall.

20

Page 21: Twitter sentiment analysis

The University of Texas at Dallas utdallas.edu

References

• Wikipedia.com

• Kaggle.com

• www.clarabridge.com/text-analytics/

• https://sites.google.com/site/manabusakamoto/home/r.../r-tutorial-3

21