towards twitter hashtag recommendation using distributed word representations and a deep feed...

25
ELIS – Multimedia Lab Towards Twitter Hashtag Recommendation Using Distributed Word Representations and a Deep Feed Forward Neural Network CSSC-2014 New Delhi, 24 September 2014 Abhineshwar Tomar, Frederic Godin, Baptist Vandersmissen, Wesley De Neve, Rik Van de Walle Multimedia Lab, Ghent University – iMinds, Belgium Image and Video Systems Lab, KAIST, South Korea

Upload: wesley-de-neve

Post on 18-Dec-2014

120 views

Category:

Technology


4 download

DESCRIPTION

Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

TRANSCRIPT

Page 1: Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

ELIS – Multimedia Lab

Towards Twitter Hashtag Recommendation Using Distributed Word Representations and a Deep Feed

Forward Neural Network

CSSC-2014New Delhi, 24 September 2014

Abhineshwar Tomar, Frederic Godin, Baptist Vandersmissen, Wesley De Neve, Rik Van de Walle

Multimedia Lab, Ghent University – iMinds, BelgiumImage and Video Systems Lab, KAIST, South Korea

Page 2: Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

2

ELIS – Multimedia Lab

Introduction Goal Motivation Methodology Results Conclusion Future work

Overview

Page 3: Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

3

ELIS – Multimedia Lab

Introduction Goal Motivation Methodology Results Conclusion Future work

Overview

Page 4: Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

4

ELIS – Multimedia Lab

• An online social network service that enables users to send and read short 140-character text messages, called "tweets" or "microposts"

Twitter

Tweet ormicropostRetweet

(sharing)

Favorite(like or

bookmark)

Mention(starts with @)

Hashtag(starts with #)

Page 5: Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

5

ELIS – Multimedia Lab

Note the presence of both textual and (embedded) visual information!

Famous Tweets

Page 6: Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

6

ELIS – Multimedia Lab

• Usage in general- 271 million monthly active users- 500 million Tweets are sent per day

• Hashtags- Only 8% of the tweets contain hashtags- 3% of the hashtags are used more than 5 times

Twitter Statistics

Page 7: Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

7

ELIS – Multimedia Lab

Hashtags on Twitter

Hashtag usage:- topic-based indexing & search

• #socialnetwork• #Reddit

- conversational/event clustering• #www2014

Observation: only 8% of tweets contain a hashtag

Page 8: Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

8

ELIS – Multimedia Lab

Introduction Goal Why Methodology Results Conclusion Future work

Overview

Page 9: Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

9

ELIS – Multimedia Lab

Generate hashtags that adhere to the semantic and linguistic regularity of a tweet

Goal

Page 10: Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

10

ELIS – Multimedia Lab

Introduction Goal Motivation Methodology Results Conclusion Future work

Overview

Page 11: Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

11

ELIS – Multimedia Lab

• Hashtags- Content categorization and discovery- Effective search of tweets

• Our approach- Connect similar hashtags (topics)- Promote the use of hashtags

• By understanding the semantics of the tweet

Why

Page 12: Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

12

ELIS – Multimedia Lab

Introduction Goal Motivation Methodology Results Conclusion Future work

Overview

Page 13: Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

13

ELIS – Multimedia Lab

• Preprocessing- Remove non-English words- Remove non-ASCII characters- Remove mentions (@USER)- Remove URLs- Remove RT @ from retweets

• Feature vector generation

• Training of a feed forward neural network

• Evaluation

Methodology (1/3)

Page 14: Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

14

ELIS – Multimedia Lab

• Training: learning the relation between tweets and hashtags

Methodology (2/3)

300-D tweet vector

word2vec

300-D hashtag vector

word2vec

Deep feed-forward neural

network

300-D input layer1000-D hidden layer500-D hidden layer400-D hidden layer300-D output layer

Tweet HashtagElizabeth Warren Taking on Hillary as New Democratic Powerhouse

#politics

Page 15: Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

15

ELIS – Multimedia Lab

• Testing: recommending hashtags to tweets

Methodology (3/3)

300-D tweet vector

word2vec

300-D hashtag vector

Deep feed-forward neural

network

300-D input layer1000-D hidden layer500-D hidden layer400-D hidden layer300-D output layer

TweetHouse Democrats suggestObama impeachment isimminent to raise cash

vec2word

HashtagHashtag

HashtagHashtags

#politics#crisis

Page 16: Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

16

ELIS – Multimedia Lab

• Developed by Google Research

• Computes vector representations for words- Through the use of neural network technology

• Trained on part of the Google News dataset (+/- 100 billion words)• The model contains vectors for 3 million words and phrases

- Capture the semantic meaning of a word

• Example word vector properties- vector('Paris') - vector('France') + vector('Italy') ≈ vector('Rome')- vector('king') - vector('man') + vector('woman') ≈ vector('queen')

word2vec

Page 17: Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

17

ELIS – Multimedia Lab

Introduction Goal Motivation Methodology Results Conclusion Future work

Overview

Page 18: Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

18

ELIS – Multimedia Lab

Tweet Recommended hashtags

1 Someone dm/text me bc I’m so bored madd, Oh noes, rainnwilson, sooooooo, fricken

2 The good life is one inspired by love and guided by knowledge.

Ahh yes, FIVE THINGS About, YANKEES TALK, Kinder gentler,Ya gotta love

3 Method of Losing Weight http://t.co/rs64CEuo5W Shape Shifting, Treat Acne, Detect Cancer, Warps, Calorie Burn

4 I hate today cause its room cleaning day for me!!! FAN ’S ATTIC, Puh leez, Mopping robot, % #F######## 3v.jsn, InterestEURO JAP

5 SPELLS AND SPELL-CASTING:ENCYCLOPEDIA OF 5000 SPELLS ( JUDIKA ILLES ):BLACKSMITH’S WATER HEALING SPELL: A... http://t.co/k0TfrqJFQW

DEBUTS NEW, NOW AVAILABLE FOR, TO PUBLISH, DESIGNED TO,IS READY TO

Results (1/3)

Page 19: Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

19

ELIS – Multimedia Lab

Results (2/3)

Page 20: Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

20

ELIS – Multimedia Lab

Top-k recommendation Hit-rate

She et al. Our approach1 Top-5 82% 83.33%2 Top-10 89% 86.67%

Results (3/3)

Page 21: Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

21

ELIS – Multimedia Lab

Introduction Goal Motivation Methodology Results Conclusion Future work

Overview

Page 22: Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

22

ELIS – Multimedia Lab

Conclusion

• Introduced a novel approach for hashtag recommendation, using distributed word representations and a feed forward neural network

• Learns semantic and linguistic regularities without requiring careful feature engineering

• Can easily take advantage of temporal information

• Supports the automatic creation of new hashtags/trends

Page 23: Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

23

ELIS – Multimedia Lab

Introduction Goal Motivation Methodology Results Conclusion Future work

Overview

Page 24: Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

24

ELIS – Multimedia Lab

Future Work

• Use of more than four days of data

• Use word representations from different data sources

• Investigate impact of the quality of the word representations created

• Investigate impact of the use of DBpedia and Freebase

Page 25: Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network

ELIS – Multimedia Lab