yuheng hu (@hyheng) arizona state univ. ajita john avaya labs fei wang ibm t.j watson research...
TRANSCRIPT
1
Yuheng Hu (@hyheng) Arizona State Univ.Ajita John Avaya LabsFei Wang IBM T.J Watson ResearchSubbarao Kambhampati Arizona State Univ.
ET-LDA: Joint Topic Modeling for Aligning Events and their Twitter Feedback
3
MotivationRepublican Primary Debate, 09/07/2011 Tweets tagged with #ReaganDebate
?
?
Which part of the event did a tweet refer to?What were the topics of the event and tweets?
Applications: Event playback/Analysis, Sentiment Analysis, Advertisement, etc
4
Event-Tweet Alignment: The Problem
• Given an event’s transcript S and its associated tweets T– Find the segment s (s ∈ S) which is topically
referred by tweet t (t ∈ T) [Could be a general tweet]
• Alignment requires:1. Extracting topics in the tweets and event2. Segmenting the event into topically coherent chunks3. Classify the tweets
--General vs. Specific
4
6
Event-Tweet Alignment: Challenges
• Both topics and Segments are latent
• Tweets are topically influenced by the content of the event. A tweet’s words’ topics can be – general (high-level and constant
across the entire event), or– specific (concrete and relate to
specific segments of the event)• General tweet = weakly influenced
by the event• Specific tweet = strongly influenced
by the event
• An event is formed by discrete sequentially-ordered segments, each of which discusses a particular set of topics
7
Event-Tweet Alignment: Approaches
• Prior work– Event Segmentation
• HMM-based, etc
– Topics Modeling • LDA, PLSI
• Possible Solution– Apply LDA to event and
Tweets separately– Measure the closeness
by JS-divergence of their topic distributions
– Problem: Event and and its twitter feeds are modeled largely independently
• Our Solution: Joint Modeling– ET-LDA (event-tweets LDA)
considers an event and its Twitter feeds jointly and characterizes the topic influences between them in a fully Bayeisan model
• Potential advantages– Tweets provide a richer
context about the topic evolution in the event
– Can measure the influence of the event on the twitterati
9
ET-LDA ModelEvent Tweets
Determine event segmentation
Determine tweet type
Determine which segment a tweet (word) refers to
Determine word’s topic in event Tweets
word’s topic
10
ET-LDA Model
For more details of the inference, please refer to our paper: http://bit.ly/MBHjyZ
11
Learning ET-LDA: Gibbs sampling
For more details of the inference, please refer to our paper: http://bit.ly/MBHjyZ
Coupling between a and b makes the posterior computation of latent variables is intractable
12
Experimental Evaluation
Evaluation Plan for ET-LDA• Performance of topic
extraction • Performance of topic
influence prediction• Performance of event
segmentation
Experimental Setup• Tweets for President
Obama’s speech on the Middle East (#MESpeech) & Republican Primary debate in the US (#ReaganDebate), expanded by search snippets
• Event transcripts from New York Times
• Tweets expanded with search snippets for context
13
Topics Extraction (#MESpeech)
MESpeech: specific topics are sensitive to the event’s context and keep evolving as the event progresses
14
Examples of segments of (#MESpeech)
• 1st segment
• 2nd segment
Thank you. Thank you. (Applause.) Thank you very much. Thank you. Please, have a seat. Thank you very much. I want to begin by thanking Hillary Clinton, who has traveled so much these last six months that she is approaching a new landmark – one million frequent flyer miles. I count on Hillary every single day, and I believe that she will go down as one of the finest Secretaries of State in our nation's history.
The State Department is a fitting venue to mark a new chapter in American diplomacy. For six months, we have witnessed an extraordinary change taking place in the Middle East and North Africa. Square by square, town by town, country by country, the people have risen up to demand their basic human rights. Two leaders have stepped aside. More may follow. And though these countries may be a great distance from our shores, we know that our own future is bound to this region by the forces of economics and security, by history and by faith. Today, I want to talk about this change -- the forces that are driving it and how we can respond in a way that advances our values and strengthens our security. Now, already, we've done much to shift our foreign policy following a decade defined by two costly conflicts. After years of war in Iraq, we've removed 100,000 American troops and ended our combat mission there. In Afghanistan, we've broken the Taliban's momentum, …
Introduction
Overview of US foreign policy
7 segments
15
Event Segmentation (#MESpeech)
• Participants asked to assess quality of segmentation by ET-LDA and LCSeg (an HMM-based event segmentation tool, trained on 15 states HMM)– Participants: 5 graduate students– Method: questionnaire
• ET-LDA performed consistently better than baselines (lower Pk values)
Pk Prob. that a random pair of words incorrectly separated by segment boundary
16
Examples of Specific/General tweets
• ReaganDebate– Specific
– General
Yes, we need to talk about jobs and teachers needing jobs! #Reagandebate
Something the #GOP candidates won't mention about Reagan - Reagan grew the size of the federal government tremendously. #reagandebate
Huntsman said Ronnie!! Take a shot! #GOPDebate #tcot #ReaganDebate
Wow, Ron Paul. Really, you think airlines would give a rip about security? Free market nonsense. #reagandebate
17
Topic Influence Prediction (#MESpeech)
• Prediction of topical influences (whether tweets are strongly/weakly influenced by the event) from the event on the un-seen tweets in our test set (20% of total tweets).
• Baseline: LDA on event and tweets, then measure by JS-divergence, deeming top ones as strongly influenced tweets
• Human study to evaluate the “goodness” of prediction results – (e.g., do you think this tweet
is strongly correlated to this segment of the event?)
The improvements are statistically significant
18
Conclusion
• Motivated joint modeling for event-tweet alignment
• Developed ET-LDA model• Provided evaluations on
two tweet datasets– Demonstrated that ET-LDA
significantly outperformed the traditional models
Thank you! 18
For details: [email protected] Web: http://bit.ly/Mkie7l
Twitter: @hyheng