finding question answer pairs from online forum

23
G. Cong, L. Wang, C. Lin, Y. Song, and Y. Sun. (2008) Presenter: Tan Kent Loong r04944005 Finding Question-Answer Pairs from Online Forums

Upload: tk-loong

Post on 09-Feb-2017

184 views

Category:

Data & Analytics


2 download

TRANSCRIPT

Page 1: Finding question answer pairs from online forum

G. Cong, L. Wang, C. Lin, Y. Song, and Y. Sun. (2008)

Presenter: Tan Kent Loong r04944005

Finding Question-Answer Pairs from Online Forums

Page 2: Finding question answer pairs from online forum
Page 3: Finding question answer pairs from online forum

Motivation

● Forums contain a huge amount of user-generated content on a variety of topics.○ Knowledge of human○ Largely unstructured

Page 4: Finding question answer pairs from online forum

Flow Chart

Question Detection Answer DetectionThreads

Forum

Page 5: Finding question answer pairs from online forum

Question Detection● Rule based methods

○ 5W1H○ End with ' ? '

■ 30% questions do not end with question marks.● I am wondering where I can visit in Bangkok.● I am having doubt about changing tyre.

■ 9% are not questions● Like to enjoy a long walk while enjoying great

sights and tastes?● Only have three days to explore this city?

Not good!

Page 6: Finding question answer pairs from online forum

Labeled Sequential Pattern1. Pre-process each sentence into POS tags

“where can you find a job”→ “where can PRP VB DT NN”

2. Build sequence database.<a, d, e, f> → Q<a, f, e, f> → Q<d, a, f> → NQ

3. Calculate the support and confidence- <a, e, f> with support 66.7% and 100% confidence- <a, f> with support 66.7% and 66.7% confidence

1. Set minimum support threshold and minimum confidence threshold F1 score = 97.4%

Page 7: Finding question answer pairs from online forum

Answer Detection● Observation: Many-to-many

○ Multiple questions and answers within same thread.■ 1 question may have multiple replies.■ 1 post may contain answers to multiple

questions.

Page 8: Finding question answer pairs from online forum

● Treat as traditional document retrieval problem○ Cosine Similarity○ Query likelihood language model○ KL-divergence language model

● Classification method

Answer Detection

Page 9: Finding question answer pairs from online forum

Think of a “distance” between question language model and answer language model

p(w|Ma) :

p(w|Mq) :

Probability of keyword appeared in candidate answer.Probability of keyword appeared in question.

KL-divergence

Page 10: Finding question answer pairs from online forum

● Treat as traditional document retrieval problem○ Cosine Similarity○ Query likelihood language model○ KL-divergence language model

● Classification method

Cons: Do not consider the relationship of candidate answers and forum-specific features.

a1: world hotel is good but I prefer century hotel a2: world hotel has a very good restauranta2(generator) → a1(offspring)

Answer Detection

Page 11: Finding question answer pairs from online forum

PageRank (without hyperlinks)

Page 12: Finding question answer pairs from online forum

Graph-Based Propagation

Page 13: Finding question answer pairs from online forum

● Calculate weight based on○ Probability assigned by language model of

generating one candidate answer from the other candidate answer

○ The distance of candidate answer from question○ The authority of authors of candidate answer.

author(ag ; #reply2, #start)

Graph-Based Propagation

Page 14: Finding question answer pairs from online forum

Graph-Based Propagation

1. Propagation without initial score:

Page 15: Finding question answer pairs from online forum

Graph-Based Propagation

1. Propagation without initial score:

2. Propagation with initial score:

Page 16: Finding question answer pairs from online forum

Integration with other methods

1. Graph based propagation → classification2. Lexical mapping

e.g. “why → because”

Page 17: Finding question answer pairs from online forum

Evaluation

Page 18: Finding question answer pairs from online forum

Evaluation

Page 19: Finding question answer pairs from online forum

Evaluation

Page 20: Finding question answer pairs from online forum

Evaluation

Page 21: Finding question answer pairs from online forum

Summary

Question Detection(Labeled Sequence

Pattern)

Answer Detection(Enhance with Graph-based Propagation)

Threads

Forum

Page 22: Finding question answer pairs from online forum

Reference1. Finding question-answer pairs from online forum

http://research.microsoft.com/en-us/people/cyl/sigir2008-gao-msra.pdf

2. PageRank without hyperlinks: Structural re-ranking using links induced by language modelshttps://www.cs.cornell.edu/home/llee/papers/lmpagerank.home.html

Page 23: Finding question answer pairs from online forum

Thank you