predicting long-term impact of cqa posts: a comprehensive ... · yuan yao. joint work with ....

27
Yuan Yao Joint work with Hanghang Tong, Feng Xu, and Jian Lu Predicting Long-Term Impact of CQA Posts: A Comprehensive Viewpoint 1 Aug 24-27, KDD 2014

Upload: others

Post on 18-Mar-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Predicting Long-Term Impact of CQA Posts: A Comprehensive ... · Yuan Yao. Joint work with . Hanghang Tong, Feng Xu, and Jian Lu. Predicting Long-Term Impact of CQA Posts: A Comprehensive

Yuan YaoJoint work with

Hanghang Tong, Feng Xu, and Jian Lu

Predicting Long-Term Impact of CQA Posts: A Comprehensive Viewpoint

1Aug 24-27, KDD 2014

Page 2: Predicting Long-Term Impact of CQA Posts: A Comprehensive ... · Yuan Yao. Joint work with . Hanghang Tong, Feng Xu, and Jian Lu. Predicting Long-Term Impact of CQA Posts: A Comprehensive

Roadmap

Background and Motivations Modeling Multi-aspect Computation Speedup Empirical Evaluations Conclusions

2

Page 3: Predicting Long-Term Impact of CQA Posts: A Comprehensive ... · Yuan Yao. Joint work with . Hanghang Tong, Feng Xu, and Jian Lu. Predicting Long-Term Impact of CQA Posts: A Comprehensive

Roadmap

Background and Motivations Modeling Multi-aspect Computation Speedup Empirical Evaluations Conclusions

3

Page 4: Predicting Long-Term Impact of CQA Posts: A Comprehensive ... · Yuan Yao. Joint work with . Hanghang Tong, Feng Xu, and Jian Lu. Predicting Long-Term Impact of CQA Posts: A Comprehensive

CQA

4

Page 5: Predicting Long-Term Impact of CQA Posts: A Comprehensive ... · Yuan Yao. Joint work with . Hanghang Tong, Feng Xu, and Jian Lu. Predicting Long-Term Impact of CQA Posts: A Comprehensive

Long-Term Impact

Q: How many users will find it beneficial?

What is the Long-Term Impact of a Q/A post?5

Page 6: Predicting Long-Term Impact of CQA Posts: A Comprehensive ... · Yuan Yao. Joint work with . Hanghang Tong, Feng Xu, and Jian Lu. Predicting Long-Term Impact of CQA Posts: A Comprehensive

Challenges

Q: Why not off-the-shell data mining algorithms?

Challenge 1: Multi-aspect C1.1. Coupling between questions and answers C1.2. Feature non-linearity C1.3. Posts dynamically arrive

Challenge 2: Efficiency

6

Page 7: Predicting Long-Term Impact of CQA Posts: A Comprehensive ... · Yuan Yao. Joint work with . Hanghang Tong, Feng Xu, and Jian Lu. Predicting Long-Term Impact of CQA Posts: A Comprehensive

C1.1 Coupling

Strong positive correlation![Yao+@ASONAM’14]

[Yao+@ASONAM’14] Y Yao, H Tong, T Xie, L Akoglu, F Xu, J Lu. Joint Voting Prediction for Questions and Answers in CQA. ASONAM 2014.

Fq

Predicted Q Impact Yq

Question Features

Fa

Predicted A Impact Ya

Answer Features

VotingConsistency

1 2

3

7

Page 8: Predicting Long-Term Impact of CQA Posts: A Comprehensive ... · Yuan Yao. Joint work with . Hanghang Tong, Feng Xu, and Jian Lu. Predicting Long-Term Impact of CQA Posts: A Comprehensive

C1.1 Coupling

LIP-M: [Yao+@ASONAM’14]

Question prediction Answer prediction

Voting Consistency Regularization

1 2

3

8

Page 9: Predicting Long-Term Impact of CQA Posts: A Comprehensive ... · Yuan Yao. Joint work with . Hanghang Tong, Feng Xu, and Jian Lu. Predicting Long-Term Impact of CQA Posts: A Comprehensive

C1.2 Non-linearity

The kernel trick (e.g., SVM) Mercer kernel

Kernel matrix as new feature matrix

9

Page 10: Predicting Long-Term Impact of CQA Posts: A Comprehensive ... · Yuan Yao. Joint work with . Hanghang Tong, Feng Xu, and Jian Lu. Predicting Long-Term Impact of CQA Posts: A Comprehensive

C1.3 Dynamics

Solution: recursive least squares regression[Haykin2005]

(x1, y1)(x2, y2)

…(xt, yt)

(xt+1, yt+1)

Modelt

Modelt+1+

[Haykin2005] S Haykin. Adaptive filter theory. 2005.

Existing examples

New examples

Current model

New model

10

Page 11: Predicting Long-Term Impact of CQA Posts: A Comprehensive ... · Yuan Yao. Joint work with . Hanghang Tong, Feng Xu, and Jian Lu. Predicting Long-Term Impact of CQA Posts: A Comprehensive

This Paper

Q1: how to comprehensively capture the multi-aspect in one algorithm? Coupling, non-linearity, and dynamics

Q2: how to make the long-term impact prediction algorithm efficient?

11

Page 12: Predicting Long-Term Impact of CQA Posts: A Comprehensive ... · Yuan Yao. Joint work with . Hanghang Tong, Feng Xu, and Jian Lu. Predicting Long-Term Impact of CQA Posts: A Comprehensive

Roadmap

Background and Motivations Modeling Multi-aspect Computation Speedup Empirical Evaluations Conclusions

12

Page 13: Predicting Long-Term Impact of CQA Posts: A Comprehensive ... · Yuan Yao. Joint work with . Hanghang Tong, Feng Xu, and Jian Lu. Predicting Long-Term Impact of CQA Posts: A Comprehensive

Modeling Non-linearity

Basic Idea: kernelize LIP-M Details - LIP-KM:

Closed-form solution:

Complexity: O(n3)

Question prediction Answer prediction

Voting Consistency Regularization

1 2

3

13

Page 14: Predicting Long-Term Impact of CQA Posts: A Comprehensive ... · Yuan Yao. Joint work with . Hanghang Tong, Feng Xu, and Jian Lu. Predicting Long-Term Impact of CQA Posts: A Comprehensive

Modeling Dynamics

Basic idea: recursively update LIP-KM Details - LIP-KIM:

Complexity: O(n3) -> O(n2)

(matrix inverse lemma)

14

Page 15: Predicting Long-Term Impact of CQA Posts: A Comprehensive ... · Yuan Yao. Joint work with . Hanghang Tong, Feng Xu, and Jian Lu. Predicting Long-Term Impact of CQA Posts: A Comprehensive

Roadmap

Background and Motivations Modeling Multi-aspect Computation Speedup Empirical Evaluations Conclusions

16

Page 16: Predicting Long-Term Impact of CQA Posts: A Comprehensive ... · Yuan Yao. Joint work with . Hanghang Tong, Feng Xu, and Jian Lu. Predicting Long-Term Impact of CQA Posts: A Comprehensive

Approximation Method (1)

Basic idea: compress the kernel matrix Details – LIP-KIMA:

1) Separate decomposition 2) Make decomposition reusable 3) Apply decomposition on LIP-KIM

Complexity: O(n2) -> O(n)

(Nyström approximation)

(Eigen-decomposition)

(SVD on X1)(Eigen-decomposition)

17

Page 17: Predicting Long-Term Impact of CQA Posts: A Comprehensive ... · Yuan Yao. Joint work with . Hanghang Tong, Feng Xu, and Jian Lu. Predicting Long-Term Impact of CQA Posts: A Comprehensive

Approximation Method (2)

Basic idea: filter less informative examples Details - LIP-KIMAA:

Complexity: O(n) -> <O(n)

?

(x1, y1)(x2, y2)

…(xt, yt)

ModeltExisting examples

Current model

(xt+1, yt+1)

New examples

Modelt+1 +New model

Filtering

18

Page 19: Predicting Long-Term Impact of CQA Posts: A Comprehensive ... · Yuan Yao. Joint work with . Hanghang Tong, Feng Xu, and Jian Lu. Predicting Long-Term Impact of CQA Posts: A Comprehensive

Roadmap

Background and Motivations Modeling Multi-aspect Computation Speedup Empirical Evaluations Conclusions

20

Page 20: Predicting Long-Term Impact of CQA Posts: A Comprehensive ... · Yuan Yao. Joint work with . Hanghang Tong, Feng Xu, and Jian Lu. Predicting Long-Term Impact of CQA Posts: A Comprehensive

Experiment Setup

Datasets (http://blog.stackoverflow.com/category/cc-wiki-dump/)

Stack Overflow , Mathematics Stack Exchange

Features Content (bag-of-words) & contextual features

Test setTraining setInitial set Incremental set

Time

21

Page 21: Predicting Long-Term Impact of CQA Posts: A Comprehensive ... · Yuan Yao. Joint work with . Hanghang Tong, Feng Xu, and Jian Lu. Predicting Long-Term Impact of CQA Posts: A Comprehensive

Evaluation Objectives

O1: Effectiveness How accurate are the proposed algorithms

for long-term impact prediction?

O2: Efficiency How scalable are the proposed algorithms?

22

Page 22: Predicting Long-Term Impact of CQA Posts: A Comprehensive ... · Yuan Yao. Joint work with . Hanghang Tong, Feng Xu, and Jian Lu. Predicting Long-Term Impact of CQA Posts: A Comprehensive

Effectiveness Results

Comparisons with existing models.

(better)

Our methods

23

Page 23: Predicting Long-Term Impact of CQA Posts: A Comprehensive ... · Yuan Yao. Joint work with . Hanghang Tong, Feng Xu, and Jian Lu. Predicting Long-Term Impact of CQA Posts: A Comprehensive

Efficiency Results

The speed comparisons.

(better)

LIP-KIMAA(sub-linear)

Ours

25

Page 24: Predicting Long-Term Impact of CQA Posts: A Comprehensive ... · Yuan Yao. Joint work with . Hanghang Tong, Feng Xu, and Jian Lu. Predicting Long-Term Impact of CQA Posts: A Comprehensive

Quality-Speed Balance-off

Our methods

(better)

26

Page 25: Predicting Long-Term Impact of CQA Posts: A Comprehensive ... · Yuan Yao. Joint work with . Hanghang Tong, Feng Xu, and Jian Lu. Predicting Long-Term Impact of CQA Posts: A Comprehensive

Roadmap

Background and Motivations Modeling Multi-aspect Computation Speedup Empirical Evaluations Conclusions

27

Page 26: Predicting Long-Term Impact of CQA Posts: A Comprehensive ... · Yuan Yao. Joint work with . Hanghang Tong, Feng Xu, and Jian Lu. Predicting Long-Term Impact of CQA Posts: A Comprehensive

Conclusions

A family of algorithms for long-term impact prediction of CQA posts

Q1: how to capture coupling, non-linearity, and dynamics?

A1: voting consistency + kernel trick + recursive updating

Q2: how to make the algorithms scalable? A2: approximation methods

Empirical Evaluations Effectiveness: up to 35.8% improvement Efficiency: up to 390x speedup and sub-linear

scalability28

Page 27: Predicting Long-Term Impact of CQA Posts: A Comprehensive ... · Yuan Yao. Joint work with . Hanghang Tong, Feng Xu, and Jian Lu. Predicting Long-Term Impact of CQA Posts: A Comprehensive

Q&A

Thanks!

Authors: Yuan Yao, Hanghang Tong, Feng Xu, and Jian Lu

Yuan Yao, [email protected]

29