a formal model to the routing questions problem

30
A formal model to the routing questions problem in the context of twitter Cleyton Caetano de Souza

Upload: cleyton-souza

Post on 24-Jan-2015

266 views

Category:

Technology


0 download

DESCRIPTION

Apresentação no ICWI 2011

TRANSCRIPT

Page 1: A formal model to the routing questions problem

A formal model to the routing questions problem in the context of

twitter

Cleyton Caetano de Souza

Page 2: A formal model to the routing questions problem

Schedule

1. Introduction

1. Problem

2. Related Works

3. The model

1. The problem

2. Details

4. A solution to the model

5. Conclusion

6. Future Works Cleyton-UFCG 2

Page 3: A formal model to the routing questions problem

Introduction

β€’ Web has became essential

– Web, a repository of information

β€’ Search Engines

– Looking answers

β€’ Social Networks

– Waiting answers

Cleyton-UFCG 3

Page 4: A formal model to the routing questions problem

Problem

β€’ Could occurs problems when you publish your question

– None answer

– None see

– Many answers

β€’ Direct the answer to someone

– You ensure a answer, but will be a good one?

Cleyton-UFCG 4

Page 5: A formal model to the routing questions problem

Problem

β€’ Informally, the problem that we proposes to solve is given a question posted by a user (asker) in Twitter, find among his followers that user with the characteristics:

– (1) knows the answer

– (2) has the trust of the questioner

– (3) provide the answer quickly

Cleyton-UFCG 5

Page 6: A formal model to the routing questions problem

Related Works

β€’ (Morris, Teevan e Panovich 2010a)

– 93.5% of users received answers to their question after post them and these responses

– in 90.1% of cases, were provided within one day

β€’ Applications

– Aardvark (Horowitz and Kamvar 2010)

– Q-Sabe (Andrade et al 2003)

β€’ The differential of our research

Cleyton-UFCG 6

Page 7: A formal model to the routing questions problem

The Model

β€’ The twitter is defined by the tuple

𝑇 = {π‘ˆ, 𝑅}

β€’ Where π‘ˆ = {𝑒1, … , 𝑒 π‘ˆ } is a set of users

β€’ And 𝑅 is the set of all relationships π‘Ÿπ‘–,𝑗 between two users 𝑖 and 𝑗.

– The existence of π‘Ÿπ‘–,𝑗 means that i follows j, this

way π‘Ÿπ‘–,𝑗 β‰  π‘Ÿπ‘—,𝑖

Cleyton-UFCG 7

Page 8: A formal model to the routing questions problem

The Model

β€’ Each useru has the attributes

– πΉπ‘œπ‘™π‘™π‘œπ‘€π‘’π‘Ÿπ‘ π‘’ that contains all users which follows 𝑒

– πΉπ‘œπ‘™π‘™π‘œπ‘€π‘–π‘›π‘”π‘’ that contains all users which are followed by 𝑒

– 𝑀𝑒 = π‘š1, … ,π‘š 𝑀 a ordered list that contains all

messages posted for 𝑒

β€’ Each message π‘š has the attributes

– π‘‘π‘š- the post date

– π‘ π‘š- the string posted

Cleyton-UFCG 8

Page 9: A formal model to the routing questions problem

The Problem

Given a query π‘ž posted by 𝑒,

𝑓 ∈ πΉπ‘œπ‘™π‘™π‘œπ‘€π‘’π‘Ÿπ‘ π‘’ and 𝑝𝑓,π‘ž a function

that tell us the chances of

𝑓 provides a good answer

– Find: 𝑓

– To: π‘€π‘Žπ‘₯ 𝑝𝑓,π‘ž

– Over: πΉπ‘œπ‘™π‘™π‘œπ‘€π‘’π‘Ÿπ‘ π‘’

Cleyton-UFCG 9

Page 10: A formal model to the routing questions problem

The problem

β€’ We believe that 𝑝𝑓,π‘ž has a correlation with

three things

– π‘˜π‘“,π‘ž – the knowledge that 𝑓 in relation with π‘ž

– 𝑑𝑒,𝑓 – the trust of 𝑒 has in 𝑓

– π‘Žπ‘“ – the level of activity of 𝑓

β€’ That way will actually want to find the best combination of: π‘˜π‘“,π‘ž, 𝑑𝑒,𝑓 and π‘Žπ‘“

Cleyton-UFCG 10

Page 11: A formal model to the routing questions problem

Knowledge

β€’ Each message π‘šπ‘’ corresponds a fraction of the total expertise of 𝑒

π‘˜π‘’ = π‘˜π‘šπ‘’π‘šπ‘’βˆˆπ‘€π‘’

β€’ In IR we represent this fraction as a vector of the words/token contained in π‘šπ‘’

β€’ So the π‘˜π‘’ is a vector where each coordinate represents a token and its value is the frequency of this token in all messages π‘šπ‘’

Cleyton-UFCG 11

Page 12: A formal model to the routing questions problem

Knowledge

β€’ If π‘‘π‘ž is the frequency of the token 𝑑 in π‘ž, the

knowledge needed to answer satisfactorily the question is calculated as a inner product between the vector that represent the follower and the vector that represent the question

π‘˜π‘“,π‘ž = π‘‘π‘ž βˆ— π‘‘π‘˜π‘’π‘‘βˆˆπ‘ž

Cleyton-UFCG 12

Page 13: A formal model to the routing questions problem

Trust

β€’ Trust is related to

– Friendship [Schenkel et al 2008]

– Similarity [Kuter and Golbeck 2010]

β€’ So we believe (and simplify) 𝑑𝑒,𝑣 = 𝑓𝑒,𝑣 βˆ— π‘ π‘–π‘š 𝑒, 𝑣

Cleyton-UFCG 13

Page 14: A formal model to the routing questions problem

Friendship

β€’ Friendship measures the importance of a user to another

β€’ In Twitter a good estimative of friendship should consider the mentions (connections) between 𝑒 and 𝑣, so

𝑓𝑒,𝑣 =|π‘šπ‘’π‘›π‘‘π‘–π‘œπ‘›π‘ π‘’ 𝑣 |

π‘šπ‘’π‘›π‘‘π‘–π‘œπ‘›π‘ π‘’

Cleyton-UFCG 14

Page 15: A formal model to the routing questions problem

Similarity

β€’ The similarity measures how to users are equal under some criterion

β€’ Appears intuitively that the similarity is related to equality among the attributes

π‘ π‘–π‘š1 𝑒, 𝑣 βˆπΉπ‘œπ‘™π‘™π‘œπ‘€π‘’π‘Ÿπ‘ π‘’ ∩ πΉπ‘œπ‘™π‘™π‘œπ‘€π‘’π‘Ÿπ‘ π‘£πΉπ‘œπ‘™π‘™π‘œπ‘€π‘’π‘Ÿπ‘ π‘’ βˆͺ πΉπ‘œπ‘™π‘™π‘œπ‘€π‘’π‘Ÿπ‘ π‘£

π‘ π‘–π‘š2 𝑒, 𝑣 βˆπΉπ‘œπ‘™π‘™π‘œπ‘€π‘–π‘›π‘”π‘’ ∩ πΉπ‘œπ‘™π‘™π‘œπ‘€π‘–π‘›π‘”π‘£πΉπ‘œπ‘™π‘™π‘œπ‘€π‘–π‘›π‘”π‘’ βˆͺ πΉπ‘œπ‘™π‘™π‘œπ‘€π‘–π‘›π‘”π‘£

π‘ π‘–π‘š3 𝑒, 𝑣 ∝ π‘ π‘–π‘š(π‘˜π‘’, π‘˜π‘£)

Cleyton-UFCG 15

Page 16: A formal model to the routing questions problem

Similarity

β€’ Any combination of this equations could be used

β€’ We choose use

π‘ π‘–π‘š 𝑒, 𝑣 =π‘ π‘–π‘š1 𝑒, 𝑣

1 βˆ’ π‘ π‘–π‘š1 𝑒, π‘£βˆ—π‘ π‘–π‘š2 𝑒, 𝑣

1 βˆ’ π‘ π‘–π‘š2 𝑒, π‘£βˆ—π‘ π‘–π‘š3 𝑒, 𝑣

1 βˆ’ π‘ π‘–π‘š3 𝑒, 𝑣

Cleyton-UFCG 16

Page 17: A formal model to the routing questions problem

Activity

β€’ Users not interact with the same intensity

β€’ It seems intuitive that the activity level of a user depends on the frequency with he/she post new tweets

Cleyton-UFCG 17

Page 18: A formal model to the routing questions problem

Activity

β€’ Activity means the mean time between the messages posted by 𝑒

π‘Žπ‘’ =π‘‘π‘œπ‘‘π‘Žπ‘¦ βˆ’ π‘‘π‘š, 𝑀𝑒 + π‘‘π‘š,𝑖+1 βˆ’ π‘‘π‘š,𝑖

|𝑀|𝑖=1

𝑀𝑒 + 1

β€’ As lower this value, most active is the user and bigger the chances of him give a answer quickly

Cleyton-UFCG 18

Page 19: A formal model to the routing questions problem

Solving the Model

β€’ Calculate the tuples (π‘˜π‘“,π‘ž , 𝑑𝑒,𝑓, π‘Žπ‘“) to each

user is a simple task

β€’ But, how decides who is the best?

Cleyton-UFCG 19

Page 20: A formal model to the routing questions problem

Solving the Model

β€’ We consider this is a problem of decision making with multiple criteria

β€’ We decide to use the Weight Product Model to solve based on [Triantaphyllou and Mann 1989]

Cleyton-UFCG 20

Page 21: A formal model to the routing questions problem

Solving the Model-Step 1

β€’ The resolution of the model starts calculating the tuple (π‘˜π‘“,π‘ž , 𝑑𝑒,𝑓, π‘Žπ‘“) to each user

𝑓𝑒 ∈ πΉπ‘œπ‘™π‘™π‘œπ‘€π‘’π‘Ÿπ‘ π‘’

Cleyton-UFCG 21

Page 22: A formal model to the routing questions problem

Solving the Model-Step 2

β€’ The we display this users in a matrix πΉπ‘œπ‘™π‘™π‘œπ‘€π‘’π‘Ÿπ‘ π‘’ π‘₯|πΉπ‘œπ‘™π‘™π‘œπ‘€π‘’π‘Ÿπ‘ π‘’|

Cleyton-UFCG 22

Page 23: A formal model to the routing questions problem

Solving the Model-Step 3

β€’ We create a function π‘šπ‘Žπ‘ π‘₯ which will map the values of (π‘˜π‘“,π‘ž , 𝑑𝑒,𝑓, π‘Žπ‘“) in a same scale

Cleyton-UFCG 23

Page 24: A formal model to the routing questions problem

Solving the Model-Step 4

β€’ For each pair 𝑓1, 𝑓2 |𝑓1 β‰  𝑓2we calculate

𝑝𝑓1,𝑓2 =π‘˜π‘“1,π‘ž

π‘˜π‘“2,π‘ž

π‘₯

βˆ—π‘‘π‘’,𝑓1𝑑𝑒,𝑓2

𝑦

*π‘Žπ‘“1π‘Žπ‘“2

𝑧

β€’ The values π‘₯,𝑦 and 𝑧 are factors of importance and must be between 0 and 1, besides that π‘₯ + 𝑦 + 𝑧 = 1

Cleyton-UFCG 24

Page 25: A formal model to the routing questions problem

Solving the Model-Step 5

β€’ If 𝑝𝑓1,𝑓2 > 0 we put 1 in position (𝑓1, 𝑓2) and 0

in position (𝑓2, 𝑓1)

β€’ If 𝑝𝑓1,𝑓2 < 0 we put 0 in position (𝑓1, 𝑓2) and 1

in position (𝑓2, 𝑓1)

β€’ If 𝑝𝑓1,𝑓2 = 0 we put 1 in position (𝑓1, 𝑓2) and 1

in position (𝑓2, 𝑓1)

Cleyton-UFCG 25

Page 26: A formal model to the routing questions problem

Solving the Model-Step 5

Cleyton-UFCG 26

Page 27: A formal model to the routing questions problem

Solving the Model-Step 6 (End)

β€’ We calculate the sum of each line of the matrix, this number represents the number of victories of each user

β€’ In the end we have

β€’ The question will be

routed to the user

with more victories

Cleyton-UFCG 27

Page 28: A formal model to the routing questions problem

Conclusion

β€’ The differential of our research

– We focus in a successful network

– We treat the problem over a new perspective

– We lead with a recent and interesting problem

Cleyton-UFCG 28

Page 29: A formal model to the routing questions problem

Future Works

β€’ The model was already implemented

β€’ We are investigating if our heuristics are coherent

β€’ We will investigating

– If the indications of the model are accurate

– If direct questions is more effective

– What factor of importance is most important

Cleyton-UFCG 29

Page 30: A formal model to the routing questions problem

Thank You

β€’ Any Question?

Cleyton-UFCG 30