ranking systems: manipulability and efficiency eric friedman*, orie cornell university (currently...

53
Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported by NSF. ITR-0325453

Post on 20-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Ranking Systems: Manipulability and Efficiency

Eric Friedman*, ORIE

Cornell University

(Currently visiting: Dept of CS,

U.C. Berkeley, 2005-6)

*Work supported by NSF. ITR-0325453

Page 2: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Ranking and Reputations

• Reputations are important– Webpage ranking: links are

“recommendations”• High ranks lead to more “clicks”

– P2P: choosing partners– Ebay: reputations are crucial (and quite

valuable).• Higher reputations lead to higher prices

– PGP: web of trust.– Spam and DDoS protections

Page 3: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Problems with Reputation Systems

• Gaming reputation systems is becoming a serious problem.

– P2P: seti@home, Kazaa-lite

– Webpage ranking: link spamming

• Note: most (all?) current reputation systems are ad-hoc

– No formal requirements etc.

Page 4: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

A research agenda: Understanding the tradeoffs between

manipulability and efficiency

1) Quantify the manipulability of ranking systems.

2) Quantify the efficiency of ranking systems.

3) Find the ranking systems that are on the efficient frontier and maximize various objectives.

Page 5: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Today’s talk (some first steps)

• A framework for manipulability (w/Alice Cheng)– Characterization of manipulability of ranking

systems.• Empirical analysis of PageRank on the WWW

(w/Alice Cheng)• Evaluating the Efficiency of ranking

mechanisms (work in progress)

Page 6: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Part I: Goals and Approach

• Our goal: create a formalism for analyzing and designing reputation systems that are robust to attacks.– Here we focus on sybils, but although

this is important in itself, our goals are much broader.

• Note: the definitions were harder than the proofs.

• Approach: Game theory, mechanism design (i.e., Arrows Theorem)

Page 7: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Trust Graphs

• Most reputation systems use trust graphs:– G=(V,E) – e=(i,j) then T(e) = i’s (direct) trust of j.– higher T(e) is better

• Reputation function: f(G)i = reputation of i.• Rank: i outranks j if f(G)i >f(G)j

– Note: we focus on rank

• Why use a trust graph?– Many (most?) interactions are 1st time interactions

• (i,j)E

1

1

3

2

2

1

3

Page 8: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Some Representative Reputation Systems

• Pagerank and related systems (Brin and Page 98, Kleinberg 98, Guha et. al. 04)– Start at an arbitrary node and then take a

random walk on the graph.• Flow methods (e.g., Flake et. al. 02, Chuang and

Stoica 02)– Compute the max flow from i to j.

• Shortest path method.– Let c(e)=1/T(e) then find the shortest path

from i to j in terms of c’s.

Page 9: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Pagerank = Random Walk on Graph

Page 10: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Maxflow = compute flow from a chosen source to a node

s

t

Page 11: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Shortest Path

s

t

Page 12: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Sybils

• A single “agent” can replicate itself under a variety of pseudonyms.

Page 13: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported
Page 14: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported
Page 15: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported
Page 16: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported
Page 17: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Sybil Attacks

• Sybils are essentially unavoidable (Douceur 02)

• Sybil clouds can forge trust among each other.– Using strong cryptography to prevent them is

expensive and awkward.

Page 18: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Sybils in Practice

• Web ranking: Create a large number of dummy websites and then all link to each other.

• P2P: create a large number of peers and then give each other high ratings

• Ebay: fake transactions with yourself.• Amazon shopping: post high evaluations of your

own products.

Page 19: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Robustness Against Sybils

• Pagerank: not robust.– Empirically, can increase pageranks

dramatically with a few sybils. (more later) • Max-flow: value robust but not rank robust.• Shortest path: robust.

Page 20: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Robustness: Pagerank

• Pagerank: not robust.

Page 21: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Robustness: Pagerank

• Pagerank: not robust.– Create a “flower”

Page 22: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Robustness: Maxflow

• Max-flow: Designed for value robustness – Flow into and out of sybil cloud cannot be changed!

s

SybilCloud

Min cut

Page 23: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Robustness: Maxflow

• Max-flow: not rank robust– b is higher ranked than a

a

b

1

0.5

0.7

[1.2]

[1]Min cut

Page 24: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Robustness: Maxflow

• Max-flow: not rank robust– a is higher ranked than b

a

b

1

0.5

0

[0.5]

[1]

Page 25: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Robustness: Shortest Path

• Shortest path: robust– a is higher ranked than b

a

b

c=1

c=3

c=1

[2]

[1]

Page 26: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Robustness: Shortest Path

• Shortest path: robust– a is higher ranked than b– a can harm b, but a is already higher ranked than b– b cannot hurt a, since it is not on the shortest path to

a

a

b

c=1

c=3

c=3

[3]

[1]

Page 27: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Sybilproofness

• Def: A sybil strategy for node i in G=(V,E) is G’=(V’,E’) and U’V’, such that by collapsing U’, G is obtained. (T’s are added together)

• Def: f is k-sybilproof if there does not exist any pair of nodes i,j and a sybil strategy for i such that f(G)i< f(G)j and f(G’)r> f(G)j for rU and |U’|k+1.

• Def: f is sybilproof if it is k-sybilproof for all k>0.

• Key: sybils can only forge recommendations among each other.

Page 28: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Results: Symmetric Reputations

• Def: A reputation function is symmetric if it is covariant under graph isomorphism.

• Theorem: There is no nontrivial symmetric sybilproof mechanism. – In fact, for any G, any node (except the top one) can

improve their ranking via sybils

• Theorem: There is no nontrivial symmetric k-sybilproof mechanism, for any k1.– (How often this occurs for small k is open.)

Page 29: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Proof (via the butterfly)

j si

G U’

•Sybilproofness: by symmetry, f(G’)j=f(G’)s

•K-sybilproofness: build G’ one sybil at a time

Page 30: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Results: Non-Symmetric

• Theorem: There exist sybilproof reputation functions. (e.g., shortest path)

• Def: Given a root node sV, let P be the set of all collections of edge disjoint paths* from s to i. Let g be a function from paths to reals and be an (addition-like) operator on the reals.

Page 31: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Results: Non-Symmetric

• Let f(G)i=max{P P } {pP} g(p)

• Max flow: g(p)=min{T(e)|ep}, =+

• Shortest path:g(p)=min{T(e)|ep}, =min

• Other generalizations

– Leaky pipes etc.

Page 32: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Results: Non-Symmetric

• Theorem: f as defined above is value sybilproof assuming – If p’ is an extension of p, then g(p’)<g(p). is nondecreasing and g is nondecreasing

with respect to T.– If p=p’+p’’ then g(p)=g(p’) g(p’’)

Page 33: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Results: Non-Symmetric

• Theorem: f as defined above is rank sybilproof iff =max, assuming:– For any p there exist an extension p’ such that

g(p)=g(p’).

• I.e., f depends on the maximal path.

Page 34: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Summary (Part I)

• A framework for the analysis of the manipulability of ranking systems.

• Key distinction: rank vs. value• Result 1: all symmetric ranking systems are

manipulable.• Result 2: “flow based” ranking systems are not

value manipulable but are rank manipulable.• Result 3: “path based” ranking systems are not

manipulable.

Page 35: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Part II: Empirical Analysis of PageRank

• (Joint with Alice Cheng)

• (Inspired by Zhang et. al. on collusion)

• Stanford web matrix -- ~280k pages.

• Question:How often are a small number of sybils helpful?

• Answer: Surprisingly often!

Page 36: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Value Magnification: 1 sybil

Page 37: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Value Magnification – by # of sybils

Page 38: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Rank as a function of old Rank -- 1-Sybil

Page 39: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Effect of on values

Page 40: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

on ranks

Page 41: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Summary of Empirical

• Analytic approximations for these.

• PageRank is quite manipulable– Especially for low ranked pages

• (but that’s where automated methods are supposed to work!)

Page 42: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Part III: Quantifying the Efficiency of Ranking Mechanisms

• Work in progress – some preliminary results.

• Is FlowRank or PageRank better than PathRank?

Page 43: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Model

• Random graph model (descriptive, not constructive)• Follow the intuition behind pagerank

– Pages link more to “better pages”– Better pages are more selective.– Pr(link)=f(qi,qj)

• Increasing in qj

• FOSD in qi

– Average outdegree = k, (n∞)– (many results have k∞, and miss important aspects

of ranking.)

Page 44: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Finding “Baddies”

• 2 layer example: – ½ nodes are H and ½ L

– L’s link uniformly at random

– H’s link to H with (relative) probability (1+a) and to L’s with (1-a).

– a=0, random graph

– a=1, two tiered graph

Page 45: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Statistical Inference

• Now, ranking is a problem of statistical inference– G is a random variable

– r is a statistical estimate of true qualities

– Note: unlike most inference problems we only have a single sample

Page 46: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

3 methods

• PageRank

• InRank: rank by indegree

• MLRank: compute a maximum likelihood estimate.

Page 47: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Results

• Pr(error)=Pr(ri>rj|qi<qj)

• InRank: difference of Poissons• PageRank: two stage calculation

– First by quality then statistical manipulations of PageRank equations.

• MLRank: find a subgraph with the maximal number of edges. – NP complete– Implemented a greedy algorithm

Page 48: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Results

Pr(error)

a

PageRank

InRankMLRank

PageRank

InRank

MLRank

Page 49: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Results

• InRank better than PageRank when graph is close to random and vice versa. (General Theorem)

• Differences can be significant!

• MLRank is significantly better.

Page 50: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Some Intuition

• Case a=0 (Sketch -- ignoring special cases)• PageRank

– rj’s are iid (in limit)

• InRank

• Theorem: PageRank is more random.• (But, also need to consider expected values)

)( |)(|)1(

iPj

ji jS

rr

)()1()1(

iPjir

Page 51: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Concluding Comments

• Reputation systems should be designed from requirements and subject to formal validation.– Ex: What problem does pagerank solve? How well

does it do it? – Ex: Why is Flowrank better than Pathrank? Is it?

When and why?

• Aside: fighting link spam– Results show that most of the proposed methods can

be defeated! – Perhaps they work so well because they are not being

used and spammers haven’t tried to defeat them. Endogeneity is important!

Page 52: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported

Concluding Comments

• Reputation systems are important and deserve formal, careful, study!– Axiomatic analyses.

– Econometric analyses.

• Lots of challenging open problems!

Page 53: Ranking Systems: Manipulability and Efficiency Eric Friedman*, ORIE Cornell University (Currently visiting: Dept of CS, U.C. Berkeley, 2005-6) *Work supported