deal or deceit: detecting cheating in distribution …skai2/files/cikm2014_shu_final.pdfkai shu et...

Post on 26-May-2018

222 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Kai Shu1,2, Ping Luo1

Li Wan2, Peifeng Yin

3, Linpeng Tang

4

Deal or Deceit:

Detecting Cheating in Distribution Channels

2

1

3 4

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Outline

2

Introduction

• Motivation

• Problem Formulation

• Detection Framework

• Experimental results

• Conclusion

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Introduction

3

Manufacturer

End user

Manufacturer

Partner

End user

Low

price

High

price

Price

Difference

Distribution channel Price difference

Manufacturer

Partner Partner

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Introduction

4

×

Cheating seller Cheating buyer

Detect cheating in distribution channel

×

A typical cheating scenario

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Outline

5

• Introduction

Motivation

• Problem Formulation

• Detection Framework

• Experimental results

• Conclusion

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Motivation

6

Time series of purchase quantities

Time series of purchase quantities for each partner

Data: from the perspective of manufactory

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Motivation (2)

7

0

20

40

60

80

100

120

140

160

180

200

220

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Purchase quantity

Month

𝑣1 𝑣2 𝑣1 𝑣2 𝑣1 𝑣2

×

×𝑣1 𝑣2

Negative

correlation

The purchase quantities of 𝑣1 and 𝑣2 change collectively

Observation

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Outline

8

• Introduction

• Motivation

Problem Formulation

• Detection Framework

• Experimental results

• Conclusion

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Problem Formulation

9

𝑥1 𝑥2 𝑥𝑛

𝑣1 𝑣2 𝑣𝑛

Degree of cheating

Input Output

Audit staff

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Outline

10

• Introduction

• Motivation

• Problem Formulation

Detection Framework

• Experimental results

• Conclusion

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Detection Framework

11

all purchase

volume sequences

(𝑥𝑖|𝑖 = 1,2, … , 𝑛)

Building Partner

Correlation Graph (PCG)

Partitioning

Ranking the partners by

probabilistic modelRanking of

cheating partners

A ···

B ···

Step 2

Step 3

Step 1

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Detection Framework (Step 1)

12

0

20

40

60

80

100

120

140

160

180

200

220

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Purchase quantity

Month

𝑣1 𝑣2

• Tick-to-tick correspondence

Pearson correlation

• Symmetric measure

𝑣1

𝑣2 ?Cheating seller

Cheating buyer

Pearson 𝑣1, 𝑣2 = Pearson(𝑣2, 𝑣1)

No negative

correlation

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Detection Framework (Step 1)

13

Dynamic time warping

0

20

40

60

80

100

120

140

160

180

200

220

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Purchase quantity

Month

𝑣1 𝑣2 𝑣1 𝑣2 𝑣1 𝑣2

Learn the correspondence relationship of elements between two time

series to minimize the Pearson correlation.

Warping direction

Directed Pearson Correlation (DPC)

𝑟𝑑𝑝𝑐(𝑥1, 𝑥2) ≔ min{1

𝐿

𝑙=1

𝐿

(𝑥1 𝑖𝑙 − 𝑥1𝜎𝑥1

)(𝑥2 𝑗𝑙 − 𝑥2𝜎𝑥2

)}

Where (𝑖𝑙 , 𝑗𝑙) is the element of a warping path 𝑃 and 𝐿 = 𝑃 .

Cheating seller

Cheating buyer

𝑟𝑑𝑝𝑐(𝑥1, 𝑥2) ≠ 𝑟𝑑𝑝𝑐(𝑥2, 𝑥1)

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Detection Framework (Step 1)

14

𝑣𝑖 𝑣𝑗

𝑤𝑖𝑗 = −𝑟𝑑𝑝𝑐(𝑥𝑖 , 𝑥𝑗)

If 𝑤𝑖𝑗 ≥ 𝜂, keep the edge

×

If 𝑤𝑖𝑗 < 𝜂, remove the edge

all purchase

volume sequences

(𝑥𝑖|𝑖 = 1,2, … , 𝑛)

PCG

Step 1

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Detection Framework (Step 2)

15

Cheating Sellers

Cheating Buyers

PCG

Bipartite Graph

A ···

B ···

Cheating Seller Cheating Buyerand Not true

A partner can only be cheating seller or cheating buyer

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Detection Framework (Step 2)

16

Maximize 𝑤𝐴𝐵

Classical graph cut problem (MAX DICUT)

Cheating Sellers

Cheating Buyers

A ···

B ···

𝑤𝐴𝐵

𝑤𝐵𝐴

Maximize 3𝑤𝐴𝐵 −𝑤𝐵𝐴

New graph cut problem

We propose a greedy algorithm to solve this new NP-hard problem.

Solution algorithm

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Detection Framework (Step 3)

17

A ···

B ···

𝑣𝑖(𝛼𝑖 , 𝛽𝑖)

𝑣𝑗(𝛼𝑗 , 𝛽𝑗)

𝑤𝑖𝑗

𝑣𝑖(𝛼𝑖 , 𝛽𝑖)

𝑣𝑗(𝛼𝑗 , 𝛽𝑗)

𝑤𝑖𝑗

Generate the edges and the weights in resultant bipartite graph.

Probabilistic model

Maxmize log 𝑝 𝑤𝑖𝑗 , 𝜽 + 𝜆 ∙ 𝐶(𝜽)

𝜽 is the set of all parameters

Bipartite Graph

Probabilistic model

𝐵𝑒𝑡𝑎 𝑤𝑖𝑗; 𝛼𝑖 , 𝛽𝑗

𝛼𝑖 𝛽𝑗 𝑤𝑖𝑗𝛽𝑗 𝑤𝑖𝑗𝛼𝑖

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Detection Framework (Step 3)

18

𝜋𝑖𝑗 =𝛼𝑖𝛼𝑖 + 𝛽𝑗

𝜋𝑖𝑗 is the expected value of beta distribution with parameters (𝛼𝑖 , 𝛽𝑗)

Probability weight 𝜋𝑖𝑗

Ranking score function

Bipartite Graph

A ···

B ···

Ranking of

cheating partners

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Outline

19

• Introduction

• Motivation

• Problem Formulation

• Detection Framework

Experimental results

• Conclusion

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Experimental results

20

Dataset

Gold Silver All

# Total partners 104 424 528

# Cheating partners 17 85 102

Evaluation measure

For the top-k ranking list, giving a number 𝑘,we

count the number of hits in it.

𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛@𝑘 =#hits in top−k list

𝑘

𝑅𝑒𝑐𝑎𝑙𝑙@𝑘 =#hits in top−k list

𝑀

𝐹1@𝑘 = 2 ∙𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛@𝑘 ∙ 𝑅𝑒𝑐𝑎𝑙𝑙@𝑘

𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛@𝑘 + 𝑅𝑒𝑐𝑎𝑙𝑙@𝑘

Ground truth

k1 2 3 k-1

Precision

Recall

F1 score

k

evalu

ation

hits

AUC

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Experimental results (1)

21

• DPC improves performance

0

5

10

15

20

25

30

35

40

AU

C_F1

Pearson_Rank Noncut_Rank Noncut_ProbRank Cut_Rank Cut_ProbRank

PCG Partition Ranking

Pearson correlation DPC No Yes Edge weight Probability

Noncut_Rank √ √ √

Noncut_ProbRank √ √ √

Cut_Rank √ √ √

Cut_ProbRank √ √ √

Pearson_Rank √ √ √

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Experimental results (2)

22

• DPC improves performance

0

5

10

15

20

25

30

35

40

AU

C_F1

Pearson_Rank Noncut_Rank Noncut_ProbRank Cut_Rank Cut_ProbRank

PCG Partition Ranking

Pearson correlation DPC No Yes Edge weight Probability

Noncut_Rank √ √ √

Noncut_ProbRank √ √ √

Cut_Rank √ √ √

Cut_ProbRank √ √ √

Pearson_Rank √ √ √

Noncut_Rank Cut_RankNoncut_ProbRank Cut_ProbRank

• Partitioning improves

performance

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Experimental results (3)

23

• DPC improves performance

0

5

10

15

20

25

30

35

40

AU

C_F1

Pearson_Rank Noncut_Rank Noncut_ProbRank Cut_Rank Cut_ProbRank

PCG Partition Ranking

Pearson correlation DPC No Yes Edge weight Probability

Noncut_Rank √ √ √

Noncut_ProbRank √ √ √

Cut_Rank √ √ √

Cut_ProbRank √ √ √

Pearson_Rank √ √ √

• Partitioning improves

performance

Noncut_Rank Noncut_ProbRank Cut_Rank Cut_ProbRank

• Probability ranking improves

performance

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Experimental results (4)

24

• DPC improves performance

0

5

10

15

20

25

30

35

40

AU

C_F1

Pearson_Rank Noncut_Rank Noncut_ProbRank Cut_Rank Cut_ProbRank

PCG Partition Ranking

Pearson correlation DPC No Yes Edge weight Probability

Noncut_Rank √ √ √

Noncut_ProbRank √ √ √

Cut_Rank √ √ √

Cut_ProbRank √ √ √

Pearson_Rank √ √ √

• Partitioning improves

performance

Cut_ProbRank

• Probability ranking improves

performance

• Cut_ProbRank performs best

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Experimental results (5)

25

Precision Recall F1 score

Cut_ProbRank always performs the best

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Outline

26

• Introduction

• Motivation

• Problem Formulation

• Detection Framework

• Experimental results

Conclusion

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Conclusion

27

The first quantitative work to detect cheating in distribution

channels.

• A new correlation method (i.e. DPC) by Introducing

dynamic time warping technique.

• A new graph cut problem for graph partitioning.

• A probability model for node ranking.

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Thanks

28

top related