the applicability of benford's law fraud detection in the social sciences johannes bauer

18
The Applicability of Benford's Law Fraud detection in the social sciences Johannes Bauer

Upload: horace-rich

Post on 18-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Applicability of Benford's Law Fraud detection in the social sciences Johannes Bauer

The Applicability of Benford's Law

Fraud detection in the social sciences

Johannes Bauer

Page 2: The Applicability of Benford's Law Fraud detection in the social sciences Johannes Bauer

2 of 18

0 1 2 3 4 5 6 7 8 9

Benford 0,120 0,114 0,109 0,104 0,100 0,097 0,093 0,090 0,088 0,085

0,0

0,1

0,2

0,3

0,4

second significant digit

0 1 2 3 4 5 6 7 8 9

Benford 0,102 0,101 0,101 0,101 0,100 0,100 0,099 0,099 0,099 0,098

0,0

0,1

0,2

0,3

0,4

third significant digit

0 1 2 3 4 5 6 7 8 9

Benford 0,120 0,114 0,109 0,104 0,100 0,097 0,093 0,090 0,088 0,085

0 1 2 3 4 5 6 7 8 9

Benford 0,100 0,100 0,100 0,100 0,100 0,100 0,099 0,099 0,099 0,099

0,0

0,1

0,2

0,3

0,4

fourth significant digit

Benford distribution

k

i

kikk ddDdDP

1

111011 ])10(1[log)...(

1 2 3 4 5 6 7 8 9

Benford 0,301 0,176 0,125 0,097 0,079 0,066 0,058 0,051 0,046

0,0

0,1

0,2

0,3

0,4

first significant digit

Page 3: The Applicability of Benford's Law Fraud detection in the social sciences Johannes Bauer

3 of 18

• Paraguay

• Share prices

• The surface of rivers

• House numbers

• Company balance sheets

Two ways to Benford's law

• Multiplications – R.W. Hammering (1970)

• Distributions – Theodor Hill (1995)

Benford distribution

Benford distributed data

Page 4: The Applicability of Benford's Law Fraud detection in the social sciences Johannes Bauer

4 of 18

Detecting fraud

Approach:

• Distribution of digits with specific values

• Uncovering of fraudulent data

Page 5: The Applicability of Benford's Law Fraud detection in the social sciences Johannes Bauer

5 of 18

First step: What is Benford distributed

Datasource: Kölner Zeitschrift für Soziologie und Sozialforschung Feb.1985 to Mar.2007 with support from Lehrstuhl Braun

N 1.digit 2.digit 3.digit 4.digit

Not standardised regressions 2180 X X X X

Logistic regressions 2251 X X X

T-values 1325 X X

Cox regressions 506 X X

Pseudo r² 239 X X X

Chi-square values 188 X X X

R² 131 X (X)

Page 6: The Applicability of Benford's Law Fraud detection in the social sciences Johannes Bauer

6 of 18

First step: Unified distribution of digits

Normaldistribution

Mean: 3 Standard error: 2

Density between 3 and 4 (second digit)

Page 7: The Applicability of Benford's Law Fraud detection in the social sciences Johannes Bauer

7 of 18

Research survey LS Braun

Hypothesis: 'The higher the education a person has, the fewer cigarettes he consumes per day'

1. digit: Ho rejected (X²=103.39,df = 8,p = 0.000)

2. digit: Ho rejected (X²=122.59,df = 9,p = 0.000)

Fraudulent regressions coefficients Distribution of the first digit (n= 4621)

0,0

0,1

0,2

0,3

0,4

Benford 0,301 0,176 0,125 0,097 0,079 0,067 0,058 0,051 0,046

Students 0,309 0,173 0,136 0,080 0,072 0,048 0,053 0,061 0,067

1 2 3 4 5 6 7 8 9

Page 8: The Applicability of Benford's Law Fraud detection in the social sciences Johannes Bauer

8 of 18

Research survey: 3. and 4. digit

Fraudulent regressions coefficients Distribution of the third digit (n= 4541)

0,0

0,1

0,2

0,3

0,4

Benford 0,10 0,10 0,10 0,10 0,10 0,10 0,09 0,09 0,09 0,09

Studenten 0,09 0,15 0,10 0,14 0,09 0,07 0,07 0,09 0,08 0,07

0 1 2 3 4 5 6 7 8 9

Fraudulent regressions coefficients Distribution of the fourth digit (n= 4378)

0,0

0,1

0,2

0,3

0,4

Benford 0,10 0,10 0,10 0,10 0,10 0,10 0,09 0,09 0,09 0,09

Studenten 0,03 0,18 0,11 0,14 0,09 0,08 0,08 0,09 0,08 0,08

0 1 2 3 4 5 6 7 8 9

Ho rejected(X² = 304.89, df=9, p= 0.000)

Ho rejected(X² = 622.20, df=9, p= 0.000)

Students Students

Page 9: The Applicability of Benford's Law Fraud detection in the social sciences Johannes Bauer

9 of 18

Research survey: Individual data

Deviations from the Benford distribution

1. digit 2. digit 3. digit 4. digit

35 40 42 41

47 persons

0.744 0.851 0.893 0.872

absolute

percentage

Page 10: The Applicability of Benford's Law Fraud detection in the social sciences Johannes Bauer

10 of 18

Second step: detecting fraud

Approach:

• At what point is fraud recognisable?

Method:

• Random selection of fraudulent digits

• Goodness of fit test used on a Benford distribution

• Logistic regression

Page 11: The Applicability of Benford's Law Fraud detection in the social sciences Johannes Bauer

11 of 18

first significant digit

Second step: results

Mean number of cases in order to reject H0 with a probability of 95 %:

1. digit: 989 cases

2. digit: 766 cases

3. digit: 351 cases

4. digit: 138 cases

Page 12: The Applicability of Benford's Law Fraud detection in the social sciences Johannes Bauer

12 of 18

first significant digit ~ 50 % fabricated data

2. digit: 3308 cases

3. digit: 1351 cases

4. digit: 585 cases

Second step: results

Mean number of cases in order to reject H0 with a probability of 95 %:

1. digit: 4001 cases

Page 13: The Applicability of Benford's Law Fraud detection in the social sciences Johannes Bauer

13 of 18

first significant digit ~ 10 % fabricated data

2. digit: 78883 cases

3. digit: 31266 cases

4. digit: 12592 cases

Second step: results

Mean number of cases in order to reject H0 with a probability of 95 %:

1. digit: 94439 cases

Page 14: The Applicability of Benford's Law Fraud detection in the social sciences Johannes Bauer

14 of 18

Second step: Testing the results

first significant digit ~ 100 % fabricated data

47 persons, each with 100 data

1. digit

7 - 8

0.167

absolut

percentage

exp.deviation

1. digit

35

0.744

absolut

percentage

observed

Page 15: The Applicability of Benford's Law Fraud detection in the social sciences Johannes Bauer

15 of 18

Second step: Individual data

Mean number of cases in order to reject H0 with a probability of 95 %:

1. digit: 136 cases

2. digit: 102 cases

3. digit: 100 cases

4. digit: 69 cases

first significant digit ~ 100 % fabricated data

Page 16: The Applicability of Benford's Law Fraud detection in the social sciences Johannes Bauer

16 of 18

Second step: Individual data

first digit

fourth digit

Page 17: The Applicability of Benford's Law Fraud detection in the social sciences Johannes Bauer

17 of 18

Summary of results

Fraud detection with Benford's law – proposals:

• Observation of individual data („The Wisdom of Crowds“ effect)

• Observation of higher digits

• Recording of all possible metric coefficients

• Application of Goodness of fit test, which react more strongly to the sample size (i.e. chi-square g.o.f. test)

• With a very small degree of fraud, the result is strongly dependent on the procedure of the forger

Page 18: The Applicability of Benford's Law Fraud detection in the social sciences Johannes Bauer

18 of 18

Literature

Benford, F. (1938): The Law of Anomalous Numbers, in: Proceedings of the American Philosophical Society, 78(4): 551-572

Diekmann, A. (2007): Not the First Digit! Using Benford’s Law to Detect Fraudulent Scientific Data, in: Journal of Applied Statistics, 34(3): 321-329

Hill, T.P. (1996): A Statistical Derivation of the Significant-Digit Law, in: Statistical Science, 10(4) 354-363

Hammering, R.W. (1970): On the Distribution of Numbers, in: Bell System Technical Journal, 49(8): 1609-1625

Judge, G.; Schechter, L. (2006): Detection problems in survey data using Benford's Law: Preprint.

Nigrini, M. (1999): I've Got Your Number. How a Mathematical Phenomenon Can Help CPAs Uncover Fraud and Other Irregularities, in: Journal of Accountancy, May: 79-83

Pietronero, L.; Tossati, E.; Tossati, V., Vespignani, A. (2001): Explaining the uneven distribution of numbers in nature: the laws of Benford and Zipf, in: Physica A, 293(1): 297-304