data diagnostics using second order tests of benford’s law · pdf filepositives and can...

DATA DIAGNOSTICS USING SECOND ORDER TESTS OF BENFORD’S LAW

by

Mark J. Nigrini Saint Michael’s College

Department of Business Administration and Accounting Colchester, Vermont, 05439

[email protected]

and

Steven J. Miller Brown University

Department of Mathematics Providence, RI 02912

[email protected]

June 1, 2006

Copyright © 2005-2006 by Mark J. Nigrini and Steven J. Miller. All rights reserved. We wish to thank the management of the restaurant company for allowing the use of their corporate data in the case study and the simulations. In accordance with their requests we have not shown the actual years that the data relates to and we have approximated some of the descriptive statistics. We also wish to thank the AJPT reviewers and the editor, Dan Simunic, for their excellent comments.

1


Summary SAS No. 99 requires auditors to use analytical procedures to identify the existence of unusual transactions, events, and trends. This use of analytical procedures and the effective use of the computer on transaction level data is an efficient means for auditors to partially fulfill their duties with regards to the detection of fraud and material misstatements.

Benford’s Law gives the expected patterns of the digits in numerical data, and it has been advocated as a test for the authenticity and reliability of transaction level accounting data. To date these tests have been tests of the first digits and first-two digits of accounting data. This paper proposes a new second order test of Benford’s Law that has the potential to provide new insights into accounting data. The second order test is an analysis of the digit frequencies of the differences between the ordered (ranked) values in a data set. The digit frequencies of these differences approximates the frequencies of Benford’s Law for most distributions of the original data. The second order tests generate few false positives and can detect rounded data, data generated by linear regression, data generated by using the inverse function of a known distribution, and inaccurate ordering. These conditions would not be easily detectable using traditional analytical procedures. Keywords: Benford’s Law, fraud detection, analytical procedures, audit tests. Data availability: The accounts payable data of the first case study are available from the authors.

2


INTRODUCTION Statement on Auditing Standards No. 99, Consideration of Fraud in a Financial Statement Audit (AICPA, 2002), establishes standards and provides guidance to auditors with respect to the detection of material misstatements caused by error or fraud. Auditors are required to explicitly consider the potential for material misstatements due to fraud. Analytical procedures applied to highly aggregated data can however only provide a broad initial indication of the existence of a material misstatement. Computer-assisted audit techniques could be used to inter alia test an entire population instead of a sample, and may be useful in identifying unusual or unexpected relationships or transactions. The effective use of the computer on transaction level data could present an efficient means for auditors to partially fulfill their responsibilities with regards to fraud and material misstatements. Digital analysis based on Benford’s Law is a computer-assisted audit technique that is applied to an entire population of transactional data. Benford’s Law is named after Frank Benford, who formulated the expected frequencies for tabulated data (Benford, 1938). Benford’s Law was introduced to the auditing literature in Nigrini and Mittermaier (1997), and researchers have since used these digit patterns to detect data anomalies by testing either the first or first-two digit patterns of transactional data or reported statistics (see Wallace (2002) and Moore and Benjamin (2004)). Benford’s Law routines are now included in both IDEA and ACL, and Cleary and Thibodeau (2005) recently critiqued the diagnostic statistics provided by these software programs.

This paper introduces a second order test related to Benford’s Law that could be used by auditors for the detection of fraud, errors, and fabricated data. This second order test diagnoses the internal structure of the data. It is derived from the fact that, for many different types of data sets, if the observations are sorted from smallest to largest (ordered), then the digits of the differences between the successive numbers are expected to conform perfectly or closely (called Almost Benford in Miller and Nigrini, 2006) to Benford’s Law. This second order test is demonstrated using three cases. The first case uses data that has been previously reviewed in the literature and which has a reasonable fit to Benford’s Law. The second case uses normally distributed annual revenue and cost data from a large restaurant chain, and the results are that the largest deviations from the expected results are for the data sets that are most likely to include errors. The third case uses the same revenue and cost data source with the data seeded with errors. The results showed that the second order test can detect (a) rounded data, (b) the fraudulent substitution of regression output for the real numbers, (c) fraudulent substitution using the inverse of a distribution function, and (d) inaccurate ranking in data that is assumed to be ordered from smallest to largest. These error conditions would not have been detectable by comparing the mean and standard deviation of the original data with the fraudulently seeded data. The second order test gives few, if any, false positives in that if the results are not Almost Benford, then the data does exhibit an inherent internal structural inconsistency.

The next section of this paper discusses the second order test of Benford’s Law. Thereafter the accounting studies are reviewed. A discussion section follows in which an agenda for further research is developed, and a concluding section summarizes the paper.

3

A SECOND ORDER TEST OF BENFORD’S LAW This section of the paper first describes Benford’s Law and then presents a result that forms the basis of a new test of Benford’s Law that could be performed in addition to the usual first digit or first-two digits tests. This new test is called the second order test of Benford’s Law.

Benford’s Law gives the expected patterns of the digits in tabulated data. The law is named after Frank Benford, who noticed that the first few pages of his tables of common logarithms were more worn than the later pages (Benford, 1938). From this he hypothesized that people were looking up the logs of numbers with low first digits (such as 1, 2, and 3) more often than the logs of numbers with high first digits (such as 7, 8, and 9) because there were more numbers in the world with logarithms with low first digits. The first digit of a number is the leftmost digit with the qualification that 0 is inadmissible as a first digit. The first digits of 2,204, 0.0025 and 20 million are all equal to 2. Benford empirically tested the first digits of 20 diverse lists of numbers with 20,229 records and noticed a marked skewness in favor of the low digits that approximated a geometric pattern. He then made some assumptions related to the geometric pattern of natural phenomena (despite the fact that some of his data sets were not related to natural phenomena) and formulated the expected patterns for the digits in tabulated data. These expected frequencies are shown below with D1 representing the first digit, and D1D2 representing the first-two digits of a number:

P(D1=d1)= log(1 + 1/ d1) d1 ∈ {1, 2, ... ,9}. (1) P(D1D2=d1d2)= log(1 + 1/d1d2) d1d2 ∈ {10, 11, 12, ... , 99}; (2)

where P indicates the probability of observing the event in parentheses. For example, the expected probability of the first-two digits 10 is 4.14 percent (log(1 + 1/10)). All logs in this paper are to the base 10. Durtschi, Hillison, and Pacini (2004) review the types of accounting data that are likely to conform to Benford’s Law and the conditions under which a “Benford Analysis” is likely to be useful. The use of Benford’s Law as a test of data authenticity has not been limited to internal audit and the attestation functions. For example, Hassan (2003) advocates its use in general managerial settings, and Hoyle et. al. (2002) apply Benford’s Law to biological findings. The mathematical theory supporting Benford’s Law is still continuously evolving. Examples include Berger, Bunimovich, and Hill (2005) who show that many dynamical systems follow Benford’s Law including most power, exponential, and rational functions, linearly-dominated systems, and non-autonomous dynamical systems. Kontorovich and Miller (2005) show the relationship between Benford’s Law and many functions from number theory. Work by Berger and Hill (2006) shows the relationship between Benford’s Law and Newton’s Method for finding the roots of a function where the iterates converge to a root. They then evaluate the effect of numerical algorithms based on Newton’s Method having a biased roundoff error due to the bias towards the lower digits of Benford’s Law. Recent mathematical papers have shown a widespread applicability of Benford’s Law, yet the tests used by auditors are the unchanged same tests advocated in the first auditing paper on Benford’s Law by Nigrini and Mittermaier (1997).

4

A set of numbers that closely conforms to Benford’s Law is called a Benford Set in Nigrini (2000, 12). When describing the conformity of data to Benford’s Law, other prior work has used the term strong Benford sequence or a weak Benford sequence (see Raimi 1976, 525), or as a strict Benford sequence (see Berger, Bunimovich, and Hill, 2005). The term Benford Set is preferred because auditors can relate to a set of transactional data from an audit cycle. The mathematical papers cited deal with sequences of numbers on the positive real line and auditors have little need to consider these mathematical phenomena when dealing with transactional data. The link between a geometric sequence and a Benford Set is well known in the literature and is discussed in Raimi (1976). The link was also evident to Benford who titled Part II of his paper “Geometric Basis of the Law” and declared that “Nature counts geometrically and builds and functions accordingly” (Benford 1938, 563). Raimi (1976) relaxes the tight restriction that the sequence should be perfectly geometric, and states that asymptotically (a close approximation to) geometric sequences will also produce a Benford Set. Raimi still further relaxes the geometric requirement and notes that “the interleaving of a finite number of geometric sequences” will produce a strong Benford sequence. A mixture of approximate geometric sequences will therefore also produce a Benford Set. This is stated as a theorem with a proof in Leemis, Schmeiser, and Evans (2000, 5). A geometric sequence can be written as:

Sn= arn-1 (with n = 1, 2, 3, …, N). (3)

where a is the first element of the sequence, and r is the ratio of the (n+1)th element divided by the nth element. A geometric sequence with N elements will have n spanning the range 1, 2, 3, …, N. In a graph of a geometric sequence, the rank (1, 2, 3, …, N) is shown on the X-axis, and the heights are arn-1. When creating such a sequence for the purposes of simulating a Benford Set, the value of r that would yield a geometric sequence with N elements over [a, b] is as follows,

r = 10(log(b) - log(a)) / (N-1) (4)

where a and b are the upper bound and lower bound of the geometric sequence. The notation [a, b] means that the range includes both the lower bound a and the upper bound b. The digit frequencies of a geometric sequence will form a Benford Set if two requirements are met. First, N should be reasonably large and this vague requirement of being “large” is because even a perfect geometric sequence with (say) 1,000 records cannot fit Benford’s Law perfectly. For example, for the first-two digits from 90 to 99 the expected proportions range from 0.0044 to 0.0048. Since any actual count must be an integer, it means that the actual counts (probably either 4 or 5) will translate to actual proportions of either 0.004 or 0.005. As the number of elements in the geometric sequence increases it becomes more possible for the actual proportions to tend towards the exact expected proportions of Benford’s Law. Another motivation for applying a first-two digits test only to data sets with N ≥ 1,000 is the requirement of the chi-square test that all cells have an expected value of about 5. Second, the log(b) – log(a) term in equation (4) should be an integer value. The geometric sequence needs to span a large enough range to allow each of the possible first digits to occur with the expected frequency of Benford’s

5

Law. A geometric sequence over the range [40, 48] will have all first digits equal to 4, and a geometric sequence over the range [10, 92] will be clipped short with no numbers beginning with 93 to 99. This fact is confirmed in Leemis, Schmeiser, and Evans (2000, 3) who state that,

Let W ~ U(a, b) where a and b are real numbers satisfying a < b. If the interval (10a, 10b) covers an integer number of orders of magnitude, then the first significant digit of the random variable T = 10W satisfies Benford’s Law exactly.

What is meant by the above is that it is the probability distribution of all the digits of the possible

values of T that forms a Benford Set. T is a random variable and just one number cannot be “Benford.” So if log(b) –log(a) (from equation (4)) is an integer, and the logarithms are equidistributed, then the exponentiated numbers follow Benford’s Law. Diaconis (1976) provides an early proof of this equivalence. Kontorovich and Miller (2005) and Lagarias and Soundararajan (2006) show some recent results using the distribution of the mantissas of the logs to show conformity to Benford’s Law. In these papers the mantissas of the logs (the fractional parts) are referred to as the logarithm modulo 1. The algebra below shows that the differences between the successive elements of a geometric sequence give a second geometric sequence of the form,

S’n = arn - arn-1 (with n = 1, 2, 3, …, N-1) (5)

= a(r-1) * rn-1.

As the elements of this new sequence form a geometric series, the distribution of these digits will also obey Benford’s Law. Thus the (N-1) differences form a Benford Set. The new second order test of Benford’s Law is derived from the following set of heuristics related to the digit patterns of the differences between the elements of ordered data:

1. If the data comprises a single geometric sequence of N elements conforming to Benford’s Law, then the N-1 differences between the ordered (ranked) elements of such a data set gives a second data set which also conforms to Benford’s Law.

2. If the data comprises N non-discrete random variables drawn from any continuous distribution with a smooth density function (e.g., the uniform, triangular, normal, gamma or weibull distributions) then the digit patterns of the N-1 differences between the ordered elements will exhibit Almost Benford behavior. Almost Benford behavior means that the digit patterns will conform closely, but not exactly, to Benford’s Law and this behavior will persist even with N tending to infinity. The deviations are small and quantifiable in terms of the underlying distribution (see Miller and Nigrini 2006).

3. Counter examples exist to the above two remarks. The instances where the differences between the ordered elements do not exhibit Benford or Almost Benford behavior are however expected to occur with a low enough frequency in real world data to merit a further investigation into the authenticity of the data in any data interrogation exercise.

6

An example of a counter example to (1) and (2) above occurs when the data comprises two overlapping and interleaved geometric sequences. In this case the differences between the ordered elements will usually not form a second Benford Set. Such a sequence is shown in Figure 1.

Insert Figure 1 about here In Figure 1 the first geometric sequence with N1=10,000 spans the half-open [30,300) interval and the second geometric sequence with N2=10,000 spans the [10, 100) interval. The combined sequence spans the range [10, 300) and has 20,000 elements (N1 + N2). The two sequences overlap on the [30, 100) interval and it is this range that causes the differences between the elements to diverge from Benford’s Law. The extent of the divergence depends on (a) the extent to which the two data sets overlap, and (b) the relative number of records (N) in each data set. The digit frequencies of the differences between the numbers in the combined sequence are shown in Figure 2.

Insert Figure 2 about here

The digit frequencies of the source data (N1 and N2) both individually and combined (appended) all conform perfectly to Benford’s Law. The differences between the ordered elements of the two geometric sequences also form a Benford Set. It is only when the two sequences are interleaved that the (N1 + N2 –1) differences do not conform to Benford’s Law.

Miller and Nigrini (2006) show that the digit patterns of the differences between the ordered elements of any “nice” distribution give a distribution that is close to Benford’s Law. This behavior is called Almost Benford behavior. Explicitly, they show the following:

Theorem (Miller-Nigrini 2006): Let X1, …, XN be independent identically distributed random

variables from a probability density p(x) taking on values x1, …, xN. Let y1, …, yN be the xi’s in increasing order. Assume p(x) is continuous with convergent second order Taylor series expansion for almost all x, and p’(x), p’’(x) are bounded independent of x for almost all x. Then as N tends to infinity the distribution of digits of the differences between adjacent observations (yi+1 – yi ) is close to Benford’s Law (and the small deviations from Benford’s Law are a function of the underlying distribution p(x)).

The above result leads naturally to the following procedure: Second Order Benford Test: Let x1, …, xN be a data set, and let y1, …, yN be the xi’s in increasing

order. Then for many natural data sets, for large N, the digits of the differences between adjacent observations (yi+1 – yi ) is close to Benford’s Law (and the small deviations from Benford’s Law are quantifiable in terms of the underlying distribution of the data set).

To demonstrate this Almost Benford behavior the data from four distributions were simulated.

7


Panel A of Figure 3 shows a histogram of data from a normal distribution with a mean of 500 and a standard deviation of 100. Panel B simulates a uniform distribution over the [0,1000) interval while Panel C simulates a triangular distribution with a lower endpoint of 0 and an upper endpoint of 1,000 and a mode of 500. Panel D simulates a Gamma distribution with a shape parameter of 2.5 and a scale parameter of 500. Each of the data sets had N=20,000 and the simulations were done using Minitab.

The distributions in Figure 3 were chosen so as to have a mixture of density functions with positive, zero, and negative slopes as well as a combination of linear and convex and concave sections in the density functions. The scale parameters (the means) have no effect on the differences between the ordered observations. The shape parameters (the standard deviations) do impact the sizes of the differences, as do the number of records (N). The data in each case was ordered (ranked from smallest to largest) and the first-two digits of the differences are shown in Figure 4.

Insert Figure 4 about here In each panel in Figure 4 the first-two digit frequencies of the differences between the ordered elements show a close approximation to Benford’s Law despite the fact that the densities shown in Figure 3 have different shapes. Almost Benford means that in one or two sections of the graph the actual proportions will tend to be less than those of Benford’s Law, and in one or two sections the actual proportions will tend to exceed those of Benford’s Law (by a quantifiable margin which depends mildly on the underlying distribution). If the simulations were repeated and the results aggregated, then these “over” and “under” sections would be easier to see. Such repeated simulations were not performed because auditors will usually only deal with one somewhat noisy data set at a time. The differences between the ordered data elements of random variables drawn from any nice distribution will exhibit Almost Benford behavior, but they will seldom conform perfectly to Benford’s Law even with N tending to infinity (Miller and Nigrini, 2006). For data testing purposes, however, the important fact is that these differences should be close to Benford’s Law for most data sets, and therefore an analysis of the digits of the differences can be used to test data integrity.

8

CASE STUDIES The utility of the second order test of Benford’s Law is illustrated using three cases. The first case shows a result for transaction level accounting data. The second case shows the findings from revenue and expense data that are potentially biased and erroneous. The third case shows the results of intentionally seeding accounting data with errors. Corporate Accounts Payable Data The data used for this case study is the accounts payables data set from Drake and Nigrini (2000). This data set contains the invoice date, invoice amount, and vendor numbers from an accounts payable system of a company in the software business. The data has of 38,176 records of which 36,515 invoices have dollar amounts greater than or equal to $10.00. It has become common practice to delete numbers less than $10.00 when testing against Benford’s Law because (a) these numbers might not have an explicit first-two digits or an explicit second digit (e.g., $0.01 or $6), and (b) these numbers are usually immaterial from an audit perspective. The expected results of Benford’s Law are usually not affected by a deletion of numbers below an integer power of 10.

Insert Figure 5 about here The graph of the first-two digits of the dollar amounts of the invoices is shown in Panel A of Figure 5. The fit of the actual proportions and those of Benford’s Law is visually appealing. For most of the higher first-two digit combinations (50 and higher) the difference between the actual and expected proportions is only a small percentage. The calculated chi-square statistic is 773.3 which exceeds the critical value of 112.02 at α = 0.05 with 89 degrees of freedom, and indicates that the null hypothesis of conformity to Benford’s Law should be rejected. The high calculated chi-square statistic is due in part to N being equal to 36,515. All things being equal, a data set with a large N will have a larger calculated chi-square statistic than one with a smaller N if the assumed distribution is slightly off. Thus while the data may be approximately Benford, N is large enough so that small deviations are detectable. The mean absolute deviation of 0.0012 indicates that the average deviation from Benford’s Law is about one-tenth of one percent, which is normal for corporate accounts payable data. The deviations are comparable to those of the accounts payable data analyzed in Nigrini and Mittermaier (1997) and the data set would pass a Benford’s Law test if this analytical procedure were used as a high-level reasonableness test. The results of the second order test are shown in Panel B of Figure 5. Differences of zero (which occur when the same number is duplicated in the list of ordered records) are not shown on the graph. The ordered differences gave 23,777 nonzero records indicating that there were 12,737 differences that were “lost” due to their being equal to zero. The second-order graph seems to have two different functions. The first Benford-like function applies to the first-two digits of 10, 20, 30, … , 90, and a second Benford-like function applies to the remaining first-two digit combinations (11 to 19, 21 to 29, …., 91 to 99).

The data set had 26,067 records with an Amount field from $10.00 to $999.99. Of the 26,066 differences fully 10,320 (39.6 percent) of these were equal to zero. For the nonzero differences there were 3,487 differences (13.4 percent) of 0.01 which are shown on the graph as having first-two digits of

9

10 (because the number could be written as 0.010) and 2,475 (9.5 percent) differences of only 0.02. The reason for the spikes at 10, 20, … , 90 is therefore that the numbers are tightly packed in the $10.00 to $999.99 interval.

The mathematical reason for the systematic spikes is that the data set does not comprise numbers from a continuous distribution. Currency amounts can only differ by multiples of $0.01. The results in Panel B of Figure 5 are because of the density of the numbers over a short interval and the fact that the numbers are restricted to 100 evenly spaced fractions after the decimal point. Neither of these conclusions could have been drawn from an analysis of the first and first-two digits of the original data. These Panel B patterns should occur with any discrete data (e.g., populations of counties) and the size of the spikes at 10, 20, …, 90 are a function of both the size of the data set and the range. For a larger N and a smaller range the pattern will be even more pronounced.

10

Franchisor Restaurant Data This case study shows the results of an analysis of accounting data from a large chain of restaurants. The company franchises about 5,000 restaurants and the franchisees report their sales monthly and pay their franchise fee based on these reported sales numbers. The data analyzed included the sales numbers and the food and supplies (hereafter food) purchases by the franchisees from the franchisor corporation. The annual total sales and food purchases for each location were recorded in two tables, one table for each of two successive years, namely 200t and 200u. The first test was an analysis of the sales numbers for 200t that were approximately normally distributed with a mean of $1 million and a standard deviation of $400,000. The results of the second order test are shown in Panel A of Figure 6.

Insert Figure 6 about here The second order test shows a reasonable fit when examined visually, but the calculated chi-square statistic of 326.9 exceeds the critical value of 112.02 of the chi-squared distribution with 89 degrees of freedom and α = 0.05. The chi-square test therefore calls for a rejection of the null hypothesis that the differences data conforms to Benford’s Law. The graph shows a tendency to spike at multiples of 10 but these spikes are smaller in size than those for the accounts payable numbers. The tendency to spike at the multiples of 10 was because the data was usually in whole dollars and there were cases where the differences amounted to an integer less than $10. The tendency for the unders to be clustered from 20 to 89 (not immediately apparent from the first-two digits graph but noticeable on a first digits graph) is symptomatic of Almost Benford behavior and is consistent with the fact that the original data is normally distributed.

The next test was a second order test of the normally distributed food cost numbers (mean of $300,000 and a standard deviation of $100,000) and the results are shown in Panel B of Figure 6. The fit is visually appealing, and with a calculated chi-square statistic of 108.2 the null hypothesis that the data conforms to Benford’s Law is not rejected at α = 0.05 with 89 degrees of freedom. There is no tendency to spike at multiples of 10 because the original data was in dollars and cents and the relatively large standard deviation produced a large enough spread between the differences to avoid spikes at 10, 20, … 90.

The final test was an analysis of the food cost proportions. For each restaurant this was equal to the sum of the annual food costs divided by the sum of the annual sales. This same metric was used internally by the franchisor to evaluate the reasonableness of the reported sales by the franchisees. The proportions were calculated and the results showed that the data was approximately normally distributed with a mean of about 0.30 and a standard deviation of about 0.04. The second order test results shown in Panel C of Figure 6 gave a result that conformed to Benford’s Law. The calculated chi-square statistic was 79.6 which was less than the critical value of 112.02 for 89 degrees of freedom and α = 0.05. The second order tests followed the Almost Benford behavior described in Nigrini and Miller (2006) in that the digit patterns of the differences approximated those of Benford’s Law. The weakest fit to Benford’s Law was for the reported sales data and the closest fit was for the calculated food cost proportions. Coincidentally the most error prone data was the reported sales data since these were mainly

11

self-reported numbers that were the basis for the franchisee fees payable. The second most error prone data was the food cost data due mainly to allocation errors which would occur when a franchisee that owned multiple locations purchased food for one location and then redistributed the food to the members of the group. Food cost omissions would occur when the franchisee purchased food directly from an outside supplier. The least error prone data set was the calculated food cost proportions because this was simply an arithmetic calculation (food costs / sales) for each restaurant. The calculation was arithmetically perfect even though errors were possible in both the numerator and denominator. The fit of the second order tests to Benford’s Law was weakest for the most error prone data and strongest for the least error prone data. In contrast to the usual Benford’s Law tests, there is no real value to investigating any spikes. A difference (with say first-two digits of 48) represents the difference between two successive elements in the ordered data, and it would be difficult to formulate a rule to say that either the first or the second of these two numbers appeared to be fraudulent or erroneous. Franchisor Restaurant Data Seeded with Errors The data for this case study was the restaurant data for revenues and food costs for the immediately following year, called 200u. These accounting numbers were seeded with errors to evaluate the types of errors or anomalies might be detected by the second order tests. Prior to running the second order tests the 200u data was compared to the 200t data. A comparison with similar prior-period data is a valid analytical procedure although these tests are usually performed on account balances and ratios rather than the data diagnostics of a set of transactional data. The second order test on the 200u sales numbers showed a graph with a similar profile to that of 200t and a comparable calculated chi-square statistic of 378.7. The same applied to the food costs which had a comparable calculated chi-square statistic of 138.7 and the food cost proportions which had a comparable calculated chi-square statistic of 112.6. When compared to the critical value of 112.02 for 89 degrees of freedom and α = 0.05, the results for 200u were comparable to those of 200t with the same goodness-of-fit ranking, although the calculated chi-square statistics in each case were somewhat higher.


The first simulation was to round the sales numbers to the nearest $10. In these simulations the goal was to create a second data set that closely mimicked the mean and standard deviation of the original data. A comparison of the mean and standard deviation of the manipulated 200u data with the 200t data would then not show anything unusual since these values were approximately equal for both years. We also imagined a situation where the auditor would not print and visually scan the transactional data for unusual items. Scanning the transactions is especially difficult with ERP (Enterprise Resource Planning) systems because access to the data tables is restricted and also requires specific knowledge of the proprietary query tools of the system. The use of rounded numbers with amounts rounded to the nearest dollar is common in ERP systems. A situation where all numbers are multiples of $10 would either mean that the rounding function was faulty or that the data was subject to a systematic error. Systematic errors are a known risk in ERP systems. The results of performing the second order test on the rounded sales

12

numbers are shown in Panel A of Figure 7. The results show the familiar spiked graph with a Benford-like pattern for the multiples of 10 and a sharply downward skewed series for the other digit combinations. The calculated chi-square statistic was 7,755.0 which when compared to the critical value of 112.02 for 89 degrees of freedom and α = 0.05 indicated a severe level of non-conformity to Benford’s Law. The second order test was therefore able to detect the rounded sales numbers. The second simulation was to order the 200u sales data and then to regress the sales numbers against the rank (sales = Y and rank = X). The regression equation was 375,000 + 143*Rank with an R-squared of about 0.85. The mean of the fitted values was equal to the mean of the original amounts but there was a small reduction in the standard deviation. The situation envisioned was that of a client fraudulently altering the numbers so that they would give good results to an auditor using a linear regression model for reasonableness tests. The substitution of the fitted values produced a data set where all the calculated differences between the fitted values were equal to 143.295 and the second order test results in Panel B of Figure 7 showed that all the differences had first-two digits equal to 14. The calculated chi-square value of 259,965.2 which when compared to the critical value of 112.02 for 89 degrees of freedom and α = 0.05 signaled an extreme level of nonconformity to Benford’s Law. The third simulation was to replace the original sales numbers with numbers from a normal distribution with a mean and standard deviation equal to that of the original numbers. The result would be a new set of sales numbers with a perfect fit to the normal distribution. Panel C of Figure 7 shows a histogram of the original data and the fitted values as a line. The left side of the graph is truncated and shows the count for all amounts less than $50,000. The situation envisioned was where the client substituted fitted values from a normal distribution for the original values in the belief that this would produce good results from any statistical technique that assumed that the data are normally distributed. Excel’s NORMINV function was used to generate the fitted values with parameters Rank/N, the original mean, and the original standard deviation; that is, N uniformly spaced numbers were chosen on [0,1] and then we evaluated the inverse cumulative distribution function at these points to simulate N points on the normal distribution (thus for the standard normal .025 would correspond to -1.96, .5 would correspond to 0 and .975 would correspond to 1.96). The results of the second order test in Panel D of Figure 7 shows that the patterns of the digits of the differences has a smooth pattern sharply decreasing to the right of 11. There is no fit to Benford’s Law as is evidenced by a calculated chi-square statistic of 19,932 when compared to the critical value of 112.02 for 89 degrees of freedom and α = 0.05. The fourth simulation was to perform the second order test on data that was not ranked in ascending order. The situation envisioned here was that where the auditor was presented with data that should have a natural ranking. This could be sales numbers for locations ranked by annual payroll numbers ascending, or number of votes cast in precincts ranked from the earliest reporting to the last to report, or miles driven by the taxis in a fleet ranked by gasoline usage in gallons. Here the auditor might expect the variable of interest to be ranked and the second order test could be used as a test to see whether the data is actually ranked on the variable of interest.


13

In Figure 8 the first digit results (as opposed to first-two digits) are shown because it is easier to see the systematic pattern of the deviations. The first method was to randomize the sales numbers and the results of the second order test are shown in Panel A. The graph shows a decreasing frequency from right to left for the digits, but the skewness is less pronounced than that of Benford’s Law. The calculated chi-square statistic is 252.5 which exceeds the critical value of 15.5 (8 degrees of freedom and α=0.05) by a wide margin. The null hypothesis that the data conforms to Benford’s Law is therefore rejected. The second ranking method was to sort the sales numbers by the ranking of the food costs numbers. The second order test results are shown in Panel B of Figure 8. The graph shows decreasing counts from right to left but the shape is not the same as that of Benford’s Law. The calculated chi-square statistic is 230.1 calling for a rejection of the null hypothesis since it exceeds the critical value of 15.5 (8 degrees of freedom and α=0.05). If the data were correctly ranked, the graph (not shown) would have a visually appealing close fit with a Mean Absolute Deviation of 0.006 and a calculated chi-square statistic of 38.5. Because the expected result is Almost Benford it is conservative to compare the calculated statistic to a cutoff value of 15.5. The correctly ranked data would have the null hypothesis of conformity rejected but it would be a near miss. This result is interesting because the fit to Benford’s Law deteriorated significantly with the “incorrect” ranking even though the correlation between the correctly ranked and incorrectly ranked sales numbers was 0.91. The third ranking method was to rank the sales by opening date with the first record being the first restaurant opened and the last record was for the most recently opened restaurant. Since this ranking was from the most established location to the newest location we might expect this ranking to be from the highest sales to the lowest sales. The correlation between annual sales and restaurant number was –0.173 indicating a mild disposition towards the lower numbers (oldest restaurants) having higher sales numbers. The results of the second order test are shown in Panel C of Figure 8. The results are similar to that of Panel A and the calculated chi-square statistic of 201.5 exceeds the critical value of 15.5 (8 degrees of freedom and α=0.05) and again calls for a clear rejection of the null hypothesis of conformity to Benford’s Law. The fourth ranking simulation was to sort the sales numbers by the food cost proportions (food costs / sum of sales). The expectation would be that those restaurants with higher sales would have lower proportions because they might be able to better manage food waste. The correlation between the food cost proportions and sales numbers was –0.179 which indicated a mild disposition for this to be the case. The results of the second order test are shown in Panel D of Figure 8. The digit pattern is similar to those of Panels A and C and the calculated chi-square statistic of 228.7 exceeds the critical value of 15.5 (8 degrees of freedom and α=0.05) and also calls for a rejection of the null hypothesis of conformity to Benford’s Law.

The overall results of the ranking simulations are that for those cases where the ranking was strongly affected (Panels A, C, and D) that the digit patterns were similar with chi-square statistics in excess of 200 and a less pronounced skewness than that of Benford’s Law. Where the ranking was not so significantly disturbed there was a different pattern but still with a calculated chi-square statistic in excess of 200. The second order test correctly signaled that the data was not correctly ranked.

14

DISCUSSION The new second order test of Benford’s Law analyses the digit patterns of the differences between the ordered (ranked) values of a data set. In most cases the digit frequencies of the differences will closely follow Benford’s law. The usual first digit and first-two digit tests are usually only of value on data that is expected to follow Benford’s law. In contrast, the second order test can be performed on any data set. Irrespective of the underlying distribution of the data, provided that it can be described as a “nice” distribution, the second order test is expected to yield digit frequencies closely approximating those of Benford’s Law.

The auditing literature, starting with Nigrini and Mittermaier (1997), has only advocated the use of the first digit and first-two digits tests for data sets that are expected to follow Benford’s Law. Deviations from the expected patterns are either explained by an incorrect expectation (i.e., the data should not have been expected to follow Benford’s Law) or the spikes (excesses) are red flags that should be investigated further. False positives arise when an investigation of a spike turns up a reasonable explanation for the spike (e.g., the frequent refunding of a customer deposit of $50). In contrast to the frequent occurrence of false positives generated by the usual Benford’s Law tests, this second order test could actually return compliant results even for data sets with errors such as underreported sales by restaurant franchisees. Deviations from the Almost Benford pattern (Miller and Nigrini, 2006) would in most cases signal an internal inconsistency in the data. At this stage of the development of the theory the evaluation of the results is somewhat qualitative (since the expectation is that the second order test generates something close to but not equal to Benford) with non-Benford behavior indicating that additional audit tests should be considered.

Analytical procedures involve a comparison of ratios and balances, and serve as reasonableness tests. These tests are required in the planning stages of the audit and in the completion phase and may also be performed in the testing phase of the audit. Both the usual Benford's Law tests and the second order tests search for internal inconsistencies in the patterns of the numbers making them a unique type of analytical procedure. In contrast to the usual analytical procedures where the deviation from the expected is evaluated based on subjectively determined materiality and tolerable misstatement levels, the Benford’s Law tests can be evaluated using objective means such as the chi-square statistic. The only caveat is that for the second order test the result is expected to be Almost Benford, and a test statistic that is therefore on the cusp of rejecting the null hypothesis might still be acceptable as being Almost Benford. In addition to detecting internal consistencies, the second order tests could provide the user with an increased understanding and insights into the data under review. This ability to provide additional insights should impact detection risk favorably, especially for those cycles most prone to a larger error counts such as the inventory and warehousing cycle and the sales and collection cycle. The simulations in this paper focused on seeding the data with rounded numbers and substituting “neat” regression output and data from the inverse of the normal distribution for the “real” data. The second set of simulations concentrated on the evaluation of the digit patterns of the second order tests with incorrectly ranked (sorted) data. In all cases the results showed a deviation from the Almost Benford pattern. We used sales and cost data that would be similar to transactional data that would be encountered in the testing phase of the audit. We also interpreted the term auditing broadly and saw it as including all

15

the various types of attestation activities carried out in the private and public sectors. The second order tests could therefore be used in a wide variety of cases and circumstances, and it could be used in both the controls evaluation phase of the audit, or as a testing procedure on data subject to a controls evaluation phase, or data not subject to a controls evaluation phase. A suggestion for future research is therefore that auditors use the second order test in a wide variety of settings and share the results in either professional or academic forums. The desired result is that with the passage of time auditors would develop expertise in interpreting the output.

In a continuous monitoring or continuous audit environment an auditor might need to evaluate data that represents a continuous stream of transactions, such as the sales made by an online retailer or an airline reservation system. Unpublished work by the authors suggests that the digit patterns of the differences should approximate those in Panel A of Figure 8 and that (a) the first-two digits graph would be stable over time even though the pattern might not be that of Benford’s Law, and (b) the first-two digits of the differences would also be stable across time. Future research that provides guidance on the possible benefits of continuously running the Benford’s Law based tests on (say) the last 10,000 records of transactional data with graphs acting like a dashboard control. The use of corporate dashboards has been mainly for managerial purposes (Eckerson, 2006) but the concept seems to be adaptable to the monitoring of accounting systems. Such research could demonstrate what patterns might be expected under conditions of “in control” and under conditions of “out of control.” The objective would be to identify the types of realistic and plausible errors that might cause the graphs to give an “out of control” signal. Preliminary unpublished work by the authors has shown that the applying the second order test to consecutive seismic readings generates a stable graph over time similar to that of Panel A of Figure 8.

Future research that seeks to uncover data inconsistencies in encouraged given the large penalties that are awarded for audit failures. Such research needs to be matched with a willingness on the part of the auditing profession to adopt more sophisticated data diagnostic tools.

16

CONCLUSIONS Auditors are required to explicitly consider the potential for material misstatements due to fraud.

The effective use of the computer on transaction level data could present an efficient means for auditors to at least partially fulfill their responsibilities with regards to fraud and material misstatements.

Digital analysis based on Benford’s Law is an audit technique that is applied to an entire population of transactional data. Benford’s Law provides the expected digit frequencies in tabulated data and Durtschi, Hillison, and Pacini (2004) discuss the types of accounting data that are likely to conform to Benford’s Law and the conditions under which a “Benford Analysis” is likely to be useful. This paper introduced a second order test of Benford’s Law. If the records in any nice data set are sorted from smallest to largest, then the digit patterns of the differences between the ordered records are expected to closely approximate Benford’s Law in all but a small subset of special circumstances.

The first study used the second order test on an accounts payable data set. The second order test graph did not conform to Benford’s Law because of the measurements being in dollars and cents with a large number of records spanning a small range of values. The second study analyzed the annual sales, annual food costs, and food cost proportions for the restaurants in a franchising operation. The second order results showed that the closest conformity was for the most “accurate” data. In the third study the original data was seeded with errors. The results showed that the second order test could detect rounded data and cases where the original data was replaced by other data (with a similar mean and standard deviation) generated by regression and other statistical means.

The paper also discussed the relationship between the second order tests and the more traditional analytical procedures used in the planning, testing, and completion phases of the audit. The second order test has the potential to uncover errors and irregularities that would not be discovered by traditional means. Future research on data diagnostic techniques was encouraged.

17

REFERENCES American Institute of Certified Public Accountants. 2002. Statement on Auditing Standards No. 99, Consideration

of Fraud in a Financial Statement Audit. New York, New York. Benford, F. 1938. The law of anomalous numbers. Proceedings of the American Philosophical Society 78: 551-572. Berger, A., Bunimovich, L.A., and T.P. Hill. 2005. One-dimensional dynamical systems and Benford’s Law,

Transactions of the American Mathematical Society, 357 (1): 197-219. Berger, A., and T. Hill. 2006. Newton’s Method obeys Benford’s Law. American Mathematical Monthly

forthcoming. Cleary, R., and J. Thibodeau. 2005. Applying Digital Analysis using Benford’s Law to detect fraud: The dangers of

type I errors. Auditing: A Journal of Practice and Theory 24 (1): 77-81. Diaconis, P. 1976. The distribution of leading digits and uniform distribution mod 1. The Annals of Probability 5

(1): 72-81. Drake, P.D., and M.J. Nigrini. 2000. Computer assisted analytical procedures using Benford’s Law. Journal of

Accounting Education 18: 127-146. Durtschi, C., Hillison, W., and C. Pacini. 2004. The effective use of Benford’s Law in detecting fraud in accounting

data. Journal of Forensic Accounting (5): 17-33. Eckerson, W. 2006. Performance dashboards: Measuring, monitoring, and managing your business. John Wiley &

Sons: Hoboken, New Jersey. Hassan, B. 2003. Examining data accuracy and authenticity with leading digit frequency analysis. Industrial

Management & Data Systems 103 (2): 121-125. Herrmann, D., and W. Thomas. 2005. Rounding of Analyst Forecasts. The Accounting Review 80 (July): 805-823. Hoyle, D., M. Rattray, R. Jupp, and A. Brass. 2002. Making sense of microarray data distributions. Bioinformatics

18 (4): 576-584. Kontorovich, A.V., and S. J. Miller. 2005. Benford’s Law, values of L-functions, and the 3x+1 problem. Acta

Arithmetica 120: 269-297. Lagarias, J. and Soundararajan, K. 2005. Benford's Law for the 3x+1 function, Working paper, Univ. of Michigan. Leemis, L.M., Schmeiser, B.W., and D.L. Evans. 2000. Survival distributions satisfying Benford’s Law. The

American Statistician 54 (3): 1-6. Miller, S. and M. Nigrini. 2006. Differences between independent variables and almost Benford behavior. Working

paper, Brown University. Moore, G., and C. Benjamin. 2004. Using Benford’s Law for Fraud Detection. Internal Auditing (Jan/Feb): 4-9. Nigrini, M.J. 2000. Digital Analysis using Benford’s Law: Tests and Statistics for auditors. Global Audit

Publications: Vancouver, British Columbia. Nigrini, M. J., and L. J. Mittermaier. 1997. The use of Benford’s Law as an aid in analytical procedures. Auditing:

A Journal of Practice and Theory 16 (Fall): 52-67. Raimi, R. 1976. The first digit problem. American Mathematical Monthly 83 (Aug-Sept): 521-538. Wallace, W. 2002. Assessing the quality of data used for benchmarking and decision-making. The Journal of

Government Financial Management (Fall) 51 (3): 16-22.

18

FIGURE 1

A COMBINATION OF GEOMETRIC SEQUENCES

The Figure depicts several sequences. Geometric #1 is a geometric sequence over the range [30, 300). Geometric #2 is a geometric sequence over the range [10,100). The Combined line is an ordered interleaving of the two geometric sequences such that on the interval [10, 30) the values are from geometric #2, and for the [30, 100) interval the values are from both geometric #1 and geometric #2, and for the interval [100, 300) the values are from geometric #2. The number of elements for the combined line is the sum of the elements in the two geometric sequences.

19

FIGURE 2

FIRST-TWO DIGITS OF THE DIFFERENCES BETWEEN INTERLEAVING SEQUENCES

The graph above shows the first-two digit frequencies of the differences between the records in the Combined sequence in Figure 1. The Combined sequence is the sequence generated by appending two geometric sequences and ordering the records to form a new sequence.

20

FIGURE 3

SIMULATED DATA FROM FOUR DISTRIBUTIONS Panel A: Normal Distribution

Panel B: Uniform Distribution

Panel C: Triangular Distribution

Panel D: Gamma Distribution

The Figure shows the four distributions generated to illustrate the second order test. Panel A shows a histogram of the data simulated from a normal distribution with a mean of 500 and a standard deviation of 100. Panel B shows data from a uniform distribution over the [0,1000) interval. Panel C shows the data from a triangular distribution over the [0,1000) interval with a mode of 500. Panel D shows a Gamma distribution with a shape parameter of 2.5 and a scale parameter of 500.

21

FIGURE 4

DIFFERENCES BETWEEN THE ORDERED VALUES FROM FOUR DISTRIBUTIONS Panel A: Normal Distribution Panel B: Uniform Distribution

Panel C: Triangular Distribution

Panel D: Gamma Distribution

The figure shows the results of the second order test on the simulated distributions in Figure 3. The first-two digit combinations (10 to 99) are shown on the X-axis and the proportions are shown on the Y-axis. The proportions of Benford’s Law are shown by the curved line and the actual proportions are represented by the vertical bars. The digit patterns are expected to closely approximate those of Benford’s Law but are not expected to conform perfectly to Benford’s Law.

22

FIGURE 5

DIGIT PATTERNS OF ACCOUNTS PAYABLE NUMBERS Panel A: Accounts Payable numbers

Panel B: Second order test

The Figure shows the digit frequencies of an accounts payable data set with 36,515 records. The y-axis shows the proportions and the first-two digits (10 to 99) are shown on the x-axis. The line shows the proportions of Benford’s Law and the bars show the actual proportions for the first-two digits of the accounts payable numbers. Panel A shows the digit frequencies of the original source data and Panel B shows the results of the second order test (the digit frequencies of the differences between the ordered records).

23

FIGURE 6

FIRST-TWO DIGITS OF THE DIFFERENCES BETWEEN 200t NUMBERS Panel A: 200t Sales Numbers

Panel B: 200t Food Cost Numbers

Panel C: 200t Food Cost Proportions

The figure shows the first-two digit frequencies of the differences between the ordered 200t data. Panel A shows the results for the sales numbers, Panel B shows the results for the food cost numbers, and Panel C shows the results for the food cost proportions. The first-two digit combinations (10 to 99) are shown on the X-axis and the proportions are shown on the Y-axis. The proportions of Benford’s Law are shown by the curved downward-sloping line and the actual proportions are represented by the vertical bars.

24

FIGURE 7

RESULTS OF SIMULATING ERRORS IN THE 200u SALES NUMBERS Panel A: Sales numbers rounded to nearest $10.

Panel B: Sales numbers from regression line.

Panel C: Actual and fitted normal densities.

Panel D: Sales numbers from Normal curve.

The Figure shows the second order tests after seeding the 200u sales data with errors. The first-two digit combinations (10 to 99) are shown on the X-axis and the proportions are shown on the Y-axis. The proportions of Benford’s Law are shown by the curved line and the actual proportions are represented by the vertical bars. Panel A shows the result of rounding the annual sales to multiples of $10. Panel B shows the result of using fitted values from a regression. Panel C shows the simulated normal density and the actual histogram of the sales data. Panel D shows the second order test on the simulated Normal data.

25

FIGURE 8

RESULTS OF SIMULATING ERRORS IN THE 200u SALES NUMBERS Panel A: Ranked by random number generator

Panel B: Ranked by Food dollars

Panel C: Ranked by Restaurant number

Panel D: Ranked by Food/Sales proportions

The Figure shows the second order tests after seeding the 200u sales with errors related to the ordering of the data. The first-two digit combinations (10 to 99) are shown on the X-axis and the proportions are shown on the Y-axis. The proportions of Benford’s Law are the curved line and the actual proportions are the vertical bars. Panel A shows the result of randomly sorting the sales numbers. Panel B shows the result of sorting the sales numbers by the sorted values of the food cost numbers. Panel C shows the result of sorting the sales by a proxy measure for the age of the restaurant. Panel D shows the results of sorting the sales numbers by the food cost proportions which are a rough measure of food cost efficiency.

data diagnostics using second order tests of benford’s law · pdf filepositives and can...

Documents