genetic algorithms for credit card fraud detection - inase · characteristics of credit card...

Genetic algorithms for credit card fraud

detection

SATVIK VATS*, SURYA KANT DUBEY, NAVEEN KUMAR PANDEY

Institute of Technology and Management

AL-1, Sector-7 GIDA, Gorakhpur, Uttar Pradesh, INDIA

E-mail address- [email protected]

Abstract: - Due to the rise and rapid growth of E-Commerce, use of credit cards for online purchases has

dramatically increased and it caused an explosion in the credit card fraud. Fraud is one of the major ethical

issues in the credit card industry. As credit card becomes the most popular mode of payment for both online as

well as regular purchase, cases of fraud associated with it are also rising. In real life, fraudulent transactions are

scattered with genuine transactions and simple pattern matching techniques are not often sufficient to detect

those frauds accurately. Implementation of efficient fraud detection systems has thus become imperative for all

credit card issuing banks to minimize their losses. Many modern techniques based on Artificial Intelligence,

Data mining, Fuzzy logic, Machine learning, Sequence Alignment, Genetic Programming etc., has evolved in

detecting various credit card fraudulent transactions. A genetic algorithm is an evolutionary search and

optimisation technique that Mimics natural evolution to find the best solution to a problem. Here the

characteristics of credit card transactions undergo evolution to allow a modelled credit card fraud detection

system to be tested.

Key- Words:- Electronic commerce, fraud, credit card, genetic algorithms, detection

1 Introduction

In recent history information technology has

become far more pervasive in everyone’s lives. As reliance on software products increases, so does the

pressure to ensure that they work reliably and as

expected. This is why software testing has risen to

the forefront of public attention, with notable

instances such as the iPhone alarm bug [3]. In

1998, the Data Protection Act changed the way

data can be used [13]. Until this time, developers in

the UK working in industry have simply made

copies of customer data and used it in an often less

secure development environment. The Act

introduces legislation intended to give more rights

to individuals whose data is being held, and

restricts uses to which it can be put. For example a

company wishing to outsource some of its

development may not have the right to pass on its

customers data, hence doing so would be a

Proceedings of the 2013 International Conference on Education and Educational Technologies

42

mailto:[email protected]

violation of the Act. The company Grid-Tools

Limited is our industrial partner for this project.

They offer professional solutions for automatic test

data generation in the form of a tool called Data

Maker. This tool was originally written to address

the increasing size of data sets required by industry. As the set size increases it becomes impractical to

create the data by hand, so automatic methods had

to be found. Data Maker is capable of generating

synthetic data that conforms to the requirements of a test engineer’s specification. The data created is

not regulated by the Data Protection Act, because it

has been generated rather than gathered. Thus it is

not related to any individual and hence not covered

in legislation. The use of synthetic data has

advantages in its own right: large sets of data can

be created, with their composition tailored to meet

test coverage criteria. A set of real-world data may

not do this, as it is likely to be of relatively constant

composition, so not testing all aspects of the

program. A limitation of Data Maker is that it can

only produce linear sets of data from its built-in functions. The systematic testing of some software,

however, requires data sets with trends. A typical

example of such a system is a credit card fraud

detection system. To thoroughly test such a system

one would require a large set of realistic

transactions, both legitimate and fraudulent. For

reasons discussed above real data should not be

used, so instead a way to generate such data must

be found.

2 Related work

Genetic algorithms are a heuristic used to solve

high-complexity computational problems. Apart

from modeling the phenomena occurring in nature,

they help in optimization, simulation, modeling,

design and prediction purposes in science,

medicine, technology, and everyday life [14]. A

recent survey of the state of the art was carried out

for the “Materials and Manufacturing Processes”

journal in 2009, by Paszkowicz [14]. As the name

of the journal suggests they were only concerned

with the application of genetic algorithms to

problems in chemistry and physics, but nonetheless

they highlighted some innovative uses. One cited

example was to help the design process of new

materials, in particular with regards to a reverse

heat transfer problem. The problem consists of

finding a material with desirable thermal properties

that give rise to a good temperature field profile.

For a particular material well known equations can

be used to calculate the temperature profile, but

because of their complex nature the process cannot

easily be reversed to find optimal parameters. This

is an area where evolutionary search often excels,

as we will see in the next example where the search

is applied to an NP complete problem. The

algorithm used in this case modeled a liquid

material that was being heated linearly on its

surface. The input to the algorithm, its initial

population, was properties of already known

similar liquids. The output computed for each

liquid was the temperature field and the cooling

rate. Good results were returned by the algorithm,

which were later confirmed to be correct

experimentally. Still in the same materials

engineering survey, evolutionary search has been

applied to the mechanical process of welding. To

produce a strong weld several parameters have to

be optimized, such as current, voltage, torch speed,

arc gap, shielding gas and its flow rate, type and

geometry of the electrode. It can already be seen

how this optimization process could lend itself to

the application of a genetic algorithm, and once

again good results were found for what would have

been an expensive experimental process. Not only

did the results of the optimization provide a better

set of welding parameters, they also shed light on


43

the transformation of the metal during the weld.

This had already been described theoretically, but

the results from the algorithm helped to bring

calculations and experimental results closer

together. In a purely theoretical area, genetic

algorithms have been applied to find approximate

solutions to the travelling salesman problem.

Scaling became an issue as the number of cities the

salesman had to visit increased. Braun [4] reported

that the algorithm could generate very good but not

optimal solutions for travelling salesman problems

with 442 to 531 cities. Using a standard SUN

workstation they could optimally solve problems

with up to 442 cities in under thirty minutes. The

biggest problem examined was 666 cities, which

could be solved approximately with a journey

0.04% longer than the optimum route. Potvin also

analyzed Travelling Salesman with genetic

algorithms [6]. The biggest problem reported in his

survey was one million cities, solved to within 4%

of an optimal route. This took four hours on a

powerful computer. He identified the role played by

the crossover operator on the outcome, with

performance being significantly affected by the

reordering of the tour. Perhaps the most well

known application of machine learning is robotic

movement. Schultz [1] applied the algorithm so

that autonomous robots could navigate and perform

collision avoidance of obstacles in their path. An

innovative part of his work was once again aimed

at cost and time saving, similar to the previously

detailed welding example. The task set for the

autonomous robot was to navigate from a start to

end point down no pre planned route, avoiding

randomly placed obstacles on its way.

3. Fraud Detection using Genetic Algorithm

Genetic algorithms are evolutionary algorithms

which aim at obtaining better solutions as time

progresses. Since their first introduction by Holland

[5], they have been successfully applied to many

problem domains from astronomy to sports [2],

from optimization [8] to computer science [7], etc.

They have also been used in data mining mainly for

variable selection [10] and are mostly coupled with

other data mining algorithms. In this study, we try

to solve our classification problem by using only a

genetic algorithm solution. In this module the

system must detect whether any fraud has been

occurred in the transaction or not. It must also

display the user about the result.

In the following we make clear the

concept of genetic algorithms by using an own

example over boundary value testing. We

implemented this algorithm in Java and can

successfully generate inputs for the test. A genetic

algorithm is a paradigm often used to search vast

and poorly understood search spaces. With well

defined functions the algorithm will converge into

one area of the search space which holds the

optimal solution. This example is a very simple

instance of the algorithm that searches for a set of

optimum inputs for black box testing. The function

being tested checks whether or not a value x is

within the range 0 ≤ x ≤ 8. Boundary value testing

is concerned with selecting the following input

Values:

• Maximum.


44

• Maximum minus one.

• Nominal middle value.

• Minimum plus one.

• Minimum.

Fig.1 Selection of inputs for 0 ≤ x ≤ 8

These test cases will exercise the program to detect

any errors, particularly those that are “off by one”.

For simplicity we will assume that the correct input

values are known.

A generic genetic algorithm [11]

SimpleGeneticAlgorithm ( )

initialise population ;

evaluate population ;

while ( termination criteria not met )

select solutions for nextgeneration ;

perform crossover and mutation ;

evaluate population ;

In the same way as a chromosome is the basic

building block of nature, so it is of a genetic

algorithm. The chromosome is an encoded

statement of the data which one wishes to optimise.

In our example the chromosome would represent a

tuple of all of the input values, and it is encoded as

a binary string. The reason for this choice will

become clear when further genetic operators are

considered. In our example, the inputs 8, 7, 3, 1 and

0 would be encoded as their binary equivalent, and

concatenated: 1000 0111 0011 0001 0000. The

second task is to write a function to compare the

relative merit of chromosomes... The fallowing

pseudo code shows, pseudo-code for fitness

calculation over an encoding of five bytes, each

representing an input integer. A set of

chromosomes goes to make up the population of

the algorithm. Our algorithm is started with a

randomly generated population of chromosomes.

Evaluation of the fitness of a chromosome

Int fitness ( Chromosome input )

int fitness = 0 ;

int [ ] ideal = new int [A : E ] ;// Array of ideal inputs A to E .


45

int [ ] actual = input . to Array ( ) ; // Retrieved at a from chromosome .

for ( int i = A : E ; i++)

fitness = absolute ( actual − ideal ) ;

return fitness ;

Fig.2 Diagram of crossover.

Crossover is the operator used to reproduce

chromosomes. This works by taking a pair of

encoded chromosomes - the parents - and

combining them to produce two different

chromosomes - the progeny. When applied across

two fit chromosomes this method aims to produce

progeny that have inherited the best attributes of its

parents, though this is not always the case. To

illustrate the principal, let’s consider two

chromosomes and assume a central crossover: If

the parents are 0011 and 1100 the two progeny will

be 0000 and 1111 respectively - see Figure 4. This

should make it clear that the bit string is simply

crossed, as the name suggests. Depending on the

encoding, crossing in the middle of the

chromosome may not be likely to give rise to fit

progeny, where this is the case other points may be

chosen, or indeed more than one point. Another

common solution is to select a random point up to

the length of the chromosome, and cross there.

Crossover pseudo-code

Chromosome crossover ( Chromosome parentX , Chromosome parentY )

int c r o s s P o i n t = 8 ;

String x First Half = parentX . substring ( 0 , cross Point ) ;

String x Second Half = parentX . sub string ( cross Point , parentX . length ) ;

String y First Half = parentY . sub string ( 0 , cross Point ) ;

String y Second Half = parentY . sub string ( cross Point , parentY . length ) ;

Chromosome crossed X = x First Half + y Second Half ;

Chromosome crossed Y = y First Half + x Second Half ;

Mutation is essential to a true genetic algorithm. In

popular culture mutation is often viewed in a

negative light - simply consider how many horror

films are based around some kind of mutant! In fact

without mutation neither the world as we know it


46

nor our algorithms would evolve efficiently.

Mutation is defined as a minimal change to a

chromosome, so when one is using a binary string

representation often a single bit is flipped. These

changes are usually applied at the end of each

generation before the breeding pool and population

are combined again, but only with a very small

probability of each chromosome being affected. If

this was not done then no new genetic information

would be produced after the initial population -

note that crossover doesn’t create anything, rather

just recombine existing chromosomes. Without

new chromosomes the algorithm is likely to cease

with a suboptimal population, or run infinitely

never converging on a solution. If, on the other

hand, mutation levels are set too high the stream of

new chromosomes could be too large, disrupting

any convergent progress. If mutation was set to

affect every chromosome in each generation and

crossover removed, then the search has become

completely stochastic.

Mutation pseudo-code

Chromosome mutate (Chromosome)

int randomValue = new Random( Chromosome . length ) ;

i f ( Chromosome . valueAt ( randomValue ) == 0 )

Chromosome . valueAt ( randomValue ) == 1 ;

else

Chromosome . valueAt ( randomValue ) == 0 ;

return Chromosome ;

3.1 Mathematical model

Chromosome is the logical unit of information

transmission to the next generation [12]. The

definition of a chromosome can be taken a little

deeper. Usually the chromosome holds a binary

encoding of the optimization subject. Where this is

the case the genetic algorithm is considered

discrete, as clearly only a set number of values can

be assumed. In some cases the encoding involves

the real numbers instead, creating a continuous

genetic algorithm. In other cases, such as modeling

temperature, the use of a continuous chromosome

is more appropriate. For natural selection to take

place, some way of comparing one chromosome to

the other must be available. In the algorithm this is

modeled as a fitness or cost function, where a lower cost chromosome is favored over a higher cost.

Cost function is mapping such that: chromosome

→ R, where a value closer to zero shows a better

optimized chromosome. The formalization that

follows has been drawn from work by B¨ck [1] and

Vose [9]. Wea begin by considering the algorithm

at the highest level. It can be considered a finite

state machine, where each state represents an

arbitrary generation of the population at a time t.

Between these states there is a transition, τ , to the

next generation. The algorithm can be considered

as a function with parameters, as shown in

Equation (1).


47

Genetic Algorithm = (I, Φ, Ω, s, µ, λ, τ, ι) (1)

In this representation, the following notation is used:

• I is the space of chromosomes, or the underlying search space. Each chromosome is of

length l.

• Φ is a cost function I → R.

• Ω represents a set of probabilistic genetic operators. We will specify these shortly.

• s represents a deterministic selection operator. A side affect of this operator is ensuring

population size remains constant.

• µ is the number of parent individuals to include in reproduction.

• λ is the number of offspring individuals from reproduction.

• τ represents the complete process of transitioning from one generation to the next. This

will be expanded shortly.

• ι represents an arbitrary termination condition.

Initialization of I is carried out by a function randomly sampling the range Z (0,2). This is

done l · µ times. To relate this model to the finite state machine outlined above, we will clarify the operation of τ . Consider the population P at a generation t, P (t):

∀t ≥ 1 : P (t + 1) = τ (P (t)) (2)

The termination condition, ι, can be as simple or

complex as required. For this analysis we will

assume it is simply a maximum generation count,

and that functionality to maintain this count is

provided. In implementation this can be combined

with an average cost value, a relative improvement

or a threshold standard deviation of the population.

We now return our analysis to the genetic

operators, Ω. In the set of reproductive functions

we have recombination (crossover), and mutation.

These can be considered as sexual and asexual

operators respectively, characterized by the number

of input chromosomes used. Because of its simpler

nature we will first consider mutation. It can be

modeled as a function ω: I p → I q, where the

chromosome I is shown as a binary vector. This

means an arbitrary I can be shown as (a1 , ..., al ),

where l is the length of the binary string.

Mutation is the smallest unique change that can be

made to a chromosome. By Definition of the

mutation over a binary chromosome should be the

random change of one bit. To model this, a random

bit should be selected in the chromosome, 0 ≤ k ≤ l,

k ∈ N, and that bit flipped. The function then looks

like this:

(a1 , ..., al ) → (a1 , ..., ak ..., al )

(3)

Crossover is recombination of two chromosomes

without loss of information. Crossover works by

taking two chromosomes and swapping over the

values after a random cross point. To do this, once

again a random is selected, 0 ≤ k ≤ l, k ∈ N, and the

function looks like this:

(a1 , ..., al ), (b1 , ..., bl ) → (a1 , ..., ak , bk+1 , ...bl ), (b1 , ..., bk , ak+1 , ...al )

(4)

Elitism is a property preventing current best chromosomes participating in mutation. Many algorithms implement elitism, as it prevents the fitness of a population decaying. If the population is in a suboptimal area of the search space the best solution is retained until mutation makes a


48

selection closer to the global optimum. From here normal evolution can continue.

3.2 Example run

To illustrate better the operation of a genetic algorithm, we shall dry run an own example. For simplicity we will use a five by five grid, as shown in Figure 3. The “optimal” square is shaded in the centre. Chromosomes. Four chromosome will be defined, each of which is a coordinate x,y on the grid. Cost function. The number of squares the chromosome is away from the optimum square is used.

Fig.3 Search space.

Crossover. Two most optimal chromosomes go to the next generation unchanged, two new ones are created as:

(x1 , y1 ), (x2 , y2 ) → (y2 , x1 ), (y1 , x2 ) (5)

Mutation. Not implemented.

Initial population. Randomly instantiated.

Elitism. Implemented.

(a) Chromosomes.

(b) Search space.

Fig.4 End of generation one.

In generation one the initial population is shown in Figure 4. To progress to generation two, the two best chromosomes are unchanged, and two new ones created by crossover. This is shown in Figure 5. Clearly the chromosome of cost one is selected, and as the remaining three have the same cost, we select the first, 5, and 4. The same process is iterated again, creating generation three, and giving rise to one optimal chromosome. This is in Figure 6.

(a) Chromosomes.

(b) Search space.

Fig.5 End of generation two.

(a) Chromosomes.

(b) Search space.


49

Fig.6 End of generation three.

One can see from this that the search space is systematically sampled, the best chromosomes selected, and their traits passed on into the next generation. The algorithms do work without mutation being implemented; in this case it was left out in the interests of minimizing generation count. In the case of a larger example it would become necessary to prevent the evolution stagnating in a suboptimal area, as without this convergence cannot be proven.

4. Flow of Genetic algorithm

• Initially the initial population is selected

randomly from the sample space which

has many populations.

• The fitness value is calculated for each

chromosome in each population and is

sorted out.

• In selection process two parent

chromosomes are selected through

tournament method.

• The Crossover forms new offspring

(children) from the parent chromosomes

using single point probability.

• Mutation mutates the new offspring using

uniform probability measure.

• In elitism selection the best solution are

passed to the further generation.

• The new population is generated and

undergoes the same process it maximum

number of generation is reached.

4.1 Selection process

Selection is used for choosing the best

individuals, that is, for selecting those

chromosomes with higher fitness values. The

selection operation takes the current population and

produces a ‘mating pool’ which contains the

individuals which are going to reproduce. There

are several selection methods, like biased selection,

random selection, roulette wheel selection,

tournament selection. In this work the following

selection mechanisms are used.

4.2 Tournament Selection

Tournament selection has been used in

this as it selects optimal individuals from diverse

groups. It selects t individuals from the current

population uniformly at random, forms a

tournament and the best individual of a group wins

the tournament and is put into the mating pool for

recombination. This process is repeated the

number of times necessary to achieve the desired

size of intermediate population. The tournament

size controls the selection strength. The larger the

tournament size, the stronger is the selection

process.

4.3 Elitist Selection


50

In order to make sure that the best individuals of

the solution are passed to further generations, and

should not be lost in random selection, this

selection operator is used. So we used a few best

chromosomes from each generation, based on the

higher fitness value and are passed to the next

generation of population.

4.4 Reproduction

To generate a second generation

population of solutions from those selected through

genetic operators: crossover (also called

recombination), and/or mutation. For each new

solution to be produced, a pair of "parent" solutions

is selected for breeding from the pool selected

previously. By producing a "child" solution using

the above methods of crossover and mutation, a

new solution is created which typically shares

many of the characteristics of its "parents". New

parents are selected for each new child, and the

process continues until a new population of

solutions of appropriate size is generated. Although

reproduction methods that are based on the use of

two parents are more "biology inspired", some

research suggests more than two "parents" are

better to be used to reproduce a good quality

chromosome. These processes ultimately result in

the next generation population of chromosomes

that is different from the initial generation.

Generally the average fitness will have increased

by this procedure for the population, since only the

best organisms from the first generation are

selected for breeding, along with a small proportion

of less fit solutions, for reasons already mentioned

above. Although Crossover and Mutation are

known as the main genetic operators, it is possible

to use other operators such as regrouping,

colonization-extinction, or migration in genetic

algorithms.

4.5 Termination

This generational process is repeated until

a termination condition has been reached. Common

terminating conditions are:

• A solution is found that satisfies minimum

criteria

• Fixed number of generations reached

• Allocated budget (computation

time/money) reached

• The highest ranking solution's fitness is

reaching or has reached a plateau such that

successive iterations no longer produce

better results

• Manual inspection

• Combinations of the above

5. Conclusion

This method proves accurate in deducting

fraudulent transaction and minimizing the number


51

of false alert. Genetic algorithm is a novel one in

this literature in terms of application domain. If this

algorithm is applied into bank credit card fraud

detection system, the probability of fraud

transactions can be predicted soon after credit card

transactions. And a series of anti-fraud strategies

can be adopted to prevent banks from great losses

and reduce risks. The objective of the study was

taken differently than the typical classification

problems in that we had a variable misclassification

cost. As the standard data mining algorithms does

not fit well with this situation we decided to use

multi population genetic algorithm to obtain an

optimized parameter.

Future Enhancements

The findings obtained here may not be generalized

to the global fraud detection problem. As future

work, some effective algorithm which can perform

well for the classification problem with variable

misclassification costs could be developed.

REFERENCES

[1] Alan C Schultz. Learning robot behaviours

using genetic algorithms. Navy Center for Applied

Research in Artificial Intelligence, Naval Research

Laboratory, Washington, 1994.

[2] Charbonneau P., Genetic Algorithms in

Astronomy and Astrophysics High Altitude

Observatory. National Center for Athmospheric

Research, pp. 309-334, 1995.

[3] Dr Markus Roggenbach. CS364 Software

testing slides. Swansea University, 2011.

[4] Heinrich Braun. On solving travelling salesman

problems by genetic algorithms. Springer

Berlin / Heidelberg, 1991.

[5] Holland J., Adaptation in Natural and Artificial

Systems. Ann Harbor, MI: University of Michigan

Press. 1975.

[6] Jean-Yves Potvin. Genetic Algorithms for the

Travelling Salesman Problem. Centre de

Recherche sur les Transports, 1996.

[7] Kaya M., Autonomous Classifiers with

Understanable Rule Using Multi-objective Genetic

Algorithms. Expert Systems with Applications.

Vol. 37, no. 4, pp.3489-3494, 2009.

[8] Levi M., Burrows J., Fleming M., Hopkins M.,

The Nature, Extent and Economic Impact of Fraud

in the UK. Report for the Association of Chief

Police Officers' Economic Crime Portfolio. 2007.

[9] Michael Vose. The Simple Genetic Algorithm.

Massachusetts Institute of Technology, 1999.

[10] Minaei-Bidgoli B., Kashy D., Kortemeyer G.,

Punch W., Predicting Student Performance: An

Application of Data Mining Methods with the

Educational Web-based System LON CAPA.

Proceedings of ASEEIIEEE Frontiers in Education

Conference. 2003.

[11] Srinivas M., Patnaik L., Genetic algorithms - a

survey. IEEE Computer Society, 1994.

[12] Thomas Back. Evolutionary Algorithms in

Theory and Practice. Oxford University Press,1996.

[13] UK Statute Law. Data Protection Act 1998.

Office of Public Sector Information, 1998.

[23] Wojciech Paszkowicz. Genetic Algorithms, a

Nature-Inspired Tool: Survey of Applications in


52

Materials Science and Related Fields. Taylor and

Francis Group, 2009.


53

genetic algorithms for credit card fraud detection - inase · characteristics of credit card...

Documents