hierarchical clustering analysis of crime rate

Hierarchical Clustering Analysis of Crime Rate

1M.M. Kembe,

2A.A. Onoja,

3O.C. Ogundare

1Benue State University, Makurdi,

2University of Jos, Jos Nigeria,

3University of Jos, Jos Nigeria

Abstract: Cluster analysis search for patterns in a data set and is done by grouping the

(multivariate) observations into clusters. The goal is to find an optimal grouping for which the

observations or objects within each cluster are similar, but the clusters are dissimilar to each

other. One hopes to find the natural groupings in the data, groupings that make sense to the

researcher. The hierarchical clustering method uses the distance matrix to build a treelike

diagram, called a dendrogram. In a bid to proffer solution to the problems of crime which has

bedeviled Plateau state’s social and economic development, the hierarchical cluster technique

was employed as a powerful multivariate statistical tool to analyze crime rate in these local

government areas based on similarities in the crime rate recorded from 2004-2014 to provide

appropriate formation of military deployment and security tips for such areas. This study uses

the Hierarchical Clustering technique to examined the crimes committed on the Plateau and

recommend that more security personnel should be deployed to the localities like Angwan Rogo,

Angwan Rukuba, Gangare, Odus Bauchi Ringroad, Jenta, Fobor, Apata, Dilimi Yandoya and

Bukuru. This is to help maintain and build a lasting peace on the land of Plateau especially in

this contemporary time when Terrorism and other crimes are distorting civic rest, destabilizing

peace and economies of nations. The Study also recommended that the government should

encourage the youth in such localities by providing social amenities, employment opportunities

and peaceful campaign programs and sensitization of these communities on the need for a

peaceful coexistence.

Keywords: Crime Rate, Hierarchical clustering technique, Dendrogram

1.0 Introduction

The prevalence of crime in Jos south, Jos North and Jos east local government areas has

undoubtedly slowed down all forms of development in Plateau State. There have been jingles in

most radio and television stations in the state on the status quo, one of which states that “social

and economic development can only thrive an atmosphere of peace, let’s give peace a chance”.

The oscillation of crime in recent times has been attributed to several factors which include

illiteracy, unemployment, poverty, high state of urban-rural drift in search of greener pasture and

ultimately as an aftermath of incessant crisis in these local government areas, as there is high rate

of illegal possession of arms, cattle rustling, among others. In a bid to proffer solution to this

problem which has bedeviled the state’s social and economic development, the hierarchical

cluster technique was employed as a powerful multivariate statistical tool to analyze crime rate in

these local government areas based on similarities in the crime rate recorded from 2004-2014 to

provide appropriate formation of military deployment and security tips for such areas.

2.0 Review Literature

Vander Walt et al (1985) refer to crime as wide range activities, which include violent personal

crimes, property crimes, organized crimes and political crimes. They further distinguished

between crimes defined in a no juridical sense. Juridical (legally) crime is a contravention of the

law, to which a punishment is attached and imposed by the state. In other words, crime is an act

which is forbidden by the law, and if detected, is likely to be punished. Non-juridical

(criminological) crime can be viewed as an act of anti-social behavior which influences the life

of the individual, his/her community and society at large. (Van Velzen, 1998) described crime in

a non-juridical sense as an anti-social act entailing a threat to and a breach or violation of the

stability and security of a community and its individual members. Society is a network of

interacting persons, groups and institutions. Interaction involves establishing relations between

these individuals, groups and institutions. Crime is an act which violates these social relations

and it is this violation which harms the individual and society at large. Therefore crime in its

non-juridical sense (that is, when it is perceived as a personal threat) leads to feelings of fear and

mistrust. Literally, a criminal is a frustrated person who emerges in the society and violates the

law. Vander Walt et al (1989) distinguished between a criminal defined juridical and a criminal

defined in a non-juridical sense. Juridical (legally), a criminal is a person who has been found

guilty and punished. There are basically two main categories of crime, that is violence crime

(murder, assault, rape, abduction, etc) and property crime (theft, all other forms of theft, house

breaking, robbery and other forms of robbery vandalism, arson and so on) in relation to socio-

economic development. Brown et al (1996) argue that people mostly fear violent crimes (such as

murder, rape, robbery and assault). Victims may be deeply angered when they are swindled or

their houses are broken into, but these emotions pale in comparison to the fear death or serious

injury that can be inflicted by a violent crime. Crime has tended to undermine the importance of

development. Many people have been violently victimized in the past in this country, either by

means of murder, attempted murder, robbery, rape, assault, and so on.

In cluster analysis the search for patterns in a data set is done by grouping the (multivariate)

observations into clusters. The goal is to find an optimal grouping for which the observations or

objects within each cluster are similar, but the clusters are dissimilar to each other. One hopes to

find the natural groupings in the data, groupings that make sense to the researcher. In cluster

analysis, neither the number of groups nor the groups themselves are known in advance. To

group the observations into clusters, many techniques begin with similarities between all pairs of

observations. In many cases the similarities are based on some measure of distance. Other cluster

methods use a preliminary choice for cluster centers or a comparison of within- and between-

cluster variability. It is also possible to cluster the variables, in which case the similarity could be

a correlation. Clusters can be graphically represented by plotting the observations. If there are

only two variables (p = 2), we can be do this in a scatter plot. For p > 2, we can plot the data in

two dimensions using principal components or biplots. For an example of a principal component

plot, in which four clear groupings of points can be observed. Another approach to plotting is

provided by projection pursuit, which seeks two-dimensional projections that reveal clusters

[Friedman and Tukey (1974); Huber (1985); Sibson (1984); Jones and Sibson (1987); Yenyukov

(1988); Posse (1990); Nason (1995); Ripley (1996)]. Hierarchical clustering technique has also

been referred to as classification, pattern recognition (specifically, unsupervised learning), and

numerical taxonomy. The techniques of hierarchical clustering have been extensively applied to

data in many fields, such as medicine, psychiatry, sociology, criminology, anthropology,

archaeology, geology, geography, remote sensing, market research, economics, and engineering.

Quantitative variables shall be basically considered. According to Duran and Odell (1974),

Jensen (1969), and Seber (1984), hierarchical methods and other clustering algorithms represent

an attempt to find “good” clusters in the data using a computationally efficient technique. It is

not generally feasible to examine all possible clustering possibilities for a data set, especially a

large one. The number of ways of partitioning a set of n items into g clusters is given by:

This can be approximated by which is large even for moderate values of n and g for a set

of n items is . Hence, hierarchical methods and other approaches permit one to

search for a reasonable solution without having to look at all possible arrangements. As noted

above, hierarchical clustering algorithms involve a sequential process. In each step of the

agglomerative hierarchical approach, an observation or a cluster of observations is merged into

another cluster. In this process, the number of clusters shrinks and the clusters themselves grow

larger. Start with n clusters (individual items) and end with one single cluster containing the

entire data set. An alternative approach, called the divisive method, starts with a single cluster

containing all n items and partitions a cluster into two clusters at each step. The end result of the

divisive approach is n clusters of one item each. Agglomerative methods are more commonly

used than divisive methods. In either type of hierarchical clustering, a decision must be made as

to the optimal number of clusters. At each step of an agglomerative hierarchical approach, the

two closest clusters are merged into a single new cluster. The process is therefore irreversible in

the sense that any two items that are once lumped together in a cluster cannot be separated later

in the procedure; any early mistakes cannot be corrected. Similarly, in a divisive hierarchical

method, items cannot be moved to other clusters. An optional approach is to carry out a

hierarchical procedure followed by a partitioning procedure in which items can be moved from

one cluster to another. Since an agglomerative hierarchical procedure combines the two closest

clusters at each step, one must consider the question of measuring the similarity or dissimilarity

of two clusters. Different approaches to measuring distance between clusters give rise to

different hierarchical methods.

3.0 Methods

Ayila (2004) defined research methods as the process of arriving at a dependable solution to

problems through a planned systematic collection, analysis and interpretation of data.

3.1 Hierarchical Cluster Analysis

The hierarchical clustering method uses the distance matrix to build a treelike diagram, called a

dendrogram. In the beginning, all individual objects are considered as clusters with one object,

that is, itself. If there are N objects, then there will be Nclusters. Then, the two objects with the

closest distance are selected and combined into a single cluster. Now the number of clusters are

changed from n to N − 1. Then we compute the distances between other objects to the newly

formed cluster. This distance calculation is based on one of the linkage methods, such as single

linkage, complete linkage, etc. After this, update the distance matrix. Now there are n − 1

clusters, so the new distance matrix will be a (N− 1) (N− 1) matrix. This process is continued.

At each step of this process, the number of clusters is reduced by 1. Finally, all objects are

combined into one cluster. Their mutual relationship is expressed by the dendrogram. Then

examine the shape of dendrogram and decide how many clusters are there in the whole

population and which objects should be included in each cluster. Specifically, the following step-

by-step procedure is adopted:

Step 1. Establish distance measure from object to object, then compute the distance matrix D,

where

There are N clusters, each containing only one object.

Step 2. Find the minimum distance in the distance matrix D. Assume that the distance from

object r to object s is, . Then object r and s are selected to form a single cluster (r, s).

Step 3. Delete the rows and columns corresponding to object r and s in D. Then add a new row

and column corresponding to cluster (r, s). So the numbers of rows and columns of D are

reduced by 1. Compute the distance from other objects to cluster (r, s), by using one of the

linkage methods, using these new distances to fill the row and column corresponding to cluster

(r, s), so that there will be a new matrix D.

Step 4. Repeat steps 1 and 2 N − 1 times until all objects form a single cluster. At each step,

record merged clusters and the value of distances at dendrogram. In a dendrogram, the distances

between clusters and the joining process are described very well. Of course, these usually do not

form just one single cluster, there is need to “cut” the dendrogram to get several clusters. A good

clustering should be as follows:

The objects within a cluster should be similar, in other words, the distances between the

objects within a cluster should be smaller.

The objects from different clusters should be dissimilar, or the distances between them

should be large. Or can to “cut” the clusters by distance measures or the similarity

measure. Here the similarity measure is defined by

3.2 Properties of Hierarchical Methods

Monotonicity: If an item or a cluster joins another cluster at a distance that is less than the

distance for the previous merger of two clusters, we say that an inversion or a reversal has

occurred. The reversal is represented by a crossover in the dendrogram. Example is the

crossovers. A hierarchical method in which reversals cannot occur is said to be monotonic,

because the distance at each step is greater than the distance at the previous step. A distance

measure or clustering method that is monotonic is also called ultrametric. We now show that the

single linkage and complete linkage methods are monotonic. Let dk be the distance at which two

clusters are joined at the kth step. We can describe steps k and k+1 in terms of four clusters A, B,

C, and D. Suppose D(A, B) is less than the distance between any other pair among these four

clusters, so that A and B are joined at step k to form AB. Then dk = D(A, B) < min{D(A,C),

D(B,C), D(C, D)}.

Contraction or Dilation: this can be considered as the characteristics of the distances or

proximities between the original points. As clusters form, the properties of this space of distances

may be altered somewhat. A clustering method that does not alter the spatial properties is

referred to by Lance and Williams (1967) as space-conserving. A method that is not space

conserving may either contract or dilate the space. A method is space-contracting if newly

formed clusters appear to move closer to individual observations, so that an individual item tends

to join an existing cluster rather than join with another individual item to form a new cluster.

This tendency is also called chaining. A method is space-dilating if newly formed clusters appear

to move away from individual observations, so that individual items tend to form new clusters

rather than join existing clusters. In this case, clusters appear to be more distinct than they are.

Dubien and Warde (1979) described the spatial properties as follows. Suppose that the distances

among three clusters satisfy D(A, B) < D(A,C) < D(B,C). Then a cluster method is space-

conserving if D(A,C) < D(AB,C) < D(B,C). A method is space-contracting if the first inequality

does not hold and space-dilating if the second inequality does not hold.

3.3 Assumptions

The likelihood distance measure assumes that variables that formed the hierarchical cluster

model are independent. Further, each continuous variable is also assumed to have a normal

(Gaussian) distribution, and each categorical variable is assumed to have a multinomial

distribution. Empirical internal testing indicates that the procedure is fairly robust to violations of

both the assumption of independence and the distributional assumptions, but one should try to be

conscious of how well these assumptions are met. However, the data used for this research work

were collected from Three Local Government Areas Nigerian Police Divisions of Plateau State

from 2004-2014.

4.0 Analysis of Data

The data for this research work were analyzed using the Predictive Analytical Software (PASW)

IBM Version 20.

4.1 Interpretation of Results

The Vertical Icicle Plot is the ordering of Cluster cases from Yantrailer, Fobor, in that order to

Angwan Rogo. These cases can be cross check on the dendrogram which keep the cases close

together; it kind of takes a cluster and keeps the cases together on the dendrogram. The top

cluster, Middle cluster and the end cluster are all on the Icicle plot, the essence of the icicle plot

is to keep all the cases together. To interpret the icicle plot, the vertical columns is use to keep

track of the progression of the clustering. For each of these cases there is a vertical column that is

as the column is short it demonstrate were the clusters have been joined. The first two to get join

are Jenta and Odus Ringroad, follow by Maijuju and Dilimi Yandoya , Bukuru and Gangare and

so on. The last two joining will be represented by the tallest column which is the group that

contain Dadin Kowa and the group that contain Bauchi Road. The dendrogram is the pictorial

representation of the clusters to see how well they are joined. Cluster one consist of six cases

which are Angwan rogo, Angwan Rukuba, Bukuru, Gangare, Kabong, Zawan, Cluster two

consist of one case which is Dadinkowa, cluster three consist of five cases which are Angware,

Jenta, Apata, Bauchi Road and Odus Ringroad and cluster four consist of four cases which are

Maijuju, Fobor, Yantrailer, and Dilimi Yandoya it can be worth noting that cluster three and four

are more closer together than the other clusters. From the Scree plot it can be seen that the crime

data is quadratic in nature with stage 15 at the peak of the graph. The ward method cluster

membership is used to form the frequency table showing the distribution of all the clusters with

cluster one having 37.5% and cluster 2 having the lowest percentage about 6.3%. The ANOVA

table above showed the entire crime rate committed on Plateau from Murder down to Cattle

Rustling. All the crime cases except Burglary having the p-value of p = 0.087 which is greater

than the p-value p = 0.05 are significant at the 5% level of significance.

4.2 Discussion of Results

From the results of the analysis above it can be worth noting that hierarchical clustering analysis

is a powerful multivariate tool that can classify multivariate statistical data based on their

similarities. It identifies hidden factors that cannot be account for. From the result of the

Dendrogram it can be seen that cluster one which consist of six cases via: Angwan Rogo,

Angwan Rukuba, Bukuru, Gangare, Kabong, Zawan have a lot of similarities in their crime rate,

and these cases call for security attention to tackle the uprising of crime in such localities, places

like Angwan Rukuba, Angwan Rogo, Gangare, Kabong and Bukuru have been known in past

and recent times for high rate of crime such as rape, Murder, Assault, Drug abuse and Cattle

rustling. Cluster two which consist of just one case which is Dadinkowa has been known in the

past and recent times for it peaceful coexistence between all the clans, tribes and religious

entities, cohabiting, living together in peace and harmony, this place witness low crime rate this

is in-line with the publications made by the peace and conflict resolution, University of Jos

(2014) which awarded the locality for its peaceful coexistence on the Plateau. Cluster three

which consist of five localities via: Angware, Jenta, Apata, Bauchi Road and Odus Ringroad also

calls for more security redeployment in such areas, these localities in the past and present have

been known for high rate of rape, Murder especially Campus occultism, Prostitution, drug abuse,

Robbery activities and burglary, these localities have in time past proven notorious in its dealings

and calls for security attention especially the joint task force operation. Cluster four which

consist of Maijuju, Fobor, Yantrailer, and Dilimi Yandoya also are similar in its mode of crime

operations, these localities in the past and present have been known for prevalent cases of Man

Assault, Robbery and Murder, these localities have threaten the lasting peace on the Plateau, and

calls for more security operations in such places. From the ANOVA table above it can be seen

that all the various kinds of crimes committed on the Plateau via: Murder, Rape, Robbery,

Assault, Drug abuse, Cattle rustling are all significant that is having a p-value greater than p-

value of p = 0.05 which means they are all significantly committed on the land of Plateau but

Burglary is not significant on the land of Plateau, this does not imply that it is not committed in

the land but it is done in relative small scale compare to the other crimes.

4.3 Conclusion and Recommendation

In recommendation this study has examined the crimes committed on the Plateau and

recommend that more security personnel should be deployed to the localities like Angwan Rogo,

Angwan Rukuba, Gangare, Odus Bauchi Ringroad, Jenta, Fobor, Apata, Dilimi Yandoya and

Bukuru. This is to help maintain and build a lasting peace on the land of Plateau especially in

this contemporary time when Terrorism and other crimes are distorting civic rest, destabilizing

peace and economies of nations. Also the government of the day should encourage the youth in

such localities by providing social amenities, employment opportunities and peaceful campaign

programs and sensitization of these communities on the need for a peaceful coexistence.

REFERENCES

Brown, S.E., Esbensen F and Geis G. 1996 Criminology: Explaining Crime and its Context, 2nd

edition, Cincinnati; Aderson Publishers.

Dubien, J. L. and Warde, W. D. (1979), “A Mathematical Comparison of the Members of an

infinite Family of Agglomerative Clustering Algorithms,” Canadian Journal of Statistics, 7, 29–

38

Duran, B. S., and Odell, P. L. (1974), “Cluster Analysis: a Survey,” Lecture Notes in Economics

and Mathematical Systems, New York: Springer-Verlag.

Friedman, J. H., and Tukey, J. W. (1974), “A Projection Pursuit Algorithm for Exploratory Data

Analysis,” IEEE Transactions on Component Parts, 9, 881–890.

Huber, P. J. (1985), “Projection Pursuit,” Annals of Statistics, 13, 435–525.

Jensen, R. E. (1969), “A Dynamic Programming Algorithm for Cluster Analysis,” Operations

Research, 12, 1034–1057.

Jones, M. C., and Sibson, R. (1987), “What is Projection Pursuit?” Journal of the Royal

Statistical Society, 150, 1–36.

Lance, G. N., and Williams,W. T. (1967), “A General Theory of Classificatory Sorting

Strategies: I. Hierarchical Systems,” Computer Journal, 9, 373–380.

Nason, G. (1995), “Three-dimensional Projection Pursuit,” Journal of Applied Statistics, 44(4),

411–430.

Posse, C. (1990), “An Effective Two-dimensional Projection Pursuit Algorithm, Communications

in Statistics: Simulation and Computation, 19, 1143–1164.

Ripley, B. D. (1996), Pattern Recognition and Neural Networks, Cambridge: Cambridge

University Press.

Seber, G. A. F. (1984), Multivariate Observations, New York: Wiley.

Sibson, R. (1984), “Present Position and Potential Developments: Some Personal Views.

Multivariate analysis (with discussion),” Journal of the Royal Statistical Society, Series A, 147,

198–207.

Vander Walt P.J., Cronje G., and Smith B.F. 1985: Criminology: An Introduction Pretoria

Haum.

Van Velzen F.A. 1998: Fear of Crime. A Socio-Criminological Investigation. Unpublished D.

Phil Thesis, Kwa Dlangezwa, University of Zululand.

Yenyukov, I. S. (1988), “Detecting Structures by Means of Projection Pursuit,” Compstat, 88,

47– 58.

hierarchical clustering analysis of crime rate

Documents