from police and judicial databases to an offender-oriented data warehouse

8
FROM POLICE AND JUDICIAL DATABASES TO AN OFFENDER-ORIENTED DATA WAREHOUSE Sunil Choenni and Ronald Meijer Research and Documentation Centre (WODC) of the Ministry of Security and Justice, Schedeldoekshaven 131, 2511 EM Den Haag, The Netherlands ABSTRACT For the execution of the criminal law, several organisations are involved in the criminal justice system. Each of these organisation has their own more or less isolated databases/sources. As the interest in the criminal justice chain as a whole increased in the nineties, the need for coherent information about the entire chain increased as well. The data from the different organisations may be compared to each other, for instance, to gain some understanding of how specific groups of suspects or criminal proceedings move through the chain. Therefore, we developed an offender-oriented data warehouse on the basis of three databases at different organisations. In this paper we give an overview of the challenges that we had to tackle in setting up the data warehouse and we describe how we are applying the data warehouse in practice. KEYWORDS e-government, databases, data warehouse, criminal justice system 1. INTRODUCTION The police and the judicial authorities of the Netherlands count various organisations, each of which operates relatively autonomously and independently. Each developed its own operational systems. The police built its regional Identification Service Systems (HKS) and the Public Prosecutions Department developed the Public Prosecutions Departments Administration System (COMPAS). Judicial data are registered in the Criminal Records System (JDS), and the Custodial Institutions Agency enters its data in the Custodial Institutions Enforcement System (TULP). Alongside the operational systems, derivative databases were developed intended to make the data from the operational systems suitable for making analyses and reports for management information purposes for the organisation. A national HKS was derived from the regional HKS databases; data from all regional versions were merged into this national HKS. In addition to the COMPAS operational system, the Public Prosecutions Department (PPD) developed a national variant, the Public Prosecutions Department Data (OMDATA), merging data from all court districts. The Judicial Records Research and Policy Database (OBJD) contains a selection of data from JDS. As each of the three sources – HKS, OMDATA, OBJD – was developed by a different organisation for different target groups, with each organisation having its own information requirements, it will not be surprising that these sources differ in many respects. On the other hand, however, partly the same information is stored in several sources. One of the consequences of this is that, if a question can be answered by more than one of these sources, the answer is hardly ever the same. Depending on the source used, the answer may be different. In addition to this, the answers may be inconsistent due to variations among the definitions used in the different systems, which does not help the validity and reliability of the data. The provision of information may consequently be considered far from optimal. As the interest in the criminal justice chain as a whole and the interrelationship among the different co- operating organisations increased in the nineties, the need for coherent information about the entire chain increased as well. The data from the different co-operating organisations may be compared to each other, for instance, to gain some understanding of how specific groups of suspects or criminal proceedings move through the chain. Inconsistencies in the data were consequently considered undesirable for an effective and ISBN: 978-972-8939-46-5 © 2011 IADIS 98

Upload: independent

Post on 11-Dec-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

FROM POLICE AND JUDICIAL DATABASES TO AN OFFENDER-ORIENTED DATA WAREHOUSE

Sunil Choenni and Ronald Meijer Research and Documentation Centre (WODC) of the Ministry of Security and Justice, Schedeldoekshaven 131, 2511 EM

Den Haag, The Netherlands

ABSTRACT

For the execution of the criminal law, several organisations are involved in the criminal justice system. Each of these organisation has their own more or less isolated databases/sources. As the interest in the criminal justice chain as a whole increased in the nineties, the need for coherent information about the entire chain increased as well. The data from the different organisations may be compared to each other, for instance, to gain some understanding of how specific groups of suspects or criminal proceedings move through the chain. Therefore, we developed an offender-oriented data warehouse on the basis of three databases at different organisations. In this paper we give an overview of the challenges that we had to tackle in setting up the data warehouse and we describe how we are applying the data warehouse in practice.

KEYWORDS

e-government, databases, data warehouse, criminal justice system

1. INTRODUCTION

The police and the judicial authorities of the Netherlands count various organisations, each of which operates relatively autonomously and independently. Each developed its own operational systems. The police built its regional Identification Service Systems (HKS) and the Public Prosecutions Department developed the Public Prosecutions Departments Administration System (COMPAS). Judicial data are registered in the Criminal Records System (JDS), and the Custodial Institutions Agency enters its data in the Custodial Institutions Enforcement System (TULP).

Alongside the operational systems, derivative databases were developed intended to make the data from the operational systems suitable for making analyses and reports for management information purposes for the organisation. A national HKS was derived from the regional HKS databases; data from all regional versions were merged into this national HKS. In addition to the COMPAS operational system, the Public Prosecutions Department (PPD) developed a national variant, the Public Prosecutions Department Data (OMDATA), merging data from all court districts. The Judicial Records Research and Policy Database (OBJD) contains a selection of data from JDS. As each of the three sources – HKS, OMDATA, OBJD – was developed by a different organisation for different target groups, with each organisation having its own information requirements, it will not be surprising that these sources differ in many respects. On the other hand, however, partly the same information is stored in several sources. One of the consequences of this is that, if a question can be answered by more than one of these sources, the answer is hardly ever the same. Depending on the source used, the answer may be different. In addition to this, the answers may be inconsistent due to variations among the definitions used in the different systems, which does not help the validity and reliability of the data. The provision of information may consequently be considered far from optimal.

As the interest in the criminal justice chain as a whole and the interrelationship among the different co-operating organisations increased in the nineties, the need for coherent information about the entire chain increased as well. The data from the different co-operating organisations may be compared to each other, for instance, to gain some understanding of how specific groups of suspects or criminal proceedings move through the chain. Inconsistencies in the data were consequently considered undesirable for an effective and

ISBN: 978-972-8939-46-5 © 2011 IADIS

98

efficient development of policy. Just like other organisations, the Research and Documentation Centre (WODC) of the Ministry of Security and Justice (acronym in Dutch: WODC) has taken a step towards the integration of information systems. The integration concerns three information systems from the criminal justice chain, namely HKS, OMDATA, and OBJD. These information systems have been incorporated in an Offender-Oriented Data Warehouse, with the aim to optimise the provision of information.

By integrating the data into a data warehouse (DW), it will be possible to tackle the problems around inconsistencies, reliability, and validity. A DW ensures a uniform approach to data, integrates its management, and ensures maximum accessibility. Setting up the Offender-Oriented DW requires both expertise in the area of judicial authorities and computer science.

Relating data from different systems is a labour intensive job, because the information from a system does not contain any references (‘foreign keys’) to so-called primary keys of the other systems. A primary key is a collection (not empty) of characteristics (also referred to as attributes) which uniquely specifies a record in a system (Elmasri et al., 2004). It is not always permitted to use the foreign keys existing among some systems, because they are privacy sensitive. Consequently, it is currently not yet feasible to relate information from the different co-operating organisations at the individual level.

In this paper, we discuss our approach to set up an offender-oriented DW given three sources, HKS, OMDATA and OBJD. Furthermore, we illustrate how we are using the DW in practice. In our approach, we relate these databases on the basis of the collection of the common attributes between them. For the purpose of integrating the data from the different databases intensive use was made of extensive domain expertise available at the WODC, which had been developed in the course of years.

Different systems contain different attributes, but partly also the same attributes. The attribute ‘offender’s place of residence’ may, for instance, be relevant to a police unit engaged in bringing in a suspect as well as to the PPD engaged in proceedings against a suspect and/or the correspondence as a result of these proceedings. In this case, the attribute ‘place of residence’ will be found in HKS as well as OMDATA. It furthermore appeared that, although many of the attributes found in systems were not the same, they did show strong correlations. The contents of HKS and OMDATA, for instance, show that in respect of one record, the same date had been entered behind the attributes ‘date of the offence’ and ‘date of official report’ in nearly all cases. In practice, the offence is nearly always reported to the police on the same day as the offence has been committed. In order to relate these databases, we also used attributes that strongly correlated and we counted these attributes towards the collection of the common attributes.

The remainder of this paper is organized as follows. In Section 2, we briefly discuss the Dutch criminal justice system and the databases playing a role in our data warehouse. Section 3 is devoted to our approach, the architecture and how we use our data warehouse. Section 4 concludes the paper.

2. DATABASES IN THE DUTCH CRIMINAL JUSTICE SYSTEM

The Dutch criminal justice system may be regarded as a chain of successive links consisting of the phases of criminal investigation, prosecution, trial, and enforcement of the sentences, orders, and measures (Kalidien and Eggen, 2009). The criminal investigation phase starts the moment that the police or another investigating officer has learned of a suspicion that an offence has been committed. The police may in some cases decide to settle cases themselves by a dismissal, a transaction, or – in case of juveniles – a referral to a so-called ‘Halt bureau’ for a ‘Halt settlement’, to avoid a criminal record. The criminal investigation phase is followed by the prosecution phase. On the basis of the results of the criminal investigation, the public prosecutor decides whether to prosecute a suspect or not. The PPD may dismiss the case or he may offer the suspect a transaction (an out-of-court settlement) or issue a punishment order for several minor offences without involving the court by imposing a fine. The PPD is the only authority with the power to decide who is to appear before the court and for which offence. The trial phase starts as soon as the criminal investigation has been concluded and the PPD has decided to prosecute the suspect (to summon a person). The case is then before the court. If a court finds a defendant guilty, it may impose a sanction. Dutch criminal law has principal punishments, additional punishments, and non-punitive measures. The principal punishments are imprisonment, fines, and community orders. Additional punishments include the deprivation of specific rights and confiscation. The enforcement phase starts when the judgment in the first instance or in appeal has become final and conclusive: the judgment will then be enforced. The PPD assigns the actual enforcement of

IADIS International Conference e-Society 2011

99

the sanctions to different judicial and private agencies. The enforcement of custodial sentences and orders are always assigned to the Ministry of Security and Justice / the Custodial Institutions Agency. The enforcement of community orders is assigned to the probation and after-care service in respect of adults and to the Child Protection Board in respect of minors. The Central Fine Collection Agency is responsible for the collection of fines.

It will be evident from the foregoing that the criminal justice system is a complex system involving many more or less independent organisations. The law imposes rules and regulations regarding the data and information policy on each organisation. Each organisation in the criminal justice chain has its own database system in order to facilitate the primary processes. These database systems are not directly connected to each other (Kalidien et al., 2009 ). Below, we will particularly focus on the three systems that have been integrated into our Offender-Oriented DW, namely HKS, OMDATA, and OBJD.

2.1 HKS Database

In HKS, the police enter data of all natural persons who have been apprehended by the police for an offence and in respect of whom an official report has been drawn up. The database contains data on suspects in respect of whom official reports have been drawn up since 1996. A suspect is registered in HKS under a unique personal number: meta number. The file includes such personal details as date, place and country of birth, and sex. In addition, data is entered concerning the perpetration of the offence, such as the person’s age when the first offence was committed and the number of official reports drawn up in his or her total criminal history. The official reports drawn up are registered under a report number; attributes are registration date of the report, year of the offence, and the police region where it has been registered. A report may contain several offences. Within a report, the offences are given an offence number. Attributes entered for each offence are: sections of law violated, number of sections of law violated for each offence, an indication about whether it concerned an attempt to commit the offence, an indication of its severity and a classification of the offence. On the basis of the HKS file, answers may be obtained to questions about numbers of suspects a year or reference year, the numbers of suspects by personal and offence data, or by category, such as habitual offender. Answers may also be obtained to questions about numbers of reports a year or reference year, by details of the suspects or by type of offence.

2.2 OMDATA Database

OMDATA contains all data on the prosecution and trials in all criminal proceedings initiated by the PPD. In respect of court cases (serious offences), OMDATA only contains data on first instance cases. In respect of subdistrict court cases (minor offences), OMDATA contains data on first instance cases. Each case has a unique case number, which is a combination of the district code, the year in which the case was registered at the Public Prosecutor’s Office, and the Public Prosecutor’s Office number. Attributes entered for each case are: some data on the offender, such as date and country of birth, sex, and place of residence. In addition, it includes the decisions made by the PPD, data on dismissals and transactions, information on the final judgement, together with court decision and sentences, orders, and measures imposed. A case may contain several offences. The offences within a case are numbered. The attributes entered in respect of the offences include the following: laws and sections of law violated, date and place of the offence, out-of-court settlement offered by the PPD, type of dismissal, nature and scope of the transaction, some details about the preliminary stage at the police, and several attributes typical of traffic offences, such as breath alcohol content. If a case has been taken to court, it has been heard. There may be several hearings in a case. Each hearing is given a cause list number. During each hearing, some operational data are noted down, including the dates of settlement, notice for enforcement, and the fact that a judgment has become final. Other information included in OMDATA is information on the available legal remedies for appeal and information on the final judgment, together with the court decision and the sentences, orders, and measures imposed.

2.3 OBJD Database

OBJD contains information on the settlements of criminal proceedings when the case has been settled conclusively. OBJD contains information on persons (encrypted), criminal proceedings - together with the

ISBN: 978-972-8939-46-5 © 2011 IADIS

100

related decisions – and any notices. Personal attributes in OBJD are: a unique criminal records number (JDS number), registration number from the municipal administration, month and year and country of birth, nationality, and sex. Legal attributes in OBJD are: unique registration number listed in the Chamber of Commerce, place and country of business as well as place and country of business of the Chamber of Commerce. A person may have several criminal proceedings in his name. Each of these has a unique encrypted case identification number (case id#) and Public Prosecutor’s Office number. The following attributes have been entered for each case: year of registration at the public prosecutor's office, district, type of case, month and year of settlement, agency providing the data, out-of-court settlement offered by the PPD, indications whether the case was settled, whether the case was settled out of court by the PPD or at court, whether it concerned a district-court case or a subdistrict-court case, whether juvenile criminal law was applied, and a number of details concerning the most severe offence in the case. A case may contain several offences. Each offence is given a unique offence identification number (offence id#) and an offence number. The attributes entered in respect of the offences include the following: laws and sections of law violated, place and year of the offence, age of the person at the time of committing the offence, and, where applicable, any out-of-court settlement offered by the PPD, type of dismissal and the grounds for dismissal, and nature and scope of the transaction. If the case has been settled by a court, there is a court decision. A decision in a case may contain several elements. These sentence elements have been numbered by case. The attributes include type of sentence, whether the sentence is unconditional or conditional, severity of the sentence, length of the operational period, and an indication whether a special condition has been set, and if so, what type of special condition.

3. AN OFFENDER-ORIENTED DATA WAREHOUSE

Our approach to setting up a DW was based on making maximum use of database diagrams and the contents of the databases. A database diagram describes the objects (entities) about which information is stored and the relationships between those objects. It also gives a description of the characteristics (attributes) stored for each object. In using the contents of the databases, we have frequently used the available domain expertise.

When taking two records from two different systems, we used the following general rule to establish whether these records related to one and the same object in reality or to two different objects: as the number of common attributes with the same values for two records from two different systems is larger, the chance of the records relating to the same object in reality is larger. The chance that we are dealing with the same person is, for instance, considerable in the following two records: 1) a record from HKS relating to a person residing in Brussels in respect of whom an official report had been drawn up on 6 May 2009; and 2) a record from OMDATA on a person residing in Brussels who committed an offence on 6 May 2009. This chance would, however, be much smaller if HKS showed that the date of the official report had not been entered correctly and is consequently unknown.

In Section 3.1, we discuss our approach on the basis of an example. For a detailed description, we refer to (Choenni et al, 2009). In Section 3.2, we discuss the results that we obtained with our approach and its applicability. Then, in Section 3.3, we discuss briefly the architecture of our DW and in Section 3.4 we demonstrate how we use the DW in practice.

3.1 Approach

Let us consider the ‘snapshots’ – example records – of the two databases, represented in Table 1 and 2.

Table 1. Example of records from HKS

id# Name Address Postal Code Sex Year of Birth Official Report

Date of Official Report

1 Paelix Schmidtweg 4 2500 EH M 1975 1234-01 6 June 2001 2 Jans Wagenstraat 9 1212 ZK F 1960 3453-97 1 May 1997

IADIS International Conference e-Society 2011

101

Table 2. Example of records from OMDATA

Id# Name Address Place of Residence

Sex Date of Birth Case# Type of Offence

Date of Offence …

3 Paelix Schmidtweg 4 Almere M 4 May 1975 35-01 1 5 June 2001 3 Paelix Schmidtweg 4 Almere M 4 May 1975 235-01 2 6 June 2001 4 Heksien Knuthstraat 48 Tiel F 6 Oct.1975 342-01 1 6 June 2001

The contents of these databases include data on suspects and the offences they committed. The databases

and corresponding diagrams are simplified forms of HKS and OMDATA. In the databases # stands for number and id# for identification number.

The corresponding database diagrams (in so-called ‘Entity-Relationship’ – abbreviated to ER – notation) for HKS and OMDATA, respectively, are represented in Figure 1.

Figure 1. HKS and OMDATA ER diagrams

HKS stores information on suspects, the official reports drawn up in respect of them, and the offences of which they are suspected (represented as types of entity in the above diagram). Several official reports may, for instance, have been drawn up in respect of a suspect, but one report – as included in HKS – can relate to only one suspect (in the above diagram indicated by means of a ‘rake’ as a one-in-many relationship). A similar relationship can be found between official reports and the offences committed. If we look at OMDATA, we see that this database, in addition to information about a suspect and the offences of which he/she is suspected, also registers case-related information. Because both diagrams have overlapping types of entities, we take the ‘intersection’ of these diagrams, which results in the combined diagram represented in Figure 2. This diagram contains the types of entities which share a collection of separate diagrams as well as the relationships between these types of entities.

Figure 2. Combined HKS and OMDATA ER diagram

We selected the contents of the databases that correspond with the combined diagram and removed privacy sensitive attributes, such as name and address. If we apply this to the snapshots of HKS and OMDATA, we will get the following stripped snapshots (Table 3).

Table 3. Example of records from a combined HKS and OMDATA diagram; without privacy sensitive attributes

id# Sex Year of Birth Date of Official Report

1 M 1975 6 June 2001 2 F 1960 1 May 1997

Table 4. Stripped OMDATA

id# Place of Residence

Sex Date of Birth Type of Offence

Date of the Offence

3 Almere M 4 May 1975 1 5 June 2001 3 Almere M 4 May 1975 2 6 June 2001 4 Tiel F 6 Oct. 1975 1 6 June 2001

Suspect Offence

Suspect

Offence

or

Suspect

Case

Offence

Entity

One-to-many relationship

Official Report

ISBN: 978-972-8939-46-5 © 2011 IADIS

102

We subsequently determine the probability that two entities relate to the same person on the basis of the common attributes. In the example, we think there is a good chance that the person in HKS with id# =‘1’ is the same person as in OMDATA with id# = ‘3’, because the overlapping attributes (sex, year of birth) assume the same values. Besides, we know from practice that on the date that an offence was committed, an offence was reported to the police. We see that the date of the official report in HKS for entity with id# =‘1’ is the same for entity with id# = ‘3’ in OMDATA (second row).

3.2 Results & Applicability

We applied the above-mentioned principles to 10,000 suspects in HKS. These suspects had been selected from previous WODC research and were carefully (read: on the basis of unique numbers and distinguishing attributes) connected to OMDATA. Out of the 10,000 suspects from HKS, 8,705 suspects were retrieved in OMDATA. The common attributes we identified for HKS and OMDATA are year of birth, sex, country of birth, date of official report / date of the offence, the first five sections of law included in an official report or charged by the PPD. Connecting the HKS and OMDATA files on the basis of these common attributes has resulted in the correct results in more than 93 % of cases (see Table 5).

Table 5. Results of approach

Result Number Percentage Type-I error 490 4.9% Type-II error 133 1.3% Correct result 9377 93.7%

In 490 cases, we assumed on the basis of our approach that it concerned different suspects, whereas it was

the same suspect in reality (type 1 error) and in 133 it concerned different suspects in reality, whereas we assumed on the basis of our approach that it was the same suspect (type 2 error). In 9,377 of the cases, the classification was correct.

Our experiment shows that the results are rather good. However, the applicability of our approach depends on the characteristics of the overlapping attributes. An important characteristic of an attribute is its selectivity factor, which is defined as 1/(the number of different values an attribute assumes) (Choenni et al, 2003). The smaller the selectivity factor of an attribute, the more discriminating the attribute is. We discuss a heuristic based on selectivity factors to estimate the chances whether our approach may be applied successfully or not.

Let S be a set of overlapping attributes of a set of relations iR ,i = 1,2, ...n, and ),( jiRsf α is the

selectivity factor of attribute jα in relation iR . Then, we compute H=∏∏= ∈

n

i Siji

j

RRsf1

).,(α

α , in which

iR is the cardinality of a relation iR . The smaller H , the more suitable our approach will be. The optimal

value that H may assume is close to 1. In the case H is too large, this implies that there are many tuples referring to a same entity in real-life. To match the proper t tuple to the real-world entity in these cases, we extensively exploit domain knowledge (Choenni et al, 2009).

3.3 Architecture

After having thoroughly analysed the three databases, HKS, OMDATA and OBJD, the architecture of the DW was developed and subsequently implemented in an ORACLE environment. It will be evident from the descriptions of the databases in the previous section that the data relate to the same cases at different phases in the criminal justice process, with identical and different attributes. The identical attributes relate to the data which are relevant to each of the three organisations: all organisations will, for instance, register the type of offence. Other attributes are relevant to the work of one organisation, but not to that of the other. The identical attributes were used to merge the files. The architecture of our DW is represented in Figure 3.

IADIS International Conference e-Society 2011

103

Figure 3. WODC Offender-oriented data warehouse architecture* Note*: DW= Data Warehouse; ndm= national drug monitor; pmj= justice chain forecasting model

The data extracted from the databases are data that may be of interest to the users of the DW, initially, in particular, the WODC researchers. The data will be cleared, transformed, and subsequently loaded into the DW. The DW tables constitute the actual DW: data from different databases have been combined and ordered here. In respect of the Offender-Oriented DW, this means that the data from the three sources have been connected and oriented towards persons. The structure of the DW has been developed in such a way that all data in the DW relate to persons in one way or another. In addition, information about the data in the DW is stored in a so-called ‘metadatabase’. These metadata contain all information required to assess the data, such as definitions, origin, adaptations, and restrictions. The metadatabase is consequently a collection of all documentation available about the sources and adaptations of the data.

3.4 Application: a Drug Crime Data Mart

Our department contributes to various national and international reports containing figures and information of registered drug crime in the Netherlands, including monitor reports, recurrent annual surveys, specific research projects, and questions from our information desk. The majority of these information products are made under the flag of the National Drug Monitor (NDM) (Meijer et al., 2008).

The Drug Crime Data Mart (DCDM) consists of a selection of all data from the Offender-Oriented DW that are relevant to the NDM. This selection was obtained by identifying the information requirement concerning drug crime of the various NDM projects and by determining in which respect these data were identical for those projects. On the basis of this analysis, we could subsequently define the DCDM. This data mart consists of a set of data from the DW that meets the information requirement of the various NDM projects. The DCDM is currently the source for those projects. By means of the data mart, it was possible to integrate the existing NDM projects and to streamline the existing ad hoc software. The data mart is suitable for analysis and reporting purposes.

The DCDM has been expanded by a module that is intended for the publication of tables and figures in reports, a so-called ‘Publication on Demand’ module (POD), which makes it possible to insert pre-selected tables and figures automatically in pre-determined places or to update them. The layout of the relevant tables and figures is maintained during this process. The computerisation of publication processes enables the editors to focus their attention exclusively on putting the tables and figures in the right layout. The only thing the editors have to do is initiate the process of inserting and updating the tables and figures: inserting and updating follows automatically. This saves considerable time and, in addition, it considerably reduces the chance of errors. By means of the POD module , the figures from a previous NDM publication – e.g. on custodial punishments and detention years in the period 2001-2006 – were automatically supplemented with the year 2006 for the purpose of a follow-up report (Van Laar et al. 2008 ). For more details, please refer to (Meijer et al, 2008).

All in all, we have been able to establish that the DCDM contributes considerably to the integration of the existing NDM projects. Although these projects may still be considered as independent projects from a research management point of view, the implementation of the underlying data is managed as one project.

Extraction

Cleaning Transformation Loading

DW

Metadatabase ...

NDM

OM Data

OBJD

HKS

PMJ

ISBN: 978-972-8939-46-5 © 2011 IADIS

104

The data collection is performed periodically for all NDM projects and the updates of the data are done automatically. The documentation of the related projects is stored centrally. This increases the overview of the definitions used and makes it easier to adapt them, if necessary. Query programmes, which used to be written in different computer languages and syntaxes and which were spread across several projects, have been transformed in standard SQL and incorporated under one central activity. This has simplified programme maintenance and increased efficiency. Finally, we would like to state in this context that expanding the data mart with POD could be very effective for more or less standardised reports.

4. CONCLUSION

The Offender-Oriented DW has been realised by bringing together and connecting different databases on the basis of common attributes and the domain expertise available at the WODC. Identifying the available domain expertise has been a time-consuming job. A major advantage of this is, however, that we currently understand the databases in our centre better and that we know better which information is contained in these databases. We are currently able to determine rather quickly whether a request for information from judicial partners can be answered or not. The efforts made for identifying the domain expertise furthermore also contributed to the improvement in quality of the data. Another advantage of setting up the DW is that we are currently able to answer a number of new types of questions: questions concerning the relationship among the different databases.

The DW may currently be searched in two ways. The first method is a conventional one, namely by making ad hoc queries or search software. The second method is by integrating data marts into the DW. The advantage of setting up data marts is that this considerably facilitates the integration of projects that use the same collection of data. This results in efficiency gain. This applies, in particular, to standardised reports, such as the NDM report.

We would like to note that the Offender-Oriented DW is not suitable for answering questions in respect of small groups of individuals, because we are dealing with connections on the basis of common attributes. As a result of this, we never know with certainty whether the connections are correct at the individual level. We can only say something about the quality of the connections from a statistical point of view. An experiment has revealed that these connections may be qualified as good. Furthermore, we have provided a rule of thumb that may be used to determine beforehand to what extent our approach is applicable.

It is our aim to also integrate databases that relate to the enforcement of sanctions into the DW in the near future.

REFERENCES

Choenni, S. and van Dijk, J., 2009. Towards Privacy Preserving Data Reconciliation for Criminal Justice Chains. in Proceedings of the 10th Annual international Conference on Digital Government Research: Social Networks: Making Connections between Citizens, Data and Government (May 17 - 20, 2009). S. A. Chun, R. Sandoval, and P. Regan, Eds. ACM International Conference Proceeding Series; vol. 390. Digital Government Society of North America, 223-229.

Choenni, S., Blanken, H., and Chang, T. 1993. Index Selection in Relational Databases, In International Conference of Computing and Information (ICCI) 1993, IEEE press, 491-496

Elmasri, R., Navathe, S. (2004) Fundamentals of Database Systems, Addison-Wesley Kalidien, S., Choenni, R., and Meijer, R., 2009. Towards a Tool for Monitoring Crime and Law Enforcement. In

proceedings of the 3rd European Conference on Information Management and Evaluation (ECIME),Gothenburg, September 17-18, 2009; Academic Publishing Limited, Curtis Farm, Kidmore end, pp. 239-247.

Kalidien, S. N. and Eggen, A. Th. J., 2009. Criminaliteit en rechtshandhaving 2008. WODC, The Hague. Meijer, R., van Dijk J., Leertouwer E., and Choenni, S., 2008. A Drug Crime Data Mart to Support Publication on

Demand. Proceedings of the 2nd European Conference on Information Management and Evaluation (September 11-12, 2008); Academic Publishing Limited, Reading, UK, 277-286.

Van Laar, M. et al., 2008. The Netherlands Drug Situation 2007; Report to the EMCDDA, by the Reitox National Focal Point. Trimbos Institute, Utrecht.

IADIS International Conference e-Society 2011

105