spatial analysis of crime data: a case study mike tischler presented by arnold boedihardjo

Spatial Analysis of Crime Data: A Case Study

Mike Tischler Presented by Arnold Boedihardjo

Outline

• Motivation• Spatial autocorrelation• Approach• Issues• Data sets

Motivation

• Goal: reduce crime activity• Develop a tool to extract crime patterns– Allow visualization of patterns

• Ultimately, predict crime occurrences

Spatial Autocorrelation

• Tobler’s first law of geography: “everything is related to everything else, but near things are more related than distant things”

• Possible causes of spatial dependency – Spatial causality: an object (event) is a direct cause of

nearby objects (events)– Spatial correlation: nearby objects (events) behave

similarly– Spatial interaction: movements of objects induce a

relationship between objects in different locations

Approach

• Provide a spatial-based model to describe the density of incident objects (e.g., crime locations) within a given set of spatial objects

• The density values are essentially probability values, hence can be used as a predictive metric for future occurrences of incident objects

Example: When will the next crime happen?

C

C

C

Bank A

Bank BC

C

Bank C

Store

StoreC

How to formalize our intuition in a probabilistic framework?

• The probability of a crime occurring at bank C is higher than the stores– Furthermore, the probability is equivalent to bank

A and bank B• How to define the probabilities?– Kernel Density Estimation

Applying the KDE

• Suppose that the our sample set, S, is not the incident points, but the pair-wise distances of the incidents to the NN non-incident objects (e.g., banks and stores)

• If we apply the KDE to S, the kernel functions will be centered at these pair-wise distances and our query points will be transformed to the NN of the non-incident spatial objects

• Formally, we have the following multivariate KDE

𝐷(𝑥Ԧ) = ෑ� 𝐾𝐻𝑑(𝑁𝑁(𝑥Ԧ,𝑑) −𝑠𝑖,𝑑)|𝑑𝑖𝑚|𝑑=1

|𝑖𝑛𝑐𝑖𝑑𝑒𝑛𝑡𝑠 |𝑖=1

After applying the KDE, we have the following…

Bank A

Bank B

Bank C

Store

Store

Research Issues

• How to select the features (e.g., banks, stores)? Employ notions of density attractors and repellers.

• If the above is solved, how to improve the quality of the density estimates? Currently, an adaptive KDE approach is being tested.

• How to incorporate temporal correlation? • Producing this model is computationally intensive:

feature selection, NN search for every feature, and multiple queries on KDE

Data Set

• Washington DC crime data• Crime incident reports in parse-able formats:– XML, Text/CSV, KML or ESRI

• Geographic feature layers are also available for download (could not verify, but was told by a very reliable source)

• Other regional information are available (e.g., census tract)

• http://data.octo.dc.gov

http://data.octo.dc.gov/

spatial analysis of crime data: a case study mike tischler presented by arnold boedihardjo

Documents