data mining university crime presenters: stuart king konrad kuczynski ying liu

7
Data Mining Data Mining University University Crime Crime Presenters: Stuart King Konrad Kuczynski Ying Liu

Upload: franklin-thornton

Post on 17-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Data Mining University Crime Presenters: Stuart King Konrad Kuczynski Ying Liu

Data MiningData Mining University Crime University Crime

Presenters: Stuart King

Konrad Kuczynski

Ying Liu

Page 2: Data Mining University Crime Presenters: Stuart King Konrad Kuczynski Ying Liu

Data Sourcesa) FBI Uniform Crime Reporting campus crime

statistics for 669 schools from 1995 thru 2007 (except 2004): 1995-2005:  http://www.securityoncampus.org/crimestats/index.html 2006-2007: 

http://www.fbi.gov/ucr/cius2006/offenses/standard_links/universities_colleges.html   http://www.fbi.gov/ucr/cius2007/offenses/standard_links/universities_colleges.html 

b) City Crime Data: 1999-2007: http://www.fbi.gov/ucr/ucr.htm

c) State Poverty: 1999-2007: http://www.census.gov/hhes/www/saipe/county.html

Page 3: Data Mining University Crime Presenters: Stuart King Konrad Kuczynski Ying Liu

Data Mining Tasks

• Cluster/Classify1: Prediction of missing population values

Used for improving both charting and anomaly detection of Per Capita crime data

2: Prediction of crime trends Can be used by universities to allocate resources for the coming year

• Outlier Detection3: Detect anomalies in crime

Can be used by universities to target “root cause and prevention” projects

Page 4: Data Mining University Crime Presenters: Stuart King Konrad Kuczynski Ying Liu

Data Preprocessing• University crime data cleanup 669 universities with 971 names 136 universities were removed because they reported less than 5 years of data If a year of crime data was missing, a fabricated record of averages was added

• Matching City, State, and Zip code information for each University

• Post cleaned and merged data sets: 5,869 records of crime data for 533 universities with City/State/Zip code added 1,712 records of crime data for 249 universities with the following added:

City/State/Zip, City Crime Data, State Poverty Data

• Additionally, Per Capita and values were calculated

Page 5: Data Mining University Crime Presenters: Stuart King Konrad Kuczynski Ying Liu

Data Mining• Data used for each task

St. StPov%

Pop. Vio. Murd. Rape Rob Aslt Prpty Burg Theft Car Arson

1 G | |

P | | % % % % % % % % % %

2 G U∆ U∆ U∆ U∆ U∆ U∆ U∆ U∆ U∆

P S∆ U∆

C∆

U∆

C∆

U∆

C∆

U∆

C∆

U∆

C∆

U∆

C∆

U∆

C∆

U∆

C∆

U∆

C∆

U∆

C∆

U∆

C∆

3 %U∆

%U∆

%U∆

%U∆

%U∆

%U∆

%U∆

%U∆

%U∆

• 1: Population cluster/classify

• 2: Crime Trend cluster/classify

• 3: Anomaly Detection

• G: Grouping = Clustering

• P: Prediction = Classifying

• % Per capita values• ∆ Difference values• | | Absolute values• U University• C City• S State

Page 6: Data Mining University Crime Presenters: Stuart King Konrad Kuczynski Ying Liu

Data Mining• Algorithm used for each task

Task Mining Algorithm

1: Population

Prediction

Clustering EM

Classification Decorate using J48

2: Crime

Trend

Prediction

Clustering EM

Classification J48

3: Crime

Anomalies

Outlier Detection DBSCAN with minPoints=1

Page 7: Data Mining University Crime Presenters: Stuart King Konrad Kuczynski Ying Liu

Visualizations• Summary charts and graphs for Per Capita and Clustering data

• Interactive Map showing cluster changes for 533 Universities

• Interactive Map showing predicted clusters for 249 Universities

• Interactive Map showing where 355 outliers occurred

• Interactive Charts showing values for outliers

http://www.cse.msu.edu/~kingstua/Team3