gm533_course_project_lastf

18
Case Study 49: Property Crimes First M Last ([email protected]) For Professor Beintema Managerial Statistics (GM533) Keller School of Management August 2010

Upload: em-atallah

Post on 14-Oct-2014

1.450 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: GM533_Course_Project_LastF

Case Study 49: Property Crimes

First M Last([email protected])

For

Professor Beintema

Managerial Statistics (GM533)

Keller School of Management

August 2010

Page 2: GM533_Course_Project_LastF

I. Executive summary

Our study examined data provided by various U.S. government agencies on property crime rates in the

fifty U.S. states and eight possible contributing factors such as per capita income, high school dropout

rate, average precipitation, population density, and urbanization. Our analysis revealed that of the eight

possible contributing factors, only three variables (namely, urbanization rate, high school dropout rate,

and population density) affected property crime rates. Our data analysis model accounted for

approximately 66% of the factors contributing to property crimes. The model is generally considered to

be statistically strong, however, if we need to account for the remaining 34% of factors contributing to

property crime rates in the U.S., further data and evaluation of other possible factors would be

necessary.

II. Introduction

According to the US Department of Justice (2006), property crime includes several criminal

offenses such as burglary; car and motorcycle theft, larceny theft and arson. Property crimes involve

“taking of money or property, but there is no force or threat of force against the victims.” One exception

to the basic rule, however, is arson which does not involve the taking of property and does involve force

against the victims.

The purpose of this case study is to evaluate available data and attempt to determine the

variables that contribute the most and address several conceptions and misconceptions about the

leading causes of property crimes in the U.S. The questions that this study will answer include:

1. Are crime rates higher in urban than rural areas?

2. Does unemployment or education level contribute to property crime rates?

3. Does public assistance contribute to property crime rates?

4. What other factors relate to property crimes?

Case Study 49: Property Crimes Page 1

Page 3: GM533_Course_Project_LastF

The study used data that was collected from a “variety of U.S. government sources, including:

the 1988 Uniform Crime Reports, Federal Bureau of Investigation; the Office of Research and Statistics,

Social Security Administration; the Commerce Department, Bureau of Economic Analysis; the National

Center for Education Statistics, U.S. Department of Education; the Bureau of the Census, Department of

Commerce and Geography Division; the Labor Department, Bureau of Labor Statistics; and the National

Climatic Data Center, U.S. Department of Commerce. The data set was originally collected by Louis J.

Moritz, an operations manager.” (Bowerman et. al., 2010). A copy of the available data set is attached

in Appendix A. The data consists of the following information for each of the fifty states:

1. Property crime rate per hundred thousand inhabitants

2. Per capita income

3. High school dropout rate

4. Average precipitation in the major city

5. Percentage of public aid recipients

6. Population density

7. Public aid for families with children

8. Percentage of unemployed workers

9. Percentage of the residents living in urban areas

III. Analysis and methods

We used MegaStat to analyze the given data and test the various facts and hypotheses about the

data. We ran a multiple regression analysis on the data to determine which variables affected crime

rate the most. In this scenario, our dependant variable (Y) was the given crime rate for each state, and

the independent variables were the other 8 variables (i.e., per capita income, dropout rate, etc.) given

for each of the states. The MegaStat output is shown in Appendix B and pertinent excerpts are shown

below.

Case Study 49: Property Crimes Page 2

Page 4: GM533_Course_Project_LastF

             

Regression Analysis

R² 0.686

Adjusted R² 0.625 n 50

R 0.828 k 8

Std. Error 754.255 Dep. Var. CRIMES (Y)

Regression output confidence interval

variables coefficients std. error t (df=41) p-value 95% lower 95% upper

Intercept -1,008.0855 1,003.2571 -1.005 .3209-

3,034.2043 1,018.0334

PINCOME (X1) 0.0156 0.0731 0.213 .8323 -0.1320 0.1632

DROPOUT (X2) 73.3997 21.5165 3.411 .0015 29.9463 116.8532

PUBAID (X3) -49.3649 39.8547 -1.239 .2225 -129.8531 31.1233

DENSITY (X4) -2.2108 0.7018 -3.150 .0030 -3.6281 -0.7934

KIDS (X5) 0.4108 1.3363 0.307 .7601 -2.2878 3.1095

PRECIP (X6) -0.5357 10.9622 -0.049 .9613 -22.6744 21.6030

UNEMPLOY (X7) -57.4497 78.7026 -0.730 .4696 -216.3928 101.4933

URBAN (X8) 65.8552 11.0268 5.972 4.74E-07 43.5862 88.1242

The summary of this analysis:

1. R^2 = 68.6%: This is the proportion of variation in the dependent variable Y that is explained by

variation in the independent variables Xi. In other words, using this model, almost 67% of the

variation in the crime rate can be attributed to the independent variables X1 – X8.

2. To determine how much effect each of the independent variables has on the dependent

variable, we examine the correlation coefficient for each of the independent variables. The

higher the coefficient, the more effect the particular independent variable has on the

independent variable. A positive coefficient value indicates that the independent variable has a

positive effect on the dependent variable, while a negative coefficient indicates that the

independent variable has a negative effect on the dependent variable. The output shows that

X2 (dropout rate) has the largest number (most effect), followed by X8 (urbanization), X7

(unemployment rate), and finally public aid. A positive number indicates that the independent

variable has a positive effect on the independent variable while a negative number indicates

that the independent variable has a negative effect on the dependent variable. For example, X2

Case Study 49: Property Crimes Page 3

Page 5: GM533_Course_Project_LastF

= 73.39 means that every 1% increase in the dropout rate for the state contributes 73.39%

increase in crime rate. X7 = 57.44 means that every 1% increase in unemployment rate

decreases the crime rate by 57.44%.

3. Next, we examine the p-value of each of the independent variables to determine which ones are

most significant (below .005 alpha value) and we see that the X2, X4, and X8 are the only

independent variables with p-value less than .005 with X8 (urbanization) having the lowest p-

value. This is an indicator that perhaps urbanization has the strongest effect on crime rate.

4. To refine our model further, we drop the independent variables that do not positively affect the

dependant variable and re-run the regression analysis with only the variables that have a strong

effect (high coefficient and low p-value).

5. The new regression analysis for the data that uses only significant independent variables (X2, X4,

and X8) yields the results below (also presented in Appendix C).

             

Regression Analysis

R² 0.656

Adjusted R² 0.633 n 50

R 0.810 k 3

Std. Error 745.822 Dep. Var. CRIMES (Y)

ANOVA table

Source SS df MS F p-value

Regression 48,778,906.692

7 3 16,259,635.564

2 29.23 9.92E-11

Residual 25,587,492.190

5 46 556,249.8302

Total 74,366,398.883

2 49      

Regression output confidence interval

variables coefficientsstd. error t (df=46) p-value 95% lower 95% upper

Intercept -1,052.5531 613.1049 -1.717 .0928

-2,286.669

2 181.5630

DROPOUT (X1) 57.7544 15.3153 3.771 .0005 26.9262 88.5826

DENSITY (X2) -1.9318 0.5270 -3.666 .0006 -2.9926 -0.8710

URBAN (X3) 67.8889 8.4077 8.075 2.30E-10 50.9650 84.8127

Case Study 49: Property Crimes Page 4

Page 6: GM533_Course_Project_LastF

6. Next, we examine R^2 = 65.6% and Adjusted R^2 = 63.3% for this model to determine how

much of the variation in X2, X4, and X8 used in this model accounts for the variation in Y and we

determine that the model is useful to us and accounts for high percentage of the variation in

crime rate (Y).

7. We further look at the F(test) = 29.23 value and its associated p-value = 9.92E-11 which is very

low and well below 0.005 (the lowest of the standard confidence tests). This is a positive sign

that the model represents the data adequately.

The hypotheses for the overall F-test are: Ho: B1 = B2 = B3 = 0 (i.e. none of the independent

variables are significantly related to Y) vs. Ha: at least one Bi <> 0 (i.e. at least of the

independent variables is significantly related to Y). Therefore, we reject Ho and due to the

extremely low p-value, we are able to conclude that there extremely strong evidence that at

least one of the independent variables is significantly related to Y and our model represents the

data accurately.

8. Next, we examine the significance of each of the proposed independent variables, by looking at

their individual p-values:

If we select an alpha of 0.005 as a cutoff and we compare our p-values to alpha = .005, we

conclude that since each of the p-values is < 0.005 all 3 independent variables are significantly

related to crime rate and should be included in the model.

Case Study 49: Property Crimes Page 5

X1 p-value = .0005X2 p-value = .0006X3 p-value = 2.30E-10

Page 7: GM533_Course_Project_LastF

9. Now that we’ve determined the fitness of our model, we look at the following information:

a. Scatter plot for each of the independent variables (presented in Appendix D).

b. Descriptive statistics and point estimates:

i. Central Tendency: the regression equation of the least squares line:

y-hat = b0 + b1x1 + b2x2 + b3x3

y-hat = -1,052.5531 + 57.7544x1 - 1.9318x2 + 67.8889x3

ii. Standard error = 745.82

iii. Simple coefficient of determination = 0.656

iv. Conclusion: 65.6% of the variability in crime rate can be explained by changes in

dropout rate, population density, and urbanization

c. Confidence intervals:

i. The 95% confidence interval for dropout rate is [26.93, 88.58] which means that

for each 1% increase in dropout rate, the crime rate will increase between 27%

and 89%.

ii. The 95% confidence interval for population density is [-2.99, -0.87] which means

that for each increase in population density per square mile, the crime rate will

increase between 1% and 3%.

iii. The 95% confidence interval for urbanization is [50.96, 84.81] which means that

for each 1% increase in urbanization, the crime rate will increase between 51%

and 84%.

IV. Conclusions and summary

We first used MegaStat and ran a regression analysis that looked at all eight possible variables that could

affect property crimes so we could determine which of the variables truly contributed to the crime rate.

The initial test revealed that there were only three factors that contributed significantly to crime rate,

Case Study 49: Property Crimes Page 6

Page 8: GM533_Course_Project_LastF

and those were (in the following order): Urbanization high school dropout rate, and population density.

We then used MegaStat again and ran another regression analysis with the three variables that we

identified as significantly contributing to property crime rate. The model is considered a statistically

strong one, since 66% of the variation in property crime is explained by the model. Using our model, we

provided and explained the descriptive statistics and the confidence intervals for the data. Finally, we

were able to answer the questions posed by assignment:

a. Are crime rates higher in urban than rural areas?

According to 9.c.3 above, we are very confident that crime rates are higher in urban areas than

rural areas.

b. Does unemployment or education level contribute to property crime rates?

According to 4 above, unemployment and education do not appear to contribute significantly to

property crime rates.

c. Does public assistance contribute to property crime rates?

According to 4 above, public assistance does not appear to contribute significantly to property

crime rates.

d. What other factors relate to property crimes?

According to the available data, other factors that appear to influence property crimes are the

high school dropout rate and population density.

References

Bowerman, O’Connell, Orris, & Murphree (2010). Essentials of Business Statistics, Third Edition. New York: The McGraw−Hill Companies.

Department of Justice, Federal Bureau of Investigation (2006). Property Crime. Retrieved from http://www.fbi.gov/ucr/cius_04/offenses_reported/property_crime/index.html

Case Study 49: Property Crimes Page 7

Page 9: GM533_Course_Project_LastF

Case Study 49: Property Crimes Page 8

Page 10: GM533_Course_Project_LastF

Appendix A

Data Set

STATE CRIMES PINCOME

DROPOUT

PUBAID DENSITY

KIDS PRECIP UNEMPLOY URBAN1 4003.1 12604 30.5 6.5 80.8 114 59.4 7.2 60

2 4398.8 19514 26.4 4.3 0.9 593 53.2 9.3 64.33 6861.2 14887 30 3.5 30.7 268 7.1 6.3 83.84 3796.9 12172 21.4 5.9 46 190 49.2 7.7 51.55 5705.7 18855 31.5 8.8 181.2 581 17.3 5.3 91.36 5705.7 16417 24 3.7 31.9 317 15.3 6.4 80.67 4642.2 22761 21.8 4.3 663.6 486 44.4 3 78.88 4347.4 17699 29 4.3 341.7 267 41.4 3.2 70.69 7819.9 16546 36.5 4.1 227.8 240 55.2 5 84.6

10 5661.2 14980 35 6.4 109.2 252 48.6 5.8 62.411 5731.9 16898 15.5 4.9 170.9 481 23.5 3.2 86.512 3738.2 12657 20.4 2.7 12.2 250 11.7 5.8 5413 4810.4 17611 22.2 22.2 208.7 309 33.3 6.8 83.314 3770 14721 24.1 3.6 154.6 263 39.1 5.3 64.215 3819.8 14764 13.4 5 50.6 348 38.6 4.5 58.616 4514.7 15905 15.9 3.8 30.5 338 28.6 4.8 66.717 2804.7 12795 32.1 7 93.9 204 43.6 7.9 50.918 5043.3 12193 38.4 8.8 99 167 59.7 10.9 68.719 3420.3 14976 21.2 6.5 38.9 370 43.5 3.8 47.520 4897.9 19316 23.5 5.1 469.9 329 41.8 4.5 80.321 4371.3 20701 24 5.9 752.7 536 43.8 3.3 83.822 5342.7 16387 28.6 8.5 162.2 481 31 7.6 70.723 4024.7 16787 11.3 4.6 54.1 515 26.4 4 66.924 3267.6 10992 34.4 11.1 55.5 119 52.8 8.4 47.325 4292.2 15492 23.9 5.5 74.6 264 33.9 5.7 68.126 4144 12670 15.5 4.5 5.5 362 11.4 6.8 52.927 3866.8 15184 13.3 3.8 20.9 320 30.3 3.6 62.928 5672.5 17440 18.7 2.6 9.6 273 4.2 5.2 85.329 3186.1 19016 25.4 1.7 120.7 407 36.5 2.4 52.230 4712.5 21882 20.3 5.6 1033.9 357 41.9 3.8 8931 5948.1 12481 26.8 5.7 12.4 225 8.9 7.8 72.132 5212 19299 33.3 8 378 523 39.3 4.2 84.633 4360.4 14128 30.9 5 132.9 243 42.5 3.6 42.934 2668.9 12720 11.6 3.1 9.6 350 15.4 4.8 48.835 4193.2 15485 20.8 7.5 264.7 298 37.8 6 73.336 5154.6 13269 24.2 4.8 47.2 278 30.9 6.7 67.337 6513 14982 29.2 4 28.8 348 37.4 5.8 67.938 2814.4 12168 18.9 6.1 267.4 347 40 5.1 69.339 4807.7 16793 28 6 940.9 450 41.9 3.1 8740 4671.2 12764 32.2 6.3 114.9 186 51.6 4.5 54.141 2467.3 12475 13.1 3.9 9.4 270 17.5 3.9 46.442 3936.7 13659 32.8 6.4 118.9 155 48.5 5.8 60.443 7365.1 14640 34.1 4.5 64.3 169 42 7.3 79.644 5335.5 12013 17.5 3.2 20.6 343 15.3 4.9 84.445 4098.2 15382 17.3 5.6 60.1 469 33.7 2.8 33.846 3877.5 17640 23.4 4 151.5 257 45.2 3.9 6647 6646.6 16569 21.9 5.8 69.9 443 38.6 6.2 73.548 2107.4 11658 22.9 8.2 77.8 238 40.7 9.9 36.249 3757.6 15444 16.3 7.7 89.2 473 30.9 4.3 64.250 3653.1 13718 20.4 3.2 4.9 303 13.3 6.3 62.7

Case Study 49: Property Crimes Page 9

Page 11: GM533_Course_Project_LastF

Variables

CRIMES Property crime rate per hundred thousand inhabitants (propertycrimes include burglary, larceny, theft, and motor vehicle theft);calculated as # of property crimes committed divided by totalpopulation/100,000

PINCOME Per capita income for each stateDROPOUT High school dropout rate (%, 1987)PRECIP Average precipitation in inches in the major city in each state

over 1951 - 80PUBAID Percentage of public aid recipients (1987)DENSITY Population/total square milesKIDS Public aid for families with children, dollars per familyUNEMPLOY Percentage of unemployed workersURBAN Percentage of the residents living in urban areasSTATE Number (1-50) representing the state

1 = Alabama 26 = Montana2 = Alaska 27 = Nebraska 3 = Arizona 28 = Nevada 4 = Arkansas 29 = New Hampshire 5 = California 30 = New Jersey 6 = Colorado 31 = New Mexico 7 = Connecticut 32 = New York 8 = Delaware 33 = North Carolina 9 = Florida 34 = North Dakota 10 = Georgia 35 = Ohio 11 = Hawaii 36 = Oklahoma 12 = Idaho 37 = Oregon 13 = Illinois 38 = Pennsylvania 14 = Indiana 39 = Rhode Island15 = Iowa 40 = South Carolina 16 = Kansas 41 = South Dakota 17 = Kentucky 42 = Tennessee 18 = Louisiana 43 = Texas 19 = Maine 44 = Utah 20 = Maryland 45 = Vermont 21 = Massachusetts 46 = Virginia 22 = Michigan 47 = Washington 23 = Minnesota 48 = West Virginia 24 = Mississippi 49 = Wisconsin 25 = Missouri 50 = Wyoming

Case Study 49: Property Crimes Page 10

Page 12: GM533_Course_Project_LastF

Appendix B

Initial MegaStat output for multiple regression analysis taking into consideration all given data variables with crime rate as the dependant variable and the other data as independent variables.

             

Regression Analysis

R² 0.686

Adjusted R² 0.625 n 50

R 0.828 k 8

Std. Error 754.255 Dep. Var. CRIMES (Y)

ANOVA table

Source SS df MS F p-value

Regression 51,041,456.255

9 8 6,380,182.032

0 11.21 3.11E-08

Residual 23,324,942.627

3 41 568,901.0397

Total 74,366,398.883

2 49      

Regression output confidence interval

variables coefficientsstd. error t (df=41) p-value 95% lower 95% upper

Intercept -1,008.0855 1,003.257

1 -1.005 .3209

-3,034.204

3 1,018.033

4

PINCOME (X1) 0.0156 0.0731 0.213 .8323 -0.1320 0.1632

DROPOUT (X2) 73.3997 21.5165 3.411 .0015 29.9463 116.8532

PUBAID (X3) -49.3649 39.8547 -1.239 .2225 -129.8531 31.1233

DENSITY (X4) -2.2108 0.7018 -3.150 .0030 -3.6281 -0.7934

KIDS (X5) 0.4108 1.3363 0.307 .7601 -2.2878 3.1095

PRECIP (X6) -0.5357 10.9622 -0.049 .9613 -22.6744 21.6030

UNEMPLOY (X7) -57.4497 78.7026 -0.730 .4696 -216.3928 101.4933

URBAN (X8) 65.8552 11.0268 5.972 4.74E-07 43.5862 88.1242

Case Study 49: Property Crimes Page 11

Page 13: GM533_Course_Project_LastF

Appendix C

Refined MegaStat output for multiple regression analysis that takes into consideration independent variables that strongly affect the dependent variable (crime rate).

             

Regression Analysis

R² 0.656

Adjusted R² 0.633 n 50

R 0.810 k 3

Std. Error 745.822 Dep. Var. CRIMES (Y)

ANOVA table

Source SS df MS F p-value

Regression 48,778,906.692

7 3 16,259,635.564

2 29.23 9.92E-11

Residual 25,587,492.190

5 46 556,249.8302

Total 74,366,398.883

2 49      

Regression output confidence interval

variables coefficientsstd. error t (df=46) p-value 95% lower 95% upper

Intercept -1,052.5531 613.1049 -1.717 .0928

-2,286.669

2 181.5630

DROPOUT (X2) 57.7544 15.3153 3.771 .0005 26.9262 88.5826

DENSITY (X4) -1.9318 0.5270 -3.666 .0006 -2.9926 -0.8710

URBAN (X8) 67.8889 8.4077 8.075 2.30E-10 50.9650 84.8127

Case Study 49: Property Crimes Page 12

Page 14: GM533_Course_Project_LastF

Appendix D

Scatter plots for the three independent variables high school dropout rate (X1), population density (X2), and urbanization (X3) that significantly saffect the dependent variable, property crime rate (Y).

5 10 15 20 25 30 35 40 450

1000

2000

3000

4000

5000

6000

7000

8000

9000

f(x) = 71.7172258646256 x + 2832.58007008327R² = 0.167947891260972

Crime and Dropout Rate

DROPOUT (X1)

CR

IME

S (

Y)

0 200 400 600 800 1000 12000

1000

2000

3000

4000

5000

6000

7000

8000

9000

f(x) = 0.324658737786672 x + 4506.02529038453R² = 0.00371175403619672

Crime and Population Density

DENSITY (X2)

CR

IME

S (

Y)

Case Study 49: Property Crimes Page 13

Page 15: GM533_Course_Project_LastF

20 30 40 50 60 70 80 90 1000

1000

2000

3000

4000

5000

6000

7000

8000

9000

f(x) = 57.1814101542171 x + 737.009819651514R² = 0.457157880440283

Crime and Urbanization

URBAN (X3)

CR

IME

S (

Y)

Case Study 49: Property Crimes Page 14