analysis of discrimination in prime and subprime … · according to inside mortgage finance,1 the...

ANALYSIS OF DISCRIMINATION IN PRIME AND SUBPRIME

MORTGAGE MARKETS

by

R. Glenn Hubbard*

Darius Palia

Wei Yu

This Draft: September 2012

Abstract

This paper examines evidence of lending discrimination in prime and subprime mortgage

markets in New Jersey. Existing single-equation studies of race-based discrimination in

mortgage lending assume race is uncorrelated with the disturbance term in the loan denial

regression. We show that race is correlated with both observable and unobservable risk variables,

leading to biased coefficient estimates. To mitigate this problem, we specify a system of

equations and use a full information maximum likelihood (FIML) method and two-stage least

squares (2SLS) We use as an instrumental variable for race, the number of African-American

church members at the county-level. Both FIML and 2SLS show that minorities are more likely

to be rejected than whites in the prime market, but less likely to be rejected than whites in the

subprime market, results supportive of the information-based theory of discrimination. We also

find that the reduction in rejection rates to minority neighborhoods from 1996 to 2008 cannot be

fully justified by risk, suggesting a relaxation of lending standards to minority neighborhoods.

Using the methodology of Mian and Sufi [2009], we also find evidence for strong credit supply

effects.

JEL codes: J15, G21

* Corresponding author. Address: Dean and Russell L. Carson Professor of Finance & Economics, Columbia

Business School, 3022 Broadway, Uris Hall 101, New York, NY 10027. Phone: (212)-854-2888. Email address:

[email protected]. We thank Orley Ashenfelter, Ivan Brick, Paul Calem, Markus Brunnermeier, Serdar Dinc,

Alan Krueger, Henry Farber, Alexandre Mas, Atif Mian, and Cecilia Rouse for helpful comments. Part of this

research was conducted when the second author was a visiting professor at Princeton University. All errors remain

our responsibility.

mailto:[email protected]

1

1. Introduction:

It is now widely accepted that the recent credit crisis and Great Recession has as one of

its primary causes the excesses of the U.S. mortgage market (for example, see Blinder [2007,

2009], Stiglitz [2007], Calomiris [2008], and Brunnermeier [2009]). The adverse impact of

declines in the value of residential real estate have been especially acute in the subprime

mortgage markets, in which loans were made to borrowers with lower credit quality and/or short

credit histories. According to Inside Mortgage Finance,1 the volume of subprime mortgage

originations grew from $75 billion in 1994 to a peak of $625 billion in 2006, and then falling to

$93 billion in 2009.

The rise of the subprime mortgage market allows us to test for race-based discrimination

in the residential real estate prime and subprime markets. With home ownership rates at 67.4

percent in 2009,2 housing is an important asset in households’ portfolio holdings, with

consequences for other portfolio assets and asset returns.3 However, there are still significant

differences in homeownership rates among African-Americans (46.2 percent), Hispanics (48.4

percent), and whites (71.4 percent).4 The Fair Housing Act of 1968 prohibits discrimination

based on race, and is actively enforced by the Office of Fair Housing and Equal Opportunity in

the U.S. Department of Housing and Urban Development (HUD). Moreover, the Home

Mortgage Disclosure Act (HMDA) of 1975 and the Community Reinvestment Act (CRA) of

1977 were enacted by Congress5 to monitor lending institutions’ fair lending practices to

minority and low-income borrowers and neighborhoods.

The subprime market has both a beneficial and destructive impact on mortgage borrowers

(see Gramlich [2007] for a good discussion). On the one hand, subprime lending makes credit

accessible to borrowers with blemished credit histories and/or volatile income who do not

qualify for mortgages in the prime lending market. On the other hand, a proportion of subprime

borrowers were very vulnerable to any adverse economic shock due to low verifiable income

stability and savings, and were forced to sell their houses early, often ending in foreclosures.

1Mortgage Market Statistical Annual [2010, volume 1].

2 U.S. Census Bureau 2009 Housing Vacancies and Homeownership Survey.

3Hubbard [1985], Campbell [2006], Cochrane [2007], among others, propose that real estate is an illiquid/nontraded

asset that has a significant affect on a household’s portfolio choice and asset returns. Empirical evidence of such

effects has been found in Flavin and Yamashita [2002], and Piazzesi, et al. [2007], among others. 4 U.S. Census Bureau 2009 Housing Vacancies and Homeownership Survey.

5 The HMDA is enacted by Congress and implemented by the Federal Reserve Board's Regulation C. The CRA is

enacted by Congress and implemented by Regulations 12 CFR parts 25, 228, 345, and 563e.

2

Some lenders have also been sued for allegedly targeting minority borrowers and minority

neighborhoods for high-cost subprime loans, a practice referred to as “reverse redlining.” For

example, the NAACP has filed a lawsuit in federal court in Los Angeles against 12 mortgage

lenders for steering African-American borrowers into high-cost subprime loans (New York

Times, October 15, 2007). Similarly, the City of Baltimore (National Public Radio January 11,

2008) and state of Illinois (Wall Street Journal July 31, 2009) both sued Wells Fargo bank, for

high-cost subprime mortgage lending to minority borrowers.

Given the importance of real estate and the stated objective of regulators to fair and

equal access to housing, researchers have examined whether African-Americans and Hispanics

are discriminated against by lenders (see, for example, Black, Schweitzer, and Mandell [1978];

King [1980]; and Munnell, et al. [1996], among others). These studies have examined whether

minorities are rejected more often in prime mortgage loans than white applicants (referred to as

the accept/reject decision). We add to this research in four ways.

First, we examine discrimination in both prime and subprime markets. In doing so, we

analyze for any differences in discrimination between prime and subprime markets, the latter

being where a large proportion of minority borrowers tend to obtain their mortgages (Scheessele

[2002]; Calem, et al. [2004]; and Mayer and Pence [2008]).

Second, we use two economic theories of discrimination (taste-based, Becker, 1957;

information-based statistical theory, Phelps, 1972, Arrow, 1973) to derive possible hypotheses

for testing (See Section 2.1 for further details). Analyzing the subprime market along with the

prime market allows us to test the differing implications of these theories using actual data. In

order to test these theories, recent studies have used small samples of self-reported data, or

studied a game, or conducted an experiment.6 For example, Levitt (2004) describes two

important caveats in his study of the voting behavior on the television show titled the “Weakest

Link,” namely, that the environment is not a market setting, and individuals who both apply and

are then selected for the game show are not representative of the general population. Individuals

6 Some studies have tested for discrimination in a game setting (for example, Fershtman and Gneezy [2001]; Levitt

[2004]) or in a paired-audit experimental setting (for example, Turner, et al. [2002]; Bertrand and Mullainathan

[2004]). Yinger (1998), Ross and Yinger (2003), and Anderson, Fryer and Holt (2005) provide detailed surveys on

the discrimination literature and Levitt and List (2009) provide an excellent overview of field experiments in

economics. Hausman and Wise (1979), Heckman (1992), Heckman and Smith (1995), and Manski (1996), among

others, have criticized these studies as not being generalizable to large-scale markets, not having true randomness,

having attrition bias, and participants changing their behavior due to their awareness of being measured.

http://topics.nytimes.com/top/reference/timestopics/organizations/n/national_association_for_the_advancement_of_colored_people/index.html?inline=nyt-org

3

who own real estate are representative of the general population, and lenders are required to

disclose details of every loan application to the Federal Financial Institutions Examination

Council (FFIEC) under HMDA.

Third, previous research on race-based lending discrimination uses a single-equation

model of mortgage acceptance and assumes the race variable to be uncorrelated with the error

term in the accept/reject model. Though problems associated with single-equation tests for

discrimination have been observed in studies by Rachlis and Yezer [1993], Yezer, Phillips and

Trost [1994], among others, these papers have focused on the endogeneity of loan terms,

especially the loan-to-value ratio.7 None of these papers has tested the hypothesis that race is

correlated with observable and unobservable risks.8 If such a hypothesis is not statistically

rejected, the average disturbance term in these single-equation probit models is not conditionally

zero, resulting in inconsistent regression parameter estimates.

Fourth, we ameliorate the bias in the single-equation probit model using two methods.

The first method uses the full information maximum likelihood method (FIML) or bivariate

probit. The advantage of this method is that it is at the individual loan level, the unit at which the

lending officer makes the decision. The disadvantage is that the regression estimates depends

crucially on the normality assumption of the error terms in the two equations. The second

method is two-stage least squares (2SLS) at the neighborhood census tract level. The advantage

of 2SLS is that the regression estimates do not depends crucially on the statistical distribution of

the error terms, but its disadvantage is that it is at the census tract level.

We define an instrumental variable for race, namely the number of people in that county

who attend a predominantly African-American church.9 We conjecture that this variable is likely

to be strongly related to race (making it a strong instrumental variable), and that religious status

should not be related to risk variables and are therefore exogenous to the lender’s accept/reject

loan decision (making it a valid instrumental variable). The latter assumption is also made

stronger given that we have controlled for neighborhood characteristics such as unemployment

7 We also test the endogeneity of the loan-to-value ratio, LTV, but find it to be exogenous in the loan denial

regression using the Durbin-Wu-Hausman test. 8 Many authors have criticized these single-equation studies (Zandi [1993]; Lebowitz [1993]; Horn [1994, 1997];

Yezer, Phillips, and Trost [1994]; Ross and Yinger [1999]; Stengel and Glennon [1999]; and LaCour-Little [1999]

for what is essentially an omitted variable problem, but no study has corrected for the correlation between the race

variable and the error term. 9 See section 3.1 for more details on this variable. Note that this variable is not whether the borrower attends church,

which might reflect some risk-taking characteristics.

4

rates, percentage who attended college, age of property, percentage of renter occupied housing,

house price appreciation rates, and owner occupied traits.

It is still possible that we have not controlled for some unobservable characteristic. We

therefore use the insightful approach of Altonji, Elder, and Taber (2005a, 2005b) who examine

the impact of Catholic schools on education outcomes using an instrumental variable approach.

As in these papers, we examine a) if observable loan, borrower, and property characteristics are

significantly different in neighborhoods with a large number people who attend a predominantly

African-American church, from neighborhoods with few people attending a predominantly

African-American church. If these differences are economically large, then our instrumental

variable is capturing some observable risk characteristic; b) we examine if our instrumental

variable is significantly related to the accept/reject decision in neighborhoods with a large

proportion of White borrowers. If we find this is true, then our instrumental variable is capturing

some unobservable risk characteristic that is not exclusively correlated with race; c) we examine

the lower bound (upper bound) bias of our estimates, defined as equal selection on unobserved

variables when compared to observed variables (defined as zero correlation between race and

unobserved variables). Making observations about the differential impact of unobservable and

observable risks in subprime and prime markets, we examine how much does the bias impact our

results?

Fourth, the recent literature (for example, Mian and Sufi [2009], Keys, et al. [2010];

among others), has suggested that much of the current credit crisis in the subprime market has

arisen from the large increase in credit supply provided by lax screening incentives inherent in

the lender’s originate-to-distribute model. Mian and Sufi [2009] find that zip codes in 1996-a

period before the credit expansion—with either high denial rates or high subprime borrowers got

much greater credit in 2002-05 than other zip codes. Such borrowers obtained increased credit

and had higher default rates despite having lower income and employment growth rates (proxies

for increase mortgage demand) during that period. We use their methodology to isolate the

supply effect by using the proportion of the census track that was rejected in 1996. In doing so,

we attempt to disentangle whether our results are driven by changes in mortgage demand or

mortgage supply.

We construct a unique data set from seven sources to capture a broad set of borrower risk

characteristics, loan risk characteristics, property risk characteristics, lender characteristics,

5

religious affiliations, and macroeconomic variables. Our large sample consists of 2,026,556

mortgage applications made by borrowers in New Jersey from 2000 to 2008.

Our principal results are as follows.

First, we find that African-American or minority borrowers are more likely to be rejected

than white borrowers in the prime market. These single-equation results are consistent with those

found by many studies such as Black, Schweitzer, and Mandell [1978]; King [1980]; Schafer and

Ladd [1981]; and Munnell, et al. [1996]. When we examine the subprime market, we find that

African-American or minority borrowers are less likely to be rejected than white borrowers.

Second, we find at the individual-loan level that race is correlated with observable risks

using 14 risk measures. Using Rivers-Vuong and likelihood ratio tests, we also show that race is

also positively correlated with unobservable risks (such as wealth, gifts and bequests, and loan

documentation), in the accept/reject regression. The fact that race is correlated with observable

and unobservable risks in the single-equation model, violates the assumption that the error terms

are uncorrelated resulting in biased parameter estimates in the single-equation probit model.

Using FIML system of equations, we find that African-American or minority borrowers are more

(less) likely to be rejected than white borrowers in the prime (subprime) market, after controlling

for risks. This result is supportive of the information-based theory of discrimination.

Third, we generally find similar results at the neighborhood census tract level. The

Durbin-Wu-Hausman test also shows that race is correlated with the disturbance term in the

neighborhood single-equation OLS regressions, suggesting that the regression parameter

estimates are inconsistent. In our two-stage least squares estimates (2SLS) with a valid

instrumental variable we find that neighborhoods with a higher proportion of minority borrowers

are associated with a significantly higher (lower) percentage of loans rejected in the prime

(subprime) market.

Fourth, using the methodology of Altonji, Elder, and Taber (2005a, 2005b) we find that

our instrumental variable, namely, the number of people in the neighborhood who attend a

predominantly African-American church, is not consistently correlated with observable risk

variables. We also find that it does not correlate with the accept/reject decision in predominantly

white neighborhoods. Finding a higher ratio of unobservables to observable risk variables in

subprime markets when compared to prime markets, we find that our main results in support of

information-based discrimination are unaffected by any possible bias. Finally, we also find that

6

this instrumental variable is strongly correlated with the race variable, helping us avoid the well-

known weak instrument problem.

Fifth, to disentangle demand and supply effects further, we examine the change in reject

rates, neighborhood risk characteristics and racial composition between 1996 and 2008. We find

that the reduction in rejection rates to minority neighborhoods from 1996 to 2008 cannot be fully

justified by risk, suggesting a relaxation of lending standards to minority neighborhoods. These

results for strong credit supply effects are consistent with those in Mian and Sufi [2009], who

also find reductions in loan denial rates to neighborhoods despite significant deterioration in

credit quality.

A number of papers have suggested that home price depreciation has been a significant

factor for rising mortgage defaults in the subprime industry (see Demyank and Hemert [2011],

Gerardi, Shapiro and Willen [2008]; Mayer and Pence [2008], Hubbard and Mayer [2008];

Mayer, Pence, and Sherlund [2009]). Accordingly, we control for home price levels at the MSA-

level so as to ensure that none of our above results are affected due to home price depreciation.

Our finding that increased credit supply affected minorities in the subprime market

should be not be confused with “reverse redlining” by lenders. The former argument is a general

lessening of standards by lenders resulting in minorities in subprime neighborhoods being more

severely affected. The latter argument rests on the necessary assumption that minorities in

subprime neighborhoods were steered and/or targeted for unnecessary loans. We do not test for

this latter argument which is beyond the scope of this paper.

The paper is organized as follows. Section 2 describes the testable hypotheses,

comparison to Levitt’s [2004] analysis of the game show “Weakest Link.” In Section 3, we

present our empirical methodology and variables. Section 4 describes the data’s sources and

basic characteristics, and Section 5 presents our empirical results. Section 6 concludes.

2. Testable Hypotheses, Comparison to Levitt’s (2004) Analysis of the Game Show

“Weakest Link,” Unobservable Risks

2.1 Testable Hypotheses

There are two leading theories of discrimination. Under the taste-based theory of

discrimination (Becker [1957]), the motive that drives the behavior is animus or prejudice

towards a particular group, that is, the economic actors simply do not like that group and do not

7

want to interact with members in that group. Under the information-based theory of

discrimination (Phelps [1972], and Arrow [1973], the motive that drives the agent’s behavior is

expected profit maximization. In an imperfect information world, economic agents discriminate

against certain groups because they believe these groups have lower productivity (or in this

context credit quality), which will reduce their profit.

Using the implications of these theories in the prime and subprime markets, we develop

testable hypotheses of regressions of the accept/reject decision on race. For ease of exposition,

let

0 it it it itReject X Race (1)

where itReject is a dummy variable that equals one if a loan application is rejected, and zero

otherwise; itRace is dummy variable set to unity if the borrower is African American and/or

Hispanic, and zero if White; itX is a vector of observable risk variables categorized into

borrower risks, loan risks, property risks, lender risks, and macroeconomic risks, and it is the

disturbance term.

Ex-ante, we don’t know whether discrimination exists in mortgage lending and which

theory of discrimination explains the data. In the prime market, the two theories have similar

predictions. Under the taste-based theory, lenders tend to reject minority borrowers simply

because they don’t like them. Under the information-based theory, lenders tend to reject minority

borrowers because they believe minorities are associated with lower credit quality and doing so

will minimize default and maximize expected profit. In the subprime market, the two theories

have different predictions. Under the taste-based theory, lenders continue to disproportionately

reject minority borrowers because they dislike them. But the strategic incentive switches under

the information-based theory. In the subprime market, lenders can make higher expected profits

by charging higher loan prices if they believe minorities are associated with lower credit quality,

resulting in a much higher cost of discrimination in the subprime market. Therefore, there is

flipping of behavior in the prime and subprime markets which distinguishes the taste-based

theory from the information-based theory.

Before the subprime mortgage crisis, subprime mortgage market is expected to have

higher profits than the prime market. For example, an Office of Thrift Supervision study (2000;

page 11) points out that “All in all, subprime lending should have higher expected profits than

prime lending because of its higher risk.” Comparing subprime loans with prime loans, Lax, et.

8

al (2000) finds that 50% of the interest rate premium charged to subprime borrowers is not

related to their higher levels of risk. Azmy (2005) also shows that profits from subprime lending

industry are high enough such that their excess profit is not justified by risk for some lenders.

Accordingly, we have the following three testable hypotheses with respect to the sign and

magnitude of the regression coefficients on Race ( ).

Hypothesis 1: If is positive and of equal magnitude in the prime and subprime markets,

then we have evidence supportive of the taste-based theory.

Hypothesis 2: If is positive in the prime market, and zero or negative in the subprime

market, then we have evidence supportive of the information-based theory.

Hypothesis 3: If is not positive in either the prime or subprime markets, then there is no

evidence supportive of either the taste-based or information-based theories of discrimination.

2.2 Comparison to Levitt’s (2004) Analysis of the Game Show “Weakest Link”

In this sub-section, we compare the implications of the theories of discrimination in the

prime and subprime mortgage market to the game show “Weakest Link” that was analyzed by

Levitt (2004). The “Weakest Link” is a television game show. In the game, players compete by

answering questions to get the final winner-take-all prize. In early rounds of the game, players

answer a series of questions to build a pot of money that can be added to the final winner-take-all

prize. At the end of each round, one player who received the most votes from other players is

removed from the game. The process continues until only two players are left. In the final round,

the player who answers more questions correctly will get the winner-take-all prize.

The key point is that it is the same difference in agents’ motives that drives the differing

behavior in early and later rounds in the Weakest Link and the differing behavior in the prime

and subprime mortgage markets. In the early rounds of the Weakest Link, the two theories have

similar predictions but for different reasons. Under the taste-based theory, participants in the

game tend to vote off the group that is discriminated because they don’t like the members in that

group. Under the information-based theory, participants in the game also tend to vote off the

group that is discriminated because they believe that group is associated with lower skill and

doing so will maximize the number of questions answered and therefore maximize their final

profit. In the later rounds, the two theories have different predictions. Under the taste-based

theory, the players in the targeted group will continue to receive more votes because other

participants don’t like them. But, the strategic incentive switches under the information-based

9

theory. If the participants in the game believe that a particular group is associated with lower

skill, they will avoid voting off members in that group in order to maximize their chance of

winning and maximize their final profit. It is this flipping of behavior that distinguishes the taste-

based discrimination from the information-based discrimination.

In the prime market, the two theories have similar predictions. Under the taste-based

theory, lenders tend to reject minority borrowers simply because they don’t like them. Under the

information-based theory, lenders tend to reject minority borrowers because they believe

minorities are associated with lower credit quality and doing so will minimize default and

maximize expected profit. In the subprime market, the two theories have different predictions.

Under the taste-based theory, lenders continue to disproportionately reject minority borrowers

because they dislike them. But the strategic incentive switches under the information-based

theory. In the subprime market, lenders can make higher expected profits by charging higher

loan prices if they believe minorities are associated with lower credit quality, resulting in a much

higher cost of discrimination in the subprime market. Therefore, similar to the flipping of

behavior in the Weakest Link, there is flipping of behavior in the prime and subprime markets

which distinguishes the taste-based theory from the information-based theory.

3. Methodology and Variables

3.1 The potential importance of unobservable risks

The standard econometric model used to examine racial discrimination is specified as

regression model (1). One difficulty in estimating this model is whether RACE is correlated with

the disturbance term . If RACE is correlated with observable and unobservable risk variables

that are not included in X , then RACE will be correlated with the disturbance term. This

correlation would lead to inconsistent regression estimates of , which results in incorrect

support for discrimination when in fact there is not any, or vice versa.

Many risks are unobservable to the researcher due to data availability, but are privately

observable to the lender. These include borrower and loan characteristics such as wealth, bequest

or gifts received, and loan documentation.10

We describe some of these unobservable risks using

examples below.

10

These risks which we term unobservable could include both omitted quantifiable risk variables and non-

quantifiable variables.

10

Wealth. Consider the example of an African-American borrower and a white borrower

who come to the same lender for a loan application. Assume the two borrowers have similar

credit scores, income, debt-to-income ratios and the two loans are similar in every aspect such as

the loan-to-value ratio, loan amount, and property risks. Based on the observable information,

the researcher would expect to find similar loan decisions made by the lender. If the application

for the white borrower is accepted and the application for the African-American borrower is

rejected, then the researcher might conclude that this is evidence of racial discrimination.

This conclusion would be incorrect if the lender has private information about ability to

pay, which is not observable to the researcher. For example, if the lender knows that the white

borrower has a higher level of wealth than the African-American borrower, then the lender may

approve the loan for the white borrower because of a perceived stronger ability to repay the loan.

Therefore, if RACE is correlated with wealth, the coefficient on RACE may not reflect

discrimination, but the effect of wealth.

A number of studies (for example, Altonji, et al. [2000]; Gittleman and Wolff [2000];

and Blau and Graham [1990]) have shown that minority (African-American and Hispanic)

households have much lower wealth holdings than white households. In a recent survey of

research, Scholz and Levine [2004] show that wealth differences are large and persistent when

groups are divided by race, and the differences remain even when differences in educational

attainment across the groups are controlled for. Using pooled data from the Federal Reserve’s

Surveys of Consumer Finances from 1989, 1992, 1995, and 1998, they show that there are wide

net wealth disparities at every age. For example, they find that “at ages 51 to 55, …, mean

(median) net wealth of white households is $467,747 ($156,550), while for African-American

and Hispanic households it is $105,675 ($33,170).” They also find that “the net wealth of

African-American and Hispanic college graduates is similar to the net wealth of white high

school graduates, and the net wealth of African-American and Hispanic high school graduates is

similar to the net wealth of white high school dropouts (emphasis added).”

Bequests and gifts. One can make a similar argument about differing levels of bequests

and gifts by race. That is, the lender may know the bequest or gifts received by the African-

American, Hispanic, or white borrower, while the researcher does not. Therefore, the researcher

may incorrectly conclude evidence of discrimination, when differences really reflect inheritance.

Using data from the Panel Study of Income Dynamics, Charles and Hurst [2000] find that white

11

and African-American households differ in family help for a down payment on a home.

Specifically, more white households (42 percent) get family help than do African-American

households (fewer than 10 percent). Using PSID wealth supplements in 1984, 1989, and 1994,

Gittleman and Wolff [2000] find that inheritance plays an important role in wealth accumulation

for whites, but not for African-Americans. These papers suggest different ability of getting

financial help from families among different races, which can be information known to the

lender, but unknown to the researcher.

Poor documentation. The relative importance of particular unobservable risks may

change when lending standards change. For example, “no documentation” or “low

documentation” loans were very rare in the 1990s, and were not included in earlier single-

equation studies. But such loans were 12 percent of mortgage loan originations in the second half

of 2005 a boom year for subprime mortgages.11

If minority status of the borrowers is correlated

with low and unstable income, they might be more likely to apply for low documentation loans.12

Therefore, omitting the documentation type of the loan biases the regression coefficients,

because ‘low documentation’ loans are associated with higher risks.

Soft private information. There are many soft information variables that the researcher

does not have access to which might be important to lenders. For example, if the borrower has

significant deposits with the bank, has a long relationship with the lender, has a common social

circle or cultural affinity (such as belonging to the same credit union), and the month’s/quarter’s

loan origination target set for the loan officer by the supervisor.13

Therefore, it is hard for

researchers to document every variable that is relevant to the lender’s loan decision.

We describe below how our methodology mitigates the bias in parameter estimates.

11

State of the Nation’s Housing 2005 report at Harvard’s Joint Center for Housing Studies,

http://www.jchs.harvard.edu/publications/markets/son2005. 12

In a typical low documentation loan, income can be stated, but not verified. Therefore, borrowers with low or

unstable income are more likely to apply for a low documentation loan. Low documentation can capture additional

credit risks that are not captured in the borrower’s FICO credit score. For example, in the current FICO score, good

or bad performance is based on the worst delinquency on any obligation over the last two years. Therefore, a

borrower who is delinquent in one out of ten open accounts could represent a same level of unsatisfactory outcome

as someone who is delinquent on all ten accounts if their worst delinquency amount is the same. In the latter case,

the borrower who is delinquent on all ten accounts is more willing to apply for a low documentation loan, given that

a lender might require her to disclose more adverse information. Courchane, et al. [2007] show that low-

documentation loans in the subprime market are associated with higher credit risk. 13

Gan and Riddiough [2008] suggest that lenders possess proprietary credit quality information embedded in their

screening technology that is not observable to empirical researchers.

12

3.2 Equations and variables

We begin by explaining the system of equations at the individual loan-level. We model

the accept/reject decision of a loan application in a manner similar to that used in the single-

equation studies:

0 1 2 3

4 5 (2)

it it it it

it t it it

REJECT BORROWER LOAN PROPERTY

LENDER MACRO RACE

where itREJECT is a dummy variable that equals one if the loan application is rejected, and zero

otherwise; itBORROWER is a vector of borrower risk variables;

itLOAN is a vector of loan risk

variables; itPROPERTY is a vector of property risk variables;

itLENDER is a vector of lender

characteristics; tMacro is a vector of macroeconomic variables; itRACE is a dummy variable that

equals one if the borrower is a member of a minority group (African-American or Hispanic), and

zero if the borrower is white. We have used a comprehensive set of loan variables. If RACE is

correlated with the disturbance term it it is technically endogenous, resulting in inconsistent

estimates of . To rectify this problem we specify another equation for RACE:

0 1 2 3 4 (3) it it it it itRACE BORROWER LOAN PROPERTY BLACKCHURCH e

Because both the outcome variable itREJECT and the independent variable itRACE are

binary variables and itRACE is correlated with the disturbance it , neither the traditional two-

stage least squares (2SLS) nor the Rivers-Vuong approaches will produce consistent estimators.

Instead, we use the full-information maximum likelihood (FIML) methodology to estimate the

model because this is the only valid econometric technique that can be used to estimate a binary

choice model with a binary endogenous variable, (Greene [2003]; and Wooldridge [2002]). This

method has been used to estimate binary choice models with endogenous variables by Evans,

Oates, and Schwab [1992], Evans and Schwab [1995], and Greene [1998]. In FIML, the

correlation of the residual terms in equations (1) and (2) is calculated by maximizing the log

likelihood function for the two equations simultaneously. Though the identification of the system

of equations does not need an instrument, we include the variable BLACKCHURCH, which is

the county-level number of African-American church members. The coefficient on RACE is

the coefficient of interest. We estimate the model for prime and subprime loans separately to

draw inferences for the taste-based and information-based theories of discrimination.

13

Motivating the correlation coefficient of ite with

it is straightforward. ite in equation (2)

are characteristics of being a minority that are not correlated with borrower, loan, and property

risk characteristics. It could be unobserved risk or just another characteristic of minorities. If it is

other characteristics, and not risk that is captured by equation (1), then the estimate of should

be statistically insignificant. If minorities have higher unobservable risks, then the estimate of

should be positive and statistically significant. Conversely, if minorities have lower unobservable

risks, then the estimate of should be negative and statistically significant.

We have two specifications for RACE: AFRICAN-AMERICAN, a dummy variable set to

unity if the borrower is African-American, and zero if the borrower is white; and MINORITY, a

dummy variable set to unity if the borrower is either African-American or Hispanic, and zero if

the borrower is white.14

As in the previous literature, we exclude all other race categories such as

Asian-American.

We use a comprehensive list of control variables combining the explanatory variables

used by Holmes and Horvitz [1994]; Munnell, et al. [1996]; Berkovec, et al. [1998]; Day and

Liebowitz [1996, 1998]; and Calem, et al. [2004]. Specifically, the control variables cover five

broad categories: borrower risk characteristics, loan risk characteristics, property risk

characteristics, lender characteristics, and macroeconomic variables.

Our proxies for the various borrower risk characteristics are as follows: INCOME, the

natural logarithm of individual borrower income; MEDFICO, the borrower’s Census tract

median credit score (FICO); DTI, as the average non-mortgage debt divided by average income

in the Census tract; d) PCTCOLLEGE, as the percentage of the Census tract population greater

than 25 years of age with at least a Bachelor’s degree; UNEMPLOYED, the number of

unemployed civilians divided by the sum of employed and unemployed civilians in the Census

tract. We expect lower levels of INCOME, MEDFICO, PCTCOLLEGE to be associated with

higher borrower risks and higher probability of rejection. We also expect higher DTI and

UNEMPLOYED to be associated with higher borrower risks and higher probability of rejection.

14

If the borrower is African-American (white) and is a male, we classify the borrower as African-American (white).

We re-estimated our models when we modified our definition of African-American to include if the borrower and/or

co-borrower are African-American with no reference to gender. Similar adjustments were made for Hispanics. None

of our results generally changed (results available on request), driven by the fact that only a few borrowers were

affected when we use different definitions of RACE.

14

Loan risk characteristics include LTV, the median loan amount divided by the product of

owner-occupied median house value and annual house price appreciation rate in the Census tract;

CONVENTIONAL, a dummy variable set to unity if the loan is a conventional loan, and zero if

the loan is a special program loan, such as VA or FHA loan; and AMOUNT, defined as

individual loan amount in thousands. We expect loans with higher LTV and CONVENTIONAL

loans to have higher loan risks and are therefore positively associated with the probability of

rejection.

Property risk characteristics include MEDAGE, the median age of residential property in

the census tract; PCTRENT, the number of renter-occupied housing units divided by the total

housing units in the census tract; HVCHG, the house price appreciation rate; NEWOWN, the

percentage change in number of owner occupants between the 1990 and 2000 censuses; and

OWNEROCC, a dummy variable equal to unity if the property underlying the loan is owner-

occupied, and zero otherwise. We expect higher levels of MEDAGE and PCTRENT to be

associated with higher property risk and lead to higher rejection probabilities, while properties

that are owner-occupied (OWNEROCC) or located in higher HVCHG or NEWOWN tracts are

less risky, and are accordingly expected to have lower rejection probabilities.

We include two measures of lender characteristics: BANK, a dummy variable equal to

unity if the lender is a commercial bank, savings, or thrift institution, and zero otherwise; and

HERF, the Herfindahl-Hirschmann Index of the census tract, defined as the sum of squared

market shares of lenders in each census tract. We include the bank dummy variable to distinguish

depository and nondepository institutions. These two types of institutions are subject to very

different set of regulations. For example, depositories are subject to Community Reinvestment

Act ratings and CAMEL ratings, but nondepositories are not. We include this dummy to capture

this difference.

Three macroeconomic variables are chosen to control for differences in the economic

environment at the time the loan was applied. PRIMERATE, the interest rate on prime loans;

TERM, the yield spread between the seven-year Treasury note and the yield of a three-month

Treasury bill; and DEFAULT, the spread between the yields on Baa and Aaa bonds.

To check the robustness of the loan-level FIML results, we also estimated the model at

the neighborhood-census-tract level. By aggregating the loan-level binary dependent variables

into neighborhood continuous variables, we can use two-stage least squares (2SLS) with

15

instrumental variables. If we find similar results for both levels, then the central results do not

depend on FIML. Moreover, some of the borrower risk variables, such as census tract

MEDFICO score and DTI, are imperfect proxies for an individual borrower risk. If the results

from the individual loan-level (with imperfect credit risk proxies) hold at the neighborhood-level

(with accurate credit risk proxies), imperfect borrower risk proxies are not driving the results.

At the neighborhood-census-tract level, previous studies examine either the number of

subprime loan rejections or originations. This emphasis is problematic because more loans could

be rejected or originated simply because there were more applications. Therefore, we examine

the percentage of subprime loans rejected in a census tract in order to also control for the demand

side of loans. The system of equations for the neighborhood level is specified as follows:

0 1 2 3

4

(4)

it it it it

t it it

PCTREJECTION BORROWER LOAN PROPERTY

MACRO PCTRACE

0 1 2 3

4

(5)

,

it it it it

it

PCTRACE BORROWER LOAN PROPERTY

BLACKCHURCH e

where itPCTREJECTION is defined as the number of loan rejections divided by number of loan

applications in census tract i and itPCTRACE is the percentage of minority applicants in census

tract i . The coefficient on itPCTRACE is the coefficient of interest. We estimate the model

separately for prime and subprime loans. We use the Durbin-Wu-Hausman test to examine

whether itPCTRACE is correlated with the disturbance term it .

The control variables in the neighborhood regression are similar to those in the loan-level

regression. However, because some of the control variables such as OWNEROCC and BANK are

loan-level variables, we have fewer control variables in the neighborhood specifications.15

Similar to the loan-level regression, we have two specifications for the race variable PCTRACE:

PCTAFRICAN, the percentage of census tract applicants that are African-American; and

PCTMINORITY, the percentage of census-tract applicants that are either African-American or

Hispanic.

15

We estimated the loan-level specifications dropping variables that are not used in the neighborhood specifications

and found that none of our results change significantly.

16

Because both the outcome variable and independent variable are continuous, we can use

two-stage least squares to estimate the model. For this system of equations to be identified, we

need a valid and strong instrumental variable16

for PCTRACE. We use the following instrument

in the PCTRACE equation: BLACKCHURCH, the number of county level black church

members. This instrument is chosen because Black church members are highly correlated with

African American status, but religious status should be exogenous to the accept/reject loan

decision. In the first stage, we estimate by OLS a regression of PCTRACE on all control

variables and the instrumental variable. We use the fitted values from this stage as regressors in

the PCTREJECTION regression.

Next, to examine whether changes in rejection rates are associated with changes in

lending standards, we further estimate the following model:

,96 06 ,96 06 ,96 06

,96 06 ,96 06

0 1 2

3 4

_ _ _

+ _ (6)

and

i i i

i i i

DIFF PCTREJECTION DIFF BORROWER DIFF LOAN

DIFF_PROPERTY DIFF MACRO RESID

0 1 ,96 06

_ , (7)i i iRESID DIFF PCTRACE e

where DIFF_PCTREJECTION is the difference in rejection rates between 1996 and 2006,

DIFF_BORROWER, DIFF_LOAN, DIFF_PROPERTY, and DIFF_MACRO are the differences

in borrower, loan, property and macroeconomic risks between 1996 and 2006.

In the first stage, we estimate model (6). In the second stage, we regress the residual of

regression (6) on the difference of minority borrowers between 1996 and 2006. The residual

from regression model (6) captures the change in rejection rates that is not explained by changes

in risks. If the coefficient 1 is positive, it suggests that lenders tightened lending to minority

neighborhoods. Otherwise, if it is negative, it suggests a relaxation of lending standards to

minority neighborhood.

3. Data Sources and Descriptive Statistics

3. 1 Data sources

We match seven sources of data to construct a broad set of borrower risk characteristics,

loan risk characteristics, property risk characteristics, lender characteristics and macroeconomic

variables. First, we use the Home Mortgage Disclosure Act (HMDA) data from 2000 to 2008 to

16

We use the Staiger-Stock test to examine whether this is a strong instrument variable.

17

obtain individual loan-level data (such as whether a loan is being accepted or rejected, loan

amount, income, race and gender of the borrower, etc). We also use the HMDA data to derive

measures of lender characteristics, such as the Herfindahl-Hirschmann Index of the census tract

and whether the lender is a bank. Second, following previous studies, we use the Department of

Housing and Urban Development’s (HUD) list of subprime lenders to code each loan as being

subprime or prime. HUD’s list of lenders is matched to HMDA by lender identification code by

each lenders unique id. However, since HUD stopped publishing the subprime lender list after

2005, we construct a subprime lender list if the proportion of higher-priced loan origination of a

lender exceeds 50% of its total loan originations. The correlation between our list of subprime

lenders and the HUD subprime list is 0.8. Third, we use U.S. census data to derive census tract-

level demographic, property and borrower risk characteristics. We match the census data to

HMDA data by state, county, and census tract number. Fourth, we use the proprietary data from

TransUnion, a major credit bureau for tract-median FICO score (MEDFICO) and debt-to-income

ratio (DTI), which are widely accepted borrower-risk variables used by mortgage bankers and

brokers in their lending decisions. We match the credit bureau data to HMDA data by state,

county, and census tract number. Fifth, we match the House Price Index (HPI) data from the

Office of Federal Housing Finance Agency (FHFA, formerly OFHEO) to HMDA data by year

and Metropolitan Statistical Area (MSA). We use these data to construct neighborhood house

price appreciation rate, which is used to calculate the loan-to-value ratio (LTV). Sixth, we match

the African-American church members from the U.S. Religious Landscape Survey by county.

The African American churches are identified from list of predominantly African-American

denominations in the Religious Congregations and Membership Study (RCMS). Finally, we use

macroeconomic data from the Federal Reserve Bank of St. Louis’s website

(http://research.stlouisfed.org) to control for macroeconomic risk. Detailed definitions of the

control variables, instrumental variables, their sources, and which regression they are used in are

listed in Table I.

***Table I***

3.2 Descriptive statistics

Table II reports summary statistics for variables in the loan-level regressions. The sample

includes 2,026,556 mortgage loan applications in New Jersey from 2000 to 2008 and is divided

into the prime loan sample (1,714,003 applications) and the subprime loan sample (312,553

18

applications). Examining the two samples, the probability of rejection is much higher in

subprime lending (0.46) than in prime lending (0.23).

***Table II***

The subprime mortgage sample has a higher proportion of minority applicants than the

prime mortgage sample. For example, 19 percent of subprime loan applicants are African-

Americans and 28 percent are minorities (African-American or Hispanic borrowers),

corresponding values for prime loan applicants are 10 percent and 19 percent, respectively.

These results are consistent with Scheessele [2002], Calem, et al. [2004], and Mayer and Pence

[2008], who also find that a large proportion of minority borrowers obtain their mortgages in the

subprime market. Subprime borrowers also show higher credit risks than prime borrowers. On

average, subprime borrowers have lower income (natural logarithm of borrower income in

thousands is 4.35 for subprime borrowers and 4.54 for prime borrowers), a lower credit score

(681 for subprime and 730 for prime). Further, subprime borrowers have a lower percentage of

college graduates (22 percent for subprime versus 29 percent for prime) and a higher

unemployment rate (7 percent for subprime and 6 percent for prime).

4. Results

In this section, we present our regression results at the individual loan-level and

neighborhood-level. We first estimate the loan-level regressions, in which we can control for

more variables and at a level which the loan is made, namely, the individual. The advantage of

the FIML method is that it is at the individual loan level, the unit at which the lending officer

makes the decision. The disadvantage is that the regression estimates depends crucially on the

normality assumption of the error terms. The second method is two-stage least squares (2SLS) at

the neighborhood census tract level. The advantage of 2SLS is that the regression estimates do

not depends crucially on the statistical distribution of the error terms, but its disadvantage is that

it is at the census tract level.

4.1 Loan-level estimation

Similar to previous studies we begin by estimating a single-equation probit regressions

for the accept/reject decision, where RACE is treated as uncorrelated with the disturbance term.

19

Table III reports these results. In column (1) and (3), RACE is defined as a dummy

variable that equals one if the borrower is African-American, and zero if the borrower is white.

In column (2) and (4), RACE is equal to one if the borrower is either African-American or

Hispanic, and zero if the borrower is white. We estimate separate regressions for prime and

subprime lending and report the marginal effects.

***Table III***

In Table III, we find that the probability of rejection is 7.9 percent higher for an African-

American borrower than it is for a white applicant in the prime market. The probability of

rejection is 2.0 percent higher for a minority (African-American or Hispanic) borrower than for a

white borrower. These results are consistent with studies such as Munnell, et. al [1996], which

find that the probability of rejection is 8 percent higher for minorities (African-American or

Hispanic) than for whites. But, in the subprime market, we do not find similar results. The

probability of rejection is 3.6 percent lower for African-Americans than for whites.

We next investigate the effects of correcting the correlation of RACE and disturbance

term in the accept/reject regression. We begin by showing RACE is correlated with other

observable risk characteristics in Table IV.

***Table IV***

Panel A of Table IV presents different risk characteristics for different race categories

and tests for the differences in means. Examining African-American versus white borrowers, we

find that African-Americans exhibit much higher borrower, loan, and property risks than whites.

African-Americans have, on average, lower income, lower credit score, a higher debt-to-income

ratio, lower education, and a higher unemployment rate than whites. The magnitude of the

difference is quite big for some variables. For example, the average credit score for whites is

735, but it is only 666 for African-Americans. The average debt-to-income ratio for whites is

18.5 percent while it is 30.1 percent for African-Americans. In terms of loan risks, the average

loan amount is much smaller for African-Americans than whites ($138,396 versus $192,271) and

the average loan-to-value ratio is slightly higher for African-Americans. Moreover, African-

Americans are more likely to apply for a special program loan such as VA or FHA loans given

that the proportion of conventional loans are much lower for African-Americans than for whites

(88.7 percent versus 96.3 percent). Examining property risk characteristics, we find that African-

Americans live in neighborhoods with higher property risks such as older houses, higher

20

percentage of rental housing units, a higher percentage of houses boarded up, etc. The results for

minorities are very similar to African-Americans. African-Americans and minorities are

significantly different from whites for all of the borrower, loan, and property risk variables.

Therefore, Panel A of Table IV provides evidence that RACE is strongly correlated with

observable borrower, loan, and property risk variables in the accept/reject regression.

We now test whether RACE is also correlated with unobservable risk variables. To do so,

we use the Rivers-Vuong [1988] endogeneity test17

and the likelihood ratio test.18

Both tests

reject the null hypothesis that RACE is exogenous, suggesting inconsistent parameter estimates

in the accept/reject regression (see Panel B of Table IV). Moreover, for all specifications, ρ is

positive and statistically significantly different from zero, suggesting that minorities are

associated with higher unobservable risks, leading to higher loan rejections. Therefore, the

omission of the unobservable risks overestimates the effect of RACE on rejection. Because we

reject the null hypothesis that RACE is uncorrelated with the disturbance term in the accept/reject

regression, single-equation probit estimation will produce inconsistent estimators.

In Table V, we use FIML, in which ρ is built into the likelihood function. The coefficient

on the RACE variable is positive (negative) in prime (subprime) markets. That is, these results

show that lending discrimination against minorities appears in prime markets after correcting for

endogeneity (correlation of RACE with the disturbance), similar to the results in Table III in

which RACE is treated as exogenous (uncorrelated with the disturbance). Furthermore, we find

that the coefficients on African-American and minority status are significantly negative in the

subprime market, suggesting that African-American and minority borrowers are more likely to

be approved for a subprime loan. These results suggest evidence consistent with information-

based discrimination.

***Table V***

4.2 Neighborhood-level estimation

We now estimate the model at the neighborhood census tract level. By aggregating the

loan-level binary dependent variables into neighborhood continuous variables, we can use two-

stage least squares (2SLS) with instrumental variables. If we find similar results for both levels,

17

We do not perform the Durbin-Wu-Hausman endogeneity test because the dependent variable in the accept/reject

regression is not a continuous variable. Appendix B describes the Rivers-Vuong [1988] test. 18

The likelihood ratio test is a test for whether (the correlation coefficient between the residuals of the

accept/reject equation and the race equation) is statistically significantly different from zero.

21

then the central results do not depend on the FIML. Moreover, some of the borrower risk

variables, such as census tract MEDFICO score and DTI, are imperfect proxies for an individual

borrower risk. If the results from the individual loan-level (with imperfect credit risk proxies)

hold at the neighborhood-level (with accurate credit risk proxies), imperfect borrower risk

proxies are not driving the results.

We define PCTRACE as the percentage of applicants that are African-American in a

census tract and the percentage of minority applicants (African-American or Hispanic) in a

census tract. Similar to our loan level regressions, we compare the OLS results without

correcting for the correlated race and disturbance term and when we correct for the correlation

using two-stage least squares regressions. In doing so, we test if RACE is technically

endogenous resulting in inconsistent parameter estimates for OLS. In Panel A of Table VII, our

OLS regression show higher rejection for minorities in the prime markets (consistent with

Gabriel and Rosenthal [1991], and Munnell, et al. [1996]). For subprime markets we find that

minorities have lower rejection rates. However, these regression parameter estimates might be

biased, which we test for using Durbin-Wu-Hausman test for endogeneity. As the first step in the

Durbin-Wu-Hausman test, one typically regresses the endogenous variable on all exogenous

variables in the system and obtains the residual. We then include the residual as an additional

regressor in the original OLS regressions. If the coefficient on the residual is statistically

significantly different from zero, we can reject the hypothesis of exogeneiety. Panel B of Table

VIII shows the chi-squared statistic to statistically significantly different from zero, suggesting

inconsistent parameter estimates on RACE.

***Tables VI***

Given that PCTRACE is a continuous variable we can use two-stage least squares

regressions (2SLS) with an instrument variable to obtain the coefficient on RACE. Our selection

of instrumental variable is guided by econometric considerations. Our instrument,

BLACKCHURCH, captures the idea that Black church members are highly correlated with

African-American status, but religious status should be exogenous to the accept/reject loan

decision. We regress PCTREJECTION on PCTRACE and other neighborhood control variables,

the results of which are presented in Table VII. For prime lending, minority-concentrated

neighborhoods are associated with a higher percentage of rejection. Minority-concentrated

neighborhoods are associated with a lower percentage of rejection in subprime markets. These

22

results are consistent with those found in the individual loan-level regressions using FIML,

suggesting that the central result that minorities got much higher loans than whites in the

subprime market does not depend on which method we use (2SLS or FIML) and which

imperfect proxies we use for an individual borrower risk (MEDFICO and DTI). The results are

once again consistent with the information-based theory of discrimination.

***Table VII***

4.3 Validity and Strength of Instrumental Variable

To test the validity and biasedness of our instrument, we follow the methodology of Altonji

et al. (2005a, 2005b). First, we test to see whether the observable variables are significantly

different in areas with high concentration of black churches members and areas with low

concentrations of black church members. If the observable variables are both economically and

statistically significant different in the two areas, it may suggest that BLACKCHURCH is also

correlated with other observable variables, and may not serve as a valid instrument. Therefore,

we sort the census tracts into two subsamples: above the median black church members and

below the median black church members. We present the differences in observable variables

between these two subsamples in Panel A of Table VIII. Most of the differences are statistically

significant due to our use of a large sample (see t-statistic and p-values). However, in the case of

many risk variables the signs are not one that are consistent with a risk story. For example, if

neighborhoods with a larger number of African-American church members have higher risk, one

would expect them to have a lower median FICO score than neighborhoods with a smaller

number of African-American church members. In fact we find the opposite (7.136% v. 7.007%,

respectively). Similarly, neighborhoods with a larger number of African-American church

members have lower loan-to-value ratios, lower percentage of renters and new owners, and

younger houses than neighborhoods with a smaller number of African-American church

members, a result contrary to the correlated risk story. In some cases, the differences are

statistically insignificant (for example, log median income and house price appreciation rates),

whereas in others the differences are very small (for example, conventional loans). In summary,

this panel does not show any evidence for consistently higher risk in neighborhoods with a larger

number of African-American church members when compared to neighborhoods with a smaller

number of African-American church members.

***Table VIII***

23

The second approach to assess the validity of the instrumental variables is to identify a

sample of white borrowers and test whether BLACKCHURCH directly affect the accept/reject

decision in this sub-sample. If the coefficient on BLACKCHURCH is significant in such sample,

it suggest that the instrumental variable itself may directly affect the accept/reject decision and

may not serve as a valid instrument. To implement this approach, we regress the accept/reject

decision on the observable variables and the instrument BLACKCHURCH for the top-quintile

white census tracts sample. As shown in Panel B of Table VIII, the instrument, black church

members does not have a statistically strong effect on the accept/reject decision. Therefore, we

cannot reject BLACKCHURCH to be a valid instrument.

The third approach to assess the validity of the instrumental variables in the individual-

loan FIML regressions, is to assess the upper and lower bound of the coefficients on the RACE

variable by varying , the correlation coefficient between the residuals of the accept/reject

equation and the race equation. The upper bound bias assumes that is 0, i.e., there is no

correlation between the race variable and the unobservable variables (namely, the single-

equation results). The lower bound bias assumes that the selection on the unobservables is the

same as selection on the observables. More specifically, it assumes that the correlation between

RACE and the observable variables in the regression is equal to the correlation between RACE

and the unobservable variables such as wealth, bequests, documentation, deposits with the

lending institution, and bargaining power. This is a strong assumption, as the observables are

likely to have a higher correlation with RACE than the unobservables.19

Additionally, as Altonji

et al. (2005a, 2005b) argue, researchers do not pick observable variables randomly, but rather to

get the highest fit in their regression model.

We calculate the correlation coefficient between the race variable and observable

variables, and then econometrically restrict the correlation coefficient between the race variable

and the unobservable variables to be the same as the correlation coefficient between the race

variable and observable variables in the FIML regression.20

The results are shown in Panel C of

Table VIII.

19

See Table III wherein the observables correctly predicted varies from 65% to 88%. We do not focus on the pseudo

R2 statistic, because it has less information content and is known to be low in limited dependent variable regressions

(see Greene 2003). 20

This FIML specification restricts to be greater than zero, whereas the FIML specification Table V makes no

restriction.

24

We expect that the unobservable variables such as wealth and bequests, deposits with the

lender, and documentation to have much more impact on whether the borrower gets a loan in the

subprime market, than in the prime market. We find evidence in support of this argument as our

observable variables can explain 84-88% of the accept/reject decision in the prime market, which

is lower than the corresponding 62-64% in the subprime market (see Table V). This is also

confirmed at the census-track level (see Table VI), wherein our 2SLS regressions capture 65% of

the variation in the prime market and only 14% at the subprime level. As the ratio of

unobservables to observables is likely to be higher in the subprime market than in the prime

markets, the bias is likely to be higher in the subprime market. Therefore, we expect the results

to be more towards the single-equation results in the prime market, and more towards the FIML

results in the subprime market. These results confirm our support for the information-based

theory of discrimination.

To test the robustness of our results and whether the same results hold during the housing

boom period from 2000 to 2005, we reran the individual loan-level FIML regression and

neighborhood level 2SLS regression for 2000-2005. The results are similar to the full sample

period from 2000 to 2008, and are given in Table IX. This suggests that our main results are not

dependent on the sample period used.

***Table IX***

4.3 Are the lower rejections rates for minorities in the subprime market due to increases in

credit supply?

Mian and Sufi [2009] and Keys, et al. [2010]; among others, have suggested that much of

the current credit crisis in the subprime market has arisen from the large increase in credit supply

provided by lax screening incentives inherent in the lender’s originate-to-distribute model. Mian

and Sufi [2009] find that zip codes in 1996 -- a period before the credit expansion -- with either

high denial rates or high subprime borrowers got much greater credit in 2002-2005 than other zip

codes. Such borrowers obtained increased credit and had higher default rates despite having

lower income and employment growth rates (proxies for increase mortgage demand) during that

period. We use their methodology to isolate the supply effect by using the proportion of the

census track that was rejected in 1996.

25

Specifically, to separate the change in demand from the change in supply, in the first-

stage regression we estimate the change in rejection rates on the changes in borrower (without

the race variable), loan, property, and macroeconomic risks between 1996 and 2008. This first

stage regression represents the changes in rejection rates due to credit demand effects. We take

the residuals from this first stage regression and regress it on the difference in proportion of

minority borrowers between 1996 and 2008, or the proportion of minority borrowers in 1996.

The results are presented in Panel A of Table X. For both measures of racial composition, we

find that minority concentrated neighborhoods experienced disproportionally higher reduction in

subprime loan rejection rates, even after controlling for the changes in demand-side risk

characteristics.

***Table X***

To ensure further that the reduction in subprime rejection rates for minorities was not

driven by higher growth in income or employment, we sort the data into four quartiles according

to the proportion of minority borrowers. The top quartile includes those census tracts that have

the lowest minority concentration, while the bottom quartile includes those census tracts that

have the highest minority concentration. In Panel B, we find that neighborhoods with high

minority concentration experienced either lower income or employment growth than in

neighborhoods with low minority concentration, or statistically insignificantly different from

neighborhoods with low minority concentration. The results suggest that the reduction in

subprime rejection rates between 1996 and 2008 was not driven by the demand related income or

employment growth, but from an expansion of credit from the supply side. These results are

consistent with those of Mian and Sufi [2009] and Keys, et al. [2010], who also find strong credit

supply effects.

6. Conclusions

Given the importance of real estate in household portfolios and the stated objective of

regulators to fair and equal access to housing, researchers have examined whether African-

Americans and Hispanics are discriminated against by lenders (see, for example, Black,

Schweitzer, and Mandell [1978]; King [1980]; Schafer and Ladd [1981]; and Munnell, et al.

[1996], among others). These studies have examined whether minorities are rejected more often

in prime mortgage loans than white applicants (referred to as the accept/reject decision). We

26

examine discrimination in both prime and subprime markets, the latter being the markets in

which a large proportion of minority borrowers obtain their mortgages.

We replicate existing studies using a single-equation probit analysis. We find that race is

positively correlated with the decision to reject at the individual loan-level for prime markets.

This finding is consistent with the previous literature above. When we examine the subprime

market, we find that African-American or minority borrowers are less likely to be rejected than

white borrowers. However, we find that race is correlated with observable risks using 14 risk

measures. Using Rivers-Vuong and likelihood ratio tests, we also show that race is also

positively correlated with unobservable risks (such as wealth, gifts and bequests, and loan

documentation), in the accept/reject model. The fact that race is correlated with observable and

unobservable risks in the single-equation model, violates the assumption that the error terms are

uncorrelated resulting in biased parameter estimates. We use two methods to ameliorate the bias,

FIML and 2SLS. We use an instrumental variable for race, namely, the number of people who

attend a predominantly African-American Church. We argue that such a religious status variable

is unlikely to be in the lending officers information set, and find that its inclusion does not

significantly change our main result. Using both FIML and 2SLS methods, we find race to be

positively (negatively) related to the accept/reject decision in prime (subprime) markets. These

results suggest that African-American or minority borrowers are more (less) likely to be rejected

than white borrowers in the prime (subprime) market, confirming the information-based theory

of discrimination.

We also find that the reduction in rejection rates to minority neighborhoods from 1996 to

2008 cannot be fully justified by risk, suggesting a relaxation of lending standards to minority

neighborhoods. In doing so, we have controlled for home price levels at the MSA level. Our

results for strong credit supply effects are consistent with those in Mian and Sufi (2009), among

others, who also find reductions in loan denial rates to neighborhoods despite significant

deterioration in credit quality.

Author Affiliations

COLUMBIA BUSINESS SCHOOL, COLUMBIA UNIVERSITY

RUTGERS BUSINESS SCHOOL, RUTGERS UNIVERSITY

27

COLLEGE OF BUSINESS ADMINISTRATION, CALIFORNIA STATE POLYTECHNIC

UNIVERSITY, POMONA

28

References

Altonji, Joseph G., Ulrich Doraszelski, and Lewis Segal, “Black/White Differences in Wealth,”

Federal Reserve Bank of Chicago Economic Perspectives, 24 (1) (2000), 38-50.

Baher Azmy, Squaring the Predatory Lending Circle, 57 FLA. L. REV. 295, 307 (2005).

Berkovec, James A., Glenn B. Canner, Stuart A. Gabriel, and Timothy H. Hannan,

“Discrimination, Competition, and Loan Performance in FHA Mortgage Lending,” Review of

Economics and Statistics, 80 (2) (1998), 241-250.

Bertrand, Marianne, and Sendhil Mullainathan, “Are Emily and Greg More Employable than

Lakisha and Jamal? A Field Experiment on Labor Market Discrimination,” American Economic

Review, 94(4) (2004), 991-1013.

Black, Harold A., Robert L. Schweitzer, and Lewis Mandell, “Discrimination in Mortgage

Lending,” American Economic Review, 68(2) (1998), 186-191.

Blau, Francine D., and John W. Graham,“Black-White Differences in Wealth and Asset

Composition,” Quarterly Journal of Economics, 105(2) (1990), 321-339.

Blinder, Alan S., “Six Fingers of Blame in the Mortgage Mess,” New York Times. September 30,

2007.

Blinder, Alan S., “Six Blunders En Route to a Crisis,” New York Times, January 25, 2009.

Bound, John, David A. Jaeger, and Regina M. Baker, “Problems with Instrumental Variables

Estimation When the Correlation between the Instruments and Endogenous Explanatory

Variables is Weak,” Journal of American Statistical Association, 90 (1995), 443-450.

Brunnermeier, Markus K., “Deciphering the Liquidity and Credit Crunch 2007-2008,” Journal of

Economic Perspectives 23(1) (2009), 77-100.

Calomiris, Charles W., “The Subprime Turmoil: What’s Old, What’s New, and What’s Next?”

Working Paper, Columbia Business School, 2008.

Calem, Paul S., Kevin Gillen, and Susan Wachter, “The Neighborhood Distribution of Subprime

Mortgage Lending,” Journal of Real Estate Finance and Economics, 29 (4) (2004), 393-410.

Campbell, John Y, “Household Finance,” Journal of Finance, 61 (2006), 1553-1604.

Charles, Kerwin K., and Erik Hurst, “The Transition to Home-Ownership and the Black-White

Wealth Gap,” Review of Economics and Statistics, 84(2) (2002), 281-297.

Cochrane, John, “Portfolio Theory,” Working Paper, University of Chicago, 2007.

29

Courchane, Marsha, Adam Gailey, and Peter Zorn, “Consumer Credit Literacy: What Price

Perception?” Working Paper, Federal Reserve Bank of Chicago, 2007.

Day, Theodore E., and Stan J. Liebowitz, “Mortgages, Minorities, and HMDA,” Paper presented

at the Federal Reserve Bank of Chicago, April 1996.

Day, Theodore E., and Stan J. Liebowitz, “Mortgage Lending to Minorities: Where’s the Bias,”

Economic Inquiry, 34 (1998), 3–28.

Demyanyk, Yuliya, and Otto Van Hemert, “Understanding the Subprime Mortgage Crisis,”

Review of Financial Studies 26 (6) (2011), 1848-1880

Evans, William N., Wallace E. Oates, and Robert M. Schwab, “Measuring Peer Group Effects: A

Study of Teenage Behavior,” Journal of Political Economy, 100 (1992), 966-991.

Evans, William N., and Robert M. Schwab, “Finishing High School and Starting College: Do

Catholic Schools Make a Difference?” Quarterly Journal of Economics, 110 (1995), 941-974.

Fershtman, Chaim, and Uri Gneezy, “Discrimination in a Segmented Society: An Experimental

Approach.” Quarterly Journal of Economics,115 (1) (2001), 351-377.

Flavin, Marjorie, and Takashi Yamashita, “Owner-occupied Housing and Composition of the

Household Portfolio,” American Economic Review, 92 (2002), 345-362.

Gabriel, Stuart A., and Stuart S. Rosenthal, “Credit Rationing, Race and the Mortgage Market,”

Journal of Urban Economics, 29 (1991), 371-379.

Gittleman, Maury, and Edward N. Wolff, “Racial Wealth Disparities: Is the Gap Closing?”

Jerome Levy Economics Institute. Working Paper No. 311, 2000.

Gan, Jie, and Timothy. J. Riddiough, “Monopoly and Informational Advantage in the Residential

Mortgage Market,” Review of Financial Studies, 21(6) (2008), 2677-2703.

Gerardi, Kristopher, Adam H. Shapiro, and Paul S. Willen, “Subprime Outcomes: Risky

Mortgages, Homeownership Experiences, and Foreclosures,” Working Paper, Federal Reserve

Bank of Boston, 2008.

Gramlich, Edward, Subprime Mortgages: America’s Latest Boom and Bust. (Washington, D.C.:

Urban Institute Press.) 2005.

Greene, William H., “Gender Economics Courses in Liberal Arts Colleges: Further Results,”

Journal of Economic Education, 29 (1998), 291-300.

Greene, William H. 2003. Econometric Analysis, 5th

Edition, (Upper Saddle River: Prentice

Hall.).

30

Hausman, Jerry, and David Wise, “Attrition Bias in Experimental and Panel Data: The Gary

Income Maintenance Experiment,” Econometrica 47(2) (1979), 455-473.

Heckman, James J., “Detecting Discrimination.” Journal of Economic Perspectives 12(2) (1998),

101-116.

Heckman, James J. and Jeffrey Smith, “Assessing the Case For Social Experiments” Journal of

Economic Perspectives 9(2) (1992), 85-110.

Holmes, Andrew, and Paul. Horvitz, “Mortgage Redlining: Race, Risk and Demand,” Journal of

Finance, 49 (1) (1994), 81-99.

Horn, David K., “Evaluating the Role of Race in Mortgage Lending.” FDIC Banking Review,

(Spring/Summer) (1994), 1-15.

Horn, David K., “Mortgage Lending, Race and Model Specification,” Journal of Financial

Services Research 11(1-2), (1997), 42-68.

Hubbard, R. Glenn, “Social Security, Liquidity Constraints, and Pre-Retirement Consumption,"

Southern Economic Journal, 51 (1985), 471-484.

Hubbard, R. Glenn and Christopher Mayer, “House Prices, Interest Rates, and the Mortgage

Market Meltdown,” Working Paper, Columbia Business School, 2008.

Keys, Benjamin J., Tanmoy Mukherjee, Amit Seru, and Vikrant Vig, “Did Securitization Lead to

Lax Screening? Evidence from Sub-Prime Loans,” Quarterly Journal of Economics, 125(1)

(2010), 307-362 .

King, Alvin T., “Discrimination in Mortgage Lending: A Study of Three Cities,” Office of

Policy and Economic Research, Federal Home Loan Bank Board, Research Working Paper no.

91, 1980.

LaCour-Little, Michael, “Discrimination in Mortgage Lending: A Critical Review of the

Literature,” Journal of Real Estate Literature, 7 (1999), 15-49.

Lax, Howard, Michael Matni, Paul Raca and Peter Zorn, “Subprime Lending: An Investigation

of Economic Efficiency, Housing Policy Debate, 15(3), 533-571.

Levitt, Steven, "Testing Theories of Discrimination: Evidence From Weakest Link," Journal of

Law and Economics, 47(2) (2004), 431-452.

Levitt, Steven, and John A. List, “Field Experiments in Economics: The Past, the Present, and

the Future.” European Economic Review, 53 (2009), 1-18.

Liebowitz, Stan J., “A Study That Deserves No Credit,” Wall Street Journal, September 1, p.

A14, 1993.

31

Manski, Charles F., “Learning about Treatment Effects from Experiments with Random

Assignment of Treatments,” Journal of Human Resources, 31(4) (1996), 707-733.

Mayer, Christopher, Karen Pence, and Shane M. Sherlund, “The Rise in Mortgage Defaults,”

Journal of Economic Perspectives, 23(1) (2009), 27-50.

Mayer, Christopher, and Karen Pence, “Subprime Mortgages: What, Where, and to Whom?” In

Glaeser, Edward and John Quigley, editors, Housing and the Built Environment: Access,

Finance, Policy, Lincoln Land Institute of Land Policy, Cambridge MA, 2008.

Mian, A., and A. Sufi. 2009. “The Consequences of Mortgage Credit Expansion: Evidence of the

U.S. Mortgage Default Crisis.” Quarterly Journal of Economics 124, 1449-1496.

Mortgage Market Statistical Annual, 2011.

Munnell, Alicia H., Geoffrey M. B. Tootell, Lynn E. Browne, and James McEneaney, “Mortgage

Lending in Boston: Interpreting HMDA Data,” American Economic Review 86(1) (1996), 25–53.

Office of Thrift Supervision, 2000. “What About Subprime Mortgages?” Mortgage Market

Trends, Volume 4 Issue 1.

Piazzesi, Monika, Martin Schneider, and Selale Tuzel, “Housing, Consumption and Asset

Pricing,” Journal of Financial Economics, 83 (2007), 531-569.

Rachlis, Mitchell B., and Anthony M. J. Yezer, “Serious Flaws in Statistical Tests for

Discrimination in Mortgage Markets.” Journal of Housing Research, 42 (1993), 315 – 336.

Rivers, Douglas, and Quang H. Vuong, “Limited Information Estimators and Exogeneity Tests

for Simultaneous Probit Models,” Journal of Econometrics 39(3) (1988), 347–366.

Ross, Stephen, and John Yinger, “Does Discrimination in Mortgage Lending Exist? The Boston

Fed Study and Its Critics,” In Margery Austin Turner and Felicity Skidmore, eds., Mortgage

Lending Discrimination: A Review of Existing Evidence. Washington, DC: Urban Institute, 43-

83, 1999.

Scheessele, Randall M, “Black and White Disparities in Subprime Mortgage Refinance

Lending,” Housing Finance Working Paper Series, HF-014, U.S. Department of Housing and

Urban Development, 2002.

Scholz, John K., and Kara Levine, “U.S. Black-White Wealth Inequality: A Survey,” in Social

Inequality, K. Neckerman (ed.), Russell Sage Foundation (2004), 895-929.

Schafer, Robert, and Helen F. Ladd, Discrimination in Mortgage Lending. (Cambridge: MIT

Press, 1981).

32

Staiger, Douglas, and James H. Stock. 1997. “Instrumental Variables Regression with Weak

Instruments,” Econometrica 65, 557-586.

Stengel, Mitchell, and Dennis Glennon, “Evaluating Statistical Models of Mortgage Lending

Discrimination: A Bank-Specific Analysis.” Real Estate Economics, 27 (1999), 299-334.

Stiglitz, Joseph, “The House of Cards.” The Guardian, October 9, 2007.

Turner, Margery A., Stephen L. Ross, George C. Galster, and John Yinger, Discrimination in

Metropolitan Housing Market: National Results from Phase 1 of HDS2000. Washington D.C.,

2002.

Yezer, Anthony M. J., Robert F. Phillips, and Robert P. Trost, “Bias in Estimates of

Discrimination and Default in Mortgage Lending: The Effects of Simultaneity and Self-

Selection,” Journal of Real Estate Finance and Economics, 9 (1994), 197-215.

Wooldridge, Jeffrey M., Econometric Analysis of Cross Section and Panel Data, (Cambridge:

The MIT Press, 2002)

Zandi, Mark, “Boston Fed’s Study Was Deeply Flawed,” American Banker, August 19, 1993.

33

Appendix A

Full-Information Maximum Likelihood (FIML) for

Probit Regression with Binary Endogenous Variable

Greene [2003] and Wooldridge [2002] both show that it is inappropriate to use two-step

procedures in estimating a probit regression with a binary endogenous variable. In a two-step

procedure, such as in two-stage least squares (2SLS), one typically substitutes the fitted value for

the endogenous variable. Yet, when the endogenous variable is binary, the function is nonlinear.

Substituting the fitted values will not produce consistent estimators in nonlinear systems.

It is also interesting to note that using the FIML, we do not need an exclusion restriction,

or instrumental variable for the binary endogenous variable. We illustrate this point using the

following three models.

Model 1: Standard Probit Without Endogeneity

1 1 1 2 2 11 0i i i iy x y u , (1)

where 2y is exogenous. The log likelihood function is:

1 1 1 2 2 1 1 1 2 2ln ln (1 ) ln 1i i i i i i

i

L y x y y x y ,

where is the cumulative normal density function. The maximum likelihood estimation (MLE)

of the parameters are obtained by taking the partial derivatives of ln L with respect to 1 2, and

setting the first-order conditions equal to zero.

Model 2: Probit with Continuous Endogenous Variable

1 1 1 2 2 11 0i i i iy x y u (1)

2 2 1 2i i iy x v (2)

34

where 2iy is a continuous variable,

1iu and2iv are assumed to be bivariate normal, each with

mean zero and covariance2

2

2 2

1.

This model can be estimated by two methods: the Rivers and Vuong [1988] two-step

procedure and the maximum likelihood estimation (MLE). To compare with the other two

models (models 1 and 3), we only describe MLE in this Appendix.

Assume all the variables in 1ix are also included in 2ix . Then Evans, Oates, and Schwab

[1992] and Wooldridge [2002] derive the log likelihood function as follows. The joint

distribution of 1 2,i iy y conditional on 2ix is

1 2 2 1 2 2 2 2, | | , |i i i i i i i if y y x f y y x f y x .

Because 2 2|i iy x ~ Normal 2

2 1 2,ix ,

2

2 2 112

2

22

12 2

2|

i iy x

i if y x e .

Because 2 2 2 1i i iv y x and *

1 11 0i iy y ,

1 1 2 2 2 2 1 2

1 2 22

/1| ,

1

i i i i

i i i

x y y xP y y x ;

1 1 2 2 2 2 1 2

1 2 22

/0 | , 1

1

i i i i

i i i

x y y xP y y x ;

Therefore, the log-likelihood function for this probit model with a continuous

endogenous variable is:

22

1 1 2 2 2 12

2

1ln ln 1 ln 1 .5ln 2 .5i i i i

i

L y z y z y x

Where 1 1 2 2 2 2 1 2

2

/

1

i i i ix y y xz ,

35

1 2 2 1 1| , ln 1 ln 1i i i i i

i

f y y x y z y z ,

and 22

2 2 2 2 2 12

2

1| .5ln 2 .5i i i i

i

f y x y x .

Maximizing the log-likelihood function with respect to all parameters gives the MLEs

of 1 2 1 2, , , , . It is clear that 2iy enters the log-likelihood function through the density

functions 1 2 2| ,i i if y y x and 2 2|i if y x . By taking the partial derivatives of ln L with respect to

the parameters, 2iy plays an important role in solving the first-order conditions.21

Therefore, we

need at least one variable in 2ix that is not in 1ix for the system of equations to be identified.

Model 3: Probit with Binary Endogenous Variable

1 1 1 2 2 11 0i i i iy x y u (1)

2 2 1 21 0i i iy x v , (2)

where 2iy is a continuous variable, 1iu and 2iv are assumed to be bivariate-normal, each with

mean zero and covariance 1

1.

Wooldridge [2002] suggests using the FIML to estimate such a model, but does not

derive the maximum likelihood function. Greene [2003] shows that the correlation between the

endogenous variable and the disturbance term ( ) is built into the likelihood function.

Specifically, Greene [2003] shows that:

1 2 1 2 2 1 1 2 2 11, 1| , , ,i i i i i iProb y y x x x x

1 2 1 2 2 1 1 2 11, 0 | , , ,i i i i i iProb y y x x x x

21

2iy shows up in the first-order conditions through the partial derivative of the normal density function ( )z in the

first and second term, and the partial derivative of the fourth term in the log likelihood function.

36

1 2 1 2 2 1 1 2 2 10, 1| , , ,i i i i i iProb y y x x x x

1 2 1 2 2 1 1 2 10, 0 | , , ,i i i i i iProb y y x x x x

where 2is the cumulative bivariate-normal density function. Greene [1998] proved that the log-

likelihood function for probit with binary endogenous variable is exactly the same as the log-

likelihood function for a bivariate-normal regression.22

In doing so, Greene [1998, p. 295] states

that “the counterintuitive result is that in the bivariate probit model, unlike in the linear

simultaneous equations model, if the two dependent variables are jointly determined, we just put

each on the right-hand side of the other equation [in our case, one of them] and proceed as if

there were no simultaneity problem.” Greene [2003, p. 716] further states that “we can ignore the

simultaneity in this model and we cannot in the linear regression model because, in this instance,

we are maximizing the log likelihood, whereas in the linear regression case, we are manipulating

certain sample moments that do not converge to the necessary population parameters in the

presence of simultaneity.” The log-likelihood function for this probit model with a binary

endogenous variable is the same as the bivariate probit and can be written as:

1 2 2 1 1 2 2 1

1 2 2 1 1 2 1

1 2 2 1 1 2 2 1

1 2 2 1 1 2 1

ln 1, 1 ln , ,

1, 0 ln , ,

+ 0, 1 ln , ,

0, 0 ln , ,

i i i i

i

i i i i

i i i i

i i i i

L y y x x

y y x x

y y x x

y y x x

where is the indicator function. Because of the binary nature of 2iy , it does not appear in the

joint density function. Further, when we expand the summation, 2iy does not appear in the log

likelihood function. Assume the data for i observations reads as 11 121, 1y y ,

22

A proof of this result was suggested in Maddala [1983, p. 123] and pursued in Greene [1998].

37

21 221, 0y y , 31 320, 1 ,y y 41 420, 0y y , ... 1 21, 1i iy y , then the log-

likelihood can be written as:

2 11 1 2 12 1 2 21 1 22 1 2 31 1 2 32 1

2 41 1 42 1 2 1 1 2 2 1

ln ln , , ln , , + ln , ,

ln , , ... ln , , .i i

L x x x x x x

x x x x

When we take the partial derivatives of this log likelihood function with respect to the

parameters, 2iy does not play any role in the first-order conditions. Therefore, we do not need an

exclusion restriction (i.e., at least one variable in2ix that is not in

1ix ) for the system of equations

to be identified.

38

Appendix B

Rivers-Vuong Endogeneity Test for

Binary Endogenous Variable in a Probit Regression

When the dependent variable is continuous, one usually performs the Durbin-Wu-

Hausman test for endogeneity. As a first step in the Durbin-Wu-Hausman test, one typically

regresses the endogenous variable on all exogenous variables in the system and obtains the

residual. The residual is then included as additional regressors in the original OLS regressions. If

the coefficient on the residual is statistically significant, then exogeneity is rejected.

Yet, in a probit model with binary endogenous variable, the Durbin-Wu-Hausman test is

not applicable mainly because the usual probit standard errors and test statistics are not strictly

valid. To test for exogeneity in simultaneous equation models with limited dependent variables,

Smith and Blundell [1986] developed a test for tobit models and Rivers and Vuong [1988]

developed a test for the probit model. Wooldridge [2002]23

recommends using Rivers-Vuong

approach in testing for an endogenous binary variable in a probit model. The model is set up as

follows:

1 1 1 1 2 11 0y x y u (1)

2 1 21 2 22 21 0y x x v , (2)

where 1y is the binary dependent variable, 1x is a set of exogenous variables, 2y is the potential

endogenous binary variable, and 1u is the disturbance term. The variables 2x are additional

independent variables in equation (2) that are not included in equation (1). The endogeneity in

the model arises from the correlation of 2y with 1u . Rivers and Vuong (1988) assume that

1 2,u v , the disturbance terms in equation (1) and (2) is independent of 1x , 2x and distributed as

23

Wooldridge [2002, p. 478].

39

Bivariate-normal with zero mean, unit variance, and 1 2( , )Corr u v . If 0 , then

1u and 2y

are correlated, and probit estimation of equation (1) is inconsistent for both 1 and 1.

Showing that 1 1 2 1u v e under joint normality of 1 2,u v , with 1 1Var u , Rivers

and Vuong [1988] develop a simple two-step approach to test the endogeneity of 2y . The two

steps are similar to Durbin-Wu-Hausman test. The first step involves estimating equation (2) to

get the residuals 2v̂ . In the second step, estimate the Probit of 1y on 1x , 2y and the residual.

One feature of Rivers and Vuong [1988] is that the usual probit t statistics on 2v̂ is a valid

test of the null hypothesis when 2y is exogenous, i.e., 0 1: 0H . Yet, if 1 0 , the usual probit

standard errors and test statistics are not strictly valid and the parameters are estimated only up to

scale. The asymptotic variance of the estimated probit parameters needs to be adjusted to account

for the first stage estimation. The scaled probit coefficients need to be divided by a factor,

0.52

1 2ˆ ˆ 1 , where

1/ 22

1 1 / 1 and 2

2 2( )Var v .

40

Table I: Definition of Variables

Variable Definition Source Regression

Dependent variables:

REJECT Dummy set to unity if the loan is rejected, zero otherwise HMDA REJECT

PCTREJECT Percentage of loans rejected in the Census tract HMDA PCTREJECT

Race:

AFRICAN Dummy set to unity if borrower is African-American, zero if White HMDA REJECT

MINORITY Dummy set to unity if the borrower is African-American or

Hispanic, zero if White HMDA

REJECT

PCTAFRICAN Percentage of applicants African-American HMDA PCTREJECT

PCTMINORITY Percentage of applicants African-American or Hispanic HMDA PCTREJECT

Control variables:

Borrower risk

characteristics

INCOME Log of borrower income (in thousands, use median in PCTREJECT

regression) HMDA REJECT and PCTREJECT

MEDFICO Tract median FICO score Credit Bureau REJECT and PCTREJECT

DTI Average non-mortgage debt/average income in the Tract Credit Bureau and HMDA REJECT and PCTREJECT

PCTCOLLEGE Percentage of Tract population 25+ age with a Bachelor's Degree Census REJECT and PCTREJECT

UNEMPLOY Unemployed civilian/(Unemployed civilian + Employed civilian) Census REJECT and PCTREJECT

Loan risk characteristics

LTV

Median loan amount/(Median house value * House price

appreciation rate), this variable is split into three dummy variables:

LTV80_90 (80%<LTV<=90%), LTV90_100 (90%<LTV<=100%) and

LTV100 (LTV>100%)

Census, HMDA and FHFA REJECT and PCTREJECT

CONVENTION Dummy set to unity if the loan is a conventional loan, zero otherwise HMDA REJECT

AMOUNT Loan amount HMDA REJECT

41

Table I (continued)

Variable Definition Source Regression

Property risk characteristics:

MEDAGE Median age of residential property in the Tract Census REJECT and PCTREJECT

PCTRENT Renter-occupied housing units/total housing units Census REJECT and PCTREJECT

HVCHG House price appreciation rate FHFA REJECT and PCTREJECT

NEWOWN Percentage change in owner occupants between 1990 and 2000 Census REJECT and PCTREJECT

OWNEROCC Dummy set to unity if the property is owner-occupied, zero

otherwise

HMDA REJECT and PCTREJECT

Lender Characteristics:

BANK

Set to unity if the lender is a commercial bank, savings, or thrift

institution, and zero otherwise

HMDA REJECT

HERF Sum of squared market shares of lenders in the tract HMDA REJECT

Macroeconomic Variables:

PRIMERATE Prime rate St. Louis Fed REJECT and PCTREJECT

TERM

Yield spread between the seven-year Treasury note and the three-

month Treasury bill

St. Louis Fed REJECT and PCTREJECT

DEFAULT Difference between the yield of a Baa bond and a Aaa bond St. Louis Fed REJECT and PCTREJECT

Instrumental variable:

BLACKCHURCH Number of African American church members in the county U.S. Religious Landscape Survey

and RCMS REJECT and PCTREJECT

This table shows the definitions of variables in loan-level accept/reject regression, loan price regression and the Census tract level PCTREJECT regression. The first column

shows the name of the variable, the second column gives the definitions of the variables, the third column shows the data sources for the variables and the fourth column shows

which regression the variables are used in. Regression REJECT corresponds to equation (1) and (2), and regression PCTREJECT corresponds to equation (3) and (4).

42

Table II: Descriptive Statistics

All Loans Prime Loans Subprime Loans

Number Mean Median Std. Dev. Number Mean Median Std. Dev. Number Mean Median Std. Dev.

Dependent variables:

REJECT 2,026,556 0.26 0.00 0.44 1,714,003 0.23 0.00 0.42 312,553 0.46 0.00 0.50

Race:

AFRICAN 1,784,843 0.12 0.00 0.32 1,521,057 0.10 0.00 0.30 263,786 0.19 0.00 0.39

MINORITY 2,026,556 0.20 0.00 0.40 1,714,003 0.19 0.00 0.39 312,553 0.28 0.00 0.45

Control variables:

Borrower risk characteristics

INCOME 2,026,556 4.51 4.49 0.64 1,714,003 4.54 4.51 0.64 312,553 4.35 4.36 0.61

MEDFICO 2,026,556 707 720 65 1,714,003 712 724 63 312,553 681 695 70

DTI 2,026,556 0.87 0.94 0.35 1,714,003 0.86 0.94 0.35 312,553 0.91 0.97 0.33

PCTCOLLEGE 2,026,556 0.28 0.24 0.16 1,714,003 0.29 0.25 0.17 312,553 0.22 0.19 0.14

UNEMPLOY 2,026,556 0.06 0.05 0.04 1,714,003 0.06 0.04 0.04 312,553 0.07 0.06 0.05


LTV 2,026,556 0.76 0.74 0.20 1,714,003 0.76 0.73 0.20 312,553 0.79 0.76 0.18

CONVENTION 2,026,556 0.96 1.00 0.21 1,714,003 0.95 1.00 0.22 312,553 0.99 1.00 0.12

AMOUNT 2,026,556 220.73 192.99 330.59 1,714,003 221.49 190.00 352.56 312,553 216.34 197.00 150.79

Property risk characteristics

MEDAGE 2,026,556 31.95 33.00 12.68 1,714,003 31.57 32.00 12.69 312,553 34.10 35.00 12.36

PCTRENT 2,026,556 0.29 0.22 0.22 1,714,003 0.28 0.21 0.21 312,553 0.34 0.30 0.22

HVCHG 2,026,556 0.11 0.07 0.26 1,714,003 0.11 0.07 0.25 312,553 0.10 0.06 0.28

NEWOWN 2,026,556 0.15 0.05 0.42 1,714,003 0.16 0.05 0.41 312,553 0.11 0.03 0.45

OWNEROCC 2,026,556 0.92 1.00 0.27 1,714,003 0.91 1.00 0.28 312,553 0.93 1.00 0.25

Lender Characteristics

BANK 2,026,556 0.34 0.00 0.47 1,714,003 0.29 0.00 0.45 312,553 0.66 1.00 0.47

HERF 2,026,556 0.03 0.03 0.01 1,714,003 0.03 0.03 0.01 312,553 0.03 0.03 0.01

43

Table II (continued)

All Loans Prime Loans Subprime Loans

Number Mean Median Std. Dev. Number Mean Median Std. Dev. Number Mean Median Std. Dev.

Macroeconomic Variables

PRIMERATE 2,026,556 4.45 4.34 0.63 1,714,003 4.45 4.34 0.64 312,553 4.43 4.34 0.60

TERM 2,026,556 2.46 2.50 0.27 1,714,003 2.46 2.50 0.27 312,553 2.46 2.50 0.26

DEFAULT 2,026,556 0.79 0.77 0.10 1,714,003 0.79 0.77 0.11 312,553 0.78 0.77 0.09

Instrumental variable for race:

BLACKCHURCH 2,026,556 24,435 22,846 18,179 1,714,003 24,005 22,846 17,761 312,553 26,917 23,253 20,246

This table shows summary descriptive statistics of the variables employed in the loan-level regression for a sample of 250,000 mortgage loans in New Jersey from 2000 to 2008 HMDA

data. The sample is broken down into prime loans (217,886) and subprime loans (32,114). All variables are defined in Table I.

44

Table III: Single-Equation Probit Regressions at the Loan-Level -- Race Uncorrelated

with Disturbance Term

Prime Lending Subprime Lending

(1) (2) (3) (4)

African-American Minority African-American Minority

RACE 0.079*** 0.020*** -0.036*** -0.002

(64.87) (22.58) (-13.30) (-1.02)


INCOME -0.065*** -0.071*** -0.112*** -0.118***

(-92.69) (-102.98) (-55.90) (-63.33)

MEDFICO -0.035*** -0.048*** -0.017*** -0.009***

(-36.00) (-51.73) (-6.66) (-3.82)

DTI 0.051*** 0.048*** 0.053*** 0.050***

(40.80) (39.14) (14.61) (14.55)

PCTCOLLEGE -0.059*** -0.050*** 0.035*** 0.039***

(-19.69) (-17.02) (3.47) (4.18)

UNEMPLOY 0.137*** 0.145*** 0.154*** 0.125***

(11.61) (13.02) (4.66) (4.18)


LTV80_90 0.007*** 0.007*** -0.004 -0.005*

(7.35) (8.12) (-1.62) (-1.95)

LTV90_100 -0.002 -0.002 0.013*** 0.009**

(-1.32) (-1.33) (2.92) (2.41)

LTV100 -0.004** -0.006*** -0.020*** -0.010**

(-2.03) (-3.29) (-3.66) (-2.03)

CONVENTION 0.011*** 0.012*** -0.249*** -0.258***

(6.31) (7.44) (-30.93) (-33.65)

AMOUNT 0.020*** 0.022*** 0.300*** 0.301***

(9.62) (9.74) (36.93) (40.69)


MEDAGE 0.000*** 0.000*** -0.000*** -0.000***

(9.27) (14.05) (-3.30) (-5.07)

PCTRENT 0.039*** 0.047*** 0.025*** 0.008

(16.91) (21.62) (3.63) (1.23)

HVCHG 0.006*** 0.004** -0.004 -0.000

(3.92) (2.48) (-0.78) (-0.11)

NEWOWN -0.011*** -0.010*** 0.001 0.002

(-11.07) (-10.94) (0.23) (0.63)

OWNEROCC -0.030*** -0.031*** -0.021*** -0.043***

(-22.70) (-24.39) (-5.13) (-11.52)


BANK -0.099*** -0.110*** 0.043*** 0.032***

(-130.82) (-150.92) (20.26) (16.71)

HERF -0.156*** -0.176*** 0.734*** 0.761***

(-4.71) (-5.58) (6.93) (7.83)

45

Table III (continued)


(1) (2) (3) (4)



PRIMERATE -0.036*** -0.035*** -0.153*** -0.151***

(-12.50) (-12.20) (-15.89) (-16.51)

TERM -0.086*** -0.081*** -0.326*** -0.325***

(-12.73) (-12.29) (-14.64) (-15.30)

DEFAULT -0.036*** -0.035*** 0.043*** 0.032***

(-12.50) (-12.20) (20.26) (16.71)

Number of Observations 1,521,057 1,714,003 263,786 312,553

Percent correctly

predicted 88.25 87.54

65.13 65.14

Log-likelihood value -69856.52 -77009.33 -18486.13 -20370.35

Psuedo R-squared 0.055 0.061 0.032 0.030

This table shows Probit regressions of REJECT on all control variables defined in Table I and RACE , where RACE is treated

as uncorrelated with the disturbance. The dependent variable is REJECT, a dummy that equals one when the loan application

is rejected, and zero otherwise. We estimate the model for two different samples: the prime loan sample and the subprime loan

sample and define RACE as a dummy variable set to one if borrower is African-American, or Minority (African-American or

Hispanic), and zero if borrower is white. This results in four specifications. Marginal effects are reported with robust t-

statistics given in parenthesis. We use ***, **, and * to denote significance at the 1, 5, and 10 percent level, respectively.

46

Table IV: Evidence of Correlated Race and Disturbance Term in Single-Equation Probit Loan-Level Regressions of Table III

Panel A: Different Risk Characteristics

African-American vs. White Minority vs. White

Variable White

African-

American t-statistic p-value White Minority t-statistic p-value

INCOME 4.450 4.187 55.156 0.000*** 4.450 4.195 75.437 0.000***

MEDFICO 735.089 665.870 120.360 0.000*** 735.089 676.487 153.774 0.000***

DTI 18.544 30.083 -29.141 0.000*** 18.544 25.517 -26.708 0.000***

PCTCOLLEGE 0.328 0.236 73.371 0.000*** 0.328 0.227 115.641 0.000***

UNEMPLOY 0.046 0.067 -61.763 0.000*** 0.046 0.067 -91.569 0.000***

LTV 0.754 0.781 -8.811 0.000*** 0.754 0.782 -11.827 0.000***

CONVENTION 0.963 0.887 29.465 0.000*** 0.963 0.889 41.466 0.000***

AMOUNT 192.271 138.396 24.376 0.000*** 192.271 149.562 17.536 0.000***

OWNEROCC 0.924 0.937 -6.653 0.000*** 0.924 0.943 -13.809 0.000***

MEDAGE 29.362 33.551 -39.335 0.000*** 29.362 35.345 -80.156 0.000***

PCTRENT 0.214 0.326 -66.103 0.000*** 0.214 0.359 -116.885 0.000***

HVCHG 0.109 0.075 18.382 0.000*** 0.109 0.076 25.521 0.000***

NEWOWN 0.186 0.117 18.896 0.000*** 0.186 0.095 41.274 0.000***

PCTBOARD 0.167 0.381 -26.729 0.000*** 0.167 0.310 -30.065 0.000***

Panel B: Formal Tests

African-American Minority

Prime Subprime Prime Subprime

test statistic p-value test statistic p-value test statistic p-value test statistic p-value

Rivers-Vuong Test (t

statistics) 2.84 0.005*** 1.72 0.086* 2.01 0.044** 2.33 0.020**

a 0.184 0.166 0.158 0.252

Likelihood Ratio Test of

=0 (Chi-square statistics) 32.69 0.000*** 6.45 0.011** 22.08 0.000*** 11.21 0.001***

Panel A shows different borrower, loan, and property risk characteristics for different racial groups (African-American vs. white, Hispanic vs. white and Minority vs. white). Mean

values for each control variables are reported by difference racial groups and t-statistics shows the difference in means between two different groups. Panel B shows two formal tests

that race is correlated with the disturbance term in the accept/reject equation: the Rivers-Vuong t-test and the likelihood ratio test of =0, where is is the correlation coefficient

between the accept/reject equation and the race equation. The null hypothesis that race is exogenous (uncorrelated with the disturbance) in the loan-level regression is rejected when the

test statistics exceed the critical values. We use ***, **, and * to denote significance at the 1, 5, and 10 percent level, respectively.

a is the correlation coefficient of the disturbance term in the loan accept/reject equation and the race equation.

47

Table V: Full-Information Maximum Likelihood (FIML) Loan-Level Regressions: Race

Correlated with Disturbance Term


(1) (2) (3) (4)


RACE 0.015*** 0.047*** -0.140*** -0.030*

(3.35) (8.93) (-9.42) (-1.94)

Borrower risk

characteristics

INCOME -0.068*** -0.073*** -0.111*** -0.118***

(-112.58) (-125.02) (-58.90) (-67.03)

MEDFICO -0.060*** -0.050*** -0.055*** -0.025***

(-38.69) (-32.96) (-11.26) (-5.10)

DTI -0.000 -0.001 -0.001 -0.003

(-0.09) (-0.34) (-0.13) (-0.67)

PCTCOLLEGE -0.042*** -0.057*** 0.079*** 0.043***

(-12.85) (-18.85) (6.57) (4.12)

UNEMPLOY 0.109*** 0.079*** 0.135*** 0.069**

(8.82) (6.83) (3.97) (2.27)


LTV80_90 0.004*** 0.004*** -0.007*** -0.007***

(3.75) (4.76) (-2.58) (-3.11)

LTV90_100 -0.008*** -0.004*** 0.007 0.003

(-5.06) (-2.61) (1.62) (0.85)

LTV100 -0.004** -0.004** -0.020*** -0.010**

(-2.05) (-2.23) (-3.71) (-2.17)

CONVENTION 0.007*** 0.013*** -0.247** -0.258***

(4.21) (8.21) (-29.72) (-32.60)

AMOUNT 0.020*** 0.021*** 0.286*** 0.294***

(27.58) (29.00) (35.82) (40.39)

Property risk

characteristics

MEDAGE 0.000*** 0.000*** -0.001** -0.000***

(10.03) (11.29) (-2.46) (-4.24)

PCTRENT 0.009*** 0.021*** -0.007 -0.015**

(3.85) (9.07) (-0.99) (-2.25)

HVCHG 0.003** 0.004*** -0.106** -0.004

(2.02) (2.99) (-2.21) (-0.98)

NEWOWN -0.007*** -0.007*** 0.005* 0.006**

(-7.63) (-8.04) (1.94) (2.31)

OWNEROCC -0.023*** -0.028*** -0.018*** -0.039***

(-17.97) (-22.18) (-4.45) (-10.52)


BANK -0.094*** -0.105*** 0.041*** 0.030***

(-120.11) (-139.46) (19.41) (15.91)

HERF -0.139*** -0.163*** 0.696*** 0.732***

(-4.26) (-5.19) (6.58) (7.51)

48

Table V (continued)


(1) (2) (3) (4)



PRIMERATE 0.030*** 0.027*** -0.091*** -0.089***

(8.42) (7.72) (-8.39) (-8.70)

TERM 0.104*** 0.094*** -0.147*** -0.148***

(11.74) (10.83) (-5.55) (-5.89)

DEFAULT -0.243*** -0.223*** -0.254*** -0.252***

(-34.67) (-32.80) (-12.42) (-13.26)

Number of Observations 1,521,057 1,714,003 263,786 312,553

Percent correctly predicted 84.05 87.53 62.03 64.40

Log-likelihood value -105006.91 -1565193.10 -288866.37 -367757.39

This table shows Full-Information Maximum Likelihood (FIML) estimation of REJECT on control variables defined in Table

I and RACE, where RACE is treated as correlated with the disturbance. The correlation of the RACE with the disturbance term

is built into the likelihood function. The dependent variable, REJECT, is a dummy that equals one when the loan application is

rejected, and zero otherwise. We estimate the model for two different samples: the prime loan sample and the subprime loan

sample and define RACE as a dummy variable set to one if borrower is African-American, or Minority (African-American or

Hispanic), and zero if borrower is white. This approach results in four specifications. Marginal effects are reported with

robust t-statistics given in parenthesis. We use ***, **, and * to denote significance at the 1, 5, and 10 percent level,

respectively.

49

Table VI: OLS Neighborhood-Level Regressions – Race Uncorrelated with Disturbance Terms

Panel A: OLS

Prime

Subprime

(1) (2) (3) (4)

PCTAFRICAN PCTMINORITY PCTAFRICAN PCTMINORITY

PCTRACE 0.079*** 0.116*** -0.071*** -0.164***

(7.02) (5.78) (-3.48) (-4.28)


INCOME -0.027*** -0.024*** -0.042*** -0.044***

(-8.28) (-7.66) (-8.74) (-9.55)

MEDFICO -0.043*** -0.034*** -0.073*** -0.099***

(-11.18) (-5.54) (-10.74) (-8.74)

DTI -0.004 -0.004 -0.002 0.001

(-0.68) (-0.65) (-0.34) (0.15)

PCTCOLLEGE -0.113*** -0.110*** -0.045*** -0.037***

(-17.53) (-16.14) (-3.83) (-3.02)

UNEMPLOY 0.028 -0.028 -0.033 0.069

(1.02) (-0.75) (-0.70) (1.09)


LTV80_90 0.000 -0.004*** -0.025*** -0.020***

(0.02) (-2.89) (-9.45) (-7.86)

LTV90_100 -0.004* -0.005* -0.010** -0.006

(-1.65) (-1.89) (-2.37) (-1.34)

LTV100 -0.005 -0.003 0.025*** 0.022***

(-1.61) (-0.99) (4.72) (4.43)


MEDAGE 0.000*** 0.000*** -0.000*** -0.000***

(7.05) (4.26) (-5.35) (-3.72)

PCTRENT 0.030*** -0.006 -0.065*** -0.018

(6.25) (-1.02) (-8.16) (-1.57)

HVCHG 0.021*** 0.020*** 0.010** 0.013***

(6.01) (5.53) (2.19) (3.02)

NEWOWN -0.009*** -0.007*** -0.000 -0.002

(-5.83) (-4.70) (-0.14) (-0.85)

Macroeconomic variables

PRIMERATE 0.031*** 0.037*** 0.037*** 0.027***

(9.69) (11.26) (7.49) (4.91)

TERM 0.063*** 0.074*** 0.055*** 0.033**

(6.55) (7.72) (4.11) (2.31)

DEFAULT -0.114*** -0.110*** -0.007 -0.004

(-7.58) (-7.44) (-0.41) (-0.23)

INTERCEPT 0.589*** 0.433*** 1.189*** 1.495***

(10.77) (6.21) (14.12) (12.50)

Observations 14,393 14,393 14,381 14,381

R-squared 0.649 0.655 0.182 0.184

50

Table VI (continued)

Panel B: Durbin-Wu-Hausman Endogeneity Test of PCTRACE

Prime Subprime

(1) (2) (3) (4)


Chi-sq

statistics 6.381** 14.917*** 3.127* 9.290***

p-value (0.011) (0.000) (0.077) (0.002)

This table shows OLS regression of PCTREJECTION on some of the neighborhood control variables defined in Table I, and

PCTRACE, where PCTRACE is treated as uncorrelated with the disturbance. The dependent variable PCTREJECTION is

defined as number of loan rejections divided by number of loan applications in a Census tract. We estimate the model for two

different samples: the prime loan sample and the subprime loan sample and define PCTRACE as percentage of Census tract

applicants African-American or minority (African-American + Hispanic). This approach results in four specifications. We

report robust t-statistics in parenthesis and use ***, **, and * to denote significance at the 1, 5, and 10 percent level,

respectively.

51

The table shows the 2SLS regression of PCTREJECTION on some of the neighborhood control variables defined

in Table I, and PCTRACE, where PCTRACE is treated as correlated with the disturbance. The dependent variable

PCTREJECTION is defined as number of loan rejections divided by number of loan applications in a Census tract.

We estimate the model for two different samples: the prime loan sample and the subprime loan sample and define

PCTRACE as percentage of Census tract applicants African-American or minority (African-American and

Hispanic). This results in four specifications. PCTRACE (PRED) is the predicted value of PCTRACE in the first

stage regression. The instrumental variables for PCTRACE in 2SLS are FETALDEATH, FETALSQ, URBANFLAG

and INTER. We report robust t-statistics in parenthesis and use ***, **, and * to denote significance at the 1, 5,

and 10 percent level (two-sided), respectively.

Table VII: 2SLS Neighborhood-Level Regression – Race Correlated with Disturbance

Prime

Subprime

(1) (2) (3) (4)


PCTRACE (PRED) -0.027 -0.014 -0.124*** -0.222***

(1.38) (0.66) (-3.76) (-5.55)


INCOME -0.022*** -0.023*** -0.040*** -0.044***

(-9.77) (-11.23) (-9.81) (-11.18)

MEDFICO -0.076*** -0.072*** -0.089*** -0.116***

(-12.39) (-11.13) (-8.61) (-9.75)

DTI -0.006** -0.006** 0.000 0.003

(-2.54) (-2.44) (0.01) (0.59)

PCTCOLLEGE -0.077*** -0.083*** -0.027** -0.025**

(-10.13) (-14.18) (-1.99) (-2.30)

UNEMPLOY 0.144*** 0.132*** 0.015 0.129***

(5.70) (4.43) (0.39) (2.70)

Loan risk

characteristics

LTV80_90 -0.002 -0.001 -0.026*** -0.020***

(-1.49) (-0.88) (-9.17) (-7.09)

LTV90_100 -0.003 -0.003 -0.009** -0.004

(-1.21) (-1.28) (-2.14) (-0.88)

LTV100 -0.003 -0.004 0.027*** 0.023***

(-1.40) (-1.64) (5.63) (5.00)


MEDAGE 0.000*** 0.000*** -0.001*** -0.000***

(4.16) (5.31) (-6.11) (-3.69)

PCTRENT 0.014*** 0.021*** -0.073*** -0.005

(3.21) (3.89) (-10.06) (-0.46)

HVCHG 0.021*** 0.021*** 0.010*** 0.014***

(14.05) (14.13) (3.25) (4.73)

NEWOWN -0.009*** -0.009*** -0.001 -0.003

(-8.83) (-8.46) (-0.30) (-1.32)


PRIMERATE 0.031*** 0.030*** 0.036*** 0.023***

(12.19) (11.02) (7.11) (3.99)

TERM 0.064*** 0.063*** 0.050*** 0.023*

(9.98) (9.37) (3.89) (1.65)

DEFAULT -0.117*** -0.117*** -0.001 0.000

(-16.89) (-16.71) (-0.06) (0.02)

INTERCEPT 0.775*** 0.765*** 1.299*** 1.657***

(15.47) (11.56) (13.25) (12.64)

Observations 14,393 14,393 14,381 14,381

R-squared 0.645 0.646 0.162 0.163

52

Panel B: Test Whether BLACKCHURCH has a Significant Effect in Predominantly White

Neighborhoods

(1) (2)

Prime

Subprime

BLACKCHURCH 0.012

0.030

(0.854)

(1.335)


INCOME 0.025

-0.036

(0.741)

(-1.273)

MEDFICO -0.260***

-0.162***

(-10.692)

(-5.361)

DTI -0.049

0.007

0.025

-0.036

PCTCOLLEGE -0.360***

-0.225***

(-9.372)

(-7.579)

UNEMPLOY -0.023

-0.072***

(-1.247)

(-3.050)


LTV80_90 -0.018

-0.093***

(-0.998)

(-4.230)

Table VIII: Tests of Validity and Strength of Instrumental Variable

Panel A: Differences in Observed Variables for Neighborhoods with Large Number of African-

American Church Members (above the median) and Small Number of African-American Church

Members (below the median)

Variable Above Below t-statistic p-value

INCOME 4.482 4.541 -0.534 0.592

MEDFICO 7.136 7.007 12.120 0.000***

DTI 0.899 0.841 4.741 0.000***

PCTCOLLEGE 0.263 0.291 -8.553 0.000***

UNEMPLOY 0.055 0.061 -8.907 0.000***

LTV 1.030 1.037 -5.480 0.000***

CONVENTION 0.955 0.957 -1.896 0.072**

AMOUNT 0.211 0.231 -3.634 0.000***

OWNEROCC 0.906 0.929 -3.927 0.000***

MEDAGE 29.002 34.901 -17.326 0.000***

PCTRENT 0.254 0.320 -9.094 0.000***

HVCHG 0.132 0.139 -1.072 0.283

NEWOWN 0.125 0.097 3.067 0.000***

PCTBOARD 0.270 0.304 -5.288 0.000***

53

Table VIII (continued)

LTV90_100 -0.058**

0.002

(-1.975)

(0.064)

LTV100 0.036

-0.008

(1.242)

(-0.318)


MEDAGE -0.076***

-0.041**

(-3.318)

(-2.082)

PCTRENT 0.196***

0.061**

(3.467)

(2.262)

HVCHG 0.346***

0.191***

(5.296)

(6.730)

NEWOWN -0.046***

-0.022

(-3.362)

(-1.334)


PRIMERATE 0.888***

0.365*

(4.714)

(1.899)

TERM 0.890***

0.287

(3.730)

(1.327)

DEFAULT -0.311***

-0.004

(-3.946)

(-0.075)

Observations 3,011

3,012

R-squared 0.486 0.167

Panel C: Upper and Lower Bound Bias of the Coefficients on Race Variables in

FIML

Prime Market

African-American

Minority

Race Coefficient

Race Coefficient

Lower Bound Bias

0.184 -0.022***

0.166 -0.060***

(-20.89)

(-74.64)

Upper Bound Bias

0 0.079***

0 0.020***

(64.87) (22.58)

Subprime Market

African-American

Minority

Race Coefficient

Race Coefficient

Lower Bound Bias

0.158 -0.141***

0.252 -0.164***

(-54.06)

(-78.15)

Upper Bound Bias

0 -0.036***

0 -0.002

(-13.30) (-1.02)

54

Table VIII (continued)

Panel D: Staiger-Stock (1997) instrument strength test

(1) (2)

PCTAFRICAN PCTMINORITY

F-statistics 38.662*** 26.323***

p-value (0.000) (0.000)

Panel A shows evidence of the validity of the instrument by comparing the differences in the top

(above the median) and bottom (below the median) black church concentrated areas to see

whether BLACKCHURCH is correlated with the observable variables. Panel B shows evidence of

the validity of the instrument by testing the direct effect of the BLACKCHURCH on the

accept/reject decision on a sample of white concentrated census tracts (top quintile). Panel C

shows the upper bound and lower bounds of the biases on the RACE coefficients in Full

Information Maximum Likelihood regression. The upper bound bias is calculated by restricting

, the correlation coefficient between the residuals of the accept/reject equation and the race

equation, to 0. The lower bound bias is calculated by restricting to be the correlation coefficient

between the race variable and the observable variables. Panel D shows Staiger and Stock (1997)

test of the strength of the instruments. This test is based on an F-test of the joint significance of

the instruments. The critical value for strong instruments is 10.

55

Table IX: Evidence of Mortgage Lending Discrimination from 2000-2005

Panel A: FIML

Prime Lending

Subprime Lending

(1) (2)

(3) (4)



RACE 0.041*** 0.038***

-0.164*** -0.019

(5.90) (4.90) (-7.73) (-0.86)

Panel B: 2SLS

Prime Lending

Subprime Lending

(1) (2)

(3) (4)



RACE -0.014 -0.015

-0.089** -0.137***

(-0.62) (-0.52) (-2.08) (-2.58)

This table shows evidence of lending discrimination in the prime and subprime mortgage market in the

housing boom period from 2000 to 2005, using both the individual loan-level FIML regression and the

neighborhood-level 2SLS regression. Panel A shows the results of the FIML regression, and Panel B shows

the results of the 2SLS regression.

56

Table X: Changes in Rejection Rates between 1996 and 2008 for subprime

Panel A:

Black Minority

Black Minority

DIFF_PCTRACE -0.094** -0.283***

PCTRACE96 -0.052*** -0.062***

(2.57) (11.61)

(4.49) (7.65)

INTERCEPT -0.051*** -0.043***

INTERCEPT -0.047*** -0.041***

(39.00) (29.60)

(32.23) (27.11)

Observations 1611 1611

Observations 1611 1611

R-squared 0.006 0.139 R-squared 0.015 0.046

Panel B:

Quartiles sorted by PCTMINORITY

Quartile Pctminority Income Growth Employment Growth

1 1.60% 3.19% -2.03%

2 4.22% 4.06% 0.16%

3 9.47% 4.17% 0.24%

4 38.83% 0.57% -2.40%

Difference in income growth between quartile 1 and 4:

Quartile 1 Quartile 4 Difference (1-4) p-value

3.19% 0.57% 2.62% 0.00

Difference in employment growth between quartile 1 and 4:


-2.03% -2.40% 0.37% 0.00

Quartiles sorted by PCTBLACK

Quartile Pctblack Income Growth Employment Growth

1 0.10% 2.25% -2.90%

2 1.31% 4.24% -0.34%

3 4.79% 4.19% 0.98%

4 30.45% 1.29% -2.80%

Difference in income growth between quartile 1 and 4:


2.25% 1.29% 0.96% 0.00

Difference in employment growth between quartile 1 and 4:


-2.90% -2.80% -0.10% 0.00

Panel A of this table shows second stage regression results of model (5) and (6). In the first stage, the change in

rejection rate between 2008 and 1996 is regressed on the change in borrower (without race), loan, property and

macro risks between 1996 and 2008. In the second stage, the residual from first stage regression is regressed on

the change in proportion of minority borrowers between 1996 and 2008 (column 1 and 2), or the proportion of

minority borrowers in 1996 (column 3 and 4). We report robust t-statistics in parenthesis and use ***, **, and * to

denote significance at the 1, 5, and 10 percent level (two-sided), respectively. Panel B presents the difference in

57

average income growth rate and employment growth rate from 1996 to 2008 between quartiles sorted by the

proportion of minority population in the census tract.

analysis of discrimination in prime and subprime … · according to inside mortgage finance,1 the...

Documents