analysis of discrimination in prime and subprime … · according to inside mortgage finance,1 the...
TRANSCRIPT
ANALYSIS OF DISCRIMINATION IN PRIME AND SUBPRIME
MORTGAGE MARKETS
by
R. Glenn Hubbard*
Darius Palia
Wei Yu
This Draft: September 2012
Abstract
This paper examines evidence of lending discrimination in prime and subprime mortgage
markets in New Jersey. Existing single-equation studies of race-based discrimination in
mortgage lending assume race is uncorrelated with the disturbance term in the loan denial
regression. We show that race is correlated with both observable and unobservable risk variables,
leading to biased coefficient estimates. To mitigate this problem, we specify a system of
equations and use a full information maximum likelihood (FIML) method and two-stage least
squares (2SLS) We use as an instrumental variable for race, the number of African-American
church members at the county-level. Both FIML and 2SLS show that minorities are more likely
to be rejected than whites in the prime market, but less likely to be rejected than whites in the
subprime market, results supportive of the information-based theory of discrimination. We also
find that the reduction in rejection rates to minority neighborhoods from 1996 to 2008 cannot be
fully justified by risk, suggesting a relaxation of lending standards to minority neighborhoods.
Using the methodology of Mian and Sufi [2009], we also find evidence for strong credit supply
effects.
JEL codes: J15, G21
* Corresponding author. Address: Dean and Russell L. Carson Professor of Finance & Economics, Columbia
Business School, 3022 Broadway, Uris Hall 101, New York, NY 10027. Phone: (212)-854-2888. Email address:
[email protected]. We thank Orley Ashenfelter, Ivan Brick, Paul Calem, Markus Brunnermeier, Serdar Dinc,
Alan Krueger, Henry Farber, Alexandre Mas, Atif Mian, and Cecilia Rouse for helpful comments. Part of this
research was conducted when the second author was a visiting professor at Princeton University. All errors remain
our responsibility.
1
1. Introduction:
It is now widely accepted that the recent credit crisis and Great Recession has as one of
its primary causes the excesses of the U.S. mortgage market (for example, see Blinder [2007,
2009], Stiglitz [2007], Calomiris [2008], and Brunnermeier [2009]). The adverse impact of
declines in the value of residential real estate have been especially acute in the subprime
mortgage markets, in which loans were made to borrowers with lower credit quality and/or short
credit histories. According to Inside Mortgage Finance,1 the volume of subprime mortgage
originations grew from $75 billion in 1994 to a peak of $625 billion in 2006, and then falling to
$93 billion in 2009.
The rise of the subprime mortgage market allows us to test for race-based discrimination
in the residential real estate prime and subprime markets. With home ownership rates at 67.4
percent in 2009,2 housing is an important asset in households’ portfolio holdings, with
consequences for other portfolio assets and asset returns.3 However, there are still significant
differences in homeownership rates among African-Americans (46.2 percent), Hispanics (48.4
percent), and whites (71.4 percent).4 The Fair Housing Act of 1968 prohibits discrimination
based on race, and is actively enforced by the Office of Fair Housing and Equal Opportunity in
the U.S. Department of Housing and Urban Development (HUD). Moreover, the Home
Mortgage Disclosure Act (HMDA) of 1975 and the Community Reinvestment Act (CRA) of
1977 were enacted by Congress5 to monitor lending institutions’ fair lending practices to
minority and low-income borrowers and neighborhoods.
The subprime market has both a beneficial and destructive impact on mortgage borrowers
(see Gramlich [2007] for a good discussion). On the one hand, subprime lending makes credit
accessible to borrowers with blemished credit histories and/or volatile income who do not
qualify for mortgages in the prime lending market. On the other hand, a proportion of subprime
borrowers were very vulnerable to any adverse economic shock due to low verifiable income
stability and savings, and were forced to sell their houses early, often ending in foreclosures.
1Mortgage Market Statistical Annual [2010, volume 1].
2 U.S. Census Bureau 2009 Housing Vacancies and Homeownership Survey.
3Hubbard [1985], Campbell [2006], Cochrane [2007], among others, propose that real estate is an illiquid/nontraded
asset that has a significant affect on a household’s portfolio choice and asset returns. Empirical evidence of such
effects has been found in Flavin and Yamashita [2002], and Piazzesi, et al. [2007], among others. 4 U.S. Census Bureau 2009 Housing Vacancies and Homeownership Survey.
5 The HMDA is enacted by Congress and implemented by the Federal Reserve Board's Regulation C. The CRA is
enacted by Congress and implemented by Regulations 12 CFR parts 25, 228, 345, and 563e.
2
Some lenders have also been sued for allegedly targeting minority borrowers and minority
neighborhoods for high-cost subprime loans, a practice referred to as “reverse redlining.” For
example, the NAACP has filed a lawsuit in federal court in Los Angeles against 12 mortgage
lenders for steering African-American borrowers into high-cost subprime loans (New York
Times, October 15, 2007). Similarly, the City of Baltimore (National Public Radio January 11,
2008) and state of Illinois (Wall Street Journal July 31, 2009) both sued Wells Fargo bank, for
high-cost subprime mortgage lending to minority borrowers.
Given the importance of real estate and the stated objective of regulators to fair and
equal access to housing, researchers have examined whether African-Americans and Hispanics
are discriminated against by lenders (see, for example, Black, Schweitzer, and Mandell [1978];
King [1980]; and Munnell, et al. [1996], among others). These studies have examined whether
minorities are rejected more often in prime mortgage loans than white applicants (referred to as
the accept/reject decision). We add to this research in four ways.
First, we examine discrimination in both prime and subprime markets. In doing so, we
analyze for any differences in discrimination between prime and subprime markets, the latter
being where a large proportion of minority borrowers tend to obtain their mortgages (Scheessele
[2002]; Calem, et al. [2004]; and Mayer and Pence [2008]).
Second, we use two economic theories of discrimination (taste-based, Becker, 1957;
information-based statistical theory, Phelps, 1972, Arrow, 1973) to derive possible hypotheses
for testing (See Section 2.1 for further details). Analyzing the subprime market along with the
prime market allows us to test the differing implications of these theories using actual data. In
order to test these theories, recent studies have used small samples of self-reported data, or
studied a game, or conducted an experiment.6 For example, Levitt (2004) describes two
important caveats in his study of the voting behavior on the television show titled the “Weakest
Link,” namely, that the environment is not a market setting, and individuals who both apply and
are then selected for the game show are not representative of the general population. Individuals
6 Some studies have tested for discrimination in a game setting (for example, Fershtman and Gneezy [2001]; Levitt
[2004]) or in a paired-audit experimental setting (for example, Turner, et al. [2002]; Bertrand and Mullainathan
[2004]). Yinger (1998), Ross and Yinger (2003), and Anderson, Fryer and Holt (2005) provide detailed surveys on
the discrimination literature and Levitt and List (2009) provide an excellent overview of field experiments in
economics. Hausman and Wise (1979), Heckman (1992), Heckman and Smith (1995), and Manski (1996), among
others, have criticized these studies as not being generalizable to large-scale markets, not having true randomness,
having attrition bias, and participants changing their behavior due to their awareness of being measured.
3
who own real estate are representative of the general population, and lenders are required to
disclose details of every loan application to the Federal Financial Institutions Examination
Council (FFIEC) under HMDA.
Third, previous research on race-based lending discrimination uses a single-equation
model of mortgage acceptance and assumes the race variable to be uncorrelated with the error
term in the accept/reject model. Though problems associated with single-equation tests for
discrimination have been observed in studies by Rachlis and Yezer [1993], Yezer, Phillips and
Trost [1994], among others, these papers have focused on the endogeneity of loan terms,
especially the loan-to-value ratio.7 None of these papers has tested the hypothesis that race is
correlated with observable and unobservable risks.8 If such a hypothesis is not statistically
rejected, the average disturbance term in these single-equation probit models is not conditionally
zero, resulting in inconsistent regression parameter estimates.
Fourth, we ameliorate the bias in the single-equation probit model using two methods.
The first method uses the full information maximum likelihood method (FIML) or bivariate
probit. The advantage of this method is that it is at the individual loan level, the unit at which the
lending officer makes the decision. The disadvantage is that the regression estimates depends
crucially on the normality assumption of the error terms in the two equations. The second
method is two-stage least squares (2SLS) at the neighborhood census tract level. The advantage
of 2SLS is that the regression estimates do not depends crucially on the statistical distribution of
the error terms, but its disadvantage is that it is at the census tract level.
We define an instrumental variable for race, namely the number of people in that county
who attend a predominantly African-American church.9 We conjecture that this variable is likely
to be strongly related to race (making it a strong instrumental variable), and that religious status
should not be related to risk variables and are therefore exogenous to the lender’s accept/reject
loan decision (making it a valid instrumental variable). The latter assumption is also made
stronger given that we have controlled for neighborhood characteristics such as unemployment
7 We also test the endogeneity of the loan-to-value ratio, LTV, but find it to be exogenous in the loan denial
regression using the Durbin-Wu-Hausman test. 8 Many authors have criticized these single-equation studies (Zandi [1993]; Lebowitz [1993]; Horn [1994, 1997];
Yezer, Phillips, and Trost [1994]; Ross and Yinger [1999]; Stengel and Glennon [1999]; and LaCour-Little [1999]
for what is essentially an omitted variable problem, but no study has corrected for the correlation between the race
variable and the error term. 9 See section 3.1 for more details on this variable. Note that this variable is not whether the borrower attends church,
which might reflect some risk-taking characteristics.
4
rates, percentage who attended college, age of property, percentage of renter occupied housing,
house price appreciation rates, and owner occupied traits.
It is still possible that we have not controlled for some unobservable characteristic. We
therefore use the insightful approach of Altonji, Elder, and Taber (2005a, 2005b) who examine
the impact of Catholic schools on education outcomes using an instrumental variable approach.
As in these papers, we examine a) if observable loan, borrower, and property characteristics are
significantly different in neighborhoods with a large number people who attend a predominantly
African-American church, from neighborhoods with few people attending a predominantly
African-American church. If these differences are economically large, then our instrumental
variable is capturing some observable risk characteristic; b) we examine if our instrumental
variable is significantly related to the accept/reject decision in neighborhoods with a large
proportion of White borrowers. If we find this is true, then our instrumental variable is capturing
some unobservable risk characteristic that is not exclusively correlated with race; c) we examine
the lower bound (upper bound) bias of our estimates, defined as equal selection on unobserved
variables when compared to observed variables (defined as zero correlation between race and
unobserved variables). Making observations about the differential impact of unobservable and
observable risks in subprime and prime markets, we examine how much does the bias impact our
results?
Fourth, the recent literature (for example, Mian and Sufi [2009], Keys, et al. [2010];
among others), has suggested that much of the current credit crisis in the subprime market has
arisen from the large increase in credit supply provided by lax screening incentives inherent in
the lender’s originate-to-distribute model. Mian and Sufi [2009] find that zip codes in 1996-a
period before the credit expansion—with either high denial rates or high subprime borrowers got
much greater credit in 2002-05 than other zip codes. Such borrowers obtained increased credit
and had higher default rates despite having lower income and employment growth rates (proxies
for increase mortgage demand) during that period. We use their methodology to isolate the
supply effect by using the proportion of the census track that was rejected in 1996. In doing so,
we attempt to disentangle whether our results are driven by changes in mortgage demand or
mortgage supply.
We construct a unique data set from seven sources to capture a broad set of borrower risk
characteristics, loan risk characteristics, property risk characteristics, lender characteristics,
5
religious affiliations, and macroeconomic variables. Our large sample consists of 2,026,556
mortgage applications made by borrowers in New Jersey from 2000 to 2008.
Our principal results are as follows.
First, we find that African-American or minority borrowers are more likely to be rejected
than white borrowers in the prime market. These single-equation results are consistent with those
found by many studies such as Black, Schweitzer, and Mandell [1978]; King [1980]; Schafer and
Ladd [1981]; and Munnell, et al. [1996]. When we examine the subprime market, we find that
African-American or minority borrowers are less likely to be rejected than white borrowers.
Second, we find at the individual-loan level that race is correlated with observable risks
using 14 risk measures. Using Rivers-Vuong and likelihood ratio tests, we also show that race is
also positively correlated with unobservable risks (such as wealth, gifts and bequests, and loan
documentation), in the accept/reject regression. The fact that race is correlated with observable
and unobservable risks in the single-equation model, violates the assumption that the error terms
are uncorrelated resulting in biased parameter estimates in the single-equation probit model.
Using FIML system of equations, we find that African-American or minority borrowers are more
(less) likely to be rejected than white borrowers in the prime (subprime) market, after controlling
for risks. This result is supportive of the information-based theory of discrimination.
Third, we generally find similar results at the neighborhood census tract level. The
Durbin-Wu-Hausman test also shows that race is correlated with the disturbance term in the
neighborhood single-equation OLS regressions, suggesting that the regression parameter
estimates are inconsistent. In our two-stage least squares estimates (2SLS) with a valid
instrumental variable we find that neighborhoods with a higher proportion of minority borrowers
are associated with a significantly higher (lower) percentage of loans rejected in the prime
(subprime) market.
Fourth, using the methodology of Altonji, Elder, and Taber (2005a, 2005b) we find that
our instrumental variable, namely, the number of people in the neighborhood who attend a
predominantly African-American church, is not consistently correlated with observable risk
variables. We also find that it does not correlate with the accept/reject decision in predominantly
white neighborhoods. Finding a higher ratio of unobservables to observable risk variables in
subprime markets when compared to prime markets, we find that our main results in support of
information-based discrimination are unaffected by any possible bias. Finally, we also find that
6
this instrumental variable is strongly correlated with the race variable, helping us avoid the well-
known weak instrument problem.
Fifth, to disentangle demand and supply effects further, we examine the change in reject
rates, neighborhood risk characteristics and racial composition between 1996 and 2008. We find
that the reduction in rejection rates to minority neighborhoods from 1996 to 2008 cannot be fully
justified by risk, suggesting a relaxation of lending standards to minority neighborhoods. These
results for strong credit supply effects are consistent with those in Mian and Sufi [2009], who
also find reductions in loan denial rates to neighborhoods despite significant deterioration in
credit quality.
A number of papers have suggested that home price depreciation has been a significant
factor for rising mortgage defaults in the subprime industry (see Demyank and Hemert [2011],
Gerardi, Shapiro and Willen [2008]; Mayer and Pence [2008], Hubbard and Mayer [2008];
Mayer, Pence, and Sherlund [2009]). Accordingly, we control for home price levels at the MSA-
level so as to ensure that none of our above results are affected due to home price depreciation.
Our finding that increased credit supply affected minorities in the subprime market
should be not be confused with “reverse redlining” by lenders. The former argument is a general
lessening of standards by lenders resulting in minorities in subprime neighborhoods being more
severely affected. The latter argument rests on the necessary assumption that minorities in
subprime neighborhoods were steered and/or targeted for unnecessary loans. We do not test for
this latter argument which is beyond the scope of this paper.
The paper is organized as follows. Section 2 describes the testable hypotheses,
comparison to Levitt’s [2004] analysis of the game show “Weakest Link.” In Section 3, we
present our empirical methodology and variables. Section 4 describes the data’s sources and
basic characteristics, and Section 5 presents our empirical results. Section 6 concludes.
2. Testable Hypotheses, Comparison to Levitt’s (2004) Analysis of the Game Show
“Weakest Link,” Unobservable Risks
2.1 Testable Hypotheses
There are two leading theories of discrimination. Under the taste-based theory of
discrimination (Becker [1957]), the motive that drives the behavior is animus or prejudice
towards a particular group, that is, the economic actors simply do not like that group and do not
7
want to interact with members in that group. Under the information-based theory of
discrimination (Phelps [1972], and Arrow [1973], the motive that drives the agent’s behavior is
expected profit maximization. In an imperfect information world, economic agents discriminate
against certain groups because they believe these groups have lower productivity (or in this
context credit quality), which will reduce their profit.
Using the implications of these theories in the prime and subprime markets, we develop
testable hypotheses of regressions of the accept/reject decision on race. For ease of exposition,
let
0 it it it itReject X Race (1)
where itReject is a dummy variable that equals one if a loan application is rejected, and zero
otherwise; itRace is dummy variable set to unity if the borrower is African American and/or
Hispanic, and zero if White; itX is a vector of observable risk variables categorized into
borrower risks, loan risks, property risks, lender risks, and macroeconomic risks, and it is the
disturbance term.
Ex-ante, we don’t know whether discrimination exists in mortgage lending and which
theory of discrimination explains the data. In the prime market, the two theories have similar
predictions. Under the taste-based theory, lenders tend to reject minority borrowers simply
because they don’t like them. Under the information-based theory, lenders tend to reject minority
borrowers because they believe minorities are associated with lower credit quality and doing so
will minimize default and maximize expected profit. In the subprime market, the two theories
have different predictions. Under the taste-based theory, lenders continue to disproportionately
reject minority borrowers because they dislike them. But the strategic incentive switches under
the information-based theory. In the subprime market, lenders can make higher expected profits
by charging higher loan prices if they believe minorities are associated with lower credit quality,
resulting in a much higher cost of discrimination in the subprime market. Therefore, there is
flipping of behavior in the prime and subprime markets which distinguishes the taste-based
theory from the information-based theory.
Before the subprime mortgage crisis, subprime mortgage market is expected to have
higher profits than the prime market. For example, an Office of Thrift Supervision study (2000;
page 11) points out that “All in all, subprime lending should have higher expected profits than
prime lending because of its higher risk.” Comparing subprime loans with prime loans, Lax, et.
8
al (2000) finds that 50% of the interest rate premium charged to subprime borrowers is not
related to their higher levels of risk. Azmy (2005) also shows that profits from subprime lending
industry are high enough such that their excess profit is not justified by risk for some lenders.
Accordingly, we have the following three testable hypotheses with respect to the sign and
magnitude of the regression coefficients on Race ( ).
Hypothesis 1: If is positive and of equal magnitude in the prime and subprime markets,
then we have evidence supportive of the taste-based theory.
Hypothesis 2: If is positive in the prime market, and zero or negative in the subprime
market, then we have evidence supportive of the information-based theory.
Hypothesis 3: If is not positive in either the prime or subprime markets, then there is no
evidence supportive of either the taste-based or information-based theories of discrimination.
2.2 Comparison to Levitt’s (2004) Analysis of the Game Show “Weakest Link”
In this sub-section, we compare the implications of the theories of discrimination in the
prime and subprime mortgage market to the game show “Weakest Link” that was analyzed by
Levitt (2004). The “Weakest Link” is a television game show. In the game, players compete by
answering questions to get the final winner-take-all prize. In early rounds of the game, players
answer a series of questions to build a pot of money that can be added to the final winner-take-all
prize. At the end of each round, one player who received the most votes from other players is
removed from the game. The process continues until only two players are left. In the final round,
the player who answers more questions correctly will get the winner-take-all prize.
The key point is that it is the same difference in agents’ motives that drives the differing
behavior in early and later rounds in the Weakest Link and the differing behavior in the prime
and subprime mortgage markets. In the early rounds of the Weakest Link, the two theories have
similar predictions but for different reasons. Under the taste-based theory, participants in the
game tend to vote off the group that is discriminated because they don’t like the members in that
group. Under the information-based theory, participants in the game also tend to vote off the
group that is discriminated because they believe that group is associated with lower skill and
doing so will maximize the number of questions answered and therefore maximize their final
profit. In the later rounds, the two theories have different predictions. Under the taste-based
theory, the players in the targeted group will continue to receive more votes because other
participants don’t like them. But, the strategic incentive switches under the information-based
9
theory. If the participants in the game believe that a particular group is associated with lower
skill, they will avoid voting off members in that group in order to maximize their chance of
winning and maximize their final profit. It is this flipping of behavior that distinguishes the taste-
based discrimination from the information-based discrimination.
In the prime market, the two theories have similar predictions. Under the taste-based
theory, lenders tend to reject minority borrowers simply because they don’t like them. Under the
information-based theory, lenders tend to reject minority borrowers because they believe
minorities are associated with lower credit quality and doing so will minimize default and
maximize expected profit. In the subprime market, the two theories have different predictions.
Under the taste-based theory, lenders continue to disproportionately reject minority borrowers
because they dislike them. But the strategic incentive switches under the information-based
theory. In the subprime market, lenders can make higher expected profits by charging higher
loan prices if they believe minorities are associated with lower credit quality, resulting in a much
higher cost of discrimination in the subprime market. Therefore, similar to the flipping of
behavior in the Weakest Link, there is flipping of behavior in the prime and subprime markets
which distinguishes the taste-based theory from the information-based theory.
3. Methodology and Variables
3.1 The potential importance of unobservable risks
The standard econometric model used to examine racial discrimination is specified as
regression model (1). One difficulty in estimating this model is whether RACE is correlated with
the disturbance term . If RACE is correlated with observable and unobservable risk variables
that are not included in X , then RACE will be correlated with the disturbance term. This
correlation would lead to inconsistent regression estimates of , which results in incorrect
support for discrimination when in fact there is not any, or vice versa.
Many risks are unobservable to the researcher due to data availability, but are privately
observable to the lender. These include borrower and loan characteristics such as wealth, bequest
or gifts received, and loan documentation.10
We describe some of these unobservable risks using
examples below.
10
These risks which we term unobservable could include both omitted quantifiable risk variables and non-
quantifiable variables.
10
Wealth. Consider the example of an African-American borrower and a white borrower
who come to the same lender for a loan application. Assume the two borrowers have similar
credit scores, income, debt-to-income ratios and the two loans are similar in every aspect such as
the loan-to-value ratio, loan amount, and property risks. Based on the observable information,
the researcher would expect to find similar loan decisions made by the lender. If the application
for the white borrower is accepted and the application for the African-American borrower is
rejected, then the researcher might conclude that this is evidence of racial discrimination.
This conclusion would be incorrect if the lender has private information about ability to
pay, which is not observable to the researcher. For example, if the lender knows that the white
borrower has a higher level of wealth than the African-American borrower, then the lender may
approve the loan for the white borrower because of a perceived stronger ability to repay the loan.
Therefore, if RACE is correlated with wealth, the coefficient on RACE may not reflect
discrimination, but the effect of wealth.
A number of studies (for example, Altonji, et al. [2000]; Gittleman and Wolff [2000];
and Blau and Graham [1990]) have shown that minority (African-American and Hispanic)
households have much lower wealth holdings than white households. In a recent survey of
research, Scholz and Levine [2004] show that wealth differences are large and persistent when
groups are divided by race, and the differences remain even when differences in educational
attainment across the groups are controlled for. Using pooled data from the Federal Reserve’s
Surveys of Consumer Finances from 1989, 1992, 1995, and 1998, they show that there are wide
net wealth disparities at every age. For example, they find that “at ages 51 to 55, …, mean
(median) net wealth of white households is $467,747 ($156,550), while for African-American
and Hispanic households it is $105,675 ($33,170).” They also find that “the net wealth of
African-American and Hispanic college graduates is similar to the net wealth of white high
school graduates, and the net wealth of African-American and Hispanic high school graduates is
similar to the net wealth of white high school dropouts (emphasis added).”
Bequests and gifts. One can make a similar argument about differing levels of bequests
and gifts by race. That is, the lender may know the bequest or gifts received by the African-
American, Hispanic, or white borrower, while the researcher does not. Therefore, the researcher
may incorrectly conclude evidence of discrimination, when differences really reflect inheritance.
Using data from the Panel Study of Income Dynamics, Charles and Hurst [2000] find that white
11
and African-American households differ in family help for a down payment on a home.
Specifically, more white households (42 percent) get family help than do African-American
households (fewer than 10 percent). Using PSID wealth supplements in 1984, 1989, and 1994,
Gittleman and Wolff [2000] find that inheritance plays an important role in wealth accumulation
for whites, but not for African-Americans. These papers suggest different ability of getting
financial help from families among different races, which can be information known to the
lender, but unknown to the researcher.
Poor documentation. The relative importance of particular unobservable risks may
change when lending standards change. For example, “no documentation” or “low
documentation” loans were very rare in the 1990s, and were not included in earlier single-
equation studies. But such loans were 12 percent of mortgage loan originations in the second half
of 2005 a boom year for subprime mortgages.11
If minority status of the borrowers is correlated
with low and unstable income, they might be more likely to apply for low documentation loans.12
Therefore, omitting the documentation type of the loan biases the regression coefficients,
because ‘low documentation’ loans are associated with higher risks.
Soft private information. There are many soft information variables that the researcher
does not have access to which might be important to lenders. For example, if the borrower has
significant deposits with the bank, has a long relationship with the lender, has a common social
circle or cultural affinity (such as belonging to the same credit union), and the month’s/quarter’s
loan origination target set for the loan officer by the supervisor.13
Therefore, it is hard for
researchers to document every variable that is relevant to the lender’s loan decision.
We describe below how our methodology mitigates the bias in parameter estimates.
11
State of the Nation’s Housing 2005 report at Harvard’s Joint Center for Housing Studies,
http://www.jchs.harvard.edu/publications/markets/son2005. 12
In a typical low documentation loan, income can be stated, but not verified. Therefore, borrowers with low or
unstable income are more likely to apply for a low documentation loan. Low documentation can capture additional
credit risks that are not captured in the borrower’s FICO credit score. For example, in the current FICO score, good
or bad performance is based on the worst delinquency on any obligation over the last two years. Therefore, a
borrower who is delinquent in one out of ten open accounts could represent a same level of unsatisfactory outcome
as someone who is delinquent on all ten accounts if their worst delinquency amount is the same. In the latter case,
the borrower who is delinquent on all ten accounts is more willing to apply for a low documentation loan, given that
a lender might require her to disclose more adverse information. Courchane, et al. [2007] show that low-
documentation loans in the subprime market are associated with higher credit risk. 13
Gan and Riddiough [2008] suggest that lenders possess proprietary credit quality information embedded in their
screening technology that is not observable to empirical researchers.
12
3.2 Equations and variables
We begin by explaining the system of equations at the individual loan-level. We model
the accept/reject decision of a loan application in a manner similar to that used in the single-
equation studies:
0 1 2 3
4 5 (2)
it it it it
it t it it
REJECT BORROWER LOAN PROPERTY
LENDER MACRO RACE
where itREJECT is a dummy variable that equals one if the loan application is rejected, and zero
otherwise; itBORROWER is a vector of borrower risk variables;
itLOAN is a vector of loan risk
variables; itPROPERTY is a vector of property risk variables;
itLENDER is a vector of lender
characteristics; tMacro is a vector of macroeconomic variables; itRACE is a dummy variable that
equals one if the borrower is a member of a minority group (African-American or Hispanic), and
zero if the borrower is white. We have used a comprehensive set of loan variables. If RACE is
correlated with the disturbance term it it is technically endogenous, resulting in inconsistent
estimates of . To rectify this problem we specify another equation for RACE:
0 1 2 3 4 (3) it it it it itRACE BORROWER LOAN PROPERTY BLACKCHURCH e
Because both the outcome variable itREJECT and the independent variable itRACE are
binary variables and itRACE is correlated with the disturbance it , neither the traditional two-
stage least squares (2SLS) nor the Rivers-Vuong approaches will produce consistent estimators.
Instead, we use the full-information maximum likelihood (FIML) methodology to estimate the
model because this is the only valid econometric technique that can be used to estimate a binary
choice model with a binary endogenous variable, (Greene [2003]; and Wooldridge [2002]). This
method has been used to estimate binary choice models with endogenous variables by Evans,
Oates, and Schwab [1992], Evans and Schwab [1995], and Greene [1998]. In FIML, the
correlation of the residual terms in equations (1) and (2) is calculated by maximizing the log
likelihood function for the two equations simultaneously. Though the identification of the system
of equations does not need an instrument, we include the variable BLACKCHURCH, which is
the county-level number of African-American church members. The coefficient on RACE is
the coefficient of interest. We estimate the model for prime and subprime loans separately to
draw inferences for the taste-based and information-based theories of discrimination.
13
Motivating the correlation coefficient of ite with
it is straightforward. ite in equation (2)
are characteristics of being a minority that are not correlated with borrower, loan, and property
risk characteristics. It could be unobserved risk or just another characteristic of minorities. If it is
other characteristics, and not risk that is captured by equation (1), then the estimate of should
be statistically insignificant. If minorities have higher unobservable risks, then the estimate of
should be positive and statistically significant. Conversely, if minorities have lower unobservable
risks, then the estimate of should be negative and statistically significant.
We have two specifications for RACE: AFRICAN-AMERICAN, a dummy variable set to
unity if the borrower is African-American, and zero if the borrower is white; and MINORITY, a
dummy variable set to unity if the borrower is either African-American or Hispanic, and zero if
the borrower is white.14
As in the previous literature, we exclude all other race categories such as
Asian-American.
We use a comprehensive list of control variables combining the explanatory variables
used by Holmes and Horvitz [1994]; Munnell, et al. [1996]; Berkovec, et al. [1998]; Day and
Liebowitz [1996, 1998]; and Calem, et al. [2004]. Specifically, the control variables cover five
broad categories: borrower risk characteristics, loan risk characteristics, property risk
characteristics, lender characteristics, and macroeconomic variables.
Our proxies for the various borrower risk characteristics are as follows: INCOME, the
natural logarithm of individual borrower income; MEDFICO, the borrower’s Census tract
median credit score (FICO); DTI, as the average non-mortgage debt divided by average income
in the Census tract; d) PCTCOLLEGE, as the percentage of the Census tract population greater
than 25 years of age with at least a Bachelor’s degree; UNEMPLOYED, the number of
unemployed civilians divided by the sum of employed and unemployed civilians in the Census
tract. We expect lower levels of INCOME, MEDFICO, PCTCOLLEGE to be associated with
higher borrower risks and higher probability of rejection. We also expect higher DTI and
UNEMPLOYED to be associated with higher borrower risks and higher probability of rejection.
14
If the borrower is African-American (white) and is a male, we classify the borrower as African-American (white).
We re-estimated our models when we modified our definition of African-American to include if the borrower and/or
co-borrower are African-American with no reference to gender. Similar adjustments were made for Hispanics. None
of our results generally changed (results available on request), driven by the fact that only a few borrowers were
affected when we use different definitions of RACE.
14
Loan risk characteristics include LTV, the median loan amount divided by the product of
owner-occupied median house value and annual house price appreciation rate in the Census tract;
CONVENTIONAL, a dummy variable set to unity if the loan is a conventional loan, and zero if
the loan is a special program loan, such as VA or FHA loan; and AMOUNT, defined as
individual loan amount in thousands. We expect loans with higher LTV and CONVENTIONAL
loans to have higher loan risks and are therefore positively associated with the probability of
rejection.
Property risk characteristics include MEDAGE, the median age of residential property in
the census tract; PCTRENT, the number of renter-occupied housing units divided by the total
housing units in the census tract; HVCHG, the house price appreciation rate; NEWOWN, the
percentage change in number of owner occupants between the 1990 and 2000 censuses; and
OWNEROCC, a dummy variable equal to unity if the property underlying the loan is owner-
occupied, and zero otherwise. We expect higher levels of MEDAGE and PCTRENT to be
associated with higher property risk and lead to higher rejection probabilities, while properties
that are owner-occupied (OWNEROCC) or located in higher HVCHG or NEWOWN tracts are
less risky, and are accordingly expected to have lower rejection probabilities.
We include two measures of lender characteristics: BANK, a dummy variable equal to
unity if the lender is a commercial bank, savings, or thrift institution, and zero otherwise; and
HERF, the Herfindahl-Hirschmann Index of the census tract, defined as the sum of squared
market shares of lenders in each census tract. We include the bank dummy variable to distinguish
depository and nondepository institutions. These two types of institutions are subject to very
different set of regulations. For example, depositories are subject to Community Reinvestment
Act ratings and CAMEL ratings, but nondepositories are not. We include this dummy to capture
this difference.
Three macroeconomic variables are chosen to control for differences in the economic
environment at the time the loan was applied. PRIMERATE, the interest rate on prime loans;
TERM, the yield spread between the seven-year Treasury note and the yield of a three-month
Treasury bill; and DEFAULT, the spread between the yields on Baa and Aaa bonds.
To check the robustness of the loan-level FIML results, we also estimated the model at
the neighborhood-census-tract level. By aggregating the loan-level binary dependent variables
into neighborhood continuous variables, we can use two-stage least squares (2SLS) with
15
instrumental variables. If we find similar results for both levels, then the central results do not
depend on FIML. Moreover, some of the borrower risk variables, such as census tract
MEDFICO score and DTI, are imperfect proxies for an individual borrower risk. If the results
from the individual loan-level (with imperfect credit risk proxies) hold at the neighborhood-level
(with accurate credit risk proxies), imperfect borrower risk proxies are not driving the results.
At the neighborhood-census-tract level, previous studies examine either the number of
subprime loan rejections or originations. This emphasis is problematic because more loans could
be rejected or originated simply because there were more applications. Therefore, we examine
the percentage of subprime loans rejected in a census tract in order to also control for the demand
side of loans. The system of equations for the neighborhood level is specified as follows:
0 1 2 3
4
(4)
it it it it
t it it
PCTREJECTION BORROWER LOAN PROPERTY
MACRO PCTRACE
0 1 2 3
4
(5)
,
it it it it
it
PCTRACE BORROWER LOAN PROPERTY
BLACKCHURCH e
where itPCTREJECTION is defined as the number of loan rejections divided by number of loan
applications in census tract i and itPCTRACE is the percentage of minority applicants in census
tract i . The coefficient on itPCTRACE is the coefficient of interest. We estimate the model
separately for prime and subprime loans. We use the Durbin-Wu-Hausman test to examine
whether itPCTRACE is correlated with the disturbance term it .
The control variables in the neighborhood regression are similar to those in the loan-level
regression. However, because some of the control variables such as OWNEROCC and BANK are
loan-level variables, we have fewer control variables in the neighborhood specifications.15
Similar to the loan-level regression, we have two specifications for the race variable PCTRACE:
PCTAFRICAN, the percentage of census tract applicants that are African-American; and
PCTMINORITY, the percentage of census-tract applicants that are either African-American or
Hispanic.
15
We estimated the loan-level specifications dropping variables that are not used in the neighborhood specifications
and found that none of our results change significantly.
16
Because both the outcome variable and independent variable are continuous, we can use
two-stage least squares to estimate the model. For this system of equations to be identified, we
need a valid and strong instrumental variable16
for PCTRACE. We use the following instrument
in the PCTRACE equation: BLACKCHURCH, the number of county level black church
members. This instrument is chosen because Black church members are highly correlated with
African American status, but religious status should be exogenous to the accept/reject loan
decision. In the first stage, we estimate by OLS a regression of PCTRACE on all control
variables and the instrumental variable. We use the fitted values from this stage as regressors in
the PCTREJECTION regression.
Next, to examine whether changes in rejection rates are associated with changes in
lending standards, we further estimate the following model:
,96 06 ,96 06 ,96 06
,96 06 ,96 06
0 1 2
3 4
_ _ _
+ _ (6)
and
i i i
i i i
DIFF PCTREJECTION DIFF BORROWER DIFF LOAN
DIFF_PROPERTY DIFF MACRO RESID
0 1 ,96 06
_ , (7)i i iRESID DIFF PCTRACE e
where DIFF_PCTREJECTION is the difference in rejection rates between 1996 and 2006,
DIFF_BORROWER, DIFF_LOAN, DIFF_PROPERTY, and DIFF_MACRO are the differences
in borrower, loan, property and macroeconomic risks between 1996 and 2006.
In the first stage, we estimate model (6). In the second stage, we regress the residual of
regression (6) on the difference of minority borrowers between 1996 and 2006. The residual
from regression model (6) captures the change in rejection rates that is not explained by changes
in risks. If the coefficient 1 is positive, it suggests that lenders tightened lending to minority
neighborhoods. Otherwise, if it is negative, it suggests a relaxation of lending standards to
minority neighborhood.
3. Data Sources and Descriptive Statistics
3. 1 Data sources
We match seven sources of data to construct a broad set of borrower risk characteristics,
loan risk characteristics, property risk characteristics, lender characteristics and macroeconomic
variables. First, we use the Home Mortgage Disclosure Act (HMDA) data from 2000 to 2008 to
16
We use the Staiger-Stock test to examine whether this is a strong instrument variable.
17
obtain individual loan-level data (such as whether a loan is being accepted or rejected, loan
amount, income, race and gender of the borrower, etc). We also use the HMDA data to derive
measures of lender characteristics, such as the Herfindahl-Hirschmann Index of the census tract
and whether the lender is a bank. Second, following previous studies, we use the Department of
Housing and Urban Development’s (HUD) list of subprime lenders to code each loan as being
subprime or prime. HUD’s list of lenders is matched to HMDA by lender identification code by
each lenders unique id. However, since HUD stopped publishing the subprime lender list after
2005, we construct a subprime lender list if the proportion of higher-priced loan origination of a
lender exceeds 50% of its total loan originations. The correlation between our list of subprime
lenders and the HUD subprime list is 0.8. Third, we use U.S. census data to derive census tract-
level demographic, property and borrower risk characteristics. We match the census data to
HMDA data by state, county, and census tract number. Fourth, we use the proprietary data from
TransUnion, a major credit bureau for tract-median FICO score (MEDFICO) and debt-to-income
ratio (DTI), which are widely accepted borrower-risk variables used by mortgage bankers and
brokers in their lending decisions. We match the credit bureau data to HMDA data by state,
county, and census tract number. Fifth, we match the House Price Index (HPI) data from the
Office of Federal Housing Finance Agency (FHFA, formerly OFHEO) to HMDA data by year
and Metropolitan Statistical Area (MSA). We use these data to construct neighborhood house
price appreciation rate, which is used to calculate the loan-to-value ratio (LTV). Sixth, we match
the African-American church members from the U.S. Religious Landscape Survey by county.
The African American churches are identified from list of predominantly African-American
denominations in the Religious Congregations and Membership Study (RCMS). Finally, we use
macroeconomic data from the Federal Reserve Bank of St. Louis’s website
(http://research.stlouisfed.org) to control for macroeconomic risk. Detailed definitions of the
control variables, instrumental variables, their sources, and which regression they are used in are
listed in Table I.
***Table I***
3.2 Descriptive statistics
Table II reports summary statistics for variables in the loan-level regressions. The sample
includes 2,026,556 mortgage loan applications in New Jersey from 2000 to 2008 and is divided
into the prime loan sample (1,714,003 applications) and the subprime loan sample (312,553
18
applications). Examining the two samples, the probability of rejection is much higher in
subprime lending (0.46) than in prime lending (0.23).
***Table II***
The subprime mortgage sample has a higher proportion of minority applicants than the
prime mortgage sample. For example, 19 percent of subprime loan applicants are African-
Americans and 28 percent are minorities (African-American or Hispanic borrowers),
corresponding values for prime loan applicants are 10 percent and 19 percent, respectively.
These results are consistent with Scheessele [2002], Calem, et al. [2004], and Mayer and Pence
[2008], who also find that a large proportion of minority borrowers obtain their mortgages in the
subprime market. Subprime borrowers also show higher credit risks than prime borrowers. On
average, subprime borrowers have lower income (natural logarithm of borrower income in
thousands is 4.35 for subprime borrowers and 4.54 for prime borrowers), a lower credit score
(681 for subprime and 730 for prime). Further, subprime borrowers have a lower percentage of
college graduates (22 percent for subprime versus 29 percent for prime) and a higher
unemployment rate (7 percent for subprime and 6 percent for prime).
4. Results
In this section, we present our regression results at the individual loan-level and
neighborhood-level. We first estimate the loan-level regressions, in which we can control for
more variables and at a level which the loan is made, namely, the individual. The advantage of
the FIML method is that it is at the individual loan level, the unit at which the lending officer
makes the decision. The disadvantage is that the regression estimates depends crucially on the
normality assumption of the error terms. The second method is two-stage least squares (2SLS) at
the neighborhood census tract level. The advantage of 2SLS is that the regression estimates do
not depends crucially on the statistical distribution of the error terms, but its disadvantage is that
it is at the census tract level.
4.1 Loan-level estimation
Similar to previous studies we begin by estimating a single-equation probit regressions
for the accept/reject decision, where RACE is treated as uncorrelated with the disturbance term.
19
Table III reports these results. In column (1) and (3), RACE is defined as a dummy
variable that equals one if the borrower is African-American, and zero if the borrower is white.
In column (2) and (4), RACE is equal to one if the borrower is either African-American or
Hispanic, and zero if the borrower is white. We estimate separate regressions for prime and
subprime lending and report the marginal effects.
***Table III***
In Table III, we find that the probability of rejection is 7.9 percent higher for an African-
American borrower than it is for a white applicant in the prime market. The probability of
rejection is 2.0 percent higher for a minority (African-American or Hispanic) borrower than for a
white borrower. These results are consistent with studies such as Munnell, et. al [1996], which
find that the probability of rejection is 8 percent higher for minorities (African-American or
Hispanic) than for whites. But, in the subprime market, we do not find similar results. The
probability of rejection is 3.6 percent lower for African-Americans than for whites.
We next investigate the effects of correcting the correlation of RACE and disturbance
term in the accept/reject regression. We begin by showing RACE is correlated with other
observable risk characteristics in Table IV.
***Table IV***
Panel A of Table IV presents different risk characteristics for different race categories
and tests for the differences in means. Examining African-American versus white borrowers, we
find that African-Americans exhibit much higher borrower, loan, and property risks than whites.
African-Americans have, on average, lower income, lower credit score, a higher debt-to-income
ratio, lower education, and a higher unemployment rate than whites. The magnitude of the
difference is quite big for some variables. For example, the average credit score for whites is
735, but it is only 666 for African-Americans. The average debt-to-income ratio for whites is
18.5 percent while it is 30.1 percent for African-Americans. In terms of loan risks, the average
loan amount is much smaller for African-Americans than whites ($138,396 versus $192,271) and
the average loan-to-value ratio is slightly higher for African-Americans. Moreover, African-
Americans are more likely to apply for a special program loan such as VA or FHA loans given
that the proportion of conventional loans are much lower for African-Americans than for whites
(88.7 percent versus 96.3 percent). Examining property risk characteristics, we find that African-
Americans live in neighborhoods with higher property risks such as older houses, higher
20
percentage of rental housing units, a higher percentage of houses boarded up, etc. The results for
minorities are very similar to African-Americans. African-Americans and minorities are
significantly different from whites for all of the borrower, loan, and property risk variables.
Therefore, Panel A of Table IV provides evidence that RACE is strongly correlated with
observable borrower, loan, and property risk variables in the accept/reject regression.
We now test whether RACE is also correlated with unobservable risk variables. To do so,
we use the Rivers-Vuong [1988] endogeneity test17
and the likelihood ratio test.18
Both tests
reject the null hypothesis that RACE is exogenous, suggesting inconsistent parameter estimates
in the accept/reject regression (see Panel B of Table IV). Moreover, for all specifications, ρ is
positive and statistically significantly different from zero, suggesting that minorities are
associated with higher unobservable risks, leading to higher loan rejections. Therefore, the
omission of the unobservable risks overestimates the effect of RACE on rejection. Because we
reject the null hypothesis that RACE is uncorrelated with the disturbance term in the accept/reject
regression, single-equation probit estimation will produce inconsistent estimators.
In Table V, we use FIML, in which ρ is built into the likelihood function. The coefficient
on the RACE variable is positive (negative) in prime (subprime) markets. That is, these results
show that lending discrimination against minorities appears in prime markets after correcting for
endogeneity (correlation of RACE with the disturbance), similar to the results in Table III in
which RACE is treated as exogenous (uncorrelated with the disturbance). Furthermore, we find
that the coefficients on African-American and minority status are significantly negative in the
subprime market, suggesting that African-American and minority borrowers are more likely to
be approved for a subprime loan. These results suggest evidence consistent with information-
based discrimination.
***Table V***
4.2 Neighborhood-level estimation
We now estimate the model at the neighborhood census tract level. By aggregating the
loan-level binary dependent variables into neighborhood continuous variables, we can use two-
stage least squares (2SLS) with instrumental variables. If we find similar results for both levels,
17
We do not perform the Durbin-Wu-Hausman endogeneity test because the dependent variable in the accept/reject
regression is not a continuous variable. Appendix B describes the Rivers-Vuong [1988] test. 18
The likelihood ratio test is a test for whether (the correlation coefficient between the residuals of the
accept/reject equation and the race equation) is statistically significantly different from zero.
21
then the central results do not depend on the FIML. Moreover, some of the borrower risk
variables, such as census tract MEDFICO score and DTI, are imperfect proxies for an individual
borrower risk. If the results from the individual loan-level (with imperfect credit risk proxies)
hold at the neighborhood-level (with accurate credit risk proxies), imperfect borrower risk
proxies are not driving the results.
We define PCTRACE as the percentage of applicants that are African-American in a
census tract and the percentage of minority applicants (African-American or Hispanic) in a
census tract. Similar to our loan level regressions, we compare the OLS results without
correcting for the correlated race and disturbance term and when we correct for the correlation
using two-stage least squares regressions. In doing so, we test if RACE is technically
endogenous resulting in inconsistent parameter estimates for OLS. In Panel A of Table VII, our
OLS regression show higher rejection for minorities in the prime markets (consistent with
Gabriel and Rosenthal [1991], and Munnell, et al. [1996]). For subprime markets we find that
minorities have lower rejection rates. However, these regression parameter estimates might be
biased, which we test for using Durbin-Wu-Hausman test for endogeneity. As the first step in the
Durbin-Wu-Hausman test, one typically regresses the endogenous variable on all exogenous
variables in the system and obtains the residual. We then include the residual as an additional
regressor in the original OLS regressions. If the coefficient on the residual is statistically
significantly different from zero, we can reject the hypothesis of exogeneiety. Panel B of Table
VIII shows the chi-squared statistic to statistically significantly different from zero, suggesting
inconsistent parameter estimates on RACE.
***Tables VI***
Given that PCTRACE is a continuous variable we can use two-stage least squares
regressions (2SLS) with an instrument variable to obtain the coefficient on RACE. Our selection
of instrumental variable is guided by econometric considerations. Our instrument,
BLACKCHURCH, captures the idea that Black church members are highly correlated with
African-American status, but religious status should be exogenous to the accept/reject loan
decision. We regress PCTREJECTION on PCTRACE and other neighborhood control variables,
the results of which are presented in Table VII. For prime lending, minority-concentrated
neighborhoods are associated with a higher percentage of rejection. Minority-concentrated
neighborhoods are associated with a lower percentage of rejection in subprime markets. These
22
results are consistent with those found in the individual loan-level regressions using FIML,
suggesting that the central result that minorities got much higher loans than whites in the
subprime market does not depend on which method we use (2SLS or FIML) and which
imperfect proxies we use for an individual borrower risk (MEDFICO and DTI). The results are
once again consistent with the information-based theory of discrimination.
***Table VII***
4.3 Validity and Strength of Instrumental Variable
To test the validity and biasedness of our instrument, we follow the methodology of Altonji
et al. (2005a, 2005b). First, we test to see whether the observable variables are significantly
different in areas with high concentration of black churches members and areas with low
concentrations of black church members. If the observable variables are both economically and
statistically significant different in the two areas, it may suggest that BLACKCHURCH is also
correlated with other observable variables, and may not serve as a valid instrument. Therefore,
we sort the census tracts into two subsamples: above the median black church members and
below the median black church members. We present the differences in observable variables
between these two subsamples in Panel A of Table VIII. Most of the differences are statistically
significant due to our use of a large sample (see t-statistic and p-values). However, in the case of
many risk variables the signs are not one that are consistent with a risk story. For example, if
neighborhoods with a larger number of African-American church members have higher risk, one
would expect them to have a lower median FICO score than neighborhoods with a smaller
number of African-American church members. In fact we find the opposite (7.136% v. 7.007%,
respectively). Similarly, neighborhoods with a larger number of African-American church
members have lower loan-to-value ratios, lower percentage of renters and new owners, and
younger houses than neighborhoods with a smaller number of African-American church
members, a result contrary to the correlated risk story. In some cases, the differences are
statistically insignificant (for example, log median income and house price appreciation rates),
whereas in others the differences are very small (for example, conventional loans). In summary,
this panel does not show any evidence for consistently higher risk in neighborhoods with a larger
number of African-American church members when compared to neighborhoods with a smaller
number of African-American church members.
***Table VIII***
23
The second approach to assess the validity of the instrumental variables is to identify a
sample of white borrowers and test whether BLACKCHURCH directly affect the accept/reject
decision in this sub-sample. If the coefficient on BLACKCHURCH is significant in such sample,
it suggest that the instrumental variable itself may directly affect the accept/reject decision and
may not serve as a valid instrument. To implement this approach, we regress the accept/reject
decision on the observable variables and the instrument BLACKCHURCH for the top-quintile
white census tracts sample. As shown in Panel B of Table VIII, the instrument, black church
members does not have a statistically strong effect on the accept/reject decision. Therefore, we
cannot reject BLACKCHURCH to be a valid instrument.
The third approach to assess the validity of the instrumental variables in the individual-
loan FIML regressions, is to assess the upper and lower bound of the coefficients on the RACE
variable by varying , the correlation coefficient between the residuals of the accept/reject
equation and the race equation. The upper bound bias assumes that is 0, i.e., there is no
correlation between the race variable and the unobservable variables (namely, the single-
equation results). The lower bound bias assumes that the selection on the unobservables is the
same as selection on the observables. More specifically, it assumes that the correlation between
RACE and the observable variables in the regression is equal to the correlation between RACE
and the unobservable variables such as wealth, bequests, documentation, deposits with the
lending institution, and bargaining power. This is a strong assumption, as the observables are
likely to have a higher correlation with RACE than the unobservables.19
Additionally, as Altonji
et al. (2005a, 2005b) argue, researchers do not pick observable variables randomly, but rather to
get the highest fit in their regression model.
We calculate the correlation coefficient between the race variable and observable
variables, and then econometrically restrict the correlation coefficient between the race variable
and the unobservable variables to be the same as the correlation coefficient between the race
variable and observable variables in the FIML regression.20
The results are shown in Panel C of
Table VIII.
19
See Table III wherein the observables correctly predicted varies from 65% to 88%. We do not focus on the pseudo
R2 statistic, because it has less information content and is known to be low in limited dependent variable regressions
(see Greene 2003). 20
This FIML specification restricts to be greater than zero, whereas the FIML specification Table V makes no
restriction.
24
We expect that the unobservable variables such as wealth and bequests, deposits with the
lender, and documentation to have much more impact on whether the borrower gets a loan in the
subprime market, than in the prime market. We find evidence in support of this argument as our
observable variables can explain 84-88% of the accept/reject decision in the prime market, which
is lower than the corresponding 62-64% in the subprime market (see Table V). This is also
confirmed at the census-track level (see Table VI), wherein our 2SLS regressions capture 65% of
the variation in the prime market and only 14% at the subprime level. As the ratio of
unobservables to observables is likely to be higher in the subprime market than in the prime
markets, the bias is likely to be higher in the subprime market. Therefore, we expect the results
to be more towards the single-equation results in the prime market, and more towards the FIML
results in the subprime market. These results confirm our support for the information-based
theory of discrimination.
To test the robustness of our results and whether the same results hold during the housing
boom period from 2000 to 2005, we reran the individual loan-level FIML regression and
neighborhood level 2SLS regression for 2000-2005. The results are similar to the full sample
period from 2000 to 2008, and are given in Table IX. This suggests that our main results are not
dependent on the sample period used.
***Table IX***
4.3 Are the lower rejections rates for minorities in the subprime market due to increases in
credit supply?
Mian and Sufi [2009] and Keys, et al. [2010]; among others, have suggested that much of
the current credit crisis in the subprime market has arisen from the large increase in credit supply
provided by lax screening incentives inherent in the lender’s originate-to-distribute model. Mian
and Sufi [2009] find that zip codes in 1996 -- a period before the credit expansion -- with either
high denial rates or high subprime borrowers got much greater credit in 2002-2005 than other zip
codes. Such borrowers obtained increased credit and had higher default rates despite having
lower income and employment growth rates (proxies for increase mortgage demand) during that
period. We use their methodology to isolate the supply effect by using the proportion of the
census track that was rejected in 1996.
25
Specifically, to separate the change in demand from the change in supply, in the first-
stage regression we estimate the change in rejection rates on the changes in borrower (without
the race variable), loan, property, and macroeconomic risks between 1996 and 2008. This first
stage regression represents the changes in rejection rates due to credit demand effects. We take
the residuals from this first stage regression and regress it on the difference in proportion of
minority borrowers between 1996 and 2008, or the proportion of minority borrowers in 1996.
The results are presented in Panel A of Table X. For both measures of racial composition, we
find that minority concentrated neighborhoods experienced disproportionally higher reduction in
subprime loan rejection rates, even after controlling for the changes in demand-side risk
characteristics.
***Table X***
To ensure further that the reduction in subprime rejection rates for minorities was not
driven by higher growth in income or employment, we sort the data into four quartiles according
to the proportion of minority borrowers. The top quartile includes those census tracts that have
the lowest minority concentration, while the bottom quartile includes those census tracts that
have the highest minority concentration. In Panel B, we find that neighborhoods with high
minority concentration experienced either lower income or employment growth than in
neighborhoods with low minority concentration, or statistically insignificantly different from
neighborhoods with low minority concentration. The results suggest that the reduction in
subprime rejection rates between 1996 and 2008 was not driven by the demand related income or
employment growth, but from an expansion of credit from the supply side. These results are
consistent with those of Mian and Sufi [2009] and Keys, et al. [2010], who also find strong credit
supply effects.
6. Conclusions
Given the importance of real estate in household portfolios and the stated objective of
regulators to fair and equal access to housing, researchers have examined whether African-
Americans and Hispanics are discriminated against by lenders (see, for example, Black,
Schweitzer, and Mandell [1978]; King [1980]; Schafer and Ladd [1981]; and Munnell, et al.
[1996], among others). These studies have examined whether minorities are rejected more often
in prime mortgage loans than white applicants (referred to as the accept/reject decision). We
26
examine discrimination in both prime and subprime markets, the latter being the markets in
which a large proportion of minority borrowers obtain their mortgages.
We replicate existing studies using a single-equation probit analysis. We find that race is
positively correlated with the decision to reject at the individual loan-level for prime markets.
This finding is consistent with the previous literature above. When we examine the subprime
market, we find that African-American or minority borrowers are less likely to be rejected than
white borrowers. However, we find that race is correlated with observable risks using 14 risk
measures. Using Rivers-Vuong and likelihood ratio tests, we also show that race is also
positively correlated with unobservable risks (such as wealth, gifts and bequests, and loan
documentation), in the accept/reject model. The fact that race is correlated with observable and
unobservable risks in the single-equation model, violates the assumption that the error terms are
uncorrelated resulting in biased parameter estimates. We use two methods to ameliorate the bias,
FIML and 2SLS. We use an instrumental variable for race, namely, the number of people who
attend a predominantly African-American Church. We argue that such a religious status variable
is unlikely to be in the lending officers information set, and find that its inclusion does not
significantly change our main result. Using both FIML and 2SLS methods, we find race to be
positively (negatively) related to the accept/reject decision in prime (subprime) markets. These
results suggest that African-American or minority borrowers are more (less) likely to be rejected
than white borrowers in the prime (subprime) market, confirming the information-based theory
of discrimination.
We also find that the reduction in rejection rates to minority neighborhoods from 1996 to
2008 cannot be fully justified by risk, suggesting a relaxation of lending standards to minority
neighborhoods. In doing so, we have controlled for home price levels at the MSA level. Our
results for strong credit supply effects are consistent with those in Mian and Sufi (2009), among
others, who also find reductions in loan denial rates to neighborhoods despite significant
deterioration in credit quality.
Author Affiliations
COLUMBIA BUSINESS SCHOOL, COLUMBIA UNIVERSITY
RUTGERS BUSINESS SCHOOL, RUTGERS UNIVERSITY
27
COLLEGE OF BUSINESS ADMINISTRATION, CALIFORNIA STATE POLYTECHNIC
UNIVERSITY, POMONA
28
References
Altonji, Joseph G., Ulrich Doraszelski, and Lewis Segal, “Black/White Differences in Wealth,”
Federal Reserve Bank of Chicago Economic Perspectives, 24 (1) (2000), 38-50.
Baher Azmy, Squaring the Predatory Lending Circle, 57 FLA. L. REV. 295, 307 (2005).
Berkovec, James A., Glenn B. Canner, Stuart A. Gabriel, and Timothy H. Hannan,
“Discrimination, Competition, and Loan Performance in FHA Mortgage Lending,” Review of
Economics and Statistics, 80 (2) (1998), 241-250.
Bertrand, Marianne, and Sendhil Mullainathan, “Are Emily and Greg More Employable than
Lakisha and Jamal? A Field Experiment on Labor Market Discrimination,” American Economic
Review, 94(4) (2004), 991-1013.
Black, Harold A., Robert L. Schweitzer, and Lewis Mandell, “Discrimination in Mortgage
Lending,” American Economic Review, 68(2) (1998), 186-191.
Blau, Francine D., and John W. Graham,“Black-White Differences in Wealth and Asset
Composition,” Quarterly Journal of Economics, 105(2) (1990), 321-339.
Blinder, Alan S., “Six Fingers of Blame in the Mortgage Mess,” New York Times. September 30,
2007.
Blinder, Alan S., “Six Blunders En Route to a Crisis,” New York Times, January 25, 2009.
Bound, John, David A. Jaeger, and Regina M. Baker, “Problems with Instrumental Variables
Estimation When the Correlation between the Instruments and Endogenous Explanatory
Variables is Weak,” Journal of American Statistical Association, 90 (1995), 443-450.
Brunnermeier, Markus K., “Deciphering the Liquidity and Credit Crunch 2007-2008,” Journal of
Economic Perspectives 23(1) (2009), 77-100.
Calomiris, Charles W., “The Subprime Turmoil: What’s Old, What’s New, and What’s Next?”
Working Paper, Columbia Business School, 2008.
Calem, Paul S., Kevin Gillen, and Susan Wachter, “The Neighborhood Distribution of Subprime
Mortgage Lending,” Journal of Real Estate Finance and Economics, 29 (4) (2004), 393-410.
Campbell, John Y, “Household Finance,” Journal of Finance, 61 (2006), 1553-1604.
Charles, Kerwin K., and Erik Hurst, “The Transition to Home-Ownership and the Black-White
Wealth Gap,” Review of Economics and Statistics, 84(2) (2002), 281-297.
Cochrane, John, “Portfolio Theory,” Working Paper, University of Chicago, 2007.
29
Courchane, Marsha, Adam Gailey, and Peter Zorn, “Consumer Credit Literacy: What Price
Perception?” Working Paper, Federal Reserve Bank of Chicago, 2007.
Day, Theodore E., and Stan J. Liebowitz, “Mortgages, Minorities, and HMDA,” Paper presented
at the Federal Reserve Bank of Chicago, April 1996.
Day, Theodore E., and Stan J. Liebowitz, “Mortgage Lending to Minorities: Where’s the Bias,”
Economic Inquiry, 34 (1998), 3–28.
Demyanyk, Yuliya, and Otto Van Hemert, “Understanding the Subprime Mortgage Crisis,”
Review of Financial Studies 26 (6) (2011), 1848-1880
Evans, William N., Wallace E. Oates, and Robert M. Schwab, “Measuring Peer Group Effects: A
Study of Teenage Behavior,” Journal of Political Economy, 100 (1992), 966-991.
Evans, William N., and Robert M. Schwab, “Finishing High School and Starting College: Do
Catholic Schools Make a Difference?” Quarterly Journal of Economics, 110 (1995), 941-974.
Fershtman, Chaim, and Uri Gneezy, “Discrimination in a Segmented Society: An Experimental
Approach.” Quarterly Journal of Economics,115 (1) (2001), 351-377.
Flavin, Marjorie, and Takashi Yamashita, “Owner-occupied Housing and Composition of the
Household Portfolio,” American Economic Review, 92 (2002), 345-362.
Gabriel, Stuart A., and Stuart S. Rosenthal, “Credit Rationing, Race and the Mortgage Market,”
Journal of Urban Economics, 29 (1991), 371-379.
Gittleman, Maury, and Edward N. Wolff, “Racial Wealth Disparities: Is the Gap Closing?”
Jerome Levy Economics Institute. Working Paper No. 311, 2000.
Gan, Jie, and Timothy. J. Riddiough, “Monopoly and Informational Advantage in the Residential
Mortgage Market,” Review of Financial Studies, 21(6) (2008), 2677-2703.
Gerardi, Kristopher, Adam H. Shapiro, and Paul S. Willen, “Subprime Outcomes: Risky
Mortgages, Homeownership Experiences, and Foreclosures,” Working Paper, Federal Reserve
Bank of Boston, 2008.
Gramlich, Edward, Subprime Mortgages: America’s Latest Boom and Bust. (Washington, D.C.:
Urban Institute Press.) 2005.
Greene, William H., “Gender Economics Courses in Liberal Arts Colleges: Further Results,”
Journal of Economic Education, 29 (1998), 291-300.
Greene, William H. 2003. Econometric Analysis, 5th
Edition, (Upper Saddle River: Prentice
Hall.).
30
Hausman, Jerry, and David Wise, “Attrition Bias in Experimental and Panel Data: The Gary
Income Maintenance Experiment,” Econometrica 47(2) (1979), 455-473.
Heckman, James J., “Detecting Discrimination.” Journal of Economic Perspectives 12(2) (1998),
101-116.
Heckman, James J. and Jeffrey Smith, “Assessing the Case For Social Experiments” Journal of
Economic Perspectives 9(2) (1992), 85-110.
Holmes, Andrew, and Paul. Horvitz, “Mortgage Redlining: Race, Risk and Demand,” Journal of
Finance, 49 (1) (1994), 81-99.
Horn, David K., “Evaluating the Role of Race in Mortgage Lending.” FDIC Banking Review,
(Spring/Summer) (1994), 1-15.
Horn, David K., “Mortgage Lending, Race and Model Specification,” Journal of Financial
Services Research 11(1-2), (1997), 42-68.
Hubbard, R. Glenn, “Social Security, Liquidity Constraints, and Pre-Retirement Consumption,"
Southern Economic Journal, 51 (1985), 471-484.
Hubbard, R. Glenn and Christopher Mayer, “House Prices, Interest Rates, and the Mortgage
Market Meltdown,” Working Paper, Columbia Business School, 2008.
Keys, Benjamin J., Tanmoy Mukherjee, Amit Seru, and Vikrant Vig, “Did Securitization Lead to
Lax Screening? Evidence from Sub-Prime Loans,” Quarterly Journal of Economics, 125(1)
(2010), 307-362 .
King, Alvin T., “Discrimination in Mortgage Lending: A Study of Three Cities,” Office of
Policy and Economic Research, Federal Home Loan Bank Board, Research Working Paper no.
91, 1980.
LaCour-Little, Michael, “Discrimination in Mortgage Lending: A Critical Review of the
Literature,” Journal of Real Estate Literature, 7 (1999), 15-49.
Lax, Howard, Michael Matni, Paul Raca and Peter Zorn, “Subprime Lending: An Investigation
of Economic Efficiency, Housing Policy Debate, 15(3), 533-571.
Levitt, Steven, "Testing Theories of Discrimination: Evidence From Weakest Link," Journal of
Law and Economics, 47(2) (2004), 431-452.
Levitt, Steven, and John A. List, “Field Experiments in Economics: The Past, the Present, and
the Future.” European Economic Review, 53 (2009), 1-18.
Liebowitz, Stan J., “A Study That Deserves No Credit,” Wall Street Journal, September 1, p.
A14, 1993.
31
Manski, Charles F., “Learning about Treatment Effects from Experiments with Random
Assignment of Treatments,” Journal of Human Resources, 31(4) (1996), 707-733.
Mayer, Christopher, Karen Pence, and Shane M. Sherlund, “The Rise in Mortgage Defaults,”
Journal of Economic Perspectives, 23(1) (2009), 27-50.
Mayer, Christopher, and Karen Pence, “Subprime Mortgages: What, Where, and to Whom?” In
Glaeser, Edward and John Quigley, editors, Housing and the Built Environment: Access,
Finance, Policy, Lincoln Land Institute of Land Policy, Cambridge MA, 2008.
Mian, A., and A. Sufi. 2009. “The Consequences of Mortgage Credit Expansion: Evidence of the
U.S. Mortgage Default Crisis.” Quarterly Journal of Economics 124, 1449-1496.
Mortgage Market Statistical Annual, 2011.
Munnell, Alicia H., Geoffrey M. B. Tootell, Lynn E. Browne, and James McEneaney, “Mortgage
Lending in Boston: Interpreting HMDA Data,” American Economic Review 86(1) (1996), 25–53.
Office of Thrift Supervision, 2000. “What About Subprime Mortgages?” Mortgage Market
Trends, Volume 4 Issue 1.
Piazzesi, Monika, Martin Schneider, and Selale Tuzel, “Housing, Consumption and Asset
Pricing,” Journal of Financial Economics, 83 (2007), 531-569.
Rachlis, Mitchell B., and Anthony M. J. Yezer, “Serious Flaws in Statistical Tests for
Discrimination in Mortgage Markets.” Journal of Housing Research, 42 (1993), 315 – 336.
Rivers, Douglas, and Quang H. Vuong, “Limited Information Estimators and Exogeneity Tests
for Simultaneous Probit Models,” Journal of Econometrics 39(3) (1988), 347–366.
Ross, Stephen, and John Yinger, “Does Discrimination in Mortgage Lending Exist? The Boston
Fed Study and Its Critics,” In Margery Austin Turner and Felicity Skidmore, eds., Mortgage
Lending Discrimination: A Review of Existing Evidence. Washington, DC: Urban Institute, 43-
83, 1999.
Scheessele, Randall M, “Black and White Disparities in Subprime Mortgage Refinance
Lending,” Housing Finance Working Paper Series, HF-014, U.S. Department of Housing and
Urban Development, 2002.
Scholz, John K., and Kara Levine, “U.S. Black-White Wealth Inequality: A Survey,” in Social
Inequality, K. Neckerman (ed.), Russell Sage Foundation (2004), 895-929.
Schafer, Robert, and Helen F. Ladd, Discrimination in Mortgage Lending. (Cambridge: MIT
Press, 1981).
32
Staiger, Douglas, and James H. Stock. 1997. “Instrumental Variables Regression with Weak
Instruments,” Econometrica 65, 557-586.
Stengel, Mitchell, and Dennis Glennon, “Evaluating Statistical Models of Mortgage Lending
Discrimination: A Bank-Specific Analysis.” Real Estate Economics, 27 (1999), 299-334.
Stiglitz, Joseph, “The House of Cards.” The Guardian, October 9, 2007.
Turner, Margery A., Stephen L. Ross, George C. Galster, and John Yinger, Discrimination in
Metropolitan Housing Market: National Results from Phase 1 of HDS2000. Washington D.C.,
2002.
Yezer, Anthony M. J., Robert F. Phillips, and Robert P. Trost, “Bias in Estimates of
Discrimination and Default in Mortgage Lending: The Effects of Simultaneity and Self-
Selection,” Journal of Real Estate Finance and Economics, 9 (1994), 197-215.
Wooldridge, Jeffrey M., Econometric Analysis of Cross Section and Panel Data, (Cambridge:
The MIT Press, 2002)
Zandi, Mark, “Boston Fed’s Study Was Deeply Flawed,” American Banker, August 19, 1993.
33
Appendix A
Full-Information Maximum Likelihood (FIML) for
Probit Regression with Binary Endogenous Variable
Greene [2003] and Wooldridge [2002] both show that it is inappropriate to use two-step
procedures in estimating a probit regression with a binary endogenous variable. In a two-step
procedure, such as in two-stage least squares (2SLS), one typically substitutes the fitted value for
the endogenous variable. Yet, when the endogenous variable is binary, the function is nonlinear.
Substituting the fitted values will not produce consistent estimators in nonlinear systems.
It is also interesting to note that using the FIML, we do not need an exclusion restriction,
or instrumental variable for the binary endogenous variable. We illustrate this point using the
following three models.
Model 1: Standard Probit Without Endogeneity
1 1 1 2 2 11 0i i i iy x y u , (1)
where 2y is exogenous. The log likelihood function is:
1 1 1 2 2 1 1 1 2 2ln ln (1 ) ln 1i i i i i i
i
L y x y y x y ,
where is the cumulative normal density function. The maximum likelihood estimation (MLE)
of the parameters are obtained by taking the partial derivatives of ln L with respect to 1 2, and
setting the first-order conditions equal to zero.
Model 2: Probit with Continuous Endogenous Variable
1 1 1 2 2 11 0i i i iy x y u (1)
2 2 1 2i i iy x v (2)
34
where 2iy is a continuous variable,
1iu and2iv are assumed to be bivariate normal, each with
mean zero and covariance2
2
2 2
1.
This model can be estimated by two methods: the Rivers and Vuong [1988] two-step
procedure and the maximum likelihood estimation (MLE). To compare with the other two
models (models 1 and 3), we only describe MLE in this Appendix.
Assume all the variables in 1ix are also included in 2ix . Then Evans, Oates, and Schwab
[1992] and Wooldridge [2002] derive the log likelihood function as follows. The joint
distribution of 1 2,i iy y conditional on 2ix is
1 2 2 1 2 2 2 2, | | , |i i i i i i i if y y x f y y x f y x .
Because 2 2|i iy x ~ Normal 2
2 1 2,ix ,
2
2 2 112
2
22
12 2
2|
i iy x
i if y x e .
Because 2 2 2 1i i iv y x and *
1 11 0i iy y ,
1 1 2 2 2 2 1 2
1 2 22
/1| ,
1
i i i i
i i i
x y y xP y y x ;
1 1 2 2 2 2 1 2
1 2 22
/0 | , 1
1
i i i i
i i i
x y y xP y y x ;
Therefore, the log-likelihood function for this probit model with a continuous
endogenous variable is:
22
1 1 2 2 2 12
2
1ln ln 1 ln 1 .5ln 2 .5i i i i
i
L y z y z y x
Where 1 1 2 2 2 2 1 2
2
/
1
i i i ix y y xz ,
35
1 2 2 1 1| , ln 1 ln 1i i i i i
i
f y y x y z y z ,
and 22
2 2 2 2 2 12
2
1| .5ln 2 .5i i i i
i
f y x y x .
Maximizing the log-likelihood function with respect to all parameters gives the MLEs
of 1 2 1 2, , , , . It is clear that 2iy enters the log-likelihood function through the density
functions 1 2 2| ,i i if y y x and 2 2|i if y x . By taking the partial derivatives of ln L with respect to
the parameters, 2iy plays an important role in solving the first-order conditions.21
Therefore, we
need at least one variable in 2ix that is not in 1ix for the system of equations to be identified.
Model 3: Probit with Binary Endogenous Variable
1 1 1 2 2 11 0i i i iy x y u (1)
2 2 1 21 0i i iy x v , (2)
where 2iy is a continuous variable, 1iu and 2iv are assumed to be bivariate-normal, each with
mean zero and covariance 1
1.
Wooldridge [2002] suggests using the FIML to estimate such a model, but does not
derive the maximum likelihood function. Greene [2003] shows that the correlation between the
endogenous variable and the disturbance term ( ) is built into the likelihood function.
Specifically, Greene [2003] shows that:
1 2 1 2 2 1 1 2 2 11, 1| , , ,i i i i i iProb y y x x x x
1 2 1 2 2 1 1 2 11, 0 | , , ,i i i i i iProb y y x x x x
21
2iy shows up in the first-order conditions through the partial derivative of the normal density function ( )z in the
first and second term, and the partial derivative of the fourth term in the log likelihood function.
36
1 2 1 2 2 1 1 2 2 10, 1| , , ,i i i i i iProb y y x x x x
1 2 1 2 2 1 1 2 10, 0 | , , ,i i i i i iProb y y x x x x
where 2is the cumulative bivariate-normal density function. Greene [1998] proved that the log-
likelihood function for probit with binary endogenous variable is exactly the same as the log-
likelihood function for a bivariate-normal regression.22
In doing so, Greene [1998, p. 295] states
that “the counterintuitive result is that in the bivariate probit model, unlike in the linear
simultaneous equations model, if the two dependent variables are jointly determined, we just put
each on the right-hand side of the other equation [in our case, one of them] and proceed as if
there were no simultaneity problem.” Greene [2003, p. 716] further states that “we can ignore the
simultaneity in this model and we cannot in the linear regression model because, in this instance,
we are maximizing the log likelihood, whereas in the linear regression case, we are manipulating
certain sample moments that do not converge to the necessary population parameters in the
presence of simultaneity.” The log-likelihood function for this probit model with a binary
endogenous variable is the same as the bivariate probit and can be written as:
1 2 2 1 1 2 2 1
1 2 2 1 1 2 1
1 2 2 1 1 2 2 1
1 2 2 1 1 2 1
ln 1, 1 ln , ,
1, 0 ln , ,
+ 0, 1 ln , ,
0, 0 ln , ,
i i i i
i
i i i i
i i i i
i i i i
L y y x x
y y x x
y y x x
y y x x
where is the indicator function. Because of the binary nature of 2iy , it does not appear in the
joint density function. Further, when we expand the summation, 2iy does not appear in the log
likelihood function. Assume the data for i observations reads as 11 121, 1y y ,
22
A proof of this result was suggested in Maddala [1983, p. 123] and pursued in Greene [1998].
37
21 221, 0y y , 31 320, 1 ,y y 41 420, 0y y , ... 1 21, 1i iy y , then the log-
likelihood can be written as:
2 11 1 2 12 1 2 21 1 22 1 2 31 1 2 32 1
2 41 1 42 1 2 1 1 2 2 1
ln ln , , ln , , + ln , ,
ln , , ... ln , , .i i
L x x x x x x
x x x x
When we take the partial derivatives of this log likelihood function with respect to the
parameters, 2iy does not play any role in the first-order conditions. Therefore, we do not need an
exclusion restriction (i.e., at least one variable in2ix that is not in
1ix ) for the system of equations
to be identified.
38
Appendix B
Rivers-Vuong Endogeneity Test for
Binary Endogenous Variable in a Probit Regression
When the dependent variable is continuous, one usually performs the Durbin-Wu-
Hausman test for endogeneity. As a first step in the Durbin-Wu-Hausman test, one typically
regresses the endogenous variable on all exogenous variables in the system and obtains the
residual. The residual is then included as additional regressors in the original OLS regressions. If
the coefficient on the residual is statistically significant, then exogeneity is rejected.
Yet, in a probit model with binary endogenous variable, the Durbin-Wu-Hausman test is
not applicable mainly because the usual probit standard errors and test statistics are not strictly
valid. To test for exogeneity in simultaneous equation models with limited dependent variables,
Smith and Blundell [1986] developed a test for tobit models and Rivers and Vuong [1988]
developed a test for the probit model. Wooldridge [2002]23
recommends using Rivers-Vuong
approach in testing for an endogenous binary variable in a probit model. The model is set up as
follows:
1 1 1 1 2 11 0y x y u (1)
2 1 21 2 22 21 0y x x v , (2)
where 1y is the binary dependent variable, 1x is a set of exogenous variables, 2y is the potential
endogenous binary variable, and 1u is the disturbance term. The variables 2x are additional
independent variables in equation (2) that are not included in equation (1). The endogeneity in
the model arises from the correlation of 2y with 1u . Rivers and Vuong (1988) assume that
1 2,u v , the disturbance terms in equation (1) and (2) is independent of 1x , 2x and distributed as
23
Wooldridge [2002, p. 478].
39
Bivariate-normal with zero mean, unit variance, and 1 2( , )Corr u v . If 0 , then
1u and 2y
are correlated, and probit estimation of equation (1) is inconsistent for both 1 and 1.
Showing that 1 1 2 1u v e under joint normality of 1 2,u v , with 1 1Var u , Rivers
and Vuong [1988] develop a simple two-step approach to test the endogeneity of 2y . The two
steps are similar to Durbin-Wu-Hausman test. The first step involves estimating equation (2) to
get the residuals 2v̂ . In the second step, estimate the Probit of 1y on 1x , 2y and the residual.
One feature of Rivers and Vuong [1988] is that the usual probit t statistics on 2v̂ is a valid
test of the null hypothesis when 2y is exogenous, i.e., 0 1: 0H . Yet, if 1 0 , the usual probit
standard errors and test statistics are not strictly valid and the parameters are estimated only up to
scale. The asymptotic variance of the estimated probit parameters needs to be adjusted to account
for the first stage estimation. The scaled probit coefficients need to be divided by a factor,
0.52
1 2ˆ ˆ 1 , where
1/ 22
1 1 / 1 and 2
2 2( )Var v .
40
Table I: Definition of Variables
Variable Definition Source Regression
Dependent variables:
REJECT Dummy set to unity if the loan is rejected, zero otherwise HMDA REJECT
PCTREJECT Percentage of loans rejected in the Census tract HMDA PCTREJECT
Race:
AFRICAN Dummy set to unity if borrower is African-American, zero if White HMDA REJECT
MINORITY Dummy set to unity if the borrower is African-American or
Hispanic, zero if White HMDA
REJECT
PCTAFRICAN Percentage of applicants African-American HMDA PCTREJECT
PCTMINORITY Percentage of applicants African-American or Hispanic HMDA PCTREJECT
Control variables:
Borrower risk
characteristics
INCOME Log of borrower income (in thousands, use median in PCTREJECT
regression) HMDA REJECT and PCTREJECT
MEDFICO Tract median FICO score Credit Bureau REJECT and PCTREJECT
DTI Average non-mortgage debt/average income in the Tract Credit Bureau and HMDA REJECT and PCTREJECT
PCTCOLLEGE Percentage of Tract population 25+ age with a Bachelor's Degree Census REJECT and PCTREJECT
UNEMPLOY Unemployed civilian/(Unemployed civilian + Employed civilian) Census REJECT and PCTREJECT
Loan risk characteristics
LTV
Median loan amount/(Median house value * House price
appreciation rate), this variable is split into three dummy variables:
LTV80_90 (80%<LTV<=90%), LTV90_100 (90%<LTV<=100%) and
LTV100 (LTV>100%)
Census, HMDA and FHFA REJECT and PCTREJECT
CONVENTION Dummy set to unity if the loan is a conventional loan, zero otherwise HMDA REJECT
AMOUNT Loan amount HMDA REJECT
41
Table I (continued)
Variable Definition Source Regression
Property risk characteristics:
MEDAGE Median age of residential property in the Tract Census REJECT and PCTREJECT
PCTRENT Renter-occupied housing units/total housing units Census REJECT and PCTREJECT
HVCHG House price appreciation rate FHFA REJECT and PCTREJECT
NEWOWN Percentage change in owner occupants between 1990 and 2000 Census REJECT and PCTREJECT
OWNEROCC Dummy set to unity if the property is owner-occupied, zero
otherwise
HMDA REJECT and PCTREJECT
Lender Characteristics:
BANK
Set to unity if the lender is a commercial bank, savings, or thrift
institution, and zero otherwise
HMDA REJECT
HERF Sum of squared market shares of lenders in the tract HMDA REJECT
Macroeconomic Variables:
PRIMERATE Prime rate St. Louis Fed REJECT and PCTREJECT
TERM
Yield spread between the seven-year Treasury note and the three-
month Treasury bill
St. Louis Fed REJECT and PCTREJECT
DEFAULT Difference between the yield of a Baa bond and a Aaa bond St. Louis Fed REJECT and PCTREJECT
Instrumental variable:
BLACKCHURCH Number of African American church members in the county U.S. Religious Landscape Survey
and RCMS REJECT and PCTREJECT
This table shows the definitions of variables in loan-level accept/reject regression, loan price regression and the Census tract level PCTREJECT regression. The first column
shows the name of the variable, the second column gives the definitions of the variables, the third column shows the data sources for the variables and the fourth column shows
which regression the variables are used in. Regression REJECT corresponds to equation (1) and (2), and regression PCTREJECT corresponds to equation (3) and (4).
42
Table II: Descriptive Statistics
All Loans Prime Loans Subprime Loans
Number Mean Median Std. Dev. Number Mean Median Std. Dev. Number Mean Median Std. Dev.
Dependent variables:
REJECT 2,026,556 0.26 0.00 0.44 1,714,003 0.23 0.00 0.42 312,553 0.46 0.00 0.50
Race:
AFRICAN 1,784,843 0.12 0.00 0.32 1,521,057 0.10 0.00 0.30 263,786 0.19 0.00 0.39
MINORITY 2,026,556 0.20 0.00 0.40 1,714,003 0.19 0.00 0.39 312,553 0.28 0.00 0.45
Control variables:
Borrower risk characteristics
INCOME 2,026,556 4.51 4.49 0.64 1,714,003 4.54 4.51 0.64 312,553 4.35 4.36 0.61
MEDFICO 2,026,556 707 720 65 1,714,003 712 724 63 312,553 681 695 70
DTI 2,026,556 0.87 0.94 0.35 1,714,003 0.86 0.94 0.35 312,553 0.91 0.97 0.33
PCTCOLLEGE 2,026,556 0.28 0.24 0.16 1,714,003 0.29 0.25 0.17 312,553 0.22 0.19 0.14
UNEMPLOY 2,026,556 0.06 0.05 0.04 1,714,003 0.06 0.04 0.04 312,553 0.07 0.06 0.05
Loan risk characteristics
LTV 2,026,556 0.76 0.74 0.20 1,714,003 0.76 0.73 0.20 312,553 0.79 0.76 0.18
CONVENTION 2,026,556 0.96 1.00 0.21 1,714,003 0.95 1.00 0.22 312,553 0.99 1.00 0.12
AMOUNT 2,026,556 220.73 192.99 330.59 1,714,003 221.49 190.00 352.56 312,553 216.34 197.00 150.79
Property risk characteristics
MEDAGE 2,026,556 31.95 33.00 12.68 1,714,003 31.57 32.00 12.69 312,553 34.10 35.00 12.36
PCTRENT 2,026,556 0.29 0.22 0.22 1,714,003 0.28 0.21 0.21 312,553 0.34 0.30 0.22
HVCHG 2,026,556 0.11 0.07 0.26 1,714,003 0.11 0.07 0.25 312,553 0.10 0.06 0.28
NEWOWN 2,026,556 0.15 0.05 0.42 1,714,003 0.16 0.05 0.41 312,553 0.11 0.03 0.45
OWNEROCC 2,026,556 0.92 1.00 0.27 1,714,003 0.91 1.00 0.28 312,553 0.93 1.00 0.25
Lender Characteristics
BANK 2,026,556 0.34 0.00 0.47 1,714,003 0.29 0.00 0.45 312,553 0.66 1.00 0.47
HERF 2,026,556 0.03 0.03 0.01 1,714,003 0.03 0.03 0.01 312,553 0.03 0.03 0.01
43
Table II (continued)
All Loans Prime Loans Subprime Loans
Number Mean Median Std. Dev. Number Mean Median Std. Dev. Number Mean Median Std. Dev.
Macroeconomic Variables
PRIMERATE 2,026,556 4.45 4.34 0.63 1,714,003 4.45 4.34 0.64 312,553 4.43 4.34 0.60
TERM 2,026,556 2.46 2.50 0.27 1,714,003 2.46 2.50 0.27 312,553 2.46 2.50 0.26
DEFAULT 2,026,556 0.79 0.77 0.10 1,714,003 0.79 0.77 0.11 312,553 0.78 0.77 0.09
Instrumental variable for race:
BLACKCHURCH 2,026,556 24,435 22,846 18,179 1,714,003 24,005 22,846 17,761 312,553 26,917 23,253 20,246
This table shows summary descriptive statistics of the variables employed in the loan-level regression for a sample of 250,000 mortgage loans in New Jersey from 2000 to 2008 HMDA
data. The sample is broken down into prime loans (217,886) and subprime loans (32,114). All variables are defined in Table I.
44
Table III: Single-Equation Probit Regressions at the Loan-Level -- Race Uncorrelated
with Disturbance Term
Prime Lending Subprime Lending
(1) (2) (3) (4)
African-American Minority African-American Minority
RACE 0.079*** 0.020*** -0.036*** -0.002
(64.87) (22.58) (-13.30) (-1.02)
Borrower risk characteristics
INCOME -0.065*** -0.071*** -0.112*** -0.118***
(-92.69) (-102.98) (-55.90) (-63.33)
MEDFICO -0.035*** -0.048*** -0.017*** -0.009***
(-36.00) (-51.73) (-6.66) (-3.82)
DTI 0.051*** 0.048*** 0.053*** 0.050***
(40.80) (39.14) (14.61) (14.55)
PCTCOLLEGE -0.059*** -0.050*** 0.035*** 0.039***
(-19.69) (-17.02) (3.47) (4.18)
UNEMPLOY 0.137*** 0.145*** 0.154*** 0.125***
(11.61) (13.02) (4.66) (4.18)
Loan risk characteristics
LTV80_90 0.007*** 0.007*** -0.004 -0.005*
(7.35) (8.12) (-1.62) (-1.95)
LTV90_100 -0.002 -0.002 0.013*** 0.009**
(-1.32) (-1.33) (2.92) (2.41)
LTV100 -0.004** -0.006*** -0.020*** -0.010**
(-2.03) (-3.29) (-3.66) (-2.03)
CONVENTION 0.011*** 0.012*** -0.249*** -0.258***
(6.31) (7.44) (-30.93) (-33.65)
AMOUNT 0.020*** 0.022*** 0.300*** 0.301***
(9.62) (9.74) (36.93) (40.69)
Property risk characteristics
MEDAGE 0.000*** 0.000*** -0.000*** -0.000***
(9.27) (14.05) (-3.30) (-5.07)
PCTRENT 0.039*** 0.047*** 0.025*** 0.008
(16.91) (21.62) (3.63) (1.23)
HVCHG 0.006*** 0.004** -0.004 -0.000
(3.92) (2.48) (-0.78) (-0.11)
NEWOWN -0.011*** -0.010*** 0.001 0.002
(-11.07) (-10.94) (0.23) (0.63)
OWNEROCC -0.030*** -0.031*** -0.021*** -0.043***
(-22.70) (-24.39) (-5.13) (-11.52)
Lender Characteristics
BANK -0.099*** -0.110*** 0.043*** 0.032***
(-130.82) (-150.92) (20.26) (16.71)
HERF -0.156*** -0.176*** 0.734*** 0.761***
(-4.71) (-5.58) (6.93) (7.83)
45
Table III (continued)
Prime Lending Subprime Lending
(1) (2) (3) (4)
African-American Minority African-American Minority
Macroeconomic Variables
PRIMERATE -0.036*** -0.035*** -0.153*** -0.151***
(-12.50) (-12.20) (-15.89) (-16.51)
TERM -0.086*** -0.081*** -0.326*** -0.325***
(-12.73) (-12.29) (-14.64) (-15.30)
DEFAULT -0.036*** -0.035*** 0.043*** 0.032***
(-12.50) (-12.20) (20.26) (16.71)
Number of Observations 1,521,057 1,714,003 263,786 312,553
Percent correctly
predicted 88.25 87.54
65.13 65.14
Log-likelihood value -69856.52 -77009.33 -18486.13 -20370.35
Psuedo R-squared 0.055 0.061 0.032 0.030
This table shows Probit regressions of REJECT on all control variables defined in Table I and RACE , where RACE is treated
as uncorrelated with the disturbance. The dependent variable is REJECT, a dummy that equals one when the loan application
is rejected, and zero otherwise. We estimate the model for two different samples: the prime loan sample and the subprime loan
sample and define RACE as a dummy variable set to one if borrower is African-American, or Minority (African-American or
Hispanic), and zero if borrower is white. This results in four specifications. Marginal effects are reported with robust t-
statistics given in parenthesis. We use ***, **, and * to denote significance at the 1, 5, and 10 percent level, respectively.
46
Table IV: Evidence of Correlated Race and Disturbance Term in Single-Equation Probit Loan-Level Regressions of Table III
Panel A: Different Risk Characteristics
African-American vs. White Minority vs. White
Variable White
African-
American t-statistic p-value White Minority t-statistic p-value
INCOME 4.450 4.187 55.156 0.000*** 4.450 4.195 75.437 0.000***
MEDFICO 735.089 665.870 120.360 0.000*** 735.089 676.487 153.774 0.000***
DTI 18.544 30.083 -29.141 0.000*** 18.544 25.517 -26.708 0.000***
PCTCOLLEGE 0.328 0.236 73.371 0.000*** 0.328 0.227 115.641 0.000***
UNEMPLOY 0.046 0.067 -61.763 0.000*** 0.046 0.067 -91.569 0.000***
LTV 0.754 0.781 -8.811 0.000*** 0.754 0.782 -11.827 0.000***
CONVENTION 0.963 0.887 29.465 0.000*** 0.963 0.889 41.466 0.000***
AMOUNT 192.271 138.396 24.376 0.000*** 192.271 149.562 17.536 0.000***
OWNEROCC 0.924 0.937 -6.653 0.000*** 0.924 0.943 -13.809 0.000***
MEDAGE 29.362 33.551 -39.335 0.000*** 29.362 35.345 -80.156 0.000***
PCTRENT 0.214 0.326 -66.103 0.000*** 0.214 0.359 -116.885 0.000***
HVCHG 0.109 0.075 18.382 0.000*** 0.109 0.076 25.521 0.000***
NEWOWN 0.186 0.117 18.896 0.000*** 0.186 0.095 41.274 0.000***
PCTBOARD 0.167 0.381 -26.729 0.000*** 0.167 0.310 -30.065 0.000***
Panel B: Formal Tests
African-American Minority
Prime Subprime Prime Subprime
test statistic p-value test statistic p-value test statistic p-value test statistic p-value
Rivers-Vuong Test (t
statistics) 2.84 0.005*** 1.72 0.086* 2.01 0.044** 2.33 0.020**
a 0.184 0.166 0.158 0.252
Likelihood Ratio Test of
=0 (Chi-square statistics) 32.69 0.000*** 6.45 0.011** 22.08 0.000*** 11.21 0.001***
Panel A shows different borrower, loan, and property risk characteristics for different racial groups (African-American vs. white, Hispanic vs. white and Minority vs. white). Mean
values for each control variables are reported by difference racial groups and t-statistics shows the difference in means between two different groups. Panel B shows two formal tests
that race is correlated with the disturbance term in the accept/reject equation: the Rivers-Vuong t-test and the likelihood ratio test of =0, where is is the correlation coefficient
between the accept/reject equation and the race equation. The null hypothesis that race is exogenous (uncorrelated with the disturbance) in the loan-level regression is rejected when the
test statistics exceed the critical values. We use ***, **, and * to denote significance at the 1, 5, and 10 percent level, respectively.
a is the correlation coefficient of the disturbance term in the loan accept/reject equation and the race equation.
47
Table V: Full-Information Maximum Likelihood (FIML) Loan-Level Regressions: Race
Correlated with Disturbance Term
Prime Lending Subprime Lending
(1) (2) (3) (4)
African-American Minority African-American Minority
RACE 0.015*** 0.047*** -0.140*** -0.030*
(3.35) (8.93) (-9.42) (-1.94)
Borrower risk
characteristics
INCOME -0.068*** -0.073*** -0.111*** -0.118***
(-112.58) (-125.02) (-58.90) (-67.03)
MEDFICO -0.060*** -0.050*** -0.055*** -0.025***
(-38.69) (-32.96) (-11.26) (-5.10)
DTI -0.000 -0.001 -0.001 -0.003
(-0.09) (-0.34) (-0.13) (-0.67)
PCTCOLLEGE -0.042*** -0.057*** 0.079*** 0.043***
(-12.85) (-18.85) (6.57) (4.12)
UNEMPLOY 0.109*** 0.079*** 0.135*** 0.069**
(8.82) (6.83) (3.97) (2.27)
Loan risk characteristics
LTV80_90 0.004*** 0.004*** -0.007*** -0.007***
(3.75) (4.76) (-2.58) (-3.11)
LTV90_100 -0.008*** -0.004*** 0.007 0.003
(-5.06) (-2.61) (1.62) (0.85)
LTV100 -0.004** -0.004** -0.020*** -0.010**
(-2.05) (-2.23) (-3.71) (-2.17)
CONVENTION 0.007*** 0.013*** -0.247** -0.258***
(4.21) (8.21) (-29.72) (-32.60)
AMOUNT 0.020*** 0.021*** 0.286*** 0.294***
(27.58) (29.00) (35.82) (40.39)
Property risk
characteristics
MEDAGE 0.000*** 0.000*** -0.001** -0.000***
(10.03) (11.29) (-2.46) (-4.24)
PCTRENT 0.009*** 0.021*** -0.007 -0.015**
(3.85) (9.07) (-0.99) (-2.25)
HVCHG 0.003** 0.004*** -0.106** -0.004
(2.02) (2.99) (-2.21) (-0.98)
NEWOWN -0.007*** -0.007*** 0.005* 0.006**
(-7.63) (-8.04) (1.94) (2.31)
OWNEROCC -0.023*** -0.028*** -0.018*** -0.039***
(-17.97) (-22.18) (-4.45) (-10.52)
Lender Characteristics
BANK -0.094*** -0.105*** 0.041*** 0.030***
(-120.11) (-139.46) (19.41) (15.91)
HERF -0.139*** -0.163*** 0.696*** 0.732***
(-4.26) (-5.19) (6.58) (7.51)
48
Table V (continued)
Prime Lending Subprime Lending
(1) (2) (3) (4)
African-American Minority African-American Minority
Macroeconomic Variables
PRIMERATE 0.030*** 0.027*** -0.091*** -0.089***
(8.42) (7.72) (-8.39) (-8.70)
TERM 0.104*** 0.094*** -0.147*** -0.148***
(11.74) (10.83) (-5.55) (-5.89)
DEFAULT -0.243*** -0.223*** -0.254*** -0.252***
(-34.67) (-32.80) (-12.42) (-13.26)
Number of Observations 1,521,057 1,714,003 263,786 312,553
Percent correctly predicted 84.05 87.53 62.03 64.40
Log-likelihood value -105006.91 -1565193.10 -288866.37 -367757.39
This table shows Full-Information Maximum Likelihood (FIML) estimation of REJECT on control variables defined in Table
I and RACE, where RACE is treated as correlated with the disturbance. The correlation of the RACE with the disturbance term
is built into the likelihood function. The dependent variable, REJECT, is a dummy that equals one when the loan application is
rejected, and zero otherwise. We estimate the model for two different samples: the prime loan sample and the subprime loan
sample and define RACE as a dummy variable set to one if borrower is African-American, or Minority (African-American or
Hispanic), and zero if borrower is white. This approach results in four specifications. Marginal effects are reported with
robust t-statistics given in parenthesis. We use ***, **, and * to denote significance at the 1, 5, and 10 percent level,
respectively.
49
Table VI: OLS Neighborhood-Level Regressions – Race Uncorrelated with Disturbance Terms
Panel A: OLS
Prime
Subprime
(1) (2) (3) (4)
PCTAFRICAN PCTMINORITY PCTAFRICAN PCTMINORITY
PCTRACE 0.079*** 0.116*** -0.071*** -0.164***
(7.02) (5.78) (-3.48) (-4.28)
Borrower risk characteristics
INCOME -0.027*** -0.024*** -0.042*** -0.044***
(-8.28) (-7.66) (-8.74) (-9.55)
MEDFICO -0.043*** -0.034*** -0.073*** -0.099***
(-11.18) (-5.54) (-10.74) (-8.74)
DTI -0.004 -0.004 -0.002 0.001
(-0.68) (-0.65) (-0.34) (0.15)
PCTCOLLEGE -0.113*** -0.110*** -0.045*** -0.037***
(-17.53) (-16.14) (-3.83) (-3.02)
UNEMPLOY 0.028 -0.028 -0.033 0.069
(1.02) (-0.75) (-0.70) (1.09)
Loan risk characteristics
LTV80_90 0.000 -0.004*** -0.025*** -0.020***
(0.02) (-2.89) (-9.45) (-7.86)
LTV90_100 -0.004* -0.005* -0.010** -0.006
(-1.65) (-1.89) (-2.37) (-1.34)
LTV100 -0.005 -0.003 0.025*** 0.022***
(-1.61) (-0.99) (4.72) (4.43)
Property risk characteristics
MEDAGE 0.000*** 0.000*** -0.000*** -0.000***
(7.05) (4.26) (-5.35) (-3.72)
PCTRENT 0.030*** -0.006 -0.065*** -0.018
(6.25) (-1.02) (-8.16) (-1.57)
HVCHG 0.021*** 0.020*** 0.010** 0.013***
(6.01) (5.53) (2.19) (3.02)
NEWOWN -0.009*** -0.007*** -0.000 -0.002
(-5.83) (-4.70) (-0.14) (-0.85)
Macroeconomic variables
PRIMERATE 0.031*** 0.037*** 0.037*** 0.027***
(9.69) (11.26) (7.49) (4.91)
TERM 0.063*** 0.074*** 0.055*** 0.033**
(6.55) (7.72) (4.11) (2.31)
DEFAULT -0.114*** -0.110*** -0.007 -0.004
(-7.58) (-7.44) (-0.41) (-0.23)
INTERCEPT 0.589*** 0.433*** 1.189*** 1.495***
(10.77) (6.21) (14.12) (12.50)
Observations 14,393 14,393 14,381 14,381
R-squared 0.649 0.655 0.182 0.184
50
Table VI (continued)
Panel B: Durbin-Wu-Hausman Endogeneity Test of PCTRACE
Prime Subprime
(1) (2) (3) (4)
PCTAFRICAN PCTMINORITY PCTAFRICAN PCTMINORITY
Chi-sq
statistics 6.381** 14.917*** 3.127* 9.290***
p-value (0.011) (0.000) (0.077) (0.002)
This table shows OLS regression of PCTREJECTION on some of the neighborhood control variables defined in Table I, and
PCTRACE, where PCTRACE is treated as uncorrelated with the disturbance. The dependent variable PCTREJECTION is
defined as number of loan rejections divided by number of loan applications in a Census tract. We estimate the model for two
different samples: the prime loan sample and the subprime loan sample and define PCTRACE as percentage of Census tract
applicants African-American or minority (African-American + Hispanic). This approach results in four specifications. We
report robust t-statistics in parenthesis and use ***, **, and * to denote significance at the 1, 5, and 10 percent level,
respectively.
51
The table shows the 2SLS regression of PCTREJECTION on some of the neighborhood control variables defined
in Table I, and PCTRACE, where PCTRACE is treated as correlated with the disturbance. The dependent variable
PCTREJECTION is defined as number of loan rejections divided by number of loan applications in a Census tract.
We estimate the model for two different samples: the prime loan sample and the subprime loan sample and define
PCTRACE as percentage of Census tract applicants African-American or minority (African-American and
Hispanic). This results in four specifications. PCTRACE (PRED) is the predicted value of PCTRACE in the first
stage regression. The instrumental variables for PCTRACE in 2SLS are FETALDEATH, FETALSQ, URBANFLAG
and INTER. We report robust t-statistics in parenthesis and use ***, **, and * to denote significance at the 1, 5,
and 10 percent level (two-sided), respectively.
Table VII: 2SLS Neighborhood-Level Regression – Race Correlated with Disturbance
Prime
Subprime
(1) (2) (3) (4)
PCTAFRICAN PCTMINORITY PCTAFRICAN PCTMINORITY
PCTRACE (PRED) -0.027 -0.014 -0.124*** -0.222***
(1.38) (0.66) (-3.76) (-5.55)
Borrower risk characteristics
INCOME -0.022*** -0.023*** -0.040*** -0.044***
(-9.77) (-11.23) (-9.81) (-11.18)
MEDFICO -0.076*** -0.072*** -0.089*** -0.116***
(-12.39) (-11.13) (-8.61) (-9.75)
DTI -0.006** -0.006** 0.000 0.003
(-2.54) (-2.44) (0.01) (0.59)
PCTCOLLEGE -0.077*** -0.083*** -0.027** -0.025**
(-10.13) (-14.18) (-1.99) (-2.30)
UNEMPLOY 0.144*** 0.132*** 0.015 0.129***
(5.70) (4.43) (0.39) (2.70)
Loan risk
characteristics
LTV80_90 -0.002 -0.001 -0.026*** -0.020***
(-1.49) (-0.88) (-9.17) (-7.09)
LTV90_100 -0.003 -0.003 -0.009** -0.004
(-1.21) (-1.28) (-2.14) (-0.88)
LTV100 -0.003 -0.004 0.027*** 0.023***
(-1.40) (-1.64) (5.63) (5.00)
Property risk characteristics
MEDAGE 0.000*** 0.000*** -0.001*** -0.000***
(4.16) (5.31) (-6.11) (-3.69)
PCTRENT 0.014*** 0.021*** -0.073*** -0.005
(3.21) (3.89) (-10.06) (-0.46)
HVCHG 0.021*** 0.021*** 0.010*** 0.014***
(14.05) (14.13) (3.25) (4.73)
NEWOWN -0.009*** -0.009*** -0.001 -0.003
(-8.83) (-8.46) (-0.30) (-1.32)
Macroeconomic variables
PRIMERATE 0.031*** 0.030*** 0.036*** 0.023***
(12.19) (11.02) (7.11) (3.99)
TERM 0.064*** 0.063*** 0.050*** 0.023*
(9.98) (9.37) (3.89) (1.65)
DEFAULT -0.117*** -0.117*** -0.001 0.000
(-16.89) (-16.71) (-0.06) (0.02)
INTERCEPT 0.775*** 0.765*** 1.299*** 1.657***
(15.47) (11.56) (13.25) (12.64)
Observations 14,393 14,393 14,381 14,381
R-squared 0.645 0.646 0.162 0.163
52
Panel B: Test Whether BLACKCHURCH has a Significant Effect in Predominantly White
Neighborhoods
(1) (2)
Prime
Subprime
BLACKCHURCH 0.012
0.030
(0.854)
(1.335)
Borrower risk characteristics
INCOME 0.025
-0.036
(0.741)
(-1.273)
MEDFICO -0.260***
-0.162***
(-10.692)
(-5.361)
DTI -0.049
0.007
0.025
-0.036
PCTCOLLEGE -0.360***
-0.225***
(-9.372)
(-7.579)
UNEMPLOY -0.023
-0.072***
(-1.247)
(-3.050)
Loan risk characteristics
LTV80_90 -0.018
-0.093***
(-0.998)
(-4.230)
Table VIII: Tests of Validity and Strength of Instrumental Variable
Panel A: Differences in Observed Variables for Neighborhoods with Large Number of African-
American Church Members (above the median) and Small Number of African-American Church
Members (below the median)
Variable Above Below t-statistic p-value
INCOME 4.482 4.541 -0.534 0.592
MEDFICO 7.136 7.007 12.120 0.000***
DTI 0.899 0.841 4.741 0.000***
PCTCOLLEGE 0.263 0.291 -8.553 0.000***
UNEMPLOY 0.055 0.061 -8.907 0.000***
LTV 1.030 1.037 -5.480 0.000***
CONVENTION 0.955 0.957 -1.896 0.072**
AMOUNT 0.211 0.231 -3.634 0.000***
OWNEROCC 0.906 0.929 -3.927 0.000***
MEDAGE 29.002 34.901 -17.326 0.000***
PCTRENT 0.254 0.320 -9.094 0.000***
HVCHG 0.132 0.139 -1.072 0.283
NEWOWN 0.125 0.097 3.067 0.000***
PCTBOARD 0.270 0.304 -5.288 0.000***
53
Table VIII (continued)
LTV90_100 -0.058**
0.002
(-1.975)
(0.064)
LTV100 0.036
-0.008
(1.242)
(-0.318)
Property risk characteristics
MEDAGE -0.076***
-0.041**
(-3.318)
(-2.082)
PCTRENT 0.196***
0.061**
(3.467)
(2.262)
HVCHG 0.346***
0.191***
(5.296)
(6.730)
NEWOWN -0.046***
-0.022
(-3.362)
(-1.334)
Macroeconomic variables
PRIMERATE 0.888***
0.365*
(4.714)
(1.899)
TERM 0.890***
0.287
(3.730)
(1.327)
DEFAULT -0.311***
-0.004
(-3.946)
(-0.075)
Observations 3,011
3,012
R-squared 0.486 0.167
Panel C: Upper and Lower Bound Bias of the Coefficients on Race Variables in
FIML
Prime Market
African-American
Minority
Race Coefficient
Race Coefficient
Lower Bound Bias
0.184 -0.022***
0.166 -0.060***
(-20.89)
(-74.64)
Upper Bound Bias
0 0.079***
0 0.020***
(64.87) (22.58)
Subprime Market
African-American
Minority
Race Coefficient
Race Coefficient
Lower Bound Bias
0.158 -0.141***
0.252 -0.164***
(-54.06)
(-78.15)
Upper Bound Bias
0 -0.036***
0 -0.002
(-13.30) (-1.02)
54
Table VIII (continued)
Panel D: Staiger-Stock (1997) instrument strength test
(1) (2)
PCTAFRICAN PCTMINORITY
F-statistics 38.662*** 26.323***
p-value (0.000) (0.000)
Panel A shows evidence of the validity of the instrument by comparing the differences in the top
(above the median) and bottom (below the median) black church concentrated areas to see
whether BLACKCHURCH is correlated with the observable variables. Panel B shows evidence of
the validity of the instrument by testing the direct effect of the BLACKCHURCH on the
accept/reject decision on a sample of white concentrated census tracts (top quintile). Panel C
shows the upper bound and lower bounds of the biases on the RACE coefficients in Full
Information Maximum Likelihood regression. The upper bound bias is calculated by restricting
, the correlation coefficient between the residuals of the accept/reject equation and the race
equation, to 0. The lower bound bias is calculated by restricting to be the correlation coefficient
between the race variable and the observable variables. Panel D shows Staiger and Stock (1997)
test of the strength of the instruments. This test is based on an F-test of the joint significance of
the instruments. The critical value for strong instruments is 10.
55
Table IX: Evidence of Mortgage Lending Discrimination from 2000-2005
Panel A: FIML
Prime Lending
Subprime Lending
(1) (2)
(3) (4)
African-American Minority
African-American Minority
RACE 0.041*** 0.038***
-0.164*** -0.019
(5.90) (4.90) (-7.73) (-0.86)
Panel B: 2SLS
Prime Lending
Subprime Lending
(1) (2)
(3) (4)
African-American Minority
African-American Minority
RACE -0.014 -0.015
-0.089** -0.137***
(-0.62) (-0.52) (-2.08) (-2.58)
This table shows evidence of lending discrimination in the prime and subprime mortgage market in the
housing boom period from 2000 to 2005, using both the individual loan-level FIML regression and the
neighborhood-level 2SLS regression. Panel A shows the results of the FIML regression, and Panel B shows
the results of the 2SLS regression.
56
Table X: Changes in Rejection Rates between 1996 and 2008 for subprime
Panel A:
Black Minority
Black Minority
DIFF_PCTRACE -0.094** -0.283***
PCTRACE96 -0.052*** -0.062***
(2.57) (11.61)
(4.49) (7.65)
INTERCEPT -0.051*** -0.043***
INTERCEPT -0.047*** -0.041***
(39.00) (29.60)
(32.23) (27.11)
Observations 1611 1611
Observations 1611 1611
R-squared 0.006 0.139 R-squared 0.015 0.046
Panel B:
Quartiles sorted by PCTMINORITY
Quartile Pctminority Income Growth Employment Growth
1 1.60% 3.19% -2.03%
2 4.22% 4.06% 0.16%
3 9.47% 4.17% 0.24%
4 38.83% 0.57% -2.40%
Difference in income growth between quartile 1 and 4:
Quartile 1 Quartile 4 Difference (1-4) p-value
3.19% 0.57% 2.62% 0.00
Difference in employment growth between quartile 1 and 4:
Quartile 1 Quartile 4 Difference (1-4) p-value
-2.03% -2.40% 0.37% 0.00
Quartiles sorted by PCTBLACK
Quartile Pctblack Income Growth Employment Growth
1 0.10% 2.25% -2.90%
2 1.31% 4.24% -0.34%
3 4.79% 4.19% 0.98%
4 30.45% 1.29% -2.80%
Difference in income growth between quartile 1 and 4:
Quartile 1 Quartile 4 Difference (1-4) p-value
2.25% 1.29% 0.96% 0.00
Difference in employment growth between quartile 1 and 4:
Quartile 1 Quartile 4 Difference (1-4) p-value
-2.90% -2.80% -0.10% 0.00
Panel A of this table shows second stage regression results of model (5) and (6). In the first stage, the change in
rejection rate between 2008 and 1996 is regressed on the change in borrower (without race), loan, property and
macro risks between 1996 and 2008. In the second stage, the residual from first stage regression is regressed on
the change in proportion of minority borrowers between 1996 and 2008 (column 1 and 2), or the proportion of
minority borrowers in 1996 (column 3 and 4). We report robust t-statistics in parenthesis and use ***, **, and * to
denote significance at the 1, 5, and 10 percent level (two-sided), respectively. Panel B presents the difference in
57
average income growth rate and employment growth rate from 1996 to 2008 between quartiles sorted by the
proportion of minority population in the census tract.