the fiscal cost of con ict: evidence from afghanistan 2005 ...pobarrett.com/conflict.pdf · losses...
TRANSCRIPT
The Fiscal Cost of Conflict: Evidence fromAfghanistan 2005-2017∗
Philip Barrett†
This version: April 5, 2020
Abstract
I use a novel monthly panel of provincially-collected central government revenues and con-
flict fatalities to estimate government revenues lost due to conflict in Afghanistan since
2005. Headline estimates are large, implying future revenue gains from peace of about 6
percent of GDP per year and total revenue losses of $3bn since 2005. Reduced collection
efficiency, rather than lower economic activity, appears to be the key channel. The key chal-
lenge to identification is omitted variable bias, which I address by extending Powell’s (2017)
generalized synthetic control method to a dynamic setting, confirming the main results.
JEL-Codes: E62, F51, H11, K42
Keywords: Conflict, peace dividend, fiscal policy, Afghanistan
∗I am grateful for helpful comments from Cooper Allen, Eli Berman, Tilman Bruck, Paul Collier, Christoph
Duenwald, Raphael Espinoza, Gunnar Heins, Divya Kirti, Frederico Lima, Allison Holland, Mariusz Sumlinkski,
Marialuz Moreno Badia, Catherine Pattillo, Valerie Ramey, Cullen Roberts, and Philippe Wingender. Tetyana
Sydorenko provided excellent research assistance. The views expressed in this paper are those of the author and
do not necessarily represent the views of the IMF, its Executive Board, or IMF management. An early version of
this paper appeared as IMF (2017).†Research Department, International Monetary Fund. Website: www.pobarrett.com Email: [email protected]
1
1 Introduction
A violent conflict is one of the largest shocks that an economy can face. Other events, such as
financial crises, terms of trade shocks, and severe policy errors all undoubtedly generate economic
costs. But few events have the capacity to shatter economies and ruin living standards like an
extended major conflict. This paper is an attempt to quantify one aspect of the economic impact
of such a conflict: the fiscal loss from conflict in Afghanistan during 2005-17.
The ongoing conflict in Afghanistan is just one chapter in a much longer war. While some
years have been less violent than others, for barely one year in the last 40 has the country been
truly at peace. The socialist coup of 1978 presaged the Soviet invasion a year later, followed
by the Mujahideen resistance, post-Soviet-withdrawal civil war of 1992-96, subsequent Taliban
ascendancy, 2001 invasion by a U.S.-led coalition, an aggressive troop surge in 2010, and finally
the draw down of coalition troops in 2014. Today, the outlook for Afghanistan remains bleak,
with the government continuing to battle Taliban-aligned forces across the country while the the
United States negotiates a settlement which will allow it to withdraw its troops. Its intensity and
duration mean there can be few more important case studies of the economic impact of conflict
than the war in Afghanistan.
Conflict imposes many costs on a society. Yet the loss of domestic government revenue is
a particularly important one – in any conflict – as it captures the extent to which the basic
functions of the state are eroded. The impoverishment of the government impedes its ability
to provide basic services essential to any functioning state, such as security of its people and
their property, functioning courts, education, health care, and basic social safety nets for the
very poorest. Fiscal records also have the added benefit of being a source of regular, consistently
comparable information – a rare object in an otherwise data-poor environment.
By measuring the fiscal cost of conflict in Afghanistan I therefore an attempt to quantify a
fundamental cost of a particularly important recent conflict.
To investigate this issue, I assemble a novel dataset of monthly central government revenue
collected in the 34 provinces of Afghanistan between 2005 and 2017, supplemented with data on
provincial conflict fatalities and other covariates, including local weather and prices of agricultural
products, labor, and opium.
I analyze this data using two complementary methods. The first, an ordinary least squares
(OLS) approach, using local projection in the style of Jorda (2005) in a panel to compute dynamic
responses of provincially collected revenues to a local violence shock. In this environment, the most
likely impediment to OLS revealing the causal impact of conflict on revenues is omitted variables
bias.1 That is, that some shocks might simultaneously impact both conflict and revenues within
1. The other common culprit, reverse causality, is unlikely to be as severe a threat to identification as local
1
a province, such as shocks to local economic output or time-varying administrative capacity.2
While I try to account for this in the OLS setting using a variety of controls, the challenges of
data collection in Afghanistan during the sample period mean that even the richest specifications
might omit important simultaneous drivers of both local conflict and local revenues.
To address this concern, I also employ a second methodology to estimate the same effects,
extending the generalized synthetic controls (GSC) method of Powell (2017) to a dynamic setting.
Similar to Abadie and Gardeazabal (2003), this approach creates a synthetic counterfactual for
each province as a weighted average of other provinces and is robust to very general forms of
omitted variables bias. The innovation in Powell (2017) is that the method builds synthetic
counterfactuals for contemporaneous responses to a continuous treatment.3 I further extend this
approach to produce dynamic impulse responses to a continuous treatment (in this case, local
conflict violence).
This methodology allows me to make three contributions to the understanding of the economic
impact of conflict.
First, I provide a quantitative measure of the extent to which conflict violence erodes the
capacity of the state to function, expressed in terms of the degradation of the state’s ability to
fund itself. While studies of the reverse mechanism are common,4 this paper appears to be the first
within-country study of this link.5 This makes concrete a fundamental but otherwise amorphous
cost of conflict – the damage to the state’s capacity to perform its most basic duties, such as
guaranteeing the safety and property of its citizens. By quantifying this effect, I also provide
a measure of the value of a lasting peace – a valuable object not only for economic scholars of
conflict but also for foreign donors and governments assessing the impact of their engagement
in Afghanistan. Moreover, by showing that the OLS and GSC estimates are similar, I provide
convincing evidence that the measured effect is causal.6
The size of this measured effect is large. Headline estimates suggest that government revenues
in Afghanistan would be about 50 percent higher in the absence of conflict violence – approx-
imately 6 percent of GDP per year. More bluntly, peace would likely mean that the Afghan
revenues do not fund local government military activity. Military activity is centrally co-ordinated and almostentirely funded by block grants from foreign donors.
2. See, for example, Collier and Hoeffler (1998), Miguel, Satyanath, and Sergenti (2004), Dube and Vargas(2013), and Iyigun, Nunn, and Qian (2017)
3. In their original approach, Abadie and Gardeazabal (2003) study only a binary treatment.4. A good recent example is Gehring, Langlotz, and Stefan (2018), who show that the extent to which opium
price shocks drive (or dampen) conflict is a function of the degree of government control.5. Gupta et al. (2004) addresses a similar question, albeit in a cross-country setting.6. I also show that entho-linguistic composition is a good instrument for violence in the cross-section, as violence
is concentrated of violence in the those provinces which have a historical ethnic link with the Taliban. The cross-sectional estimates are similar, or even larger, than the OLS estimates although the sample size is necessarilysmall. The instrument is not strong enough to use in a dynamic panel.
2
government could expand the services it provides to its people by around one half. The historic
cost of the Afghan conflict is similarly large. The same coefficients imply an estimated cumulative
cost to the Afghan treasury of the rise in violence since 2005 of around 250 percent of annual rev-
enues – approximately 3 billion US dollars in 2017.7 This is equivalent to the total contributions
since 2002 of the United States to the main multilateral development fund.8
The second contribution is to provide evidence on the mechanism for this effect. By manip-
ulating the timing assumptions of the local projection, I decompose the revenue loss into two
channels: that resulting from a change in the tax base, and that from the change in share of the
tax base collected as revenue. I find that the main driver is this second channel. This is consis-
tent with past work, which often finds that local economic activity is stable or even increases in
response to conflict. This finding also emphasizes that the measured impact is truly capturing a
reduction in state capacity, rather than just the consequence of changing local economic activity. I
also show that there are no statistically significant spillover effects of violence. Increased violence
in neighboring provinces has no impact on local revenue collection.
The third contribution is methodological. By extending Powell (2017) to produce dynamic
responses to shocks, I provide a general method for identifying the time-varying impact of contin-
uous shocks on an outcome variable which is robust to omitted variables bias.9 This is potentially
valuable for any environment where reliable covariate data is hard to come across. Conflict
situations are the most obvious such setting, but many other research questions in low-income
countries likely suffer from the same problem of missing covariates.
Section 2 reviews related literature. Section 3 introduces the main data sources. Section 4
presents results from OLS estimation in the dynamic setting and Section 5 from the generalized
synthetic control method. Section 6 shows that the main results still hold in a very simple
cross-sectional setting. Section 7 provides counterfactual simulations, and Section 8 concludes.
7. The focus on foregone revenues as measure of the cost of conflict potentially omits changes in the compositionof expenditure due to conflict. This is usually thought of as a key part of any “peace dividend”; funds used forsecurity spending can be reallocated to other functions. In the case of Afghanistan, though, this is likely a nearnet zero effect, as elevated Afghan security spending is funded almost entirely funded by foreign grants, whichwill decline along with spending in the event of peace. For example, in 2016 Afghanistan spent 11.6 percent ofGDP on security and received 8.6 percent of GDP from its major two security grants. So even if security spendingin Afghanistan were to decline to the NATO target of 2 percent of GDP (surely at the upper end of plausiblereductions in military spending), this would leave at more a 1 percent GDP peace dividend on the expenditureside after accounting for reduced security grants.
8. These results are robust to a variety of alternative regression specifications, and also hold when the sampleis restricted to a single episode of violence – the southern insurgency of 2007-08.
9. R Code to perform this in a general setting is available via the package pgsc on CRAN, the ComprehensiveR Archive Network.
3
2 Related Literature
Measures of the economic impact of conflict have a rich history. Early attempts used cross-country
variation in conflict to measure a variety of outcome variables: Collier (1999) computes the impact
of conflict on the level and composition of output; Gupta et al. (2004) calculate losses to growth
and tax revenues; and Knight, Loayza, and Villanueva (1996) estimates a peace dividend from
reduced military spending. However, cross-country studies suffer from rather serious challenges to
identification. Countries in conflict differ not only in their duration and intensity of the conflicts,
but also in many ways which may affect both conflict and economic outcomes, such as include
economic structure, geography, and political systems. As a result, it is very hard to convincingly
control for confounding factors across conflicts as diverse as Colombia, Sri Lanka, and Sudan (to
name just a few).10
More recent work on the cost of conflict have studied outcomes at partly or fully sub-national
levels. Besley and Mueller (2012) and Mueller (2016) estimate the cost of conflict in Northern
Ireland and Africa respectively, both using sub-national panels. And Brodeur (2018) uses plausi-
bly random variation in the success of terror attacks in the US during 1970-2013 to identify the
causal impact of terror attacks on local activity.11
Given its pre-eminence as an early 21st century conflict with major involvement of Western
countries, there are unsurprisingly several other papers which attempt to measure some aspect of
the cost of conflict. With limited data, the identification challenge has mostly been addressed by
using short panels of individual- or household-level data, with largely inconclusive results. Flo-
reani, Lopez-Acevedo, and Rama (2016) show that after accounting for the impact of increased
local troops due to conflict, household consumption does not respond significantly when conflict
rises. Blumenstock et al. (2018) use mobile phone data in 2013-16 to show that corporate phone
activity declines following a major violent incident. Parlow (2016) uses cross-provincial varia-
tion in conflict to measure the effect of conflict on the health of the Afghan population. And
Ciarli, Kofol, and Menon (2015) demonstrate that conflict causes households engaging in more,
yet less productive, economic activities. Other studies focus on the behavior of prices during
conflict. D‘Souza and Jolliffe (2012) study the impact of changing food prices on household-level
consumption patterns. And Bove and Gavrilova (2014) estimate the impact of conflict on food
prices in a subset of provinces in 2003-2009.
Ciarli, Kofol, and Menon (2015) is also particularly relevant in that it seeks to go beyond
identification via fixed effects in a panel, using a Bartik-style instrument for conflict, with the
10. See Blattman and Miguel (2010) for a detailed overview of the limitations of this approach.11. An alternative approach to identification is to specify a structural economic model. For example, World Bank
(2017) measure the cost of the recent Syrian conflict by performing counterfactual simulations in an estimatedmulti-region DSGE model.
4
cross-sectional variation coming from violence during the Soviet occupation of 1979-1989. In spite
of (or perhaps because of) their care in identifying the causal impact of conflict on household
income, the estimated responses are quite small, in contrast to the results reported here. This
difference might easily be explained by the many factors. For example, if households are more
adept than the government at substituting into other income-generating activities, especially at
short notice.12
When measuring the economic impact of conflict in the Basque country, Abadie and Gardeaz-
abal (2003) propose a synthetic control approach to identify the causal impact of conflict. This
is designed to address bias due to omitted variables which drive both conflict and economic ac-
tivity. By applying Powell’s generalized version of this framework I not only extend Abadie and
Gardeazabal’s approach in a technical sense, but also expanding the field of application to a
potentially more widely relevant case.13
My focus on omitted variables bias as a challenge to identification is motivated by a long
literature showing how economic shocks – likely an important driver of local revenues – robustly
predict conflict. In their seminal study, Miguel, Satyanath, and Sergenti (2004) use rainfall as an
instrument to identify exogenous variation in economic activity – and hence its causal impact on
conflict – in a sample of sub-Saharan African countries. Dube and Vargas (2013) use international
terms of trade shocks to identify similar variation in economic activity in commodity-producing
regions in Colombia.14
3 Data
This section gives an overview of the data used in my analysis.
3.1 Data on conflict
My source of data on conflict is the Uppsala Conflict Data Program Georeferenced Event Dataset
(UCDP GED). The UCDP GED is a commonly used dataset in conflict research across a wide
range of disciplines15, and includes observations worldwide dating during 1989-2017. The unit of
observation is an incident of conflict violence with at least one fatality, such as a bomb in a public
12. Differences in the sample may also be important. Ciarli, Kofol, and Menon (2015) study only 2005 and 2008,since when the level and spread of violence has increased considerably.
13. Afghanistan is the prototypical modern conflict in many ways, not least because it a low-income country.This is important because conflict in rich countries is rare. The Uppsala Conflict Database recorded conflict in 33countries in 2017, or which 19 were low income countries, and only one an advanced economy.
14. Others studying this link include Collier and Hoeffler (1998), Fearon and Laitin (2003), Sambanis (2002),Iyigun, Nunn, and Qian (2017), and Gehring, Langlotz, and Stefan (2018)
15. Examples include Michalopoulos and Papaioannou (2016) (economics), Hultman, Kathman, and Shannon(2013) (political science), and Themner and Wallensteen (2014) (conflict research)
5
place or an attack on a police checkpoint. Each record contains detailed information on location,
fatalities, actors, sources, and type of incident. In order to guarantee data quality, the dataset
draws from a wide range of primary sources, including official documents, media sources, NGO
reports, and archives.
The protagonists in the current conflict have been involved in Afghanistan since late 2001,
when a US-led coalition invaded Afghanistan and deposed the Taliban government. In the period
since, the UCDP GED records 19,955 separate conflict incidents in Afghanistan, resulting in
97,902 deaths.16 Figure 1 shows the time series for this data, aggregated to the national level per
1,000 since 2001.
There is considerable temporal and spatial variation in the data: deaths from conflict spiked
in southern provinces in 2006-2007; the troop “surge” of 2010-2013 corresponded to a reduction
of violence in the South but a rise everywhere else; and following the withdrawal of NATO forces
in 2014 conflict violence rose sharply nationwide. In Appendix A.2 I provide a more detailed
description of the data including other corroborating data sources, as well as a broader overview
of the conflict.
0.1
0.2
0.3
0.4
0.5
2005 2010 2015
Dea
ths
per
thou
sand
Figure 1: Deaths from conflict per 1,000 in Afghanistan 2002-17
16. The UCDP GED provides three estimates of fatalities for each incident: an upper bound, a lower bound,and a “best guess” estimate. Throughout, I use the best guess estimate.
6
3.2 Data on revenues
The dependent variable in my analysis is provincially-collected domestic revenues of the central
government, which I compile from the Afghan Ministry of Finance Monthly Fiscal Bulletins,
available publicly until 2017. This measure covers all revenues of the central government collected
within the borders of Afghanistan with a specific provincial origin. Examples include corporate
income tax paid by firms registered at provincial tax offices and customs duties collected at border
crossings. This measure excludes external revenues, such as foreign grants,17 as well as domestic
revenues which are categorized as “nationally” collected.18
I look at domestic revenues capture the Afghan government’s ability to provide self-sustaining
funding for ongoing expenditures. As such, losses to domestic revenues represent an economic cost
of conflict, measuring services that cannot be provided by the government. Grants, in contrast,
are a transfer from overseas citizens. And I focus specifically on provincially-collected domestic
revenues as they provide a source of spatial variation.
Total (nominal) domestic revenues in Afghanistan grew sharply during the sample period (see
Figure 2). Throughout, provincially-collected domestic revenues have made up large share of
total domestic revenues, averaging 59 percent during the sample period. This share has also been
stable. As shown in Figure 3, the growth rates of the provincially-collected and total revenues
are highly correlated. This data source is also comprehensive; the central government is the only
(legal) revenue-collecting agency in the provinces. Provincial governments have no meaningful
tx-raising powers and are funded almost entirely by transfers from the center.
There is considerable heterogeneity in provincial revenues, as shown in Table 6 in Appendix
A.1, which reports summary statistics for province-level revenue data, aggregated annually. There
are also several obvious outliers in the revenue data, the largest due to apparently misplaced
decimal points. Full details of outlier identification are given in Appendix A.3.
3.3 Other data
Given that Afghanistan has not conducted a census since before the Soviet invasion of 1979,
accurate provincial population data are hard to come by. Appendix A.1 discusses the difficulties
of using subsequent published provincial population estimates. So for national populations, I use
data from the World Bank, and for provincial shares I use the 2003-05 nationwide household list-
ing, reported in UNFPA and CSO (2007). The 2003-05 household listing also includes provincial
17. Grants were a large source of funding for the government of Afghanistan during this period, averaging over12 percent of GDP. Afghanistan has no meaningful net foreign debt issuance during this period.
18. Nationally collected domestic revenues are those which have no definite provincial origin. This typicallyrefers to revenues received by government entities with nationwide scope. For example: fees charged for issuingpassports, which are collected by the (single, nationwide) Foreign Ministry.
7
0
5
10
15
2005 2010 2015
Per
cent
Figure 2: Central government domestic revenues as a share of GDP
0
25
50
2005 2010 2015
Per
cent
Growth of total domestic revenues Growth of provincially collected domestic revenues
Figure 3: Nominal growth rates of total and provincially collected revenues
8
data on the share of speakers of each of the major languages. I use this in Section 6, where I use
language share as an instrument for violence in a purely cross-sectional analysis.
As controls in the dynamic panel estimation, I use eight other data series, from three sources:
four provincial meteorological series (precipitation, average temperature, days of frost, and aver-
age cloud cover) come from the Centre for Environmental Data Analysis (CEDA); opium prices
for five regions are listed in the UN Office on Drugs and Crime (UNODC) Afghanistan Opium
Survey; and regional wheat, sheep and daily unskilled labor costs come from the World Food
Program’s Vulnerability Analysis and Mapping project (WFP VAM). These controls are all avail-
able monthly, although coverage varies across each series. In robustness checks with annually
aggregated data I also use data on nighttime lights visible from space. A full description of this,
and all other data sources, can be found in Appendix A.1, with summary statistics in Table 7.
4 Ordinary least squares estimates
In this section I present baseline estimates of the dynamic impact of conflict on government
revenues in Afghanistan estimated by ordinary least squares. The headline results come from
local projection in the style of Jorda (2005). This produces long-run revenue responses of around
-0.5 to -0.6 log points for each annual death from conflict per 1,000 population. I also report results
from a single dynamic equation model and illustrate why this understates the true response.
By manipulating the timing assumption in the local projection, I attempt to decompose the
revenue loss into two channels: losses due to changes in the tax base and those due to reduced
collection efficiency. I conclude that changes in revenue are almost entirely due to the latter
effect. That is, the revenue cost of conflict does measure the reduction in the ability of the state
to function.
4.1 Model specification
I use the UCDP data to generate a month-province specific measure of (annualized) conflict
intensity, defined as:
xit =1000× Annualized UCDP conflict fatalities in province i in period t
Province i popuation share 2003-05 × National population in period t× 12
I use the start-of-sample population share for two reasons: the unreliability of published
provincial population estimates (discussed in Appendix A.1) and because even accurately scaling
by the current population would overstate increases in violence if people migrate away from areas
affected by conflict.
9
To measure the impact of conflict intensity on provincial revenues, I estimate two local pro-
jection models for provincial revenues
yi,t+h = bhxit +N∑n=1
ηnxi,t−n +M∑m=1
ρmyi,t−m +L∑l=0
α′lzi,t−l + µi + δt + ei,t+k (1)
Y hi,t+h = βhxit +
N∑n=1
ηnxi,t−n +M∑m=1
ρmyi,t−m +L∑l=0
α′lzi,t−l + µi + δt + ei,t+k (2)
where yit is the log of provincial revenues in local currency, xit is the shock to provincial conflict
intensity, and Y hi,t+h the cumulated log revenues between period t and t+ h:
Y hi,t+h =
h∑j=0
yi,t+j
The zit’s represent a vector of meteorological and price controls, as described in Table 7. The
µi represent a full set of province-specific fixed effects, which will absorb permanent differences
between provinces, and the δt are a full set of time fixed effects, which will absorb an national-level
time series variation. The error term eit absorbs all remaining time-province specific variation.
Local projection specifications of this form are mathematically a type of dynamic panel esti-
mation (although impulse responses are computed by repeated estimation at horizons h = 0, 1, . . .
rather than recursively). And so they inherit the same econometric challenges of dynamic panel
models. In particular, estimation of equations (1) and (2) by OLS will produce an unbiased
estimate of the causal effect of violence on local revenues if the usual sequential exogeneity con-
dition for a panel holds. Here, this requires that: 1) the error terms eit are serially uncorrelated
(due to the inclusion of lagged dependent variables); and 2) the error term eit is orthogonal to
{xit, . . . , xi0, yi0, . . . , yi0, zi,t, . . . , zi0} for all t.19 In general, the two most obvious ways in which
sequential exogeneity can be violated are omitted variables and via reverse causality. Attempting
to mitigate these concerns motivates much of the specific regression design details.
Inclusion of the price and weather controls aims to satisfy sequential exogeneity by reducing
the likelihood of omitted variables. If other local conditions affect both provincial revenues and
security simultaneously, then the error term eit will again be correlated with provincial conflict
fatalities xit. The meteorological series are controls for weather-related phenomena which might
19. See Acemoglu et al. (2018).
10
affect both local economic activity20 and the probability of conflict.21 Prices also absorb variation
due to local demand shocks which may affect both economic activity and conflict.22
Including lags of the independent variable similarly control for omitted variables, this time
by appealing to reasonable assumptions about the timing of shocks. If deteriorating economic
conditions cause subsequent violence, then omitting lags of yit will induce correlation between
eit and xit, violating sequential exogeneity and causing bias in the OLS estimates. Specifically,
regressing revenues on violence without accounting for past revenue performance will produce es-
timates which measure not only the causal impact of violence on revenues but also the selection of
low-revenue province-year pairs into violent episodes. In the language of a difference-in-differences
framework, this would violate the ‘parallel trends’ assumption. Including lags of the dependent
controls for this effect. Including dependent lagged variables is also a partial control for reverse
causality – that low revenues actually cause future local violence – although I argue in Section
4.5 that this is unlikely a major source of bias in this setting.
4.2 Headline results
Figure 4 shows the estimated coefficients from equation (1) using eight lags each of the controls (so
N = M = L = 8). As there is evidence of mis-reported revenue data being offset by corrections
in subsequent months (see Appendix A.3), I use data aggregated to quarterly frequency to reduce
unnecessary noise, and adjust xit to retain an annualized rate. The period responses, bh, are almost
always negative – suggesting a persistent and negative impact of conflict on revenue collection –
but rarely statistically significant (standard errors are clustered by time and province).
This motivates looking at the cumulative responses in equation (2), shown Figure 5. This is
the key results of this section. The cumulative effect is large and highly significant. The point
estimate is around -0.6 times quarterly revenues after 7 quarters.
Sluggish dynamics are an important aspect of the dynamic response. Much of the impact
comes more than a year after the shock. That the response is so delayed does make some intuitive
sense – delays in reporting or in local investment may mean that an uptick in local violence may
not feed through into reported revenues for some time. Indeed, the contemporaneous impact is
20. According to the 2003 National Risk and Vulnerability Assessment survey, agriculture was the principalsector of employment for 64 percent of male heads of household. The same survey finds that lack of irrigationwas the most-cited constraint reported by Afghan farmers, suggesting that production is particularly sensitive toweather.
21. Afghanistan is well known for having a “fighting season”, as the difficulty of living outdoors and the needto tend crops have a seasonal impact on manpower available for both for insurgents and potentially governmentforces. During 2005-17 nearly two thirds (64 percent) of conflict fatalities occurred in the six months from Mayto October. Local variation in weather is therefore likely to affect not only local revenues but also the length andtiming of this fighting season (and thus measured conflict deaths).
22. Time fixed effects will already absorb national inflation changes, which is the relevant price index for thecentral government’s budget constraint.
11
quite small. This feature also has some bearing on the choice of estimation method. Estimating
a single dynamic equation – which takes the estimates for equation (1) at h = 0 only and iterates
to produce an impulse response – would not capture this full effect.
To illustrate this point, shown in Figure 5 also includes the impulse response from a single
dynamic equation, which has an estimated long-run impact of between -0.25 and -0.3, less than
one half the impact estimated by local projection. Although the two responses agree for the
first four quarters, thereafter they diverge, suggesting that the single dynamic equation is mis-
specified either because of a inherently nonlinear dynamics or an amplifying feedback between
revenue collection and conflict.23
Table 1 aims to show that this result is robust across a variety of specifications. Specifications
become increasingly complex from left to right, with increasing lags of revenues and controls, and
province-specific time trends. Including estimation results at all horizons h would be unwieldy, so
the upper portion of the table reports only the regression coefficients and diagnostics for h = 0.
All standard errors are two-way clustered by province and period.
More important is the lower part of the table, which reports three summary measures of
the long-run cumulative response. There is no obvious single way to calculate this for a local
projection, so Table 1 includes two: βh at the horizon with the largest estimated response, and βh
at the most significant response. I also include the cumulative long run impact from the iterated
h = equation, about which there is no ambiguity, given by:∑Kk=1 βk
1−∑M
m ρm(3)
Unsurprisingly, with quarterly data at least four lags of revenue are required to remove seasonal
patterns (specification (2)). Persistence is typically in the order of 0.7 and significant, with the
longest lags always significant. Once these lags are included, the long-run measures of the impact
of conflict on revenues are very stable. Including price and weather controls (specifications (4) and
(5)) or lags of conflict fatalities (specifications (6), which corresponds to the impulse responses
shown in Figures 4 and 5, and (7)) or province-specific linear time trends (specifications (5) and
(7)) has no impact on the main findings. In all these cases, the maximal cumulative effect is
around -0.5 to -0.6 and occurs after 7 or 8 quarters, consistent with the visual impression given
by Figure 5
Although coefficients on controls are omitted here for brevity, most are significant at several
horizons and the price of opium is typically the most reliably so. It is almost always highly
23. As is well-known, mis-specification errors in the iterated equation setting compound over time (see Ramey(2016)). A yet further alternative is to use a VAR, although many of the features that are easy to implement inthe Jorda setting become much more cumbersome to implement (such as controls, time and unit fixed effects, andtwo-way robust standard errors).
12
0 2 4 6 8 10
−0.
2−
0.1
0.0
0.1
0.2
Horizon, h
Rep
onse
coe
ffici
ent,
b h
Local projection +/2 standard errors
Figure 4: Period revenue loss, log quarterly revenue
0 2 4 6 8 10
−1.
0−
0.8
−0.
6−
0.4
−0.
20.
0
Horizon, h
Rep
onse
coe
ffici
ent,
β h
●
●
●
●
●
●
● ●
●
● ●●
●●
●
Local projection +/− 2 standard errorsIterated dynamic equation +/− 2 standard errors
Figure 5: Cumulative revenue loss, log quarterly revenue
13
Table 1: Effect of conflict on revenues, OLS estimates
Dependent variable: provincial (log) revenues(1) (2) (3) (4) (5) (6) (7)
Deaths per 1,000 −0.051 −0.070∗∗ −0.089∗∗∗ −0.107∗∗∗ −0.114∗∗∗ −0.112∗∗∗ −0.113∗∗∗
(0.042) (0.030) (0.032) (0.033) (0.038) (0.040) (0.041)
Deaths per 1,000, lag 1 0.068 0.062(0.068) (0.066)
Deaths per 1,000, lag 2 −0.069 −0.069(0.042) (0.043)
Deaths per 1,000, lag 3 −0.051 −0.058(0.045) (0.042)
Deaths per 1,000, lag 4 0.073 0.063(0.059) (0.057)
Deaths per 1,000, lag 5 −0.087 −0.092(0.082) (0.082)
Deaths per 1,000, lag 6 0.040 0.026(0.044) (0.043)
Deaths per 1,000, lag 7 −0.112 −0.125(0.094) (0.102)
Deaths per 1,000, lag 8 0.133∗∗ 0.110∗∗
(0.057) (0.049)
Revenue lags 1 4 8 8 8 8 8Last significant lag 1 4 8 8 8 8 8
Revenue persistence 0.37 0.68 0.70 0.69 0.38 0.69 0.36(0.094) (0.071) (0.066) (0.060) (0.067) (0.060) (0.076)
Lagged price & weather 0 0 0 8 8 8 8Provincial time trend No No No No Yes No Yes
Autocorrelation test stat 1.2 0.2 0.6 0.6 0.5 0.3 0.3p-value 0.27 0.65 0.45 0.45 0.47 0.56 0.57
Observations 1,734 1,632 1,496 1,496 1,496 1,496 1,496R2 0.139 0.255 0.258 0.291 0.340 0.289 0.336Adjusted R2 0.095 0.214 0.213 0.207 0.244 0.200 0.235F Statistic 134.698∗∗∗ 107.565∗∗∗ 55.975∗∗∗ 7.021∗∗∗ 6.135∗∗∗ 6.983∗∗∗ 6.202∗∗∗
Max. cumul. impact, -0.304 -0.593*** -0.641*** -0.605*** -0.552*** -0.649*** -0.500***local projection (0.337) (0.219) (0.226) (0.217) (0.161) (0.180) (0.111)Horizon 7 7 7 7 7 8 8
Most signif. cumul. impact, -0.304 -0.593*** -0.641*** -0.605*** -0.552*** -0.649*** -0.500***local projection (0.337) (0.219) (0.226) (0.217) (0.161) (0.166) (0.111)Horizon 7 7 7 7 7 7 8
Long-run cumul. impact, -0.080 -0.219*** -0.298*** -0.342*** -0.182*** -0.375** -0.308**iterated dynamic eq. (0.065) (0.083) (0.088) (0.086) (0.056) (0.165) (0.132)
∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01Notes: Point estimates and regression diagnostics reported for one-period-ahead response, h = 0. Allspecifications include period and province fixed effects. Two-way clustered standard errors in parentheses.Significance calculated using two-sided tests throughout. Most significant cumulative and maximumcumulative impact are local projection estimates for horizons with smallest p-value and largest pointestimates respectively. AR test for panel data from Wooldridge (2010) with null of zero serial correlation.
14
statistically significant (not reported here for brevity), in line with work by Sonin, Wilson, and
Wright (2018) showing that rebel attacks in Afghanistan become more effective when opium prices
are high.
The foregoing discussion raises an obvious question: what mechanism is driving this response?
This is the question to which we turn in section 4.3.
4.3 Decomposing the sources of revenue losses
By manipulating the timing in the local projection, I can decompose the revenue loss due to
conflict into two channels. The revenue base effect is the decline in revenues due to changing
local economic activity. And the revenue efficiency effect is the decline in the share of given local
activity that authorities can collect as revenues.
To decompose losses into these two channels, I estimate for h = 1, . . . , H:
Y hi,t+h = βhxit +
N∑n=1
ηnxi,t−n +M∑m=1
ρmyi,t−m +L∑
l=−h
α′lzi,t−l + µi + δt + ei,t+k (4)
This differs from equation (2) due to the inclusion of the control variables zi,t−l for periods
subsequent to the shock up until the the outcome in period t+h. These variables – meteorological
observations and regional prices – are proxies for local economic activity. The impulse response
estimated in equation (4) is therefore conditioned not only on the state of the local economy
at time t, but also on its subsequent evolution. As a result, the resulting estimates exclude the
impact of changes in the revenue base, producing estimates of bh and βh which capture the revenue
efficiency effect alone.
Figure 6 presents the results of estimating equation (4) using the same lag structure as in
Figure 4. There is no statistically meaningful difference between the local projection with and
without. This suggests that the entirety of the revenue loss comes from an efficiency effect, and
that the local activity effect is unimportant. This is not inconsistent with other work that suggests
there are several channels through which conflict may in fact positively impact local activity.
Examples include: the purchase of goods and services by fighters moving into the area (Floreani,
Lopez-Acevedo, and Rama (2016)), government spending on development projects to counter
insurgent support (Berman, Shapiro, and Felter (2011)), and weakened state control relaxing
constraints on private sector activity (Guidolin and La Ferrara (2007)). So one interpretation of
Figure 6 is that such channels offset any negative impact of conflict on local activity, resulting in
a zero revenue base effect.
There is another interpretation of Figure 6. It might simply be the case that the future controls
simply do not capture local fluctuations in activity very well. In appendix B I test this possibility,
15
0 2 4 6 8 10
−1.
0−
0.5
0.0
Horizon, h
Rep
onse
coe
ffici
ent,
β h
●
● ●
●
● ●
●
●
● ●●
●
Local projection +/2 standard errorsLocal projection, with post−shock controls +/2 standard errors
Figure 6: Cumulative revenue loss, log quarterly revenue including future controls
estimating equation (4) using annual satellite data on the intensity of visible electric light at night
as an additional control. As satellite data is only available monthly from 2012, adding the extra
control comes at the cost of reducing the frequency of observation. Nevertheless, the resulting
estimates cannot reject the hypothesis that the revenue base effect is zero, in agreement with the
monthly estimates.
4.4 Spillovers
Later, I use the headline estimates from the province-level estimation to compute aggregate mea-
sures of the fiscal cost of conflict. A obvious challenge to this calculation is that specifications (1)
an (2) are purely local estimates, admitting no role for conflict violence elsewhere. If violence in
nearby provinces affects local revenue collection, then using local estimates to quantify the aggre-
gate impact of violence will be incorrect as it ignores important spillover effects. An interesting
aspect of this question is that the likely direction of the impact is not obvious. Violence in nearby
provinces could either have a negative impact – disrupting trade or tax collections – or a positive
one – displacing economic activity away from areas affected by conflict violence.
In this section I show that there is some evidence of positive local spillovers from conflict
violence, but that it is sufficiently weak that it can be reasonably ignored in aggregate calculations.
16
Specifically, I extend specification (4) to:
Y hi,t+h = βhxit + γhXit + ξh (xit ×Xit) +
N∑n=1
ηnxi,t−n +N∑n=1
θnXi,t−n
+M∑m=1
ρmyi,t−m +L∑l=0
α′lzi,t−l + µi + δt + ei,t+k (5)
Where Xit is the weighted average of conflict intensity in neighboring provinces. That is:
Xit =
∑j∈Ni
popjxjt∑j∈Ni
popj
where Ni is the set of neighboring provinces for province i and popj is the 2003-05 population in
province j.
Using this framework, I can construct three measures of spillovers. First, βh in equation
(5) represents the average impact of local conflict on horizon-h revenues conditioned on average
conflict in neighboring provinces, Xit. This adjusts the estimate of the impact of own-province
violence on revenues for systematic correlation with violence in neighboring provinces, a possible
form of omitted variables bias. Second, γh in equation (5) measures the average impact of
conflict in neighboring provinces, conditional on within-province conflict, xit. and so captures
the direct impact on local revenues of conflict in neighboring provinces. Third, the interaction
term ξh measures a possible indirect channel through which conflict violence may have non-local
spillovers, altering the marginal impact of local conflict. This would arise if, for example, a general
climate of widespread conflict violence (beyond that captured by fixed effects) makes traders or
tax administrators more sensitive to violence in their own provinces.
Table 2 summarizes evidence on spillovers more systematically, reporting measures of the
maximum impact of the local projection estimates for a variety of specifications, as in Table 1
but without the details of the h = 0 coefficients. Figures 15, 16, and 17 (in Appendix B.1) present
impulse responses for select specifications.
Estimates of βh are almost unchanged with the inclusion of neighboring-province violence.
This suggests that correlated variation in violence across provinces is not an important driver of
the main result. Of course, violence is correlated across provinces, and so these results can be
interpreted as evidence that the controls already included (perhaps most obviously, local opium
prices) are sufficient to address this concern.
Maximal estimates of γh, the estimated average impact of conflict in neighboring provinces,
are large and positive with a coefficient on log income of over 3. However, these estimates are
neither precise nor robust. They are highly sensitive to the choice of specification, and even in
17
the most restrictive specifications significance is elusive: only specifications with province-specific
time trends can produce coefficients which are more than two standard errors beyond zero at any
horizon. This contrasts with the estimates on local violence, where the coefficients are often three
or four times the standard error at multiple horizons and across various specifications.
Finally, estimates for the coefficient on the interaction term, ξh, present no compelling evidence
for systematic variation in the marginal impact of violence within a province conditioned on
conditions in neighboring provinces. While generally negative, the estimates are sensitive to
specification details and almost always statistically insignificant.
Overall, three evidence on cross-province spillovers of violence onto revenue collection are con-
sistent with three findings: that coefficients for within-province effects of violence are unaffected
by controlling for violence in neighboring provinces; that point estimates of direct spillovers are
large but almost never statistically significant and not robust to changing specifications; and that
intensity of conflict in neighboring provinces does not change the within-province relationship
between conflict and revenue collection. Overall, it is hard to find any evidence to reject the hy-
pothesis that spillovers from violence in neighboring provinces have no effect on the relationship
between conflict violence and revenue collection.
4.5 Identification and causal interpretation
The estimates described above can only be used to measure the fiscal cost of conflict in Afghanistan
if they identify the causal impact of violence on revenue collection. In this section I assess several
challenges to this interpretation and argue that omitted variables bias is likely the most important
of these. This informs the use of estimation by Generalized Synthetic Controls in the next section.
As mentioned already, autocorrelation of residuals is in this setting not only a sign of model
mis-specification but also a violation of sequential exogeneity, due to the inclusion of lagged
dependent variables. Table 1 therefore reports test statistics and p values for the panel residual
autocorrelation test of Wooldridge (2010). This is also comfortably rejected, even with a small
number of revenue lags. Results for h > 0 are similar.
Another relatively standard critique is Nickell bias. Because lagged regressors are dependent
variables in previous periods, this induces a correlation between the period t regressors and
the period t − m residuals (for 0 < m ≤ M). Nickell (1981) shows that the resulting bias
is asymptotically of order 1/T . This is a very serious problem for models with a handful of
observations per unit. But with 44 periods in each province this is not a first-order concern in
the current setting. Nevertheless, Appendix B also includes results at monthly frequency (with
at least 160 observations for each province the resulting bias will be small). These estimates are
less precisely identified due to likely negative serial correlation in the higher frequency data, but
18
(1)
(2)
(3)
(4)
(5)
(6)
(7)
Spe
cifi
cati
onde
tail
sC
onlict
inte
nsi
tyla
gs0
00
00
88
Rev
enue
lags
14
88
88
8L
agge
dpri
ce&
wea
ther
00
08
88
8P
rovin
cial
tim
etr
end
No
No
No
No
Yes
No
Yes
Max
imu
mcu
mu
lati
veim
pact
Bas
elin
e,fr
omT
able
2-0
.304
-0.5
93**
*-0
.641
***
-0.6
05**
*-0
.552
***
-0.6
49**
*-0
.500
***
(0.3
37)
(0.2
19)
(0.2
26)
(0.2
17)
(0.1
61)
(0.1
80)
(0.1
11)
Hor
izon
77
77
78
8
βh,
contr
olling
for
nei
ghb
orin
g-pro
vin
cevio
lence
-0.3
40-0
.644
***
-0.7
03**
*-0
.640
***
-0.5
68**
*-0
.681
***
-0.5
56**
*(0
.334
)(0
.210
)(0
.223
)(0
.218
)(0
.167
)(0
.177
)(0
.101
)H
oriz
on7
77
77
88
γh,
dir
ect
impac
tof
nei
ghb
orin
g-pro
vin
cevio
lence
0.41
40.
450
0.72
91.
161
1.46
9**
0.99
31.
279*
**(0
.781
)(0
.568
)(0
.609
)(1
.095
)(0
.646
)(0
.796
)(0
.441
)H
oriz
on7
66
129
119
ξ h,
inte
ract
ion
oflo
cal
and
nei
ghb
orin
g-pro
vin
cevio
lence
-0.3
090.
260
-0.3
21-0
.305
-0.6
20**
*-0
.370
-0.5
52**
loca
lpro
ject
ion
(0.6
10)
(0.8
14)
(0.3
57)
(0.3
28)
(0.1
63)
(0.3
63)
(0.2
71)
Hor
izon
1012
55
95
9
Not
e:∗ p<
0.1,∗∗p<
0.05
,∗∗∗ p<
0.01
All
spec
ifica
tion
sin
clude
per
iod
and
pro
vin
cefixed
effec
ts.
Tw
o-w
aycl
ust
ered
stan
dar
der
rors
inpar
enth
eses
.Sig
nifi
cance
calc
ula
ted
usi
ng
two-
sided
test
sth
rough
out.
Tab
le2:
Sum
mar
yim
pac
tlo
cal
pro
ject
ion
esti
mat
esfo
rsp
illo
vers
.
19
point estimates are similar, particularly in the richer specifications.
Unlike residual autocorrelation and Nickell bias, other critiques cannot be tested so simply
within the OLS setting, and instead require a more conceptual discussion. One such problem is
omitted variables bias – there may be third factors correlated with lower revenues and increased
violence. One obvious such factor is the strength of local institutions. For example, corrupt local
officials or powerful warlords might both intercept locally collected revenues and provide support
or intelligence to insurgent groups.24 Although I try to address these channels using controls for
weather and prices, it is not obvious that this is sufficient, especially in an environment such as
Afghanistan where one might have concerns about data quality. Without further investigation,
omitted variables bias represents a very serious challenge to causal interpretation of these results.
Another potential challenge to causal identification is reverse causality. Yet this seems some-
what implausible in the current setting. The most obvious channel – that locally collected revenues
fund local military activity – is ruled out simply because locally collected national revenues are
not funneled back to pay for local military expenditures. In fact, Afghan military expenditures
are paid for almost entirely by national grants from donors which, after accounting for time
fixed effects, are uncorrelated with local revenues by construction. In addition, military actions
conducted in concert with (or even at the direction of) foreign allies and likely reflect broader
strategic priorities rather than short-term shocks to provincial revenues.
Moreover, the spatial distribution of violence throughout the period of study is highly cor-
related with language share, suggesting that ethnicity is likely a much more important driver
of local violence than revenue fluctuations (see Section 6 and Appendix C). Indeed, for reverse
causality to be a challenge to identification, one would need to believe not only that 1) local
Taliban commanders have advance access to the government’s financial statements, but also that
2) they can interpret them to extract the information beyond that contained in lagged revenues,
violence, and prices, and that 3) they put more emphasis on this than, say, long-lasting social
and cultural ties when determining military strategy. This seems at least a little unlikely.
However, measurement error is a potential further challenge to identification. Conflict fatalities
may be imperfectly recorded. Or realized conflict deaths may be a noisy indicator of the risk of
violence – a potentially more important determinant of local collection efficiency or economic
activity. Nevertheless, this particular channel will bias coefficients towards zero, meaning that
reported estimates are a lower bound on the true cost of conflict.
A natural solution to omitted variables bias and measurement errors (and possible reverse
causality) is to try to find an instrument. In Section 6 I use ethnicity as an instrument in the
24. Corruption in particular is a very serious problem in Afghanistan. The country ranked 174th of 180 countriesin Transparency International’s 2019 Corruption Perceptions Index and has been in the bottom decile in everysurvey since 2013.
20
cross-section, and get results at least as large as without an instrument. However, the instrument
is weak in a panel setting, even as the share in a shift-share (i.e. Bartik (1991) type) instrument
once fixed effects and controls are included.
4.6 Robustness
Appendix B also contains further robustness checks, including omitting potentially influentially
observations, and detailed results from annual data with more controls. The corresponding results
all further corroborate the main results in this section.
5 Generalized synthetic control estimation
In this section I present estimates of the impact of conflict on government revenues from a gen-
eralized synthetic control estimation approach.
The generalized synthetic control approach, proposed by Powell (2017), provides estimates
which are robust to unknown omitted variables bias under fairly weak identifying assumptions.
I extend this method to calculate dynamic impulse responses, and show that the results are
statistically significant and in line with the OLS estimates.
5.1 Generalized synthetic controls: a brief introduction
The generalized synthetic control estimator, proposed by Powell (2017), extends the synthetic
control approach of Abadie and Gardeazabal (2003) to a setting with a continuous treatment. The
key econometric challenge that both seek to overcome is a failure of the exogeneity assumption:
that there may be correlation between a dependent variable the assignment of a treatment. In
regression settings, one can seek to address these issues by controls, including possibly lagged
dependent variables and (in a panel) time and/or unit fixed effects. Indeed, it is exactly this
concern that motivates the regression specifications of Section 4. However, this approach is fairly
restrictive, relying on the hope that the controls “soak up” all correlated shocks, and limiting the
possibility of arbitrary spatial correlation.
These challenges are particularly pertinent in the current setting, as controls are somewhat
sparse. Price controls are only available at the regional level, and while the meteorological con-
trols are quite rich, there is much non-climatic variation which they may fail to pick up. In
particular, there may be good reason to think that there is spatial correlation of time fixed ef-
fects in Afghanistan, which the aggregate time fixed effects will not control for. Provinces vary
considerably in ways that that are likely to produce common but complex variation in revenues:
some provinces are economically and culturally close integrated with Pakistan, others with Iran or
21
central Asia; in some provinces wheat is the principal crop, and others maize; some have airports,
or customs houses, or dams, and others do not; and some are snowy and mountainous, whereas
lowland regions depend on upstream precipitation.
The insight of the synthetic control approach, and which provides estimates robust to these
concerns, is that one need not explicitly identify all possible forms or sources of confounding
variation. Instead, what matters for inference is their overall impact on the dependent variable.
The synthetic control estimator applies this insight by constructing a “synthetic” comparator for
each unit i by weighting the other units according to some matching criterion. If the match is good
then the synthetic and real units should exhibit the same fluctuations due to the omitted variables.
Subtracting the real from synthetic outcomes removes the impact of the omitted variables from
both, leaving only the causal effect of the difference in their treatments. In the usual binary
treatment setting, matching only happens in pre-treatment periods. But in in the general case,
matching happens in all units at all times, accounting simultaneously for the estimated impact
of the treatment variable Dit.
Formally, in Powell (2017) the data is assumed to be generated by:
Yit = β′Dit + µ′iλt + eit (6)
Where Yit the outcome variable, Dit is the treatment, λit a vector of time-varying factors, and
µi is a vector of unit-specific factor loadings. As in Powell (2017), the treatment Dit is continuous
and potentially of multiple dimensions. This is in contrast to the canonical synthetic control
setting of Abadie and Gardeazabal (2003) where it is binary and univariate. Note that if µ′iλit is
correlated with Dit, regression using time and unit fixed effects will not recover the true value of
β (i.e. there is omitted variables bias).
For each period and province, a synthetic comparison can be constructed as a weighted average
of the other provinces with constant weights wij, where 1) wii = 0 ∀ i, 2) wij ∈ [0, 1] ∀ i, j, and
3)∑
j 6=iwij = 1 ∀ i. Given such weights, and given a candidate value of β, we can construct
“untreated” outcomes for both the real and synthetic provinces in each period t, Yit − βDit and∑j 6=iwij(Yjt−β′Djt). The estimator then minimizes the square difference between these untreated
outcomes over both wij and β, analogous to the “pre-treatment” matching in a standard synthetic
control model.
Thus, the generalized synthetic control estimator is a vector of coefficients β, and a N × Nmatrix of coefficients W = {wij} solving:
(β, W ) = arg minβ,W
1
2NT
N∑i=1
T∑t=1
[Yit − β′Dit −
∑j 6=i
(wij (Yjt − β′Djt))
]2 (7)
22
One of the key results in Powell (2017) is to show that if there exist weights wij satisfying
E [µ′iλt + eit|Dit] =∑j 6=i
wij(µ′iλt + eit) (8)
then the estimate defined by equation (7) is a consistent and asymptotically normal estimate for
the treatment effect, β. Intuitively, this assumption is just the requirement that the impact of
the factors not be too concentrated. If, for example, element i is the only unit with non-zero
dependence on factor j, then of course the variation in the other units will yield no information
about this factor.
In Appendix D I show that when the data generating process is not purely static as in equation
(6) but instead allows for a persistent impact of the treatment at multiple horizons, generalized
synthetic controls can be modified to give consistent estimates of the impulse response to a
time-t shock conditional on time-t information (i.e. a local projection), when assumption (8) is
supplemented with a standard identifying assumption for local projections. Mechanically, this
approach works by first estimating equation (7) for h = 0, to give an initial impulse b0 and weights
W0. Then, I estimate impulses at horizons h > 0 by solving equation (7) with the weights held
fixed at their horizon-0 values,W0. This is equivalent to computing the difference between outturns
and the synthetic controls for the dependent and treatment variables, (Yi,t+h −∑
j 6=i w0ijYj,t+h)
and (Dit −∑
j 6=i w0ijDjt), and then regressing the former on the later.
This is also analogous to how impulse responses are estimated the “standard” synthetic con-
trol method. First, the weights are estimated on the pre-treatment period. Then, the impulse
responses are computed from comparing outturns in treated units to their corresponding synthetic
untreated selves. The only differences are that 1) pre- and post-treatment times are defined rel-
ative to the every period t, 2) treatment intensity is continuous, and 3) the comparison in this
second step occurs via regression.
5.2 Estimated impulse responses
Table 3 present estimates using the synthetic control method. The estimation technique guards
against omitted variables bias, so the controls for local prices and weather are excluded from this
model. However, the motives for including the lagged independent and dependent variables – to
absorb serial correlation of shocks and persistent responses respectively – remain. So I include
between one and eight quarterly lags of revenue and conflict intensity in the GSC specification,
i.e. as elements of Dit in equation (6).
Analogous to Table 1, the full impulse responses are summarized at two horizons (full impulse
responses for select specifications follow in Figure 7). The first is the maximum cumulative impact
23
for the GSC impulse response for cumulative log revenues. The second is the same cumulative
impact but measured at the same horizon as the maximum OLS impact reported in Table 1.
Columns (1)-(4) present one step estimates, solving exactly the problem given in equation
(7). Columns (5)-(8) present the results of a two step estimator, where differences between the
true and synthetic untreated provinces are weighted by the inverse of the province-specific errors
from the one step estimate. These weighting schema, suggested by Powell (2017), down-weight
provinces where the model fit is poor, preventing them from dominating the estimation. These are
also typically more stable across specifications and so are the headline estimates for this section.
The magnitude of the estimated cumulative impacts is between -0.5 and -0.8 in almost all the
two-step estimates. Taking this as a benchmark for the estimates without omitted variable bias,
one can draw two conclusions. First, that omitted variables bias is potentially a severe problem
for OLS estimation without controls, producing estimated long-run responses that are about half
their true levels – compare the OLS and GSC estimates in specifications (1) and (5). Second, that
the controls used in the OLS estimation are in fact sufficient for addressing omitted variables bias,
as evidenced by the much closer agreement between the GSC and OLS estimates in specifications
(7) and (8).
The long-run effects are also typically statistically significant at the 5 percent level, at least for
the two-step estimator. Two-sided p-values for the long-run impact being non-zero are computed
using the asymptotic test suggested by Powell (2017). This is valid for fixed N as T → ∞, and
is robust to arbitrary correlation across provinces.25
The GSC estimates match the richest OLS specifications not only in the long run but also
in their dynamic paths. Focusing on long-run effects obscures this point somewhat, so Figure 7
shows the full dynamic response of the GSC estimates for specification (8), compared to their OLS
analogue. Here, approximate standard errors are computed by inverting the asymptotic normal
distribution using p-values for the test that the horizon h response is non-zero. Overall, the
impulse response is strikingly similar across the OLS and GSC methods, with sluggish dynamics
and an overall impact consistent with that reported in section 4. Indeed, even restricting to the
the simplest specification, specification (5) in Table 3, provides visually similar results.
The effect of the two-step weighting in columns (5)-(8) provides estimates that are, in general,
a little more stable across specifications than the one-step estimates in (1)-(4). This can be
interpreted as a further robustness check against a factor which might impede OLS estimation:
that a small number of points where the model fits poorly might have outsize influence on the
results. In Appendix B, I try to account for this by excluding provinces which might be, ex ante,
thought of as particularly influential. The weighting approach used in Table 3 is a more scientific
alternative, downweighting provinces where the fit is poor. That the results are broadly in line is
25. See Appendix D for further details on this procedure.
24
0 2 4 6 8 10
−2.
0−
1.5
−1.
0−
0.5
0.0
0.5
Horizon, h
Rep
onse
coe
ffici
ent,
βh
●
● ●
●
● ●
●
●
● ● ●
●
●
Local projection, GSC (Table 3, spec. (8)) +/2 standard errorsLocal projection, OLS (Table 1, spec. (6))Local projection, GSC (Table 3, spec. (5))
Figure 7: Cumulative revenue loss, log quarterly revenue. Approximate GSC standard errorsinferred from bootstrap χ2 test p-values.
25
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
One
step
One
step
One
step
One
step
Tw
ost
epT
wo
step
Tw
ost
epT
wo
step
Spe
cifi
cati
onde
tail
sC
onlict
inte
nsi
tyla
gs0
14
80
14
8R
even
ue
lags
11
48
11
48
Lag
ged
pri
ce&
wea
ther
01
48
01
48
(loca
lpro
j.on
ly)
Gen
eral
ized
Syn
thet
icC
ontr
olM
ax.
cum
ul.
impac
t-0
.907
***
-0.6
05**
-0.4
70**
*-0
.597
-0.5
06**
*-0
.525
-0.7
48**
-0.7
84*
p-v
alue
0.00
10.
049
0.00
70.
633
0.00
00.
234
0.02
60.
090
Hor
izon
1212
99
911
99
Impac
tat
OL
Shor
izon
-2.9
05-2
.002
-1.7
94**
*-2
.280
-1.8
99**
*-1
.786
-2.5
39**
*-2
.687
**p-v
alue
0.78
10.
234
0.00
50.
637
0.00
00.
662
0.00
90.
026
Hor
izon
77
88
77
88
Ord
inay
Lea
stS
quar
esM
ax.
cum
ul.
impac
t-0
.304
-0.1
29-0
.611
***
-0.6
49**
*-0
.304
-0.1
29-0
.611
***
-0.6
49**
*(0
.337
)(0
.276
)(0
.180
)(0
.180
)(0
.337
)(0
.276
)(0
.180
)(0
.180
)H
oriz
on7
78
87
78
8∗ p<
0.1∗∗p<
0.05∗∗∗ p<
0.01
Tab
le3:
Gen
eral
ized
synth
etic
contr
oles
tim
ates
.G
SC
two-
sided
test
sca
lcula
ted
usi
ng
a1,
000
poi
nt
boos
trap
asin
Pow
ell
(201
7).
26
further evidence that this is not a major problem in the OLS setting.
The GSC estimates also address one further challenge to the OLS estimates: over-fitting. As
well as eight lags each of the dependent and independent variables, the headline OLS specification
estimates 51 time- and 34 province- fixed effects, as well as eight lags of four quarterly weather
variables and four price variables (and, in the case of specification (7), 34 province-specific time
trends). A skeptical reader might be concerned that the results of Table 1 are spurious (even in
spite of the large and significant F-statistics) and a result of simply adding more and more controls
until a desirable result is found. In contrast, the GSC results estimate none of these extra control
variables. Yet even specifications with very short lag structures still provide impulse responses
which look very similar to the richest OLS specification, suggesting that the OLS results are not
a product of an overly rich regression.
5.3 Provincial weights
As a way to inspect the mechanism of this estimator a little, Figure 8 shows the weights from
the two step estimator in column (8) of Table 3 for the six provinces with the highest average
revenue: Balkh, Kandahar, Kabul, Herat, Nangarhar, and Nimruz (see Table 6 in Appendix A.1
for a full breakdown of provincial revenues). Two key characteristics stand out. First, that all of
the major six provinces have considerable weight on each other in their synthetic matches. The
synthetic matches for these six provinces put, on average, around 55 percent of their weight on the
other five major provinces (out of 34 province total). Second, that weight which is not assigned
to other major six provinces tends to be placed on neighboring provinces: see in particular Kabul
and Herat.
These patterns are reassuring. We should not be surprised that the natural comparator for a
major province is some combination of neighboring and other major provinces. The advantage of
the synthetic control method is that we can be agnostic about the exact combination of neighbors
and similar provinces when forming a control, instead letting the data speak to the appropriate
mix. In contrast, using control variables in an OLS is severely constrained by what data is
available.
6 Cross-sectional evidence from the 2006-2007 insurgency
With sophistication comes scope for deception. Dynamic statistical models like those used in
the preceding sections in particular leave much room for manipulation through choices over lag
structure, covariates and the like. Thus the skeptical reader may quite reasonably be concerned
that the headline results in this paper, which rely on a variety of dynamic statistical models,
27
0.00 0.25 0.50 0.75 1.00Balkh GSC weights
(a) Balkh
0.00 0.25 0.50 0.75 1.00Kandahar GSC weights
(b) Kandahar
0.00 0.25 0.50 0.75 1.00Kabul GSC weights
(c) Kabul
0.00 0.25 0.50 0.75 1.00Herat GSC weights
(d) Herat
0.00 0.25 0.50 0.75 1.00Nangarhar GSC weights
(e) Nangarhar
0.00 0.25 0.50 0.75 1.00Nimruz GSC weights
(f) Nimruz
Figure 8: Generalized synthetic control weights for six largest revenue provinces. Province towhich weights apply outlined in black. 28
are the product of econometric sleight-of-hand rather than robust statistical relationships. As a
check on this work, I therefore present my analysis in the simplest possible setting: cross-sectional
differences in provincial revenue responses to the wave of violence that swept through southern
Afghanistan in 2006-2007.
Of course, this approach has many drawbacks. With only 34 provinces, robustness checks
are constrained by degrees of freedom. Statistical inference is likewise problematic. Thus, if
performed in isolation, this exercise would not be convincing. Yet this lack of comprehensiveness
and sophistication brings a benefit: an almost absolute transparency. And because the the more
complex models of previous sections produce similar results, we can be sure that the main findings
of this paper are a consequence neither of the naivety of a simple cross-sectional analysis, or the
opacity of dynamic models, but are instead a robust feature of the data.
The insurgency of 2006 occurred when Taliban-aligned forces – routed by NATO in 2001-2002
but able to find shelter in sparsely populated regions either side of the border with Pakistan –
launched a sustained campaign of terrorist violence across the South.26 At the same time, the rest
of the country remained relatively safe (see Figure 1). As a result, there is considerable variation
in cross-sectional conflict violence in this period.
6.1 Ordinary least squares
Taking 2005 as a baseline pre-insurgency year and the 2007-08 average as the post-insurgency
period, Figure 9 plots the change in log revenues between against the change in provincial conflict
fatalities per 1,000 for each province. Averaging across two post-insurgency years minimizes the
noise in the revenue data, which is large, and allows for some heterogeneity in the timing of the
response, which the dynamic models suggest is usually maximized at around 1-2 years.
The regression coefficient for this simple bivariate relationship is, at -0.55, very similar to
those computed in the dynamic models and statistically significant at the 1 percent level (see
column (1) of Table 4).
In a two-period setting, taking first differences is identical to including province fixed effects in
the level. And so this approach already controls for omitted variables that might induce systematic
differences in provinces and so lead to a non-causal correlation between conflict and revenues, at
least in the levels. Yet other factors may still affect the growth of both log revenues and conflict
deaths. As a simple way of trying to address this concern, specifications (2)-(5) include further
covariates, capturing different ways in which initial conditions in provinces might vary by income
(initial revenues), importance of agriculture (precipitation and days of frost27), the gains from
26. Bordering Pakistan and (to a lesser extent) Iran, the Southern region of Afghanistan comprises Helmand,Kandahar, Nimroz, Uruzgan, and Zabul Provinces, and includes around 11 percent of the national population.
27. Days of frost appear to be much more predictive of local revenues than just average temperature, presumably
29
Table 4: Cross-sectional regressions: 2005 vs 2007-08 average
Dependent variable: Provincial revenues, log difference(1) (2) (3) (4) (5) (6)
Ordinary least squaresDeaths per 1,000, difference −0.554∗∗∗ −0.524∗∗∗ −0.519∗∗ −0.566∗∗∗ −0.813∗∗∗ −0.779∗∗∗
(0.114) (0.132) (0.207) (0.165) (0.186) (0.263)
Log revenues, 2005 −0.078∗∗∗ −0.085∗∗∗
(0.022) (0.013)
Log light intensity, 2005 −0.106(0.211)
Precipitation (mm), 2005 0.0002 −0.001 −0.001(0.001) (0.001) (0.001)
Days Frost, 2005 −0.0002 −0.001∗ 0.001∗∗∗
(0.001) (0.0005) (0.0002)
Log opium price, 2005 −0.906∗∗∗ −0.830∗∗∗ −0.933∗∗∗
(0.290) (0.233) (0.279)
Log wheat price, 2005 9.280∗∗∗ 12.078∗∗∗ 14.649∗∗∗
(2.848) (3.002) (3.754)
Median regressionDeaths per 1,000, difference −0.724∗∗∗ −0.679∗ −0.623∗∗∗ −0.700∗∗ −0.807∗∗∗ −0.617∗∗
(0.249) (0.350) (0.173) (0.281) (0.195) (0.290)
Future growth controlsDeaths per 1,000, difference −0.554∗∗∗ −0.524∗∗∗ −0.493 −0.527∗∗∗ −0.576 −0.494
(0.114) (0.132) (0.397) (0.190) (0.410) (0.440)
Instrumental variablesDeaths per 1,000, difference −0.889∗∗ −0.661 −1.154∗∗∗ −2.537∗∗∗ −2.391∗∗∗ −2.542∗∗∗
(0.436) (0.626) (0.333) (0.734) (0.452) (0.621)
Observations 34 34 34 34 34 34Adjusted R2 (OLS) 0.122 0.204 0.070 0.309 0.365 0.284Residual std dev (OLS) 0.44 0.41 0.44 0.38 0.34 0.36Instrument F stat 10.3 10.0 11.8 7.3 4.0 3.8
p-value 0.00 0.00 0.00 0.01 0.06 0.06Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01
Standard errors clustered at the region in parentheses
30
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0.5
1.0
1.5
0.0 0.5 1.0 1.5
Conflict fatalities per 1,000: difference
Log
reve
nues
: diff
eren
ce
Region
● Rest of Country
South
Figure 9: Change in conflict fatalities and revenues during the southern insurgency (full-sampleregression line in black)
fighting (price of opium), and the cost of living (price of wheat). And as an alternative way of
measuring initial economic conditions, specification (6) replaces initial provincial revenues with
provincial log light intensity. In all cases, the relationship shown in Figure 9 still holds after
accounting for all these factors.
To check that these results are not driven by outliers, Table 4 also reports the coefficient on
conflict fatalities from a median quantile regression for the same specifications. This method
minimizes the impact of extreme values, and the results do not provide evidence that outliers
drive the OLS estimates.
As a way of understanding the channels of the effect described in this table, the row labeled
“future growth controls” replaces all non-conflict covariates with their growth rate between the
pre-conflict baseline and the post-conflict outcome (except of course, log revenues), analogous to
the exercise in in Section 4.3. Including covariates which should proxy for changes in economic
activity in each province measures the extent to which the revenue response is due to changes
because longer or later frosts can have large impacts on crops and animals.
31
in the local tax base. If conflict reduced revenue growth by inhibiting growth in local economic
activity, then this would be correlated with in changes in local nighttime lights or prices, leading
to a zero coefficient on conflict fatalities. In general, the point estimates are little different from
those in the main regression (albeit with much larger standard errors on account of worse model
fit). This suggests that the revenue response is not due to changes in the tax base, but instead
due to conflict interrupting the government’s ability to claim a given share of the tax base in
revenues.
6.2 Instrumental variables
Table 4 also shows instrumental variables estimates, using the provincial share of Pashto-speakers
in 2003-2005 as an instrument. Appendix C provides further details on the validity of language
share as an instrument, but the basic motivation is that the Taliban has always been an over-
whelmingly Pashtun-dominated group, and that the Taliban has remained at the core of the
resistance to the Afghan government since its fall in 2001. The ability of the Taliban to act more
effectively in friendly territory therefore means that provinces with a high Pashto-speaking share
are more likely to be a focus of insurgent violence, independent of other determinants of revenue
growth.
The resulting IV coefficients typically much larger than the OLS estimates, suggesting that
OLS estimates may in fact be under-estimates. Measurement error is a probable cause of this
– deaths in war are likely a noisy signal of the underlying security situation. And while the
instrument is relatively strong (in specifications (1)-(3) at least, which all have F-statistics of 10
or more), the coefficients are less stable. This is largely to be expected when doing a two-step
procedure with a small sample.
In summary, the estimates from the cross-provincial evidence show that: OLS estimates sug-
gest that an increase in 1 fatality per 1,000 people can reduces revenues by at least one half; that
this is robust to including other covariates; that future economic conditions do not explain the
impact of conflict; and IV estimates are at least twice as large, albeit more uncertain.
Alone these results are, of course, not completely convincing; in such a small sample one
cannot properly address issues of omitted variable bias, measurement error, and reverse causality
– let alone use asymptotic standard errors in good faith. But these results do not stand alone.
Rather, they serve a useful counterpoint to the their full dynamic equivalents in Sections 4 and 5,
which are designed with precisely these challenges in mind, but can be somewhat opaque. Indeed,
the results from the dynamic models mirror those summarized in Table 4: the long-run impact
in these models is around -0.5 to -0.6 for OLS estimates and this cannot be reduced by including
future economic covariates. This should be reassurance that the analysis is identifying a robust
32
feature of the data.
7 Measuring the cost of conflict
Using the estimates computed in the preceding sections, I compute two counterfactual measures
of the cost of conflict in Afghanistan. The evidence presented in Appendix B.1 suggests that
there are no spillover effects of violence in one province on the revenue collected in its neighbors.
So adding up local impacts will give a valid measure of the national revenue loss from conflict.
The first measure is forward-looking. Given an estimated cumulative impact of conflict γ this
is defined as:
M1(γ) =∑i
wi,2017 eγdi (9)
where i indexes provinces, wi,2017 is the 2017 of share provincially collected revenues in province i,
and di =(∑
t∈2004 xi,t −∑
t∈2017 xi,t)/12 is the difference in average conflict shocks between 2017
and 2006. Thus, M1(γ) measures the estimated per-year increase in relative revenues if violence
permanently returns back to 2006 levels, relative to 2017.
The second measure is backward-looking. It is given by:
M2(γ) =1
rev2017 × usd2017
∑i,t
(1 + r)(T−t)usdt × revi eγ(xi,t−xi,0) (10)
where i indexes provinces, wi,2017 is the 2017 of share provincially collected revenues in province
i, and di =(∑
t∈2004 xi,t −∑
t∈2017 xi,t)
is the difference in average annual conflict deaths per
10,000 between 2017 and 2004. M2(γ) thus computes the estimated per-year increase in relative
revenues if violence permanently returns back to levels seen in 2004 (the last year prior to the
sample period), relative to 2017.
Table 5 presents computed values for M1(γ) and M2(γ), using point estimates from the ap-
proaches pursued in previous sections. Columns (1)-(3) present the OLS estimates from the local
projection specifications. Column (2) is the headline OLS estimate. As discussed before, dynamic
equations estimates are likely mis-specified so are omitted. Columns (4)-(7) use long-run effects
from generalized synthetic controls.
Throughout, the results imply a very large annual peace dividend – the headline estimate is
51 percent of revenues, or about 6 percent of GDP, every year. This is roughly equivalent to
the Afghan government’s combined annual spending on development projects in infrastructure,
education, health, and agriculture (which amounted to 6.3 percent GDP in 2017). That the
unweighted estimate is smaller throughout reflects the (small) positive cross-sectional correlation
between average revenues and increases in violence; conflict deaths increased slightly more in the
33
LP
-OL
SL
P-O
LS
LP
-OL
SG
SC
-IG
SC
-IG
SC
-II
GSC
-II
(1)
(2)
(3)
(4)
(5)
(6)
(7)
γ-0
.605
***
-0.6
49**
*-0
.500
***
-1.7
94**
*-2
.280
-2.5
39**
*-2
.687
**
M1(γ
),%
ann.
rev
4651
3631
4249
53
M1(γ
),unw
eigh
ted,
%an
n.
rev
4651
3631
4249
53
M2(γ
),U
SD
bns
2.94
3.24
2.28
1.99
2.71
3.14
3.40
M2(γ
),%
ann.
rev
237
261
184
160
218
253
274
Sec
uri
tyla
gs0
88
48
48
Rev
enue
lags
88
84
84
8
Met
eoro
logi
cal
and
pri
cela
gs8
88
Pro
vin
cial
tim
etr
ends
No
No
Yes
Ref
eren
celo
cati
on(1
,4)
(1,6
)(1
,7)
(3,3
)(3
,4)
(3,7
)(3
,8)
Not
e:L
P-O
LS:
Loca
lpro
ject
ion
OL
S,
GSC
-I:
One-
step
gener
aliz
edsy
nth
etic
contr
ol,
GSC
-II:
Tw
o-st
ep
gener
aliz
edsy
nth
etic
contr
ol.
Shar
esar
eex
pre
ssed
asa
per
cent
of20
17an
nual
pro
vin
cial
ly-c
olle
cted
.
reve
nues
.“R
efer
ence
loca
tion
”is
the
table
and
spec
ifica
tion
num
ber
sin
this
pap
erw
her
ees
tim
ates
ofγ
are
firs
tre
por
ted,
e.g.
(1,2
)w
ould
refe
rto
Tab
le1,
spec
ifica
tion
2.D
ynam
iceq
uat
ion
esti
mat
esar
eth
e
infinit
e-hor
izon
impac
t,ot
her
sar
eta
ken
atth
eJor
da-
OL
Sm
axim
alim
pac
thor
izon
inT
able
1.
Tab
le5:
The
fisc
alco
stof
conflic
t.
34
provinces which contribute large shares of total revenues.
This estimated peace dividend is of the same order of magnitude as other estimates of the
cost of conflict. Mueller and Tobias (2016) is one of the few papers in this literature which uses a
continuous measure of conflict intensity (most others use binary conflict/no conflict indicators).
Relying on cross-country data they estimate that a civil war with at least 0.8 annual deaths per
10,000 entails a permanent per capita annual GDP loss of around 18 percent. The Afghan conflict
has been, by this measure, significantly more intense than this minimum cutoff, with fatalities
in 2014-2017 averaging 4 per 10,000 per year. As a very rough comparison, let us assume that
the average conflict in the Mueller and Tobias (2016) sample is twice as intense as the minimum
cutoff. Then applying the average GDP loss to Afghanistan’s annual fatality rate would give
a flow GDP loss of 45%.28 This is comfortably within the range of estimates in Table 5. Of
course, this calculation varies in many dimensions from the main topic of this paper, not least
as it measures (per capita) GDP rather than revenue. Nevertheless, this seems to be the closest
point of comparison in the literature and the results are at least comparable.
The historical loss calculations, represented by M2(γ) are much larger, as this is an accu-
mulated stock. The average across estimates reported in Table 5 is more than 225 percent of
provincially-collected revenues, equivalent to nearly 15 percent of GDP, close to $2.8bn. This
more than total non-discretionary developments grants to the Afghan government cumulative
from 2011 to 2017,29 and is similar to the United States’ contribution to the Afghanistan Recon-
struction Trust Fund.30
8 Conclusion
In this paper I present estimates of government revenue loss due to conflict in Afghanistan. I
find that the fiscal peace dividend is likely to be large, around 50 percent of current revenues.
The historic cost of the Afghan conflict is even larger, at around 225 percent current revenues –
some 3 billion US dollars, or equivalent to the United States contribution to the main Afghan
reconstruction fund. I use an extended version of generalize synthetic controls to show that OLS
estimates are not subject to omitted variables bias – the main challenge to causal identification.
I also present evidence that this loss arises due to a reduction in revenue share of local activity,
28. As 18÷ (0.8× 2)× 4 = 4529. Non-discretionary development grants are the main channel through which foreign donors disburse fund
for specific development projects (as opposed to discretionary development grants, which can be used by thegovernment for any purpose).
30. The ARTF is the main vehicle by which donor governments provide budgetary support to the governmentof Afghanistan. As of April 2018, the United States was the largest contributor, providing $3.2bn of the $10.9bntotal contributions.
35
rather than reduced local economic activity, and implying that revenue loss reflects reduced state
capacity.
These estimates are valuable in their own right as a measure of the gains that peace could bring
to a country long wrought by violence. But they also have a broader applicability to international
attempts to increase state capacity and public living standards in conflict-affected areas. In such
settings, provision of security competes for finite resources with many other valuable activities
such as investment in infrastructure, education, health care, and the like. The evidence helps
evaluate these difficult tradeoffs by measuring the expected fiscal gain from policies which can
reduce conflict violence.
36
References
Abadie, Alberto, and Javier Gardeazabal. 2003. “The economic costs of conflict: A case study of
the Basque Country.” American Economic Review 93 (1): 113–132.
Acemoglu, Daron, Suresh Naidu, Pascual Restrepo, and James A Robinson. 2018. “Democracy
Does Cause Growth.” Journal of Political Economy.
Alesina, Alberto, Arnaud Devleeschauwer, William Easterly, Sergio Kurlat, and Romain Wacziarg.
2003. “Fractionalization.” Journal of Economic Growth 8 (2): 155–194.
Atlas Narodov Mira. 1964. Moscow: Miklukho-Maklai Ethnological Institute at the Department of
Geodesy and Cartography of the State Geological Committee of the Soviet Union.
Bartik, Timothy J. 1991. “Who benefits from state and local economic development policies?”
Berman, Eli, Jacob N Shapiro, and Joseph H Felter. 2011. “Can hearts and minds be bought?
The economics of counterinsurgency in Iraq.” Journal of Political Economy 119 (4): 766–819.
Besley, Timothy, and Hannes Mueller. 2012. “Estimating the Peace Dividend: The impact of
violence on house prices in Northern Ireland.” American Economic Review 102 (2): 810–33.
Blattman, Christopher, and Edward Miguel. 2010. “Civil war.” Journal of Economic literature
48 (1): 3–57.
Blumenstock, Joshua Evan, Tarek Ghani, Sylvan Rene Herskowitz, Ethan Kapstein, Thomas
Scherer, and Ott Toomet. 2018. “Insecurity and industrial organization: Evidence from
Afghanistan.”
Bove, Vincenzo, and Evelina Gavrilova. 2014. “Income and Livelihoods in the War in Afghanistan.”
World Development 60:113–131.
Brodeur, Abel. 2018. “The Effect of Terrorism on Employment and Consumer Sentiment: Ev-
idence from Successful and Failed Terror Attacks.” American Economic Journal: Applied
Economics.
Ciarli, Tommaso, Chiara Kofol, and Carlo Menon. 2015. “Business as unusual. An explanation of
the increase of private economic activity in high-conflict areas in Afghanistan.”
Collier, Paul. 1999. “On the economic consequences of civil war.” Oxford economic papers 51 (1):
168–183.
Collier, Paul, and Anke Hoeffler. 1998. “On economic causes of civil war.” Oxford economic papers
50 (4): 563–573.
Cunningham, Kathleen Gallagher, and Nils B Weidmann. 2010. “Shared space: Ethnic groups,
state accommodation, and localized conflict.” International Studies Quarterly 54 (4): 1035–
1054.
37
D‘Souza, Anna, and Dean Jolliffe. 2012. “Rising food prices and coping strategies: Household-level
evidence from Afghanistan.” Journal of Development Studies 48 (2): 282–299.
Dube, Oeindrila, and Juan F Vargas. 2013. “Commodity price shocks and civil conflict: Evidence
from Colombia.” The Review of Economic Studies 80 (4): 1384–1421.
Esposito, John L. 2004. The Oxford Dictionary of Islam. Oxford University Press.
Fairfield, Annah, Kevin Quealy, and Archie Tse. 2009. “Troop Levels in Afghanistan Since 2001.”
New York Times (October 1).
Fearon, James D, and David D Laitin. 2003. “Ethnicity, insurgency, and civil war.” American
political science review 97 (1): 75–90.
Floreani, Vincent Arthur, Gladys Lopez-Acevedo, and Martın Rama. 2016. “Conflict and Poverty
in Afghanistan’s Transition.”
Gehring, Kai, Sarah Langlotz, and Kienberger Stefan. 2018. “Stimulant or Depressant?: Resource-
Related Income Shocks and Conflict.”
Guidolin, Massimo, and Eliana La Ferrara. 2007. “Diamonds are forever, wars are not: Is conflict
bad for private firms?” American Economic Review 97 (5): 1978–1993.
Gupta, Sanjeev, Benedict Clements, Rina Bhattacharya, and Shamit Chakravarti. 2004. “Fiscal
consequences of armed conflict and terrorism in low-and middle-income countries.” European
Journal of Political Economy 20 (2): 403–421.
Harris, IPDJ, PD Jones, TJ Osborn, and DH Lister. 2014. “Updated high-resolution grids of
monthly climatic observations–the CRU TS3. 10 Dataset.” International Journal of Clima-
tology 34 (3): 623–642.
Henderson, J Vernon, Tim Squires, Adam Storeygard, and David Weil. 2018. “The global distri-
bution of economic activity: nature, history, and the role of trade.” The Quarterly Journal
of Economics 133 (1): 357–406.
Hultman, Lisa, Jacob Kathman, and Megan Shannon. 2013. “United Nations peacekeeping and
civilian protection in civil war.” American Journal of Political Science 57 (4): 875–891.
IMF. 2017. “Measuring the Fiscal Revenue Cost of Conflict in Afghanistan.” Islamic Republic of
Afghanistan; Selected Issues, no. Country Report 17/378: 3–11.
Iyigun, Murat, Nathan Nunn, and Nancy Qian. 2017. The Long-run Effects of Agricultural Pro-
ductivity on Conflict, 1400-1900. Technical report. National Bureau of Economic Research.
Johnson, Thomas H, and M Chris Mason. 2007. “Understanding the Taliban and insurgency in
Afghanistan.” Orbis 51 (1): 71–89.
Jorda, Oscar. 2005. “Estimation and inference of impulse responses by local projections.” Amer-
ican economic review 95 (1): 161–182.
38
Knight, Malcolm, Norman Loayza, and Delano Villanueva. 1996. “The peace dividend: military
spending cuts and economic growth.” Staff papers 43 (1): 1–37.
Mc Evoy, Claire, and Gergely Hideg. 2017. “Global Violent Deaths 2017.” Small Arms Survey.
Michalopoulos, Stelios, and Elias Papaioannou. 2016. “The long-run effects of the scramble for
Africa.” American Economic Review 106 (7): 1802–48.
Miguel, Edward, Shanker Satyanath, and Ernest Sergenti. 2004. “Economic shocks and civil con-
flict: An instrumental variables approach.” Journal of political Economy 112 (4): 725–753.
Mueller, Hannes. 2016. “Growth and violence: argument for a per capita measure of civil war.”
Economica 83 (331): 473–497.
Mueller, Hannes, and Julia Tobias. 2016. “The cost of violence: Estimating the economic impact
of conflict.” International Growth Centre.
Nickell, Stephen. 1981. “Biases in dynamic models with fixed effects.” Econometrica: Journal of
the Econometric Society: 1417–1426.
Parlow, Anton. 2016. “Adult Health Outcomes during War: The Case of Afghanistan.”
Peters, Heidi M., Moshe Schwartz, and Lawrence Kapp. 2017. “Department of Defense Contractor
and Troop Levels in Iraq and Afghanistan: 2007-2017.” Congressional Research Service.
Powell, David. 2017. “Synthetic control estimation beyond case studies: Does the minimum wage
reduce employment?”
Ramey, Valerie A. 2016. “Macroeconomic shocks and their propagation.” In Handbook of Macroe-
conomics, 2:71–162. Elsevier.
Sambanis, Nicholas. 2002. “A review of recent advances and future directions in the quantitative
literature on civil war.” Defence and Peace Economics 13 (3): 215–243.
Sonin, Konstantin, Jarnickae Wilson, and Austin L Wright. 2018. “Rebel Capacity, Intelligence
Gathering, and the Timing of Combat Operations.”
Sundberg, Ralph, and Erik Melander. 2013. “Introducing the UCDP georeferenced event dataset.”
Journal of Peace Research 50 (4): 523–532.
Symon, Fiona. 2001. “Afghanistan’s Northern Alliance.” BBC News Online (September 19).
Themner, Lotta, and Peter Wallensteen. 2014. “Armed conflicts, 1946–2013.” Journal of Peace
Research 51 (4): 541–554.
UNAMA, Human Rights Service. 2018. “Afghanistan Protection of Civilians in Armed Conflict
Annual Report 2017.”
UNFPA, United Nations Population Fund, and Afghanstan CSO Central Statistics Office. 2007.
“Afghanistan A Socio-Economic Profile Household Listing 2003-2005.”
39
Vasagar, Jeevan, and Ewen MacAskill. 2005. “180,000 die from hunger in Darfur.” The Guardian
(March 16).
Wooldridge, Jeffrey M. 2010. Econometric analysis of cross section and panel data. MIT press.
World Bank. 2017. “The Toll of War: The Economic and Social Consequences of the Conflict in
Syria.”
40
A Data sources and cleaning
A.1 Detailed data description
Conflict fatalities UCDP GED: Event-level data on conflict fatalities, available for download
from http://ucdp.uu.se/downloads/. See Sundberg and Melander (2013) for many further
details on the dataset, including definitions and sources. Small Arms Survey: National fatal-
ity rate for violent deaths, available for download from http://www.smallarmssurvey.org/gbav.
See Mc Evoy and Hideg (2017), especially Annex 3, for details on sources and methodology.
UNAMA: Civilian deaths and injuries by region, 2009-17, taken from UNAMA (2018).
Cross-sectional covariates These come from the 2003 National Risk and Vulnerability As-
sessment (NRVA) in Rural Afghanistan. This collected data survey from 11,757 house-
holds, containing 85,577 individuals across all provinces of Afghanistan. The sample is then
reweighted to be representative of the rural population (over 80 percent of the national
population, according to the 2003-05 household listing).
Foreign troop levels ISAF: The “Resolute Support Facts & Figures” series is published near-
monthly, and covers both US and other coalition troops from 2007 to 2017. An archive is
available to download from https://www.nato.int/cps/en/natolive/107995.htm. Congres-
sional Research Service: Peters, Schwartz, and Kapp (2017) present figures from US Central
Command covering US troops from 2007 to 2016. New York Times: Fairfield, Quealy, and
Tse (2009) also cite US Central Command as their primary source for US troops between
2002 and 2009. Links to the original Central Command reports fora both the CRS and
NYT data no longer work, but data from 2007-09 largely overlap (see Figure 12).
Language shares UNFPA and CSO (2007) include data on the majority language of sampled
villages in the 2003-05 household listing. This measure excludes urban areas. See note on
Provincial population shares for further details about the household listing.
Meteorological data From “CRU TS4.01: Climatic Research Unit (CRU) Time-Series (TS)
version 4.01 of high-resolution gridded data of month-by-month variation in climate”, pub-
lished online by CEDA at http://catalogue.ceda.ac.uk/uuid/58a8802721c94c66ae45c3baa4d814d0.
This uses weather satellites to produce global data for a grid of one half-degree latitude by
one half-degree longitude boxes. At Afghanistan’s latitude, these are rectangles of approx-
imately 25 by 35 miles. See Harris et al. (2014) for further details about this data. I
convert the boxes to average provincial levels by averaging across CEDA gridpoints within
each province. My figures for precipitation match the ground-measured NOAA normals for
major cities reasonably well for lowland parts of Afghanistan. For example the Herat City
41
NOAA mean is 239mm, whereas my average for Herat province is 246mm. But for the more
mountainous regions, the numbers are often much larger than the NOAA measures. For in-
stance, NOAA mean precipitation for Kabul City is 312mm, whereas my measure for Kabul
province is 726mm. Presumably this occurs because cities in upland areas are located in
valleys, and so have less precipitation (snowfall in particular) than the province as a whole.
Given the importance of annual snowfalls to Afghan agriculture, using the satellite data is
likely to provide a better control for local economic conditions.
National Populations World Bank, compiled from UN World Population Prospects and na-
tional censuses. Countries used: Afghanistan, Iraq, Libya, Sudan, Sri Lanka, Somalia, and
Syria
Nighttime lights data Annual nighttime lights data are collected by the Operational Linescan
System of the US Air Force Defense Meteorological Satellite Program. From 1992 until 2013,
data were archived by the National Oceanic and Atmospheric Administration (NOAA), and
are available for download from https://ngdc.noaa.gov/eog/dmsp/downloadV4composites.html.
Since the launch of Suomi National Polar-orbiting Partnership satellite in 2011, NOAA pro-
duces average nighttime lights composite images using data from the Visible Infrared Imag-
ing Radiometer Suite Day-Night Band. Monthly data are available starting in April 2012
and can be downloaded from https://ngdc.noaa.gov/eog/viirs/download dnb composites.html
Opium prices Collected from Afghan Drug Price Monitoring monthly reports, published jointly
by UNODC and the Afghan Ministry of Counter Narcotics. These report farm gate prices
of of dry opium in the five regions used by the UNODC: Northeast, Northwest, South, East,
and West.
Provincial population shares The National Household Listing, conducted during 2003-05,
records every household in Afghanistan. This was a precursor to an intended census in
2008, and also includes basic information about households and individuals living therein.
Due to deteriorating security conditions, the 2008 survey was never conducted. UNFPA
and CSO (2007) report various statistics from the listing, including provincial populations.
The Central Statistics Office of Afghanistan (CSO) does publish annual provincial popula-
tion estimates, but I do not use them due to data quality concerns. These estimates are
computed by growing forward provincial population shares at annual growth rates. The
CSO provides no further details on source data for these growth rates; they appear to be
based purely on judgment.
Provincial revenues Monthly Fiscal Bulletins including provincial revenues are available online
42
from http://www.budgetmof.gov.af/index.php/en/2012-12-06-22-51-13/fiscal-bulletin (last
retreived 20 February 2020).
Wheat & rice prices From the World Food Program’s Vulnerability Analysis and Mapping
project. This samples prices monthly in markets in ten provinces. Six provinces have
data starting in 2005 (for wheat), and 2006 (for rice): Badakhshan, Faryab, Herat, Kabul,
Kandahar, and Nangarhar. As at least one province is in each of the five UNODC regions,
I create regional wheat and rice prices by averaging across the markets in each region.
A.2 Measuring the Afghan conflict since 2001
Figure 10 shows the nationwide annual fatality rate from violence per thousand of national popu-
lation, a measure of conflict intensity. Using nationwide conflict intensity as a guide, one can tell
the history of the Afghan conflict since the fall of the Taliban in four approximate periods: 2002-
2005, 2006-2009, 2010-2013, and 2014-2017. The divisions between these periods are indicated
by the vertical lines in the Figures 10-12.
From 2002 to 2005, the country was relatively peaceful. Deaths from conflict were no higher
than one per 1,000 nationwide. By this measure, the Afghan conflict was not particularly intense
relative to other concurrent conflicts. Table 8 illustrates this point, presenting the UCDP GED
conflict fatality rate for a sample of conflict-affected countries during this time.31
Conflict intensified markedly during 2006-2009. Increased violence was concentrated in the
South (see Figure 11). This region – overwhelmingly Pahshto-speaking, and a longtime stronghold
of the Taliban – accounted for over half of the 31,169 deaths recorded during 2006-2009.
Between 2010 and 2013, foreign troop levels surged, as shown in Figure 12.32 And while the
intensity of the conflict in the South subsided during this time, increasing violence elsewhere
meant that national conflict deaths fell little.
Comparing Figure 10 and 11 to Figure 12 emphasizes a further advantage of using the UCDP
GED data. Because the UCDP GED records all fatalities from conflict, it gives the broadest
possible measure of conflict violence. In contrast, using only military records, such as ISAF
31. Of course, given the obvious difficulties of data collection, international comparisons of conflict intensity areonly very approximate. UCDP GED data on Sudan in particular are very hard to match with other sources. In2005, the UN estimated that 10,000 civilians per month had died in Darfur over the preceding 18 months (Vasagarand MacAskill (2005)). With a population of around 30 million, this alone would have produced an average fatalityrate of 2 per 1,000 per year (180, 000÷ 30, 000, 000× 1, 000 = 6 over three years).
32. Data on from US troop levels is US Central Command, reported by the New York Times (Fairfield, Quealy,and Tse (2009), covering 2002-09), and by the Congressional Research Service (Peters, Schwartz, and Kapp (2017),covering 2007-17). Data on other international troop levels comes from ISAF’s “Resolute Support Facts & Figures”series, which covers separately US and other coalition troops during 2007-16.
43
Province Pop share Prov rev share Revenue growth(2003-2005) Mean St. dev. Mean St. dev.
Badakhshan 3.6 0.4 0.1 26.3 33.3Badghis 2.2 0.1 0.0 19.9 25.2Baghlan 3.2 0.6 0.3 11.6 47.5Balkh 4.8 16.5 3.8 21.2 44.5Bamyan 1.5 0.2 0.0 31.1 36.3Daykundi 2.0 0.1 0.0 47.2 63.5Farah 2.1 2.6 2.3 40.2 56.5Faryab 3.6 3.8 2.2 20.9 58.8Ghazni 4.7 0.4 0.1 23.2 37.9Ghor 2.8 0.2 0.0 22.0 26.0Helmand 6.0 0.7 0.3 21.4 46.2Herat 7.7 26.7 2.9 13.5 22.9Jowzjan 1.8 0.3 0.1 24.0 31.2Kabul 10.4 6.0 3.3 10.4 27.3Kandahar 4.3 7.1 1.6 14.0 29.2Kapisa 1.6 0.2 0.1 28.6 38.1Khost 2.7 1.7 1.0 5.4 57.6Kunar 1.8 0.2 0.1 26.0 78.2Kunduz 3.4 1.5 0.6 20.9 39.3Laghman 1.6 0.2 0.1 24.5 25.6Logar 1.4 0.2 0.1 29.5 29.0Wardak 2.3 0.2 0.1 31.7 28.5Nangarhar 5.8 19.1 2.7 23.1 27.1Nimruz 0.5 8.4 3.5 36.4 42.6Nuristan 0.6 0.1 0.0 26.0 68.5Paktia 2.2 1.1 0.6 27.1 41.2Paktika 3.3 0.3 0.1 23.1 31.1Panjshir 0.5 0.1 0.0 28.3 41.8Parwan 2.1 0.3 0.1 16.4 30.2Samangan 1.4 0.2 0.0 23.8 15.7Sar-e Pol 1.9 0.1 0.0 30.6 25.5Takhar 3.6 0.4 0.1 23.0 31.1Urozgan 1.4 0.1 0.0 21.7 56.2Zabul 1.5 0.1 0.0 25.3 39.2Provinces 100.0 0.0 18.7 19.0Total 20.3 16.2GDP 13.1 9.0
Table 6: Provincial revenue summary statistics, anuual 2005-2016
44
Ser
ies
Spat
ial
unit
Mea
sure
men
tU
nit
sP
erio
ds
No.
obs
Sta
rtE
nd
Mea
nW
ithin
sdA
cros
ssd
Pre
cipit
atio
nP
rovin
ceM
ilim
eter
s,to
tal
3419
265
2820
0220
1731
.029
.717
.1A
vete
mp
erat
ure
Pro
vin
ceC
elci
us,
aver
age
3419
265
2820
0220
1711
.28.
75.
2C
loud
cove
rP
rovin
ceP
erce
nt,
aver
age
3419
265
2820
0220
1736
.613
.56.
4F
rost
Pro
vin
ceD
ays,
tota
l34
192
6528
2002
2017
11.0
10.0
4.8
Opiu
mpri
ceR
egio
nA
fghan
is/k
g,gr
owth
520
410
2020
0120
170.
114
.442
.0Shee
ppri
ceR
egio
nA
fghan
is/a
nim
al,
grow
th5
166
830
2004
2017
0.5
6.0
30.6
Unsk
ille
dla
bor
Reg
ion
Afg
han
is/d
ay,
grow
th5
166
830
2004
2017
0.7
8.6
21.0
Whea
tpri
ceR
egio
nA
fghan
is/k
ilo,
grow
th5
166
830
2004
2017
0.7
7.3
19.1
Tab
le7:
Cov
aria
tesu
mm
ary
stat
isti
cs,
mon
thly
dat
a.“W
ithin
sd”
and
“Acr
oss
sd”
repre
sent
the
mea
nw
ithin
-pro
vin
cean
dac
ross
-pro
vin
cest
andar
ddev
iati
ons
resp
ecti
vely
for
each
seri
es.
45
0.2
0.4
0.6
2005 2010 2015
Source Conflict deaths (UCDP GED) Violent deaths (Small Arms Survey)
Figure 10: Nationwide deaths per thousand in Afghanistan
Country 2002-2005 2006-2009 2010-2013 2014-2016Afghanistan 0.05 0.22 0.23 0.40Iraq 0.08 0.10 0.06 0.34Sudan 0.13 0.05 0.05 0.05Sri Lanka 0.00 0.29 0.00Somalia 0.05 0.15 0.17 0.11Syria 0.00 0.00 1.42 2.96
Table 8: Deaths from conflict per thousand, international comparison
46
South South East
North East West & North West
Central & Highlands East
2005 2010 2015 2005 2010 2015
0
2000
4000
0
2000
4000
0
2000
4000
Civilian deaths & injuries (UNAMA) All deaths (UCDP GED)
Figure 11: Deaths and injuries from conflict in Afghanistan 2002-17
47
0
50,000
100,000
2005 2010 2015
US (NYT) US (CRS) US (ISAF) Total (ISAF)
Figure 12: Foreign troop levels
coalition deaths, will capture a changing share of fatalities, and hence mis-represent the true
extent of violence in Afghanistan.33
I corroborate the UCDP GED data using two other data sources. The first is the Small Arms
Survey Database on Violent Deaths (see Figure 10). This covers all violent deaths, including those
resulting from homicides and legal interventions, as well as from conflict. As should be expected,
this provides an upper bound on the UCDP GED fatality rates. It also matches the overall level
and year-to-year movements fairly well. This both provides an independent cross-check on the
UCDP GED data, as well as verifying that conflict violence is the major risk to life during this
period.34
33. Bove and Gavrilova (2014) rely only ISAF casualties as their measure of conflict when trying to estimateresponses of local prices to conflict. This is likely appropriate for the time period that they study, 2003-09. Butin subsequent years, changes in ISAF fatalities will reflect large fluctuations in aggregate troop levels as much asthe variation in actual conflict intensity.
34. Of course, the two series diverge somewhat from 2012 onward. While this might reflect difficulties of gatheringaccurate data in a war-torn country, it could also be because non-conflict deaths have indeed increased, perhapsreflecting a more general breakdown in government control.
48
The second corroborating dataset provides a cross-check on the spatial variation in the UCDP
GED data, albeit imperfectly. Figure 11 also includes civilian deaths and injuries, as measured
by the United Nations Assistance Mission to Afghanistan (UNAMA). This is narrower than the
UCDP data in one dimension (it covers only civilians), but broader in another (it includes both
deaths and injuries). For much of the sample, these two differences largely offset, suggesting
that number of civilian injuries and non-civilian deaths have been roughly equal over this period.
Moreover, the agreement of the levels across provinces makes clear that the spatial distribution
of violence across these two very different measures is similar: violence in one region relative
to another is very similar across the two measures. This provides reassurance that the spatial
variation in the UCDP data is reliable. The divergence of the UCDP GED and UNAMA data
in the last few years matches anecdotal reports that insurgent groups have shifted away from
targeting civilians, focusing instead on attacks on the security forces.
A.3 Data on provincially collected revenues
The provincial revenue data include numerous obvious outliers. Typically, these involve a coding
error in the (cumulative) monthly reports, such as a misplaced decimal point. This is then
corrected in subsequent (cumulative) reports, generating an offsetting error. Figure 13 shows the
raw monthly data for four provinces35, which all show this pattern of errors during 2006 and
2009. While such outliers are obvious to the naked eye, it is less obvious how to treat them
systematically, as the timing and magnitude of the errors is not quite the same across provinces
and periods. The negative values for revenues are particularly troublesome, as they become
missing values under a log-transformation. With lagged revenues a regressor, this removes many
sample observations.
In order to systematically remove outliers, I compute for each monthly observation a local
mean and standard deviation using a two-sided twelve-month window (the point itself is excluded
from the window). Any point which deviates from the local mean by more than 10 times the
local standard deviations is considered an outlier, and replaced with the local mean. The process
is repeated until there are no remaining outliers. This replaces 204 of the 4,828 month-province
observations, or almost exactly six outliers per province. Given that outliers usually come in
pairs, and that one can easily identify two or three candidate pairs for each province in Figure
14 by eye alone, this process seems to be reasonably accurately eliminating mis-coded points.
Quarterly and annual averages are then computed from the cleaned monthly underlying data.
The results of the outlier removal process for the sample provinces are shown in Figure 14.
Even though the data no longer include obvious mistakes, they are very volatile, with several
35. Two which collect a large share of provincial revenues – Herat and Nangarhar – and two which collect less –Farah and Laghman.
49
months still recording zero revenues. As a result, I decide to aggregate the data to quarterly
frequency for the main empirical analysis, although I do conduct some robustness checks with
the monthly data.
Table 9 describes the effect on key moments of the revenue data, after aggregating to quarterly
averages. The median level and growth rate are little unchanged, but the top and bottom of the
distribution are substantially affected, suggesting that the basic properties of much of the data
are little changed, as one would hope.
Original Data Outliers Removed DifferenceLevel (10m Afghanis)
Mean 33.5 33.9 0.4Median 3.6 3.7 0.11st percentile -4.7 0.1 4.85th percentile 0.0 0.3 0.2
10th percentile 0.3 0.5 0.225th percentile 1.1 1.3 0.275th percentile 15.3 15.8 0.590th percentile 100.0 106.2 6.195th percentile 219.5 217.3 -2.299th percentile 423.3 406.5 -16.8Med. within-province coeff. of variation 0.85 0.73 -0.12Med. across-province coeff. of variation 2.13 2.13 0.00Number weakly negative 59 0 -59
Growth rateMean 0.07 0.05 -0.03Median 0.05 0.04 -0.011st percentile -1.74 -1.03 0.715th percentile -0.54 -0.51 0.03
10th percentile -0.36 -0.32 0.0325th percentile -0.12 -0.12 0.0075th percentile 0.22 0.20 -0.0390th percentile 0.56 0.45 -0.1195th percentile 0.85 0.65 -0.2099th percentile 1.84 1.17 -0.67Med. within-province std. dev. 0.50 0.33 -0.17Med. across-province std. dev. 0.40 0.31 -0.09Number NAs 63 0 -63
Table 9: Revenue data with and without outliers, 1632 quarterly observations. Aggregated from4828 monthly observations, of which 123 classified as outliers. Growth rates computed as logdifference of Revenues + 1.
50
0e+00
2e+08
4e+08
6e+08
2005 2010 2015
Rev
enue
s
(a) Farah (4 outliers detected)
−2.5e+09
0.0e+00
2.5e+09
5.0e+09
2005 2010 2015
Rev
enue
s
(b) Herat (4 outliers detected)
−2e+08
0e+00
2e+08
2005 2010 2015
Rev
enue
s
(c) Laghman (7 outliers detected)
−2e+09
0e+00
2e+09
4e+09
2005 2010 2015
Rev
enue
s
(d) Nangarhar (5 outliers detected)
Figure 13: Raw revenue data for selected Provinces.
51
0e+00
2e+08
4e+08
6e+08
2005 2010 2015
Rev
enue
s
(a) Farah
0e+00
1e+09
2e+09
2005 2010 2015
Rev
enue
s
(b) Herat
0e+00
1e+07
2e+07
3e+07
2005 2010 2015
Rev
enue
s
(c) Laghman
0e+00
1e+09
2e+09
2005 2010 2015
Rev
enue
s
(d) Nangarhar
Figure 14: Cleaned revenue data for selected Provinces.
52
B Robustness of the main estimates
B.1 Impulse responses for cross-province spillovers
Figures 15, 16, and 17 show estimated impulse responses for the spillover measures in equation (5).
As noted in the main text, the impulse responses are almost identical for the headline estimate
with and without controls for neighboring province violence. There is weak evidence of direct
positive spillovers from neighboring provinces, but neither the magnitude nor the significance
of this effect are robust across specification. The effect of neighboring province violence on the
marginal impact of local violence is similarly inconclusive.
B.2 Influential observations
To verify that particularly influential observations are not driving the results, I recalculate the
OLS estimates excluding the data for Helmand and Kandahar provinces. These two neighboring
provinces experience the most conflict fatalities during the sample, accounting for 28 percent
of all conflict deaths during this time, despite being home to only 10 percent of the national
population. And in terms of per capita fatalities, Helmand by some distance the most violent
province (several other smaller provinces have similar fatality rates to Kandahar). They are also
economically important provinces, contributing 7 percent of provincially collected revenues.
The resulting estimates results are shown in Table 10. Omitting Kandahar and Helmand
has little effect compared to the full-sample results in Table 1. The comparison between early
and late-sample effects is clearest in the short-lagged specifications. This is because a larger
fraction of the observations are omitted in from the half-samples when estimating the longer-
lagged specifications. It is apparent from these that the early sample more clearly and reliably
identifies the effect of violence on provincial revenues. This is not entirely surprising, given the
intense and regionally-concentrated outbreak of violence in the Southern provinces during 2008-
10. Nevertheless, the point estimates for the later part of the sample are similar (especially for
specifications (2) and (3)), albeit with larger standard errors.
B.3 Alternate time aggregation
In this section, I investigate the sensitivity of the headline results to changing the frequency at
which the months data are aggregated. If the data is reported with negatively serially correlated
errors – as might happen if annual or quarterly totals are accurate but allocation to months is
inaccurate due to bureaucratic mistakes or other interruptions (see data cleaning, above) – then
aggregation to higher frequencies will produce more precise estimates of regression coefficients as
53
0 2 4 6 8 10
−1.
0−
0.8
−0.
6−
0.4
−0.
20.
0
Horizon, h
Rep
onse
coe
ffici
ent,
β h
● ●
●●
●
●●
●
●
● ●●
●●
●
●
No controls for violence in neighboring province +/− 2 standard errorsControlling for violence in neighboring provinces +/− 2 standard errors
Figure 15: Period revenue loss in response to provincial violence controlling for neighboring-province violence, log quarterly revenue
0 2 4 6 8 10
−2
−1
01
2
Horizon, h
Rep
onse
coe
ffici
ent,
γ h
● ●●
●● ●
●
●● ● ●
● ●
●●
●
Headline, specification (6) +/− 2 standard errorsAlternate, specification (5) +/− 2 standard errorsAlternate, specification (3) +/− 2 standard errors
Figure 16: Period revenue loss in response to neighboring-province violence controlling for own-province violence, log quarterly revenue
54
0 2 4 6 8 10
−6
−4
−2
02
46
Horizon, h
Rep
onse
coe
ffici
ent,
γ h
● ● ●
●●
●
●
●
● ● ●
● ●
● ●
●
Headline, specification (6) +/− 2 standard errorsAlternate, specification (5) +/− 2 standard errorsAlternate, specification (3) +/− 2 standard errors
Figure 17: Cumulative revenue loss, local and neighboring conflict violence interacted, log quar-terly revenue.
the reduction in noise due to period averaging offsets the decrease in the number of observations.
Indeed, this is the motivation for aggregation to quarterly frequency in the main text.
Table 11 reports summary estimates of the impact of conflict on revenue collection, scaled to
match the quarterly frequency measures of the headline measures. The difference between annual
and quarterly estimates is negligible, with magnitude, timing, and precision of the estimates very
similar. This should provide some confidence that the annual estimates are not badly compromised
by Nickell bias, even though they come from a relatively short, wide panel. The monthly estimates,
however, are more volatile and in general less precise, although in many cases the point estimates
are not far from the equivalent quarterly and annual observations. Standard errors are on the
monthly regression are particularly large for the simpler specifications, implying that the increased
noise is not offset by the extra observations.
For the annual data, I can also add an important extra covariate – provincial nighttime lights.
Nightlights data are available over the sample at only an annual frequency – monthly data start in
April 2012 – and serve as an additional control for local activity.36 This is important for the results
of Section 4.3, as nightlights arguably represent a more direct set of controls for local economic
36. Night lights are a commonly used as a proxy for local economic activity, particularly in data-poor countries.See Henderson et al. (2018) for a recent example.
55
activity than the other covariates. Figures 18 and 19 show the effect of including local nightlights
as controls for economic activity, and as post-shock controls. In both cases the point estimates
are smaller, but always statistically significant at some point and not statistically significantly
different from the estimates without local nightlight controls.
C Language, ethnicity, and violence in Afghanistan
In section 6 I use share of population speaking Pahto as their first language as instrument in the
cross-section. Here I motivate and justify the use of this instrument in more detail.
Pashto is the language of the Pashtuns, the largest ethnic group in Afghanistan. UNFPA
and CSO (2007) estimate that it is the first language of around half the population. Pashtuns
live principally in the southern part of Afghanistan, as well as in northern Pakistan. While the
links between Pashtun nationalism and the Taliban are complex, the Taliban undoubtedly was
and continues to be an overwhelmingly Pashtun-dominated organization. For example, Johnson
and Mason (2007) list the eleven most senior Taliban figures during the Taliban government, and
identifies them all as Pashtuns. And Esposito (2004) cites Pashtun tribal traditions as a key
influence on Taliban governance. In contrast, in the north of Afghnistan, where the population
predominantly speaks Dari (a dialect of Persian), Taliban influence has always been much weaker.
Even when the Taliban operated a government out of Kabul during 1996-01, the their control did
not extend to the North, where an alliance of principally non-Pashtun forces held sway.37
Given the overwhelmingly Pashtun composition of the Taliban, and given the Taliban’s key role
in the post-2005 insurgency, it is therefore not entirely surprising that the spatial distribution
of conflict violence throughout the sample mirrors Afghanistan’s ethno-linguistic composition.
Figure 20 presents the cross-sectional correlation of provincial pre-sample Pashto-speaking share
for the whole sample. The correlation is very strong.38 It is also a statistically significant predictor;
the F-statistic in the simple bivariate regression in Figure 20a is 24.
The ethnic distribution of Afghanistan is also stable over time, allaying concerns that 2003-05
data are merely a one-off snapshot. To this end, Figure 22 presents the distribution of primary
37. Indeed, Symon (2001) reports that the Northern Alliance even retained Afghanistan’s seat at the UN, as wellas several dozen embassies during the Taliban rule.
38. Berman, Shapiro, and Felter (2011) find a similar relationship in Iraq, where local voting patterns – considereda proxy for Sunni-Shia shares – predict local violence. In cross-country settings, though, other authors have foundthat measures of heterogeneity, rather than the prevelnace of just one ethnic group, are important predictorsof poor outcomes. Cunningham and Weidmann (2010) find that conflict is highest in sub-national units wherethe degree of dominance of the largest local ethno-linguistic group – defined as the share in excess of the nextlargest group – is near 0.5. And Alesina et al. (2003) find that the degree of fractionalization – also a measure ofheterogeneity – of both language and ethnicity predicts lower subsequent growth. In the Afghan case, these areall far inferior predictors of cross-sectional conflict violence than the simple Pashtun share, presumably due to thedegree of Pashtun influence in the major insurgent group.
56
−1 0 1 2 3
−1.
2−
1.0
−0.
8−
0.6
−0.
4−
0.2
0.0
Horizon, h
Rep
onse
coe
ffici
ent,
γ h
●
●
●
●
●
●
Headline +/− 2 standard errorsWith nightlight lags +/− 2 standard errors
Figure 18: Cumulative revenue loss, log annual revenue
−1 0 1 2 3
−1.
5−
1.0
−0.
50.
0
Horizon, h
Rep
onse
coe
ffici
ent,
γ h
●
● ●
●
●
●
With post−shock controls +/− 2 standard errorsWith post−shock nightlight controls +/− 2 standard errors
Figure 19: Cumulative revenue loss, log annual revenue including future controls.
57
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0.0
0.5
1.0
1.5
0 25 50 75 100
Pashto−speaking share
Dea
ths
per
1,00
0 pe
r ye
ar
(a) Pashto-speaking share vs. conflict fatalities
25 50 75Pashto−speaking share
(b) Pashto-speaking share 2003-05
0.5 1.0 1.5Average annual deaths per 1,000
(c) Conflict fatalities, full sample
Figure 20: Spatial and cross-sectional variation of: Pashto-speaking share (2003-05) and meanannual conflict fatalities per thousand by province.
58
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
0.0
0.1
0.2
0.3
0.4
0.5
0 25 50 75 100
Pashto−speaking share
Dea
ths
per
10,0
00 p
er y
ear
(ave
. abs
. ann
. cha
nge)
Figure 21: Pashto-speaking share 2003-05 and mean annual absolute change in provincial conflictfatalities 2005-17
59
(1)
(2)
(3)
(4)
(5)
(6)
(7)
Spe
cifi
cati
onde
tail
sC
onlict
inte
nsi
tyla
gs0
00
00
88
Rev
enue
lags
14
88
88
8L
agge
dpri
ce&
wea
ther
00
08
88
8P
rovin
cial
tim
etr
end
No
No
No
No
Yes
No
Yes
Max
imu
mcu
mu
lati
veim
pact
Bas
elin
e,fr
omT
able
2-0
.304
-0.5
93**
*-0
.641
***
-0.6
05**
*-0
.552
***
-0.6
49**
*-0
.500
***
(0.3
37)
(0.2
19)
(0.2
26)
(0.2
17)
(0.1
61)
(0.1
80)
(0.1
11)
Hor
izon
77
77
78
8
Om
itti
ng
two
mos
tvio
lent
pro
vin
ces
-0.3
96-0
.671
***
-0.6
85**
*-0
.606
***
-0.5
17**
*-0
.672
***
-0.4
58**
*(0
.188
)(0
.134
)(0
.141
)(0
.187
)(0
.145
)(0
.235
)(0
.109
)H
oriz
on7
77
77
77
Ear
lysa
mple
:20
05-2
010
-0.5
60**
*-0
.527
***
-0.5
20**
*-0
.418
**0.
279*
-0.5
46**
-0.3
11**
*(0
.188
)(0
.134
)(0
.141
)(0
.187
)(0
.145
)(0
.235
)(0
.109
)H
oriz
on7
77
810
87
Lat
esa
mple
:20
11-2
017
-0.4
72-0
.715
**-0
.534
**-0
.605
***
-0.1
40*
-0.5
77**
*-0
.185
**lo
cal
pro
ject
ion
(0.3
37)
(0.2
92)
(0.2
08)
(0.2
23)
(0.0
80)
(0.2
14)
(0.0
78)
Hor
izon
79
99
39
10
Not
e:∗ p<
0.1,∗∗p<
0.05
,∗∗∗ p<
0.01
All
spec
ifica
tion
sin
clude
per
iod
and
pro
vin
cefixed
effec
ts.
Tw
o-w
aycl
ust
ered
stan
dar
der
rors
inpar
enth
eses
.Sig
nifi
cance
calc
ula
ted
usi
ng
two-
sided
test
sth
rough
out.
Tab
le10
:Sum
mar
yim
pac
tlo
cal
pro
ject
ion
esti
mat
es,
omit
ting
influen
tial
obse
rvat
ions.
60
(1)
(2)
(3)
(4)
(5)
(6)
(7)
Spe
cifi
cati
onde
tail
sC
onlict
inte
nsi
tyla
gs(q
uar
ters
)0
00
00
88
Rev
enue
lags
(quar
ters
)1
48
88
88
Lag
ged
pri
ce&
wea
ther
(quar
ters
)0
00
88
88
Pro
vin
cial
tim
etr
end
No
No
No
No
Yes
No
Yes
Max
imu
mcu
mu
lati
veim
pact
Annual
freq
uen
cy-0
.594
***
-0.6
99**
*-0
.628
***
-0.4
62**
*-0
.731
***
-0.4
49**
*(0
.177
)(0
.182
)(0
.189
)(0
.153
)(0
.185
)(0
.146
)H
oriz
on(q
uar
ters
)8
88
88
88
Quar
terl
y,fr
omT
able
2-0
.304
-0.5
93**
*-0
.641
***
-0.6
05**
*-0
.552
***
-0.6
49**
*-0
.500
***
(0.3
37)
(0.2
19)
(0.2
26)
(0.2
17)
(0.1
61)
(0.1
80)
(0.1
11)
Hor
izon
77
77
78
8
Mon
thly
freq
uen
cy-0
.455
-0.5
55-0
.576
*-0
.465
-0.7
43**
*-0
.747
***
-0.8
22**
*(0
.492
)(0
.436
)(0
.344
)(0
.314
)(0
.130
)(0
.216
)(0
.195
)H
oriz
on(q
uar
ters
)10
.711
.711
.711
.011
.711
.711
.7
Not
e:∗ p<
0.1,∗∗p<
0.05
,∗∗∗ p<
0.01
All
spec
ifica
tion
sin
clude
per
iod
and
pro
vin
cefixed
effec
ts.
Tw
o-w
aycl
ust
ered
stan
dar
der
rors
inpar
enth
eses
.Sig
nifi
cance
calc
ula
ted
usi
ng
two-
sided
test
sth
rough
out.
Est
imat
esar
ecu
mula
tive
im-
pac
tof
aunit
incr
ease
inquar
terl
ypro
vin
cial
conflic
tdea
ths
per
1,00
0,i.e.
equiv
alen
tto
quar
terl
yag
grge
gati
on.
Lag
stru
cture
sfo
ran
nual
and
mon
thly
regr
essi
ons
are
one-
quar
ter
and
thre
eti
mes
aslo
ng
asth
equar
terl
yca
sere
spec
tive
ly.
Anual
spec
ifica
tion
s(1
)is
omit
ted
asit
cannot
mat
cha
one-
quar
ter
lag.
Tab
le11
:Sum
mar
yim
pac
tlo
cal
pro
ject
ion
esti
mat
es,
alte
rnat
eag
greg
atio
nfr
equen
cies
.
61
ethno-linguistic group in Atlas Narodov Mira (1964), a Soviet ethnographic atlas,39 which shows
that the spatial concentration of Pashtuns has changed little in the intervening half-century.
The preceding evidence suggests that the instrument is likely to be valid ex ante. Exogeneity
relies on a weight of evidence. First, as discussed already, there is a plausible causal story linking
ethnicity and language to a violent insurgent group. Second, that a wide variety of covariates
(including first differences, akin to provincial fixed effects in the level) limit the scope for ethnicity
to have a direct impact on revenues. And third, that the Taliban have historically used violence
as their principal tactic for effecting change, further mitigating the likelihood hat there is a direct
impact of the instrument on the output.40 Relevance must be tested for empirically, which I do
in the first stage regressions in Section 6).
D Estimating impulse responses via Generalized Synthetic
Control
To allow for a more robust treatment of omitted variables in a dynamic setting, the canonical
generalized synthetic control method must be extended. Here, I show how the method can be
adapted to deal with the horizon h of the local projection approach.
Assume the data generating process is now:
yi,t = βxi,t + ρyi,t−1 + µ′iλt + ei,t (11)
Where E(ei,t+k|xi,t) = 0 ∀ k > 0. This is equation (1) with h = 0, N = 0,M = 1 and a more
general process for the omitted variables. The short lags are for notational simplicity only, the
extension to more complicated dynamical systems with large N,M is straightforward.
Then
yi,t+h =h∑k=0
βρkxi,t+k + ρhyi,t−1 +h∑k=0
µ′iλt+k +h∑k=0
ei,t+k (12)
The main identifying assumptions for OLS local projection is that 1) the factors are mean
zero conditional on controls, and 2) future shocks are mean zero conditional on the controls and
39. This atlas is the primary source for ethnographic information in many studies of the role of ethnicity in conflictand development, including Alesina et al. (2003), Cunningham and Weidmann (2010), and Gehring, Langlotz, andStefan (2018). In comparison to this data, which allows local coding based only on ordinal rankings of ethnicgroups (sole, majority, minority, etc.), the language share provides a much more finely graded picture of ethnicity.
40. This is most obvious in contrast with other violent organizations, who often also pursue goals with non-violentacts as well. For example, many militant organizations throughout the Middle East substitute for the state inprovision of services such as a court system or healthcare. These activities were neglected by the Taliban evenduring the late 1990s when they were the titular government, let alone as insurgents.
62
Pashtun coding Sole group Primary group Secondary group Not recorded
Figure 22: Atlas Narodov Mira (1964) ethnographic classification
63
lags of xi,t. That is, there exists some vector zit and number N such that ∀ k > 0:
E (λt+k|zit) = 0 (13)
E(xt+k|zit, {xi,t−j}Nj=0
)= 0 (14)
In the headline OLS local projection estimates, zit is lags of violence, revenue, local prices,
and meteorological conditions. Then equations (13) an (14) imply that:
E(yi,t+h|{xi,t−j}Nj=0, yi,t−1, zi,t) =h∑k=0
βρkE(xi,t+k|{xi,t−j}Nj=0, yi,t−1, zi,t) + ρhE(yi,t−1|{xi,t−j}Nj=0, yi,t−1, zi,t)
+h∑k=0
µ′iE(λt+k|{xi,t−j}Nj=0, yi,t−1, zi,t)
+ E(h∑k=0
ei,t+k|{xi,t−j}Nj=0, yi,t−1, zi,t)
= βρhxi,t + ρhyi,t−1
As ordinary least squares is a consistent estimator of the conditional mean, regression of yi,t+h on
({xi,t−j}Nj=0, yi,t−1, zi,t) will recover an impulse response bh which is a consistent estimator for the
marginal impact of xi,t on yi,t+h, which in this example is βρh.41
The generalized synthetic control approach applies when we have no such zi,t. Instead, the
central assumption replacing equations (13) is that there exist weights wij and an integer N such
that:
E [µ′iλt + eit|xit, yi,t−1] =∑j 6=i
wij(µ′iλt + eit) (15)
E(xi,t+k|{xi,t−j}Nj=0
)= 0 (16)
Equation (15) is the direct analogue of equation (8) in the main text. And equation (16) is a
standard identifying assumption in any local projection setting – that future fluctuations in the
independent variable are mean zero conditional on sufficient lags of itself.42
When h = 0, equations (12) and (15) are just (7) and (12) except with Dit = (xit, yit). Thus,
41. With more involved data generating processes with longer lags of xit and yit the impulse response will bemore complicated functions of the this underlying parameters. Nevertheless, local projection will still recover thisimpact correctly.
42. If the evolution of the independent variable also depends on lags of the dependent variable (as would be truein a VAR), then this assumption becomes E(xi,t+h|{xi,t−j}Nj=0, {yi,t−1}Nj=0). The results below then go throughwith the inclusion of lags of the independent variable as a regressor.
64
Powell’s results follow straight through and we get estimates b0, W0 for the period 0 impulse
response and weighting which are consistent and asymptotically normal.
Given these weights, we can define the synthetic control for a given unit i at time t as:
yi,t =∑i 6=j
w0ijyi,t
We can similarly define xi,t. For h > 0, we start by computing the conditional expectation of
the difference between the outcome and the synthetic control, yt+h − yi,t:
E(yi,t+h − yi,t+h|{xi,t−j}Nj=0, yi,t−1) =
h∑k=0
βρkE(xi,t+k − xi,t+k|{xi,t−j}Nj=0, yi,t−1)
+ ρhE(yi,t−1 − yi,t−1|{xi,t−j}Nj=0, yi,t−1)
+h∑k=0
E
(µ′iλt+k + ei,t+k −
[∑j 6=i
w0ij(µ
′iλt+k + ei,t+k)
]∣∣∣∣∣ {xi,t−j}Nj=0, yi,t−1
)= βρh (xi,t − xi,t+k) + ρh (yi,t−1 − yi,t−1) +
+h∑k=0
E
(E
(µ′iλt+k + ei,t+k −
[∑j 6=i
w0ij(µ
′iλt+k + ei,t+k)
]∣∣∣∣∣ {xi,t+k−j}Nj=0, yi,t+k−1
)∣∣∣∣∣ {xi,t−j}Nj=0, yi,t−1
)
Where the last line follows by the law of iterated expectations. As W 0 is consistent, then as-
sumption (16) implies that the probability limit of the last term is zero. Thus, the regression of
yi,t+h − yi,t on ({xi,t−j − xi,t−j}Nj=0, yi,t−1 − yi,t−1) will recover an impulse response bh which is a
consistent estimator for the marginal impact of xi,t on yi,t+h.
Note that when W is taken as given, the GSC estimator equation (7) is just the ordinary
least squares loss function of exactly this regression. So, we can form a consistent estimator of
the impulse response for h > 0 by solving equation (7) for h = 0 and constraining W = W 0 for
h > 0.
Powell’s bootstrap technique for hypothesis testing is also easily adapted. The model is re-
solved ith the null hypothesis imposed as a constraint (in Table 3, the null is that bh = 0). Under
the null, the gradient of the objective function at this solution is zero. Bootstrapping score by
resampling the estimated weights in a cluster-independent manner generates a distribution of test
statistics against which the sample statistic can be assessed, producing a p-value. Intuitively, this
is equivalent to drawing new units in a way that adjusts for their cross-sectional correlation.
Applying this bootstrap to the dynamic case is straightforward. The model is resolved subject
65
to the null hypothesis constraint, but with the same adjustment as in the unconstrained case: the
weights are estimated only at horizon h = 0 and imposed at h > 0. Then, the distribution of the
Wald statistic is computed as before. This is equivalent to drawing new units for the creation of
the synthetic units only at horizon h = 0.
66