master of public administration program pad 5700 -- statistics
TRANSCRIPT
PAD5700 lecture 11
Page 1 of 20
MASTER OF PUBLIC ADMINISTRATION PROGRAM
PAD 5700 -- Statistics for public management Fall 2013
Program evaluation and index construction
Index of the week
Quality of life
*
The idea behind indexes is to aggregate a number of indicators into one, mother-of-all
measure. The Economist newsmagazine, for instance, has been doing an annual 'best country to
live in' index, a portion of the results from which I've copied in above. The 2005 survey rated
Ireland #1, followed by Switzerland, Norway, Luxemburg and Sweden. The US came 13th,
trailing Singapore, Spain and Finland, but pipping out Canada and New Zealand. The method
used in constructing the index is available online:
They use a steamroller to crack a walnut, and don't explain the development of the index as
clearly as they could. So don't sweat the details. Think big picture: the aggregation of
different indicators into one, overall indication.
Note that the index aggregates numerical scores on nine 'determinants of quality of life':
material well-being, health, political stability and security, family life, community life,
climate and geography, job security, political freedom, and gender equality.
o Each of these is, of course, is measured through one, occasionally weak indicator.
'Family life' is measured by the divorce rate; community life through a dummy variable
based on whether a country has 'a high rate of church attendance or trade-union
membership'; and climate and geography is reduced wholly to latitude.
Note that these ostensibly weren't chosen willy-nilly, they are derived from regression
analyses to identify indicators that "have been shown to be associated with life
satisfaction in many studies" (p. 2).
PAD5700 lecture 11
Page 2 of 20
Note the regression data (bottom of page 2), with its adjusted R2, and statistics: these
are our 't statistics', telling us how many standard deviations the coefficients are from
zero, or how statistically significant they are.
o The analysis also discarded other indicators that might have been considered, including
education, economic growth and inequality.
These nine indicators are then 'weighted', allowing those that are deemed somewhat
more important than others to have a greater effect on the mother-of-all measures.
The weights are based (somewhat loosely) on beta coefficients of the indicators: i.e.
those shown in regression analysis to have a stronger impact on the dependent
variable are weighted more.
The idea here is to resolve the paradox: 1 is better than 2 on a, but 2 is better than 1 on b. Which
is better overall: 1 or 2? To find this out, combine a and b.
Perhaps the most prominent index in the world is the Dow Jones Industrial Average, an index of
30 stocks that are meant to provide a snapshot of the state of the US economy (or at least
investment in the US economy). The Dow Jones Company website includes a link to their
method in constructing their indices. Other examples of indices (rwo of which we’ve used in
this class):
The Human Development Index -- developed by the United Nations Development
Programme.
'Freedom in the World' scores -- developed by Freedom House, an international
nongovernmental organization dedicated to human rights.
The Ratings Percentage Index -- used by the NCAA to rank sports teams from different
conferences.
Program evaluation
There are a few weeks of program evaluation included in this course for the same reason that a
lot of the content is here: this makes a lot of sense, especially given the contemporary pressure
toward performance measurement in both the public and nonprofit sectors. As some examples:
The United Way's Outcome Measurement Resource Network.
The Clinton administration's National Partnership for Reinventing Government.
The Government Performance and Results Act of 1993.
The Office of Management & Budget's Program Assessment Rating Tool.
The Bush administration's President's Management Agenda.
Continued under the Obama administration.
The Obama Administration's Accountable Government Initiative.
The State of Florida Office of Program Policy Analysis & Government Accountability The
Florida Legislature Government Program Summaries.
The City of Jacksonville's Key Benchmark Indicators.
Types of analysis
By way of introduction, Patton and Sawicki (from which this lecture is going to draw heavily,
partly because I like the book, partly because Berman & Wang don’t do a good job of this, on p.
53-4) also offer the following as a categorization of different types of evaluations. Patton and
PAD5700 lecture 11
Page 3 of 20
Sawicki note that some 100 evaluation techniques have been identified (p. 373), but often as not
these are wheels that are reinvented and renamed.
Table 1 -- House's evaluation taxonomy
Model
Major
audiences or
reference
groups
Assumes
consensus on Method Outcome Typical questions
Systems
analysis
Economists,
managers
Goals, known
cause and
effect,
quantified
variables
PPBS, linear
programming,
planned
variation, cost-
benefit analysis
Efficiency
Are the expected
effects achieved?
Can the effects be
achieved more
economically?
What are the most
efficient programs?
Behavioral
objectives
Managers,
psychologists
Prespecificed
objectives,
quantified
outcome
variables
Behavioral
objectives,
achievement
tests
Productivity,
accountability
Is the program
achieving the
objectives?
Is the program
producing?
Decision-
making
Decision-
makers,
especially
administrators
General goals,
criteria
Surveys,
questionnaires,
interviews,
natural
variation
Effectiveness,
quality control
Is the program
effective?
What parts are
effective?
Goal free Consumers Consequences,
criteria
Bias control,
logical
analysis,
modus
operandi
Consumer
choice, social
utility
What are all the
effects?
Art
criticism
Connoisseurs,
consumers
Critics,
standards Critical review
Improved
standards,
heightened
awareness
Would a critic
approve this
program?
Is the audience's
appreciation
increased?
Prof.
review Prof’l, public
Criteria, panel
procedures
Review by
panel, self-
study
Professional
acceptance
How would
professionals rate the
program?
Quasi-
legal Jury
Procedures and
judges
Quasi-legal
procedures Resolution
What are the
arguments for and
against the program?
Case study Client,
practitioners
Negotiations,
activities
Case studies,
interviews,
observations
Understanding
diversity
What does the
program look like to
different people? Source: Patton and Sawicki, p. 374
PAD5700 lecture 11
Page 4 of 20
Principles of analysis
Patton and Sawicki offer a number of 'principles' of analysis, which I'll type in below:
Determine the focus of the evaluation
Try to become involved as early as possible
Decide what data will be produced
Determine what change will be measured
Identify what policy action or intervention is being evaluated
Use multiple methods of measurement
Design the evaluation so it can respond to program modifications
Design the evaluation to provide in-course as well as final evaluations
Involve program staff in the evaluation
Recognize the politics of evaluation
Make your preliminary findings available
Give a clear presentation
(source: Patton and Sawicki, p. 388-93)
Patton and Sawicki go on to note a variety of types of program evaluation:
Before-and-after comparisons -- in the event of the implementation of new policy, in a perfect
world pre-implementation data would be gathered to use as a benchmark. As will be discussed, a
fundamental problem with the social sciences is the difficulty involved in controlling for other
variables. As an example, in his 2001 State of the Union address, the President noted that "rising
energy prices" were a policy issue the administration had to address; while in a later speech his
budget director Mitch Daniels noted that
“The report we've issued this morning confirms that the nation has entered an era of solid
surpluses. Surpluses on the order of $160 billion, despite an economy that has been week
now for over a year and in decline for that time. This is the second largest surplus in
American history, in the face of that weak economy, a phenomenon that should strike all
Americans as very positive.
“The 10-year forecast that we have projected is $1 trillion, again an astonishing number,
vastly more than the amount of publicly held debt that it will be possible to repay over that
time period. And this number reflects new commitments since the April budget of $198
billion in the first installment of the President's program to repair and rebuild our national
defenses, and also a revised increased estimate for Medicare reform including prescription
drug coverage for our senior citizens, up from $153 billion to $190 billion.”
In both cases, the administration was proven to have been spectacularly wrong, with this easy to
determine simply by comparing the state of affairs on these two variables (energy prices and the
federal debt): energy prices are well above what they were when the President expressed his
concerns about high prices (from ~$25 a barrel in 2001, to as high as $100 a barrel since), while
Mitch Daniels presided over one of the most extraordinary budgetary meltdowns in US history,
from record surpluses to then record deficits (from a $200b surplus in 2000, to a peak ~$400b
deficit in 2004, not including the over $1 trillion deficit from Bush's 2009 budget. Can Daniels
and Bush be blamed for missing these benchmarks? They, of course, legitimately cite other
factors out of their control that account for a lot of these problems (if by no means all, and by no
means most, especially in terms of the budget!). So the before and after evaluation of
PAD5700 lecture 11
Page 5 of 20
administration policy, on energy prices and on public finance, becomes muddied by these other
factors.
With-and-without comparisons
Similarly, a control group among whom the program was not implemented can be adopted as a
contrast. Florida implements a program, and monitors the same underlying phenomenon that the
program is meant to address in Georgia, which didn't implement the new program. In this way it
is hoped that the Georgia case can help control for other factors, beyond the new program, that
might affect the underlying phenomenon. So improvement in Florida isn't enough to determine
program success, improvement in Florida relative to Georgia must be achieved. Take the US
federal deficit: perhaps the Bush administration was only whacked by global forces, so if we
compare the US to Canada we can get some sense of this. Nope, them thar' socialist Canadians
ran budget surpluses through most of the 2000s (source).
Actual-versus-planned performance targets
Establish goals for the program, and assess performance relative to these. This is, of course, also
subject to fiddling, simply by setting easy goals. Years ago, when airline arrival times began to
be monitored, most airlines quickly improved their on-time rates, simply by padding projected
travel times. Similarly, the Bush administration crowed about having brought its 2004 budget in
at a deficit of only 3.6% of GDP, well below the projected 4.5%. Yet when this budget was
proposed, many noted the ruse in the overly pessimistic projections. The administration was also
active in calming citizens about the size of the federal deficits. In the President's 2006 Budget
Message, for instance, he argues (with accompanying graphic):
"The Budget forecasts that the deficit will
continue to decline as a percentage of GDP. In
2005, we project a deficit of 3.5 percent of
GDP, or $427 billion. And if we maintain the
policies of economic growth and spending
restraint reflected in this Budget, in 2006 and
each of the next four years, the deficit is
expected to decline. By 2009, the deficit is
projected to be cut by more than half from its
originally estimated 2004 peak—to just 1.5
percent of GDP, which is well below the 40-
year historical average deficit, and lower than
all but seven of the last 25 years."
Note that the 40 year historical average was buoyed considerably by the similarly record-setting
deficits of the Reagan/Bush era. So in effect, the supply-side tax cutting Bush administration
was arguing that its deficits were not that large, when compared to the large deficits of the
previous conservative, supply-side tax cutting administration. Not so comforting!
PAD5700 lecture 11
Page 6 of 20
Experimental models
Experimentation at the societal (not to mention the national) level is all but impossible, but at the
program level it is somewhat less so. This underlies pretty much every medical trial ever
conducted. The logic of experimental models is simple: a group of subjects is recruited, and
divided randomly in half. One half is subjected to the treatment, the other half is not. In medical
trials, the two groups typically don't know whether or not they are part of the study. In the social
sciences this can be more difficult (i.e.: while the medical trial control group can be given a
'sugar pill' that is indistinguishable from the actual medicine, the halfway-house control group
knows whether or not it is in a halfway house), but certainly not impossible. A successful trial
would see the test group perform better than the control group.
Quasi-experimental models
Real world application of an experimental design, "when a true experiment cannot be conducted
-- when we cannot randomly assign persons to treatment and control groups, when we cannot
control the administration of the program or policy or restrict the policy to a treatment group or
when programs are not directed at individuals" (p. 379). The general idea is as for that of
experimental designs, save that the control group is intentionally selected (rather than randomly
assigned) as similar enough to the test group that the control group can serve as a useful
benchmark. Patton and Sawicki also discuss time series designs, which are largely the same
thing save that multiple measurements will be taken across time, rather than a simple pre- and
post-test (O'Sullivan et al note the differences between these in their chapter 3). Patton and
Sawicki note that one needs to be careful with quasi-experimental designs, especially in terms of
being conscious of survey effect (people act differently when they know they are being studied),
external effects (terror mass murders messing up budget projections), and such.
Note that, with the Affordable Care Act being implemented differently in different states, we’ll
see a natural experiment play out over the next few years, as the results come in.
Cost-oriented evaluation approaches
Or cost-benefit analysis. Beyond the obvious, an advantage of this approach lies in that dollars
provide a metric allowing the comparison of programs in two different areas. Beds provided for
homeless people in one program could be compared to cops put on the beat in another. Given
success in each program, which will have greater impact on society? To the extent that one can
'dollarize' the impact of each program, these dollar impacts can then be compared. So one
measures the social impact of each program, divide the costs by these impacts, and you get social
impact per dollar of input.
Dollarizing? Of course, 'dollarizing' these different impacts is the tough part. Monitoring dollar
inputs is fairly easy, but comparing the benefits of reduced crime to improvement in the lives of
once homeless people is hard to do quantitatively.
Patton and Sawicki note such cost-effectiveness comparisons within programs of a different
type: surely, all else being equal, a program able to provide x beds at y cost, is more effective
PAD5700 lecture 11
Page 7 of 20
than a program providing x beds at 2y cost. It isn't as easy, though, to compare 100 beds at y cost
to ten less property crimes at 0.8y cost. Returning to the use of dollars as a metric for
comparison between different programs, an interesting recent example of this was provided by a
project known as the Copenhagen Consensus, which sought to work out where global
development dollars would yield the highest return on investment. Calculating dollar benefits v.
dollar costs of such grand projects is certainly controversial, yet consider the alternative: these
policies are implemented depending on the preferences of the rich world, upper-middle class
advocacy groups and government officials at the center of the decision-making process. The
Copenhagen Consensus analysis was led by Bjorn Lomborg, famous for the book The Skeptical
Environmentalist, in which he convincingly skewers a lot of views held by rich world, upper-
middle class, self-styled environmental advocacy groups. It found that greater returns would
come from policies aimed at malaria, HIV/AIDS, and malnutrition, all of which effect
developing world poor folk; rather than combating the perceived risk of global warming, dear to
the hearts of the rich world, upper-middle class self-styled environmental advocacy groups (with
family cottages on low-lying Cape Cod!).
Index construction
Define the concept -- what is it that you are trying to explain? If the analyst is unable to do this,
the analyst may not understand what s/he has done.
Selecting the items -- four concerns:
choose the "right items -- those that represent the dimension of interest and no other,"
"include enough items to distinguish among all the important gradations of a dimension
and to improve reliability,"
"decide whether every item is supposed to represent the entire dimension in which you
are interested or whether each is to represent just [a] part of the dimension,"
"keep costs down by excluding items that provide no extra information" (p. 301).
Combining the items in an index. Options:
'Likert scaling' (p. 308-12). For instance, a Likert scale-based question such as the following
would be a weak way to get a handle on political ideology:
Q. How would you describe your political ideology
1. strong conservative 2. moderate conservative 3. moderate 4. moderate liberal 5.
strong liberal
This simply captures way too little. Instead, the analyst might develop a series of ten or
twenty questions, each with a consistent Likert scale, that assess this from different angles,
asking for opinions on a range of issues. By aggregating these results, a fuller, multi-faceted
indicator is achieved.
Examples:
The dreaded Keirsey temperament sorter
The equally dreaded Myers Briggs.
Right-Left Brain test (this should be fun!). My result: left brain 70%, 46% right brain.
Facebook has innumerable tools of this sort. One that I particularly remember is the "Which
Rhode Island town are you?" After an hour or so of retaking the test, I managed to figure out
what answers I needed to give to be a Block Islander.
PAD5700 lecture 11
Page 8 of 20
Transformations. Especially to standardize (make easier to compare) variables constructed in
different units.
o The approach taken for this by O'Sullivan et al (in pages 305-7) revolves largely around
deriving a relative score for each indicator, based either on the range, or on the standard
deviation, for that indicator. Example 10.3 provides an example of standardization based on
the range.
As an example of the importance of this, consider life expectancy and GDP per capita (at
purchasing power parity, a more comparable figure).
The lowest average life expectancy of any country in the world is Zambia, at 40.5
years (according to the Global Government data base we have been using). The
highest is Japan, at 82.3.
The lowest GDP per capita reported by the same source is $667, in Malawi; highest is
Luxembourg at $60,228.
Zambia's GDP per capita was $1023, Japan's $31,267.
If you were to simply add up life expectancy and income, Zambia would get a score
of 1063.5 (40.5 years + $1023), while Japan's score would be 31,349.3 (82.3 years +
$31,267).
The point is that life expectancy exerts far, far more influence on the index than
does longevity.
One solution: scale each of these variables as a percentage of the highest score,
the lowest life expectancy score would be 0.492 (40.5/82.3).
The lowest income score would be .011 (667/60,228).
As a result the income indicator will still exert more influence on the overall
index, as there is more variation on this one.
The better solution: scale through the range of the variable.
So for life expectancy, Zambia gets zero, Japan 100 points, the rest score
according to their relative placement within the range between Zambia and Japan.
So the United States would score 89.5. With a life expectancy of 77.9, this
score is determined by the formula (77.9-40.5)/(82.3-40.5) = 90.8
Oddities can still occur. Luxembourg's income per capita was nearly $20,000
higher than the second highest, and probably reflects Luxembourg's status as
an international (or European) center of government and finance. So there
would be one country with a score of 100 for income, and #2 would come in
at around 60. For life expectancy, dozens of countries would have scores in
the 90s, over-emphasizing this variable.
The United Nations Development Program does this with its Human
Development Index which, like The Economist's quality-of-life index above, tries
to work out which countries outperform others in terms of 'human
development'. The UNDP uses life expectancy, education and income as the
three components of its index.
Weighting the separate items -- emphasize variables believed to be more important. In The
Economist's quality-of-life ranking, for instance, is the divorce rate as important as
income? O'Sullivan et al give a good example of this in the Uniform Crime Index. As they note,
the problem with this aggregate crime figure is that it, well, aggregates different types of crime,
PAD5700 lecture 11
Page 9 of 20
producing a total number of crimes reported. This is worth knowing, but as an overall indicator
of the social impact of criminal activity, it fails to distinguish between violent crime and non-
violent crime. A reduction in the number of murders that is offset by an equal number of
marijuana possession violations, will result in no change in the UCI, yet surely society is better
off (or perhaps less worse off) in the latter case?
For the record, the 2001 crime rate in the US was 4160.5 per 100,000 inhabitants.
This 4160.5 included (add 'em up) 5.6 murders (per 100,000 inhabitants), 31.8 rapes, 148.5
robberies, 318.5 aggravated assaults, 740.8 burglaries, 2484.6 larceny thefts, and 430.6 motor
vehicle thefts. Is a one point increase in the murder rate as important as a 1 point increase in
the larceny theft rate? Should larceny thefts influence the crime index 500 times more than
the murder rate?
Not surprisingly, when I checked the World Almanac 2007 to update those figures, I got this
note for more recent UCI: "The use of the Crime Index in the Uniform Crime Reports
Program was discontinued beginning with the report of 2003 data", p. 117).
However that raw data still gets reported. 2007, for instance, saw 3.1 murders, 30 rapes, 284
aggravated assaults, 148 robberies, 723 burglaries, 2178 larcenies/thefts, and 363 motor
vehicle thefts, for a total of 3264 crimes per 100,000 population, a dramatic drop that policy
wonks are still arguing about.
An illustration of Constructing indices
We'll do an example for categorical data. In the Appendix (pages ) you’ll see an example for
continuous variables. In the factor analysis example below, we'll construct an index for a
continuous variable.
The Belle County survey provides a number of indicators regarding citizen satisfaction. But
what to make of the results? Assume you want the results broken down by whether a respondent
lives in the city limits or not. A table of key responses for ‘Overall county service value rating’,
for instance, would look like this:
Analyze, Descriptive, Crosstabs
Rows = Residence in city limits; Column = Overall county service e value rating
Click ‘cells’: Check ‘Row’ under Percentages
Okay
Table 2 -- Residence in city limits * Overall county service value rating Crosstabulation
% within Residence in city limits
Overall county service value rating
Total Very poor
value Somewhat poor value Fair value Good value
Excellent value
Residence in city limits
Inside 4.5% 10.9% 46.0% 33.5% 5.1% 100.0%
Outside 5.6% 14.9% 38.5% 36.0% 5.0% 100.0%
Total 4.9% 12.2% 43.5% 34.4% 5.1% 100.0%
But is this single indicator enough? This is where an index can help aggregate the results. This
can be done a couple of ways, depending on your unit of analysis. A simple one is as follows:
PAD5700 lecture 11
Page 10 of 20
You can construct an index for each individual respondent. The process will first require
creating a new variable, and then using the Transform function.
Go to Variable View (bottom left of the spreadsheet), and add another variable, call it
'index', leave it numeric, and Label it 'Index: service value ratings'.
Go back to Data View.
Transform
Compute
Target Variable will be 'index' (you have to type it in)
The Numeric Expression will be:
(valserv + valaid + valsch + valenv) / 4
This aggregates the responses from each respondent. These can then be added up using Case
summaries, just to use this again to construct a table (Analyze; Reports, Case Summaries;
Residence in city limits in 'Grouping Variable'; Index as 'variable', click off the checkmark in
Display cases, click on Statistics and choose Mean), click Continue, click Okay. You get this:
Table 3 -- Case Summaries
Mean
Residence in city limits Index: service value ratings
Inside 3.2922 Outside 3.3347 Total 3.3065
As can be seen, the two don't differ significantly. Needless to say, we don’t trust such
impressions, instead do a hypothesis test. An Independent samples test gives us a mean score of
3.29 for those residing in the city, and 3.33 for those outside, with p = .542. But hey: now you
know!
*
III. The program evaluation exercise
* Present the analyses as you would do so in a (20 page maximum, including graphics) professional
report. Introduce the analysis, justify the model that you have built, and interpret the
results. Present data in professional table/graphic formats as necessary.
Grading criteria:
Identify the issue/state the question
Demonstrate command of the material
Write professionally
Logical, coherent, argument
Well used tables/ graphs
Follow instructions (debits)
For further explanation of these criteria, see the PAD 5700 Assignments page, starting about
halfway down the page.
Choose one of the four programs to evaluate:
PAD5700 lecture 11
Page 11 of 20
Long term care facility
Report on the operation of the Adams long term care facility. The data is based on a survey
instrument of residents of Adams and of three other institutions. Develop
a list of key performance measures for the four facilities, and
identify correlates/determinants of those performance measures.
Write up the results as you would a professional report.
Long term care facilities dataset
County programs evaluation tool
Develop an evaluation tool for, and report on the operation of, public services provided by this
county. The data is based on a survey of residents. Develop
a list of key performance indicators for the various programs provided by the county, and
identify correlates/determinants of those performance measures, for the different
programs.
Write up the results as you would a professional report.
Belle County survey
Community concert program
Develop an evaluation tool to recommend types of music to be offered in a series of six concerts
presented by a city concert hall. The survey below (which is just a modified version of the 1993
General Social Survey) asked 1500 citizens for their preferences, as well as identified a range of
characteristics of these citizens. The community's musical preferences are indicated in the
questions titled: 'Music: ______'. In your recommendation, balance the following criteria:
satisfy as wide a cross section of the community as possible with these six concerts,
appeal to a wealthier clientele likely to be able to afford the concerts, and
recommend identifiable groups of people (no more than two per type of music) to whom
advertising for each concert could be targeted.
The data: Concert.
The UNF PORL Fall Statewide Omnibus Survey
Evaluate the results of the UNF Public Opinion Research lab’s Fall 2013 Omnibus statewide
survey, in terms of determinants of the views expressed. By this I mean the official press release
from the survey presented political preference results, as well as demographic results. In your
analysis, look at the impact of the demographics on the political preference (and other) results.
The data is based on a survey of residents. Do the following:
Identify the key outcomes that you will analyze,
identify important correlates/determinants of those outcomes.
Write up the results as you would a professional report.
UNF-PORL-Fall-State survey
The data
Variable descriptions
PAD5700 lecture 11
Page 12 of 20
References: Candler, G.G. (1997). "The Tongan Construction Industry -- infrastructure provision in a small
economy." Pacific Economic Bulletin, 12(1).
Candler, G.G. (1999). "Civil Society and Development -- Scientific and Professional
Associations in Public Policy in Santa Catarina and Sergipe, Brazil." Policy Studies Journal
27(3).
Gold, Michael and G.G. Candler (2006). "The MPA Program in small markets: an exploratory
analysis." Journal of Public Affairs Education 12(1).
Weimer, David (2003). "Reflections on building an MPA program." Journal of Public Affairs
Education, 9(1), p. 39-41.
Appendix!
Factor analysis (the Quick & Nasty Guide)
After lots of trial and error, I think I finally figured this out. The idea in factor analysis is to look
for correlations between variables. If variables are correlated, and correlations among
independent variables violates one of the assumptions of regression analysis, one might consider
combining them into an index. For a good example (by a prominent public administration
scholar), see
James Perry and Leslie Berkes (1979). "Predicting local government strike activity: an
exploratory analysis." Western Political Quarterly 30(4), p. 513-27. JSTOR link.
The logic (if not some of the details) of factor analysis is illustrated pretty well in this
article. Essentially, Perry and Berkes wanted to look at the impact, on local government strike
activity, of four broad variables: the macro-environment environment (social and economic
characteristics), government characteristics, public employment characteristics, and legal
policy. How, though, does one 'operationalize' these variables? They didn't, for instance, want
to do something as naive as offer the divorce rate as an indicator of 'family life', as in the
Economist article above. But which to choose, and how to avoid the 'multi-collinearity' problem
in regression analysis (see O'Sullivan et el, p. 442-5)? What factor analysis can do is help sort
this out.
So Perry and Berkes identified a number of variables that plausibly seemed like they could
operationalize each of their four broad variables. In Table 1 (p. 515), for instance, they identify
ten. These then are chunked into factor analysis, and what the process does is identify groups
among them, or identifies those that are significantly related. As you’ll see, it’s like a
multivariate correlation matrix – rather than just looking at relationships between pairs of
variables (a covaries with b), it looks for clumped groups of tightly correlated variables (a, b, c,
and d all covary). Possible results vary from -1 (perfect negative correlation) to 1 (perfect
correlation), so a high absolute value (closer to 1 or -1) reflects correlation. Analyzing the
results, the authors identify three broad groupings, we'll call them a, b and c for now:
PAD5700 lecture 11
Page 13 of 20
a b c
% of state population which is
urban
% non-agricultural union
membership
% of state population below low
income level
% of state population below low
income level right to work law % population black
state per capita income man days idle (normalized)
state median family income
% employed in non-agricultural
establishments
state population density
Note that all have 'loadings' (scores) above 0.50 (though they use 'number of teacher strikes' in
their 'past strike activity' factor in Table C, even though it only has a 'loading' of 0.46). The final
step in this process is simply to provide the 'factors' (the broad groupings of correlated variables)
with spiffier names, given the nature of the 'factors'. So they call the three 'factors'
urbanization/industrialization, private union influence, and race/poverty:
urbanization/industrialization private union influence race poverty
% of state population which is
urban
% non-agricultural union
membership
% of state population below low
income level
% of state population below low
income level right to work law % population black
state per capita income man days idle (normalized)
state median family income
% employed in non-agricultural
establishments
state population density
The process is repeated for the other groups of variables that they identify as relevant to their
other three broad variables.
What one does with this is
Not include alone, in further analysis, any two variables that 'load' onto a single 'factor':
they are correlated.
Instead, on the one hand only one of these variables might be selected to operationalize
the 'factor', perhaps the one with the highest 'loading'.
On the other hand, one might construct an index to operationalize the 'factor', using all or
some of the variables in that factor. This can provide a richer, less narrow
operationalization of the factor.
An example of factor analysis -- note: you will not necessarily have to do this for assessment
items!!!
We'll be using the indiana-crimeadjusted dataset for a factor analysis example. Let's look at
determinants of (or factors correlated with, to not make the usual mistake of assuming causality
PAD5700 lecture 11
Page 14 of 20
in statistical analysis) population change in Indiana counties. Looking at our dataset, we have a
number of variables that might plausibly be related to this:
pop -- population: size might matter, large counties might be growing faster than small;
or vice versa.
latino -- percent Latino population
youths -- 18-24 year old population as a percentage of the total
femhspct -- percentage female headed households
gradhs -- high school graduation rate
gradcol -- college graduation rate
inchouse -- median household income
poverty -- poverty rate
crime -- crime rate
jobless -- unemployment rate
manfng -- manufacturing earnings, as a percentage of personal earnings
farm -- farm population, as a percent of the total
As a start, run descriptive statistics for these (and the dependent variable), just so you have a
better idea what you're dealing with. Also, make sure you have the right variables: we need
normalized (percentages or rates) variables, not raw numbers (with the exception of
population). In other words, you need the percentage of female headed households, not the
number of these.
Second, also run a correlation matrix, and a multiple regression, just to have these handy so that
we can compare these results (that we're a bit familiar with) with the results of the factor
analysis. If you're looking at this at home, I'll run the results and copy them into another file
http://www.unf.edu/~g.candler/PAD5700/11-FactorCor.pdf
just so that the results don't clutter up this page (not to mention that the correlation matrix won't
fit on an 8 1/2 x 11 sheet of paper if you print it out.
Returning to the factor analysis exercise, we will take all of our prospective independent
variables and do a factor analysis on them:
Analyze, Dimension Reduction, Factor
Load all independent (not population change) variables as Variables
Leave the Descriptives and Extraction settings at their default positions (respectively:
Initial solution; and Principal components, Correlation matrix, Unrotated factor solution,
Eigenvalues over 1, Maximum items for convergence; 25).
For Rotation select Method: Varimax (it seems to be all the rage), Display: Rotated
solution, and Maximum iterations: 25.
For Scores select Save as Variables: Regression.
For Options, leave it as Missing values: Exclude cases listwise.
Click okay.
PAD5700 lecture 11
Page 15 of 20
The results (including only the key factor loadings, and looking at the Rotated Component
Matrix, which doesn't differ significantly from the Component Matrix):
Rotated Component Matrix
a
Component
1 2 3 4
Population .161 .819 .119 .136 Percent Latino population .130 .431 .110 .681 18-24 population, % -.070 .119 .953 .039 % female headed households -.326 .846 .043 .173 High school grads, % .788 .174 .316 -.027 College grads (B.A. or higher), % .634 .353 .596 -.132 Median household income per capita .942 .052 -.127 -.023 Poverty, % -.849 .275 .284 -.132 Crime rate, serious per 100k pop. -.006 .644 .019 -.166 Unemployment rate, % -.716 -.035 -.096 -.126 Manufacturing, % personal earnings .023 -.216 -.076 .847 Farm population, % -.253 -.677 -.292 .079
Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. a. Rotation converged in 5 iterations.
To reconfigure these four identified factors (ordering them in terms of their factor score), in
terms of my presentation of the Perry/Berkes study above, we get this:
a b c d
income (.942) female headed households
(.846) youth population (.953) manufacturing (.847)
high school graduation
(.788) population (.819) college education (.596) Latino population (.681)
college graduation (.644) crime rate (.644)
unemployment (-.716)
poverty (-.849) farm population (-.677)
The next (largely cosmetic) step is to name the four factors. This is just a matter of looking at
what is going on, and giving them names that are a bit more evocative than a-d. How about:
a -- socio-economic I
b -- socio-economic II
c -- college town
d -- manufacturing
So we now have four factors identified, as follows:
Socio-economic I Socio-economic II College town Manufacturing
income (median hh) female-headed
households % youth population % manufacturing earnings %
high school graduates (%) population college grads % Latino population %
college graduates (%) serious crime (per 100k)
unemployment (%)
(negative)
poverty (%) (negative) farm population (%)
(negative)
PAD5700 lecture 11
Page 16 of 20
Before moving on: the Socio-economic I and II is a bit interesting. On the one hand, factor
analysis has identified two distinct 'factors' composed of general socio-economic data. On the
other, at this point I can't distinguish between the two -- what is different between them? -- so am
unable to come up with a designation better than Socio-economic I and Socio-economic II.
Further analysis can clarify what is going on. With these four factor variables, we have created
indices to rate the 92 counties in the state on these four criteria. SPSS creates them
automatically, labeling them fac1_1 to fac4_1 (check the dataset). We can now rank order them
(Data, Sort Cases. In the window: Sort By fac1_1). The results, showing the top and bottom
three in each category:
Socio-economic I Socio-economic II College town Manufacturing
Hamilton Marion Monroe Noble
Hendricks Lake Tippecanoe Elkhart
Boone Allen Delaware Lake
Starke Pulaski Morgan Brown
Crawford Warren Ohio Pike
Switzerland Lagrange Franklin Hendricks
These results make the difference between Socio-economic I and Socio-economic II pretty clear:
I is suburban counties, II urban counties. To rename the factor analysis results:
Suburban Urban College town Manufacturing
income (median hh) female-headed
households % youth population % manufacturing earnings %
high school graduates (%) population college grads % Latino population %
college graduates (%) serious crime rate (per
100k)
unemployment (%)
(negative)
poverty (%) (negative) farm population (%)
(negative)
A way to get a better handle on what factor Analysis is doing can be shown if we do a correlation
matrix on one of the factors we've identified above, say Socio-economic II (urbanization). We
get this:
Correlations Note first that each of the six two-way correlations between the four variables in this factor have
p-values (the significance 2-tailed) of less than .002. As well, look at the Latino population,
which had a 'loading' of .431 on this 'factor' (remember that Perry and Berkes accepted a couple
of loadings that were this low). Look at the p-values of the correlations between this variable
and the others in the factor:
Latino population and Population -- .000
Latino population and Female households -- .001
Latino population and Crime rate -- .106
PAD5700 lecture 11
Page 17 of 20
Latino population and Farm population -- .054
The point here is that the four variables in this factor with loadings of at least 0.500 have
correlations of less than p = .002 with each of the other three variables in the factor. Latino
population is quite highly correlated with each of the four variables in this factor, but its
correlation with two are not quite as strong: .054 with Farm population, and .106 with Crime
rate. So what factor analysis does is identify groups of variables that are all, highly correlated.
Well bully for factor analysis!
But what's the point? A point is that when specifying models for regression analysis, we can
avoid multi-collinearity and, using index construction, develop richer variables. The regression
model that you ran earlier (and which I've presented on the linked output page mentioned above)
lists thirteen independent variables. We now know that many of these thirteen variables are very
highly correlated with each other: multi-collinearity. So instead, we will construct four new
variables -- the four factors identified in this analysis -- and use these as our independent
variables in regression analysis.
From factors to variables
An intermediate step would be to see what we want to do with our factor analysis results, or just
to make sure we like what the computer program has done for us. Remember: the computer
works for us, we don't work for the computer.
For instance:
We have four nicely identified phenomena, or variables: Urban, Suburban, College and
Manufacturing.
An obvious missing component is 'ruralness', or farm communities.
o Farm population appears in only one of our factors, as a negative component of
Urbanization. So identifying this farm population variable alone as an
operationalization of farm counties seems reasonable.
We now have five variables: Urban, Suburban, College, Manufacturing, and Farm.
o We can also decide whether to leave farm population in the Urban index as a
negative number, and will do so mostly just to illustrate how to handle these in the
construction of our indices.
A last question we might ask concerns the negatively loaded variables on the Suburban
factor. This factor shows that low unemployment and low poverty are characteristics of
suburban counties. Or put another way: lack of social distress. Partly to avoid more
complicated math with negative numbers, but also because it seems reasonable to treat
'Distress' as a unique characteristic, we will add this to our list of counties and omit these
two characteristics from the Suburban variable.
We now have the following variables identified:
o Urban -- an index made up of the variables
female-headed households
population
serious crime rate
farm population (negative)
o Suburban -- an index made up of the variables
PAD5700 lecture 11
Page 18 of 20
median household income
high school graduates
college graduates
o College -- an index made up of the variables
youth population
college graduates
o Manufacturing -- an index made up of the variables
manufacturing earnings
Latino population
o Farm -- made up of the variable
farm population
o Social Distress -- an index made up of the variables
unemployment
poverty
Variable transformations (or: Constructing indexes II)
You'll need to create five new variables (click on Variable View in the bottom right),
respectively
burbs -- Label it Factor: socio-economic I
urban -- Label it Factor: socio-economic II
college -- Label it Factor: college town
manfg -- Label it Factor: manufacturing
distress -- Label it Factor: social distress
We now have to address some of the standardizing problems raised by O'Sullivan et al (p. 296-
9). The problem is that the four variables for the Socio-economic II (Urban) factor are expressed
in different units that can't simply be added, unlike the Likert scales in the Belle County
study. Descriptive statistics on the four variables, after all, show us that
Two are percentages: Female headed households (with a range of 10% from the highest
to the lowest) and Farm population (with a range of 22).
Population is expressed in terms of thousands (with a range of 855),
crime rate the number per 100,000 population (with a range of 250).
As a result, we can't simply add them up, as crime rate and especially population will be over-
emphasized. So what we'll do is a simple transformation of each variable into a percentile
score. As an example, for the Population variable, the minimum population will be subtracted
from each county population (county population-minimum population), which will then be
divided by the range for the population (maximum county population - minimum county
population), indicating how far it is from the lowest to highest. The formula to create this new,
standardized percentile variable would be
(county population-minimum population) / (maximum county population - minimum county population)
Also, variables that are negatively related to the factor need to be turned around, so that each
variable in the factor is pushing in the same direction. In the case of the Urban factor, Farm
population is 'negatively loaded', and so needs to be flipped around. So you don't want to know
how far it is from the minimum in percentage terms, but rather how far it is from the maximum
in percentage terms. The formula for this would be
PAD5700 lecture 11
Page 19 of 20
(maximum farm population - county farm population) / (maximum farm population - minimum farm
population)
The logic is that a higher number should indicate more consistency with what the index is trying
to capture. In the case of farm population, the absence of which is an indicator of urbanization, a
high number is opposite to urbanization. So "(maximum farm population - county farm
population)" -- the first part of the equation above -- turns this around. A county with a low farm
population will get higher numbers this way. This then gets divided by the range (maximum
farm population - minimum farm population), as before.
Are we having fun yet! And to think that I gave up a career in concrete for this...
So putting together the percentile formula for each of the four variables in this factor, we get the
following as a formula for the mean percentile ranking of the five variables in the Socio-
economic II, Urban factor: (((pop-5.623) / (860.454-5.623)) + ((femhspct-6.5) / (16.6-6.5)) +
((22-farm) / (22-0)) + ((crime-319) / (9650-319)))
/ 4
So to create a new, Socio-economic II (urbanization) variable, we go to
Transform, Compute
Type urban in as Target Variable
Load as Numeric Expression the formula above
Click Okay, and Change existing variable.
You can do a quick check to see if this looks reasonable by sorting the data by fac2_1, and
looking to see if your new variable seems to be similarly sorted.
Do the same thing for the other four factor variables.
So what?
Well, we can now do a more meaningful regression (and related) analysis of the determinants (or
correlates) of population change. Running a regression analysis of
population change = Suburbanization + Urbanization + College community + Manufacturing +
Social Distress + Farming
gives the results presented on the next page.
Note that in the earlier regression (on the output link) you had an R squared of .639, now it is
only .415. Further, before the following variables were at least weakly (p ~ .10) statistically
significant:
Latino population (Standardized β = .148, p = .107)
high school grads (Standardized β = -.295, p = .037)
college grads (Standardized β = .343, p = .115)
median per capita household income (Standardized β = .766, p = .003)
unemployment (Standardized β = .189, p = .052)
PAD5700 lecture 11
Page 20 of 20
manufacturing earnings (Standardized β = -.215, p = .020)
Note too that a lot of these variables are highly correlated (income, unemployment, high school
grads and college grads; Latino population and manufacturing earnings). So that whizbang R2
is
the result of our including single social phenomena like suburbanization more than once
(through the statistics for income, unemployment, high school grads and college grads). This
factor analysis model should give a more accurate indication of the determinants/correlates of
population growth.
Regression of Population change (1990-2000) on suburbanization, urbanization,
college community, manufacturing, social distress and farming
Coefficient
(standard error)
Standardized
coefficient t score probability
Constant 2.03
(6.43) .316 .753
Suburbanization
index
49.05
(8.39) .755 5.88 .000
Urbanization index -20.17
(13.81) -.274 -1.50 .137
'College town'
index
-12.72
(8.73) -.181 -1.46 .149
Manufacturing
index
-4.55
(5.25) -.078 -0.87 .388
Social distress
index
10.17
(9.93) .123 1.024 .309
Farming index -2.657
(8.021) -.054 -.331 .741
Adjusted r2 = .323
F (6, 85) = 8.24 p = .000
Note, too, that the analysis makes sense. What the results tell us is that surburbanization is
strongly related to population change, and positively. 'College town' and 'Urbanization' is
negatively correlated with population growth. manufacturing and social distress are not related
to population decline.