master of public administration program pad 5700 -- statistics

PAD5700 lecture 11

of 20

MASTER OF PUBLIC ADMINISTRATION PROGRAM

PAD 5700 -- Statistics for public management Fall 2013

Program evaluation and index construction

Index of the week

Quality of life

*

The idea behind indexes is to aggregate a number of indicators into one, mother-of-all

measure. The Economist newsmagazine, for instance, has been doing an annual 'best country to

live in' index, a portion of the results from which I've copied in above. The 2005 survey rated

Ireland #1, followed by Switzerland, Norway, Luxemburg and Sweden. The US came 13th,

trailing Singapore, Spain and Finland, but pipping out Canada and New Zealand. The method

used in constructing the index is available online:

They use a steamroller to crack a walnut, and don't explain the development of the index as

clearly as they could. So don't sweat the details. Think big picture: the aggregation of

different indicators into one, overall indication.

Note that the index aggregates numerical scores on nine 'determinants of quality of life':

material well-being, health, political stability and security, family life, community life,

climate and geography, job security, political freedom, and gender equality.

o Each of these is, of course, is measured through one, occasionally weak indicator.

'Family life' is measured by the divorce rate; community life through a dummy variable

based on whether a country has 'a high rate of church attendance or trade-union

membership'; and climate and geography is reduced wholly to latitude.

Note that these ostensibly weren't chosen willy-nilly, they are derived from regression

analyses to identify indicators that "have been shown to be associated with life

satisfaction in many studies" (p. 2).

http://www.economist.com/media/pdf/QUALITY_OF_LIFE.pdf

http://www.economist.com/media/pdf/QUALITY_OF_LIFE.pdf

PAD5700 lecture 11

of 20

Note the regression data (bottom of page 2), with its adjusted R2, and statistics: these

are our 't statistics', telling us how many standard deviations the coefficients are from

zero, or how statistically significant they are.

o The analysis also discarded other indicators that might have been considered, including

education, economic growth and inequality.

These nine indicators are then 'weighted', allowing those that are deemed somewhat

more important than others to have a greater effect on the mother-of-all measures.

The weights are based (somewhat loosely) on beta coefficients of the indicators: i.e.

those shown in regression analysis to have a stronger impact on the dependent

variable are weighted more.

The idea here is to resolve the paradox: 1 is better than 2 on a, but 2 is better than 1 on b. Which

is better overall: 1 or 2? To find this out, combine a and b.

Perhaps the most prominent index in the world is the Dow Jones Industrial Average, an index of

30 stocks that are meant to provide a snapshot of the state of the US economy (or at least

investment in the US economy). The Dow Jones Company website includes a link to their

method in constructing their indices. Other examples of indices (rwo of which we’ve used in

this class):

The Human Development Index -- developed by the United Nations Development

Programme.

'Freedom in the World' scores -- developed by Freedom House, an international

nongovernmental organization dedicated to human rights.

The Ratings Percentage Index -- used by the NCAA to rank sports teams from different

conferences.

Program evaluation

There are a few weeks of program evaluation included in this course for the same reason that a

lot of the content is here: this makes a lot of sense, especially given the contemporary pressure

toward performance measurement in both the public and nonprofit sectors. As some examples:

The United Way's Outcome Measurement Resource Network.

The Clinton administration's National Partnership for Reinventing Government.

The Government Performance and Results Act of 1993.

The Office of Management & Budget's Program Assessment Rating Tool.

The Bush administration's President's Management Agenda.

Continued under the Obama administration.

The Obama Administration's Accountable Government Initiative.

The State of Florida Office of Program Policy Analysis & Government Accountability The

Florida Legislature Government Program Summaries.

The City of Jacksonville's Key Benchmark Indicators.

Types of analysis

By way of introduction, Patton and Sawicki (from which this lecture is going to draw heavily,

partly because I like the book, partly because Berman & Wang don’t do a good job of this, on p.

53-4) also offer the following as a categorization of different types of evaluations. Patton and

http://money.cnn.com/data/markets/dow/

http://www.dj.com/

http://www.djindexes.com/mdsidx/downloads/meth_info/Dow_Jones_Averages_Methodology.pdf

https://www.ncaa.org/wps/wcm/connect/public/NCAA/Championships/NCAA+Rating+Percentage+Index/

http://www.ncaa.org/

http://www.liveunited.org/outcomes/

http://govinfo.library.unt.edu/npr/index.htm

http://www.whitehouse.gov/omb/mgmt-gpra/gplaw2m.html

http://www.whitehouse.gov/omb/performance_past

http://georgewbush-whitehouse.archives.gov/omb/budintegration/pma_index.html

http://www.whitehouse.gov/omb/financial_fia_pma

http://www.whitehouse.gov/sites/default/files/omb/memoranda/2010/AccountableGovernmentInitiative_09142010.pdf

http://www.oppaga.state.fl.us/

http://www.oppaga.state.fl.us/government/

http://www.coj.net/mayor/blueprint-for-prosperity/key-benchmark-indicators.aspx

PAD5700 lecture 11

of 20

Sawicki note that some 100 evaluation techniques have been identified (p. 373), but often as not

these are wheels that are reinvented and renamed.

Table 1 -- House's evaluation taxonomy

Model

Major

audiences or

reference

groups

Assumes

consensus on Method Outcome Typical questions

Systems

analysis

Economists,

managers

Goals, known

cause and

effect,

quantified

variables

PPBS, linear

programming,

planned

variation, cost-

benefit analysis

Efficiency

Are the expected

effects achieved?

Can the effects be

achieved more

economically?

What are the most

efficient programs?

Behavioral

objectives

Managers,

psychologists

Prespecificed

objectives,

quantified

outcome

variables

Behavioral

objectives,

achievement

tests

Productivity,

accountability

Is the program

achieving the

objectives?

Is the program

producing?

Decision-

making

Decision-

makers,

especially

administrators

General goals,

criteria

Surveys,

questionnaires,

interviews,

natural

variation

Effectiveness,

quality control

Is the program

effective?

What parts are

effective?

Goal free Consumers Consequences,

criteria

Bias control,

logical

analysis,

modus

operandi

Consumer

choice, social

utility

What are all the

effects?

Art

criticism

Connoisseurs,

consumers

Critics,

standards Critical review

Improved

standards,

heightened

awareness

Would a critic

approve this

program?

Is the audience's

appreciation

increased?

Prof.

review Prof’l, public

Criteria, panel

procedures

Review by

panel, self-

study

Professional

acceptance

How would

professionals rate the

program?

Quasi-

legal Jury

Procedures and

judges

Quasi-legal

procedures Resolution

What are the

arguments for and

against the program?

Case study Client,

practitioners

Negotiations,

activities

Case studies,

interviews,

observations

Understanding

diversity

What does the

program look like to

different people? Source: Patton and Sawicki, p. 374

PAD5700 lecture 11

of 20

Principles of analysis

Patton and Sawicki offer a number of 'principles' of analysis, which I'll type in below:

Determine the focus of the evaluation

Try to become involved as early as possible

Decide what data will be produced

Determine what change will be measured

Identify what policy action or intervention is being evaluated

Use multiple methods of measurement

Design the evaluation so it can respond to program modifications

Design the evaluation to provide in-course as well as final evaluations

Involve program staff in the evaluation

Recognize the politics of evaluation

Make your preliminary findings available

Give a clear presentation

(source: Patton and Sawicki, p. 388-93)

Patton and Sawicki go on to note a variety of types of program evaluation:

Before-and-after comparisons -- in the event of the implementation of new policy, in a perfect

world pre-implementation data would be gathered to use as a benchmark. As will be discussed, a

fundamental problem with the social sciences is the difficulty involved in controlling for other

variables. As an example, in his 2001 State of the Union address, the President noted that "rising

energy prices" were a policy issue the administration had to address; while in a later speech his

budget director Mitch Daniels noted that

“The report we've issued this morning confirms that the nation has entered an era of solid

surpluses. Surpluses on the order of $160 billion, despite an economy that has been week

now for over a year and in decline for that time. This is the second largest surplus in

American history, in the face of that weak economy, a phenomenon that should strike all

Americans as very positive.

“The 10-year forecast that we have projected is $1 trillion, again an astonishing number,

vastly more than the amount of publicly held debt that it will be possible to repay over that

time period. And this number reflects new commitments since the April budget of $198

billion in the first installment of the President's program to repair and rebuild our national

defenses, and also a revised increased estimate for Medicare reform including prescription

drug coverage for our senior citizens, up from $153 billion to $190 billion.”

In both cases, the administration was proven to have been spectacularly wrong, with this easy to

determine simply by comparing the state of affairs on these two variables (energy prices and the

federal debt): energy prices are well above what they were when the President expressed his

concerns about high prices (from ~$25 a barrel in 2001, to as high as $100 a barrel since), while

Mitch Daniels presided over one of the most extraordinary budgetary meltdowns in US history,

from record surpluses to then record deficits (from a $200b surplus in 2000, to a peak ~$400b

deficit in 2004, not including the over $1 trillion deficit from Bush's 2009 budget. Can Daniels

and Bush be blamed for missing these benchmarks? They, of course, legitimately cite other

factors out of their control that account for a lot of these problems (if by no means all, and by no

means most, especially in terms of the budget!). So the before and after evaluation of

http://www.presidency.ucsb.edu/ws/?pid=29643

http://georgewbush-whitehouse.archives.gov/news/releases/2001/08/20010822-1.html

http://georgewbush-whitehouse.archives.gov/news/releases/2001/08/20010822-1.html

PAD5700 lecture 11

of 20

administration policy, on energy prices and on public finance, becomes muddied by these other

factors.

With-and-without comparisons

Similarly, a control group among whom the program was not implemented can be adopted as a

contrast. Florida implements a program, and monitors the same underlying phenomenon that the

program is meant to address in Georgia, which didn't implement the new program. In this way it

is hoped that the Georgia case can help control for other factors, beyond the new program, that

might affect the underlying phenomenon. So improvement in Florida isn't enough to determine

program success, improvement in Florida relative to Georgia must be achieved. Take the US

federal deficit: perhaps the Bush administration was only whacked by global forces, so if we

compare the US to Canada we can get some sense of this. Nope, them thar' socialist Canadians

ran budget surpluses through most of the 2000s (source).

Actual-versus-planned performance targets

Establish goals for the program, and assess performance relative to these. This is, of course, also

subject to fiddling, simply by setting easy goals. Years ago, when airline arrival times began to

be monitored, most airlines quickly improved their on-time rates, simply by padding projected

travel times. Similarly, the Bush administration crowed about having brought its 2004 budget in

at a deficit of only 3.6% of GDP, well below the projected 4.5%. Yet when this budget was

proposed, many noted the ruse in the overly pessimistic projections. The administration was also

active in calming citizens about the size of the federal deficits. In the President's 2006 Budget

Message, for instance, he argues (with accompanying graphic):

"The Budget forecasts that the deficit will

continue to decline as a percentage of GDP. In

2005, we project a deficit of 3.5 percent of

GDP, or $427 billion. And if we maintain the

policies of economic growth and spending

restraint reflected in this Budget, in 2006 and

each of the next four years, the deficit is

expected to decline. By 2009, the deficit is

projected to be cut by more than half from its

originally estimated 2004 peak—to just 1.5

percent of GDP, which is well below the 40-

year historical average deficit, and lower than

all but seven of the last 25 years."

Note that the 40 year historical average was buoyed considerably by the similarly record-setting

deficits of the Reagan/Bush era. So in effect, the supply-side tax cutting Bush administration

was arguing that its deficits were not that large, when compared to the large deficits of the

previous conservative, supply-side tax cutting administration. Not so comforting!

http://news.nationalpost.com/2013/03/21/federal-budget-aims-for-economic-growth-over-cuts-to-erase-deficit/

http://www.gpoaccess.gov/usbudget/fy06/pdf/budget/overview.pdf

http://www.gpoaccess.gov/usbudget/fy06/pdf/budget/overview.pdf

PAD5700 lecture 11

of 20

Experimental models

Experimentation at the societal (not to mention the national) level is all but impossible, but at the

program level it is somewhat less so. This underlies pretty much every medical trial ever

conducted. The logic of experimental models is simple: a group of subjects is recruited, and

divided randomly in half. One half is subjected to the treatment, the other half is not. In medical

trials, the two groups typically don't know whether or not they are part of the study. In the social

sciences this can be more difficult (i.e.: while the medical trial control group can be given a

'sugar pill' that is indistinguishable from the actual medicine, the halfway-house control group

knows whether or not it is in a halfway house), but certainly not impossible. A successful trial

would see the test group perform better than the control group.

Quasi-experimental models

Real world application of an experimental design, "when a true experiment cannot be conducted

-- when we cannot randomly assign persons to treatment and control groups, when we cannot

control the administration of the program or policy or restrict the policy to a treatment group or

when programs are not directed at individuals" (p. 379). The general idea is as for that of

experimental designs, save that the control group is intentionally selected (rather than randomly

assigned) as similar enough to the test group that the control group can serve as a useful

benchmark. Patton and Sawicki also discuss time series designs, which are largely the same

thing save that multiple measurements will be taken across time, rather than a simple pre- and

post-test (O'Sullivan et al note the differences between these in their chapter 3). Patton and

Sawicki note that one needs to be careful with quasi-experimental designs, especially in terms of

being conscious of survey effect (people act differently when they know they are being studied),

external effects (terror mass murders messing up budget projections), and such.

Note that, with the Affordable Care Act being implemented differently in different states, we’ll

see a natural experiment play out over the next few years, as the results come in.

Cost-oriented evaluation approaches

Or cost-benefit analysis. Beyond the obvious, an advantage of this approach lies in that dollars

provide a metric allowing the comparison of programs in two different areas. Beds provided for

homeless people in one program could be compared to cops put on the beat in another. Given

success in each program, which will have greater impact on society? To the extent that one can

'dollarize' the impact of each program, these dollar impacts can then be compared. So one

measures the social impact of each program, divide the costs by these impacts, and you get social

impact per dollar of input.

Dollarizing? Of course, 'dollarizing' these different impacts is the tough part. Monitoring dollar

inputs is fairly easy, but comparing the benefits of reduced crime to improvement in the lives of

once homeless people is hard to do quantitatively.

Patton and Sawicki note such cost-effectiveness comparisons within programs of a different

type: surely, all else being equal, a program able to provide x beds at y cost, is more effective

PAD5700 lecture 11

of 20

than a program providing x beds at 2y cost. It isn't as easy, though, to compare 100 beds at y cost

to ten less property crimes at 0.8y cost. Returning to the use of dollars as a metric for

comparison between different programs, an interesting recent example of this was provided by a

project known as the Copenhagen Consensus, which sought to work out where global

development dollars would yield the highest return on investment. Calculating dollar benefits v.

dollar costs of such grand projects is certainly controversial, yet consider the alternative: these

policies are implemented depending on the preferences of the rich world, upper-middle class

advocacy groups and government officials at the center of the decision-making process. The

Copenhagen Consensus analysis was led by Bjorn Lomborg, famous for the book The Skeptical

Environmentalist, in which he convincingly skewers a lot of views held by rich world, upper-

middle class, self-styled environmental advocacy groups. It found that greater returns would

come from policies aimed at malaria, HIV/AIDS, and malnutrition, all of which effect

developing world poor folk; rather than combating the perceived risk of global warming, dear to

the hearts of the rich world, upper-middle class self-styled environmental advocacy groups (with

family cottages on low-lying Cape Cod!).

Index construction

Define the concept -- what is it that you are trying to explain? If the analyst is unable to do this,

the analyst may not understand what s/he has done.

Selecting the items -- four concerns:

choose the "right items -- those that represent the dimension of interest and no other,"

"include enough items to distinguish among all the important gradations of a dimension

and to improve reliability,"

"decide whether every item is supposed to represent the entire dimension in which you

are interested or whether each is to represent just [a] part of the dimension,"

"keep costs down by excluding items that provide no extra information" (p. 301).

Combining the items in an index. Options:

'Likert scaling' (p. 308-12). For instance, a Likert scale-based question such as the following

would be a weak way to get a handle on political ideology:

Q. How would you describe your political ideology

1. strong conservative 2. moderate conservative 3. moderate 4. moderate liberal 5.

strong liberal

This simply captures way too little. Instead, the analyst might develop a series of ten or

twenty questions, each with a consistent Likert scale, that assess this from different angles,

asking for opinions on a range of issues. By aggregating these results, a fuller, multi-faceted

indicator is achieved.

Examples:

The dreaded Keirsey temperament sorter

The equally dreaded Myers Briggs.

Right-Left Brain test (this should be fun!). My result: left brain 70%, 46% right brain.

Facebook has innumerable tools of this sort. One that I particularly remember is the "Which

Rhode Island town are you?" After an hour or so of retaking the test, I managed to figure out

what answers I needed to give to be a Block Islander.

http://www.copenhagenconsensus.com/CCC%20Home%20Page.aspx

http://www.lomborg.com/

http://www.lomborg.com/publications/the_skeptical_environmentalist/

http://www.lomborg.com/publications/the_skeptical_environmentalist/

http://keirsey.com/

http://www.humanmetrics.com/cgi-win/jungtype.htm

http://www.humanmetrics.com/cgi-win/jungtype.htm

http://www.blockisland.com/

PAD5700 lecture 11

of 20

Transformations. Especially to standardize (make easier to compare) variables constructed in

different units.

o The approach taken for this by O'Sullivan et al (in pages 305-7) revolves largely around

deriving a relative score for each indicator, based either on the range, or on the standard

deviation, for that indicator. Example 10.3 provides an example of standardization based on

the range.

As an example of the importance of this, consider life expectancy and GDP per capita (at

purchasing power parity, a more comparable figure).

The lowest average life expectancy of any country in the world is Zambia, at 40.5

years (according to the Global Government data base we have been using). The

highest is Japan, at 82.3.

The lowest GDP per capita reported by the same source is $667, in Malawi; highest is

Luxembourg at $60,228.

Zambia's GDP per capita was $1023, Japan's $31,267.

If you were to simply add up life expectancy and income, Zambia would get a score

of 1063.5 (40.5 years + $1023), while Japan's score would be 31,349.3 (82.3 years +

$31,267).

The point is that life expectancy exerts far, far more influence on the index than

does longevity.

One solution: scale each of these variables as a percentage of the highest score,

the lowest life expectancy score would be 0.492 (40.5/82.3).

The lowest income score would be .011 (667/60,228).

As a result the income indicator will still exert more influence on the overall

index, as there is more variation on this one.

The better solution: scale through the range of the variable.

So for life expectancy, Zambia gets zero, Japan 100 points, the rest score

according to their relative placement within the range between Zambia and Japan.

So the United States would score 89.5. With a life expectancy of 77.9, this

score is determined by the formula (77.9-40.5)/(82.3-40.5) = 90.8

Oddities can still occur. Luxembourg's income per capita was nearly $20,000

higher than the second highest, and probably reflects Luxembourg's status as

an international (or European) center of government and finance. So there

would be one country with a score of 100 for income, and #2 would come in

at around 60. For life expectancy, dozens of countries would have scores in

the 90s, over-emphasizing this variable.

The United Nations Development Program does this with its Human

Development Index which, like The Economist's quality-of-life index above, tries

to work out which countries outperform others in terms of 'human

development'. The UNDP uses life expectancy, education and income as the

three components of its index.

Weighting the separate items -- emphasize variables believed to be more important. In The

Economist's quality-of-life ranking, for instance, is the divorce rate as important as

income? O'Sullivan et al give a good example of this in the Uniform Crime Index. As they note,

the problem with this aggregate crime figure is that it, well, aggregates different types of crime,

http://www.undp.org/

http://hdr.undp.org/hdr2006/statistics/

http://hdr.undp.org/hdr2006/statistics/

PAD5700 lecture 11

of 20

producing a total number of crimes reported. This is worth knowing, but as an overall indicator

of the social impact of criminal activity, it fails to distinguish between violent crime and non-

violent crime. A reduction in the number of murders that is offset by an equal number of

marijuana possession violations, will result in no change in the UCI, yet surely society is better

off (or perhaps less worse off) in the latter case?

For the record, the 2001 crime rate in the US was 4160.5 per 100,000 inhabitants.

This 4160.5 included (add 'em up) 5.6 murders (per 100,000 inhabitants), 31.8 rapes, 148.5

robberies, 318.5 aggravated assaults, 740.8 burglaries, 2484.6 larceny thefts, and 430.6 motor

vehicle thefts. Is a one point increase in the murder rate as important as a 1 point increase in

the larceny theft rate? Should larceny thefts influence the crime index 500 times more than

the murder rate?

Not surprisingly, when I checked the World Almanac 2007 to update those figures, I got this

note for more recent UCI: "The use of the Crime Index in the Uniform Crime Reports

Program was discontinued beginning with the report of 2003 data", p. 117).

However that raw data still gets reported. 2007, for instance, saw 3.1 murders, 30 rapes, 284

aggravated assaults, 148 robberies, 723 burglaries, 2178 larcenies/thefts, and 363 motor

vehicle thefts, for a total of 3264 crimes per 100,000 population, a dramatic drop that policy

wonks are still arguing about.

An illustration of Constructing indices

We'll do an example for categorical data. In the Appendix (pages ) you’ll see an example for

continuous variables. In the factor analysis example below, we'll construct an index for a

continuous variable.

The Belle County survey provides a number of indicators regarding citizen satisfaction. But

what to make of the results? Assume you want the results broken down by whether a respondent

lives in the city limits or not. A table of key responses for ‘Overall county service value rating’,

for instance, would look like this:

Analyze, Descriptive, Crosstabs

Rows = Residence in city limits; Column = Overall county service e value rating

Click ‘cells’: Check ‘Row’ under Percentages

Okay

Table 2 -- Residence in city limits * Overall county service value rating Crosstabulation

% within Residence in city limits

Overall county service value rating

Total Very poor

value Somewhat poor value Fair value Good value

Excellent value

Residence in city limits

Inside 4.5% 10.9% 46.0% 33.5% 5.1% 100.0%

Outside 5.6% 14.9% 38.5% 36.0% 5.0% 100.0%

Total 4.9% 12.2% 43.5% 34.4% 5.1% 100.0%

But is this single indicator enough? This is where an index can help aggregate the results. This

can be done a couple of ways, depending on your unit of analysis. A simple one is as follows:

PAD5700 lecture 11

of 20

You can construct an index for each individual respondent. The process will first require

creating a new variable, and then using the Transform function.

Go to Variable View (bottom left of the spreadsheet), and add another variable, call it

'index', leave it numeric, and Label it 'Index: service value ratings'.

Go back to Data View.

Transform

Compute

Target Variable will be 'index' (you have to type it in)

The Numeric Expression will be:

(valserv + valaid + valsch + valenv) / 4

This aggregates the responses from each respondent. These can then be added up using Case

summaries, just to use this again to construct a table (Analyze; Reports, Case Summaries;

Residence in city limits in 'Grouping Variable'; Index as 'variable', click off the checkmark in

Display cases, click on Statistics and choose Mean), click Continue, click Okay. You get this:

Table 3 -- Case Summaries

Mean

Residence in city limits Index: service value ratings

Inside 3.2922 Outside 3.3347 Total 3.3065

As can be seen, the two don't differ significantly. Needless to say, we don’t trust such

impressions, instead do a hypothesis test. An Independent samples test gives us a mean score of

3.29 for those residing in the city, and 3.33 for those outside, with p = .542. But hey: now you

know!

*

III. The program evaluation exercise

* Present the analyses as you would do so in a (20 page maximum, including graphics) professional

report. Introduce the analysis, justify the model that you have built, and interpret the

results. Present data in professional table/graphic formats as necessary.

Grading criteria:

Identify the issue/state the question

Demonstrate command of the material

Write professionally

Logical, coherent, argument

Well used tables/ graphs

Follow instructions (debits)

For further explanation of these criteria, see the PAD 5700 Assignments page, starting about

halfway down the page.

Choose one of the four programs to evaluate:

http://www.unf.edu/~g.candler/PAD5700/Assignments.pdf

PAD5700 lecture 11

of 20

Long term care facility

Report on the operation of the Adams long term care facility. The data is based on a survey

instrument of residents of Adams and of three other institutions. Develop

a list of key performance measures for the four facilities, and

identify correlates/determinants of those performance measures.

Write up the results as you would a professional report.

Long term care facilities dataset

County programs evaluation tool

Develop an evaluation tool for, and report on the operation of, public services provided by this

county. The data is based on a survey of residents. Develop

a list of key performance indicators for the various programs provided by the county, and

identify correlates/determinants of those performance measures, for the different

programs.


Belle County survey

Community concert program

Develop an evaluation tool to recommend types of music to be offered in a series of six concerts

presented by a city concert hall. The survey below (which is just a modified version of the 1993

General Social Survey) asked 1500 citizens for their preferences, as well as identified a range of

characteristics of these citizens. The community's musical preferences are indicated in the

questions titled: 'Music: ______'. In your recommendation, balance the following criteria:

satisfy as wide a cross section of the community as possible with these six concerts,

appeal to a wealthier clientele likely to be able to afford the concerts, and

recommend identifiable groups of people (no more than two per type of music) to whom

advertising for each concert could be targeted.

The data: Concert.

The UNF PORL Fall Statewide Omnibus Survey

Evaluate the results of the UNF Public Opinion Research lab’s Fall 2013 Omnibus statewide

survey, in terms of determinants of the views expressed. By this I mean the official press release

from the survey presented political preference results, as well as demographic results. In your

analysis, look at the impact of the demographics on the political preference (and other) results.

The data is based on a survey of residents. Do the following:

Identify the key outcomes that you will analyze,

identify important correlates/determinants of those outcomes.


UNF-PORL-Fall-State survey

The data

Variable descriptions

http://www.unf.edu/~g.candler/Data/LongTermCare.sav

http://www.unf.edu/~g.candler/Data/BelleCounty.sav

http://www.unf.edu/~g.candler/Data/Concerts.sav

http://www.unf.edu/coas/porl/

http://www.unf.edu/~g.candler/Data/Poll2013.sav

http://www.unf.edu/~g.candler/Data/Poll2013Questions.docx

PAD5700 lecture 11

of 20

References: Candler, G.G. (1997). "The Tongan Construction Industry -- infrastructure provision in a small

economy." Pacific Economic Bulletin, 12(1).

Candler, G.G. (1999). "Civil Society and Development -- Scientific and Professional

Associations in Public Policy in Santa Catarina and Sergipe, Brazil." Policy Studies Journal

27(3).

Gold, Michael and G.G. Candler (2006). "The MPA Program in small markets: an exploratory

analysis." Journal of Public Affairs Education 12(1).

Weimer, David (2003). "Reflections on building an MPA program." Journal of Public Affairs

Education, 9(1), p. 39-41.

Appendix!

Factor analysis (the Quick & Nasty Guide)

After lots of trial and error, I think I finally figured this out. The idea in factor analysis is to look

for correlations between variables. If variables are correlated, and correlations among

independent variables violates one of the assumptions of regression analysis, one might consider

combining them into an index. For a good example (by a prominent public administration

scholar), see

James Perry and Leslie Berkes (1979). "Predicting local government strike activity: an

exploratory analysis." Western Political Quarterly 30(4), p. 513-27. JSTOR link.

The logic (if not some of the details) of factor analysis is illustrated pretty well in this

article. Essentially, Perry and Berkes wanted to look at the impact, on local government strike

activity, of four broad variables: the macro-environment environment (social and economic

characteristics), government characteristics, public employment characteristics, and legal

policy. How, though, does one 'operationalize' these variables? They didn't, for instance, want

to do something as naive as offer the divorce rate as an indicator of 'family life', as in the

Economist article above. But which to choose, and how to avoid the 'multi-collinearity' problem

in regression analysis (see O'Sullivan et el, p. 442-5)? What factor analysis can do is help sort

this out.

So Perry and Berkes identified a number of variables that plausibly seemed like they could

operationalize each of their four broad variables. In Table 1 (p. 515), for instance, they identify

ten. These then are chunked into factor analysis, and what the process does is identify groups

among them, or identifies those that are significantly related. As you’ll see, it’s like a

multivariate correlation matrix – rather than just looking at relationships between pairs of

variables (a covaries with b), it looks for clumped groups of tightly correlated variables (a, b, c,

and d all covary). Possible results vary from -1 (perfect negative correlation) to 1 (perfect

correlation), so a high absolute value (closer to 1 or -1) reflects correlation. Analyzing the

results, the authors identify three broad groupings, we'll call them a, b and c for now:

http://www.jstor.org/stable/447653

PAD5700 lecture 11

of 20

a b c

% of state population which is

urban

% non-agricultural union

membership

% of state population below low

income level


income level right to work law % population black

state per capita income man days idle (normalized)

state median family income

% employed in non-agricultural

establishments

state population density

Note that all have 'loadings' (scores) above 0.50 (though they use 'number of teacher strikes' in

their 'past strike activity' factor in Table C, even though it only has a 'loading' of 0.46). The final

step in this process is simply to provide the 'factors' (the broad groupings of correlated variables)

with spiffier names, given the nature of the 'factors'. So they call the three 'factors'

urbanization/industrialization, private union influence, and race/poverty:

urbanization/industrialization private union influence race poverty

% of state population which is

urban

% non-agricultural union

membership


income level


income level right to work law % population black

state per capita income man days idle (normalized)

state median family income

% employed in non-agricultural

establishments

state population density

The process is repeated for the other groups of variables that they identify as relevant to their

other three broad variables.

What one does with this is

Not include alone, in further analysis, any two variables that 'load' onto a single 'factor':

they are correlated.

Instead, on the one hand only one of these variables might be selected to operationalize

the 'factor', perhaps the one with the highest 'loading'.

On the other hand, one might construct an index to operationalize the 'factor', using all or

some of the variables in that factor. This can provide a richer, less narrow

operationalization of the factor.

An example of factor analysis -- note: you will not necessarily have to do this for assessment

items!!!

We'll be using the indiana-crimeadjusted dataset for a factor analysis example. Let's look at

determinants of (or factors correlated with, to not make the usual mistake of assuming causality

http://mypage.iusb.edu/~gcandler/K300/indiana-crimeadj.sav

PAD5700 lecture 11

of 20

in statistical analysis) population change in Indiana counties. Looking at our dataset, we have a

number of variables that might plausibly be related to this:

pop -- population: size might matter, large counties might be growing faster than small;

or vice versa.

latino -- percent Latino population

youths -- 18-24 year old population as a percentage of the total

femhspct -- percentage female headed households

gradhs -- high school graduation rate

gradcol -- college graduation rate

inchouse -- median household income

poverty -- poverty rate

crime -- crime rate

jobless -- unemployment rate

manfng -- manufacturing earnings, as a percentage of personal earnings

farm -- farm population, as a percent of the total

As a start, run descriptive statistics for these (and the dependent variable), just so you have a

better idea what you're dealing with. Also, make sure you have the right variables: we need

normalized (percentages or rates) variables, not raw numbers (with the exception of

population). In other words, you need the percentage of female headed households, not the

number of these.

Second, also run a correlation matrix, and a multiple regression, just to have these handy so that

we can compare these results (that we're a bit familiar with) with the results of the factor

analysis. If you're looking at this at home, I'll run the results and copy them into another file

http://www.unf.edu/~g.candler/PAD5700/11-FactorCor.pdf

just so that the results don't clutter up this page (not to mention that the correlation matrix won't

fit on an 8 1/2 x 11 sheet of paper if you print it out.

Returning to the factor analysis exercise, we will take all of our prospective independent

variables and do a factor analysis on them:

Analyze, Dimension Reduction, Factor

Load all independent (not population change) variables as Variables

Leave the Descriptives and Extraction settings at their default positions (respectively:

Initial solution; and Principal components, Correlation matrix, Unrotated factor solution,

Eigenvalues over 1, Maximum items for convergence; 25).

For Rotation select Method: Varimax (it seems to be all the rage), Display: Rotated

solution, and Maximum iterations: 25.

For Scores select Save as Variables: Regression.

For Options, leave it as Missing values: Exclude cases listwise.

Click okay.

http://www.unf.edu/~g.candler/PAD5700/11-FactorCor.pdf

PAD5700 lecture 11

of 20

The results (including only the key factor loadings, and looking at the Rotated Component

Matrix, which doesn't differ significantly from the Component Matrix):

Rotated Component Matrix

a

Component

1 2 3 4

Population .161 .819 .119 .136 Percent Latino population .130 .431 .110 .681 18-24 population, % -.070 .119 .953 .039 % female headed households -.326 .846 .043 .173 High school grads, % .788 .174 .316 -.027 College grads (B.A. or higher), % .634 .353 .596 -.132 Median household income per capita .942 .052 -.127 -.023 Poverty, % -.849 .275 .284 -.132 Crime rate, serious per 100k pop. -.006 .644 .019 -.166 Unemployment rate, % -.716 -.035 -.096 -.126 Manufacturing, % personal earnings .023 -.216 -.076 .847 Farm population, % -.253 -.677 -.292 .079

Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. a. Rotation converged in 5 iterations.

To reconfigure these four identified factors (ordering them in terms of their factor score), in

terms of my presentation of the Perry/Berkes study above, we get this:

a b c d

income (.942) female headed households

(.846) youth population (.953) manufacturing (.847)

high school graduation

(.788) population (.819) college education (.596) Latino population (.681)

college graduation (.644) crime rate (.644)

unemployment (-.716)

poverty (-.849) farm population (-.677)

The next (largely cosmetic) step is to name the four factors. This is just a matter of looking at

what is going on, and giving them names that are a bit more evocative than a-d. How about:

a -- socio-economic I

b -- socio-economic II

c -- college town

d -- manufacturing

So we now have four factors identified, as follows:

Socio-economic I Socio-economic II College town Manufacturing

income (median hh) female-headed

households % youth population % manufacturing earnings %

high school graduates (%) population college grads % Latino population %

college graduates (%) serious crime (per 100k)

unemployment (%)

(negative)

poverty (%) (negative) farm population (%)

(negative)

PAD5700 lecture 11

of 20

Before moving on: the Socio-economic I and II is a bit interesting. On the one hand, factor

analysis has identified two distinct 'factors' composed of general socio-economic data. On the

other, at this point I can't distinguish between the two -- what is different between them? -- so am

unable to come up with a designation better than Socio-economic I and Socio-economic II.

Further analysis can clarify what is going on. With these four factor variables, we have created

indices to rate the 92 counties in the state on these four criteria. SPSS creates them

automatically, labeling them fac1_1 to fac4_1 (check the dataset). We can now rank order them

(Data, Sort Cases. In the window: Sort By fac1_1). The results, showing the top and bottom

three in each category:

Socio-economic I Socio-economic II College town Manufacturing

Hamilton Marion Monroe Noble

Hendricks Lake Tippecanoe Elkhart

Boone Allen Delaware Lake

Starke Pulaski Morgan Brown

Crawford Warren Ohio Pike

Switzerland Lagrange Franklin Hendricks

These results make the difference between Socio-economic I and Socio-economic II pretty clear:

I is suburban counties, II urban counties. To rename the factor analysis results:

Suburban Urban College town Manufacturing

income (median hh) female-headed

households % youth population % manufacturing earnings %

high school graduates (%) population college grads % Latino population %

college graduates (%) serious crime rate (per

100k)

unemployment (%)

(negative)

poverty (%) (negative) farm population (%)

(negative)

A way to get a better handle on what factor Analysis is doing can be shown if we do a correlation

matrix on one of the factors we've identified above, say Socio-economic II (urbanization). We

get this:

Correlations Note first that each of the six two-way correlations between the four variables in this factor have

p-values (the significance 2-tailed) of less than .002. As well, look at the Latino population,

which had a 'loading' of .431 on this 'factor' (remember that Perry and Berkes accepted a couple

of loadings that were this low). Look at the p-values of the correlations between this variable

and the others in the factor:

Latino population and Population -- .000

Latino population and Female households -- .001

Latino population and Crime rate -- .106

PAD5700 lecture 11

of 20

Latino population and Farm population -- .054

The point here is that the four variables in this factor with loadings of at least 0.500 have

correlations of less than p = .002 with each of the other three variables in the factor. Latino

population is quite highly correlated with each of the four variables in this factor, but its

correlation with two are not quite as strong: .054 with Farm population, and .106 with Crime

rate. So what factor analysis does is identify groups of variables that are all, highly correlated.

Well bully for factor analysis!

But what's the point? A point is that when specifying models for regression analysis, we can

avoid multi-collinearity and, using index construction, develop richer variables. The regression

model that you ran earlier (and which I've presented on the linked output page mentioned above)

lists thirteen independent variables. We now know that many of these thirteen variables are very

highly correlated with each other: multi-collinearity. So instead, we will construct four new

variables -- the four factors identified in this analysis -- and use these as our independent

variables in regression analysis.

From factors to variables

An intermediate step would be to see what we want to do with our factor analysis results, or just

to make sure we like what the computer program has done for us. Remember: the computer

works for us, we don't work for the computer.

For instance:

We have four nicely identified phenomena, or variables: Urban, Suburban, College and

Manufacturing.

An obvious missing component is 'ruralness', or farm communities.

o Farm population appears in only one of our factors, as a negative component of

Urbanization. So identifying this farm population variable alone as an

operationalization of farm counties seems reasonable.

We now have five variables: Urban, Suburban, College, Manufacturing, and Farm.

o We can also decide whether to leave farm population in the Urban index as a

negative number, and will do so mostly just to illustrate how to handle these in the

construction of our indices.

A last question we might ask concerns the negatively loaded variables on the Suburban

factor. This factor shows that low unemployment and low poverty are characteristics of

suburban counties. Or put another way: lack of social distress. Partly to avoid more

complicated math with negative numbers, but also because it seems reasonable to treat

'Distress' as a unique characteristic, we will add this to our list of counties and omit these

two characteristics from the Suburban variable.

We now have the following variables identified:

o Urban -- an index made up of the variables

female-headed households

population

serious crime rate

farm population (negative)

o Suburban -- an index made up of the variables

PAD5700 lecture 11

of 20

median household income

high school graduates

college graduates

o College -- an index made up of the variables

youth population

college graduates

o Manufacturing -- an index made up of the variables

manufacturing earnings

Latino population

o Farm -- made up of the variable

farm population

o Social Distress -- an index made up of the variables

unemployment

poverty

Variable transformations (or: Constructing indexes II)

You'll need to create five new variables (click on Variable View in the bottom right),

respectively

burbs -- Label it Factor: socio-economic I

urban -- Label it Factor: socio-economic II

college -- Label it Factor: college town

manfg -- Label it Factor: manufacturing

distress -- Label it Factor: social distress

We now have to address some of the standardizing problems raised by O'Sullivan et al (p. 296-

9). The problem is that the four variables for the Socio-economic II (Urban) factor are expressed

in different units that can't simply be added, unlike the Likert scales in the Belle County

study. Descriptive statistics on the four variables, after all, show us that

Two are percentages: Female headed households (with a range of 10% from the highest

to the lowest) and Farm population (with a range of 22).

Population is expressed in terms of thousands (with a range of 855),

crime rate the number per 100,000 population (with a range of 250).

As a result, we can't simply add them up, as crime rate and especially population will be over-

emphasized. So what we'll do is a simple transformation of each variable into a percentile

score. As an example, for the Population variable, the minimum population will be subtracted

from each county population (county population-minimum population), which will then be

divided by the range for the population (maximum county population - minimum county

population), indicating how far it is from the lowest to highest. The formula to create this new,

standardized percentile variable would be

(county population-minimum population) / (maximum county population - minimum county population)

Also, variables that are negatively related to the factor need to be turned around, so that each

variable in the factor is pushing in the same direction. In the case of the Urban factor, Farm

population is 'negatively loaded', and so needs to be flipped around. So you don't want to know

how far it is from the minimum in percentage terms, but rather how far it is from the maximum

in percentage terms. The formula for this would be

PAD5700 lecture 11

of 20

(maximum farm population - county farm population) / (maximum farm population - minimum farm

population)

The logic is that a higher number should indicate more consistency with what the index is trying

to capture. In the case of farm population, the absence of which is an indicator of urbanization, a

high number is opposite to urbanization. So "(maximum farm population - county farm

population)" -- the first part of the equation above -- turns this around. A county with a low farm

population will get higher numbers this way. This then gets divided by the range (maximum

farm population - minimum farm population), as before.

Are we having fun yet! And to think that I gave up a career in concrete for this...

So putting together the percentile formula for each of the four variables in this factor, we get the

following as a formula for the mean percentile ranking of the five variables in the Socio-

economic II, Urban factor: (((pop-5.623) / (860.454-5.623)) + ((femhspct-6.5) / (16.6-6.5)) +

((22-farm) / (22-0)) + ((crime-319) / (9650-319)))

/ 4

So to create a new, Socio-economic II (urbanization) variable, we go to

Transform, Compute

Type urban in as Target Variable

Load as Numeric Expression the formula above

Click Okay, and Change existing variable.

You can do a quick check to see if this looks reasonable by sorting the data by fac2_1, and

looking to see if your new variable seems to be similarly sorted.

Do the same thing for the other four factor variables.

So what?

Well, we can now do a more meaningful regression (and related) analysis of the determinants (or

correlates) of population change. Running a regression analysis of

population change = Suburbanization + Urbanization + College community + Manufacturing +

Social Distress + Farming

gives the results presented on the next page.

Note that in the earlier regression (on the output link) you had an R squared of .639, now it is

only .415. Further, before the following variables were at least weakly (p ~ .10) statistically

significant:

Latino population (Standardized β = .148, p = .107)

high school grads (Standardized β = -.295, p = .037)

college grads (Standardized β = .343, p = .115)

median per capita household income (Standardized β = .766, p = .003)

unemployment (Standardized β = .189, p = .052)

PAD5700 lecture 11

of 20

manufacturing earnings (Standardized β = -.215, p = .020)

Note too that a lot of these variables are highly correlated (income, unemployment, high school

grads and college grads; Latino population and manufacturing earnings). So that whizbang R2

is

the result of our including single social phenomena like suburbanization more than once

(through the statistics for income, unemployment, high school grads and college grads). This

factor analysis model should give a more accurate indication of the determinants/correlates of

population growth.

Regression of Population change (1990-2000) on suburbanization, urbanization,

college community, manufacturing, social distress and farming

Coefficient

(standard error)

Standardized

coefficient t score probability

Constant 2.03

(6.43) .316 .753

Suburbanization

index

49.05

(8.39) .755 5.88 .000

Urbanization index -20.17

(13.81) -.274 -1.50 .137

'College town'

index

-12.72

(8.73) -.181 -1.46 .149

Manufacturing

index

-4.55

(5.25) -.078 -0.87 .388

Social distress

index

10.17

(9.93) .123 1.024 .309

Farming index -2.657

(8.021) -.054 -.331 .741

Adjusted r2 = .323

F (6, 85) = 8.24 p = .000

Note, too, that the analysis makes sense. What the results tell us is that surburbanization is

strongly related to population change, and positively. 'College town' and 'Urbanization' is

negatively correlated with population growth. manufacturing and social distress are not related

to population decline.

master of public administration program pad 5700 -- statistics

Documents